All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v5 00/15] paravirtual IOMMU interface
@ 2018-08-03 17:22 Paul Durrant
  2018-08-03 17:22 ` [PATCH v5 01/15] iommu: turn need_iommu back into a boolean Paul Durrant
                   ` (14 more replies)
  0 siblings, 15 replies; 57+ messages in thread
From: Paul Durrant @ 2018-08-03 17:22 UTC (permalink / raw)
  To: xen-devel
  Cc: Kevin Tian, Stefano Stabellini, Wei Liu, Jun Nakajima,
	Razvan Cojocaru, Andrew Cooper, Ian Jackson, George Dunlap,
	Tim Deegan, Julien Grall, Paul Durrant, Tamas K Lengyel,
	Jan Beulich, Daniel De Graaf, Brian Woods, Suravee Suthikulpanit

The idea of a paravirtual IOMMU interface was last discussed on xen-devel
several years ago and narrowed down on a draft specification [1].
There was also an RFC patch series posted with an implementation, however
this was never followed through.

In this patch series I have tried to simplify the interface and therefore
have moved away from the draft specification. There is not yet any
new specification but hopefully the interface in the introduced iommu_op
header file will be understandable without such a specification.

[1] https://lists.xenproject.org/archives/html/xen-devel/2016-02/msg01428.html

Paul Durrant (15):
  iommu: turn need_iommu back into a boolean.
  iommu: introduce the concept of BFN...
  iommu: make use of type-safe BFN and MFN in exported functions
  iommu: push use of type-safe BFN and MFN into iommu_ops
  iommu: don't domain_crash() inside iommu_map/unmap_page()
  public / x86: introduce __HYPERCALL_iommu_op
  iommu: track reserved ranges using a rangeset
  x86: add iommu_op to query reserved ranges
  vtd: add lookup_page method to iommu_ops
  mm / iommu: include need_iommu() test in iommu_use_hap_pt()
  mm / iommu: split need_iommu() into has_iommu_pt() and sync_iommu_pt()
  x86: add iommu_op to enable modification of IOMMU mappings
  memory: add get_paged_gfn() as a wrapper...
  x86: add iommu_ops to modify and flush IOMMU mappings
  x86: extend the map and unmap iommu_ops to support grant references

 tools/flask/policy/modules/xen.if             |   1 +
 xen/arch/arm/p2m.c                            |   9 +-
 xen/arch/x86/Makefile                         |   1 +
 xen/arch/x86/hvm/emulate.c                    |  32 +-
 xen/arch/x86/hvm/hvm.c                        |  16 +-
 xen/arch/x86/hvm/hypercall.c                  |   1 +
 xen/arch/x86/hvm/mtrr.c                       |   5 +-
 xen/arch/x86/hypercall.c                      |   1 +
 xen/arch/x86/iommu_op.c                       | 558 ++++++++++++++++++++++++++
 xen/arch/x86/mm.c                             |  15 +-
 xen/arch/x86/mm/mem_sharing.c                 |   2 +-
 xen/arch/x86/mm/p2m-ept.c                     |  19 +-
 xen/arch/x86/mm/p2m-pt.c                      |  52 ++-
 xen/arch/x86/mm/p2m.c                         |  44 +-
 xen/arch/x86/mm/paging.c                      |   2 +-
 xen/arch/x86/pv/hypercall.c                   |   1 +
 xen/arch/x86/x86_64/mm.c                      |   8 +-
 xen/common/grant_table.c                      | 201 ++++++++--
 xen/common/memory.c                           |  69 +++-
 xen/common/vm_event.c                         |   2 +-
 xen/drivers/passthrough/amd/iommu_cmd.c       |  18 +-
 xen/drivers/passthrough/amd/iommu_map.c       |  86 ++--
 xen/drivers/passthrough/amd/pci_amd_iommu.c   |   6 +-
 xen/drivers/passthrough/arm/smmu.c            |  20 +-
 xen/drivers/passthrough/device_tree.c         |  19 +-
 xen/drivers/passthrough/iommu.c               |  84 ++--
 xen/drivers/passthrough/pci.c                 |   6 +-
 xen/drivers/passthrough/vtd/iommu.c           |  90 +++--
 xen/drivers/passthrough/vtd/iommu.h           |   3 +
 xen/drivers/passthrough/vtd/x86/vtd.c         |  23 +-
 xen/drivers/passthrough/x86/iommu.c           |   8 +-
 xen/include/Makefile                          |   2 +
 xen/include/asm-arm/grant_table.h             |   2 +-
 xen/include/asm-arm/iommu.h                   |   2 +-
 xen/include/asm-arm/p2m.h                     |   3 +
 xen/include/asm-x86/grant_table.h             |   2 +-
 xen/include/asm-x86/hvm/svm/amd-iommu-proto.h |   8 +-
 xen/include/asm-x86/iommu.h                   |   5 +-
 xen/include/asm-x86/p2m.h                     |   2 +
 xen/include/public/iommu_op.h                 | 171 ++++++++
 xen/include/public/xen.h                      |   1 +
 xen/include/xen/grant_table.h                 |   7 +
 xen/include/xen/hypercall.h                   |  12 +
 xen/include/xen/iommu.h                       |  66 ++-
 xen/include/xen/mm.h                          |  10 +-
 xen/include/xen/sched.h                       |  12 +-
 xen/include/xlat.lst                          |   6 +
 xen/include/xsm/dummy.h                       |   6 +
 xen/include/xsm/xsm.h                         |   6 +
 xen/xsm/dummy.c                               |   1 +
 xen/xsm/flask/hooks.c                         |   6 +
 xen/xsm/flask/policy/access_vectors           |   2 +
 52 files changed, 1426 insertions(+), 308 deletions(-)
 create mode 100644 xen/arch/x86/iommu_op.c
 create mode 100644 xen/include/public/iommu_op.h
---
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Brian Woods <brian.woods@amd.com>
Cc: Daniel De Graaf <dgdegra@tycho.nsa.gov>
Cc: George Dunlap <george.dunlap@citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Julien Grall <julien.grall@arm.com>
Cc: Jun Nakajima <jun.nakajima@intel.com>
Cc: Kevin Tian <kevin.tian@intel.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Razvan Cojocaru <rcojocaru@bitdefender.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Cc: Tamas K Lengyel <tamas@tklengyel.com>
Cc: Tim Deegan <tim@xen.org>
Cc: Wei Liu <wei.liu2@citrix.com>
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 57+ messages in thread

* [PATCH v5 01/15] iommu: turn need_iommu back into a boolean.
  2018-08-03 17:22 [PATCH v5 00/15] paravirtual IOMMU interface Paul Durrant
@ 2018-08-03 17:22 ` Paul Durrant
  2018-08-08 13:39   ` Jan Beulich
  2018-08-03 17:22 ` [PATCH v5 02/15] iommu: introduce the concept of BFN Paul Durrant
                   ` (13 subsequent siblings)
  14 siblings, 1 reply; 57+ messages in thread
From: Paul Durrant @ 2018-08-03 17:22 UTC (permalink / raw)
  To: xen-devel
  Cc: Stefano Stabellini, Wei Liu, George Dunlap, Andrew Cooper,
	Ian Jackson, Tim Deegan, Julien Grall, Paul Durrant, Jan Beulich

As noted in [1] the tri-state nature of need_iommu after commit 3e06b989 is
confusing.

Because arch_iommu_populate_page_table() removes pages from the page list
as it maps them it is actually safe to re-invoke multiple times without
the need for any specific indication it has been called before, so it
is actually safe to simply turn need_iommu back into a boolean with no
change to the population algorithm.

[1] https://lists.xenproject.org/archives/html/xen-devel/2018-07/msg01870.html

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
---
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: George Dunlap <George.Dunlap@eu.citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Julien Grall <julien.grall@arm.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Tim Deegan <tim@xen.org>
Cc: Wei Liu <wei.liu2@citrix.com>
Cc: George Dunlap <george.dunlap@eu.citrix.com>

v5:
 - Simpler patch replacing the more significatio reversion of 3e06b989 in
   v4.

v4:
 - New in v4.
---
 xen/drivers/passthrough/device_tree.c | 19 ++++++++-----------
 xen/drivers/passthrough/iommu.c       |  8 ++++----
 xen/drivers/passthrough/x86/iommu.c   |  2 --
 xen/include/xen/sched.h               |  4 ++--
 4 files changed, 14 insertions(+), 19 deletions(-)

diff --git a/xen/drivers/passthrough/device_tree.c b/xen/drivers/passthrough/device_tree.c
index 421f003438..671c8db1d0 100644
--- a/xen/drivers/passthrough/device_tree.c
+++ b/xen/drivers/passthrough/device_tree.c
@@ -40,17 +40,14 @@ int iommu_assign_dt_device(struct domain *d, struct dt_device_node *dev)
     if ( !list_empty(&dev->domain_list) )
         goto fail;
 
-    if ( need_iommu(d) <= 0 )
-    {
-        /*
-         * The hwdom is forced to use IOMMU for protecting assigned
-         * device. Therefore the IOMMU data is already set up.
-         */
-        ASSERT(!is_hardware_domain(d));
-        rc = iommu_construct(d);
-        if ( rc )
-            goto fail;
-    }
+    /*
+     * The hwdom is forced to use IOMMU for protecting assigned
+     * device. Therefore the IOMMU data is already set up.
+     */
+    ASSERT(!is_hardware_domain(d) || need_iommu(d));
+    rc = iommu_construct(d);
+    if ( rc )
+        goto fail;
 
     /* The flag field doesn't matter to DT device. */
     rc = hd->platform_ops->assign_device(d, 0, dt_to_dev(dev), 0);
diff --git a/xen/drivers/passthrough/iommu.c b/xen/drivers/passthrough/iommu.c
index 70d218f910..49974e5634 100644
--- a/xen/drivers/passthrough/iommu.c
+++ b/xen/drivers/passthrough/iommu.c
@@ -214,14 +214,14 @@ void iommu_teardown(struct domain *d)
 {
     const struct domain_iommu *hd = dom_iommu(d);
 
-    d->need_iommu = 0;
+    d->need_iommu = false;
     hd->platform_ops->teardown(d);
     tasklet_schedule(&iommu_pt_cleanup_tasklet);
 }
 
 int iommu_construct(struct domain *d)
 {
-    if ( need_iommu(d) > 0 )
+    if ( need_iommu(d) )
         return 0;
 
     if ( !iommu_use_hap_pt(d) )
@@ -233,7 +233,7 @@ int iommu_construct(struct domain *d)
             return rc;
     }
 
-    d->need_iommu = 1;
+    d->need_iommu = true;
     /*
      * There may be dirty cache lines when a device is assigned
      * and before need_iommu(d) becoming true, this will cause
@@ -493,7 +493,7 @@ static void iommu_dump_p2m_table(unsigned char key)
     ops = iommu_get_ops();
     for_each_domain(d)
     {
-        if ( is_hardware_domain(d) || need_iommu(d) <= 0 )
+        if ( is_hardware_domain(d) || !need_iommu(d) )
             continue;
 
         if ( iommu_use_hap_pt(d) )
diff --git a/xen/drivers/passthrough/x86/iommu.c b/xen/drivers/passthrough/x86/iommu.c
index 68182afd91..d48d16b51b 100644
--- a/xen/drivers/passthrough/x86/iommu.c
+++ b/xen/drivers/passthrough/x86/iommu.c
@@ -45,8 +45,6 @@ int arch_iommu_populate_page_table(struct domain *d)
     struct page_info *page;
     int rc = 0, n = 0;
 
-    d->need_iommu = -1;
-
     this_cpu(iommu_dont_flush_iotlb) = 1;
     spin_lock(&d->page_alloc_lock);
 
diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
index 851f11ecf7..5b9820e4e1 100644
--- a/xen/include/xen/sched.h
+++ b/xen/include/xen/sched.h
@@ -372,8 +372,8 @@ struct domain
 #ifdef CONFIG_HAS_PASSTHROUGH
     struct domain_iommu iommu;
 
-    /* Does this guest need iommu mappings (-1 meaning "being set up")? */
-    s8               need_iommu;
+    /* Does this guest need iommu mappings? */
+    bool             need_iommu;
 #endif
     /* is node-affinity automatically computed? */
     bool             auto_node_affinity;
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH v5 02/15] iommu: introduce the concept of BFN...
  2018-08-03 17:22 [PATCH v5 00/15] paravirtual IOMMU interface Paul Durrant
  2018-08-03 17:22 ` [PATCH v5 01/15] iommu: turn need_iommu back into a boolean Paul Durrant
@ 2018-08-03 17:22 ` Paul Durrant
  2018-08-07  2:38   ` Tian, Kevin
  2018-08-03 17:22 ` [PATCH v5 03/15] iommu: make use of type-safe BFN and MFN in exported functions Paul Durrant
                   ` (12 subsequent siblings)
  14 siblings, 1 reply; 57+ messages in thread
From: Paul Durrant @ 2018-08-03 17:22 UTC (permalink / raw)
  To: xen-devel
  Cc: Kevin Tian, Stefano Stabellini, Suravee Suthikulpanit,
	Julien Grall, Paul Durrant, Jan Beulich

...meaning 'bus frame number' i.e. a frame number mapped in the IOMMU
rather than the MMU.

This patch is a largely cosmetic change that substitutes the terms 'gfn'
and 'gaddr' for 'bfn' and 'baddr' in all the places where the frame number
or address relate to the IOMMU rather than the MMU.

The parts that are not purely cosmetic are:

 - the introduction of a type-safe declaration of bfn_t and definition of
   INVALID_BFN to make the substitution of gfn_x(INVALID_GFN) mechanical.
 - the introduction of __bfn_to_baddr and __baddr_to_bfn (and type-safe
   variants without the leading __) with some use of the former.

Subsequent patches will convert code to make use of type-safe BFNs.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
---
Cc: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Julien Grall <julien.grall@arm.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Kevin Tian <kevin.tian@intel.com>

v3:
 - Get rid of intermediate 'frame' variables again.

v2:
 - Addressed comments from Jan.
---
 xen/drivers/passthrough/amd/iommu_cmd.c     | 18 +++----
 xen/drivers/passthrough/amd/iommu_map.c     | 76 ++++++++++++++---------------
 xen/drivers/passthrough/amd/pci_amd_iommu.c |  2 +-
 xen/drivers/passthrough/arm/smmu.c          | 16 +++---
 xen/drivers/passthrough/iommu.c             | 28 +++++------
 xen/drivers/passthrough/vtd/iommu.c         | 30 ++++++------
 xen/include/xen/iommu.h                     | 38 ++++++++++++---
 xen/include/xen/mm.h                        | 10 +++-
 8 files changed, 125 insertions(+), 93 deletions(-)

diff --git a/xen/drivers/passthrough/amd/iommu_cmd.c b/xen/drivers/passthrough/amd/iommu_cmd.c
index 08247fa354..f93becd6e1 100644
--- a/xen/drivers/passthrough/amd/iommu_cmd.c
+++ b/xen/drivers/passthrough/amd/iommu_cmd.c
@@ -284,7 +284,7 @@ void invalidate_iommu_all(struct amd_iommu *iommu)
 }
 
 void amd_iommu_flush_iotlb(u8 devfn, const struct pci_dev *pdev,
-                           uint64_t gaddr, unsigned int order)
+                           baddr_t baddr, unsigned int order)
 {
     unsigned long flags;
     struct amd_iommu *iommu;
@@ -315,12 +315,12 @@ void amd_iommu_flush_iotlb(u8 devfn, const struct pci_dev *pdev,
 
     /* send INVALIDATE_IOTLB_PAGES command */
     spin_lock_irqsave(&iommu->lock, flags);
-    invalidate_iotlb_pages(iommu, maxpend, 0, queueid, gaddr, req_id, order);
+    invalidate_iotlb_pages(iommu, maxpend, 0, queueid, baddr, req_id, order);
     flush_command_buffer(iommu);
     spin_unlock_irqrestore(&iommu->lock, flags);
 }
 
-static void amd_iommu_flush_all_iotlbs(struct domain *d, uint64_t gaddr,
+static void amd_iommu_flush_all_iotlbs(struct domain *d, baddr_t baddr,
                                        unsigned int order)
 {
     struct pci_dev *pdev;
@@ -333,7 +333,7 @@ static void amd_iommu_flush_all_iotlbs(struct domain *d, uint64_t gaddr,
         u8 devfn = pdev->devfn;
 
         do {
-            amd_iommu_flush_iotlb(devfn, pdev, gaddr, order);
+            amd_iommu_flush_iotlb(devfn, pdev, baddr, order);
             devfn += pdev->phantom_stride;
         } while ( devfn != pdev->devfn &&
                   PCI_SLOT(devfn) == PCI_SLOT(pdev->devfn) );
@@ -342,7 +342,7 @@ static void amd_iommu_flush_all_iotlbs(struct domain *d, uint64_t gaddr,
 
 /* Flush iommu cache after p2m changes. */
 static void _amd_iommu_flush_pages(struct domain *d,
-                                   uint64_t gaddr, unsigned int order)
+                                   baddr_t baddr, unsigned int order)
 {
     unsigned long flags;
     struct amd_iommu *iommu;
@@ -352,13 +352,13 @@ static void _amd_iommu_flush_pages(struct domain *d,
     for_each_amd_iommu ( iommu )
     {
         spin_lock_irqsave(&iommu->lock, flags);
-        invalidate_iommu_pages(iommu, gaddr, dom_id, order);
+        invalidate_iommu_pages(iommu, baddr, dom_id, order);
         flush_command_buffer(iommu);
         spin_unlock_irqrestore(&iommu->lock, flags);
     }
 
     if ( ats_enabled )
-        amd_iommu_flush_all_iotlbs(d, gaddr, order);
+        amd_iommu_flush_all_iotlbs(d, baddr, order);
 }
 
 void amd_iommu_flush_all_pages(struct domain *d)
@@ -367,9 +367,9 @@ void amd_iommu_flush_all_pages(struct domain *d)
 }
 
 void amd_iommu_flush_pages(struct domain *d,
-                           unsigned long gfn, unsigned int order)
+                           unsigned long bfn, unsigned int order)
 {
-    _amd_iommu_flush_pages(d, (uint64_t) gfn << PAGE_SHIFT, order);
+    _amd_iommu_flush_pages(d, __bfn_to_baddr(bfn), order);
 }
 
 void amd_iommu_flush_device(struct amd_iommu *iommu, uint16_t bdf)
diff --git a/xen/drivers/passthrough/amd/iommu_map.c b/xen/drivers/passthrough/amd/iommu_map.c
index 70b4345b37..4deab9cd2f 100644
--- a/xen/drivers/passthrough/amd/iommu_map.c
+++ b/xen/drivers/passthrough/amd/iommu_map.c
@@ -35,12 +35,12 @@ static unsigned int pfn_to_pde_idx(unsigned long pfn, unsigned int level)
     return idx;
 }
 
-void clear_iommu_pte_present(unsigned long l1_mfn, unsigned long gfn)
+void clear_iommu_pte_present(unsigned long l1_mfn, unsigned long bfn)
 {
     u64 *table, *pte;
 
     table = map_domain_page(_mfn(l1_mfn));
-    pte = table + pfn_to_pde_idx(gfn, IOMMU_PAGING_MODE_LEVEL_1);
+    pte = table + pfn_to_pde_idx(bfn, IOMMU_PAGING_MODE_LEVEL_1);
     *pte = 0;
     unmap_domain_page(table);
 }
@@ -104,7 +104,7 @@ static bool_t set_iommu_pde_present(u32 *pde, unsigned long next_mfn,
     return need_flush;
 }
 
-static bool_t set_iommu_pte_present(unsigned long pt_mfn, unsigned long gfn, 
+static bool_t set_iommu_pte_present(unsigned long pt_mfn, unsigned long bfn,
                                     unsigned long next_mfn, int pde_level, 
                                     bool_t iw, bool_t ir)
 {
@@ -114,7 +114,7 @@ static bool_t set_iommu_pte_present(unsigned long pt_mfn, unsigned long gfn,
 
     table = map_domain_page(_mfn(pt_mfn));
 
-    pde = (u32*)(table + pfn_to_pde_idx(gfn, pde_level));
+    pde = (u32*)(table + pfn_to_pde_idx(bfn, pde_level));
 
     need_flush = set_iommu_pde_present(pde, next_mfn, 
                                        IOMMU_PAGING_MODE_LEVEL_0, iw, ir);
@@ -331,7 +331,7 @@ static void set_pde_count(u64 *pde, unsigned int count)
  * otherwise increase pde count if mfn is contigous with mfn - 1
  */
 static int iommu_update_pde_count(struct domain *d, unsigned long pt_mfn,
-                                  unsigned long gfn, unsigned long mfn,
+                                  unsigned long bfn, unsigned long mfn,
                                   unsigned int merge_level)
 {
     unsigned int pde_count, next_level;
@@ -347,7 +347,7 @@ static int iommu_update_pde_count(struct domain *d, unsigned long pt_mfn,
 
     /* get pde at merge level */
     table = map_domain_page(_mfn(pt_mfn));
-    pde = table + pfn_to_pde_idx(gfn, merge_level);
+    pde = table + pfn_to_pde_idx(bfn, merge_level);
 
     /* get page table of next level */
     ntable_maddr = amd_iommu_get_next_table_from_pte((u32*)pde);
@@ -362,7 +362,7 @@ static int iommu_update_pde_count(struct domain *d, unsigned long pt_mfn,
     mask = (1ULL<< (PTE_PER_TABLE_SHIFT * next_level)) - 1;
 
     if ( ((first_mfn & mask) == 0) &&
-         (((gfn & mask) | first_mfn) == mfn) )
+         (((bfn & mask) | first_mfn) == mfn) )
     {
         pde_count = get_pde_count(*pde);
 
@@ -387,7 +387,7 @@ out:
 }
 
 static int iommu_merge_pages(struct domain *d, unsigned long pt_mfn,
-                             unsigned long gfn, unsigned int flags,
+                             unsigned long bfn, unsigned int flags,
                              unsigned int merge_level)
 {
     u64 *table, *pde, *ntable;
@@ -398,7 +398,7 @@ static int iommu_merge_pages(struct domain *d, unsigned long pt_mfn,
     ASSERT( spin_is_locked(&hd->arch.mapping_lock) && pt_mfn );
 
     table = map_domain_page(_mfn(pt_mfn));
-    pde = table + pfn_to_pde_idx(gfn, merge_level);
+    pde = table + pfn_to_pde_idx(bfn, merge_level);
 
     /* get first mfn */
     ntable_mfn = amd_iommu_get_next_table_from_pte((u32*)pde) >> PAGE_SHIFT;
@@ -436,7 +436,7 @@ static int iommu_merge_pages(struct domain *d, unsigned long pt_mfn,
  * {Re, un}mapping super page frames causes re-allocation of io
  * page tables.
  */
-static int iommu_pde_from_gfn(struct domain *d, unsigned long pfn, 
+static int iommu_pde_from_bfn(struct domain *d, unsigned long pfn,
                               unsigned long pt_mfn[])
 {
     u64 *pde, *next_table_vaddr;
@@ -477,11 +477,11 @@ static int iommu_pde_from_gfn(struct domain *d, unsigned long pfn,
              next_table_mfn != 0 )
         {
             int i;
-            unsigned long mfn, gfn;
+            unsigned long mfn, bfn;
             unsigned int page_sz;
 
             page_sz = 1 << (PTE_PER_TABLE_SHIFT * (next_level - 1));
-            gfn =  pfn & ~((1 << (PTE_PER_TABLE_SHIFT * next_level)) - 1);
+            bfn =  pfn & ~((1 << (PTE_PER_TABLE_SHIFT * next_level)) - 1);
             mfn = next_table_mfn;
 
             /* allocate lower level page table */
@@ -499,10 +499,10 @@ static int iommu_pde_from_gfn(struct domain *d, unsigned long pfn,
 
             for ( i = 0; i < PTE_PER_TABLE_SIZE; i++ )
             {
-                set_iommu_pte_present(next_table_mfn, gfn, mfn, next_level,
+                set_iommu_pte_present(next_table_mfn, bfn, mfn, next_level,
                                       !!IOMMUF_writable, !!IOMMUF_readable);
                 mfn += page_sz;
-                gfn += page_sz;
+                bfn += page_sz;
              }
 
             amd_iommu_flush_all_pages(d);
@@ -540,7 +540,7 @@ static int iommu_pde_from_gfn(struct domain *d, unsigned long pfn,
     return 0;
 }
 
-static int update_paging_mode(struct domain *d, unsigned long gfn)
+static int update_paging_mode(struct domain *d, unsigned long bfn)
 {
     u16 bdf;
     void *device_entry;
@@ -554,13 +554,13 @@ static int update_paging_mode(struct domain *d, unsigned long gfn)
     unsigned long old_root_mfn;
     struct domain_iommu *hd = dom_iommu(d);
 
-    if ( gfn == gfn_x(INVALID_GFN) )
+    if ( bfn == bfn_x(INVALID_BFN) )
         return -EADDRNOTAVAIL;
-    ASSERT(!(gfn >> DEFAULT_DOMAIN_ADDRESS_WIDTH));
+    ASSERT(!(bfn >> DEFAULT_DOMAIN_ADDRESS_WIDTH));
 
     level = hd->arch.paging_mode;
     old_root = hd->arch.root_table;
-    offset = gfn >> (PTE_PER_TABLE_SHIFT * (level - 1));
+    offset = bfn >> (PTE_PER_TABLE_SHIFT * (level - 1));
 
     ASSERT(spin_is_locked(&hd->arch.mapping_lock) && is_hvm_domain(d));
 
@@ -631,7 +631,7 @@ static int update_paging_mode(struct domain *d, unsigned long gfn)
     return 0;
 }
 
-int amd_iommu_map_page(struct domain *d, unsigned long gfn, unsigned long mfn,
+int amd_iommu_map_page(struct domain *d, unsigned long bfn, unsigned long mfn,
                        unsigned int flags)
 {
     bool_t need_flush = 0;
@@ -651,34 +651,34 @@ int amd_iommu_map_page(struct domain *d, unsigned long gfn, unsigned long mfn,
     if ( rc )
     {
         spin_unlock(&hd->arch.mapping_lock);
-        AMD_IOMMU_DEBUG("Root table alloc failed, gfn = %lx\n", gfn);
+        AMD_IOMMU_DEBUG("Root table alloc failed, bfn = %lx\n", bfn);
         domain_crash(d);
         return rc;
     }
 
     /* Since HVM domain is initialized with 2 level IO page table,
-     * we might need a deeper page table for lager gfn now */
+     * we might need a deeper page table for wider bfn now */
     if ( is_hvm_domain(d) )
     {
-        if ( update_paging_mode(d, gfn) )
+        if ( update_paging_mode(d, bfn) )
         {
             spin_unlock(&hd->arch.mapping_lock);
-            AMD_IOMMU_DEBUG("Update page mode failed gfn = %lx\n", gfn);
+            AMD_IOMMU_DEBUG("Update page mode failed bfn = %lx\n", bfn);
             domain_crash(d);
             return -EFAULT;
         }
     }
 
-    if ( iommu_pde_from_gfn(d, gfn, pt_mfn) || (pt_mfn[1] == 0) )
+    if ( iommu_pde_from_bfn(d, bfn, pt_mfn) || (pt_mfn[1] == 0) )
     {
         spin_unlock(&hd->arch.mapping_lock);
-        AMD_IOMMU_DEBUG("Invalid IO pagetable entry gfn = %lx\n", gfn);
+        AMD_IOMMU_DEBUG("Invalid IO pagetable entry bfn = %lx\n", bfn);
         domain_crash(d);
         return -EFAULT;
     }
 
     /* Install 4k mapping first */
-    need_flush = set_iommu_pte_present(pt_mfn[1], gfn, mfn, 
+    need_flush = set_iommu_pte_present(pt_mfn[1], bfn, mfn,
                                        IOMMU_PAGING_MODE_LEVEL_1,
                                        !!(flags & IOMMUF_writable),
                                        !!(flags & IOMMUF_readable));
@@ -690,7 +690,7 @@ int amd_iommu_map_page(struct domain *d, unsigned long gfn, unsigned long mfn,
     /* 4K mapping for PV guests never changes, 
      * no need to flush if we trust non-present bits */
     if ( is_hvm_domain(d) )
-        amd_iommu_flush_pages(d, gfn, 0);
+        amd_iommu_flush_pages(d, bfn, 0);
 
     for ( merge_level = IOMMU_PAGING_MODE_LEVEL_2;
           merge_level <= hd->arch.paging_mode; merge_level++ )
@@ -698,15 +698,15 @@ int amd_iommu_map_page(struct domain *d, unsigned long gfn, unsigned long mfn,
         if ( pt_mfn[merge_level] == 0 )
             break;
         if ( !iommu_update_pde_count(d, pt_mfn[merge_level],
-                                     gfn, mfn, merge_level) )
+                                     bfn, mfn, merge_level) )
             break;
 
-        if ( iommu_merge_pages(d, pt_mfn[merge_level], gfn, 
+        if ( iommu_merge_pages(d, pt_mfn[merge_level], bfn,
                                flags, merge_level) )
         {
             spin_unlock(&hd->arch.mapping_lock);
             AMD_IOMMU_DEBUG("Merge iommu page failed at level %d, "
-                            "gfn = %lx mfn = %lx\n", merge_level, gfn, mfn);
+                            "bfn = %lx mfn = %lx\n", merge_level, bfn, mfn);
             domain_crash(d);
             return -EFAULT;
         }
@@ -720,7 +720,7 @@ out:
     return 0;
 }
 
-int amd_iommu_unmap_page(struct domain *d, unsigned long gfn)
+int amd_iommu_unmap_page(struct domain *d, unsigned long bfn)
 {
     unsigned long pt_mfn[7];
     struct domain_iommu *hd = dom_iommu(d);
@@ -739,34 +739,34 @@ int amd_iommu_unmap_page(struct domain *d, unsigned long gfn)
     }
 
     /* Since HVM domain is initialized with 2 level IO page table,
-     * we might need a deeper page table for lager gfn now */
+     * we might need a deeper page table for lager bfn now */
     if ( is_hvm_domain(d) )
     {
-        int rc = update_paging_mode(d, gfn);
+        int rc = update_paging_mode(d, bfn);
 
         if ( rc )
         {
             spin_unlock(&hd->arch.mapping_lock);
-            AMD_IOMMU_DEBUG("Update page mode failed gfn = %lx\n", gfn);
+            AMD_IOMMU_DEBUG("Update page mode failed bfn = %lx\n", bfn);
             if ( rc != -EADDRNOTAVAIL )
                 domain_crash(d);
             return rc;
         }
     }
 
-    if ( iommu_pde_from_gfn(d, gfn, pt_mfn) || (pt_mfn[1] == 0) )
+    if ( iommu_pde_from_bfn(d, bfn, pt_mfn) || (pt_mfn[1] == 0) )
     {
         spin_unlock(&hd->arch.mapping_lock);
-        AMD_IOMMU_DEBUG("Invalid IO pagetable entry gfn = %lx\n", gfn);
+        AMD_IOMMU_DEBUG("Invalid IO pagetable entry bfn = %lx\n", bfn);
         domain_crash(d);
         return -EFAULT;
     }
 
     /* mark PTE as 'page not present' */
-    clear_iommu_pte_present(pt_mfn[1], gfn);
+    clear_iommu_pte_present(pt_mfn[1], bfn);
     spin_unlock(&hd->arch.mapping_lock);
 
-    amd_iommu_flush_pages(d, gfn, 0);
+    amd_iommu_flush_pages(d, bfn, 0);
 
     return 0;
 }
diff --git a/xen/drivers/passthrough/amd/pci_amd_iommu.c b/xen/drivers/passthrough/amd/pci_amd_iommu.c
index 12d2695b89..d608631e6e 100644
--- a/xen/drivers/passthrough/amd/pci_amd_iommu.c
+++ b/xen/drivers/passthrough/amd/pci_amd_iommu.c
@@ -578,7 +578,7 @@ static void amd_dump_p2m_table_level(struct page_info* pg, int level,
                 maddr_to_page(next_table_maddr), next_level,
                 address, indent + 1);
         else
-            printk("%*sgfn: %08lx  mfn: %08lx\n",
+            printk("%*sbfn: %08lx  mfn: %08lx\n",
                    indent, "",
                    (unsigned long)PFN_DOWN(address),
                    (unsigned long)PFN_DOWN(next_table_maddr));
diff --git a/xen/drivers/passthrough/arm/smmu.c b/xen/drivers/passthrough/arm/smmu.c
index 74c09b0991..1e4d561b47 100644
--- a/xen/drivers/passthrough/arm/smmu.c
+++ b/xen/drivers/passthrough/arm/smmu.c
@@ -2551,7 +2551,7 @@ static int __must_check arm_smmu_iotlb_flush_all(struct domain *d)
 }
 
 static int __must_check arm_smmu_iotlb_flush(struct domain *d,
-                                             unsigned long gfn,
+                                             unsigned long bfn,
                                              unsigned int page_count)
 {
 	/* ARM SMMU v1 doesn't have flush by VMA and VMID */
@@ -2737,7 +2737,7 @@ static void arm_smmu_iommu_domain_teardown(struct domain *d)
 	xfree(xen_domain);
 }
 
-static int __must_check arm_smmu_map_page(struct domain *d, unsigned long gfn,
+static int __must_check arm_smmu_map_page(struct domain *d, unsigned long bfn,
 			unsigned long mfn, unsigned int flags)
 {
 	p2m_type_t t;
@@ -2748,10 +2748,10 @@ static int __must_check arm_smmu_map_page(struct domain *d, unsigned long gfn,
 	 * protected by an IOMMU, Xen needs to add a 1:1 mapping in the domain
 	 * p2m to allow DMA request to work.
 	 * This is only valid when the domain is directed mapped. Hence this
-	 * function should only be used by gnttab code with gfn == mfn.
+	 * function should only be used by gnttab code with gfn == mfn == bfn.
 	 */
 	BUG_ON(!is_domain_direct_mapped(d));
-	BUG_ON(mfn != gfn);
+	BUG_ON(mfn != bfn);
 
 	/* We only support readable and writable flags */
 	if (!(flags & (IOMMUF_readable | IOMMUF_writable)))
@@ -2763,19 +2763,19 @@ static int __must_check arm_smmu_map_page(struct domain *d, unsigned long gfn,
 	 * The function guest_physmap_add_entry replaces the current mapping
 	 * if there is already one...
 	 */
-	return guest_physmap_add_entry(d, _gfn(gfn), _mfn(mfn), 0, t);
+	return guest_physmap_add_entry(d, _gfn(bfn), _mfn(bfn), 0, t);
 }
 
-static int __must_check arm_smmu_unmap_page(struct domain *d, unsigned long gfn)
+static int __must_check arm_smmu_unmap_page(struct domain *d, unsigned long bfn)
 {
 	/*
 	 * This function should only be used by gnttab code when the domain
-	 * is direct mapped
+	 * is direct mapped (i.e. gfn == mfn == bfn).
 	 */
 	if ( !is_domain_direct_mapped(d) )
 		return -EINVAL;
 
-	return guest_physmap_remove_page(d, _gfn(gfn), _mfn(gfn), 0);
+	return guest_physmap_remove_page(d, _gfn(bfn), _mfn(bfn), 0);
 }
 
 static const struct iommu_ops arm_smmu_iommu_ops = {
diff --git a/xen/drivers/passthrough/iommu.c b/xen/drivers/passthrough/iommu.c
index 49974e5634..0faab18e88 100644
--- a/xen/drivers/passthrough/iommu.c
+++ b/xen/drivers/passthrough/iommu.c
@@ -185,7 +185,7 @@ void __hwdom_init iommu_hwdom_init(struct domain *d)
         page_list_for_each ( page, &d->page_list )
         {
             unsigned long mfn = mfn_x(page_to_mfn(page));
-            unsigned long gfn = mfn_to_gmfn(d, mfn);
+            unsigned long bfn = mfn_to_gmfn(d, mfn);
             unsigned int mapping = IOMMUF_readable;
             int ret;
 
@@ -194,7 +194,7 @@ void __hwdom_init iommu_hwdom_init(struct domain *d)
                   == PGT_writable_page) )
                 mapping |= IOMMUF_writable;
 
-            ret = hd->platform_ops->map_page(d, gfn, mfn, mapping);
+            ret = hd->platform_ops->map_page(d, bfn, mfn, mapping);
             if ( !rc )
                 rc = ret;
 
@@ -255,7 +255,7 @@ void iommu_domain_destroy(struct domain *d)
     arch_iommu_domain_destroy(d);
 }
 
-int iommu_map_page(struct domain *d, unsigned long gfn, unsigned long mfn,
+int iommu_map_page(struct domain *d, unsigned long bfn, unsigned long mfn,
                    unsigned int flags)
 {
     const struct domain_iommu *hd = dom_iommu(d);
@@ -264,13 +264,13 @@ int iommu_map_page(struct domain *d, unsigned long gfn, unsigned long mfn,
     if ( !iommu_enabled || !hd->platform_ops )
         return 0;
 
-    rc = hd->platform_ops->map_page(d, gfn, mfn, flags);
+    rc = hd->platform_ops->map_page(d, bfn, mfn, flags);
     if ( unlikely(rc) )
     {
         if ( !d->is_shutting_down && printk_ratelimit() )
             printk(XENLOG_ERR
-                   "d%d: IOMMU mapping gfn %#lx to mfn %#lx failed: %d\n",
-                   d->domain_id, gfn, mfn, rc);
+                   "d%d: IOMMU mapping bfn %#lx to mfn %#lx failed: %d\n",
+                   d->domain_id, bfn, mfn, rc);
 
         if ( !is_hardware_domain(d) )
             domain_crash(d);
@@ -279,7 +279,7 @@ int iommu_map_page(struct domain *d, unsigned long gfn, unsigned long mfn,
     return rc;
 }
 
-int iommu_unmap_page(struct domain *d, unsigned long gfn)
+int iommu_unmap_page(struct domain *d, unsigned long bfn)
 {
     const struct domain_iommu *hd = dom_iommu(d);
     int rc;
@@ -287,13 +287,13 @@ int iommu_unmap_page(struct domain *d, unsigned long gfn)
     if ( !iommu_enabled || !hd->platform_ops )
         return 0;
 
-    rc = hd->platform_ops->unmap_page(d, gfn);
+    rc = hd->platform_ops->unmap_page(d, bfn);
     if ( unlikely(rc) )
     {
         if ( !d->is_shutting_down && printk_ratelimit() )
             printk(XENLOG_ERR
-                   "d%d: IOMMU unmapping gfn %#lx failed: %d\n",
-                   d->domain_id, gfn, rc);
+                   "d%d: IOMMU unmapping bfn %#lx failed: %d\n",
+                   d->domain_id, bfn, rc);
 
         if ( !is_hardware_domain(d) )
             domain_crash(d);
@@ -319,7 +319,7 @@ static void iommu_free_pagetables(unsigned long unused)
                             cpumask_cycle(smp_processor_id(), &cpu_online_map));
 }
 
-int iommu_iotlb_flush(struct domain *d, unsigned long gfn,
+int iommu_iotlb_flush(struct domain *d, unsigned long bfn,
                       unsigned int page_count)
 {
     const struct domain_iommu *hd = dom_iommu(d);
@@ -328,13 +328,13 @@ int iommu_iotlb_flush(struct domain *d, unsigned long gfn,
     if ( !iommu_enabled || !hd->platform_ops || !hd->platform_ops->iotlb_flush )
         return 0;
 
-    rc = hd->platform_ops->iotlb_flush(d, gfn, page_count);
+    rc = hd->platform_ops->iotlb_flush(d, bfn, page_count);
     if ( unlikely(rc) )
     {
         if ( !d->is_shutting_down && printk_ratelimit() )
             printk(XENLOG_ERR
-                   "d%d: IOMMU IOTLB flush failed: %d, gfn %#lx, page count %u\n",
-                   d->domain_id, rc, gfn, page_count);
+                   "d%d: IOMMU IOTLB flush failed: %d, bfn %#lx, page count %u\n",
+                   d->domain_id, rc, bfn, page_count);
 
         if ( !is_hardware_domain(d) )
             domain_crash(d);
diff --git a/xen/drivers/passthrough/vtd/iommu.c b/xen/drivers/passthrough/vtd/iommu.c
index 1710256823..48f62e0e8d 100644
--- a/xen/drivers/passthrough/vtd/iommu.c
+++ b/xen/drivers/passthrough/vtd/iommu.c
@@ -585,7 +585,7 @@ static int __must_check iommu_flush_all(void)
 }
 
 static int __must_check iommu_flush_iotlb(struct domain *d,
-                                          unsigned long gfn,
+                                          unsigned long bfn,
                                           bool_t dma_old_pte_present,
                                           unsigned int page_count)
 {
@@ -612,12 +612,12 @@ static int __must_check iommu_flush_iotlb(struct domain *d,
         if ( iommu_domid == -1 )
             continue;
 
-        if ( page_count != 1 || gfn == gfn_x(INVALID_GFN) )
+        if ( page_count != 1 || bfn == bfn_x(INVALID_BFN) )
             rc = iommu_flush_iotlb_dsi(iommu, iommu_domid,
                                        0, flush_dev_iotlb);
         else
             rc = iommu_flush_iotlb_psi(iommu, iommu_domid,
-                                       (paddr_t)gfn << PAGE_SHIFT_4K,
+                                       __bfn_to_baddr(bfn),
                                        PAGE_ORDER_4K,
                                        !dma_old_pte_present,
                                        flush_dev_iotlb);
@@ -633,15 +633,15 @@ static int __must_check iommu_flush_iotlb(struct domain *d,
 }
 
 static int __must_check iommu_flush_iotlb_pages(struct domain *d,
-                                                unsigned long gfn,
+                                                unsigned long bfn,
                                                 unsigned int page_count)
 {
-    return iommu_flush_iotlb(d, gfn, 1, page_count);
+    return iommu_flush_iotlb(d, bfn, 1, page_count);
 }
 
 static int __must_check iommu_flush_iotlb_all(struct domain *d)
 {
-    return iommu_flush_iotlb(d, gfn_x(INVALID_GFN), 0, 0);
+    return iommu_flush_iotlb(d, bfn_x(INVALID_BFN), 0, 0);
 }
 
 /* clear one page's page table */
@@ -1767,7 +1767,7 @@ static void iommu_domain_teardown(struct domain *d)
 }
 
 static int __must_check intel_iommu_map_page(struct domain *d,
-                                             unsigned long gfn,
+                                             unsigned long bfn,
                                              unsigned long mfn,
                                              unsigned int flags)
 {
@@ -1786,14 +1786,14 @@ static int __must_check intel_iommu_map_page(struct domain *d,
 
     spin_lock(&hd->arch.mapping_lock);
 
-    pg_maddr = addr_to_dma_page_maddr(d, (paddr_t)gfn << PAGE_SHIFT_4K, 1);
+    pg_maddr = addr_to_dma_page_maddr(d, __bfn_to_baddr(bfn), 1);
     if ( pg_maddr == 0 )
     {
         spin_unlock(&hd->arch.mapping_lock);
         return -ENOMEM;
     }
     page = (struct dma_pte *)map_vtd_domain_page(pg_maddr);
-    pte = page + (gfn & LEVEL_MASK);
+    pte = page + (bfn & LEVEL_MASK);
     old = *pte;
     dma_set_pte_addr(new, (paddr_t)mfn << PAGE_SHIFT_4K);
     dma_set_pte_prot(new,
@@ -1817,22 +1817,22 @@ static int __must_check intel_iommu_map_page(struct domain *d,
     unmap_vtd_domain_page(page);
 
     if ( !this_cpu(iommu_dont_flush_iotlb) )
-        rc = iommu_flush_iotlb(d, gfn, dma_pte_present(old), 1);
+        rc = iommu_flush_iotlb(d, bfn, dma_pte_present(old), 1);
 
     return rc;
 }
 
 static int __must_check intel_iommu_unmap_page(struct domain *d,
-                                               unsigned long gfn)
+                                               unsigned long bfn)
 {
     /* Do nothing if hardware domain and iommu supports pass thru. */
     if ( iommu_passthrough && is_hardware_domain(d) )
         return 0;
 
-    return dma_pte_clear_one(d, (paddr_t)gfn << PAGE_SHIFT_4K);
+    return dma_pte_clear_one(d, __bfn_to_baddr(bfn));
 }
 
-int iommu_pte_flush(struct domain *d, u64 gfn, u64 *pte,
+int iommu_pte_flush(struct domain *d, uint64_t bfn, uint64_t *pte,
                     int order, int present)
 {
     struct acpi_drhd_unit *drhd;
@@ -1856,7 +1856,7 @@ int iommu_pte_flush(struct domain *d, u64 gfn, u64 *pte,
             continue;
 
         rc = iommu_flush_iotlb_psi(iommu, iommu_domid,
-                                   (paddr_t)gfn << PAGE_SHIFT_4K,
+                                   __bfn_to_baddr(bfn),
                                    order, !present, flush_dev_iotlb);
         if ( rc > 0 )
         {
@@ -2626,7 +2626,7 @@ static void vtd_dump_p2m_table_level(paddr_t pt_maddr, int level, paddr_t gpa,
             vtd_dump_p2m_table_level(dma_pte_addr(*pte), next_level, 
                                      address, indent + 1);
         else
-            printk("%*sgfn: %08lx mfn: %08lx\n",
+            printk("%*sbfn: %08lx mfn: %08lx\n",
                    indent, "",
                    (unsigned long)(address >> PAGE_SHIFT_4K),
                    (unsigned long)(dma_pte_addr(*pte) >> PAGE_SHIFT_4K));
diff --git a/xen/include/xen/iommu.h b/xen/include/xen/iommu.h
index 6b42e3b876..14ada0c14e 100644
--- a/xen/include/xen/iommu.h
+++ b/xen/include/xen/iommu.h
@@ -23,11 +23,37 @@
 #include <xen/page-defs.h>
 #include <xen/spinlock.h>
 #include <xen/pci.h>
+#include <xen/typesafe.h>
 #include <public/hvm/ioreq.h>
 #include <public/domctl.h>
 #include <asm/device.h>
 #include <asm/iommu.h>
 
+TYPE_SAFE(uint64_t, bfn);
+#define PRI_bfn     PRIx64
+#define INVALID_BFN _bfn(~0UL)
+
+#ifndef bfn_t
+#define bfn_t /* Grep fodder: bfn_t, _bfn() and bfn_x() are defined above */
+#define _bfn
+#define bfn_x
+#undef bfn_t
+#undef _bfn
+#undef bfn_x
+#endif
+
+#define IOMMU_PAGE_SHIFT 12
+#define IOMMU_PAGE_SIZE  (_AC(1,L) << IOMMU_PAGE_SHIFT)
+#define IOMMU_PAGE_MASK  (~(IOMMU_PAGE_SIZE - 1))
+
+typedef uint64_t baddr_t;
+
+#define __bfn_to_baddr(bfn) ((baddr_t)(bfn) << IOMMU_PAGE_SHIFT)
+#define __baddr_to_bfn(baddr) ((uint64_t)(baddr >> IOMMU_PAGE_SHIFT))
+
+#define bfn_to_baddr(bfn) __bfn_to_baddr(bfn_x(bfn))
+#define baddr_to_bfn(baddr) _bfn(__baddr_to_bfn(baddr))
+
 extern bool_t iommu_enable, iommu_enabled;
 extern bool_t force_iommu, iommu_dom0_strict, iommu_verbose;
 extern bool_t iommu_workaround_bios_bug, iommu_igfx, iommu_passthrough;
@@ -60,9 +86,9 @@ void iommu_teardown(struct domain *d);
 #define IOMMUF_readable  (1u<<_IOMMUF_readable)
 #define _IOMMUF_writable 1
 #define IOMMUF_writable  (1u<<_IOMMUF_writable)
-int __must_check iommu_map_page(struct domain *d, unsigned long gfn,
+int __must_check iommu_map_page(struct domain *d, unsigned long bfn,
                                 unsigned long mfn, unsigned int flags);
-int __must_check iommu_unmap_page(struct domain *d, unsigned long gfn);
+int __must_check iommu_unmap_page(struct domain *d, unsigned long bfn);
 
 enum iommu_feature
 {
@@ -152,9 +178,9 @@ struct iommu_ops {
 #endif /* HAS_PCI */
 
     void (*teardown)(struct domain *d);
-    int __must_check (*map_page)(struct domain *d, unsigned long gfn,
+    int __must_check (*map_page)(struct domain *d, unsigned long bfn,
                                  unsigned long mfn, unsigned int flags);
-    int __must_check (*unmap_page)(struct domain *d, unsigned long gfn);
+    int __must_check (*unmap_page)(struct domain *d, unsigned long bfn);
     void (*free_page_table)(struct page_info *);
 #ifdef CONFIG_X86
     void (*update_ire_from_apic)(unsigned int apic, unsigned int reg, unsigned int value);
@@ -165,7 +191,7 @@ struct iommu_ops {
     void (*resume)(void);
     void (*share_p2m)(struct domain *d);
     void (*crash_shutdown)(void);
-    int __must_check (*iotlb_flush)(struct domain *d, unsigned long gfn,
+    int __must_check (*iotlb_flush)(struct domain *d, unsigned long bfn,
                                     unsigned int page_count);
     int __must_check (*iotlb_flush_all)(struct domain *d);
     int (*get_reserved_device_memory)(iommu_grdm_t *, void *);
@@ -187,7 +213,7 @@ int iommu_do_pci_domctl(struct xen_domctl *, struct domain *d,
 int iommu_do_domctl(struct xen_domctl *, struct domain *d,
                     XEN_GUEST_HANDLE_PARAM(xen_domctl_t));
 
-int __must_check iommu_iotlb_flush(struct domain *d, unsigned long gfn,
+int __must_check iommu_iotlb_flush(struct domain *d, unsigned long bfn,
                                    unsigned int page_count);
 int __must_check iommu_iotlb_flush_all(struct domain *d);
 
diff --git a/xen/include/xen/mm.h b/xen/include/xen/mm.h
index 24654e8e22..4ebf91f17e 100644
--- a/xen/include/xen/mm.h
+++ b/xen/include/xen/mm.h
@@ -14,8 +14,9 @@
  * should be adhered to.
  *
  * mfn: Machine Frame Number
- *   The values Xen puts into its own pagetables.  This is the host physical
- *   memory address space with RAM, MMIO etc.
+ *   The values Xen puts into its own pagetables, 2nd stage pagetables (where
+ *   hardware assisted 2nd stage translation is used) or IOMMU page tables.
+ *   This is the host physical memory address space with RAM, MMIO etc.
  *
  * gfn: Guest Frame Number
  *   The values a guest puts in its own pagetables.  For an auto-translated
@@ -26,6 +27,11 @@
  *   A linear idea of a guest physical address space. For an auto-translated
  *   guest, pfn == gfn while for a non-translated guest, pfn != gfn.
  *
+ * bfn: Bus Frame Number (definitions in include/xen/iommu.h)
+ *   The linear frame numbers of IOMMU address space. All initiators for a
+ *   given domain share a single IOMMU address space and, by default, Xen will
+ *   ensure bfn == pfn.
+ *
  * WARNING: Some of these terms have changed over time while others have been
  * used inconsistently, meaning that a lot of existing code does not match the
  * definitions above.  New code should use these terms as described here, and
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH v5 03/15] iommu: make use of type-safe BFN and MFN in exported functions
  2018-08-03 17:22 [PATCH v5 00/15] paravirtual IOMMU interface Paul Durrant
  2018-08-03 17:22 ` [PATCH v5 01/15] iommu: turn need_iommu back into a boolean Paul Durrant
  2018-08-03 17:22 ` [PATCH v5 02/15] iommu: introduce the concept of BFN Paul Durrant
@ 2018-08-03 17:22 ` Paul Durrant
  2018-08-07  2:45   ` Tian, Kevin
  2018-08-03 17:22 ` [PATCH v5 04/15] iommu: push use of type-safe BFN and MFN into iommu_ops Paul Durrant
                   ` (11 subsequent siblings)
  14 siblings, 1 reply; 57+ messages in thread
From: Paul Durrant @ 2018-08-03 17:22 UTC (permalink / raw)
  To: xen-devel
  Cc: Kevin Tian, Stefano Stabellini, Jun Nakajima, George Dunlap,
	Andrew Cooper, Ian Jackson, Tim Deegan, Julien Grall,
	Paul Durrant, Jan Beulich

This patch modifies the declaration of the entry points to the IOMMU
sub-system to use bfn_t and mfn_t in place of unsigned long. A subsequent
patch will similarly modify the methods in the iommu_ops structure.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
---
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: George Dunlap <George.Dunlap@eu.citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Julien Grall <julien.grall@arm.com>
Cc: Tim Deegan <tim@xen.org>
Cc: Jun Nakajima <jun.nakajima@intel.com>
Cc: Kevin Tian <kevin.tian@intel.com>
Cc: George Dunlap <george.dunlap@eu.citrix.com>

v3:
 - Removed most of the uses of an intermediate 'frame' variable.

v2:
 - Addressed comments from Jan.
 - Use intermediate 'frame' variable to avoid directly encapsulating
   mfn or gfn values as bfns.
---
 xen/arch/arm/p2m.c                    |  3 ++-
 xen/arch/x86/mm.c                     | 10 ++++----
 xen/arch/x86/mm/p2m-ept.c             | 10 +++++---
 xen/arch/x86/mm/p2m-pt.c              | 45 ++++++++++++++++++++---------------
 xen/arch/x86/mm/p2m.c                 | 16 ++++++++-----
 xen/arch/x86/x86_64/mm.c              |  5 ++--
 xen/common/grant_table.c              | 12 +++++-----
 xen/common/memory.c                   |  4 ++--
 xen/drivers/passthrough/iommu.c       | 25 ++++++++++---------
 xen/drivers/passthrough/vtd/x86/vtd.c |  3 ++-
 xen/include/xen/iommu.h               | 14 +++++++----
 11 files changed, 85 insertions(+), 62 deletions(-)

diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
index 14791388ad..d719db1e30 100644
--- a/xen/arch/arm/p2m.c
+++ b/xen/arch/arm/p2m.c
@@ -953,7 +953,8 @@ static int __p2m_set_entry(struct p2m_domain *p2m,
 
     if ( need_iommu(p2m->domain) &&
          (lpae_valid(orig_pte) || lpae_valid(*entry)) )
-        rc = iommu_iotlb_flush(p2m->domain, gfn_x(sgfn), 1UL << page_order);
+        rc = iommu_iotlb_flush(p2m->domain, _bfn(gfn_x(sgfn)),
+                               1UL << page_order);
     else
         rc = 0;
 
diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index a1a1f5f3c3..99070f916d 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -2729,14 +2729,14 @@ static int _get_page_type(struct page_info *page, unsigned long type,
         struct domain *d = page_get_owner(page);
         if ( d && is_pv_domain(d) && unlikely(need_iommu(d)) )
         {
-            gfn_t gfn = _gfn(mfn_to_gmfn(d, mfn_x(page_to_mfn(page))));
+            mfn_t mfn = page_to_mfn(page);
 
             if ( (x & PGT_type_mask) == PGT_writable_page )
-                iommu_ret = iommu_unmap_page(d, gfn_x(gfn));
+                iommu_ret = iommu_unmap_page(d, _bfn(mfn_x(mfn)));
             else if ( type == PGT_writable_page )
-                iommu_ret = iommu_map_page(d, gfn_x(gfn),
-                                           mfn_x(page_to_mfn(page)),
-                                           IOMMUF_readable|IOMMUF_writable);
+                iommu_ret = iommu_map_page(d, _bfn(mfn_x(mfn)), mfn,
+                                           IOMMUF_readable |
+                                           IOMMUF_writable);
         }
     }
 
diff --git a/xen/arch/x86/mm/p2m-ept.c b/xen/arch/x86/mm/p2m-ept.c
index 14b593923b..2089b5232d 100644
--- a/xen/arch/x86/mm/p2m-ept.c
+++ b/xen/arch/x86/mm/p2m-ept.c
@@ -870,15 +870,19 @@ out:
             rc = iommu_pte_flush(d, gfn, &ept_entry->epte, order, vtd_pte_present);
         else
         {
+            bfn_t bfn = _bfn(gfn);
+
             if ( iommu_flags )
                 for ( i = 0; i < (1 << order); i++ )
                 {
-                    rc = iommu_map_page(d, gfn + i, mfn_x(mfn) + i, iommu_flags);
+                    rc = iommu_map_page(d, bfn_add(bfn, i),
+                                        mfn_add(mfn, i), iommu_flags);
                     if ( unlikely(rc) )
                     {
                         while ( i-- )
                             /* If statement to satisfy __must_check. */
-                            if ( iommu_unmap_page(p2m->domain, gfn + i) )
+                            if ( iommu_unmap_page(p2m->domain,
+                                                  bfn_add(bfn, i)) )
                                 continue;
 
                         break;
@@ -887,7 +891,7 @@ out:
             else
                 for ( i = 0; i < (1 << order); i++ )
                 {
-                    ret = iommu_unmap_page(d, gfn + i);
+                    ret = iommu_unmap_page(d, bfn_add(bfn, i));
                     if ( !rc )
                         rc = ret;
                 }
diff --git a/xen/arch/x86/mm/p2m-pt.c b/xen/arch/x86/mm/p2m-pt.c
index b8c5d2ed26..a441af388a 100644
--- a/xen/arch/x86/mm/p2m-pt.c
+++ b/xen/arch/x86/mm/p2m-pt.c
@@ -687,29 +687,36 @@ p2m_pt_set_entry(struct p2m_domain *p2m, gfn_t gfn_, mfn_t mfn,
             if ( iommu_old_flags )
                 amd_iommu_flush_pages(p2m->domain, gfn, page_order);
         }
-        else if ( iommu_pte_flags )
-            for ( i = 0; i < (1UL << page_order); i++ )
-            {
-                rc = iommu_map_page(p2m->domain, gfn + i, mfn_x(mfn) + i,
-                                    iommu_pte_flags);
-                if ( unlikely(rc) )
+        else
+        {
+            bfn_t bfn = _bfn(gfn);
+
+            if ( iommu_pte_flags )
+                for ( i = 0; i < (1UL << page_order); i++ )
                 {
-                    while ( i-- )
-                        /* If statement to satisfy __must_check. */
-                        if ( iommu_unmap_page(p2m->domain, gfn + i) )
-                            continue;
+                    rc = iommu_map_page(p2m->domain, bfn_add(bfn, i),
+                                        mfn_add(mfn, i), iommu_pte_flags);
+                    if ( unlikely(rc) )
+                    {
+                        while ( i-- )
+                            /* If statement to satisfy __must_check. */
+                            if ( iommu_unmap_page(p2m->domain,
+                                                  bfn_add(bfn, i)) )
+                                continue;
 
-                    break;
+                        break;
+                    }
                 }
-            }
-        else
-            for ( i = 0; i < (1UL << page_order); i++ )
-            {
-                int ret = iommu_unmap_page(p2m->domain, gfn + i);
+            else
+                for ( i = 0; i < (1UL << page_order); i++ )
+                {
+                    int ret = iommu_unmap_page(p2m->domain,
+                                               bfn_add(bfn, i));
 
-                if ( !rc )
-                    rc = ret;
-            }
+                    if ( !rc )
+                        rc = ret;
+                }
+        }
     }
 
     /*
diff --git a/xen/arch/x86/mm/p2m.c b/xen/arch/x86/mm/p2m.c
index 8e9fbb5a14..fbf67def50 100644
--- a/xen/arch/x86/mm/p2m.c
+++ b/xen/arch/x86/mm/p2m.c
@@ -715,9 +715,11 @@ p2m_remove_page(struct p2m_domain *p2m, unsigned long gfn_l, unsigned long mfn,
 
         if ( need_iommu(p2m->domain) )
         {
+            bfn_t bfn = _bfn(mfn);
+
             for ( i = 0; i < (1 << page_order); i++ )
             {
-                int ret = iommu_unmap_page(p2m->domain, mfn + i);
+                int ret = iommu_unmap_page(p2m->domain, bfn_add(bfn, i));
 
                 if ( !rc )
                     rc = ret;
@@ -774,16 +776,17 @@ guest_physmap_add_entry(struct domain *d, gfn_t gfn, mfn_t mfn,
     {
         if ( need_iommu(d) && t == p2m_ram_rw )
         {
+            bfn_t bfn = _bfn(mfn_x(mfn));
+
             for ( i = 0; i < (1 << page_order); i++ )
             {
-                rc = iommu_map_page(d, mfn_x(mfn_add(mfn, i)),
-                                    mfn_x(mfn_add(mfn, i)),
+                rc = iommu_map_page(d, bfn_add(bfn, i), mfn_add(mfn, i),
                                     IOMMUF_readable|IOMMUF_writable);
                 if ( rc != 0 )
                 {
                     while ( i-- > 0 )
                         /* If statement to satisfy __must_check. */
-                        if ( iommu_unmap_page(d, mfn_x(mfn_add(mfn, i))) )
+                        if ( iommu_unmap_page(d, bfn_add(bfn, i)) )
                             continue;
 
                     return rc;
@@ -1158,7 +1161,8 @@ int set_identity_p2m_entry(struct domain *d, unsigned long gfn_l,
     {
         if ( !need_iommu(d) )
             return 0;
-        return iommu_map_page(d, gfn_l, gfn_l, IOMMUF_readable|IOMMUF_writable);
+        return iommu_map_page(d, _bfn(gfn_l), _mfn(gfn_l),
+                              IOMMUF_readable | IOMMUF_writable);
     }
 
     gfn_lock(p2m, gfn, 0);
@@ -1248,7 +1252,7 @@ int clear_identity_p2m_entry(struct domain *d, unsigned long gfn_l)
     {
         if ( !need_iommu(d) )
             return 0;
-        return iommu_unmap_page(d, gfn_l);
+        return iommu_unmap_page(d, _bfn(gfn_l));
     }
 
     gfn_lock(p2m, gfn, 0);
diff --git a/xen/arch/x86/x86_64/mm.c b/xen/arch/x86/x86_64/mm.c
index cca4ae926e..cc58e4cef4 100644
--- a/xen/arch/x86/x86_64/mm.c
+++ b/xen/arch/x86/x86_64/mm.c
@@ -1429,13 +1429,14 @@ int memory_add(unsigned long spfn, unsigned long epfn, unsigned int pxm)
     if ( iommu_enabled && !iommu_passthrough && !need_iommu(hardware_domain) )
     {
         for ( i = spfn; i < epfn; i++ )
-            if ( iommu_map_page(hardware_domain, i, i, IOMMUF_readable|IOMMUF_writable) )
+            if ( iommu_map_page(hardware_domain, _bfn(i), _mfn(i),
+                                IOMMUF_readable | IOMMUF_writable) )
                 break;
         if ( i != epfn )
         {
             while (i-- > old_max)
                 /* If statement to satisfy __must_check. */
-                if ( iommu_unmap_page(hardware_domain, i) )
+                if ( iommu_unmap_page(hardware_domain, _bfn(i)) )
                     continue;
 
             goto destroy_m2p;
diff --git a/xen/common/grant_table.c b/xen/common/grant_table.c
index d9ec711c73..a83c8832af 100644
--- a/xen/common/grant_table.c
+++ b/xen/common/grant_table.c
@@ -1135,13 +1135,13 @@ map_grant_ref(
              !(old_pin & (GNTPIN_hstw_mask|GNTPIN_devw_mask)) )
         {
             if ( !(kind & MAPKIND_WRITE) )
-                err = iommu_map_page(ld, mfn_x(mfn), mfn_x(mfn),
-                                     IOMMUF_readable|IOMMUF_writable);
+                err = iommu_map_page(ld, _bfn(mfn_x(mfn)), mfn,
+                                     IOMMUF_readable | IOMMUF_writable);
         }
         else if ( act_pin && !old_pin )
         {
             if ( !kind )
-                err = iommu_map_page(ld, mfn_x(mfn), mfn_x(mfn),
+                err = iommu_map_page(ld, _bfn(mfn_x(mfn)), mfn,
                                      IOMMUF_readable);
         }
         if ( err )
@@ -1404,10 +1404,10 @@ unmap_common(
 
         kind = mapkind(lgt, rd, op->mfn);
         if ( !kind )
-            err = iommu_unmap_page(ld, mfn_x(op->mfn));
+            err = iommu_unmap_page(ld, _bfn(mfn_x(op->mfn)));
         else if ( !(kind & MAPKIND_WRITE) )
-            err = iommu_map_page(ld, mfn_x(op->mfn),
-                                 mfn_x(op->mfn), IOMMUF_readable);
+            err = iommu_map_page(ld, _bfn(mfn_x(op->mfn)), op->mfn,
+                                 IOMMUF_readable);
 
         double_gt_unlock(lgt, rgt);
 
diff --git a/xen/common/memory.c b/xen/common/memory.c
index e29d596727..a5656743c2 100644
--- a/xen/common/memory.c
+++ b/xen/common/memory.c
@@ -833,11 +833,11 @@ int xenmem_add_to_physmap(struct domain *d, struct xen_add_to_physmap *xatp,
 
         this_cpu(iommu_dont_flush_iotlb) = 0;
 
-        ret = iommu_iotlb_flush(d, xatp->idx - done, done);
+        ret = iommu_iotlb_flush(d, _bfn(xatp->idx - done), done);
         if ( unlikely(ret) && rc >= 0 )
             rc = ret;
 
-        ret = iommu_iotlb_flush(d, xatp->gpfn - done, done);
+        ret = iommu_iotlb_flush(d, _bfn(xatp->gpfn - done), done);
         if ( unlikely(ret) && rc >= 0 )
             rc = ret;
     }
diff --git a/xen/drivers/passthrough/iommu.c b/xen/drivers/passthrough/iommu.c
index 0faab18e88..303ac56e08 100644
--- a/xen/drivers/passthrough/iommu.c
+++ b/xen/drivers/passthrough/iommu.c
@@ -255,7 +255,7 @@ void iommu_domain_destroy(struct domain *d)
     arch_iommu_domain_destroy(d);
 }
 
-int iommu_map_page(struct domain *d, unsigned long bfn, unsigned long mfn,
+int iommu_map_page(struct domain *d, bfn_t bfn, mfn_t mfn,
                    unsigned int flags)
 {
     const struct domain_iommu *hd = dom_iommu(d);
@@ -264,13 +264,13 @@ int iommu_map_page(struct domain *d, unsigned long bfn, unsigned long mfn,
     if ( !iommu_enabled || !hd->platform_ops )
         return 0;
 
-    rc = hd->platform_ops->map_page(d, bfn, mfn, flags);
+    rc = hd->platform_ops->map_page(d, bfn_x(bfn), mfn_x(mfn), flags);
     if ( unlikely(rc) )
     {
         if ( !d->is_shutting_down && printk_ratelimit() )
             printk(XENLOG_ERR
-                   "d%d: IOMMU mapping bfn %#lx to mfn %#lx failed: %d\n",
-                   d->domain_id, bfn, mfn, rc);
+                   "d%d: IOMMU mapping bfn %"PRI_bfn" to mfn %"PRI_mfn" failed: %d\n",
+                   d->domain_id, bfn_x(bfn), mfn_x(mfn), rc);
 
         if ( !is_hardware_domain(d) )
             domain_crash(d);
@@ -279,7 +279,7 @@ int iommu_map_page(struct domain *d, unsigned long bfn, unsigned long mfn,
     return rc;
 }
 
-int iommu_unmap_page(struct domain *d, unsigned long bfn)
+int iommu_unmap_page(struct domain *d, bfn_t bfn)
 {
     const struct domain_iommu *hd = dom_iommu(d);
     int rc;
@@ -287,13 +287,13 @@ int iommu_unmap_page(struct domain *d, unsigned long bfn)
     if ( !iommu_enabled || !hd->platform_ops )
         return 0;
 
-    rc = hd->platform_ops->unmap_page(d, bfn);
+    rc = hd->platform_ops->unmap_page(d, bfn_x(bfn));
     if ( unlikely(rc) )
     {
         if ( !d->is_shutting_down && printk_ratelimit() )
             printk(XENLOG_ERR
-                   "d%d: IOMMU unmapping bfn %#lx failed: %d\n",
-                   d->domain_id, bfn, rc);
+                   "d%d: IOMMU unmapping bfn %"PRI_bfn" failed: %d\n",
+                   d->domain_id, bfn_x(bfn), rc);
 
         if ( !is_hardware_domain(d) )
             domain_crash(d);
@@ -319,8 +319,7 @@ static void iommu_free_pagetables(unsigned long unused)
                             cpumask_cycle(smp_processor_id(), &cpu_online_map));
 }
 
-int iommu_iotlb_flush(struct domain *d, unsigned long bfn,
-                      unsigned int page_count)
+int iommu_iotlb_flush(struct domain *d, bfn_t bfn, unsigned int page_count)
 {
     const struct domain_iommu *hd = dom_iommu(d);
     int rc;
@@ -328,13 +327,13 @@ int iommu_iotlb_flush(struct domain *d, unsigned long bfn,
     if ( !iommu_enabled || !hd->platform_ops || !hd->platform_ops->iotlb_flush )
         return 0;
 
-    rc = hd->platform_ops->iotlb_flush(d, bfn, page_count);
+    rc = hd->platform_ops->iotlb_flush(d, bfn_x(bfn), page_count);
     if ( unlikely(rc) )
     {
         if ( !d->is_shutting_down && printk_ratelimit() )
             printk(XENLOG_ERR
-                   "d%d: IOMMU IOTLB flush failed: %d, bfn %#lx, page count %u\n",
-                   d->domain_id, rc, bfn, page_count);
+                   "d%d: IOMMU IOTLB flush failed: %d, bfn %"PRI_bfn", page count %u\n",
+                   d->domain_id, rc, bfn_x(bfn), page_count);
 
         if ( !is_hardware_domain(d) )
             domain_crash(d);
diff --git a/xen/drivers/passthrough/vtd/x86/vtd.c b/xen/drivers/passthrough/vtd/x86/vtd.c
index cc2bfea162..dc37dce4b6 100644
--- a/xen/drivers/passthrough/vtd/x86/vtd.c
+++ b/xen/drivers/passthrough/vtd/x86/vtd.c
@@ -155,7 +155,8 @@ void __hwdom_init vtd_set_hwdom_mapping(struct domain *d)
         tmp = 1 << (PAGE_SHIFT - PAGE_SHIFT_4K);
         for ( j = 0; j < tmp; j++ )
         {
-            int ret = iommu_map_page(d, pfn * tmp + j, pfn * tmp + j,
+            int ret = iommu_map_page(d, _bfn(pfn * tmp + j),
+                                     _mfn(pfn * tmp + j),
                                      IOMMUF_readable|IOMMUF_writable);
 
             if ( !rc )
diff --git a/xen/include/xen/iommu.h b/xen/include/xen/iommu.h
index 14ada0c14e..a3c36c1148 100644
--- a/xen/include/xen/iommu.h
+++ b/xen/include/xen/iommu.h
@@ -24,6 +24,7 @@
 #include <xen/spinlock.h>
 #include <xen/pci.h>
 #include <xen/typesafe.h>
+#include <xen/mm.h>
 #include <public/hvm/ioreq.h>
 #include <public/domctl.h>
 #include <asm/device.h>
@@ -42,6 +43,11 @@ TYPE_SAFE(uint64_t, bfn);
 #undef bfn_x
 #endif
 
+static inline bfn_t bfn_add(bfn_t bfn, unsigned long i)
+{
+    return _bfn(bfn_x(bfn) + i);
+}
+
 #define IOMMU_PAGE_SHIFT 12
 #define IOMMU_PAGE_SIZE  (_AC(1,L) << IOMMU_PAGE_SHIFT)
 #define IOMMU_PAGE_MASK  (~(IOMMU_PAGE_SIZE - 1))
@@ -86,9 +92,9 @@ void iommu_teardown(struct domain *d);
 #define IOMMUF_readable  (1u<<_IOMMUF_readable)
 #define _IOMMUF_writable 1
 #define IOMMUF_writable  (1u<<_IOMMUF_writable)
-int __must_check iommu_map_page(struct domain *d, unsigned long bfn,
-                                unsigned long mfn, unsigned int flags);
-int __must_check iommu_unmap_page(struct domain *d, unsigned long bfn);
+int __must_check iommu_map_page(struct domain *d, bfn_t bfn,
+                                mfn_t mfn, unsigned int flags);
+int __must_check iommu_unmap_page(struct domain *d, bfn_t bfn);
 
 enum iommu_feature
 {
@@ -213,7 +219,7 @@ int iommu_do_pci_domctl(struct xen_domctl *, struct domain *d,
 int iommu_do_domctl(struct xen_domctl *, struct domain *d,
                     XEN_GUEST_HANDLE_PARAM(xen_domctl_t));
 
-int __must_check iommu_iotlb_flush(struct domain *d, unsigned long bfn,
+int __must_check iommu_iotlb_flush(struct domain *d, bfn_t bfn,
                                    unsigned int page_count);
 int __must_check iommu_iotlb_flush_all(struct domain *d);
 
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH v5 04/15] iommu: push use of type-safe BFN and MFN into iommu_ops
  2018-08-03 17:22 [PATCH v5 00/15] paravirtual IOMMU interface Paul Durrant
                   ` (2 preceding siblings ...)
  2018-08-03 17:22 ` [PATCH v5 03/15] iommu: make use of type-safe BFN and MFN in exported functions Paul Durrant
@ 2018-08-03 17:22 ` Paul Durrant
  2018-08-07  2:49   ` Tian, Kevin
  2018-08-03 17:22 ` [PATCH v5 05/15] iommu: don't domain_crash() inside iommu_map/unmap_page() Paul Durrant
                   ` (10 subsequent siblings)
  14 siblings, 1 reply; 57+ messages in thread
From: Paul Durrant @ 2018-08-03 17:22 UTC (permalink / raw)
  To: xen-devel
  Cc: Kevin Tian, Suravee Suthikulpanit, Andrew Cooper, George Dunlap,
	Paul Durrant, Jan Beulich

This patch modifies the methods in struct iommu_ops to use type-safe BFN
and MFN. This follows on from the prior patch that modified the functions
exported in xen/iommu.h.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
---
Cc: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Kevin Tian <kevin.tian@intel.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: George Dunlap <george.dunlap@citrix.com>

v3:
 - Remove some use of intermediate 'frame' variables.

v2:
 - Addressed comments from Jan.
 - Extend use of intermediate 'frame' variable to avoid directly
   encapsulating gfn values as bfns or vice versa.
---
 xen/drivers/passthrough/amd/iommu_map.c       | 46 ++++++++++++++++-----------
 xen/drivers/passthrough/amd/pci_amd_iommu.c   |  2 +-
 xen/drivers/passthrough/arm/smmu.c            | 16 +++++-----
 xen/drivers/passthrough/iommu.c               |  9 +++---
 xen/drivers/passthrough/vtd/iommu.c           | 26 +++++++--------
 xen/drivers/passthrough/vtd/x86/vtd.c         |  7 ++--
 xen/drivers/passthrough/x86/iommu.c           |  2 +-
 xen/include/asm-x86/hvm/svm/amd-iommu-proto.h |  8 ++---
 xen/include/xen/iommu.h                       | 13 +++++---
 9 files changed, 72 insertions(+), 57 deletions(-)

diff --git a/xen/drivers/passthrough/amd/iommu_map.c b/xen/drivers/passthrough/amd/iommu_map.c
index 4deab9cd2f..5a9a0af320 100644
--- a/xen/drivers/passthrough/amd/iommu_map.c
+++ b/xen/drivers/passthrough/amd/iommu_map.c
@@ -631,7 +631,7 @@ static int update_paging_mode(struct domain *d, unsigned long bfn)
     return 0;
 }
 
-int amd_iommu_map_page(struct domain *d, unsigned long bfn, unsigned long mfn,
+int amd_iommu_map_page(struct domain *d, bfn_t bfn, mfn_t mfn,
                        unsigned int flags)
 {
     bool_t need_flush = 0;
@@ -651,7 +651,8 @@ int amd_iommu_map_page(struct domain *d, unsigned long bfn, unsigned long mfn,
     if ( rc )
     {
         spin_unlock(&hd->arch.mapping_lock);
-        AMD_IOMMU_DEBUG("Root table alloc failed, bfn = %lx\n", bfn);
+        AMD_IOMMU_DEBUG("Root table alloc failed, bfn = %"PRI_bfn"\n",
+                        bfn_x(bfn));
         domain_crash(d);
         return rc;
     }
@@ -660,25 +661,27 @@ int amd_iommu_map_page(struct domain *d, unsigned long bfn, unsigned long mfn,
      * we might need a deeper page table for wider bfn now */
     if ( is_hvm_domain(d) )
     {
-        if ( update_paging_mode(d, bfn) )
+        if ( update_paging_mode(d, bfn_x(bfn)) )
         {
             spin_unlock(&hd->arch.mapping_lock);
-            AMD_IOMMU_DEBUG("Update page mode failed bfn = %lx\n", bfn);
+            AMD_IOMMU_DEBUG("Update page mode failed bfn = %"PRI_bfn"\n",
+                            bfn_x(bfn));
             domain_crash(d);
             return -EFAULT;
         }
     }
 
-    if ( iommu_pde_from_bfn(d, bfn, pt_mfn) || (pt_mfn[1] == 0) )
+    if ( iommu_pde_from_bfn(d, bfn_x(bfn), pt_mfn) || (pt_mfn[1] == 0) )
     {
         spin_unlock(&hd->arch.mapping_lock);
-        AMD_IOMMU_DEBUG("Invalid IO pagetable entry bfn = %lx\n", bfn);
+        AMD_IOMMU_DEBUG("Invalid IO pagetable entry bfn = %"PRI_bfn"\n",
+                        bfn_x(bfn));
         domain_crash(d);
         return -EFAULT;
     }
 
     /* Install 4k mapping first */
-    need_flush = set_iommu_pte_present(pt_mfn[1], bfn, mfn,
+    need_flush = set_iommu_pte_present(pt_mfn[1], bfn_x(bfn), mfn_x(mfn),
                                        IOMMU_PAGING_MODE_LEVEL_1,
                                        !!(flags & IOMMUF_writable),
                                        !!(flags & IOMMUF_readable));
@@ -690,7 +693,7 @@ int amd_iommu_map_page(struct domain *d, unsigned long bfn, unsigned long mfn,
     /* 4K mapping for PV guests never changes, 
      * no need to flush if we trust non-present bits */
     if ( is_hvm_domain(d) )
-        amd_iommu_flush_pages(d, bfn, 0);
+        amd_iommu_flush_pages(d, bfn_x(bfn), 0);
 
     for ( merge_level = IOMMU_PAGING_MODE_LEVEL_2;
           merge_level <= hd->arch.paging_mode; merge_level++ )
@@ -698,15 +701,16 @@ int amd_iommu_map_page(struct domain *d, unsigned long bfn, unsigned long mfn,
         if ( pt_mfn[merge_level] == 0 )
             break;
         if ( !iommu_update_pde_count(d, pt_mfn[merge_level],
-                                     bfn, mfn, merge_level) )
+                                     bfn_x(bfn), mfn_x(mfn), merge_level) )
             break;
 
-        if ( iommu_merge_pages(d, pt_mfn[merge_level], bfn,
+        if ( iommu_merge_pages(d, pt_mfn[merge_level], bfn_x(bfn),
                                flags, merge_level) )
         {
             spin_unlock(&hd->arch.mapping_lock);
             AMD_IOMMU_DEBUG("Merge iommu page failed at level %d, "
-                            "bfn = %lx mfn = %lx\n", merge_level, bfn, mfn);
+                            "bfn = %"PRI_bfn" mfn = %"PRI_mfn"\n",
+                            merge_level, bfn_x(bfn), mfn_x(mfn));
             domain_crash(d);
             return -EFAULT;
         }
@@ -720,7 +724,7 @@ out:
     return 0;
 }
 
-int amd_iommu_unmap_page(struct domain *d, unsigned long bfn)
+int amd_iommu_unmap_page(struct domain *d, bfn_t bfn)
 {
     unsigned long pt_mfn[7];
     struct domain_iommu *hd = dom_iommu(d);
@@ -742,31 +746,33 @@ int amd_iommu_unmap_page(struct domain *d, unsigned long bfn)
      * we might need a deeper page table for lager bfn now */
     if ( is_hvm_domain(d) )
     {
-        int rc = update_paging_mode(d, bfn);
+        int rc = update_paging_mode(d, bfn_x(bfn));
 
         if ( rc )
         {
             spin_unlock(&hd->arch.mapping_lock);
-            AMD_IOMMU_DEBUG("Update page mode failed bfn = %lx\n", bfn);
+            AMD_IOMMU_DEBUG("Update page mode failed bfn = %"PRI_bfn"\n",
+                            bfn_x(bfn));
             if ( rc != -EADDRNOTAVAIL )
                 domain_crash(d);
             return rc;
         }
     }
 
-    if ( iommu_pde_from_bfn(d, bfn, pt_mfn) || (pt_mfn[1] == 0) )
+    if ( iommu_pde_from_bfn(d, bfn_x(bfn), pt_mfn) || (pt_mfn[1] == 0) )
     {
         spin_unlock(&hd->arch.mapping_lock);
-        AMD_IOMMU_DEBUG("Invalid IO pagetable entry bfn = %lx\n", bfn);
+        AMD_IOMMU_DEBUG("Invalid IO pagetable entry bfn = %"PRI_bfn"\n",
+                        bfn_x(bfn));
         domain_crash(d);
         return -EFAULT;
     }
 
     /* mark PTE as 'page not present' */
-    clear_iommu_pte_present(pt_mfn[1], bfn);
+    clear_iommu_pte_present(pt_mfn[1], bfn_x(bfn));
     spin_unlock(&hd->arch.mapping_lock);
 
-    amd_iommu_flush_pages(d, bfn, 0);
+    amd_iommu_flush_pages(d, bfn_x(bfn), 0);
 
     return 0;
 }
@@ -787,7 +793,9 @@ int amd_iommu_reserve_domain_unity_map(struct domain *domain,
     gfn = phys_addr >> PAGE_SHIFT;
     for ( i = 0; i < npages; i++ )
     {
-        rt = amd_iommu_map_page(domain, gfn +i, gfn +i, flags);
+        unsigned long frame = gfn + i;
+
+        rt = amd_iommu_map_page(domain, _bfn(frame), _mfn(frame), flags);
         if ( rt != 0 )
             return rt;
     }
diff --git a/xen/drivers/passthrough/amd/pci_amd_iommu.c b/xen/drivers/passthrough/amd/pci_amd_iommu.c
index d608631e6e..eea22c3d0d 100644
--- a/xen/drivers/passthrough/amd/pci_amd_iommu.c
+++ b/xen/drivers/passthrough/amd/pci_amd_iommu.c
@@ -271,7 +271,7 @@ static void __hwdom_init amd_iommu_hwdom_init(struct domain *d)
              */
             if ( mfn_valid(_mfn(pfn)) )
             {
-                int ret = amd_iommu_map_page(d, pfn, pfn,
+                int ret = amd_iommu_map_page(d, _bfn(pfn), _mfn(pfn),
                                              IOMMUF_readable|IOMMUF_writable);
 
                 if ( !rc )
diff --git a/xen/drivers/passthrough/arm/smmu.c b/xen/drivers/passthrough/arm/smmu.c
index 1e4d561b47..221b62a59c 100644
--- a/xen/drivers/passthrough/arm/smmu.c
+++ b/xen/drivers/passthrough/arm/smmu.c
@@ -2550,8 +2550,7 @@ static int __must_check arm_smmu_iotlb_flush_all(struct domain *d)
 	return 0;
 }
 
-static int __must_check arm_smmu_iotlb_flush(struct domain *d,
-                                             unsigned long bfn,
+static int __must_check arm_smmu_iotlb_flush(struct domain *d, bfn_t bfn,
                                              unsigned int page_count)
 {
 	/* ARM SMMU v1 doesn't have flush by VMA and VMID */
@@ -2737,8 +2736,8 @@ static void arm_smmu_iommu_domain_teardown(struct domain *d)
 	xfree(xen_domain);
 }
 
-static int __must_check arm_smmu_map_page(struct domain *d, unsigned long bfn,
-			unsigned long mfn, unsigned int flags)
+static int __must_check arm_smmu_map_page(struct domain *d, bfn_t bfn,
+					  mfn_t mfn, unsigned int flags)
 {
 	p2m_type_t t;
 
@@ -2751,7 +2750,7 @@ static int __must_check arm_smmu_map_page(struct domain *d, unsigned long bfn,
 	 * function should only be used by gnttab code with gfn == mfn == bfn.
 	 */
 	BUG_ON(!is_domain_direct_mapped(d));
-	BUG_ON(mfn != bfn);
+	BUG_ON(mfn_x(mfn) != bfn_x(bfn));
 
 	/* We only support readable and writable flags */
 	if (!(flags & (IOMMUF_readable | IOMMUF_writable)))
@@ -2763,10 +2762,11 @@ static int __must_check arm_smmu_map_page(struct domain *d, unsigned long bfn,
 	 * The function guest_physmap_add_entry replaces the current mapping
 	 * if there is already one...
 	 */
-	return guest_physmap_add_entry(d, _gfn(bfn), _mfn(bfn), 0, t);
+	return guest_physmap_add_entry(d, _gfn(bfn_x(bfn)), _mfn(bfn_x(bfn)),
+				       0, t);
 }
 
-static int __must_check arm_smmu_unmap_page(struct domain *d, unsigned long bfn)
+static int __must_check arm_smmu_unmap_page(struct domain *d, bfn_t bfn)
 {
 	/*
 	 * This function should only be used by gnttab code when the domain
@@ -2775,7 +2775,7 @@ static int __must_check arm_smmu_unmap_page(struct domain *d, unsigned long bfn)
 	if ( !is_domain_direct_mapped(d) )
 		return -EINVAL;
 
-	return guest_physmap_remove_page(d, _gfn(bfn), _mfn(bfn), 0);
+	return guest_physmap_remove_page(d, _gfn(bfn_x(bfn)), _mfn(bfn_x(bfn)), 0);
 }
 
 static const struct iommu_ops arm_smmu_iommu_ops = {
diff --git a/xen/drivers/passthrough/iommu.c b/xen/drivers/passthrough/iommu.c
index 303ac56e08..1f32958816 100644
--- a/xen/drivers/passthrough/iommu.c
+++ b/xen/drivers/passthrough/iommu.c
@@ -194,7 +194,8 @@ void __hwdom_init iommu_hwdom_init(struct domain *d)
                   == PGT_writable_page) )
                 mapping |= IOMMUF_writable;
 
-            ret = hd->platform_ops->map_page(d, bfn, mfn, mapping);
+            ret = hd->platform_ops->map_page(d, _bfn(bfn), _mfn(mfn),
+                                             mapping);
             if ( !rc )
                 rc = ret;
 
@@ -264,7 +265,7 @@ int iommu_map_page(struct domain *d, bfn_t bfn, mfn_t mfn,
     if ( !iommu_enabled || !hd->platform_ops )
         return 0;
 
-    rc = hd->platform_ops->map_page(d, bfn_x(bfn), mfn_x(mfn), flags);
+    rc = hd->platform_ops->map_page(d, bfn, mfn, flags);
     if ( unlikely(rc) )
     {
         if ( !d->is_shutting_down && printk_ratelimit() )
@@ -287,7 +288,7 @@ int iommu_unmap_page(struct domain *d, bfn_t bfn)
     if ( !iommu_enabled || !hd->platform_ops )
         return 0;
 
-    rc = hd->platform_ops->unmap_page(d, bfn_x(bfn));
+    rc = hd->platform_ops->unmap_page(d, bfn);
     if ( unlikely(rc) )
     {
         if ( !d->is_shutting_down && printk_ratelimit() )
@@ -327,7 +328,7 @@ int iommu_iotlb_flush(struct domain *d, bfn_t bfn, unsigned int page_count)
     if ( !iommu_enabled || !hd->platform_ops || !hd->platform_ops->iotlb_flush )
         return 0;
 
-    rc = hd->platform_ops->iotlb_flush(d, bfn_x(bfn), page_count);
+    rc = hd->platform_ops->iotlb_flush(d, bfn, page_count);
     if ( unlikely(rc) )
     {
         if ( !d->is_shutting_down && printk_ratelimit() )
diff --git a/xen/drivers/passthrough/vtd/iommu.c b/xen/drivers/passthrough/vtd/iommu.c
index 48f62e0e8d..c9f50f04ad 100644
--- a/xen/drivers/passthrough/vtd/iommu.c
+++ b/xen/drivers/passthrough/vtd/iommu.c
@@ -584,8 +584,7 @@ static int __must_check iommu_flush_all(void)
     return rc;
 }
 
-static int __must_check iommu_flush_iotlb(struct domain *d,
-                                          unsigned long bfn,
+static int __must_check iommu_flush_iotlb(struct domain *d, bfn_t bfn,
                                           bool_t dma_old_pte_present,
                                           unsigned int page_count)
 {
@@ -612,12 +611,12 @@ static int __must_check iommu_flush_iotlb(struct domain *d,
         if ( iommu_domid == -1 )
             continue;
 
-        if ( page_count != 1 || bfn == bfn_x(INVALID_BFN) )
+        if ( page_count != 1 || bfn_eq(bfn, INVALID_BFN) )
             rc = iommu_flush_iotlb_dsi(iommu, iommu_domid,
                                        0, flush_dev_iotlb);
         else
             rc = iommu_flush_iotlb_psi(iommu, iommu_domid,
-                                       __bfn_to_baddr(bfn),
+                                       bfn_to_baddr(bfn),
                                        PAGE_ORDER_4K,
                                        !dma_old_pte_present,
                                        flush_dev_iotlb);
@@ -633,7 +632,7 @@ static int __must_check iommu_flush_iotlb(struct domain *d,
 }
 
 static int __must_check iommu_flush_iotlb_pages(struct domain *d,
-                                                unsigned long bfn,
+                                                bfn_t bfn,
                                                 unsigned int page_count)
 {
     return iommu_flush_iotlb(d, bfn, 1, page_count);
@@ -641,7 +640,7 @@ static int __must_check iommu_flush_iotlb_pages(struct domain *d,
 
 static int __must_check iommu_flush_iotlb_all(struct domain *d)
 {
-    return iommu_flush_iotlb(d, bfn_x(INVALID_BFN), 0, 0);
+    return iommu_flush_iotlb(d, INVALID_BFN, 0, 0);
 }
 
 /* clear one page's page table */
@@ -676,7 +675,7 @@ static int __must_check dma_pte_clear_one(struct domain *domain, u64 addr)
     iommu_flush_cache_entry(pte, sizeof(struct dma_pte));
 
     if ( !this_cpu(iommu_dont_flush_iotlb) )
-        rc = iommu_flush_iotlb_pages(domain, addr >> PAGE_SHIFT_4K, 1);
+        rc = iommu_flush_iotlb_pages(domain, baddr_to_bfn(addr), 1);
 
     unmap_vtd_domain_page(page);
 
@@ -1767,8 +1766,7 @@ static void iommu_domain_teardown(struct domain *d)
 }
 
 static int __must_check intel_iommu_map_page(struct domain *d,
-                                             unsigned long bfn,
-                                             unsigned long mfn,
+                                             bfn_t bfn, mfn_t mfn,
                                              unsigned int flags)
 {
     struct domain_iommu *hd = dom_iommu(d);
@@ -1786,16 +1784,16 @@ static int __must_check intel_iommu_map_page(struct domain *d,
 
     spin_lock(&hd->arch.mapping_lock);
 
-    pg_maddr = addr_to_dma_page_maddr(d, __bfn_to_baddr(bfn), 1);
+    pg_maddr = addr_to_dma_page_maddr(d, bfn_to_baddr(bfn), 1);
     if ( pg_maddr == 0 )
     {
         spin_unlock(&hd->arch.mapping_lock);
         return -ENOMEM;
     }
     page = (struct dma_pte *)map_vtd_domain_page(pg_maddr);
-    pte = page + (bfn & LEVEL_MASK);
+    pte = page + (bfn_x(bfn) & LEVEL_MASK);
     old = *pte;
-    dma_set_pte_addr(new, (paddr_t)mfn << PAGE_SHIFT_4K);
+    dma_set_pte_addr(new, mfn_to_maddr(mfn));
     dma_set_pte_prot(new,
                      ((flags & IOMMUF_readable) ? DMA_PTE_READ  : 0) |
                      ((flags & IOMMUF_writable) ? DMA_PTE_WRITE : 0));
@@ -1823,13 +1821,13 @@ static int __must_check intel_iommu_map_page(struct domain *d,
 }
 
 static int __must_check intel_iommu_unmap_page(struct domain *d,
-                                               unsigned long bfn)
+                                               bfn_t bfn)
 {
     /* Do nothing if hardware domain and iommu supports pass thru. */
     if ( iommu_passthrough && is_hardware_domain(d) )
         return 0;
 
-    return dma_pte_clear_one(d, __bfn_to_baddr(bfn));
+    return dma_pte_clear_one(d, bfn_to_baddr(bfn));
 }
 
 int iommu_pte_flush(struct domain *d, uint64_t bfn, uint64_t *pte,
diff --git a/xen/drivers/passthrough/vtd/x86/vtd.c b/xen/drivers/passthrough/vtd/x86/vtd.c
index dc37dce4b6..6fed4a92cb 100644
--- a/xen/drivers/passthrough/vtd/x86/vtd.c
+++ b/xen/drivers/passthrough/vtd/x86/vtd.c
@@ -121,6 +121,8 @@ void __hwdom_init vtd_set_hwdom_mapping(struct domain *d)
     {
         unsigned long pfn = pdx_to_pfn(i);
         bool map;
+        bfn_t bfn;
+        mfn_t mfn;
         int rc = 0;
 
         /*
@@ -153,10 +155,11 @@ void __hwdom_init vtd_set_hwdom_mapping(struct domain *d)
             continue;
 
         tmp = 1 << (PAGE_SHIFT - PAGE_SHIFT_4K);
+        bfn = _bfn(pfn * tmp);
+        mfn = _mfn(pfn * tmp);
         for ( j = 0; j < tmp; j++ )
         {
-            int ret = iommu_map_page(d, _bfn(pfn * tmp + j),
-                                     _mfn(pfn * tmp + j),
+            int ret = iommu_map_page(d, bfn_add(bfn, j), mfn_add(mfn, j),
                                      IOMMUF_readable|IOMMUF_writable);
 
             if ( !rc )
diff --git a/xen/drivers/passthrough/x86/iommu.c b/xen/drivers/passthrough/x86/iommu.c
index d48d16b51b..9f2ad15ba0 100644
--- a/xen/drivers/passthrough/x86/iommu.c
+++ b/xen/drivers/passthrough/x86/iommu.c
@@ -63,7 +63,7 @@ int arch_iommu_populate_page_table(struct domain *d)
             {
                 ASSERT(!(gfn >> DEFAULT_DOMAIN_ADDRESS_WIDTH));
                 BUG_ON(SHARED_M2P(gfn));
-                rc = hd->platform_ops->map_page(d, gfn, mfn,
+                rc = hd->platform_ops->map_page(d, _bfn(gfn), _mfn(mfn),
                                                 IOMMUF_readable |
                                                 IOMMUF_writable);
             }
diff --git a/xen/include/asm-x86/hvm/svm/amd-iommu-proto.h b/xen/include/asm-x86/hvm/svm/amd-iommu-proto.h
index 99bc21c7b3..dce9ed6b83 100644
--- a/xen/include/asm-x86/hvm/svm/amd-iommu-proto.h
+++ b/xen/include/asm-x86/hvm/svm/amd-iommu-proto.h
@@ -52,9 +52,9 @@ int amd_iommu_init(void);
 int amd_iommu_update_ivrs_mapping_acpi(void);
 
 /* mapping functions */
-int __must_check amd_iommu_map_page(struct domain *d, unsigned long gfn,
-                                    unsigned long mfn, unsigned int flags);
-int __must_check amd_iommu_unmap_page(struct domain *d, unsigned long gfn);
+int __must_check amd_iommu_map_page(struct domain *d, bfn_t bfn,
+                                    mfn_t mfn, unsigned int flags);
+int __must_check amd_iommu_unmap_page(struct domain *d, bfn_t bfn);
 u64 amd_iommu_get_next_table_from_pte(u32 *entry);
 int __must_check amd_iommu_alloc_root(struct domain_iommu *hd);
 int amd_iommu_reserve_domain_unity_map(struct domain *domain,
@@ -77,7 +77,7 @@ void iommu_dte_set_guest_cr3(u32 *dte, u16 dom_id, u64 gcr3,
 
 /* send cmd to iommu */
 void amd_iommu_flush_all_pages(struct domain *d);
-void amd_iommu_flush_pages(struct domain *d, unsigned long gfn,
+void amd_iommu_flush_pages(struct domain *d, unsigned long bfn,
                            unsigned int order);
 void amd_iommu_flush_iotlb(u8 devfn, const struct pci_dev *pdev,
                            uint64_t gaddr, unsigned int order);
diff --git a/xen/include/xen/iommu.h b/xen/include/xen/iommu.h
index a3c36c1148..624784fec8 100644
--- a/xen/include/xen/iommu.h
+++ b/xen/include/xen/iommu.h
@@ -48,6 +48,11 @@ static inline bfn_t bfn_add(bfn_t bfn, unsigned long i)
     return _bfn(bfn_x(bfn) + i);
 }
 
+static inline bool_t bfn_eq(bfn_t x, bfn_t y)
+{
+    return bfn_x(x) == bfn_x(y);
+}
+
 #define IOMMU_PAGE_SHIFT 12
 #define IOMMU_PAGE_SIZE  (_AC(1,L) << IOMMU_PAGE_SHIFT)
 #define IOMMU_PAGE_MASK  (~(IOMMU_PAGE_SIZE - 1))
@@ -184,9 +189,9 @@ struct iommu_ops {
 #endif /* HAS_PCI */
 
     void (*teardown)(struct domain *d);
-    int __must_check (*map_page)(struct domain *d, unsigned long bfn,
-                                 unsigned long mfn, unsigned int flags);
-    int __must_check (*unmap_page)(struct domain *d, unsigned long bfn);
+    int __must_check (*map_page)(struct domain *d, bfn_t bfn, mfn_t mfn,
+                                 unsigned int flags);
+    int __must_check (*unmap_page)(struct domain *d, bfn_t bfn);
     void (*free_page_table)(struct page_info *);
 #ifdef CONFIG_X86
     void (*update_ire_from_apic)(unsigned int apic, unsigned int reg, unsigned int value);
@@ -197,7 +202,7 @@ struct iommu_ops {
     void (*resume)(void);
     void (*share_p2m)(struct domain *d);
     void (*crash_shutdown)(void);
-    int __must_check (*iotlb_flush)(struct domain *d, unsigned long bfn,
+    int __must_check (*iotlb_flush)(struct domain *d, bfn_t bfn,
                                     unsigned int page_count);
     int __must_check (*iotlb_flush_all)(struct domain *d);
     int (*get_reserved_device_memory)(iommu_grdm_t *, void *);
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH v5 05/15] iommu: don't domain_crash() inside iommu_map/unmap_page()
  2018-08-03 17:22 [PATCH v5 00/15] paravirtual IOMMU interface Paul Durrant
                   ` (3 preceding siblings ...)
  2018-08-03 17:22 ` [PATCH v5 04/15] iommu: push use of type-safe BFN and MFN into iommu_ops Paul Durrant
@ 2018-08-03 17:22 ` Paul Durrant
  2018-08-07  2:55   ` Tian, Kevin
  2018-08-03 17:22 ` [PATCH v5 06/15] public / x86: introduce __HYPERCALL_iommu_op Paul Durrant
                   ` (9 subsequent siblings)
  14 siblings, 1 reply; 57+ messages in thread
From: Paul Durrant @ 2018-08-03 17:22 UTC (permalink / raw)
  To: xen-devel
  Cc: Kevin Tian, Stefano Stabellini, Jun Nakajima, George Dunlap,
	Andrew Cooper, Ian Jackson, Tim Deegan, Julien Grall,
	Paul Durrant, Jan Beulich

Turn iommu_map/unmap_page() into straightforward wrappers that check the
existence of the relevant iommu_op and call through to it. This makes them
usable by PV IOMMU code to be delivered in future patches.
Leave the decision on whether to invoke domain_crash() up to the caller.
This has the added benefit that the (module/line number) message that
domain_crash() spits out will be more indicative of where the problem lies.

NOTE: This patch includes one bit of clean-up in set_identity_p2m_entry()
      replacing use of p2m->domain with the domain pointer passed into the
      function.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Reviewed-by: George Dunlap <George.Dunlap@eu.citrix.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
---
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Julien Grall <julien.grall@arm.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Tim Deegan <tim@xen.org>
Cc: Jun Nakajima <jun.nakajima@intel.com>
Cc: Kevin Tian <kevin.tian@intel.com>
Cc: George Dunlap <george.dunlap@eu.citrix.com>

v2:
 - New in v2.
---
 xen/arch/arm/p2m.c                  |  4 ++++
 xen/arch/x86/mm.c                   |  3 +++
 xen/arch/x86/mm/p2m-ept.c           |  3 +++
 xen/arch/x86/mm/p2m-pt.c            |  3 +++
 xen/arch/x86/mm/p2m.c               | 24 ++++++++++++++++++++----
 xen/common/grant_table.c            |  8 ++++++++
 xen/common/memory.c                 |  3 +++
 xen/drivers/passthrough/iommu.c     | 12 ------------
 xen/drivers/passthrough/x86/iommu.c |  4 ++++
 9 files changed, 48 insertions(+), 16 deletions(-)

diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
index d719db1e30..eb39861b73 100644
--- a/xen/arch/arm/p2m.c
+++ b/xen/arch/arm/p2m.c
@@ -953,8 +953,12 @@ static int __p2m_set_entry(struct p2m_domain *p2m,
 
     if ( need_iommu(p2m->domain) &&
          (lpae_valid(orig_pte) || lpae_valid(*entry)) )
+    {
         rc = iommu_iotlb_flush(p2m->domain, _bfn(gfn_x(sgfn)),
                                1UL << page_order);
+        if ( unlikely(rc) && !is_hardware_domain(p2m->domain) )
+            domain_crash(p2m->domain);
+    }
     else
         rc = 0;
 
diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index 99070f916d..08878574f3 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -2737,6 +2737,9 @@ static int _get_page_type(struct page_info *page, unsigned long type,
                 iommu_ret = iommu_map_page(d, _bfn(mfn_x(mfn)), mfn,
                                            IOMMUF_readable |
                                            IOMMUF_writable);
+
+            if ( unlikely(iommu_ret) && !is_hardware_domain(d) )
+                domain_crash(d);
         }
     }
 
diff --git a/xen/arch/x86/mm/p2m-ept.c b/xen/arch/x86/mm/p2m-ept.c
index 2089b5232d..33e77903d6 100644
--- a/xen/arch/x86/mm/p2m-ept.c
+++ b/xen/arch/x86/mm/p2m-ept.c
@@ -895,6 +895,9 @@ out:
                     if ( !rc )
                         rc = ret;
                 }
+
+            if ( unlikely(rc) && !is_hardware_domain(d) )
+                domain_crash(d);
         }
     }
 
diff --git a/xen/arch/x86/mm/p2m-pt.c b/xen/arch/x86/mm/p2m-pt.c
index a441af388a..9ff0b3fe7a 100644
--- a/xen/arch/x86/mm/p2m-pt.c
+++ b/xen/arch/x86/mm/p2m-pt.c
@@ -717,6 +717,9 @@ p2m_pt_set_entry(struct p2m_domain *p2m, gfn_t gfn_, mfn_t mfn,
                         rc = ret;
                 }
         }
+
+        if ( unlikely(rc) && !is_hardware_domain(p2m->domain) )
+            domain_crash(p2m->domain);
     }
 
     /*
diff --git a/xen/arch/x86/mm/p2m.c b/xen/arch/x86/mm/p2m.c
index fbf67def50..036f52f4f1 100644
--- a/xen/arch/x86/mm/p2m.c
+++ b/xen/arch/x86/mm/p2m.c
@@ -724,6 +724,9 @@ p2m_remove_page(struct p2m_domain *p2m, unsigned long gfn_l, unsigned long mfn,
                 if ( !rc )
                     rc = ret;
             }
+
+            if ( unlikely(rc) && !is_hardware_domain(p2m->domain) )
+                domain_crash(p2m->domain);
         }
 
         return rc;
@@ -789,6 +792,9 @@ guest_physmap_add_entry(struct domain *d, gfn_t gfn, mfn_t mfn,
                         if ( iommu_unmap_page(d, bfn_add(bfn, i)) )
                             continue;
 
+                    if ( !is_hardware_domain(d) )
+                        domain_crash(d);
+
                     return rc;
                 }
             }
@@ -1157,12 +1163,17 @@ int set_identity_p2m_entry(struct domain *d, unsigned long gfn_l,
     struct p2m_domain *p2m = p2m_get_hostp2m(d);
     int ret;
 
-    if ( !paging_mode_translate(p2m->domain) )
+    if ( !paging_mode_translate(d) )
     {
         if ( !need_iommu(d) )
             return 0;
-        return iommu_map_page(d, _bfn(gfn_l), _mfn(gfn_l),
-                              IOMMUF_readable | IOMMUF_writable);
+
+        ret = iommu_map_page(d, _bfn(gfn_l), _mfn(gfn_l),
+                             IOMMUF_readable | IOMMUF_writable);
+        if ( unlikely(ret) && !is_hardware_domain(d) )
+            domain_crash(d);
+
+        return ret;
     }
 
     gfn_lock(p2m, gfn, 0);
@@ -1252,7 +1263,12 @@ int clear_identity_p2m_entry(struct domain *d, unsigned long gfn_l)
     {
         if ( !need_iommu(d) )
             return 0;
-        return iommu_unmap_page(d, _bfn(gfn_l));
+
+        ret = iommu_unmap_page(d, _bfn(gfn_l));
+        if ( unlikely(ret) && !is_hardware_domain(d) )
+            domain_crash(d);
+
+        return ret;
     }
 
     gfn_lock(p2m, gfn, 0);
diff --git a/xen/common/grant_table.c b/xen/common/grant_table.c
index a83c8832af..1840b656c9 100644
--- a/xen/common/grant_table.c
+++ b/xen/common/grant_table.c
@@ -1146,6 +1146,9 @@ map_grant_ref(
         }
         if ( err )
         {
+            if ( !is_hardware_domain(ld) )
+                domain_crash(ld);
+
             double_gt_unlock(lgt, rgt);
             rc = GNTST_general_error;
             goto undo_out;
@@ -1412,7 +1415,12 @@ unmap_common(
         double_gt_unlock(lgt, rgt);
 
         if ( err )
+        {
+            if ( !is_hardware_domain(ld) )
+                domain_crash(ld);
+
             rc = GNTST_general_error;
+        }
     }
 
     /* If just unmapped a writable mapping, mark as dirtied */
diff --git a/xen/common/memory.c b/xen/common/memory.c
index a5656743c2..0a34677cc3 100644
--- a/xen/common/memory.c
+++ b/xen/common/memory.c
@@ -840,6 +840,9 @@ int xenmem_add_to_physmap(struct domain *d, struct xen_add_to_physmap *xatp,
         ret = iommu_iotlb_flush(d, _bfn(xatp->gpfn - done), done);
         if ( unlikely(ret) && rc >= 0 )
             rc = ret;
+
+        if ( unlikely(rc < 0) && !is_hardware_domain(d) )
+            domain_crash(d);
     }
 #endif
 
diff --git a/xen/drivers/passthrough/iommu.c b/xen/drivers/passthrough/iommu.c
index 1f32958816..21e6886a3f 100644
--- a/xen/drivers/passthrough/iommu.c
+++ b/xen/drivers/passthrough/iommu.c
@@ -272,9 +272,6 @@ int iommu_map_page(struct domain *d, bfn_t bfn, mfn_t mfn,
             printk(XENLOG_ERR
                    "d%d: IOMMU mapping bfn %"PRI_bfn" to mfn %"PRI_mfn" failed: %d\n",
                    d->domain_id, bfn_x(bfn), mfn_x(mfn), rc);
-
-        if ( !is_hardware_domain(d) )
-            domain_crash(d);
     }
 
     return rc;
@@ -295,9 +292,6 @@ int iommu_unmap_page(struct domain *d, bfn_t bfn)
             printk(XENLOG_ERR
                    "d%d: IOMMU unmapping bfn %"PRI_bfn" failed: %d\n",
                    d->domain_id, bfn_x(bfn), rc);
-
-        if ( !is_hardware_domain(d) )
-            domain_crash(d);
     }
 
     return rc;
@@ -335,9 +329,6 @@ int iommu_iotlb_flush(struct domain *d, bfn_t bfn, unsigned int page_count)
             printk(XENLOG_ERR
                    "d%d: IOMMU IOTLB flush failed: %d, bfn %"PRI_bfn", page count %u\n",
                    d->domain_id, rc, bfn_x(bfn), page_count);
-
-        if ( !is_hardware_domain(d) )
-            domain_crash(d);
     }
 
     return rc;
@@ -358,9 +349,6 @@ int iommu_iotlb_flush_all(struct domain *d)
             printk(XENLOG_ERR
                    "d%d: IOMMU IOTLB flush all failed: %d\n",
                    d->domain_id, rc);
-
-        if ( !is_hardware_domain(d) )
-            domain_crash(d);
     }
 
     return rc;
diff --git a/xen/drivers/passthrough/x86/iommu.c b/xen/drivers/passthrough/x86/iommu.c
index 9f2ad15ba0..1b3d2a2c8f 100644
--- a/xen/drivers/passthrough/x86/iommu.c
+++ b/xen/drivers/passthrough/x86/iommu.c
@@ -102,7 +102,11 @@ int arch_iommu_populate_page_table(struct domain *d)
     this_cpu(iommu_dont_flush_iotlb) = 0;
 
     if ( !rc )
+    {
         rc = iommu_iotlb_flush_all(d);
+        if ( unlikely(rc) )
+            domain_crash(d);
+    }
 
     if ( rc && rc != -ERESTART )
         iommu_teardown(d);
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH v5 06/15] public / x86: introduce __HYPERCALL_iommu_op
  2018-08-03 17:22 [PATCH v5 00/15] paravirtual IOMMU interface Paul Durrant
                   ` (4 preceding siblings ...)
  2018-08-03 17:22 ` [PATCH v5 05/15] iommu: don't domain_crash() inside iommu_map/unmap_page() Paul Durrant
@ 2018-08-03 17:22 ` Paul Durrant
  2018-08-07  3:00   ` Tian, Kevin
  2018-08-17 21:10   ` Daniel De Graaf
  2018-08-03 17:22 ` [PATCH v5 07/15] iommu: track reserved ranges using a rangeset Paul Durrant
                   ` (8 subsequent siblings)
  14 siblings, 2 replies; 57+ messages in thread
From: Paul Durrant @ 2018-08-03 17:22 UTC (permalink / raw)
  To: xen-devel
  Cc: Stefano Stabellini, Wei Liu, George Dunlap, Andrew Cooper,
	Ian Jackson, Tim Deegan, Paul Durrant, Jan Beulich,
	Daniel De Graaf

This patch introduces the boilerplate for a new hypercall to allow a
domain to control IOMMU mappings for its own pages.
Whilst there is duplication of code between the native and compat entry
points which appears ripe for some form of combination, I think it is
better to maintain the separation as-is because the compat entry point
will necessarily gain complexity in subsequent patches.

NOTE: This hypercall is only implemented for x86 and is currently
      restricted by XSM to dom0. Its scope can be expanded in future
      if need be.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
---
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: George Dunlap <George.Dunlap@eu.citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Tim Deegan <tim@xen.org>
Cc: Wei Liu <wei.liu2@citrix.com>
Cc: Daniel De Graaf <dgdegra@tycho.nsa.gov>

v3:
 - Push op code into XSM check.

v2:
 - Get rid of the can_control_iommu() function, leaving this patch as pure
   boilerplate.
 - Re-structure the hypercall to use a buffer array, similar to that used
   by __HYPERCALL_dm_op, to allow for future expansion of op structure
   without affecting binary compatibility.
 - Drop use of __ in public header guard.
---
 tools/flask/policy/modules/xen.if   |   1 +
 xen/arch/x86/Makefile               |   1 +
 xen/arch/x86/hvm/hypercall.c        |   1 +
 xen/arch/x86/hypercall.c            |   1 +
 xen/arch/x86/iommu_op.c             | 184 ++++++++++++++++++++++++++++++++++++
 xen/arch/x86/pv/hypercall.c         |   1 +
 xen/include/Makefile                |   2 +
 xen/include/public/iommu_op.h       |  64 +++++++++++++
 xen/include/public/xen.h            |   1 +
 xen/include/xen/hypercall.h         |  12 +++
 xen/include/xlat.lst                |   2 +
 xen/include/xsm/dummy.h             |   6 ++
 xen/include/xsm/xsm.h               |   6 ++
 xen/xsm/dummy.c                     |   1 +
 xen/xsm/flask/hooks.c               |   6 ++
 xen/xsm/flask/policy/access_vectors |   2 +
 16 files changed, 291 insertions(+)
 create mode 100644 xen/arch/x86/iommu_op.c
 create mode 100644 xen/include/public/iommu_op.h

diff --git a/tools/flask/policy/modules/xen.if b/tools/flask/policy/modules/xen.if
index 7aefd0061e..5161611471 100644
--- a/tools/flask/policy/modules/xen.if
+++ b/tools/flask/policy/modules/xen.if
@@ -60,6 +60,7 @@ define(`create_domain_common', `
 	allow $1 $2:grant setup;
 	allow $1 $2:hvm { getparam hvmctl sethvmc
 			setparam nested altp2mhvm altp2mhvm_op dm };
+	allow $1 $2:resource control_iommu;
 ')
 
 # create_domain(priv, target)
diff --git a/xen/arch/x86/Makefile b/xen/arch/x86/Makefile
index 17e7d3fa34..35bfaf7e8a 100644
--- a/xen/arch/x86/Makefile
+++ b/xen/arch/x86/Makefile
@@ -33,6 +33,7 @@ obj-$(CONFIG_CRASH_DEBUG) += gdbstub.o
 obj-y += hypercall.o
 obj-y += i387.o
 obj-y += i8259.o
+obj-y += iommu_op.o
 obj-y += io_apic.o
 obj-$(CONFIG_LIVEPATCH) += alternative.o livepatch.o
 obj-y += msi.o
diff --git a/xen/arch/x86/hvm/hypercall.c b/xen/arch/x86/hvm/hypercall.c
index 85eacd7d33..3574966827 100644
--- a/xen/arch/x86/hvm/hypercall.c
+++ b/xen/arch/x86/hvm/hypercall.c
@@ -137,6 +137,7 @@ static const hypercall_table_t hvm_hypercall_table[] = {
     COMPAT_CALL(mmuext_op),
     HYPERCALL(xenpmu_op),
     COMPAT_CALL(dm_op),
+    COMPAT_CALL(iommu_op),
     HYPERCALL(arch_1)
 };
 
diff --git a/xen/arch/x86/hypercall.c b/xen/arch/x86/hypercall.c
index 90e88c1d2c..045753e702 100644
--- a/xen/arch/x86/hypercall.c
+++ b/xen/arch/x86/hypercall.c
@@ -68,6 +68,7 @@ const hypercall_args_t hypercall_args_table[NR_hypercalls] =
     ARGS(xenpmu_op, 2),
     ARGS(dm_op, 3),
     ARGS(mca, 1),
+    ARGS(iommu_op, 2),
     ARGS(arch_1, 1),
 };
 
diff --git a/xen/arch/x86/iommu_op.c b/xen/arch/x86/iommu_op.c
new file mode 100644
index 0000000000..744c0fce27
--- /dev/null
+++ b/xen/arch/x86/iommu_op.c
@@ -0,0 +1,184 @@
+/******************************************************************************
+ * x86/iommu_op.c
+ *
+ * Paravirtualised IOMMU functionality
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; If not, see <http://www.gnu.org/licenses/>.
+ *
+ * Copyright (C) 2018 Citrix Systems Inc
+ */
+
+#include <xen/event.h>
+#include <xen/guest_access.h>
+#include <xen/hypercall.h>
+
+static void iommu_op(xen_iommu_op_t *op)
+{
+    switch ( op->op )
+    {
+    default:
+        op->status = -EOPNOTSUPP;
+        break;
+    }
+}
+
+int do_one_iommu_op(xen_iommu_op_buf_t *buf)
+{
+    xen_iommu_op_t op;
+    int rc;
+
+    if ( buf->size < sizeof(op) )
+        return -EFAULT;
+
+    if ( copy_from_guest((void *)&op, buf->h, sizeof(op)) )
+        return -EFAULT;
+
+    if ( op.pad )
+        return -EINVAL;
+
+    rc = xsm_iommu_op(XSM_PRIV, current->domain, op.op);
+    if ( rc )
+        return rc;
+
+    iommu_op(&op);
+
+    if ( __copy_field_to_guest(guest_handle_cast(buf->h, xen_iommu_op_t),
+                               &op, status) )
+        return -EFAULT;
+
+    return 0;
+}
+
+long do_iommu_op(unsigned int nr_bufs,
+                 XEN_GUEST_HANDLE_PARAM(xen_iommu_op_buf_t) bufs)
+{
+    unsigned int i;
+    long rc = 0;
+
+    for ( i = 0; i < nr_bufs; i++ )
+    {
+        xen_iommu_op_buf_t buf;
+
+        if ( ((i & 0xff) == 0xff) && hypercall_preempt_check() )
+        {
+            rc = i;
+            break;
+        }
+
+        if ( copy_from_guest_offset(&buf, bufs, i, 1) )
+        {
+            rc = -EFAULT;
+            break;
+        }
+
+        rc = do_one_iommu_op(&buf);
+        if ( rc )
+            break;
+    }
+
+    if ( rc > 0 )
+    {
+        ASSERT(rc < nr_bufs);
+        nr_bufs -= rc;
+        guest_handle_add_offset(bufs, rc);
+
+        rc = hypercall_create_continuation(__HYPERVISOR_iommu_op,
+                                           "ih", nr_bufs, bufs);
+    }
+
+    return rc;
+}
+
+int compat_one_iommu_op(compat_iommu_op_buf_t *buf)
+{
+    compat_iommu_op_t cmp;
+    xen_iommu_op_t nat;
+    int rc;
+
+    if ( buf->size < sizeof(cmp) )
+        return -EFAULT;
+
+    if ( copy_from_compat((void *)&cmp, buf->h, sizeof(cmp)) )
+        return -EFAULT;
+
+    if ( cmp.pad )
+        return -EINVAL;
+
+    rc = xsm_iommu_op(XSM_PRIV, current->domain, cmp.op);
+    if ( rc )
+        return rc;
+
+    XLAT_iommu_op(&nat, &cmp);
+
+    iommu_op(&nat);
+
+    XLAT_iommu_op(&cmp, &nat);
+
+    if ( __copy_field_to_compat(compat_handle_cast(buf->h,
+                                                   compat_iommu_op_t),
+                                &cmp, status) )
+        return -EFAULT;
+
+    return 0;
+}
+
+int compat_iommu_op(unsigned int nr_bufs,
+                    XEN_GUEST_HANDLE_PARAM(compat_iommu_op_buf_t) bufs)
+{
+    unsigned int i;
+    long rc = 0;
+
+    for ( i = 0; i < nr_bufs; i++ )
+    {
+        compat_iommu_op_buf_t buf;
+
+        if ( ((i & 0xff) == 0xff) && hypercall_preempt_check() )
+        {
+            rc = i;
+            break;
+        }
+
+        if ( copy_from_guest_offset(&buf, bufs, i, 1) )
+        {
+            rc = -EFAULT;
+            break;
+        }
+
+        rc = compat_one_iommu_op(&buf);
+        if ( rc )
+            break;
+    }
+
+    if ( rc > 0 )
+    {
+        ASSERT(rc < nr_bufs);
+        nr_bufs -= rc;
+        guest_handle_add_offset(bufs, rc);
+
+        rc = hypercall_create_continuation(__HYPERVISOR_iommu_op,
+                                           "ih", nr_bufs, bufs);
+    }
+
+    return rc;
+}
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * tab-width: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/xen/arch/x86/pv/hypercall.c b/xen/arch/x86/pv/hypercall.c
index bbc3011d1a..d23f9af42f 100644
--- a/xen/arch/x86/pv/hypercall.c
+++ b/xen/arch/x86/pv/hypercall.c
@@ -80,6 +80,7 @@ const hypercall_table_t pv_hypercall_table[] = {
     HYPERCALL(xenpmu_op),
     COMPAT_CALL(dm_op),
     HYPERCALL(mca),
+    COMPAT_CALL(iommu_op),
     HYPERCALL(arch_1),
 };
 
diff --git a/xen/include/Makefile b/xen/include/Makefile
index df04182965..af54d8833f 100644
--- a/xen/include/Makefile
+++ b/xen/include/Makefile
@@ -11,6 +11,7 @@ headers-y := \
     compat/features.h \
     compat/grant_table.h \
     compat/kexec.h \
+    compat/iommu_op.h \
     compat/memory.h \
     compat/nmi.h \
     compat/physdev.h \
@@ -29,6 +30,7 @@ headers-$(CONFIG_X86)     += compat/arch-x86/xen-$(compat-arch-y).h
 headers-$(CONFIG_X86)     += compat/hvm/dm_op.h
 headers-$(CONFIG_X86)     += compat/hvm/hvm_op.h
 headers-$(CONFIG_X86)     += compat/hvm/hvm_vcpu.h
+headers-$(CONFIG_X86)     += compat/iommu_op.h
 headers-y                 += compat/arch-$(compat-arch-y).h compat/pmu.h compat/xlat.h
 headers-$(CONFIG_FLASK)   += compat/xsm/flask_op.h
 
diff --git a/xen/include/public/iommu_op.h b/xen/include/public/iommu_op.h
new file mode 100644
index 0000000000..c3b68f665a
--- /dev/null
+++ b/xen/include/public/iommu_op.h
@@ -0,0 +1,64 @@
+/*
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to
+ * deal in the Software without restriction, including without limitation the
+ * rights to use, copy, modify, merge, publish, distribute, sublicense, and/or
+ * sell copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+ * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+ * DEALINGS IN THE SOFTWARE.
+ *
+ * Copyright (C) 2018 Citrix Systems Inc
+ */
+
+#ifndef XEN_PUBLIC_IOMMU_OP_H
+#define XEN_PUBLIC_IOMMU_OP_H
+
+#include "xen.h"
+
+struct xen_iommu_op {
+    uint16_t op;    /* op type */
+    uint16_t pad;
+    int32_t status; /* op completion status: */
+                    /* 0 for success otherwise, negative errno */
+};
+typedef struct xen_iommu_op xen_iommu_op_t;
+DEFINE_XEN_GUEST_HANDLE(xen_iommu_op_t);
+
+struct xen_iommu_op_buf {
+    XEN_GUEST_HANDLE(void) h;
+    xen_ulong_t size;
+};
+typedef struct xen_iommu_op_buf xen_iommu_op_buf_t;
+DEFINE_XEN_GUEST_HANDLE(xen_iommu_op_buf_t);
+
+/* ` enum neg_errnoval
+ * ` HYPERVISOR_iommu_op(unsigned int nr_bufs,
+ * `                     xen_iommu_op_buf_t bufs[])
+ * `
+ *
+ * @nr_bufs is the number of buffers in the @bufs array.
+ * @bufs points to an array of buffers where each contains a struct
+ * xen_iommu_op.
+ */
+
+#endif /* XEN_PUBLIC_IOMMU_OP_H */
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * tab-width: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/xen/include/public/xen.h b/xen/include/public/xen.h
index fb1df8f293..68b0968e7d 100644
--- a/xen/include/public/xen.h
+++ b/xen/include/public/xen.h
@@ -121,6 +121,7 @@ DEFINE_XEN_GUEST_HANDLE(xen_ulong_t);
 #define __HYPERVISOR_xc_reserved_op       39 /* reserved for XenClient */
 #define __HYPERVISOR_xenpmu_op            40
 #define __HYPERVISOR_dm_op                41
+#define __HYPERVISOR_iommu_op             42
 
 /* Architecture-specific hypercall definitions. */
 #define __HYPERVISOR_arch_0               48
diff --git a/xen/include/xen/hypercall.h b/xen/include/xen/hypercall.h
index cc99aea57d..2ebc999f4b 100644
--- a/xen/include/xen/hypercall.h
+++ b/xen/include/xen/hypercall.h
@@ -16,6 +16,7 @@
 #include <public/version.h>
 #include <public/pmu.h>
 #include <public/hvm/dm_op.h>
+#include <public/iommu_op.h>
 #include <asm/hypercall.h>
 #include <xsm/xsm.h>
 
@@ -148,6 +149,10 @@ do_dm_op(
     unsigned int nr_bufs,
     XEN_GUEST_HANDLE_PARAM(xen_dm_op_buf_t) bufs);
 
+extern long
+do_iommu_op(unsigned int nr_bufs,
+            XEN_GUEST_HANDLE_PARAM(xen_iommu_op_buf_t) bufs);
+
 #ifdef CONFIG_COMPAT
 
 extern int
@@ -205,6 +210,13 @@ compat_dm_op(
     unsigned int nr_bufs,
     XEN_GUEST_HANDLE_PARAM(void) bufs);
 
+#include <compat/iommu_op.h>
+
+DEFINE_XEN_GUEST_HANDLE(compat_iommu_op_buf_t);
+extern int
+compat_iommu_op(unsigned int nr_bufs,
+                XEN_GUEST_HANDLE_PARAM(compat_iommu_op_buf_t) bufs);
+
 #endif
 
 void arch_get_xen_caps(xen_capabilities_info_t *info);
diff --git a/xen/include/xlat.lst b/xen/include/xlat.lst
index 527332054a..3b15c18c4e 100644
--- a/xen/include/xlat.lst
+++ b/xen/include/xlat.lst
@@ -77,6 +77,8 @@
 ?	vcpu_hvm_context		hvm/hvm_vcpu.h
 ?	vcpu_hvm_x86_32			hvm/hvm_vcpu.h
 ?	vcpu_hvm_x86_64			hvm/hvm_vcpu.h
+!	iommu_op			iommu_op.h
+!	iommu_op_buf			iommu_op.h
 ?	kexec_exec			kexec.h
 !	kexec_image			kexec.h
 !	kexec_range			kexec.h
diff --git a/xen/include/xsm/dummy.h b/xen/include/xsm/dummy.h
index ff6b2dbf39..8891da0759 100644
--- a/xen/include/xsm/dummy.h
+++ b/xen/include/xsm/dummy.h
@@ -715,6 +715,12 @@ static XSM_INLINE int xsm_dm_op(XSM_DEFAULT_ARG struct domain *d)
     return xsm_default_action(action, current->domain, d);
 }
 
+static XSM_INLINE int xsm_iommu_op(XSM_DEFAULT_ARG struct domain *d, unsigned int op)
+{
+    XSM_ASSERT_ACTION(XSM_PRIV);
+    return xsm_default_action(action, current->domain, d);
+}
+
 #endif /* CONFIG_X86 */
 
 #include <public/version.h>
diff --git a/xen/include/xsm/xsm.h b/xen/include/xsm/xsm.h
index f0c6fc7e29..c47c1418eb 100644
--- a/xen/include/xsm/xsm.h
+++ b/xen/include/xsm/xsm.h
@@ -178,6 +178,7 @@ struct xsm_operations {
     int (*ioport_mapping) (struct domain *d, uint32_t s, uint32_t e, uint8_t allow);
     int (*pmu_op) (struct domain *d, unsigned int op);
     int (*dm_op) (struct domain *d);
+    int (*iommu_op) (struct domain *d, unsigned int op);
 #endif
     int (*xen_version) (uint32_t cmd);
     int (*domain_resource_map) (struct domain *d);
@@ -686,6 +687,11 @@ static inline int xsm_dm_op(xsm_default_t def, struct domain *d)
     return xsm_ops->dm_op(d);
 }
 
+static inline int xsm_iommu_op(xsm_default_t def, struct domain *d, unsigned int op)
+{
+    return xsm_ops->iommu_op(d, op);
+}
+
 #endif /* CONFIG_X86 */
 
 static inline int xsm_xen_version (xsm_default_t def, uint32_t op)
diff --git a/xen/xsm/dummy.c b/xen/xsm/dummy.c
index 6e751199ee..02512a1566 100644
--- a/xen/xsm/dummy.c
+++ b/xen/xsm/dummy.c
@@ -155,6 +155,7 @@ void __init xsm_fixup_ops (struct xsm_operations *ops)
     set_to_dummy_if_null(ops, ioport_mapping);
     set_to_dummy_if_null(ops, pmu_op);
     set_to_dummy_if_null(ops, dm_op);
+    set_to_dummy_if_null(ops, iommu_op);
 #endif
     set_to_dummy_if_null(ops, xen_version);
     set_to_dummy_if_null(ops, domain_resource_map);
diff --git a/xen/xsm/flask/hooks.c b/xen/xsm/flask/hooks.c
index 78bc32602e..a04786106f 100644
--- a/xen/xsm/flask/hooks.c
+++ b/xen/xsm/flask/hooks.c
@@ -1672,6 +1672,11 @@ static int flask_dm_op(struct domain *d)
     return current_has_perm(d, SECCLASS_HVM, HVM__DM);
 }
 
+static int flask_iommu_op(struct domain *d)
+{
+    return current_has_perm(d, SECCLASS_RESOURCE, RESOURCE__CONTROL_IOMMU);
+}
+
 #endif /* CONFIG_X86 */
 
 static int flask_xen_version (uint32_t op)
@@ -1850,6 +1855,7 @@ static struct xsm_operations flask_ops = {
     .ioport_mapping = flask_ioport_mapping,
     .pmu_op = flask_pmu_op,
     .dm_op = flask_dm_op,
+    .iommu_op = flask_iommu_op,
 #endif
     .xen_version = flask_xen_version,
     .domain_resource_map = flask_domain_resource_map,
diff --git a/xen/xsm/flask/policy/access_vectors b/xen/xsm/flask/policy/access_vectors
index c5d85486d0..0c894b733e 100644
--- a/xen/xsm/flask/policy/access_vectors
+++ b/xen/xsm/flask/policy/access_vectors
@@ -473,6 +473,8 @@ class resource
 # checked for PHYSDEVOP_setup_gsi (target IRQ)
 # checked for PHYSDEVOP_pci_mmcfg_reserved (target xen_t)
     setup
+# checked for IOMMU_OP
+    control_iommu
 }
 
 # Class security describes the FLASK security server itself; these operations
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH v5 07/15] iommu: track reserved ranges using a rangeset
  2018-08-03 17:22 [PATCH v5 00/15] paravirtual IOMMU interface Paul Durrant
                   ` (5 preceding siblings ...)
  2018-08-03 17:22 ` [PATCH v5 06/15] public / x86: introduce __HYPERCALL_iommu_op Paul Durrant
@ 2018-08-03 17:22 ` Paul Durrant
  2018-08-07  3:04   ` Tian, Kevin
  2018-08-03 17:22 ` [PATCH v5 08/15] x86: add iommu_op to query reserved ranges Paul Durrant
                   ` (7 subsequent siblings)
  14 siblings, 1 reply; 57+ messages in thread
From: Paul Durrant @ 2018-08-03 17:22 UTC (permalink / raw)
  To: xen-devel; +Cc: Kevin Tian, Paul Durrant, Jan Beulich

Ranges that should be considered reserved in the IOMMU are not necessarily
limited to RMRRs. If iommu_inclusive_mapping is set then any frame number
falling within an E820 reserved region should also be considered as
reserved in the IOMMU.
This patch adds a rangeset to the domain_iommu structure that is then used
to track all reserved ranges. This will be needed by a subsequent patch
to test whether it is safe to modify a particular IOMMU entry.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Reviewed-by: George Dunlap <george.dunlap@citrix.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
---
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Kevin Tian <kevin.tian@intel.com>

v2:
 - New in v2.
---
 xen/drivers/passthrough/iommu.c       | 10 +++++++++-
 xen/drivers/passthrough/vtd/iommu.c   | 20 +++++++++++++-------
 xen/drivers/passthrough/vtd/x86/vtd.c | 17 ++++++++++++++++-
 xen/include/xen/iommu.h               |  6 ++++++
 4 files changed, 44 insertions(+), 9 deletions(-)

diff --git a/xen/drivers/passthrough/iommu.c b/xen/drivers/passthrough/iommu.c
index 21e6886a3f..b10a37e5d7 100644
--- a/xen/drivers/passthrough/iommu.c
+++ b/xen/drivers/passthrough/iommu.c
@@ -147,6 +147,10 @@ int iommu_domain_init(struct domain *d)
     if ( !iommu_enabled )
         return 0;
 
+    hd->reserved_ranges = rangeset_new(d, NULL, 0);
+    if ( !hd->reserved_ranges )
+        return -ENOMEM;
+
     hd->platform_ops = iommu_get_ops();
     return hd->platform_ops->init(d);
 }
@@ -248,12 +252,16 @@ int iommu_construct(struct domain *d)
 
 void iommu_domain_destroy(struct domain *d)
 {
-    if ( !iommu_enabled || !dom_iommu(d)->platform_ops )
+    const struct domain_iommu *hd = dom_iommu(d);
+
+    if ( !iommu_enabled || !hd->platform_ops )
         return;
 
     iommu_teardown(d);
 
     arch_iommu_domain_destroy(d);
+
+    rangeset_destroy(hd->reserved_ranges);
 }
 
 int iommu_map_page(struct domain *d, bfn_t bfn, mfn_t mfn,
diff --git a/xen/drivers/passthrough/vtd/iommu.c b/xen/drivers/passthrough/vtd/iommu.c
index c9f50f04ad..282e227414 100644
--- a/xen/drivers/passthrough/vtd/iommu.c
+++ b/xen/drivers/passthrough/vtd/iommu.c
@@ -1910,6 +1910,7 @@ static int rmrr_identity_mapping(struct domain *d, bool_t map,
     unsigned long end_pfn = PAGE_ALIGN_4K(rmrr->end_address) >> PAGE_SHIFT_4K;
     struct mapped_rmrr *mrmrr;
     struct domain_iommu *hd = dom_iommu(d);
+    int err = 0;
 
     ASSERT(pcidevs_locked());
     ASSERT(rmrr->base_address < rmrr->end_address);
@@ -1923,8 +1924,6 @@ static int rmrr_identity_mapping(struct domain *d, bool_t map,
         if ( mrmrr->base == rmrr->base_address &&
              mrmrr->end == rmrr->end_address )
         {
-            int ret = 0;
-
             if ( map )
             {
                 ++mrmrr->count;
@@ -1934,28 +1933,35 @@ static int rmrr_identity_mapping(struct domain *d, bool_t map,
             if ( --mrmrr->count )
                 return 0;
 
-            while ( base_pfn < end_pfn )
+            err = rangeset_remove_range(hd->reserved_ranges,
+                                        base_pfn, end_pfn);
+            while ( !err && base_pfn < end_pfn )
             {
                 if ( clear_identity_p2m_entry(d, base_pfn) )
-                    ret = -ENXIO;
+                    err = -ENXIO;
+
                 base_pfn++;
             }
 
             list_del(&mrmrr->list);
             xfree(mrmrr);
-            return ret;
+            return err;
         }
     }
 
     if ( !map )
         return -ENOENT;
 
+    err = rangeset_add_range(hd->reserved_ranges, base_pfn, end_pfn);
+    if ( err )
+        return err;
+
     while ( base_pfn < end_pfn )
     {
-        int err = set_identity_p2m_entry(d, base_pfn, p2m_access_rw, flag);
-
+        err = set_identity_p2m_entry(d, base_pfn, p2m_access_rw, flag);
         if ( err )
             return err;
+
         base_pfn++;
     }
 
diff --git a/xen/drivers/passthrough/vtd/x86/vtd.c b/xen/drivers/passthrough/vtd/x86/vtd.c
index 6fed4a92cb..032412b8c6 100644
--- a/xen/drivers/passthrough/vtd/x86/vtd.c
+++ b/xen/drivers/passthrough/vtd/x86/vtd.c
@@ -164,10 +164,25 @@ void __hwdom_init vtd_set_hwdom_mapping(struct domain *d)
 
             if ( !rc )
                rc = ret;
+
+            /*
+             * The only reason a reserved page would be mapped is that
+             * iommu_inclusive_mapping is set, in which case the BFN
+             * needs to be marked as reserved in the IOMMU.
+             */
+            if ( page_is_ram_type(pfn, RAM_TYPE_RESERVED) )
+            {
+                ASSERT(iommu_inclusive_mapping);
+
+                ret = rangeset_add_singleton(dom_iommu(d)->reserved_ranges,
+                                             bfn_x(bfn));
+                if ( !rc )
+                    rc = ret;
+            }
         }
 
         if ( rc )
-            printk(XENLOG_WARNING VTDPREFIX " d%d: IOMMU mapping failed: %d\n",
+            printk(XENLOG_WARNING VTDPREFIX " d%d: IOMMU mapping/reservation failed: %d\n",
                    d->domain_id, rc);
 
         if (!(i & (0xfffff >> (PAGE_SHIFT - PAGE_SHIFT_4K))))
diff --git a/xen/include/xen/iommu.h b/xen/include/xen/iommu.h
index 624784fec8..cc0be81b4e 100644
--- a/xen/include/xen/iommu.h
+++ b/xen/include/xen/iommu.h
@@ -122,6 +122,12 @@ struct domain_iommu {
 
     /* Features supported by the IOMMU */
     DECLARE_BITMAP(features, IOMMU_FEAT_count);
+
+    /*
+     * BFN ranges that are reserved in the domain IOMMU mappings and
+     * must not be modified after initialization.
+     */
+    struct rangeset *reserved_ranges;
 };
 
 #define dom_iommu(d)              (&(d)->iommu)
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH v5 08/15] x86: add iommu_op to query reserved ranges
  2018-08-03 17:22 [PATCH v5 00/15] paravirtual IOMMU interface Paul Durrant
                   ` (6 preceding siblings ...)
  2018-08-03 17:22 ` [PATCH v5 07/15] iommu: track reserved ranges using a rangeset Paul Durrant
@ 2018-08-03 17:22 ` Paul Durrant
  2018-08-03 17:22 ` [PATCH v5 09/15] vtd: add lookup_page method to iommu_ops Paul Durrant
                   ` (6 subsequent siblings)
  14 siblings, 0 replies; 57+ messages in thread
From: Paul Durrant @ 2018-08-03 17:22 UTC (permalink / raw)
  To: xen-devel
  Cc: Stefano Stabellini, Wei Liu, George Dunlap, Andrew Cooper,
	Ian Jackson, Tim Deegan, Paul Durrant, Jan Beulich

This patch adds an iommu_op to allow the domain IOMMU reserved ranges to be
queried by the guest.

NOTE: The number of reserved ranges is determined by system firmware, in
      conjunction with Xen command line options, and is expected to be
      small. Thus, to avoid over-complicating the code, there is no
      pre-emption check within the operation.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
---
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: George Dunlap <George.Dunlap@eu.citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Tim Deegan <tim@xen.org>
Cc: Wei Liu <wei.liu2@citrix.com>

v4:
 - Make xen_bfn_t strictly 64 bits wide and drop associated compat
   translation.

v3:
 - Avoid speculation beyond array bounds check.

v2:
 - Re-implemented for v2 based on new rangeset.
---
 xen/arch/x86/iommu_op.c       | 164 ++++++++++++++++++++++++++++++++++++++++--
 xen/include/public/iommu_op.h |  39 ++++++++++
 xen/include/xlat.lst          |   2 +
 3 files changed, 199 insertions(+), 6 deletions(-)

diff --git a/xen/arch/x86/iommu_op.c b/xen/arch/x86/iommu_op.c
index 744c0fce27..bcfcd49102 100644
--- a/xen/arch/x86/iommu_op.c
+++ b/xen/arch/x86/iommu_op.c
@@ -22,11 +22,70 @@
 #include <xen/event.h>
 #include <xen/guest_access.h>
 #include <xen/hypercall.h>
+#include <xen/nospec.h>
+
+struct get_reserved_ctxt {
+    unsigned int max_entries;
+    unsigned int nr_entries;
+    XEN_GUEST_HANDLE(xen_iommu_reserved_range_t) ranges;
+};
+
+static int get_reserved(unsigned long s, unsigned long e, void *arg)
+{
+    struct get_reserved_ctxt *ctxt = arg;
+
+    if ( ctxt->nr_entries < ctxt->max_entries )
+    {
+        xen_iommu_reserved_range_t range = {
+            .start_bfn = s,
+            .nr_frames = e - s,
+        };
+
+        if ( copy_to_guest_offset(ctxt->ranges, ctxt->nr_entries, &range,
+                                  1) )
+            return -EFAULT;
+    }
+
+    ctxt->nr_entries++;
+    return 0;
+}
+
+static int iommu_op_query_reserved(struct xen_iommu_op_query_reserved *op)
+{
+    struct domain *currd = current->domain;
+    struct domain_iommu *iommu = dom_iommu(currd);
+    struct get_reserved_ctxt ctxt = {
+        .max_entries = op->nr_entries,
+        .ranges = op->ranges,
+    };
+    int rc;
+
+    if ( op->pad )
+        return -EINVAL;
+
+    rc = rangeset_report_ranges(iommu->reserved_ranges, 0, ~0ul,
+                                get_reserved, &ctxt);
+    if ( rc )
+        return rc;
+
+    /* Pass back the actual number of reserved ranges */
+    op->nr_entries = ctxt.nr_entries;
+
+    if ( !guest_handle_is_null(ctxt.ranges) &&
+         ctxt.nr_entries > ctxt.max_entries )
+        return -ENOBUFS;
+
+    return 0;
+}
 
 static void iommu_op(xen_iommu_op_t *op)
 {
     switch ( op->op )
     {
+    case XEN_IOMMUOP_query_reserved:
+        op->status = iommu_op_query_reserved(&op->u.query_reserved);
+        break;
+
     default:
         op->status = -EOPNOTSUPP;
         break;
@@ -35,13 +94,20 @@ static void iommu_op(xen_iommu_op_t *op)
 
 int do_one_iommu_op(xen_iommu_op_buf_t *buf)
 {
-    xen_iommu_op_t op;
+    xen_iommu_op_t op = {};
+    size_t offset;
+    static const size_t op_size[] = {
+        [XEN_IOMMUOP_query_reserved] = sizeof(struct xen_iommu_op_query_reserved),
+    };
+    size_t size;
     int rc;
 
-    if ( buf->size < sizeof(op) )
+    offset = offsetof(struct xen_iommu_op, u);
+
+    if ( buf->size < offset )
         return -EFAULT;
 
-    if ( copy_from_guest((void *)&op, buf->h, sizeof(op)) )
+    if ( copy_from_guest((void *)&op, buf->h, offset) )
         return -EFAULT;
 
     if ( op.pad )
@@ -51,6 +117,16 @@ int do_one_iommu_op(xen_iommu_op_buf_t *buf)
     if ( rc )
         return rc;
 
+    if ( op.op >= ARRAY_SIZE(op_size) )
+        return -EOPNOTSUPP;
+
+    size = op_size[array_index_nospec(op.op, ARRAY_SIZE(op_size))];
+    if ( buf->size < offset + size )
+        return -EFAULT;
+
+    if ( copy_from_guest_offset((void *)&op.u, buf->h, offset, size) )
+        return -EFAULT;
+
     iommu_op(&op);
 
     if ( __copy_field_to_guest(guest_handle_cast(buf->h, xen_iommu_op_t),
@@ -100,16 +176,27 @@ long do_iommu_op(unsigned int nr_bufs,
     return rc;
 }
 
+CHECK_iommu_reserved_range;
+
 int compat_one_iommu_op(compat_iommu_op_buf_t *buf)
 {
-    compat_iommu_op_t cmp;
+    compat_iommu_op_t cmp = {};
+    size_t offset;
+    static const size_t op_size[] = {
+        [XEN_IOMMUOP_query_reserved] = sizeof(struct compat_iommu_op_query_reserved),
+    };
+    size_t size;
     xen_iommu_op_t nat;
+    unsigned int u;
+    int32_t status;
     int rc;
 
-    if ( buf->size < sizeof(cmp) )
+    offset = offsetof(struct compat_iommu_op, u);
+
+    if ( buf->size < offset )
         return -EFAULT;
 
-    if ( copy_from_compat((void *)&cmp, buf->h, sizeof(cmp)) )
+    if ( copy_from_compat((void *)&cmp, buf->h, offset) )
         return -EFAULT;
 
     if ( cmp.pad )
@@ -119,17 +206,82 @@ int compat_one_iommu_op(compat_iommu_op_buf_t *buf)
     if ( rc )
         return rc;
 
+    if ( cmp.op >= ARRAY_SIZE(op_size) )
+        return -EOPNOTSUPP;
+
+    size = op_size[array_index_nospec(cmp.op, ARRAY_SIZE(op_size))];
+    if ( buf->size < offset + size )
+        return -EFAULT;
+
+    if ( copy_from_compat_offset((void *)&cmp.u, buf->h, offset, size) )
+        return -EFAULT;
+
+    /*
+     * The xlat magic doesn't quite know how to handle the union so
+     * we need to fix things up here.
+     */
+#define XLAT_iommu_op_u_query_reserved XEN_IOMMUOP_query_reserved
+    u = cmp.op;
+
+#define XLAT_iommu_op_query_reserved_HNDL_ranges(_d_, _s_)            \
+    do                                                                \
+    {                                                                 \
+        if ( !compat_handle_is_null((_s_)->ranges) )                  \
+        {                                                             \
+            unsigned int *nr_entries = COMPAT_ARG_XLAT_VIRT_BASE;     \
+            xen_iommu_reserved_range_t *ranges =                      \
+                (void *)(nr_entries + 1);                             \
+                                                                      \
+            if ( sizeof(*nr_entries) +                                \
+                 (sizeof(*ranges) * (_s_)->nr_entries) >              \
+                 COMPAT_ARG_XLAT_SIZE )                               \
+                return -E2BIG;                                        \
+                                                                      \
+            *nr_entries = (_s_)->nr_entries;                          \
+            set_xen_guest_handle((_d_)->ranges, ranges);              \
+        }                                                             \
+        else                                                          \
+            set_xen_guest_handle((_d_)->ranges, NULL);                \
+    } while (false)
+
     XLAT_iommu_op(&nat, &cmp);
 
+#undef XLAT_iommu_op_query_reserved_HNDL_ranges
+
     iommu_op(&nat);
 
+    status = nat.status;
+
+#define XLAT_iommu_op_query_reserved_HNDL_ranges(_d_, _s_)               \
+    do                                                                   \
+    {                                                                    \
+        if ( !compat_handle_is_null((_d_)->ranges) )                     \
+        {                                                                \
+            unsigned int *nr_entries = COMPAT_ARG_XLAT_VIRT_BASE;        \
+            compat_iommu_reserved_range_t *ranges =                      \
+                (void *)(nr_entries + 1);                                \
+            unsigned int nr =                                            \
+                min_t(unsigned int, (_d_)->nr_entries, *nr_entries);     \
+                                                                         \
+            if ( __copy_to_compat_offset((_d_)->ranges, 0, ranges, nr) ) \
+                status = -EFAULT;                                        \
+        }                                                                \
+    } while (false)
+
     XLAT_iommu_op(&cmp, &nat);
 
+    /* status will have been modified if __copy_to_compat_offset() failed */
+    cmp.status = status;
+
+#undef XLAT_iommu_op_query_reserved_HNDL_ranges
+
     if ( __copy_field_to_compat(compat_handle_cast(buf->h,
                                                    compat_iommu_op_t),
                                 &cmp, status) )
         return -EFAULT;
 
+#undef XLAT_iommu_op_u_query_reserved
+
     return 0;
 }
 
diff --git a/xen/include/public/iommu_op.h b/xen/include/public/iommu_op.h
index c3b68f665a..ade404a877 100644
--- a/xen/include/public/iommu_op.h
+++ b/xen/include/public/iommu_op.h
@@ -25,11 +25,50 @@
 
 #include "xen.h"
 
+typedef uint64_t xen_bfn_t;
+
+/* Structure describing a single range reserved in the IOMMU */
+struct xen_iommu_reserved_range {
+    xen_bfn_t start_bfn;
+    unsigned int nr_frames;
+    unsigned int pad;
+};
+typedef struct xen_iommu_reserved_range xen_iommu_reserved_range_t;
+DEFINE_XEN_GUEST_HANDLE(xen_iommu_reserved_range_t);
+
+/*
+ * XEN_IOMMUOP_query_reserved: Query ranges reserved in the IOMMU.
+ */
+#define XEN_IOMMUOP_query_reserved 1
+
+struct xen_iommu_op_query_reserved {
+    /*
+     * IN/OUT - On entry this is the number of entries available
+     *          in the ranges array below.
+     *          On exit this is the actual number of reserved ranges.
+     */
+    unsigned int nr_entries;
+    unsigned int pad;
+    /*
+     * OUT - This array is populated with reserved ranges. If it is
+     *       not sufficiently large then available entries are populated,
+     *       but the op status code will be set to -ENOBUFS.
+     *       It is permissable to set this to NULL if nr_entries is also
+     *       set to zero. In this case, on exit, nr_entries will still be
+     *       set to the actual number of reserved ranges but the status
+     *       code will be set to zero.
+     */
+    XEN_GUEST_HANDLE(xen_iommu_reserved_range_t) ranges;
+};
+
 struct xen_iommu_op {
     uint16_t op;    /* op type */
     uint16_t pad;
     int32_t status; /* op completion status: */
                     /* 0 for success otherwise, negative errno */
+    union {
+        struct xen_iommu_op_query_reserved query_reserved;
+    } u;
 };
 typedef struct xen_iommu_op xen_iommu_op_t;
 DEFINE_XEN_GUEST_HANDLE(xen_iommu_op_t);
diff --git a/xen/include/xlat.lst b/xen/include/xlat.lst
index 3b15c18c4e..d2f9b1034b 100644
--- a/xen/include/xlat.lst
+++ b/xen/include/xlat.lst
@@ -79,6 +79,8 @@
 ?	vcpu_hvm_x86_64			hvm/hvm_vcpu.h
 !	iommu_op			iommu_op.h
 !	iommu_op_buf			iommu_op.h
+!	iommu_op_query_reserved		iommu_op.h
+?	iommu_reserved_range		iommu_op.h
 ?	kexec_exec			kexec.h
 !	kexec_image			kexec.h
 !	kexec_range			kexec.h
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH v5 09/15] vtd: add lookup_page method to iommu_ops
  2018-08-03 17:22 [PATCH v5 00/15] paravirtual IOMMU interface Paul Durrant
                   ` (7 preceding siblings ...)
  2018-08-03 17:22 ` [PATCH v5 08/15] x86: add iommu_op to query reserved ranges Paul Durrant
@ 2018-08-03 17:22 ` Paul Durrant
  2018-08-07  3:25   ` Tian, Kevin
  2018-08-03 17:22 ` [PATCH v5 10/15] mm / iommu: include need_iommu() test in iommu_use_hap_pt() Paul Durrant
                   ` (5 subsequent siblings)
  14 siblings, 1 reply; 57+ messages in thread
From: Paul Durrant @ 2018-08-03 17:22 UTC (permalink / raw)
  To: xen-devel; +Cc: Kevin Tian, Paul Durrant, George Dunlap, Jan Beulich

This patch adds a new method to the VT-d IOMMU implementation to find the
MFN currently mapped by the specified BFN along with a wrapper function in
generic IOMMU code to call the implementation if it exists.

This functionality will be used by a subsequent patch.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
---
Cc: Kevin Tian <kevin.tian@intel.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: George Dunlap <george.dunlap@citrix.com>

v3:
 - Addressed comments from George.

v2:
 - Addressed some comments from Jan.
---
 xen/drivers/passthrough/iommu.c     | 11 +++++++++++
 xen/drivers/passthrough/vtd/iommu.c | 34 ++++++++++++++++++++++++++++++++++
 xen/drivers/passthrough/vtd/iommu.h |  3 +++
 xen/include/xen/iommu.h             |  4 ++++
 4 files changed, 52 insertions(+)

diff --git a/xen/drivers/passthrough/iommu.c b/xen/drivers/passthrough/iommu.c
index b10a37e5d7..9b7baca93f 100644
--- a/xen/drivers/passthrough/iommu.c
+++ b/xen/drivers/passthrough/iommu.c
@@ -305,6 +305,17 @@ int iommu_unmap_page(struct domain *d, bfn_t bfn)
     return rc;
 }
 
+int iommu_lookup_page(struct domain *d, bfn_t bfn, mfn_t *mfn,
+                      unsigned int *flags)
+{
+    const struct domain_iommu *hd = dom_iommu(d);
+
+    if ( !iommu_enabled || !hd->platform_ops )
+        return -EOPNOTSUPP;
+
+    return hd->platform_ops->lookup_page(d, bfn, mfn, flags);
+}
+
 static void iommu_free_pagetables(unsigned long unused)
 {
     do {
diff --git a/xen/drivers/passthrough/vtd/iommu.c b/xen/drivers/passthrough/vtd/iommu.c
index 282e227414..8cd3b59aa0 100644
--- a/xen/drivers/passthrough/vtd/iommu.c
+++ b/xen/drivers/passthrough/vtd/iommu.c
@@ -1830,6 +1830,39 @@ static int __must_check intel_iommu_unmap_page(struct domain *d,
     return dma_pte_clear_one(d, bfn_to_baddr(bfn));
 }
 
+static int intel_iommu_lookup_page(struct domain *d, bfn_t bfn, mfn_t *mfn,
+                                   unsigned int *flags)
+{
+    struct domain_iommu *hd = dom_iommu(d);
+    struct dma_pte *page = NULL, *pte = NULL, val;
+    u64 pg_maddr;
+
+    spin_lock(&hd->arch.mapping_lock);
+
+    pg_maddr = addr_to_dma_page_maddr(d, bfn_to_baddr(bfn), 0);
+    if ( pg_maddr == 0 )
+    {
+        spin_unlock(&hd->arch.mapping_lock);
+        return -ENOMEM;
+    }
+
+    page = map_vtd_domain_page(pg_maddr);
+    pte = page + (bfn_x(bfn) & LEVEL_MASK);
+    val = *pte;
+
+    unmap_vtd_domain_page(page);
+    spin_unlock(&hd->arch.mapping_lock);
+
+    if ( !dma_pte_present(val) )
+        return -ENOENT;
+
+    *mfn = maddr_to_mfn(dma_pte_addr(val));
+    *flags = dma_pte_read(val) ? IOMMUF_readable : 0;
+    *flags |= dma_pte_write(val) ? IOMMUF_writable : 0;
+
+    return 0;
+}
+
 int iommu_pte_flush(struct domain *d, uint64_t bfn, uint64_t *pte,
                     int order, int present)
 {
@@ -2661,6 +2694,7 @@ const struct iommu_ops intel_iommu_ops = {
     .teardown = iommu_domain_teardown,
     .map_page = intel_iommu_map_page,
     .unmap_page = intel_iommu_unmap_page,
+    .lookup_page = intel_iommu_lookup_page,
     .free_page_table = iommu_free_page_table,
     .reassign_device = reassign_device_ownership,
     .get_device_group_id = intel_iommu_group_id,
diff --git a/xen/drivers/passthrough/vtd/iommu.h b/xen/drivers/passthrough/vtd/iommu.h
index 72c1a2e3cd..47bdfcb5ea 100644
--- a/xen/drivers/passthrough/vtd/iommu.h
+++ b/xen/drivers/passthrough/vtd/iommu.h
@@ -272,6 +272,9 @@ struct dma_pte {
 #define dma_set_pte_prot(p, prot) do { \
         (p).val = ((p).val & ~DMA_PTE_PROT) | ((prot) & DMA_PTE_PROT); \
     } while (0)
+#define dma_pte_prot(p) ((p).val & DMA_PTE_PROT)
+#define dma_pte_read(p) (dma_pte_prot(p) & DMA_PTE_READ)
+#define dma_pte_write(p) (dma_pte_prot(p) & DMA_PTE_WRITE)
 #define dma_pte_addr(p) ((p).val & PADDR_MASK & PAGE_MASK_4K)
 #define dma_set_pte_addr(p, addr) do {\
             (p).val |= ((addr) & PAGE_MASK_4K); } while (0)
diff --git a/xen/include/xen/iommu.h b/xen/include/xen/iommu.h
index cc0be81b4e..7c5d46df81 100644
--- a/xen/include/xen/iommu.h
+++ b/xen/include/xen/iommu.h
@@ -100,6 +100,8 @@ void iommu_teardown(struct domain *d);
 int __must_check iommu_map_page(struct domain *d, bfn_t bfn,
                                 mfn_t mfn, unsigned int flags);
 int __must_check iommu_unmap_page(struct domain *d, bfn_t bfn);
+int __must_check iommu_lookup_page(struct domain *d, bfn_t bfn, mfn_t *mfn,
+                                   unsigned int *flags);
 
 enum iommu_feature
 {
@@ -198,6 +200,8 @@ struct iommu_ops {
     int __must_check (*map_page)(struct domain *d, bfn_t bfn, mfn_t mfn,
                                  unsigned int flags);
     int __must_check (*unmap_page)(struct domain *d, bfn_t bfn);
+    int __must_check (*lookup_page)(struct domain *d, bfn_t bfn, mfn_t *mfn,
+                                    unsigned int *flags);
     void (*free_page_table)(struct page_info *);
 #ifdef CONFIG_X86
     void (*update_ire_from_apic)(unsigned int apic, unsigned int reg, unsigned int value);
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH v5 10/15] mm / iommu: include need_iommu() test in iommu_use_hap_pt()
  2018-08-03 17:22 [PATCH v5 00/15] paravirtual IOMMU interface Paul Durrant
                   ` (8 preceding siblings ...)
  2018-08-03 17:22 ` [PATCH v5 09/15] vtd: add lookup_page method to iommu_ops Paul Durrant
@ 2018-08-03 17:22 ` Paul Durrant
  2018-08-07  3:32   ` Tian, Kevin
  2018-08-03 17:22 ` [PATCH v5 11/15] mm / iommu: split need_iommu() into has_iommu_pt() and sync_iommu_pt() Paul Durrant
                   ` (4 subsequent siblings)
  14 siblings, 1 reply; 57+ messages in thread
From: Paul Durrant @ 2018-08-03 17:22 UTC (permalink / raw)
  To: xen-devel
  Cc: Kevin Tian, Stefano Stabellini, Jun Nakajima, George Dunlap,
	Andrew Cooper, Julien Grall, Paul Durrant, Jan Beulich

The name 'iommu_use_hap_pt' suggests that that P2M table is in use as the
domain's IOMMU pagetable which, prior to this patch, is not strictly true
since the macro did not test whether the domain actually has IOMMU
mappings.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
---
Cc: Jun Nakajima <jun.nakajima@intel.com>
Cc: Kevin Tian <kevin.tian@intel.com>
Cc: George Dunlap <george.dunlap@eu.citrix.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Julien Grall <julien.grall@arm.com>

v4:
 - New in v4.
---
 xen/arch/x86/mm/p2m-ept.c       | 6 +++---
 xen/arch/x86/mm/p2m-pt.c        | 6 +++---
 xen/arch/x86/mm/p2m.c           | 2 +-
 xen/drivers/passthrough/iommu.c | 2 +-
 xen/include/asm-arm/iommu.h     | 2 +-
 xen/include/asm-x86/iommu.h     | 5 +++--
 6 files changed, 12 insertions(+), 11 deletions(-)

diff --git a/xen/arch/x86/mm/p2m-ept.c b/xen/arch/x86/mm/p2m-ept.c
index 33e77903d6..e52dbb6203 100644
--- a/xen/arch/x86/mm/p2m-ept.c
+++ b/xen/arch/x86/mm/p2m-ept.c
@@ -863,12 +863,12 @@ out:
         ept_sync_domain(p2m);
 
     /* For host p2m, may need to change VT-d page table.*/
-    if ( rc == 0 && p2m_is_hostp2m(p2m) && need_iommu(d) &&
+    if ( rc == 0 && p2m_is_hostp2m(p2m) &&
          need_modify_vtd_table )
     {
-        if ( iommu_hap_pt_share )
+        if ( iommu_use_hap_pt(d) )
             rc = iommu_pte_flush(d, gfn, &ept_entry->epte, order, vtd_pte_present);
-        else
+        else if ( need_iommu(d) )
         {
             bfn_t bfn = _bfn(gfn);
 
diff --git a/xen/arch/x86/mm/p2m-pt.c b/xen/arch/x86/mm/p2m-pt.c
index 9ff0b3fe7a..9255194070 100644
--- a/xen/arch/x86/mm/p2m-pt.c
+++ b/xen/arch/x86/mm/p2m-pt.c
@@ -677,8 +677,8 @@ p2m_pt_set_entry(struct p2m_domain *p2m, gfn_t gfn_, mfn_t mfn,
          && (gfn + (1UL << page_order) - 1 > p2m->max_mapped_pfn) )
         p2m->max_mapped_pfn = gfn + (1UL << page_order) - 1;
 
-    if ( iommu_enabled && need_iommu(p2m->domain) &&
-         (iommu_old_flags != iommu_pte_flags || old_mfn != mfn_x(mfn)) )
+    if ( iommu_enabled && (iommu_old_flags != iommu_pte_flags ||
+                           old_mfn != mfn_x(mfn)) )
     {
         ASSERT(rc == 0);
 
@@ -687,7 +687,7 @@ p2m_pt_set_entry(struct p2m_domain *p2m, gfn_t gfn_, mfn_t mfn,
             if ( iommu_old_flags )
                 amd_iommu_flush_pages(p2m->domain, gfn, page_order);
         }
-        else
+        else if ( need_iommu(p2m->domain) )
         {
             bfn_t bfn = _bfn(gfn);
 
diff --git a/xen/arch/x86/mm/p2m.c b/xen/arch/x86/mm/p2m.c
index 036f52f4f1..7465489074 100644
--- a/xen/arch/x86/mm/p2m.c
+++ b/xen/arch/x86/mm/p2m.c
@@ -2092,7 +2092,7 @@ static unsigned int mmio_order(const struct domain *d,
      * - exclude PV guests, should execution reach this code for such.
      * So be careful when altering this.
      */
-    if ( !need_iommu(d) || !iommu_use_hap_pt(d) ||
+    if ( !iommu_use_hap_pt(d) ||
          (start_fn & ((1UL << PAGE_ORDER_2M) - 1)) || !(nr >> PAGE_ORDER_2M) )
         return PAGE_ORDER_4K;
 
diff --git a/xen/drivers/passthrough/iommu.c b/xen/drivers/passthrough/iommu.c
index 9b7baca93f..ad8ffcfcb2 100644
--- a/xen/drivers/passthrough/iommu.c
+++ b/xen/drivers/passthrough/iommu.c
@@ -453,7 +453,7 @@ int iommu_do_domctl(
 
 void iommu_share_p2m_table(struct domain* d)
 {
-    if ( iommu_enabled && iommu_use_hap_pt(d) )
+    if ( iommu_use_hap_pt(d) )
         iommu_get_ops()->share_p2m(d);
 }
 
diff --git a/xen/include/asm-arm/iommu.h b/xen/include/asm-arm/iommu.h
index 57d9b1e14a..8d1506c6f7 100644
--- a/xen/include/asm-arm/iommu.h
+++ b/xen/include/asm-arm/iommu.h
@@ -21,7 +21,7 @@ struct arch_iommu
 };
 
 /* Always share P2M Table between the CPU and the IOMMU */
-#define iommu_use_hap_pt(d) (1)
+#define iommu_use_hap_pt(d) (need_iommu(d))
 
 const struct iommu_ops *iommu_get_ops(void);
 void __init iommu_set_ops(const struct iommu_ops *ops);
diff --git a/xen/include/asm-x86/iommu.h b/xen/include/asm-x86/iommu.h
index 14ad0489a6..0e7d66be54 100644
--- a/xen/include/asm-x86/iommu.h
+++ b/xen/include/asm-x86/iommu.h
@@ -77,8 +77,9 @@ static inline int iommu_hardware_setup(void)
     return -ENODEV;
 }
 
-/* Does this domain have a P2M table we can use as its IOMMU pagetable? */
-#define iommu_use_hap_pt(d) (hap_enabled(d) && iommu_hap_pt_share)
+/* Are we using the domain P2M table as its IOMMU pagetable? */
+#define iommu_use_hap_pt(d) \
+    (hap_enabled(d) && need_iommu(d) && iommu_hap_pt_share)
 
 void iommu_update_ire_from_apic(unsigned int apic, unsigned int reg, unsigned int value);
 unsigned int iommu_read_apic_from_ire(unsigned int apic, unsigned int reg);
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH v5 11/15] mm / iommu: split need_iommu() into has_iommu_pt() and sync_iommu_pt()
  2018-08-03 17:22 [PATCH v5 00/15] paravirtual IOMMU interface Paul Durrant
                   ` (9 preceding siblings ...)
  2018-08-03 17:22 ` [PATCH v5 10/15] mm / iommu: include need_iommu() test in iommu_use_hap_pt() Paul Durrant
@ 2018-08-03 17:22 ` Paul Durrant
  2018-08-03 18:18   ` Razvan Cojocaru
  2018-08-07  3:41   ` Tian, Kevin
  2018-08-03 17:22 ` [PATCH v5 12/15] x86: add iommu_op to enable modification of IOMMU mappings Paul Durrant
                   ` (3 subsequent siblings)
  14 siblings, 2 replies; 57+ messages in thread
From: Paul Durrant @ 2018-08-03 17:22 UTC (permalink / raw)
  To: xen-devel
  Cc: Kevin Tian, Stefano Stabellini, Wei Liu, Jun Nakajima,
	Razvan Cojocaru, George Dunlap, Andrew Cooper, Ian Jackson,
	Tim Deegan, Julien Grall, Paul Durrant, Tamas K Lengyel,
	Jan Beulich, Brian Woods, Suravee Suthikulpanit

The name 'need_iommu' is a little confusing as it suggests a domain needs
to use the IOMMU but something might not be set up yet, when in fact it
doesn't become true until IOMMU mappings for the domain have been created.

Two different meanings are also inferred from it in various places in the
code:

- Some callers want to test whether a domain has IOMMU mappings
- Some callers want to test whether they need to synchronize the domain's
  P2M and IOMMU mappings

This patch therefore creates two new boolean flags, 'has_iommu_pt' and
'sync_iommu_pt' to convey those meanings separately and creates the
corresponding macros. All callers of need_iommu() are then modified to
use the macro appropriate to what they are trying to test.

NOTE: The test of need_iommu(d) in the AMD IOMMU initialization code has
      been replaced with a test of iommu_dom0_strict since this more
      accurately reflects the meaning of the test and brings it into
      line with a similar test in the Intel VT-d code.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
---
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Julien Grall <julien.grall@arm.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: George Dunlap <George.Dunlap@eu.citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Tim Deegan <tim@xen.org>
Cc: Wei Liu <wei.liu2@citrix.com>
Cc: Tamas K Lengyel <tamas@tklengyel.com>
Cc: George Dunlap <george.dunlap@eu.citrix.com>
Cc: Jun Nakajima <jun.nakajima@intel.com>
Cc: Kevin Tian <kevin.tian@intel.com>
Cc: Razvan Cojocaru <rcojocaru@bitdefender.com>
Cc: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Cc: Brian Woods <brian.woods@amd.com>

v4:
 - New in v4.
---
 xen/arch/arm/p2m.c                          |  2 +-
 xen/arch/x86/hvm/mtrr.c                     |  5 +++--
 xen/arch/x86/mm.c                           |  2 +-
 xen/arch/x86/mm/mem_sharing.c               |  2 +-
 xen/arch/x86/mm/p2m-ept.c                   |  2 +-
 xen/arch/x86/mm/p2m-pt.c                    |  2 +-
 xen/arch/x86/mm/p2m.c                       |  8 ++++----
 xen/arch/x86/mm/paging.c                    |  2 +-
 xen/arch/x86/x86_64/mm.c                    |  3 ++-
 xen/common/memory.c                         |  6 +++---
 xen/common/vm_event.c                       |  2 +-
 xen/drivers/passthrough/amd/pci_amd_iommu.c |  2 +-
 xen/drivers/passthrough/device_tree.c       |  2 +-
 xen/drivers/passthrough/iommu.c             | 17 ++++++++++-------
 xen/drivers/passthrough/pci.c               |  6 +++---
 xen/include/asm-arm/grant_table.h           |  2 +-
 xen/include/asm-arm/iommu.h                 |  2 +-
 xen/include/asm-x86/grant_table.h           |  2 +-
 xen/include/asm-x86/iommu.h                 |  2 +-
 xen/include/xen/sched.h                     | 12 ++++++++----
 20 files changed, 46 insertions(+), 37 deletions(-)

diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
index eb39861b73..c972b72baf 100644
--- a/xen/arch/arm/p2m.c
+++ b/xen/arch/arm/p2m.c
@@ -951,7 +951,7 @@ static int __p2m_set_entry(struct p2m_domain *p2m,
     if ( lpae_valid(orig_pte) && entry->p2m.base != orig_pte.p2m.base )
         p2m_free_entry(p2m, orig_pte, level);
 
-    if ( need_iommu(p2m->domain) &&
+    if ( sync_iommu_pt(p2m->domain) &&
          (lpae_valid(orig_pte) || lpae_valid(*entry)) )
     {
         rc = iommu_iotlb_flush(p2m->domain, _bfn(gfn_x(sgfn)),
diff --git a/xen/arch/x86/hvm/mtrr.c b/xen/arch/x86/hvm/mtrr.c
index c502dda693..61131d4b61 100644
--- a/xen/arch/x86/hvm/mtrr.c
+++ b/xen/arch/x86/hvm/mtrr.c
@@ -824,7 +824,8 @@ HVM_REGISTER_SAVE_RESTORE(MTRR, hvm_save_mtrr_msr, hvm_load_mtrr_msr,
 
 void memory_type_changed(struct domain *d)
 {
-    if ( need_iommu(d) && d->vcpu && d->vcpu[0] )
+    if ( (sync_iommu_pt(d) || iommu_use_hap_pt(d)) &&
+         d->vcpu && d->vcpu[0] )
     {
         p2m_memory_type_changed(d);
         flush_all(FLUSH_CACHE);
@@ -872,7 +873,7 @@ int epte_get_entry_emt(struct domain *d, unsigned long gfn, mfn_t mfn,
         return MTRR_TYPE_UNCACHABLE;
     }
 
-    if ( !need_iommu(d) && !cache_flush_permitted(d) )
+    if ( !has_iommu_pt(d) && !cache_flush_permitted(d) )
     {
         *ipat = 1;
         return MTRR_TYPE_WRBACK;
diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index 08878574f3..462398096f 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -2727,7 +2727,7 @@ static int _get_page_type(struct page_info *page, unsigned long type,
     {
         /* Special pages should not be accessible from devices. */
         struct domain *d = page_get_owner(page);
-        if ( d && is_pv_domain(d) && unlikely(need_iommu(d)) )
+        if ( d && is_pv_domain(d) && unlikely(sync_iommu_pt(d)) )
         {
             mfn_t mfn = page_to_mfn(page);
 
diff --git a/xen/arch/x86/mm/mem_sharing.c b/xen/arch/x86/mm/mem_sharing.c
index fad8a9df13..c54845275f 100644
--- a/xen/arch/x86/mm/mem_sharing.c
+++ b/xen/arch/x86/mm/mem_sharing.c
@@ -1610,7 +1610,7 @@ int mem_sharing_domctl(struct domain *d, struct xen_domctl_mem_sharing_op *mec)
         case XEN_DOMCTL_MEM_SHARING_CONTROL:
         {
             rc = 0;
-            if ( unlikely(need_iommu(d) && mec->u.enable) )
+            if ( unlikely(has_iommu_pt(d) && mec->u.enable) )
                 rc = -EXDEV;
             else
                 d->arch.hvm_domain.mem_sharing_enabled = mec->u.enable;
diff --git a/xen/arch/x86/mm/p2m-ept.c b/xen/arch/x86/mm/p2m-ept.c
index e52dbb6203..d90424a9a3 100644
--- a/xen/arch/x86/mm/p2m-ept.c
+++ b/xen/arch/x86/mm/p2m-ept.c
@@ -868,7 +868,7 @@ out:
     {
         if ( iommu_use_hap_pt(d) )
             rc = iommu_pte_flush(d, gfn, &ept_entry->epte, order, vtd_pte_present);
-        else if ( need_iommu(d) )
+        else if ( sync_iommu_pt(d) )
         {
             bfn_t bfn = _bfn(gfn);
 
diff --git a/xen/arch/x86/mm/p2m-pt.c b/xen/arch/x86/mm/p2m-pt.c
index 9255194070..e0ae6bf20f 100644
--- a/xen/arch/x86/mm/p2m-pt.c
+++ b/xen/arch/x86/mm/p2m-pt.c
@@ -687,7 +687,7 @@ p2m_pt_set_entry(struct p2m_domain *p2m, gfn_t gfn_, mfn_t mfn,
             if ( iommu_old_flags )
                 amd_iommu_flush_pages(p2m->domain, gfn, page_order);
         }
-        else if ( need_iommu(p2m->domain) )
+        else if ( sync_iommu_pt(p2m->domain) )
         {
             bfn_t bfn = _bfn(gfn);
 
diff --git a/xen/arch/x86/mm/p2m.c b/xen/arch/x86/mm/p2m.c
index 7465489074..4ffce37c45 100644
--- a/xen/arch/x86/mm/p2m.c
+++ b/xen/arch/x86/mm/p2m.c
@@ -713,7 +713,7 @@ p2m_remove_page(struct p2m_domain *p2m, unsigned long gfn_l, unsigned long mfn,
     {
         int rc = 0;
 
-        if ( need_iommu(p2m->domain) )
+        if ( sync_iommu_pt(p2m->domain) )
         {
             bfn_t bfn = _bfn(mfn);
 
@@ -777,7 +777,7 @@ guest_physmap_add_entry(struct domain *d, gfn_t gfn, mfn_t mfn,
 
     if ( !paging_mode_translate(d) )
     {
-        if ( need_iommu(d) && t == p2m_ram_rw )
+        if ( sync_iommu_pt(d) && t == p2m_ram_rw )
         {
             bfn_t bfn = _bfn(mfn_x(mfn));
 
@@ -1165,7 +1165,7 @@ int set_identity_p2m_entry(struct domain *d, unsigned long gfn_l,
 
     if ( !paging_mode_translate(d) )
     {
-        if ( !need_iommu(d) )
+        if ( !sync_iommu_pt(d) )
             return 0;
 
         ret = iommu_map_page(d, _bfn(gfn_l), _mfn(gfn_l),
@@ -1261,7 +1261,7 @@ int clear_identity_p2m_entry(struct domain *d, unsigned long gfn_l)
 
     if ( !paging_mode_translate(d) )
     {
-        if ( !need_iommu(d) )
+        if ( !sync_iommu_pt(d) )
             return 0;
 
         ret = iommu_unmap_page(d, _bfn(gfn_l));
diff --git a/xen/arch/x86/mm/paging.c b/xen/arch/x86/mm/paging.c
index 2b0445ffe9..d54f6dd496 100644
--- a/xen/arch/x86/mm/paging.c
+++ b/xen/arch/x86/mm/paging.c
@@ -213,7 +213,7 @@ int paging_log_dirty_enable(struct domain *d, bool_t log_global)
 {
     int ret;
 
-    if ( need_iommu(d) && log_global )
+    if ( has_iommu_pt(d) && log_global )
     {
         /*
          * Refuse to turn on global log-dirty mode
diff --git a/xen/arch/x86/x86_64/mm.c b/xen/arch/x86/x86_64/mm.c
index cc58e4cef4..29c3050a7f 100644
--- a/xen/arch/x86/x86_64/mm.c
+++ b/xen/arch/x86/x86_64/mm.c
@@ -1426,7 +1426,8 @@ int memory_add(unsigned long spfn, unsigned long epfn, unsigned int pxm)
     if ( ret )
         goto destroy_m2p;
 
-    if ( iommu_enabled && !iommu_passthrough && !need_iommu(hardware_domain) )
+    if ( iommu_enabled && !iommu_passthrough &&
+         !sync_iommu_pt(hardware_domain) )
     {
         for ( i = spfn; i < epfn; i++ )
             if ( iommu_map_page(hardware_domain, _bfn(i), _mfn(i),
diff --git a/xen/common/memory.c b/xen/common/memory.c
index 0a34677cc3..ad7aa09a5c 100644
--- a/xen/common/memory.c
+++ b/xen/common/memory.c
@@ -804,8 +804,8 @@ int xenmem_add_to_physmap(struct domain *d, struct xen_add_to_physmap *xatp,
     xatp->size -= start;
 
 #ifdef CONFIG_HAS_PASSTHROUGH
-    if ( need_iommu(d) )
-        this_cpu(iommu_dont_flush_iotlb) = 1;
+    if ( sync_iommu_pt(d) || iommu_use_hap_pt(d) )
+       this_cpu(iommu_dont_flush_iotlb) = 1;
 #endif
 
     while ( xatp->size > done )
@@ -827,7 +827,7 @@ int xenmem_add_to_physmap(struct domain *d, struct xen_add_to_physmap *xatp,
     }
 
 #ifdef CONFIG_HAS_PASSTHROUGH
-    if ( need_iommu(d) )
+    if ( sync_iommu_pt(d) || iommu_use_hap_pt(d) )
     {
         int ret;
 
diff --git a/xen/common/vm_event.c b/xen/common/vm_event.c
index 144ab81c86..ec3dfd1dae 100644
--- a/xen/common/vm_event.c
+++ b/xen/common/vm_event.c
@@ -644,7 +644,7 @@ int vm_event_domctl(struct domain *d, struct xen_domctl_vm_event_op *vec,
 
             /* No paging if iommu is used */
             rc = -EMLINK;
-            if ( unlikely(need_iommu(d)) )
+            if ( unlikely(has_iommu_pt(d)) )
                 break;
 
             rc = -EXDEV;
diff --git a/xen/drivers/passthrough/amd/pci_amd_iommu.c b/xen/drivers/passthrough/amd/pci_amd_iommu.c
index eea22c3d0d..752751b103 100644
--- a/xen/drivers/passthrough/amd/pci_amd_iommu.c
+++ b/xen/drivers/passthrough/amd/pci_amd_iommu.c
@@ -256,7 +256,7 @@ static void __hwdom_init amd_iommu_hwdom_init(struct domain *d)
     if ( allocate_domain_resources(dom_iommu(d)) )
         BUG();
 
-    if ( !iommu_passthrough && !need_iommu(d) )
+    if ( !iommu_passthrough && !iommu_dom0_strict )
     {
         int rc = 0;
 
diff --git a/xen/drivers/passthrough/device_tree.c b/xen/drivers/passthrough/device_tree.c
index 671c8db1d0..1aae1a1f93 100644
--- a/xen/drivers/passthrough/device_tree.c
+++ b/xen/drivers/passthrough/device_tree.c
@@ -44,7 +44,7 @@ int iommu_assign_dt_device(struct domain *d, struct dt_device_node *dev)
      * The hwdom is forced to use IOMMU for protecting assigned
      * device. Therefore the IOMMU data is already set up.
      */
-    ASSERT(!is_hardware_domain(d) || need_iommu(d));
+    ASSERT(!is_hardware_domain(d) || has_iommu_pt(d));
     rc = iommu_construct(d);
     if ( rc )
         goto fail;
diff --git a/xen/drivers/passthrough/iommu.c b/xen/drivers/passthrough/iommu.c
index ad8ffcfcb2..caf3d125ae 100644
--- a/xen/drivers/passthrough/iommu.c
+++ b/xen/drivers/passthrough/iommu.c
@@ -179,8 +179,9 @@ void __hwdom_init iommu_hwdom_init(struct domain *d)
         return;
 
     register_keyhandler('o', &iommu_dump_p2m_table, "dump iommu p2m table", 0);
-    d->need_iommu = !!iommu_dom0_strict;
-    if ( need_iommu(d) && !iommu_use_hap_pt(d) )
+    d->has_iommu_pt = true;
+    d->sync_iommu_pt = !!iommu_dom0_strict && !iommu_use_hap_pt(d);
+    if ( sync_iommu_pt(d) )
     {
         struct page_info *page;
         unsigned int i = 0;
@@ -219,14 +220,14 @@ void iommu_teardown(struct domain *d)
 {
     const struct domain_iommu *hd = dom_iommu(d);
 
-    d->need_iommu = false;
+    d->has_iommu_pt = false;
     hd->platform_ops->teardown(d);
     tasklet_schedule(&iommu_pt_cleanup_tasklet);
 }
 
 int iommu_construct(struct domain *d)
 {
-    if ( need_iommu(d) )
+    if ( has_iommu_pt(d) )
         return 0;
 
     if ( !iommu_use_hap_pt(d) )
@@ -236,12 +237,14 @@ int iommu_construct(struct domain *d)
         rc = arch_iommu_populate_page_table(d);
         if ( rc )
             return rc;
+
+        d->sync_iommu_pt = true;
     }
 
-    d->need_iommu = true;
+    d->has_iommu_pt = true;
     /*
      * There may be dirty cache lines when a device is assigned
-     * and before need_iommu(d) becoming true, this will cause
+     * and before has_iommu_pt(d) becoming true, this will cause
      * memory_type_changed lose effect if memory type changes.
      * Call memory_type_changed here to amend this.
      */
@@ -500,7 +503,7 @@ static void iommu_dump_p2m_table(unsigned char key)
     ops = iommu_get_ops();
     for_each_domain(d)
     {
-        if ( is_hardware_domain(d) || !need_iommu(d) )
+        if ( is_hardware_domain(d) || !has_iommu_pt(d) )
             continue;
 
         if ( iommu_use_hap_pt(d) )
diff --git a/xen/drivers/passthrough/pci.c b/xen/drivers/passthrough/pci.c
index c4890a4295..3d3ad484e7 100644
--- a/xen/drivers/passthrough/pci.c
+++ b/xen/drivers/passthrough/pci.c
@@ -1416,7 +1416,7 @@ static int assign_device(struct domain *d, u16 seg, u8 bus, u8 devfn, u32 flag)
 
     /* Prevent device assign if mem paging or mem sharing have been 
      * enabled for this domain */
-    if ( unlikely(!need_iommu(d) &&
+    if ( unlikely(!has_iommu_pt(d) &&
             (d->arch.hvm_domain.mem_sharing_enabled ||
              vm_event_check_ring(d->vm_event_paging) ||
              p2m_get_hostp2m(d)->global_logdirty)) )
@@ -1460,7 +1460,7 @@ static int assign_device(struct domain *d, u16 seg, u8 bus, u8 devfn, u32 flag)
     }
 
  done:
-    if ( !has_arch_pdevs(d) && need_iommu(d) )
+    if ( !has_arch_pdevs(d) && has_iommu_pt(d) )
         iommu_teardown(d);
     pcidevs_unlock();
 
@@ -1510,7 +1510,7 @@ int deassign_device(struct domain *d, u16 seg, u8 bus, u8 devfn)
 
     pdev->fault.count = 0;
 
-    if ( !has_arch_pdevs(d) && need_iommu(d) )
+    if ( !has_arch_pdevs(d) && has_iommu_pt(d) )
         iommu_teardown(d);
 
     return ret;
diff --git a/xen/include/asm-arm/grant_table.h b/xen/include/asm-arm/grant_table.h
index 9c2c815526..f01c5e36c9 100644
--- a/xen/include/asm-arm/grant_table.h
+++ b/xen/include/asm-arm/grant_table.h
@@ -86,7 +86,7 @@ static inline unsigned int gnttab_dom0_max(void)
     gfn_x(((i) >= nr_status_frames(t)) ? INVALID_GFN : (t)->arch.status_gfn[i])
 
 #define gnttab_need_iommu_mapping(d)                    \
-    (is_domain_direct_mapped(d) && need_iommu(d))
+    (is_domain_direct_mapped(d) && sync_iommu_pt(d))
 
 #endif /* __ASM_GRANT_TABLE_H__ */
 /*
diff --git a/xen/include/asm-arm/iommu.h b/xen/include/asm-arm/iommu.h
index 8d1506c6f7..f6df32f860 100644
--- a/xen/include/asm-arm/iommu.h
+++ b/xen/include/asm-arm/iommu.h
@@ -21,7 +21,7 @@ struct arch_iommu
 };
 
 /* Always share P2M Table between the CPU and the IOMMU */
-#define iommu_use_hap_pt(d) (need_iommu(d))
+#define iommu_use_hap_pt(d) (has_iommu_pt(d))
 
 const struct iommu_ops *iommu_get_ops(void);
 void __init iommu_set_ops(const struct iommu_ops *ops);
diff --git a/xen/include/asm-x86/grant_table.h b/xen/include/asm-x86/grant_table.h
index 76ec5dda2c..80e64b74b7 100644
--- a/xen/include/asm-x86/grant_table.h
+++ b/xen/include/asm-x86/grant_table.h
@@ -99,6 +99,6 @@ static inline void gnttab_clear_flag(unsigned int nr, uint16_t *st)
 #define gnttab_release_host_mappings(domain) ( paging_mode_external(domain) )
 
 #define gnttab_need_iommu_mapping(d)                \
-    (!paging_mode_translate(d) && need_iommu(d))
+    (!paging_mode_translate(d) && sync_iommu_pt(d))
 
 #endif /* __ASM_GRANT_TABLE_H__ */
diff --git a/xen/include/asm-x86/iommu.h b/xen/include/asm-x86/iommu.h
index 0e7d66be54..291859520a 100644
--- a/xen/include/asm-x86/iommu.h
+++ b/xen/include/asm-x86/iommu.h
@@ -79,7 +79,7 @@ static inline int iommu_hardware_setup(void)
 
 /* Are we using the domain P2M table as its IOMMU pagetable? */
 #define iommu_use_hap_pt(d) \
-    (hap_enabled(d) && need_iommu(d) && iommu_hap_pt_share)
+    (hap_enabled(d) && has_iommu_pt(d) && iommu_hap_pt_share)
 
 void iommu_update_ire_from_apic(unsigned int apic, unsigned int reg, unsigned int value);
 unsigned int iommu_read_apic_from_ire(unsigned int apic, unsigned int reg);
diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
index 5b9820e4e1..66da6d6b23 100644
--- a/xen/include/xen/sched.h
+++ b/xen/include/xen/sched.h
@@ -372,8 +372,10 @@ struct domain
 #ifdef CONFIG_HAS_PASSTHROUGH
     struct domain_iommu iommu;
 
-    /* Does this guest need iommu mappings? */
-    bool             need_iommu;
+    /* Does this guest have iommu mappings? */
+    bool             has_iommu_pt;
+    /* Does this guest need synchronization of iommu mappings? */
+    bool             sync_iommu_pt;
 #endif
     /* is node-affinity automatically computed? */
     bool             auto_node_affinity;
@@ -894,9 +896,11 @@ void watchdog_domain_destroy(struct domain *d);
 #define is_pinned_vcpu(v) ((v)->domain->is_pinned || \
                            cpumask_weight((v)->cpu_hard_affinity) == 1)
 #ifdef CONFIG_HAS_PASSTHROUGH
-#define need_iommu(d)    ((d)->need_iommu)
+#define has_iommu_pt(d) ((d)->has_iommu_pt)
+#define sync_iommu_pt(d) ((d)->sync_iommu_pt)
 #else
-#define need_iommu(d)    (0)
+#define has_iommu_pt(d) (0)
+#define sync_iommu_pt(d) (0)
 #endif
 
 static inline bool is_vcpu_online(const struct vcpu *v)
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH v5 12/15] x86: add iommu_op to enable modification of IOMMU mappings
  2018-08-03 17:22 [PATCH v5 00/15] paravirtual IOMMU interface Paul Durrant
                   ` (10 preceding siblings ...)
  2018-08-03 17:22 ` [PATCH v5 11/15] mm / iommu: split need_iommu() into has_iommu_pt() and sync_iommu_pt() Paul Durrant
@ 2018-08-03 17:22 ` Paul Durrant
  2018-08-07  4:08   ` Tian, Kevin
  2018-08-03 17:22 ` [PATCH v5 13/15] memory: add get_paged_gfn() as a wrapper Paul Durrant
                   ` (2 subsequent siblings)
  14 siblings, 1 reply; 57+ messages in thread
From: Paul Durrant @ 2018-08-03 17:22 UTC (permalink / raw)
  To: xen-devel
  Cc: Stefano Stabellini, Wei Liu, George Dunlap, Andrew Cooper,
	Ian Jackson, Tim Deegan, Julien Grall, Paul Durrant, Jan Beulich

This patch adds an iommu_op which checks whether it is possible or
safe for a domain to modify its own IOMMU mappings and, if so, creates
a rangeset to track modifications.

NOTE: The actual map and unmap operations are introduced by subsequent
      patches.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
---
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: George Dunlap <George.Dunlap@eu.citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Julien Grall <julien.grall@arm.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Tim Deegan <tim@xen.org>
Cc: Wei Liu <wei.liu2@citrix.com>

v4:
 - Set sync_iommu_pt to false instead of need_iommu.

v2:
 - New in v2.
---
 xen/arch/x86/iommu_op.c         | 42 +++++++++++++++++++++++++++++++++++++++++
 xen/drivers/passthrough/iommu.c |  2 +-
 xen/drivers/passthrough/pci.c   |  4 ++--
 xen/include/public/iommu_op.h   |  6 ++++++
 xen/include/xen/iommu.h         |  3 +++
 5 files changed, 54 insertions(+), 3 deletions(-)

diff --git a/xen/arch/x86/iommu_op.c b/xen/arch/x86/iommu_op.c
index bcfcd49102..b29547bffd 100644
--- a/xen/arch/x86/iommu_op.c
+++ b/xen/arch/x86/iommu_op.c
@@ -78,6 +78,42 @@ static int iommu_op_query_reserved(struct xen_iommu_op_query_reserved *op)
     return 0;
 }
 
+static int iommu_op_enable_modification(void)
+{
+    struct domain *currd = current->domain;
+    struct domain_iommu *iommu = dom_iommu(currd);
+    const struct iommu_ops *ops = iommu->platform_ops;
+
+    /* Has modification already been enabled? */
+    if ( iommu->iommu_op_ranges )
+        return 0;
+
+    /*
+     * The IOMMU mappings cannot be modified if:
+     * - the IOMMU is not enabled or,
+     * - the current domain is dom0 and tranlsation is disabled or,
+     * - HAP is enabled and the IOMMU shares the mappings.
+     */
+    if ( !iommu_enabled ||
+         (is_hardware_domain(currd) && iommu_passthrough) ||
+         iommu_use_hap_pt(currd) )
+        return -EACCES;
+
+    /*
+     * The IOMMU implementation must provide the lookup method if
+     * modification of the mappings is to be supported.
+     */
+    if ( !ops->lookup_page )
+        return -EOPNOTSUPP;
+
+    iommu->iommu_op_ranges = rangeset_new(currd, NULL, 0);
+    if ( !iommu->iommu_op_ranges )
+        return -ENOMEM;
+
+    currd->sync_iommu_pt = 0; /* Disable synchronization */
+    return 0;
+}
+
 static void iommu_op(xen_iommu_op_t *op)
 {
     switch ( op->op )
@@ -86,6 +122,10 @@ static void iommu_op(xen_iommu_op_t *op)
         op->status = iommu_op_query_reserved(&op->u.query_reserved);
         break;
 
+    case XEN_IOMMUOP_enable_modification:
+        op->status = iommu_op_enable_modification();
+        break;
+
     default:
         op->status = -EOPNOTSUPP;
         break;
@@ -98,6 +138,7 @@ int do_one_iommu_op(xen_iommu_op_buf_t *buf)
     size_t offset;
     static const size_t op_size[] = {
         [XEN_IOMMUOP_query_reserved] = sizeof(struct xen_iommu_op_query_reserved),
+        [XEN_IOMMUOP_enable_modification] = 0,
     };
     size_t size;
     int rc;
@@ -184,6 +225,7 @@ int compat_one_iommu_op(compat_iommu_op_buf_t *buf)
     size_t offset;
     static const size_t op_size[] = {
         [XEN_IOMMUOP_query_reserved] = sizeof(struct compat_iommu_op_query_reserved),
+        [XEN_IOMMUOP_enable_modification] = 0,
     };
     size_t size;
     xen_iommu_op_t nat;
diff --git a/xen/drivers/passthrough/iommu.c b/xen/drivers/passthrough/iommu.c
index caf3d125ae..8f635a5cdb 100644
--- a/xen/drivers/passthrough/iommu.c
+++ b/xen/drivers/passthrough/iommu.c
@@ -26,7 +26,6 @@ static void iommu_dump_p2m_table(unsigned char key);
 
 unsigned int __read_mostly iommu_dev_iotlb_timeout = 1000;
 integer_param("iommu_dev_iotlb_timeout", iommu_dev_iotlb_timeout);
-
 /*
  * The 'iommu' parameter enables the IOMMU.  Optional comma separated
  * value may contain:
@@ -265,6 +264,7 @@ void iommu_domain_destroy(struct domain *d)
     arch_iommu_domain_destroy(d);
 
     rangeset_destroy(hd->reserved_ranges);
+    rangeset_destroy(hd->iommu_op_ranges);
 }
 
 int iommu_map_page(struct domain *d, bfn_t bfn, mfn_t mfn,
diff --git a/xen/drivers/passthrough/pci.c b/xen/drivers/passthrough/pci.c
index 3d3ad484e7..d4033af41a 100644
--- a/xen/drivers/passthrough/pci.c
+++ b/xen/drivers/passthrough/pci.c
@@ -1460,7 +1460,7 @@ static int assign_device(struct domain *d, u16 seg, u8 bus, u8 devfn, u32 flag)
     }
 
  done:
-    if ( !has_arch_pdevs(d) && has_iommu_pt(d) )
+    if ( !has_arch_pdevs(d) && has_iommu_pt(d) && !hd->iommu_op_ranges )
         iommu_teardown(d);
     pcidevs_unlock();
 
@@ -1510,7 +1510,7 @@ int deassign_device(struct domain *d, u16 seg, u8 bus, u8 devfn)
 
     pdev->fault.count = 0;
 
-    if ( !has_arch_pdevs(d) && has_iommu_pt(d) )
+    if ( !has_arch_pdevs(d) && has_iommu_pt(d) && !hd->iommu_op_ranges )
         iommu_teardown(d);
 
     return ret;
diff --git a/xen/include/public/iommu_op.h b/xen/include/public/iommu_op.h
index ade404a877..9bf74bd007 100644
--- a/xen/include/public/iommu_op.h
+++ b/xen/include/public/iommu_op.h
@@ -61,6 +61,12 @@ struct xen_iommu_op_query_reserved {
     XEN_GUEST_HANDLE(xen_iommu_reserved_range_t) ranges;
 };
 
+/*
+ * XEN_IOMMUOP_enable_modification: Enable operations that modify IOMMU
+ *                                  mappings.
+ */
+#define XEN_IOMMUOP_enable_modification 2
+
 struct xen_iommu_op {
     uint16_t op;    /* op type */
     uint16_t pad;
diff --git a/xen/include/xen/iommu.h b/xen/include/xen/iommu.h
index 7c5d46df81..08b163cbcb 100644
--- a/xen/include/xen/iommu.h
+++ b/xen/include/xen/iommu.h
@@ -130,6 +130,9 @@ struct domain_iommu {
      * must not be modified after initialization.
      */
     struct rangeset *reserved_ranges;
+
+    /* Ranges under the control of iommu_op */
+    struct rangeset *iommu_op_ranges;
 };
 
 #define dom_iommu(d)              (&(d)->iommu)
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH v5 13/15] memory: add get_paged_gfn() as a wrapper...
  2018-08-03 17:22 [PATCH v5 00/15] paravirtual IOMMU interface Paul Durrant
                   ` (11 preceding siblings ...)
  2018-08-03 17:22 ` [PATCH v5 12/15] x86: add iommu_op to enable modification of IOMMU mappings Paul Durrant
@ 2018-08-03 17:22 ` Paul Durrant
  2018-08-03 17:22 ` [PATCH v5 14/15] x86: add iommu_ops to modify and flush IOMMU mappings Paul Durrant
  2018-08-03 17:22 ` [PATCH v5 15/15] x86: extend the map and unmap iommu_ops to support grant references Paul Durrant
  14 siblings, 0 replies; 57+ messages in thread
From: Paul Durrant @ 2018-08-03 17:22 UTC (permalink / raw)
  To: xen-devel
  Cc: Stefano Stabellini, Wei Liu, George Dunlap, Andrew Cooper,
	Ian Jackson, Tim Deegan, Julien Grall, Paul Durrant, Jan Beulich

...for some uses of get_page_from_gfn().

There are many occurences of the following pattern in the code:

    q = <readonly look-up> ? P2M_ALLOC : P2M_UNSHARE;
    page = get_page_from_gfn(d, gfn, &p2mt, q);

    if ( p2m_is_paging(p2mt) )
    {
        if ( page )
            put_page(page);

        p2m_mem_paging_populate(d, gfn);
        return <-EAGAIN or equivalent>;
    }

    if ( (q & P2M_UNSHARE) && p2m_is_shared(p2mt) )
    {
        if ( page )
            put_page(page);

        return <-EAGAIN or equivalent>;
    }

    if ( !page )
        return <-EINVAL or equivalent>;

    if ( !p2m_is_ram(p2mt) ||
         (!<readonly look-up> && p2m_is_readonly(p2mt)) )
    {
        put_page(page);
        return <-EINVAL or equivalent>;
    }

There are some small differences between the exact way the occurrences are
coded but the desired semantic is the same.

This patch introduces a new common implementation of this code in
get_paged_gfn() and then converts the various open-coded patterns into
calls to this new function.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
---
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: George Dunlap <George.Dunlap@eu.citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Julien Grall <julien.grall@arm.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Tim Deegan <tim@xen.org>
Cc: Wei Liu <wei.liu2@citrix.com>

v3:
 - Addressed comments from George.

v2:
 - New in v2.
---
 xen/arch/x86/hvm/emulate.c | 32 ++++++--------------------
 xen/arch/x86/hvm/hvm.c     | 16 ++-----------
 xen/common/grant_table.c   | 38 +++++++++----------------------
 xen/common/memory.c        | 56 +++++++++++++++++++++++++++++++++++++---------
 xen/include/asm-arm/p2m.h  |  3 +++
 xen/include/asm-x86/p2m.h  |  2 ++
 6 files changed, 71 insertions(+), 76 deletions(-)

diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
index 8385c62145..c26281ea1c 100644
--- a/xen/arch/x86/hvm/emulate.c
+++ b/xen/arch/x86/hvm/emulate.c
@@ -332,34 +332,16 @@ static int hvmemul_do_io_buffer(
 
 static int hvmemul_acquire_page(unsigned long gmfn, struct page_info **page)
 {
-    struct domain *curr_d = current->domain;
-    p2m_type_t p2mt;
-
-    *page = get_page_from_gfn(curr_d, gmfn, &p2mt, P2M_UNSHARE);
-
-    if ( *page == NULL )
-        return X86EMUL_UNHANDLEABLE;
-
-    if ( p2m_is_paging(p2mt) )
-    {
-        put_page(*page);
-        p2m_mem_paging_populate(curr_d, gmfn);
-        return X86EMUL_RETRY;
-    }
-
-    if ( p2m_is_shared(p2mt) )
+    switch ( get_paged_gfn(current->domain, _gfn(gmfn), false, NULL, page) )
     {
-        put_page(*page);
+    case -EAGAIN:
         return X86EMUL_RETRY;
-    }
-
-    /* This code should not be reached if the gmfn is not RAM */
-    if ( p2m_is_mmio(p2mt) )
-    {
-        domain_crash(curr_d);
-
-        put_page(*page);
+    case -EINVAL:
         return X86EMUL_UNHANDLEABLE;
+    default:
+        ASSERT_UNREACHABLE();
+    case 0:
+        break;
     }
 
     return X86EMUL_OKAY;
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index 72c51faecb..03430e6f07 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -2557,24 +2557,12 @@ static void *_hvm_map_guest_frame(unsigned long gfn, bool_t permanent,
                                   bool_t *writable)
 {
     void *map;
-    p2m_type_t p2mt;
     struct page_info *page;
     struct domain *d = current->domain;
+    p2m_type_t p2mt;
 
-    page = get_page_from_gfn(d, gfn, &p2mt,
-                             writable ? P2M_UNSHARE : P2M_ALLOC);
-    if ( (p2m_is_shared(p2mt) && writable) || !page )
-    {
-        if ( page )
-            put_page(page);
-        return NULL;
-    }
-    if ( p2m_is_paging(p2mt) )
-    {
-        put_page(page);
-        p2m_mem_paging_populate(d, gfn);
+    if ( get_paged_gfn(d, _gfn(gfn), !writable, &p2mt, &page) )
         return NULL;
-    }
 
     if ( writable )
     {
diff --git a/xen/common/grant_table.c b/xen/common/grant_table.c
index 1840b656c9..cc7080bc98 100644
--- a/xen/common/grant_table.c
+++ b/xen/common/grant_table.c
@@ -366,39 +366,23 @@ static int get_paged_frame(unsigned long gfn, mfn_t *mfn,
                            struct page_info **page, bool readonly,
                            struct domain *rd)
 {
-    int rc = GNTST_okay;
-    p2m_type_t p2mt;
-
-    *mfn = INVALID_MFN;
-    *page = get_page_from_gfn(rd, gfn, &p2mt,
-                              readonly ? P2M_ALLOC : P2M_UNSHARE);
-    if ( !*page )
-    {
-#ifdef P2M_SHARED_TYPES
-        if ( p2m_is_shared(p2mt) )
-            return GNTST_eagain;
-#endif
-#ifdef P2M_PAGES_TYPES
-        if ( p2m_is_paging(p2mt) )
-        {
-            p2m_mem_paging_populate(rd, gfn);
-            return GNTST_eagain;
-        }
-#endif
-        return GNTST_bad_page;
-    }
+    int rc;
 
-    if ( p2m_is_foreign(p2mt) )
+    rc = get_paged_gfn(rd, _gfn(gfn), readonly, NULL, page);
+    switch ( rc )
     {
-        put_page(*page);
-        *page = NULL;
-
+    case -EAGAIN:
+        return GNTST_eagain;
+    case -EINVAL:
         return GNTST_bad_page;
+    default:
+        ASSERT_UNREACHABLE();
+    case 0:
+        break;
     }
 
     *mfn = page_to_mfn(*page);
-
-    return rc;
+    return GNTST_okay;
 }
 
 static inline void
diff --git a/xen/common/memory.c b/xen/common/memory.c
index ad7aa09a5c..4c34e6c2d9 100644
--- a/xen/common/memory.c
+++ b/xen/common/memory.c
@@ -1571,37 +1571,73 @@ void destroy_ring_for_helper(
     }
 }
 
-int prepare_ring_for_helper(
-    struct domain *d, unsigned long gmfn, struct page_info **_page,
-    void **_va)
+/*
+ * Acquire a pointer to struct page_info for a specified doman and GFN,
+ * checking whether the page has been paged out, or needs unsharing.
+ * If the function succeeds then zero is returned and page_p is written
+ * with a pointer to the struct page_info with a reference taken. The
+ * caller is responsible for dropping the reference. If p2mt_p is non-NULL
+ * then it is also written with the P2M type of the page.
+ * If the function fails then an appropriate errno is returned and the
+ * values referenced by page_p and p2mt_p are undefined.
+ */
+int get_paged_gfn(struct domain *d, gfn_t gfn, bool readonly,
+                  p2m_type_t *p2mt_p, struct page_info **page_p)
 {
-    struct page_info *page;
+    p2m_query_t q = readonly ? P2M_ALLOC : P2M_UNSHARE;
     p2m_type_t p2mt;
-    void *va;
+    struct page_info *page;
 
-    page = get_page_from_gfn(d, gmfn, &p2mt, P2M_UNSHARE);
+    page = get_page_from_gfn(d, gfn_x(gfn), &p2mt, q);
 
 #ifdef CONFIG_HAS_MEM_PAGING
     if ( p2m_is_paging(p2mt) )
     {
         if ( page )
             put_page(page);
-        p2m_mem_paging_populate(d, gmfn);
-        return -ENOENT;
+
+        p2m_mem_paging_populate(d, gfn_x(gfn));
+        return -EAGAIN;
     }
 #endif
 #ifdef CONFIG_HAS_MEM_SHARING
-    if ( p2m_is_shared(p2mt) )
+    if ( (q & P2M_UNSHARE) && p2m_is_shared(p2mt) )
     {
         if ( page )
             put_page(page);
-        return -ENOENT;
+
+        return -EAGAIN;
     }
 #endif
 
     if ( !page )
         return -EINVAL;
 
+    if ( !p2m_is_ram(p2mt) || (!readonly && p2m_is_readonly(p2mt)) )
+    {
+        put_page(page);
+        return -EINVAL;
+    }
+
+    if ( p2mt_p )
+        *p2mt_p = p2mt;
+
+    *page_p = page;
+    return 0;
+}
+
+int prepare_ring_for_helper(
+    struct domain *d, unsigned long gmfn, struct page_info **_page,
+    void **_va)
+{
+    struct page_info *page;
+    void *va;
+    int rc;
+
+    rc = get_paged_gfn(d, _gfn(gmfn), false, NULL, &page);
+    if ( rc )
+        return (rc == -EAGAIN) ? -ENOENT : rc;
+
     if ( !get_page_type(page, PGT_writable_page) )
     {
         put_page(page);
diff --git a/xen/include/asm-arm/p2m.h b/xen/include/asm-arm/p2m.h
index 8823707c17..a39a4faabd 100644
--- a/xen/include/asm-arm/p2m.h
+++ b/xen/include/asm-arm/p2m.h
@@ -303,6 +303,9 @@ static inline struct page_info *get_page_from_gfn(
     return page;
 }
 
+int get_paged_gfn(struct domain *d, gfn_t gfn, bool readonly,
+                  p2m_type_t *p2mt_p, struct page_info **page_p);
+
 int get_page_type(struct page_info *page, unsigned long type);
 bool is_iomem_page(mfn_t mfn);
 static inline int get_page_and_type(struct page_info *page,
diff --git a/xen/include/asm-x86/p2m.h b/xen/include/asm-x86/p2m.h
index d4b3cfcb6e..e890bcd3e1 100644
--- a/xen/include/asm-x86/p2m.h
+++ b/xen/include/asm-x86/p2m.h
@@ -492,6 +492,8 @@ static inline struct page_info *get_page_from_gfn(
     return mfn_valid(_mfn(gfn)) && get_page(page, d) ? page : NULL;
 }
 
+int get_paged_gfn(struct domain *d, gfn_t gfn, bool readonly,
+                  p2m_type_t *p2mt_p, struct page_info **page_p);
 
 /* General conversion function from mfn to gfn */
 static inline unsigned long mfn_to_gfn(struct domain *d, mfn_t mfn)
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH v5 14/15] x86: add iommu_ops to modify and flush IOMMU mappings
  2018-08-03 17:22 [PATCH v5 00/15] paravirtual IOMMU interface Paul Durrant
                   ` (12 preceding siblings ...)
  2018-08-03 17:22 ` [PATCH v5 13/15] memory: add get_paged_gfn() as a wrapper Paul Durrant
@ 2018-08-03 17:22 ` Paul Durrant
  2018-08-03 17:22 ` [PATCH v5 15/15] x86: extend the map and unmap iommu_ops to support grant references Paul Durrant
  14 siblings, 0 replies; 57+ messages in thread
From: Paul Durrant @ 2018-08-03 17:22 UTC (permalink / raw)
  To: xen-devel
  Cc: Stefano Stabellini, Wei Liu, George Dunlap, Andrew Cooper,
	Ian Jackson, Tim Deegan, Julien Grall, Paul Durrant, Jan Beulich

This patch adds iommu_ops to add (map) or remove (unmap) frames in the
domain's IOMMU mappings, and an iommu_op to synchronize (flush) those
manipulations with the hardware.

Mappings added by the map operation are tracked and only those mappings
may be removed by a subsequent unmap operation. Frames are specified by the
owning domain and GFN. It is, of course, permissable for a domain to map
its own frames using DOMID_SELF.

NOTE: The owning domain and GFN must also be specified in the unmap
      operation, as well as the BFN, so that they can be cross-checked
      with the existent mapping.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
---
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: George Dunlap <George.Dunlap@eu.citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Julien Grall <julien.grall@arm.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Tim Deegan <tim@xen.org>
Cc: Wei Liu <wei.liu2@citrix.com>

v4:
 - Fixed logic inversion when checking return of iommu_unmap_page().

v3:
 - Add type pinning.

v2:
 - Heavily re-worked in v2, including explicit tracking of mappings.
   This avoids the need to clear non-reserved mappings from IOMMU
   at start of day, which would be prohibitively slow on a large host.
---
 xen/arch/x86/iommu_op.c       | 151 ++++++++++++++++++++++++++++++++++++++++++
 xen/include/public/iommu_op.h |  43 ++++++++++++
 xen/include/xlat.lst          |   2 +
 3 files changed, 196 insertions(+)

diff --git a/xen/arch/x86/iommu_op.c b/xen/arch/x86/iommu_op.c
index b29547bffd..35daeeed92 100644
--- a/xen/arch/x86/iommu_op.c
+++ b/xen/arch/x86/iommu_op.c
@@ -114,6 +114,131 @@ static int iommu_op_enable_modification(void)
     return 0;
 }
 
+static int iommuop_map(struct xen_iommu_op_map *op)
+{
+    struct domain *d, *currd = current->domain;
+    struct domain_iommu *iommu = dom_iommu(currd);
+    bool readonly = op->flags & XEN_IOMMUOP_map_readonly;
+    bfn_t bfn = _bfn(op->bfn);
+    struct page_info *page;
+    unsigned int prot;
+    int rc, ignore;
+
+    if ( op->pad || (op->flags & ~XEN_IOMMUOP_map_readonly) )
+        return -EINVAL;
+
+    if ( !iommu->iommu_op_ranges )
+        return -EOPNOTSUPP;
+
+    /* Check whether the specified BFN falls in a reserved region */
+    if ( rangeset_contains_singleton(iommu->reserved_ranges, bfn_x(bfn)) )
+        return -EINVAL;
+
+    d = rcu_lock_domain_by_any_id(op->domid);
+    if ( !d )
+        return -ESRCH;
+
+    rc = get_paged_gfn(d, _gfn(op->gfn), readonly, NULL, &page);
+    if ( rc )
+        goto unlock;
+
+    rc = -EINVAL;
+    if ( !readonly && !get_page_type(page, PGT_writable_page) )
+    {
+        put_page(page);
+        goto unlock;
+    }
+
+    prot = IOMMUF_readable;
+    if ( !readonly )
+        prot |= IOMMUF_writable;
+
+    rc = -EIO;
+    if ( iommu_map_page(currd, bfn, page_to_mfn(page), prot) )
+        goto release;
+
+    rc = rangeset_add_singleton(iommu->iommu_op_ranges, bfn_x(bfn));
+    if ( rc )
+        goto unmap;
+
+    rc = 0;
+    goto unlock; /* retain mapping and references */
+
+ unmap:
+    ignore = iommu_unmap_page(currd, bfn);
+
+ release:
+    if ( !readonly )
+        put_page_type(page);
+    put_page(page);
+
+ unlock:
+    rcu_unlock_domain(d);
+    return rc;
+}
+
+static int iommuop_unmap(struct xen_iommu_op_unmap *op)
+{
+    struct domain *d, *currd = current->domain;
+    struct domain_iommu *iommu = dom_iommu(currd);
+    bfn_t bfn = _bfn(op->bfn);
+    mfn_t mfn;
+    bool readonly;
+    unsigned int prot;
+    struct page_info *page;
+    int rc;
+
+    if ( op->pad0 || op->pad1 )
+        return -EINVAL;
+
+    if ( !iommu->iommu_op_ranges )
+        return -EOPNOTSUPP;
+
+    if ( !rangeset_contains_singleton(iommu->iommu_op_ranges, bfn_x(bfn)) ||
+         iommu_lookup_page(currd, bfn, &mfn, &prot) ||
+         !mfn_valid(mfn) )
+        return -ENOENT;
+
+    readonly = !(prot & IOMMUF_writable);
+    
+    d = rcu_lock_domain_by_any_id(op->domid);
+    if ( !d )
+        return -ESRCH;
+
+    rc = get_paged_gfn(d, _gfn(op->gfn), !(prot & IOMMUF_writable), NULL,
+                       &page);
+    if ( rc )
+        goto unlock;
+
+    put_page(page); /* release extra reference just taken */
+
+    rc = -EINVAL;
+    if ( !mfn_eq(page_to_mfn(page), mfn) )
+        goto unlock;
+
+    /* release reference taken in map */
+    if ( !readonly )
+        put_page_type(page);
+    put_page(page);
+
+    rc = rangeset_remove_singleton(iommu->iommu_op_ranges, bfn_x(bfn));
+    if ( rc )
+        goto unlock;
+
+    if ( iommu_unmap_page(currd, bfn) )
+        rc = -EIO;
+
+ unlock:
+    rcu_unlock_domain(d);
+
+    return rc;
+}
+
+static int iommuop_flush(void)
+{
+    return !iommu_iotlb_flush_all(current->domain) ? 0 : -EIO;
+}
+
 static void iommu_op(xen_iommu_op_t *op)
 {
     switch ( op->op )
@@ -126,6 +251,22 @@ static void iommu_op(xen_iommu_op_t *op)
         op->status = iommu_op_enable_modification();
         break;
 
+    case XEN_IOMMUOP_map:
+        this_cpu(iommu_dont_flush_iotlb) = 1;
+        op->status = iommuop_map(&op->u.map);
+        this_cpu(iommu_dont_flush_iotlb) = 0;
+        break;
+
+    case XEN_IOMMUOP_unmap:
+        this_cpu(iommu_dont_flush_iotlb) = 1;
+        op->status = iommuop_unmap(&op->u.unmap);
+        this_cpu(iommu_dont_flush_iotlb) = 0;
+        break;
+
+    case XEN_IOMMUOP_flush:
+        op->status = iommuop_flush();
+        break;
+
     default:
         op->status = -EOPNOTSUPP;
         break;
@@ -139,6 +280,9 @@ int do_one_iommu_op(xen_iommu_op_buf_t *buf)
     static const size_t op_size[] = {
         [XEN_IOMMUOP_query_reserved] = sizeof(struct xen_iommu_op_query_reserved),
         [XEN_IOMMUOP_enable_modification] = 0,
+        [XEN_IOMMUOP_map] = sizeof(struct xen_iommu_op_map),
+        [XEN_IOMMUOP_unmap] = sizeof(struct xen_iommu_op_unmap),
+        [XEN_IOMMUOP_flush] = 0,
     };
     size_t size;
     int rc;
@@ -226,6 +370,9 @@ int compat_one_iommu_op(compat_iommu_op_buf_t *buf)
     static const size_t op_size[] = {
         [XEN_IOMMUOP_query_reserved] = sizeof(struct compat_iommu_op_query_reserved),
         [XEN_IOMMUOP_enable_modification] = 0,
+        [XEN_IOMMUOP_map] = sizeof(struct compat_iommu_op_map),
+        [XEN_IOMMUOP_unmap] = sizeof(struct compat_iommu_op_unmap),
+        [XEN_IOMMUOP_flush] = 0,
     };
     size_t size;
     xen_iommu_op_t nat;
@@ -263,6 +410,8 @@ int compat_one_iommu_op(compat_iommu_op_buf_t *buf)
      * we need to fix things up here.
      */
 #define XLAT_iommu_op_u_query_reserved XEN_IOMMUOP_query_reserved
+#define XLAT_iommu_op_u_map XEN_IOMMUOP_map
+#define XLAT_iommu_op_u_unmap XEN_IOMMUOP_unmap
     u = cmp.op;
 
 #define XLAT_iommu_op_query_reserved_HNDL_ranges(_d_, _s_)            \
@@ -322,6 +471,8 @@ int compat_one_iommu_op(compat_iommu_op_buf_t *buf)
                                 &cmp, status) )
         return -EFAULT;
 
+#undef XLAT_iommu_op_u_unmap
+#undef XLAT_iommu_op_u_map
 #undef XLAT_iommu_op_u_query_reserved
 
     return 0;
diff --git a/xen/include/public/iommu_op.h b/xen/include/public/iommu_op.h
index 9bf74bd007..c8dc531c83 100644
--- a/xen/include/public/iommu_op.h
+++ b/xen/include/public/iommu_op.h
@@ -67,6 +67,47 @@ struct xen_iommu_op_query_reserved {
  */
 #define XEN_IOMMUOP_enable_modification 2
 
+/*
+ * XEN_IOMMUOP_map: Map a guest page in the IOMMU.
+ */
+#define XEN_IOMMUOP_map 3
+
+struct xen_iommu_op_map {
+    /* IN - The domid of the guest */
+    domid_t domid;
+    uint16_t flags;
+
+#define _XEN_IOMMUOP_map_readonly 0
+#define XEN_IOMMUOP_map_readonly (1 << (_XEN_IOMMUOP_map_readonly))
+
+    uint32_t pad;
+    /* IN - The IOMMU frame number which will hold the new mapping */
+    xen_bfn_t bfn;
+    /* IN - The guest frame number of the page to be mapped */
+    xen_pfn_t gfn;
+};
+
+/*
+ * XEN_IOMMUOP_unmap_gfn: Remove a mapping in the IOMMU.
+ */
+#define XEN_IOMMUOP_unmap 4
+
+struct xen_iommu_op_unmap {
+    /* IN - The domid of the guest */
+    domid_t domid;
+    uint16_t pad0;
+    uint32_t pad1;
+    /* IN - The IOMMU frame number which holds the mapping to be removed */
+    xen_bfn_t bfn;
+    /* IN - The guest frame number of the page that is mapped */
+    xen_pfn_t gfn;
+};
+
+/*
+ * XEN_IOMMUOP_flush: Flush the IOMMU TLB.
+ */
+#define XEN_IOMMUOP_flush 5
+
 struct xen_iommu_op {
     uint16_t op;    /* op type */
     uint16_t pad;
@@ -74,6 +115,8 @@ struct xen_iommu_op {
                     /* 0 for success otherwise, negative errno */
     union {
         struct xen_iommu_op_query_reserved query_reserved;
+        struct xen_iommu_op_map map;
+        struct xen_iommu_op_unmap unmap;
     } u;
 };
 typedef struct xen_iommu_op xen_iommu_op_t;
diff --git a/xen/include/xlat.lst b/xen/include/xlat.lst
index d2f9b1034b..3ad7eadb5a 100644
--- a/xen/include/xlat.lst
+++ b/xen/include/xlat.lst
@@ -79,7 +79,9 @@
 ?	vcpu_hvm_x86_64			hvm/hvm_vcpu.h
 !	iommu_op			iommu_op.h
 !	iommu_op_buf			iommu_op.h
+!	iommu_op_map			iommu_op.h
 !	iommu_op_query_reserved		iommu_op.h
+!	iommu_op_unmap			iommu_op.h
 ?	iommu_reserved_range		iommu_op.h
 ?	kexec_exec			kexec.h
 !	kexec_image			kexec.h
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH v5 15/15] x86: extend the map and unmap iommu_ops to support grant references
  2018-08-03 17:22 [PATCH v5 00/15] paravirtual IOMMU interface Paul Durrant
                   ` (13 preceding siblings ...)
  2018-08-03 17:22 ` [PATCH v5 14/15] x86: add iommu_ops to modify and flush IOMMU mappings Paul Durrant
@ 2018-08-03 17:22 ` Paul Durrant
  14 siblings, 0 replies; 57+ messages in thread
From: Paul Durrant @ 2018-08-03 17:22 UTC (permalink / raw)
  To: xen-devel
  Cc: Stefano Stabellini, Wei Liu, George Dunlap, Andrew Cooper,
	Ian Jackson, Tim Deegan, Julien Grall, Paul Durrant, Jan Beulich

This patch allows a domain to add or remove foreign frames from its
IOMMU mappings by grant reference as well as GFN. This is necessary,
for example, to support a PV network backend that needs to construct a
packet buffer that can be directly accessed by a NIC.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
---
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: George Dunlap <George.Dunlap@eu.citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Julien Grall <julien.grall@arm.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Tim Deegan <tim@xen.org>
Cc: Wei Liu <wei.liu2@citrix.com>

v2:
 - New in v2.
---
 xen/arch/x86/iommu_op.c       |  83 ++++++++++++++++--------
 xen/common/grant_table.c      | 143 ++++++++++++++++++++++++++++++++++++++++++
 xen/include/public/iommu_op.h |  31 +++++++--
 xen/include/xen/grant_table.h |   7 +++
 4 files changed, 231 insertions(+), 33 deletions(-)

diff --git a/xen/arch/x86/iommu_op.c b/xen/arch/x86/iommu_op.c
index 35daeeed92..6b563c0227 100644
--- a/xen/arch/x86/iommu_op.c
+++ b/xen/arch/x86/iommu_op.c
@@ -23,6 +23,7 @@
 #include <xen/guest_access.h>
 #include <xen/hypercall.h>
 #include <xen/nospec.h>
+#include <xen/grant_table.h>
 
 struct get_reserved_ctxt {
     unsigned int max_entries;
@@ -121,10 +122,12 @@ static int iommuop_map(struct xen_iommu_op_map *op)
     bool readonly = op->flags & XEN_IOMMUOP_map_readonly;
     bfn_t bfn = _bfn(op->bfn);
     struct page_info *page;
+    mfn_t mfn;
     unsigned int prot;
     int rc, ignore;
 
-    if ( op->pad || (op->flags & ~XEN_IOMMUOP_map_readonly) )
+    if ( op->pad || (op->flags & ~(XEN_IOMMUOP_map_readonly |
+                                   XEN_IOMMUOP_map_gref)) )
         return -EINVAL;
 
     if ( !iommu->iommu_op_ranges )
@@ -138,15 +141,28 @@ static int iommuop_map(struct xen_iommu_op_map *op)
     if ( !d )
         return -ESRCH;
 
-    rc = get_paged_gfn(d, _gfn(op->gfn), readonly, NULL, &page);
-    if ( rc )
-        goto unlock;
+    if ( op->flags & XEN_IOMMUOP_map_gref )
+    {
+        rc = acquire_gref_for_iommu(d, op->u.gref, readonly, &mfn);
+        if ( rc )
+            goto unlock;
 
-    rc = -EINVAL;
-    if ( !readonly && !get_page_type(page, PGT_writable_page) )
+        page = mfn_to_page(mfn);
+    }
+    else
     {
-        put_page(page);
-        goto unlock;
+        rc = get_paged_gfn(d, _gfn(op->u.gfn), readonly, NULL, &page);
+        if ( rc )
+            goto unlock;
+
+        rc = -EINVAL;
+        if ( !readonly && !get_page_type(page, PGT_writable_page) )
+        {
+            put_page(page);
+            goto unlock;
+        }
+
+        mfn = page_to_mfn(page);
     }
 
     prot = IOMMUF_readable;
@@ -154,7 +170,7 @@ static int iommuop_map(struct xen_iommu_op_map *op)
         prot |= IOMMUF_writable;
 
     rc = -EIO;
-    if ( iommu_map_page(currd, bfn, page_to_mfn(page), prot) )
+    if ( iommu_map_page(currd, bfn, mfn, prot) )
         goto release;
 
     rc = rangeset_add_singleton(iommu->iommu_op_ranges, bfn_x(bfn));
@@ -168,9 +184,14 @@ static int iommuop_map(struct xen_iommu_op_map *op)
     ignore = iommu_unmap_page(currd, bfn);
 
  release:
-    if ( !readonly )
-        put_page_type(page);
-    put_page(page);
+    if ( op->flags & XEN_IOMMUOP_map_gref )
+        release_gref_for_iommu(d, op->u.gref, readonly, mfn);
+    else
+    {
+        if ( !readonly )
+            put_page_type(page);
+        put_page(page);
+    }
 
  unlock:
     rcu_unlock_domain(d);
@@ -185,11 +206,9 @@ static int iommuop_unmap(struct xen_iommu_op_unmap *op)
     mfn_t mfn;
     bool readonly;
     unsigned int prot;
-    struct page_info *page;
     int rc;
 
-    if ( op->pad0 || op->pad1 )
-        return -EINVAL;
+    if ( op->pad || (op->flags & ~XEN_IOMMUOP_unmap_gref) )
 
     if ( !iommu->iommu_op_ranges )
         return -EOPNOTSUPP;
@@ -205,21 +224,31 @@ static int iommuop_unmap(struct xen_iommu_op_unmap *op)
     if ( !d )
         return -ESRCH;
 
-    rc = get_paged_gfn(d, _gfn(op->gfn), !(prot & IOMMUF_writable), NULL,
-                       &page);
-    if ( rc )
-        goto unlock;
+    if ( op->flags & XEN_IOMMUOP_unmap_gref )
+    {
+        rc = release_gref_for_iommu(d, op->u.gref, readonly, mfn);
+        if ( rc )
+            goto unlock;
+    }
+    else
+    {
+        struct page_info *page;
 
-    put_page(page); /* release extra reference just taken */
+        rc = get_paged_gfn(d, _gfn(op->u.gfn), readonly, NULL, &page);
+        if ( rc )
+            goto unlock;
 
-    rc = -EINVAL;
-    if ( !mfn_eq(page_to_mfn(page), mfn) )
-        goto unlock;
+        put_page(page); /* release extra reference just taken */
 
-    /* release reference taken in map */
-    if ( !readonly )
-        put_page_type(page);
-    put_page(page);
+        rc = -EINVAL;
+        if ( !mfn_eq(page_to_mfn(page), mfn) )
+            goto unlock;
+
+        /* release reference taken in map */
+        if ( !readonly )
+            put_page_type(page);
+        put_page(page);
+    }
 
     rc = rangeset_remove_singleton(iommu->iommu_op_ranges, bfn_x(bfn));
     if ( rc )
diff --git a/xen/common/grant_table.c b/xen/common/grant_table.c
index cc7080bc98..fc285f9753 100644
--- a/xen/common/grant_table.c
+++ b/xen/common/grant_table.c
@@ -3898,6 +3898,149 @@ int gnttab_map_frame(struct domain *d, unsigned long idx, gfn_t gfn,
     return rc;
 }
 
+int
+acquire_gref_for_iommu(struct domain *d, grant_ref_t gref,
+                       bool readonly, mfn_t *mfn)
+{
+    struct domain *currd = current->domain;
+    struct grant_table *gt = d->grant_table;
+    grant_entry_header_t *shah;
+    struct active_grant_entry *act;
+    uint16_t *status;
+    int rc;
+
+    grant_read_lock(gt);
+
+    rc = -ENOENT;
+    if ( gref > nr_grant_entries(gt) )
+        goto unlock;
+
+    act = active_entry_acquire(gt, gref);
+    shah = shared_entry_header(gt, gref);
+    status = ( gt->gt_version == 2 ) ?
+        &status_entry(gt, gref) :
+        &shah->flags;
+
+    rc = -EACCES;
+    if ( (shah->flags & GTF_type_mask) != GTF_permit_access ||
+         (shah->flags & GTF_sub_page) )
+        goto release;
+
+    rc = -ERANGE;
+    if ( act->pin && ((act->domid != currd->domain_id) ||
+                      (act->pin & 0x80808080U) != 0) )
+        goto release;
+
+    rc = -EINVAL;
+    if ( !act->pin ||
+         (!readonly && !(act->pin & GNTPIN_devw_mask)) ) {
+        if ( _set_status(gt->gt_version, currd->domain_id, readonly,
+                         0, shah, act, status) != GNTST_okay )
+            goto release;
+    }
+
+    if ( !act->pin )
+    {
+        gfn_t gfn = gt->gt_version == 1 ?
+            _gfn(shared_entry_v1(gt, gref).frame) :
+            _gfn(shared_entry_v2(gt, gref).full_page.frame);
+        struct page_info *page;
+
+        rc =  get_paged_gfn(d, gfn, readonly, NULL, &page);
+        if ( rc )
+            goto clear;
+
+        act_set_gfn(act, gfn);
+        act->mfn = page_to_mfn(page);
+        act->domid = currd->domain_id;
+        act->start = 0;
+        act->length = PAGE_SIZE;
+        act->is_sub_page = false;
+        act->trans_domain = d;
+        act->trans_gref = gref;
+    }
+    else
+    {
+        ASSERT(mfn_valid(act->mfn));
+        if ( !get_page(mfn_to_page(act->mfn), d) )
+            goto clear;
+    }
+
+    rc = 0;
+    act->pin += readonly ? GNTPIN_devr_inc : GNTPIN_devw_inc;
+    *mfn = act->mfn;
+    goto release;
+
+ clear:
+    if ( !readonly && !(act->pin & GNTPIN_devw_mask) )
+        gnttab_clear_flag(_GTF_writing, status);
+
+    if ( !act->pin )
+        gnttab_clear_flag(_GTF_reading, status);
+
+ release:
+    active_entry_release(act);
+
+ unlock:
+    grant_read_unlock(gt);
+
+    return rc;
+}
+
+int
+release_gref_for_iommu(struct domain *d, grant_ref_t gref,
+                       bool readonly, mfn_t mfn)
+{
+    struct domain *currd = current->domain;
+    struct grant_table *gt = d->grant_table;
+    grant_entry_header_t *shah;
+    struct active_grant_entry *act;
+    uint16_t *status;
+    int rc;
+
+    grant_read_lock(gt);
+
+    rc = -ENOENT;
+    if ( gref > nr_grant_entries(gt) )
+        goto unlock;
+
+    act = active_entry_acquire(gt, gref);
+    shah = shared_entry_header(gt, gref);
+    status = ( gt->gt_version == 2 ) ?
+        &status_entry(gt, gref) :
+        &shah->flags;
+
+    rc = -EINVAL;
+    if ( !act->pin || (act->domid != currd->domain_id) ||
+         !mfn_eq(act->mfn, mfn) )
+        goto release;
+
+    rc = 0;
+    if ( readonly )
+        act->pin -= GNTPIN_devr_inc;
+    else
+    {
+        gnttab_mark_dirty(d, mfn);
+
+        act->pin -= GNTPIN_devw_inc;
+        if ( !(act->pin & GNTPIN_devw_mask) )
+            gnttab_clear_flag(_GTF_writing, status);
+    }
+
+    if ( !act->pin )
+        gnttab_clear_flag(_GTF_reading, status);
+
+    put_page(mfn_to_page(mfn));
+
+ release:
+    active_entry_release(act);
+
+ unlock:
+    grant_read_unlock(gt);
+
+    return rc;
+}
+
 static void gnttab_usage_print(struct domain *rd)
 {
     int first = 1;
diff --git a/xen/include/public/iommu_op.h b/xen/include/public/iommu_op.h
index c8dc531c83..5a672bcead 100644
--- a/xen/include/public/iommu_op.h
+++ b/xen/include/public/iommu_op.h
@@ -24,6 +24,7 @@
 #define XEN_PUBLIC_IOMMU_OP_H
 
 #include "xen.h"
+#include "grant_table.h"
 
 typedef uint64_t xen_bfn_t;
 
@@ -79,12 +80,20 @@ struct xen_iommu_op_map {
 
 #define _XEN_IOMMUOP_map_readonly 0
 #define XEN_IOMMUOP_map_readonly (1 << (_XEN_IOMMUOP_map_readonly))
+#define _XEN_IOMMUOP_map_gref 1
+#define XEN_IOMMUOP_map_gref (1 << (_XEN_IOMMUOP_map_gref))
 
     uint32_t pad;
     /* IN - The IOMMU frame number which will hold the new mapping */
     xen_bfn_t bfn;
-    /* IN - The guest frame number of the page to be mapped */
-    xen_pfn_t gfn;
+    /*
+     * IN - The guest frame number or grant reference of the page to
+     * be mapped.
+     */
+    union {
+        xen_pfn_t gfn;
+        grant_ref_t gref;
+    } u;
 };
 
 /*
@@ -95,12 +104,22 @@ struct xen_iommu_op_map {
 struct xen_iommu_op_unmap {
     /* IN - The domid of the guest */
     domid_t domid;
-    uint16_t pad0;
-    uint32_t pad1;
+    uint16_t flags;
+
+#define _XEN_IOMMUOP_unmap_gref 0
+#define XEN_IOMMUOP_unmap_gref (1 << (_XEN_IOMMUOP_unmap_gref))
+
+    uint32_t pad;
     /* IN - The IOMMU frame number which holds the mapping to be removed */
     xen_bfn_t bfn;
-    /* IN - The guest frame number of the page that is mapped */
-    xen_pfn_t gfn;
+    /*
+     * IN - The guest frame number or grant reference of the page that
+     * is mapped.
+     */
+    union {
+        xen_pfn_t gfn;
+        grant_ref_t gref;
+    } u;
 };
 
 /*
diff --git a/xen/include/xen/grant_table.h b/xen/include/xen/grant_table.h
index 0286ba33dd..0f44c6587c 100644
--- a/xen/include/xen/grant_table.h
+++ b/xen/include/xen/grant_table.h
@@ -59,6 +59,13 @@ int mem_sharing_gref_to_gfn(struct grant_table *gt, grant_ref_t ref,
 int gnttab_map_frame(struct domain *d, unsigned long idx, gfn_t gfn,
                      mfn_t *mfn);
 
+int
+acquire_gref_for_iommu(struct domain *d, grant_ref_t gref,
+                       bool readonly, mfn_t *mfn);
+int
+release_gref_for_iommu(struct domain *d, grant_ref_t gref,
+                       bool readonly, mfn_t mfn);
+
 unsigned int gnttab_dom0_frames(void);
 
 #endif /* __XEN_GRANT_TABLE_H__ */
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* Re: [PATCH v5 11/15] mm / iommu: split need_iommu() into has_iommu_pt() and sync_iommu_pt()
  2018-08-03 17:22 ` [PATCH v5 11/15] mm / iommu: split need_iommu() into has_iommu_pt() and sync_iommu_pt() Paul Durrant
@ 2018-08-03 18:18   ` Razvan Cojocaru
  2018-08-07  3:41   ` Tian, Kevin
  1 sibling, 0 replies; 57+ messages in thread
From: Razvan Cojocaru @ 2018-08-03 18:18 UTC (permalink / raw)
  To: Paul Durrant, xen-devel
  Cc: Kevin Tian, Stefano Stabellini, Wei Liu, Jan Beulich,
	George Dunlap, Andrew Cooper, Ian Jackson, Tim Deegan,
	Julien Grall, Tamas K Lengyel, Jun Nakajima, Brian Woods,
	Suravee Suthikulpanit

On 08/03/2018 08:22 PM, Paul Durrant wrote:
> The name 'need_iommu' is a little confusing as it suggests a domain needs
> to use the IOMMU but something might not be set up yet, when in fact it
> doesn't become true until IOMMU mappings for the domain have been created.
> 
> Two different meanings are also inferred from it in various places in the
> code:
> 
> - Some callers want to test whether a domain has IOMMU mappings
> - Some callers want to test whether they need to synchronize the domain's
>   P2M and IOMMU mappings
> 
> This patch therefore creates two new boolean flags, 'has_iommu_pt' and
> 'sync_iommu_pt' to convey those meanings separately and creates the
> corresponding macros. All callers of need_iommu() are then modified to
> use the macro appropriate to what they are trying to test.
> 
> NOTE: The test of need_iommu(d) in the AMD IOMMU initialization code has
>       been replaced with a test of iommu_dom0_strict since this more
>       accurately reflects the meaning of the test and brings it into
>       line with a similar test in the Intel VT-d code.
> 
> Signed-off-by: Paul Durrant <paul.durrant@citrix.com>

Acked-by: Razvan Cojocaru <rcojocaru@bitdefender.com>


Thanks,
Razvan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH v5 02/15] iommu: introduce the concept of BFN...
  2018-08-03 17:22 ` [PATCH v5 02/15] iommu: introduce the concept of BFN Paul Durrant
@ 2018-08-07  2:38   ` Tian, Kevin
  2018-08-07  7:59     ` Paul Durrant
  0 siblings, 1 reply; 57+ messages in thread
From: Tian, Kevin @ 2018-08-07  2:38 UTC (permalink / raw)
  To: Paul Durrant, xen-devel
  Cc: Julien Grall, Stefano Stabellini, Jan Beulich, Suravee Suthikulpanit

> From: Paul Durrant [mailto:paul.durrant@citrix.com]
> Sent: Saturday, August 4, 2018 1:22 AM
> 
> ...meaning 'bus frame number' i.e. a frame number mapped in the IOMMU
> rather than the MMU.
> 
> This patch is a largely cosmetic change that substitutes the terms 'gfn'
> and 'gaddr' for 'bfn' and 'baddr' in all the places where the frame number
> or address relate to the IOMMU rather than the MMU.
> 
> The parts that are not purely cosmetic are:
> 
>  - the introduction of a type-safe declaration of bfn_t and definition of
>    INVALID_BFN to make the substitution of gfn_x(INVALID_GFN) mechanical.
>  - the introduction of __bfn_to_baddr and __baddr_to_bfn (and type-safe
>    variants without the leading __) with some use of the former.
> 
> Subsequent patches will convert code to make use of type-safe BFNs.
> 
> Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
> Reviewed-by: Wei Liu <wei.liu2@citrix.com>

[...]

> diff --git a/xen/include/xen/mm.h b/xen/include/xen/mm.h
> index 24654e8e22..4ebf91f17e 100644
> --- a/xen/include/xen/mm.h
> +++ b/xen/include/xen/mm.h
> @@ -14,8 +14,9 @@
>   * should be adhered to.
>   *
>   * mfn: Machine Frame Number
> - *   The values Xen puts into its own pagetables.  This is the host physical
> - *   memory address space with RAM, MMIO etc.
> + *   The values Xen puts into its own pagetables, 2nd stage pagetables
> (where
> + *   hardware assisted 2nd stage translation is used) or IOMMU page
> tables.
> + *   This is the host physical memory address space with RAM, MMIO etc.

there is also shadow page table... not sure whether it is really helpful
to list all of them out. "Xen-owned pagetables" should be sufficient to cover.

>   *
>   * gfn: Guest Frame Number
>   *   The values a guest puts in its own pagetables.  For an auto-translated
> @@ -26,6 +27,11 @@
>   *   A linear idea of a guest physical address space. For an auto-translated
>   *   guest, pfn == gfn while for a non-translated guest, pfn != gfn.
>   *
> + * bfn: Bus Frame Number (definitions in include/xen/iommu.h)
> + *   The linear frame numbers of IOMMU address space. All initiators for a
> + *   given domain share a single IOMMU address space and, by default,
> Xen will
> + *   ensure bfn == pfn.
> + *

Above description is a bit confusing:

- for 'domain', are you talking about VM or IOMMU domain? If the former
then each initiator could have its own IOMMU address space when a virtual
IOMMU (either pvIOMMU or emulated IOMMU) is exposed.

- since you talk about 'by default', better also talk about non-default case,
i.e. when virtual IOMMU is exposed then bfn is a separate address space
managed by guest

>   * WARNING: Some of these terms have changed over time while others
> have been
>   * used inconsistently, meaning that a lot of existing code does not match
> the
>   * definitions above.  New code should use these terms as described here,
> and
> --
> 2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH v5 03/15] iommu: make use of type-safe BFN and MFN in exported functions
  2018-08-03 17:22 ` [PATCH v5 03/15] iommu: make use of type-safe BFN and MFN in exported functions Paul Durrant
@ 2018-08-07  2:45   ` Tian, Kevin
  0 siblings, 0 replies; 57+ messages in thread
From: Tian, Kevin @ 2018-08-07  2:45 UTC (permalink / raw)
  To: Paul Durrant, xen-devel
  Cc: Stefano Stabellini, Nakajima, Jun, George Dunlap, Andrew Cooper,
	Ian Jackson, Tim Deegan, Julien Grall, Jan Beulich

> From: Paul Durrant [mailto:paul.durrant@citrix.com]
> Sent: Saturday, August 4, 2018 1:22 AM
> 
> This patch modifies the declaration of the entry points to the IOMMU
> sub-system to use bfn_t and mfn_t in place of unsigned long. A subsequent
> patch will similarly modify the methods in the iommu_ops structure.
> 
> Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
> Reviewed-by: Wei Liu <wei.liu2@citrix.com>

Reviewed-by: Kevin Tian <kevin.tian@intel.com>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH v5 04/15] iommu: push use of type-safe BFN and MFN into iommu_ops
  2018-08-03 17:22 ` [PATCH v5 04/15] iommu: push use of type-safe BFN and MFN into iommu_ops Paul Durrant
@ 2018-08-07  2:49   ` Tian, Kevin
  0 siblings, 0 replies; 57+ messages in thread
From: Tian, Kevin @ 2018-08-07  2:49 UTC (permalink / raw)
  To: Paul Durrant, xen-devel
  Cc: Andrew Cooper, Jan Beulich, Suravee Suthikulpanit, George Dunlap

> From: Paul Durrant [mailto:paul.durrant@citrix.com]
> Sent: Saturday, August 4, 2018 1:22 AM
> 
> This patch modifies the methods in struct iommu_ops to use type-safe BFN
> and MFN. This follows on from the prior patch that modified the functions
> exported in xen/iommu.h.
> 
> Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
> Reviewed-by: Wei Liu <wei.liu2@citrix.com>

Reviewed-by: Kevin Tian <kevin.tian@intel.com>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH v5 05/15] iommu: don't domain_crash() inside iommu_map/unmap_page()
  2018-08-03 17:22 ` [PATCH v5 05/15] iommu: don't domain_crash() inside iommu_map/unmap_page() Paul Durrant
@ 2018-08-07  2:55   ` Tian, Kevin
  2018-08-07  8:05     ` Paul Durrant
  0 siblings, 1 reply; 57+ messages in thread
From: Tian, Kevin @ 2018-08-07  2:55 UTC (permalink / raw)
  To: Paul Durrant, xen-devel
  Cc: Stefano Stabellini, Nakajima, Jun, George Dunlap, Andrew Cooper,
	Ian Jackson, Tim Deegan, Julien Grall, Jan Beulich

> From: Paul Durrant [mailto:paul.durrant@citrix.com]
> Sent: Saturday, August 4, 2018 1:22 AM
> 
> Turn iommu_map/unmap_page() into straightforward wrappers that check
> the

iommu_iotlb_flush is also changed.

> existence of the relevant iommu_op and call through to it. This makes them
> usable by PV IOMMU code to be delivered in future patches.
> Leave the decision on whether to invoke domain_crash() up to the caller.
> This has the added benefit that the (module/line number) message that
> domain_crash() spits out will be more indicative of where the problem lies.
> 
> NOTE: This patch includes one bit of clean-up in set_identity_p2m_entry()
>       replacing use of p2m->domain with the domain pointer passed into the
>       function.
> 
> Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
> Reviewed-by: George Dunlap <George.Dunlap@eu.citrix.com>
> Reviewed-by: Wei Liu <wei.liu2@citrix.com>

Reviewed-by: Kevin Tian <kevin.tian@intel.com>, with one small comment:

> 
>      if ( need_iommu(p2m->domain) &&
>           (lpae_valid(orig_pte) || lpae_valid(*entry)) )
> +    {
>          rc = iommu_iotlb_flush(p2m->domain, _bfn(gfn_x(sgfn)),
>                                 1UL << page_order);
> +        if ( unlikely(rc) && !is_hardware_domain(p2m->domain) )
> +            domain_crash(p2m->domain);
> +    }

to reduce duplication, does it make sense to introduce a wrapper
like domain_crash_nd ('nd' indicate !is_hardware_domain, and
becomes a nop for hardware domain)? Then it becomes:

	if ( unlikely(rc) )
		domain_crash_nd(p2m->domain);

Thanks
Kevin

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH v5 06/15] public / x86: introduce __HYPERCALL_iommu_op
  2018-08-03 17:22 ` [PATCH v5 06/15] public / x86: introduce __HYPERCALL_iommu_op Paul Durrant
@ 2018-08-07  3:00   ` Tian, Kevin
  2018-08-07  8:10     ` Paul Durrant
  2018-08-17 21:10   ` Daniel De Graaf
  1 sibling, 1 reply; 57+ messages in thread
From: Tian, Kevin @ 2018-08-07  3:00 UTC (permalink / raw)
  To: Paul Durrant, xen-devel
  Cc: Stefano Stabellini, Wei Liu, George Dunlap, Andrew Cooper,
	Ian Jackson, Tim Deegan, Jan Beulich, Daniel De Graaf

> From: Paul Durrant
> Sent: Saturday, August 4, 2018 1:22 AM
> 
> This patch introduces the boilerplate for a new hypercall to allow a
> domain to control IOMMU mappings for its own pages.
> Whilst there is duplication of code between the native and compat entry
> points which appears ripe for some form of combination, I think it is
> better to maintain the separation as-is because the compat entry point
> will necessarily gain complexity in subsequent patches.
> 
> NOTE: This hypercall is only implemented for x86 and is currently
>       restricted by XSM to dom0. Its scope can be expanded in future
>       if need be.
> 
> Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
[...]

> diff --git a/xen/arch/x86/iommu_op.c b/xen/arch/x86/iommu_op.c
> new file mode 100644
> index 0000000000..744c0fce27
> --- /dev/null
> +++ b/xen/arch/x86/iommu_op.c
> @@ -0,0 +1,184 @@
> +/************************************************************
> ******************
> + * x86/iommu_op.c
> + *
> + * Paravirtualised IOMMU functionality

although only x86 is supported, logic in this file is vendor agnostic. Better
move to a common place now...

Thanks
Kevin
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH v5 07/15] iommu: track reserved ranges using a rangeset
  2018-08-03 17:22 ` [PATCH v5 07/15] iommu: track reserved ranges using a rangeset Paul Durrant
@ 2018-08-07  3:04   ` Tian, Kevin
  2018-08-07  8:16     ` Paul Durrant
  0 siblings, 1 reply; 57+ messages in thread
From: Tian, Kevin @ 2018-08-07  3:04 UTC (permalink / raw)
  To: Paul Durrant, xen-devel; +Cc: Jan Beulich

> From: Paul Durrant [mailto:paul.durrant@citrix.com]
> Sent: Saturday, August 4, 2018 1:22 AM
> 
> Ranges that should be considered reserved in the IOMMU are not
> necessarily
> limited to RMRRs. If iommu_inclusive_mapping is set then any frame
> number
> falling within an E820 reserved region should also be considered as
> reserved in the IOMMU.

I don't think it is a good extension. RMRR by definition requires 
identity mapping in IOMMU page table, thus must be also reserved
in bfn address space (to avoid being used for other purpose). 
Normal E820 reserved regions have no such requirement. Guest
can use same bfn location for any purpose. I'm not sure why we
want to add more limitations here.

> This patch adds a rangeset to the domain_iommu structure that is then
> used
> to track all reserved ranges. This will be needed by a subsequent patch
> to test whether it is safe to modify a particular IOMMU entry.
> 
> Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
> Reviewed-by: George Dunlap <george.dunlap@citrix.com>
> Reviewed-by: Wei Liu <wei.liu2@citrix.com>
> ---
> Cc: Jan Beulich <jbeulich@suse.com>
> Cc: Kevin Tian <kevin.tian@intel.com>
> 
> v2:
>  - New in v2.
> ---
>  xen/drivers/passthrough/iommu.c       | 10 +++++++++-
>  xen/drivers/passthrough/vtd/iommu.c   | 20 +++++++++++++-------
>  xen/drivers/passthrough/vtd/x86/vtd.c | 17 ++++++++++++++++-
>  xen/include/xen/iommu.h               |  6 ++++++
>  4 files changed, 44 insertions(+), 9 deletions(-)
> 
> diff --git a/xen/drivers/passthrough/iommu.c
> b/xen/drivers/passthrough/iommu.c
> index 21e6886a3f..b10a37e5d7 100644
> --- a/xen/drivers/passthrough/iommu.c
> +++ b/xen/drivers/passthrough/iommu.c
> @@ -147,6 +147,10 @@ int iommu_domain_init(struct domain *d)
>      if ( !iommu_enabled )
>          return 0;
> 
> +    hd->reserved_ranges = rangeset_new(d, NULL, 0);
> +    if ( !hd->reserved_ranges )
> +        return -ENOMEM;
> +
>      hd->platform_ops = iommu_get_ops();
>      return hd->platform_ops->init(d);
>  }
> @@ -248,12 +252,16 @@ int iommu_construct(struct domain *d)
> 
>  void iommu_domain_destroy(struct domain *d)
>  {
> -    if ( !iommu_enabled || !dom_iommu(d)->platform_ops )
> +    const struct domain_iommu *hd = dom_iommu(d);
> +
> +    if ( !iommu_enabled || !hd->platform_ops )
>          return;
> 
>      iommu_teardown(d);
> 
>      arch_iommu_domain_destroy(d);
> +
> +    rangeset_destroy(hd->reserved_ranges);
>  }
> 
>  int iommu_map_page(struct domain *d, bfn_t bfn, mfn_t mfn,
> diff --git a/xen/drivers/passthrough/vtd/iommu.c
> b/xen/drivers/passthrough/vtd/iommu.c
> index c9f50f04ad..282e227414 100644
> --- a/xen/drivers/passthrough/vtd/iommu.c
> +++ b/xen/drivers/passthrough/vtd/iommu.c
> @@ -1910,6 +1910,7 @@ static int rmrr_identity_mapping(struct domain
> *d, bool_t map,
>      unsigned long end_pfn = PAGE_ALIGN_4K(rmrr->end_address) >>
> PAGE_SHIFT_4K;
>      struct mapped_rmrr *mrmrr;
>      struct domain_iommu *hd = dom_iommu(d);
> +    int err = 0;
> 
>      ASSERT(pcidevs_locked());
>      ASSERT(rmrr->base_address < rmrr->end_address);
> @@ -1923,8 +1924,6 @@ static int rmrr_identity_mapping(struct domain
> *d, bool_t map,
>          if ( mrmrr->base == rmrr->base_address &&
>               mrmrr->end == rmrr->end_address )
>          {
> -            int ret = 0;
> -
>              if ( map )
>              {
>                  ++mrmrr->count;
> @@ -1934,28 +1933,35 @@ static int rmrr_identity_mapping(struct
> domain *d, bool_t map,
>              if ( --mrmrr->count )
>                  return 0;
> 
> -            while ( base_pfn < end_pfn )
> +            err = rangeset_remove_range(hd->reserved_ranges,
> +                                        base_pfn, end_pfn);
> +            while ( !err && base_pfn < end_pfn )
>              {
>                  if ( clear_identity_p2m_entry(d, base_pfn) )
> -                    ret = -ENXIO;
> +                    err = -ENXIO;
> +
>                  base_pfn++;
>              }
> 
>              list_del(&mrmrr->list);
>              xfree(mrmrr);
> -            return ret;
> +            return err;
>          }
>      }
> 
>      if ( !map )
>          return -ENOENT;
> 
> +    err = rangeset_add_range(hd->reserved_ranges, base_pfn, end_pfn);
> +    if ( err )
> +        return err;
> +
>      while ( base_pfn < end_pfn )
>      {
> -        int err = set_identity_p2m_entry(d, base_pfn, p2m_access_rw, flag);
> -
> +        err = set_identity_p2m_entry(d, base_pfn, p2m_access_rw, flag);
>          if ( err )
>              return err;
> +
>          base_pfn++;
>      }
> 
> diff --git a/xen/drivers/passthrough/vtd/x86/vtd.c
> b/xen/drivers/passthrough/vtd/x86/vtd.c
> index 6fed4a92cb..032412b8c6 100644
> --- a/xen/drivers/passthrough/vtd/x86/vtd.c
> +++ b/xen/drivers/passthrough/vtd/x86/vtd.c
> @@ -164,10 +164,25 @@ void __hwdom_init
> vtd_set_hwdom_mapping(struct domain *d)
> 
>              if ( !rc )
>                 rc = ret;
> +
> +            /*
> +             * The only reason a reserved page would be mapped is that
> +             * iommu_inclusive_mapping is set, in which case the BFN
> +             * needs to be marked as reserved in the IOMMU.
> +             */
> +            if ( page_is_ram_type(pfn, RAM_TYPE_RESERVED) )
> +            {
> +                ASSERT(iommu_inclusive_mapping);
> +
> +                ret = rangeset_add_singleton(dom_iommu(d)->reserved_ranges,
> +                                             bfn_x(bfn));
> +                if ( !rc )
> +                    rc = ret;
> +            }
>          }
> 
>          if ( rc )
> -            printk(XENLOG_WARNING VTDPREFIX " d%d: IOMMU mapping
> failed: %d\n",
> +            printk(XENLOG_WARNING VTDPREFIX " d%d: IOMMU
> mapping/reservation failed: %d\n",
>                     d->domain_id, rc);
> 
>          if (!(i & (0xfffff >> (PAGE_SHIFT - PAGE_SHIFT_4K))))
> diff --git a/xen/include/xen/iommu.h b/xen/include/xen/iommu.h
> index 624784fec8..cc0be81b4e 100644
> --- a/xen/include/xen/iommu.h
> +++ b/xen/include/xen/iommu.h
> @@ -122,6 +122,12 @@ struct domain_iommu {
> 
>      /* Features supported by the IOMMU */
>      DECLARE_BITMAP(features, IOMMU_FEAT_count);
> +
> +    /*
> +     * BFN ranges that are reserved in the domain IOMMU mappings and
> +     * must not be modified after initialization.
> +     */
> +    struct rangeset *reserved_ranges;
>  };
> 
>  #define dom_iommu(d)              (&(d)->iommu)
> --
> 2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH v5 09/15] vtd: add lookup_page method to iommu_ops
  2018-08-03 17:22 ` [PATCH v5 09/15] vtd: add lookup_page method to iommu_ops Paul Durrant
@ 2018-08-07  3:25   ` Tian, Kevin
  2018-08-07  8:21     ` Paul Durrant
  0 siblings, 1 reply; 57+ messages in thread
From: Tian, Kevin @ 2018-08-07  3:25 UTC (permalink / raw)
  To: Paul Durrant, xen-devel; +Cc: George Dunlap, Jan Beulich

> From: Paul Durrant [mailto:paul.durrant@citrix.com]
> Sent: Saturday, August 4, 2018 1:22 AM
> 
> This patch adds a new method to the VT-d IOMMU implementation to find
> the
> MFN currently mapped by the specified BFN along with a wrapper function
> in
> generic IOMMU code to call the implementation if it exists.
> 
> This functionality will be used by a subsequent patch.
> 
> Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
> Reviewed-by: Wei Liu <wei.liu2@citrix.com>
> ---
> Cc: Kevin Tian <kevin.tian@intel.com>
> Cc: Jan Beulich <jbeulich@suse.com>
> Cc: George Dunlap <george.dunlap@citrix.com>
> 
> v3:
>  - Addressed comments from George.
> 
> v2:
>  - Addressed some comments from Jan.
> ---
>  xen/drivers/passthrough/iommu.c     | 11 +++++++++++
>  xen/drivers/passthrough/vtd/iommu.c | 34
> ++++++++++++++++++++++++++++++++++
>  xen/drivers/passthrough/vtd/iommu.h |  3 +++
>  xen/include/xen/iommu.h             |  4 ++++
>  4 files changed, 52 insertions(+)
> 
> diff --git a/xen/drivers/passthrough/iommu.c
> b/xen/drivers/passthrough/iommu.c
> index b10a37e5d7..9b7baca93f 100644
> --- a/xen/drivers/passthrough/iommu.c
> +++ b/xen/drivers/passthrough/iommu.c
> @@ -305,6 +305,17 @@ int iommu_unmap_page(struct domain *d, bfn_t
> bfn)
>      return rc;
>  }
> 
> +int iommu_lookup_page(struct domain *d, bfn_t bfn, mfn_t *mfn,
> +                      unsigned int *flags)
> +{
> +    const struct domain_iommu *hd = dom_iommu(d);
> +
> +    if ( !iommu_enabled || !hd->platform_ops )
> +        return -EOPNOTSUPP;
> +
> +    return hd->platform_ops->lookup_page(d, bfn, mfn, flags);
> +}
> +
>  static void iommu_free_pagetables(unsigned long unused)
>  {
>      do {
> diff --git a/xen/drivers/passthrough/vtd/iommu.c
> b/xen/drivers/passthrough/vtd/iommu.c
> index 282e227414..8cd3b59aa0 100644
> --- a/xen/drivers/passthrough/vtd/iommu.c
> +++ b/xen/drivers/passthrough/vtd/iommu.c
> @@ -1830,6 +1830,39 @@ static int __must_check
> intel_iommu_unmap_page(struct domain *d,
>      return dma_pte_clear_one(d, bfn_to_baddr(bfn));
>  }
> 
> +static int intel_iommu_lookup_page(struct domain *d, bfn_t bfn, mfn_t
> *mfn,
> +                                   unsigned int *flags)

Not looking at later patches yet... but in concept bfn address
space is per device instead of per domain. In default situation
(w/o pvIOMMU exposed), all devices assigned to dom0 share
the same address space (bfn=pfn) which is currently linked
from domain structure. Then with pvIOMMU exposed, dom0
starts to manage individual pfn address space (called IOVA
address space within dom0) per assigned device. In that case
lookup should accept a bdf number and then find the right
page table...

> +{
> +    struct domain_iommu *hd = dom_iommu(d);
> +    struct dma_pte *page = NULL, *pte = NULL, val;
> +    u64 pg_maddr;
> +
> +    spin_lock(&hd->arch.mapping_lock);
> +
> +    pg_maddr = addr_to_dma_page_maddr(d, bfn_to_baddr(bfn), 0);
> +    if ( pg_maddr == 0 )
> +    {
> +        spin_unlock(&hd->arch.mapping_lock);
> +        return -ENOMEM;
> +    }
> +
> +    page = map_vtd_domain_page(pg_maddr);
> +    pte = page + (bfn_x(bfn) & LEVEL_MASK);
> +    val = *pte;
> +
> +    unmap_vtd_domain_page(page);
> +    spin_unlock(&hd->arch.mapping_lock);
> +
> +    if ( !dma_pte_present(val) )
> +        return -ENOENT;
> +
> +    *mfn = maddr_to_mfn(dma_pte_addr(val));
> +    *flags = dma_pte_read(val) ? IOMMUF_readable : 0;
> +    *flags |= dma_pte_write(val) ? IOMMUF_writable : 0;
> +
> +    return 0;
> +}
> +
>  int iommu_pte_flush(struct domain *d, uint64_t bfn, uint64_t *pte,
>                      int order, int present)
>  {
> @@ -2661,6 +2694,7 @@ const struct iommu_ops intel_iommu_ops = {
>      .teardown = iommu_domain_teardown,
>      .map_page = intel_iommu_map_page,
>      .unmap_page = intel_iommu_unmap_page,
> +    .lookup_page = intel_iommu_lookup_page,
>      .free_page_table = iommu_free_page_table,
>      .reassign_device = reassign_device_ownership,
>      .get_device_group_id = intel_iommu_group_id,
> diff --git a/xen/drivers/passthrough/vtd/iommu.h
> b/xen/drivers/passthrough/vtd/iommu.h
> index 72c1a2e3cd..47bdfcb5ea 100644
> --- a/xen/drivers/passthrough/vtd/iommu.h
> +++ b/xen/drivers/passthrough/vtd/iommu.h
> @@ -272,6 +272,9 @@ struct dma_pte {
>  #define dma_set_pte_prot(p, prot) do { \
>          (p).val = ((p).val & ~DMA_PTE_PROT) | ((prot) & DMA_PTE_PROT); \
>      } while (0)
> +#define dma_pte_prot(p) ((p).val & DMA_PTE_PROT)
> +#define dma_pte_read(p) (dma_pte_prot(p) & DMA_PTE_READ)
> +#define dma_pte_write(p) (dma_pte_prot(p) & DMA_PTE_WRITE)
>  #define dma_pte_addr(p) ((p).val & PADDR_MASK & PAGE_MASK_4K)
>  #define dma_set_pte_addr(p, addr) do {\
>              (p).val |= ((addr) & PAGE_MASK_4K); } while (0)
> diff --git a/xen/include/xen/iommu.h b/xen/include/xen/iommu.h
> index cc0be81b4e..7c5d46df81 100644
> --- a/xen/include/xen/iommu.h
> +++ b/xen/include/xen/iommu.h
> @@ -100,6 +100,8 @@ void iommu_teardown(struct domain *d);
>  int __must_check iommu_map_page(struct domain *d, bfn_t bfn,
>                                  mfn_t mfn, unsigned int flags);
>  int __must_check iommu_unmap_page(struct domain *d, bfn_t bfn);
> +int __must_check iommu_lookup_page(struct domain *d, bfn_t bfn,
> mfn_t *mfn,
> +                                   unsigned int *flags);
> 
>  enum iommu_feature
>  {
> @@ -198,6 +200,8 @@ struct iommu_ops {
>      int __must_check (*map_page)(struct domain *d, bfn_t bfn, mfn_t mfn,
>                                   unsigned int flags);
>      int __must_check (*unmap_page)(struct domain *d, bfn_t bfn);
> +    int __must_check (*lookup_page)(struct domain *d, bfn_t bfn, mfn_t
> *mfn,
> +                                    unsigned int *flags);
>      void (*free_page_table)(struct page_info *);
>  #ifdef CONFIG_X86
>      void (*update_ire_from_apic)(unsigned int apic, unsigned int reg,
> unsigned int value);
> --
> 2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH v5 10/15] mm / iommu: include need_iommu() test in iommu_use_hap_pt()
  2018-08-03 17:22 ` [PATCH v5 10/15] mm / iommu: include need_iommu() test in iommu_use_hap_pt() Paul Durrant
@ 2018-08-07  3:32   ` Tian, Kevin
  0 siblings, 0 replies; 57+ messages in thread
From: Tian, Kevin @ 2018-08-07  3:32 UTC (permalink / raw)
  To: Paul Durrant, xen-devel
  Cc: Stefano Stabellini, Nakajima, Jun, George Dunlap, Andrew Cooper,
	Julien Grall, Jan Beulich

> From: Paul Durrant [mailto:paul.durrant@citrix.com]
> Sent: Saturday, August 4, 2018 1:22 AM
> 
> The name 'iommu_use_hap_pt' suggests that that P2M table is in use as
> the
> domain's IOMMU pagetable which, prior to this patch, is not strictly true
> since the macro did not test whether the domain actually has IOMMU
> mappings.
> 
> Signed-off-by: Paul Durrant <paul.durrant@citrix.com>

Reviewed-by: Kevin Tian <kevin.tian@intel.com>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH v5 11/15] mm / iommu: split need_iommu() into has_iommu_pt() and sync_iommu_pt()
  2018-08-03 17:22 ` [PATCH v5 11/15] mm / iommu: split need_iommu() into has_iommu_pt() and sync_iommu_pt() Paul Durrant
  2018-08-03 18:18   ` Razvan Cojocaru
@ 2018-08-07  3:41   ` Tian, Kevin
  2018-08-07  8:24     ` Paul Durrant
  1 sibling, 1 reply; 57+ messages in thread
From: Tian, Kevin @ 2018-08-07  3:41 UTC (permalink / raw)
  To: Paul Durrant, xen-devel
  Cc: Stefano Stabellini, Wei Liu, Nakajima, Jun, Razvan Cojocaru,
	George Dunlap, Andrew Cooper, Ian Jackson, Tim Deegan,
	Julien Grall, Tamas K Lengyel, Jan Beulich, Brian Woods,
	Suravee Suthikulpanit

> From: Paul Durrant [mailto:paul.durrant@citrix.com]
> Sent: Saturday, August 4, 2018 1:22 AM
> 
> The name 'need_iommu' is a little confusing as it suggests a domain needs
> to use the IOMMU but something might not be set up yet, when in fact it
> doesn't become true until IOMMU mappings for the domain have been
> created.
> 
> Two different meanings are also inferred from it in various places in the
> code:
> 
> - Some callers want to test whether a domain has IOMMU mappings
> - Some callers want to test whether they need to synchronize the domain's
>   P2M and IOMMU mappings
> 
> This patch therefore creates two new boolean flags, 'has_iommu_pt' and
> 'sync_iommu_pt' to convey those meanings separately and creates the
> corresponding macros. All callers of need_iommu() are then modified to
> use the macro appropriate to what they are trying to test.

sync_iommu_pt sounds like an operation. what about need_sync_iommu?

and for has_iommu_pt, what about iommu_enabled where 'enabled'
implies page table created?

> 
> NOTE: The test of need_iommu(d) in the AMD IOMMU initialization code
> has
>       been replaced with a test of iommu_dom0_strict since this more
>       accurately reflects the meaning of the test and brings it into
>       line with a similar test in the Intel VT-d code.
> 
> Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
> ---
> Cc: Stefano Stabellini <sstabellini@kernel.org>
> Cc: Julien Grall <julien.grall@arm.com>
> Cc: Andrew Cooper <andrew.cooper3@citrix.com>
> Cc: George Dunlap <George.Dunlap@eu.citrix.com>
> Cc: Ian Jackson <ian.jackson@eu.citrix.com>
> Cc: Jan Beulich <jbeulich@suse.com>
> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> Cc: Tim Deegan <tim@xen.org>
> Cc: Wei Liu <wei.liu2@citrix.com>
> Cc: Tamas K Lengyel <tamas@tklengyel.com>
> Cc: George Dunlap <george.dunlap@eu.citrix.com>
> Cc: Jun Nakajima <jun.nakajima@intel.com>
> Cc: Kevin Tian <kevin.tian@intel.com>
> Cc: Razvan Cojocaru <rcojocaru@bitdefender.com>
> Cc: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
> Cc: Brian Woods <brian.woods@amd.com>
> 
> v4:
>  - New in v4.
> ---
>  xen/arch/arm/p2m.c                          |  2 +-
>  xen/arch/x86/hvm/mtrr.c                     |  5 +++--
>  xen/arch/x86/mm.c                           |  2 +-
>  xen/arch/x86/mm/mem_sharing.c               |  2 +-
>  xen/arch/x86/mm/p2m-ept.c                   |  2 +-
>  xen/arch/x86/mm/p2m-pt.c                    |  2 +-
>  xen/arch/x86/mm/p2m.c                       |  8 ++++----
>  xen/arch/x86/mm/paging.c                    |  2 +-
>  xen/arch/x86/x86_64/mm.c                    |  3 ++-
>  xen/common/memory.c                         |  6 +++---
>  xen/common/vm_event.c                       |  2 +-
>  xen/drivers/passthrough/amd/pci_amd_iommu.c |  2 +-
>  xen/drivers/passthrough/device_tree.c       |  2 +-
>  xen/drivers/passthrough/iommu.c             | 17 ++++++++++-------
>  xen/drivers/passthrough/pci.c               |  6 +++---
>  xen/include/asm-arm/grant_table.h           |  2 +-
>  xen/include/asm-arm/iommu.h                 |  2 +-
>  xen/include/asm-x86/grant_table.h           |  2 +-
>  xen/include/asm-x86/iommu.h                 |  2 +-
>  xen/include/xen/sched.h                     | 12 ++++++++----
>  20 files changed, 46 insertions(+), 37 deletions(-)
> 
> diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
> index eb39861b73..c972b72baf 100644
> --- a/xen/arch/arm/p2m.c
> +++ b/xen/arch/arm/p2m.c
> @@ -951,7 +951,7 @@ static int __p2m_set_entry(struct p2m_domain
> *p2m,
>      if ( lpae_valid(orig_pte) && entry->p2m.base != orig_pte.p2m.base )
>          p2m_free_entry(p2m, orig_pte, level);
> 
> -    if ( need_iommu(p2m->domain) &&
> +    if ( sync_iommu_pt(p2m->domain) &&
>           (lpae_valid(orig_pte) || lpae_valid(*entry)) )
>      {
>          rc = iommu_iotlb_flush(p2m->domain, _bfn(gfn_x(sgfn)),
> diff --git a/xen/arch/x86/hvm/mtrr.c b/xen/arch/x86/hvm/mtrr.c
> index c502dda693..61131d4b61 100644
> --- a/xen/arch/x86/hvm/mtrr.c
> +++ b/xen/arch/x86/hvm/mtrr.c
> @@ -824,7 +824,8 @@ HVM_REGISTER_SAVE_RESTORE(MTRR,
> hvm_save_mtrr_msr, hvm_load_mtrr_msr,
> 
>  void memory_type_changed(struct domain *d)
>  {
> -    if ( need_iommu(d) && d->vcpu && d->vcpu[0] )
> +    if ( (sync_iommu_pt(d) || iommu_use_hap_pt(d)) &&
> +         d->vcpu && d->vcpu[0] )
>      {
>          p2m_memory_type_changed(d);
>          flush_all(FLUSH_CACHE);
> @@ -872,7 +873,7 @@ int epte_get_entry_emt(struct domain *d, unsigned
> long gfn, mfn_t mfn,
>          return MTRR_TYPE_UNCACHABLE;
>      }
> 
> -    if ( !need_iommu(d) && !cache_flush_permitted(d) )
> +    if ( !has_iommu_pt(d) && !cache_flush_permitted(d) )
>      {
>          *ipat = 1;
>          return MTRR_TYPE_WRBACK;
> diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
> index 08878574f3..462398096f 100644
> --- a/xen/arch/x86/mm.c
> +++ b/xen/arch/x86/mm.c
> @@ -2727,7 +2727,7 @@ static int _get_page_type(struct page_info *page,
> unsigned long type,
>      {
>          /* Special pages should not be accessible from devices. */
>          struct domain *d = page_get_owner(page);
> -        if ( d && is_pv_domain(d) && unlikely(need_iommu(d)) )
> +        if ( d && is_pv_domain(d) && unlikely(sync_iommu_pt(d)) )
>          {
>              mfn_t mfn = page_to_mfn(page);
> 
> diff --git a/xen/arch/x86/mm/mem_sharing.c
> b/xen/arch/x86/mm/mem_sharing.c
> index fad8a9df13..c54845275f 100644
> --- a/xen/arch/x86/mm/mem_sharing.c
> +++ b/xen/arch/x86/mm/mem_sharing.c
> @@ -1610,7 +1610,7 @@ int mem_sharing_domctl(struct domain *d,
> struct xen_domctl_mem_sharing_op *mec)
>          case XEN_DOMCTL_MEM_SHARING_CONTROL:
>          {
>              rc = 0;
> -            if ( unlikely(need_iommu(d) && mec->u.enable) )
> +            if ( unlikely(has_iommu_pt(d) && mec->u.enable) )
>                  rc = -EXDEV;
>              else
>                  d->arch.hvm_domain.mem_sharing_enabled = mec->u.enable;
> diff --git a/xen/arch/x86/mm/p2m-ept.c b/xen/arch/x86/mm/p2m-ept.c
> index e52dbb6203..d90424a9a3 100644
> --- a/xen/arch/x86/mm/p2m-ept.c
> +++ b/xen/arch/x86/mm/p2m-ept.c
> @@ -868,7 +868,7 @@ out:
>      {
>          if ( iommu_use_hap_pt(d) )
>              rc = iommu_pte_flush(d, gfn, &ept_entry->epte, order,
> vtd_pte_present);
> -        else if ( need_iommu(d) )
> +        else if ( sync_iommu_pt(d) )
>          {
>              bfn_t bfn = _bfn(gfn);
> 
> diff --git a/xen/arch/x86/mm/p2m-pt.c b/xen/arch/x86/mm/p2m-pt.c
> index 9255194070..e0ae6bf20f 100644
> --- a/xen/arch/x86/mm/p2m-pt.c
> +++ b/xen/arch/x86/mm/p2m-pt.c
> @@ -687,7 +687,7 @@ p2m_pt_set_entry(struct p2m_domain *p2m, gfn_t
> gfn_, mfn_t mfn,
>              if ( iommu_old_flags )
>                  amd_iommu_flush_pages(p2m->domain, gfn, page_order);
>          }
> -        else if ( need_iommu(p2m->domain) )
> +        else if ( sync_iommu_pt(p2m->domain) )
>          {
>              bfn_t bfn = _bfn(gfn);
> 
> diff --git a/xen/arch/x86/mm/p2m.c b/xen/arch/x86/mm/p2m.c
> index 7465489074..4ffce37c45 100644
> --- a/xen/arch/x86/mm/p2m.c
> +++ b/xen/arch/x86/mm/p2m.c
> @@ -713,7 +713,7 @@ p2m_remove_page(struct p2m_domain *p2m,
> unsigned long gfn_l, unsigned long mfn,
>      {
>          int rc = 0;
> 
> -        if ( need_iommu(p2m->domain) )
> +        if ( sync_iommu_pt(p2m->domain) )
>          {
>              bfn_t bfn = _bfn(mfn);
> 
> @@ -777,7 +777,7 @@ guest_physmap_add_entry(struct domain *d, gfn_t
> gfn, mfn_t mfn,
> 
>      if ( !paging_mode_translate(d) )
>      {
> -        if ( need_iommu(d) && t == p2m_ram_rw )
> +        if ( sync_iommu_pt(d) && t == p2m_ram_rw )
>          {
>              bfn_t bfn = _bfn(mfn_x(mfn));
> 
> @@ -1165,7 +1165,7 @@ int set_identity_p2m_entry(struct domain *d,
> unsigned long gfn_l,
> 
>      if ( !paging_mode_translate(d) )
>      {
> -        if ( !need_iommu(d) )
> +        if ( !sync_iommu_pt(d) )
>              return 0;
> 
>          ret = iommu_map_page(d, _bfn(gfn_l), _mfn(gfn_l),
> @@ -1261,7 +1261,7 @@ int clear_identity_p2m_entry(struct domain *d,
> unsigned long gfn_l)
> 
>      if ( !paging_mode_translate(d) )
>      {
> -        if ( !need_iommu(d) )
> +        if ( !sync_iommu_pt(d) )
>              return 0;
> 
>          ret = iommu_unmap_page(d, _bfn(gfn_l));
> diff --git a/xen/arch/x86/mm/paging.c b/xen/arch/x86/mm/paging.c
> index 2b0445ffe9..d54f6dd496 100644
> --- a/xen/arch/x86/mm/paging.c
> +++ b/xen/arch/x86/mm/paging.c
> @@ -213,7 +213,7 @@ int paging_log_dirty_enable(struct domain *d,
> bool_t log_global)
>  {
>      int ret;
> 
> -    if ( need_iommu(d) && log_global )
> +    if ( has_iommu_pt(d) && log_global )
>      {
>          /*
>           * Refuse to turn on global log-dirty mode
> diff --git a/xen/arch/x86/x86_64/mm.c b/xen/arch/x86/x86_64/mm.c
> index cc58e4cef4..29c3050a7f 100644
> --- a/xen/arch/x86/x86_64/mm.c
> +++ b/xen/arch/x86/x86_64/mm.c
> @@ -1426,7 +1426,8 @@ int memory_add(unsigned long spfn, unsigned
> long epfn, unsigned int pxm)
>      if ( ret )
>          goto destroy_m2p;
> 
> -    if ( iommu_enabled && !iommu_passthrough
> && !need_iommu(hardware_domain) )
> +    if ( iommu_enabled && !iommu_passthrough &&
> +         !sync_iommu_pt(hardware_domain) )
>      {
>          for ( i = spfn; i < epfn; i++ )
>              if ( iommu_map_page(hardware_domain, _bfn(i), _mfn(i),
> diff --git a/xen/common/memory.c b/xen/common/memory.c
> index 0a34677cc3..ad7aa09a5c 100644
> --- a/xen/common/memory.c
> +++ b/xen/common/memory.c
> @@ -804,8 +804,8 @@ int xenmem_add_to_physmap(struct domain *d,
> struct xen_add_to_physmap *xatp,
>      xatp->size -= start;
> 
>  #ifdef CONFIG_HAS_PASSTHROUGH
> -    if ( need_iommu(d) )
> -        this_cpu(iommu_dont_flush_iotlb) = 1;
> +    if ( sync_iommu_pt(d) || iommu_use_hap_pt(d) )
> +       this_cpu(iommu_dont_flush_iotlb) = 1;
>  #endif
> 
>      while ( xatp->size > done )
> @@ -827,7 +827,7 @@ int xenmem_add_to_physmap(struct domain *d,
> struct xen_add_to_physmap *xatp,
>      }
> 
>  #ifdef CONFIG_HAS_PASSTHROUGH
> -    if ( need_iommu(d) )
> +    if ( sync_iommu_pt(d) || iommu_use_hap_pt(d) )
>      {
>          int ret;
> 
> diff --git a/xen/common/vm_event.c b/xen/common/vm_event.c
> index 144ab81c86..ec3dfd1dae 100644
> --- a/xen/common/vm_event.c
> +++ b/xen/common/vm_event.c
> @@ -644,7 +644,7 @@ int vm_event_domctl(struct domain *d, struct
> xen_domctl_vm_event_op *vec,
> 
>              /* No paging if iommu is used */
>              rc = -EMLINK;
> -            if ( unlikely(need_iommu(d)) )
> +            if ( unlikely(has_iommu_pt(d)) )
>                  break;
> 
>              rc = -EXDEV;
> diff --git a/xen/drivers/passthrough/amd/pci_amd_iommu.c
> b/xen/drivers/passthrough/amd/pci_amd_iommu.c
> index eea22c3d0d..752751b103 100644
> --- a/xen/drivers/passthrough/amd/pci_amd_iommu.c
> +++ b/xen/drivers/passthrough/amd/pci_amd_iommu.c
> @@ -256,7 +256,7 @@ static void __hwdom_init
> amd_iommu_hwdom_init(struct domain *d)
>      if ( allocate_domain_resources(dom_iommu(d)) )
>          BUG();
> 
> -    if ( !iommu_passthrough && !need_iommu(d) )
> +    if ( !iommu_passthrough && !iommu_dom0_strict )
>      {
>          int rc = 0;
> 
> diff --git a/xen/drivers/passthrough/device_tree.c
> b/xen/drivers/passthrough/device_tree.c
> index 671c8db1d0..1aae1a1f93 100644
> --- a/xen/drivers/passthrough/device_tree.c
> +++ b/xen/drivers/passthrough/device_tree.c
> @@ -44,7 +44,7 @@ int iommu_assign_dt_device(struct domain *d, struct
> dt_device_node *dev)
>       * The hwdom is forced to use IOMMU for protecting assigned
>       * device. Therefore the IOMMU data is already set up.
>       */
> -    ASSERT(!is_hardware_domain(d) || need_iommu(d));
> +    ASSERT(!is_hardware_domain(d) || has_iommu_pt(d));
>      rc = iommu_construct(d);
>      if ( rc )
>          goto fail;
> diff --git a/xen/drivers/passthrough/iommu.c
> b/xen/drivers/passthrough/iommu.c
> index ad8ffcfcb2..caf3d125ae 100644
> --- a/xen/drivers/passthrough/iommu.c
> +++ b/xen/drivers/passthrough/iommu.c
> @@ -179,8 +179,9 @@ void __hwdom_init iommu_hwdom_init(struct
> domain *d)
>          return;
> 
>      register_keyhandler('o', &iommu_dump_p2m_table, "dump iommu
> p2m table", 0);
> -    d->need_iommu = !!iommu_dom0_strict;
> -    if ( need_iommu(d) && !iommu_use_hap_pt(d) )
> +    d->has_iommu_pt = true;
> +    d->sync_iommu_pt = !!iommu_dom0_strict && !iommu_use_hap_pt(d);
> +    if ( sync_iommu_pt(d) )
>      {
>          struct page_info *page;
>          unsigned int i = 0;
> @@ -219,14 +220,14 @@ void iommu_teardown(struct domain *d)
>  {
>      const struct domain_iommu *hd = dom_iommu(d);
> 
> -    d->need_iommu = false;
> +    d->has_iommu_pt = false;
>      hd->platform_ops->teardown(d);
>      tasklet_schedule(&iommu_pt_cleanup_tasklet);
>  }
> 
>  int iommu_construct(struct domain *d)
>  {
> -    if ( need_iommu(d) )
> +    if ( has_iommu_pt(d) )
>          return 0;
> 
>      if ( !iommu_use_hap_pt(d) )
> @@ -236,12 +237,14 @@ int iommu_construct(struct domain *d)
>          rc = arch_iommu_populate_page_table(d);
>          if ( rc )
>              return rc;
> +
> +        d->sync_iommu_pt = true;
>      }
> 
> -    d->need_iommu = true;
> +    d->has_iommu_pt = true;
>      /*
>       * There may be dirty cache lines when a device is assigned
> -     * and before need_iommu(d) becoming true, this will cause
> +     * and before has_iommu_pt(d) becoming true, this will cause
>       * memory_type_changed lose effect if memory type changes.
>       * Call memory_type_changed here to amend this.
>       */
> @@ -500,7 +503,7 @@ static void iommu_dump_p2m_table(unsigned char
> key)
>      ops = iommu_get_ops();
>      for_each_domain(d)
>      {
> -        if ( is_hardware_domain(d) || !need_iommu(d) )
> +        if ( is_hardware_domain(d) || !has_iommu_pt(d) )
>              continue;
> 
>          if ( iommu_use_hap_pt(d) )
> diff --git a/xen/drivers/passthrough/pci.c b/xen/drivers/passthrough/pci.c
> index c4890a4295..3d3ad484e7 100644
> --- a/xen/drivers/passthrough/pci.c
> +++ b/xen/drivers/passthrough/pci.c
> @@ -1416,7 +1416,7 @@ static int assign_device(struct domain *d, u16 seg,
> u8 bus, u8 devfn, u32 flag)
> 
>      /* Prevent device assign if mem paging or mem sharing have been
>       * enabled for this domain */
> -    if ( unlikely(!need_iommu(d) &&
> +    if ( unlikely(!has_iommu_pt(d) &&
>              (d->arch.hvm_domain.mem_sharing_enabled ||
>               vm_event_check_ring(d->vm_event_paging) ||
>               p2m_get_hostp2m(d)->global_logdirty)) )
> @@ -1460,7 +1460,7 @@ static int assign_device(struct domain *d, u16 seg,
> u8 bus, u8 devfn, u32 flag)
>      }
> 
>   done:
> -    if ( !has_arch_pdevs(d) && need_iommu(d) )
> +    if ( !has_arch_pdevs(d) && has_iommu_pt(d) )
>          iommu_teardown(d);
>      pcidevs_unlock();
> 
> @@ -1510,7 +1510,7 @@ int deassign_device(struct domain *d, u16 seg,
> u8 bus, u8 devfn)
> 
>      pdev->fault.count = 0;
> 
> -    if ( !has_arch_pdevs(d) && need_iommu(d) )
> +    if ( !has_arch_pdevs(d) && has_iommu_pt(d) )
>          iommu_teardown(d);
> 
>      return ret;
> diff --git a/xen/include/asm-arm/grant_table.h b/xen/include/asm-
> arm/grant_table.h
> index 9c2c815526..f01c5e36c9 100644
> --- a/xen/include/asm-arm/grant_table.h
> +++ b/xen/include/asm-arm/grant_table.h
> @@ -86,7 +86,7 @@ static inline unsigned int gnttab_dom0_max(void)
>      gfn_x(((i) >= nr_status_frames(t)) ? INVALID_GFN : (t)->arch.status_gfn[i])
> 
>  #define gnttab_need_iommu_mapping(d)                    \
> -    (is_domain_direct_mapped(d) && need_iommu(d))
> +    (is_domain_direct_mapped(d) && sync_iommu_pt(d))
> 
>  #endif /* __ASM_GRANT_TABLE_H__ */
>  /*
> diff --git a/xen/include/asm-arm/iommu.h b/xen/include/asm-
> arm/iommu.h
> index 8d1506c6f7..f6df32f860 100644
> --- a/xen/include/asm-arm/iommu.h
> +++ b/xen/include/asm-arm/iommu.h
> @@ -21,7 +21,7 @@ struct arch_iommu
>  };
> 
>  /* Always share P2M Table between the CPU and the IOMMU */
> -#define iommu_use_hap_pt(d) (need_iommu(d))
> +#define iommu_use_hap_pt(d) (has_iommu_pt(d))
> 
>  const struct iommu_ops *iommu_get_ops(void);
>  void __init iommu_set_ops(const struct iommu_ops *ops);
> diff --git a/xen/include/asm-x86/grant_table.h b/xen/include/asm-
> x86/grant_table.h
> index 76ec5dda2c..80e64b74b7 100644
> --- a/xen/include/asm-x86/grant_table.h
> +++ b/xen/include/asm-x86/grant_table.h
> @@ -99,6 +99,6 @@ static inline void gnttab_clear_flag(unsigned int nr,
> uint16_t *st)
>  #define gnttab_release_host_mappings(domain)
> ( paging_mode_external(domain) )
> 
>  #define gnttab_need_iommu_mapping(d)                \
> -    (!paging_mode_translate(d) && need_iommu(d))
> +    (!paging_mode_translate(d) && sync_iommu_pt(d))
> 
>  #endif /* __ASM_GRANT_TABLE_H__ */
> diff --git a/xen/include/asm-x86/iommu.h b/xen/include/asm-
> x86/iommu.h
> index 0e7d66be54..291859520a 100644
> --- a/xen/include/asm-x86/iommu.h
> +++ b/xen/include/asm-x86/iommu.h
> @@ -79,7 +79,7 @@ static inline int iommu_hardware_setup(void)
> 
>  /* Are we using the domain P2M table as its IOMMU pagetable? */
>  #define iommu_use_hap_pt(d) \
> -    (hap_enabled(d) && need_iommu(d) && iommu_hap_pt_share)
> +    (hap_enabled(d) && has_iommu_pt(d) && iommu_hap_pt_share)
> 
>  void iommu_update_ire_from_apic(unsigned int apic, unsigned int reg,
> unsigned int value);
>  unsigned int iommu_read_apic_from_ire(unsigned int apic, unsigned int
> reg);
> diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
> index 5b9820e4e1..66da6d6b23 100644
> --- a/xen/include/xen/sched.h
> +++ b/xen/include/xen/sched.h
> @@ -372,8 +372,10 @@ struct domain
>  #ifdef CONFIG_HAS_PASSTHROUGH
>      struct domain_iommu iommu;
> 
> -    /* Does this guest need iommu mappings? */
> -    bool             need_iommu;
> +    /* Does this guest have iommu mappings? */
> +    bool             has_iommu_pt;
> +    /* Does this guest need synchronization of iommu mappings? */
> +    bool             sync_iommu_pt;
>  #endif
>      /* is node-affinity automatically computed? */
>      bool             auto_node_affinity;
> @@ -894,9 +896,11 @@ void watchdog_domain_destroy(struct domain
> *d);
>  #define is_pinned_vcpu(v) ((v)->domain->is_pinned || \
>                             cpumask_weight((v)->cpu_hard_affinity) == 1)
>  #ifdef CONFIG_HAS_PASSTHROUGH
> -#define need_iommu(d)    ((d)->need_iommu)
> +#define has_iommu_pt(d) ((d)->has_iommu_pt)
> +#define sync_iommu_pt(d) ((d)->sync_iommu_pt)
>  #else
> -#define need_iommu(d)    (0)
> +#define has_iommu_pt(d) (0)
> +#define sync_iommu_pt(d) (0)
>  #endif
> 
>  static inline bool is_vcpu_online(const struct vcpu *v)
> --
> 2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH v5 12/15] x86: add iommu_op to enable modification of IOMMU mappings
  2018-08-03 17:22 ` [PATCH v5 12/15] x86: add iommu_op to enable modification of IOMMU mappings Paul Durrant
@ 2018-08-07  4:08   ` Tian, Kevin
  2018-08-07  8:32     ` Paul Durrant
  0 siblings, 1 reply; 57+ messages in thread
From: Tian, Kevin @ 2018-08-07  4:08 UTC (permalink / raw)
  To: Paul Durrant, xen-devel
  Cc: Stefano Stabellini, Wei Liu, George Dunlap, Andrew Cooper,
	Ian Jackson, Tim Deegan, Julien Grall, Jan Beulich

> From: Paul Durrant
> Sent: Saturday, August 4, 2018 1:22 AM
> 
> This patch adds an iommu_op which checks whether it is possible or
> safe for a domain to modify its own IOMMU mappings and, if so, creates
> a rangeset to track modifications.

Have to say that there might be a concept mismatch between us,
so I will stop review here until we get aligned on the basic 
understanding. 

What an IOMMU does is to provide DMA isolation between devices. 
Each device can be hooked with a different translation structure 
(representing a different bfn address space). Linux kernel uses this
mechanism to harden kernel drivers (through dma APIs). Multiple 
devices can be also attached to the same address space, used by
hypervisor when devices are assigned to the same VM.

Now with pvIOMMU exposed to dom0, , dom0 could use it to harden 
kernel drivers too. Then there will be multiple bfn address spaces:

- A default bfn address space created by Xen, where bfn = pfn
- multiple per-bdf bfn address spaces created by Dom0, where
bfn is completely irrelevant to pfn.

the default space should not be changed by Dom0. It is attached
to devices which dom0 doesn't enable pviommu mapping.

per-bdf address spaces can be changed by Dom0, attached to
devices which dom0 enables pviommu mapping. then pviommu ops
should accept a bdf parameter. and internally Xen needs to maintain
multiple page tables under dom0, and find a right page table according
to specified bdf to complete the operation.

Now your series look assuming always just one bfn address space
cross all assigned devices per domain... I'm not sure how it works.

Did I misunderstand anything?

> 
> NOTE: The actual map and unmap operations are introduced by
> subsequent
>       patches.
> 
> Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
> ---
> Cc: Jan Beulich <jbeulich@suse.com>
> Cc: Andrew Cooper <andrew.cooper3@citrix.com>
> Cc: George Dunlap <George.Dunlap@eu.citrix.com>
> Cc: Ian Jackson <ian.jackson@eu.citrix.com>
> Cc: Julien Grall <julien.grall@arm.com>
> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> Cc: Stefano Stabellini <sstabellini@kernel.org>
> Cc: Tim Deegan <tim@xen.org>
> Cc: Wei Liu <wei.liu2@citrix.com>
> 
> v4:
>  - Set sync_iommu_pt to false instead of need_iommu.
> 
> v2:
>  - New in v2.
> ---
>  xen/arch/x86/iommu_op.c         | 42
> +++++++++++++++++++++++++++++++++++++++++
>  xen/drivers/passthrough/iommu.c |  2 +-
>  xen/drivers/passthrough/pci.c   |  4 ++--
>  xen/include/public/iommu_op.h   |  6 ++++++
>  xen/include/xen/iommu.h         |  3 +++
>  5 files changed, 54 insertions(+), 3 deletions(-)
> 
> diff --git a/xen/arch/x86/iommu_op.c b/xen/arch/x86/iommu_op.c
> index bcfcd49102..b29547bffd 100644
> --- a/xen/arch/x86/iommu_op.c
> +++ b/xen/arch/x86/iommu_op.c
> @@ -78,6 +78,42 @@ static int iommu_op_query_reserved(struct
> xen_iommu_op_query_reserved *op)
>      return 0;
>  }
> 
> +static int iommu_op_enable_modification(void)
> +{
> +    struct domain *currd = current->domain;
> +    struct domain_iommu *iommu = dom_iommu(currd);
> +    const struct iommu_ops *ops = iommu->platform_ops;
> +
> +    /* Has modification already been enabled? */
> +    if ( iommu->iommu_op_ranges )
> +        return 0;
> +
> +    /*
> +     * The IOMMU mappings cannot be modified if:
> +     * - the IOMMU is not enabled or,
> +     * - the current domain is dom0 and tranlsation is disabled or,
> +     * - HAP is enabled and the IOMMU shares the mappings.
> +     */
> +    if ( !iommu_enabled ||
> +         (is_hardware_domain(currd) && iommu_passthrough) ||
> +         iommu_use_hap_pt(currd) )
> +        return -EACCES;
> +
> +    /*
> +     * The IOMMU implementation must provide the lookup method if
> +     * modification of the mappings is to be supported.
> +     */
> +    if ( !ops->lookup_page )
> +        return -EOPNOTSUPP;
> +
> +    iommu->iommu_op_ranges = rangeset_new(currd, NULL, 0);
> +    if ( !iommu->iommu_op_ranges )
> +        return -ENOMEM;
> +
> +    currd->sync_iommu_pt = 0; /* Disable synchronization */
> +    return 0;
> +}
> +
>  static void iommu_op(xen_iommu_op_t *op)
>  {
>      switch ( op->op )
> @@ -86,6 +122,10 @@ static void iommu_op(xen_iommu_op_t *op)
>          op->status = iommu_op_query_reserved(&op->u.query_reserved);
>          break;
> 
> +    case XEN_IOMMUOP_enable_modification:
> +        op->status = iommu_op_enable_modification();
> +        break;
> +
>      default:
>          op->status = -EOPNOTSUPP;
>          break;
> @@ -98,6 +138,7 @@ int do_one_iommu_op(xen_iommu_op_buf_t *buf)
>      size_t offset;
>      static const size_t op_size[] = {
>          [XEN_IOMMUOP_query_reserved] = sizeof(struct
> xen_iommu_op_query_reserved),
> +        [XEN_IOMMUOP_enable_modification] = 0,
>      };
>      size_t size;
>      int rc;
> @@ -184,6 +225,7 @@ int
> compat_one_iommu_op(compat_iommu_op_buf_t *buf)
>      size_t offset;
>      static const size_t op_size[] = {
>          [XEN_IOMMUOP_query_reserved] = sizeof(struct
> compat_iommu_op_query_reserved),
> +        [XEN_IOMMUOP_enable_modification] = 0,
>      };
>      size_t size;
>      xen_iommu_op_t nat;
> diff --git a/xen/drivers/passthrough/iommu.c
> b/xen/drivers/passthrough/iommu.c
> index caf3d125ae..8f635a5cdb 100644
> --- a/xen/drivers/passthrough/iommu.c
> +++ b/xen/drivers/passthrough/iommu.c
> @@ -26,7 +26,6 @@ static void iommu_dump_p2m_table(unsigned char
> key);
> 
>  unsigned int __read_mostly iommu_dev_iotlb_timeout = 1000;
>  integer_param("iommu_dev_iotlb_timeout", iommu_dev_iotlb_timeout);
> -
>  /*
>   * The 'iommu' parameter enables the IOMMU.  Optional comma
> separated
>   * value may contain:
> @@ -265,6 +264,7 @@ void iommu_domain_destroy(struct domain *d)
>      arch_iommu_domain_destroy(d);
> 
>      rangeset_destroy(hd->reserved_ranges);
> +    rangeset_destroy(hd->iommu_op_ranges);
>  }
> 
>  int iommu_map_page(struct domain *d, bfn_t bfn, mfn_t mfn,
> diff --git a/xen/drivers/passthrough/pci.c b/xen/drivers/passthrough/pci.c
> index 3d3ad484e7..d4033af41a 100644
> --- a/xen/drivers/passthrough/pci.c
> +++ b/xen/drivers/passthrough/pci.c
> @@ -1460,7 +1460,7 @@ static int assign_device(struct domain *d, u16 seg,
> u8 bus, u8 devfn, u32 flag)
>      }
> 
>   done:
> -    if ( !has_arch_pdevs(d) && has_iommu_pt(d) )
> +    if ( !has_arch_pdevs(d) && has_iommu_pt(d) && !hd-
> >iommu_op_ranges )
>          iommu_teardown(d);
>      pcidevs_unlock();
> 
> @@ -1510,7 +1510,7 @@ int deassign_device(struct domain *d, u16 seg,
> u8 bus, u8 devfn)
> 
>      pdev->fault.count = 0;
> 
> -    if ( !has_arch_pdevs(d) && has_iommu_pt(d) )
> +    if ( !has_arch_pdevs(d) && has_iommu_pt(d) && !hd-
> >iommu_op_ranges )
>          iommu_teardown(d);
> 
>      return ret;
> diff --git a/xen/include/public/iommu_op.h
> b/xen/include/public/iommu_op.h
> index ade404a877..9bf74bd007 100644
> --- a/xen/include/public/iommu_op.h
> +++ b/xen/include/public/iommu_op.h
> @@ -61,6 +61,12 @@ struct xen_iommu_op_query_reserved {
>      XEN_GUEST_HANDLE(xen_iommu_reserved_range_t) ranges;
>  };
> 
> +/*
> + * XEN_IOMMUOP_enable_modification: Enable operations that modify
> IOMMU
> + *                                  mappings.
> + */
> +#define XEN_IOMMUOP_enable_modification 2
> +
>  struct xen_iommu_op {
>      uint16_t op;    /* op type */
>      uint16_t pad;
> diff --git a/xen/include/xen/iommu.h b/xen/include/xen/iommu.h
> index 7c5d46df81..08b163cbcb 100644
> --- a/xen/include/xen/iommu.h
> +++ b/xen/include/xen/iommu.h
> @@ -130,6 +130,9 @@ struct domain_iommu {
>       * must not be modified after initialization.
>       */
>      struct rangeset *reserved_ranges;
> +
> +    /* Ranges under the control of iommu_op */
> +    struct rangeset *iommu_op_ranges;
>  };
> 
>  #define dom_iommu(d)              (&(d)->iommu)
> --
> 2.11.0
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xenproject.org
> https://lists.xenproject.org/mailman/listinfo/xen-devel
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH v5 02/15] iommu: introduce the concept of BFN...
  2018-08-07  2:38   ` Tian, Kevin
@ 2018-08-07  7:59     ` Paul Durrant
  2018-08-07  8:26       ` Tian, Kevin
  0 siblings, 1 reply; 57+ messages in thread
From: Paul Durrant @ 2018-08-07  7:59 UTC (permalink / raw)
  To: Kevin Tian, xen-devel
  Cc: Julien Grall, Stefano Stabellini, Jan Beulich, Suravee Suthikulpanit

> -----Original Message-----
> From: Tian, Kevin [mailto:kevin.tian@intel.com]
> Sent: 07 August 2018 03:38
> To: Paul Durrant <Paul.Durrant@citrix.com>; xen-devel@lists.xenproject.org
> Cc: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>; Stefano
> Stabellini <sstabellini@kernel.org>; Julien Grall <julien.grall@arm.com>; Jan
> Beulich <jbeulich@suse.com>
> Subject: RE: [PATCH v5 02/15] iommu: introduce the concept of BFN...
> 
> > From: Paul Durrant [mailto:paul.durrant@citrix.com]
> > Sent: Saturday, August 4, 2018 1:22 AM
> >
> > ...meaning 'bus frame number' i.e. a frame number mapped in the IOMMU
> > rather than the MMU.
> >
> > This patch is a largely cosmetic change that substitutes the terms 'gfn'
> > and 'gaddr' for 'bfn' and 'baddr' in all the places where the frame number
> > or address relate to the IOMMU rather than the MMU.
> >
> > The parts that are not purely cosmetic are:
> >
> >  - the introduction of a type-safe declaration of bfn_t and definition of
> >    INVALID_BFN to make the substitution of gfn_x(INVALID_GFN)
> mechanical.
> >  - the introduction of __bfn_to_baddr and __baddr_to_bfn (and type-safe
> >    variants without the leading __) with some use of the former.
> >
> > Subsequent patches will convert code to make use of type-safe BFNs.
> >
> > Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
> > Reviewed-by: Wei Liu <wei.liu2@citrix.com>
> 
> [...]
> 
> > diff --git a/xen/include/xen/mm.h b/xen/include/xen/mm.h
> > index 24654e8e22..4ebf91f17e 100644
> > --- a/xen/include/xen/mm.h
> > +++ b/xen/include/xen/mm.h
> > @@ -14,8 +14,9 @@
> >   * should be adhered to.
> >   *
> >   * mfn: Machine Frame Number
> > - *   The values Xen puts into its own pagetables.  This is the host physical
> > - *   memory address space with RAM, MMIO etc.
> > + *   The values Xen puts into its own pagetables, 2nd stage pagetables
> > (where
> > + *   hardware assisted 2nd stage translation is used) or IOMMU page
> > tables.
> > + *   This is the host physical memory address space with RAM, MMIO etc.
> 
> there is also shadow page table... not sure whether it is really helpful
> to list all of them out. "Xen-owned pagetables" should be sufficient to cover.
> 
> >   *
> >   * gfn: Guest Frame Number
> >   *   The values a guest puts in its own pagetables.  For an auto-translated
> > @@ -26,6 +27,11 @@
> >   *   A linear idea of a guest physical address space. For an auto-translated
> >   *   guest, pfn == gfn while for a non-translated guest, pfn != gfn.
> >   *
> > + * bfn: Bus Frame Number (definitions in include/xen/iommu.h)
> > + *   The linear frame numbers of IOMMU address space. All initiators for a
> > + *   given domain share a single IOMMU address space and, by default,
> > Xen will
> > + *   ensure bfn == pfn.
> > + *
> 
> Above description is a bit confusing:
> 
> - for 'domain', are you talking about VM or IOMMU domain? If the former
> then each initiator could have its own IOMMU address space when a virtual
> IOMMU (either pvIOMMU or emulated IOMMU) is exposed.
> 

I mean VM and, yes, while each initiator *could* have its own IOMMU address space, that is not what we do in (or have ever done) in Xen. I can replace the word 'domain' with the abbreviation 'VM' if you feel it makes things clearer.

> - since you talk about 'by default', better also talk about non-default case,
> i.e. when virtual IOMMU is exposed then bfn is a separate address space
> managed by guest

That's putting the cart before the horse really. There is no PV-IOMMU at this stage in the series. I guess I could modify the comment in the later patch where I add the hypercall to enable PV-IOMMU.

  Paul

> 
> >   * WARNING: Some of these terms have changed over time while others
> > have been
> >   * used inconsistently, meaning that a lot of existing code does not match
> > the
> >   * definitions above.  New code should use these terms as described here,
> > and
> > --
> > 2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH v5 05/15] iommu: don't domain_crash() inside iommu_map/unmap_page()
  2018-08-07  2:55   ` Tian, Kevin
@ 2018-08-07  8:05     ` Paul Durrant
  2018-08-07  8:23       ` Jan Beulich
  0 siblings, 1 reply; 57+ messages in thread
From: Paul Durrant @ 2018-08-07  8:05 UTC (permalink / raw)
  To: Kevin Tian, xen-devel
  Cc: Stefano Stabellini, Nakajima, Jun, Andrew Cooper, Tim (Xen.org),
	George Dunlap, Julien Grall, Jan Beulich, Ian Jackson

> -----Original Message-----
> From: Tian, Kevin [mailto:kevin.tian@intel.com]
> Sent: 07 August 2018 03:56
> To: Paul Durrant <Paul.Durrant@citrix.com>; xen-devel@lists.xenproject.org
> Cc: Stefano Stabellini <sstabellini@kernel.org>; Julien Grall
> <julien.grall@arm.com>; Andrew Cooper <Andrew.Cooper3@citrix.com>;
> Ian Jackson <Ian.Jackson@citrix.com>; Jan Beulich <jbeulich@suse.com>;
> Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>; Tim (Xen.org)
> <tim@xen.org>; Nakajima, Jun <jun.nakajima@intel.com>; George Dunlap
> <George.Dunlap@citrix.com>
> Subject: RE: [PATCH v5 05/15] iommu: don't domain_crash() inside
> iommu_map/unmap_page()
> 
> > From: Paul Durrant [mailto:paul.durrant@citrix.com]
> > Sent: Saturday, August 4, 2018 1:22 AM
> >
> > Turn iommu_map/unmap_page() into straightforward wrappers that check
> > the
> 
> iommu_iotlb_flush is also changed.

Indeed it is. I'll call it out.

> 
> > existence of the relevant iommu_op and call through to it. This makes
> them
> > usable by PV IOMMU code to be delivered in future patches.
> > Leave the decision on whether to invoke domain_crash() up to the caller.
> > This has the added benefit that the (module/line number) message that
> > domain_crash() spits out will be more indicative of where the problem lies.
> >
> > NOTE: This patch includes one bit of clean-up in set_identity_p2m_entry()
> >       replacing use of p2m->domain with the domain pointer passed into the
> >       function.
> >
> > Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
> > Reviewed-by: George Dunlap <George.Dunlap@eu.citrix.com>
> > Reviewed-by: Wei Liu <wei.liu2@citrix.com>
> 
> Reviewed-by: Kevin Tian <kevin.tian@intel.com>, with one small comment:

Thanks.

> 
> >
> >      if ( need_iommu(p2m->domain) &&
> >           (lpae_valid(orig_pte) || lpae_valid(*entry)) )
> > +    {
> >          rc = iommu_iotlb_flush(p2m->domain, _bfn(gfn_x(sgfn)),
> >                                 1UL << page_order);
> > +        if ( unlikely(rc) && !is_hardware_domain(p2m->domain) )
> > +            domain_crash(p2m->domain);
> > +    }
> 
> to reduce duplication, does it make sense to introduce a wrapper
> like domain_crash_nd ('nd' indicate !is_hardware_domain, and
> becomes a nop for hardware domain)? Then it becomes:
> 
> 	if ( unlikely(rc) )
> 		domain_crash_nd(p2m->domain);

That's a bigger change and I'd like to defer to the other maintainers as to whether they think it is a good idea. I'm happy to change this in v6 if anyone gives me a +1.

  Paul

> 
> Thanks
> Kevin

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH v5 06/15] public / x86: introduce __HYPERCALL_iommu_op
  2018-08-07  3:00   ` Tian, Kevin
@ 2018-08-07  8:10     ` Paul Durrant
  2018-08-07  8:25       ` Jan Beulich
  0 siblings, 1 reply; 57+ messages in thread
From: Paul Durrant @ 2018-08-07  8:10 UTC (permalink / raw)
  To: Kevin Tian, xen-devel, Julien Grall (julien.grall@arm.com)
  Cc: Stefano Stabellini, Wei Liu, Andrew Cooper, Tim (Xen.org),
	George Dunlap, Jan Beulich, Ian Jackson, Daniel De Graaf

> -----Original Message-----
> From: Tian, Kevin [mailto:kevin.tian@intel.com]
> Sent: 07 August 2018 04:00
> To: Paul Durrant <Paul.Durrant@citrix.com>; xen-devel@lists.xenproject.org
> Cc: Stefano Stabellini <sstabellini@kernel.org>; Wei Liu
> <wei.liu2@citrix.com>; George Dunlap <George.Dunlap@citrix.com>;
> Andrew Cooper <Andrew.Cooper3@citrix.com>; Ian Jackson
> <Ian.Jackson@citrix.com>; Tim (Xen.org) <tim@xen.org>; Jan Beulich
> <jbeulich@suse.com>; Daniel De Graaf <dgdegra@tycho.nsa.gov>
> Subject: RE: [Xen-devel] [PATCH v5 06/15] public / x86: introduce
> __HYPERCALL_iommu_op
> 
> > From: Paul Durrant
> > Sent: Saturday, August 4, 2018 1:22 AM
> >
> > This patch introduces the boilerplate for a new hypercall to allow a
> > domain to control IOMMU mappings for its own pages.
> > Whilst there is duplication of code between the native and compat entry
> > points which appears ripe for some form of combination, I think it is
> > better to maintain the separation as-is because the compat entry point
> > will necessarily gain complexity in subsequent patches.
> >
> > NOTE: This hypercall is only implemented for x86 and is currently
> >       restricted by XSM to dom0. Its scope can be expanded in future
> >       if need be.
> >
> > Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
> [...]
> 
> > diff --git a/xen/arch/x86/iommu_op.c b/xen/arch/x86/iommu_op.c
> > new file mode 100644
> > index 0000000000..744c0fce27
> > --- /dev/null
> > +++ b/xen/arch/x86/iommu_op.c
> > @@ -0,0 +1,184 @@
> >
> +/*********************************************************
> ***
> > ******************
> > + * x86/iommu_op.c
> > + *
> > + * Paravirtualised IOMMU functionality
> 
> although only x86 is supported, logic in this file is vendor agnostic. Better
> move to a common place now...
> 

Personally I don't like to expand scope into architectures which I have no facility to test.  IIUC ARM also very much relies on an identity GFN:BFN map so this functionality is unlikely to get used by anything other than x86 for the foreseeable future.

Julien, would you prefer that this be in common code?

  Paul

> Thanks
> Kevin
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH v5 07/15] iommu: track reserved ranges using a rangeset
  2018-08-07  3:04   ` Tian, Kevin
@ 2018-08-07  8:16     ` Paul Durrant
  2018-08-07  8:23       ` Tian, Kevin
  0 siblings, 1 reply; 57+ messages in thread
From: Paul Durrant @ 2018-08-07  8:16 UTC (permalink / raw)
  To: Kevin Tian, xen-devel; +Cc: Jan Beulich

> -----Original Message-----
> From: Tian, Kevin [mailto:kevin.tian@intel.com]
> Sent: 07 August 2018 04:04
> To: Paul Durrant <Paul.Durrant@citrix.com>; xen-devel@lists.xenproject.org
> Cc: Jan Beulich <jbeulich@suse.com>
> Subject: RE: [PATCH v5 07/15] iommu: track reserved ranges using a rangeset
> 
> > From: Paul Durrant [mailto:paul.durrant@citrix.com]
> > Sent: Saturday, August 4, 2018 1:22 AM
> >
> > Ranges that should be considered reserved in the IOMMU are not
> > necessarily
> > limited to RMRRs. If iommu_inclusive_mapping is set then any frame
> > number
> > falling within an E820 reserved region should also be considered as
> > reserved in the IOMMU.
> 
> I don't think it is a good extension. RMRR by definition requires
> identity mapping in IOMMU page table, thus must be also reserved
> in bfn address space (to avoid being used for other purpose).
> Normal E820 reserved regions have no such requirement. Guest
> can use same bfn location for any purpose. I'm not sure why we
> want to add more limitations here.

Have you tried booting an average Intel based server without identity mapped E820 reserved regions? Good luck with that, because I think about half the servers on the planet will wedge up if you only give them RMRRs. My Dell R730 certainly will and I'd hazard a guess that any other Dell box we have with an iDRAC will too.
Note the behaviour is predicated on 'iommu_inclusive_mapping' being set so if you feel lucky then you can turn it off :-)

  Paul

> 
> > This patch adds a rangeset to the domain_iommu structure that is then
> > used
> > to track all reserved ranges. This will be needed by a subsequent patch
> > to test whether it is safe to modify a particular IOMMU entry.
> >
> > Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
> > Reviewed-by: George Dunlap <george.dunlap@citrix.com>
> > Reviewed-by: Wei Liu <wei.liu2@citrix.com>
> > ---
> > Cc: Jan Beulich <jbeulich@suse.com>
> > Cc: Kevin Tian <kevin.tian@intel.com>
> >
> > v2:
> >  - New in v2.
> > ---
> >  xen/drivers/passthrough/iommu.c       | 10 +++++++++-
> >  xen/drivers/passthrough/vtd/iommu.c   | 20 +++++++++++++-------
> >  xen/drivers/passthrough/vtd/x86/vtd.c | 17 ++++++++++++++++-
> >  xen/include/xen/iommu.h               |  6 ++++++
> >  4 files changed, 44 insertions(+), 9 deletions(-)
> >
> > diff --git a/xen/drivers/passthrough/iommu.c
> > b/xen/drivers/passthrough/iommu.c
> > index 21e6886a3f..b10a37e5d7 100644
> > --- a/xen/drivers/passthrough/iommu.c
> > +++ b/xen/drivers/passthrough/iommu.c
> > @@ -147,6 +147,10 @@ int iommu_domain_init(struct domain *d)
> >      if ( !iommu_enabled )
> >          return 0;
> >
> > +    hd->reserved_ranges = rangeset_new(d, NULL, 0);
> > +    if ( !hd->reserved_ranges )
> > +        return -ENOMEM;
> > +
> >      hd->platform_ops = iommu_get_ops();
> >      return hd->platform_ops->init(d);
> >  }
> > @@ -248,12 +252,16 @@ int iommu_construct(struct domain *d)
> >
> >  void iommu_domain_destroy(struct domain *d)
> >  {
> > -    if ( !iommu_enabled || !dom_iommu(d)->platform_ops )
> > +    const struct domain_iommu *hd = dom_iommu(d);
> > +
> > +    if ( !iommu_enabled || !hd->platform_ops )
> >          return;
> >
> >      iommu_teardown(d);
> >
> >      arch_iommu_domain_destroy(d);
> > +
> > +    rangeset_destroy(hd->reserved_ranges);
> >  }
> >
> >  int iommu_map_page(struct domain *d, bfn_t bfn, mfn_t mfn,
> > diff --git a/xen/drivers/passthrough/vtd/iommu.c
> > b/xen/drivers/passthrough/vtd/iommu.c
> > index c9f50f04ad..282e227414 100644
> > --- a/xen/drivers/passthrough/vtd/iommu.c
> > +++ b/xen/drivers/passthrough/vtd/iommu.c
> > @@ -1910,6 +1910,7 @@ static int rmrr_identity_mapping(struct domain
> > *d, bool_t map,
> >      unsigned long end_pfn = PAGE_ALIGN_4K(rmrr->end_address) >>
> > PAGE_SHIFT_4K;
> >      struct mapped_rmrr *mrmrr;
> >      struct domain_iommu *hd = dom_iommu(d);
> > +    int err = 0;
> >
> >      ASSERT(pcidevs_locked());
> >      ASSERT(rmrr->base_address < rmrr->end_address);
> > @@ -1923,8 +1924,6 @@ static int rmrr_identity_mapping(struct domain
> > *d, bool_t map,
> >          if ( mrmrr->base == rmrr->base_address &&
> >               mrmrr->end == rmrr->end_address )
> >          {
> > -            int ret = 0;
> > -
> >              if ( map )
> >              {
> >                  ++mrmrr->count;
> > @@ -1934,28 +1933,35 @@ static int rmrr_identity_mapping(struct
> > domain *d, bool_t map,
> >              if ( --mrmrr->count )
> >                  return 0;
> >
> > -            while ( base_pfn < end_pfn )
> > +            err = rangeset_remove_range(hd->reserved_ranges,
> > +                                        base_pfn, end_pfn);
> > +            while ( !err && base_pfn < end_pfn )
> >              {
> >                  if ( clear_identity_p2m_entry(d, base_pfn) )
> > -                    ret = -ENXIO;
> > +                    err = -ENXIO;
> > +
> >                  base_pfn++;
> >              }
> >
> >              list_del(&mrmrr->list);
> >              xfree(mrmrr);
> > -            return ret;
> > +            return err;
> >          }
> >      }
> >
> >      if ( !map )
> >          return -ENOENT;
> >
> > +    err = rangeset_add_range(hd->reserved_ranges, base_pfn, end_pfn);
> > +    if ( err )
> > +        return err;
> > +
> >      while ( base_pfn < end_pfn )
> >      {
> > -        int err = set_identity_p2m_entry(d, base_pfn, p2m_access_rw, flag);
> > -
> > +        err = set_identity_p2m_entry(d, base_pfn, p2m_access_rw, flag);
> >          if ( err )
> >              return err;
> > +
> >          base_pfn++;
> >      }
> >
> > diff --git a/xen/drivers/passthrough/vtd/x86/vtd.c
> > b/xen/drivers/passthrough/vtd/x86/vtd.c
> > index 6fed4a92cb..032412b8c6 100644
> > --- a/xen/drivers/passthrough/vtd/x86/vtd.c
> > +++ b/xen/drivers/passthrough/vtd/x86/vtd.c
> > @@ -164,10 +164,25 @@ void __hwdom_init
> > vtd_set_hwdom_mapping(struct domain *d)
> >
> >              if ( !rc )
> >                 rc = ret;
> > +
> > +            /*
> > +             * The only reason a reserved page would be mapped is that
> > +             * iommu_inclusive_mapping is set, in which case the BFN
> > +             * needs to be marked as reserved in the IOMMU.
> > +             */
> > +            if ( page_is_ram_type(pfn, RAM_TYPE_RESERVED) )
> > +            {
> > +                ASSERT(iommu_inclusive_mapping);
> > +
> > +                ret = rangeset_add_singleton(dom_iommu(d)-
> >reserved_ranges,
> > +                                             bfn_x(bfn));
> > +                if ( !rc )
> > +                    rc = ret;
> > +            }
> >          }
> >
> >          if ( rc )
> > -            printk(XENLOG_WARNING VTDPREFIX " d%d: IOMMU mapping
> > failed: %d\n",
> > +            printk(XENLOG_WARNING VTDPREFIX " d%d: IOMMU
> > mapping/reservation failed: %d\n",
> >                     d->domain_id, rc);
> >
> >          if (!(i & (0xfffff >> (PAGE_SHIFT - PAGE_SHIFT_4K))))
> > diff --git a/xen/include/xen/iommu.h b/xen/include/xen/iommu.h
> > index 624784fec8..cc0be81b4e 100644
> > --- a/xen/include/xen/iommu.h
> > +++ b/xen/include/xen/iommu.h
> > @@ -122,6 +122,12 @@ struct domain_iommu {
> >
> >      /* Features supported by the IOMMU */
> >      DECLARE_BITMAP(features, IOMMU_FEAT_count);
> > +
> > +    /*
> > +     * BFN ranges that are reserved in the domain IOMMU mappings and
> > +     * must not be modified after initialization.
> > +     */
> > +    struct rangeset *reserved_ranges;
> >  };
> >
> >  #define dom_iommu(d)              (&(d)->iommu)
> > --
> > 2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH v5 09/15] vtd: add lookup_page method to iommu_ops
  2018-08-07  3:25   ` Tian, Kevin
@ 2018-08-07  8:21     ` Paul Durrant
  2018-08-07  8:29       ` Jan Beulich
  2018-08-07  8:31       ` Tian, Kevin
  0 siblings, 2 replies; 57+ messages in thread
From: Paul Durrant @ 2018-08-07  8:21 UTC (permalink / raw)
  To: Kevin Tian, xen-devel; +Cc: George Dunlap, Jan Beulich

> -----Original Message-----
> From: Tian, Kevin [mailto:kevin.tian@intel.com]
> Sent: 07 August 2018 04:25
> To: Paul Durrant <Paul.Durrant@citrix.com>; xen-devel@lists.xenproject.org
> Cc: Jan Beulich <jbeulich@suse.com>; George Dunlap
> <George.Dunlap@citrix.com>
> Subject: RE: [PATCH v5 09/15] vtd: add lookup_page method to iommu_ops
> 
> > From: Paul Durrant [mailto:paul.durrant@citrix.com]
> > Sent: Saturday, August 4, 2018 1:22 AM
> >
> > This patch adds a new method to the VT-d IOMMU implementation to find
> > the
> > MFN currently mapped by the specified BFN along with a wrapper function
> > in
> > generic IOMMU code to call the implementation if it exists.
> >
> > This functionality will be used by a subsequent patch.
> >
> > Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
> > Reviewed-by: Wei Liu <wei.liu2@citrix.com>
> > ---
> > Cc: Kevin Tian <kevin.tian@intel.com>
> > Cc: Jan Beulich <jbeulich@suse.com>
> > Cc: George Dunlap <george.dunlap@citrix.com>
> >
> > v3:
> >  - Addressed comments from George.
> >
> > v2:
> >  - Addressed some comments from Jan.
> > ---
> >  xen/drivers/passthrough/iommu.c     | 11 +++++++++++
> >  xen/drivers/passthrough/vtd/iommu.c | 34
> > ++++++++++++++++++++++++++++++++++
> >  xen/drivers/passthrough/vtd/iommu.h |  3 +++
> >  xen/include/xen/iommu.h             |  4 ++++
> >  4 files changed, 52 insertions(+)
> >
> > diff --git a/xen/drivers/passthrough/iommu.c
> > b/xen/drivers/passthrough/iommu.c
> > index b10a37e5d7..9b7baca93f 100644
> > --- a/xen/drivers/passthrough/iommu.c
> > +++ b/xen/drivers/passthrough/iommu.c
> > @@ -305,6 +305,17 @@ int iommu_unmap_page(struct domain *d, bfn_t
> > bfn)
> >      return rc;
> >  }
> >
> > +int iommu_lookup_page(struct domain *d, bfn_t bfn, mfn_t *mfn,
> > +                      unsigned int *flags)
> > +{
> > +    const struct domain_iommu *hd = dom_iommu(d);
> > +
> > +    if ( !iommu_enabled || !hd->platform_ops )
> > +        return -EOPNOTSUPP;
> > +
> > +    return hd->platform_ops->lookup_page(d, bfn, mfn, flags);
> > +}
> > +
> >  static void iommu_free_pagetables(unsigned long unused)
> >  {
> >      do {
> > diff --git a/xen/drivers/passthrough/vtd/iommu.c
> > b/xen/drivers/passthrough/vtd/iommu.c
> > index 282e227414..8cd3b59aa0 100644
> > --- a/xen/drivers/passthrough/vtd/iommu.c
> > +++ b/xen/drivers/passthrough/vtd/iommu.c
> > @@ -1830,6 +1830,39 @@ static int __must_check
> > intel_iommu_unmap_page(struct domain *d,
> >      return dma_pte_clear_one(d, bfn_to_baddr(bfn));
> >  }
> >
> > +static int intel_iommu_lookup_page(struct domain *d, bfn_t bfn, mfn_t
> > *mfn,
> > +                                   unsigned int *flags)
> 
> Not looking at later patches yet... but in concept bfn address
> space is per device instead of per domain.

Not in this case. Xen has always maintained a single IOMMU address per virtual machine. That is what BFN refers to.

  Paul

> In default situation
> (w/o pvIOMMU exposed), all devices assigned to dom0 share
> the same address space (bfn=pfn) which is currently linked
> from domain structure. Then with pvIOMMU exposed, dom0
> starts to manage individual pfn address space (called IOVA
> address space within dom0) per assigned device. In that case
> lookup should accept a bdf number and then find the right
> page table...
> 

No. That is over-complicating things and would probably involve re-writing most of the IOMMU code in Xen AFAICT.

> > +{
> > +    struct domain_iommu *hd = dom_iommu(d);
> > +    struct dma_pte *page = NULL, *pte = NULL, val;
> > +    u64 pg_maddr;
> > +
> > +    spin_lock(&hd->arch.mapping_lock);
> > +
> > +    pg_maddr = addr_to_dma_page_maddr(d, bfn_to_baddr(bfn), 0);
> > +    if ( pg_maddr == 0 )
> > +    {
> > +        spin_unlock(&hd->arch.mapping_lock);
> > +        return -ENOMEM;
> > +    }
> > +
> > +    page = map_vtd_domain_page(pg_maddr);
> > +    pte = page + (bfn_x(bfn) & LEVEL_MASK);
> > +    val = *pte;
> > +
> > +    unmap_vtd_domain_page(page);
> > +    spin_unlock(&hd->arch.mapping_lock);
> > +
> > +    if ( !dma_pte_present(val) )
> > +        return -ENOENT;
> > +
> > +    *mfn = maddr_to_mfn(dma_pte_addr(val));
> > +    *flags = dma_pte_read(val) ? IOMMUF_readable : 0;
> > +    *flags |= dma_pte_write(val) ? IOMMUF_writable : 0;
> > +
> > +    return 0;
> > +}
> > +
> >  int iommu_pte_flush(struct domain *d, uint64_t bfn, uint64_t *pte,
> >                      int order, int present)
> >  {
> > @@ -2661,6 +2694,7 @@ const struct iommu_ops intel_iommu_ops = {
> >      .teardown = iommu_domain_teardown,
> >      .map_page = intel_iommu_map_page,
> >      .unmap_page = intel_iommu_unmap_page,
> > +    .lookup_page = intel_iommu_lookup_page,
> >      .free_page_table = iommu_free_page_table,
> >      .reassign_device = reassign_device_ownership,
> >      .get_device_group_id = intel_iommu_group_id,
> > diff --git a/xen/drivers/passthrough/vtd/iommu.h
> > b/xen/drivers/passthrough/vtd/iommu.h
> > index 72c1a2e3cd..47bdfcb5ea 100644
> > --- a/xen/drivers/passthrough/vtd/iommu.h
> > +++ b/xen/drivers/passthrough/vtd/iommu.h
> > @@ -272,6 +272,9 @@ struct dma_pte {
> >  #define dma_set_pte_prot(p, prot) do { \
> >          (p).val = ((p).val & ~DMA_PTE_PROT) | ((prot) & DMA_PTE_PROT); \
> >      } while (0)
> > +#define dma_pte_prot(p) ((p).val & DMA_PTE_PROT)
> > +#define dma_pte_read(p) (dma_pte_prot(p) & DMA_PTE_READ)
> > +#define dma_pte_write(p) (dma_pte_prot(p) & DMA_PTE_WRITE)
> >  #define dma_pte_addr(p) ((p).val & PADDR_MASK & PAGE_MASK_4K)
> >  #define dma_set_pte_addr(p, addr) do {\
> >              (p).val |= ((addr) & PAGE_MASK_4K); } while (0)
> > diff --git a/xen/include/xen/iommu.h b/xen/include/xen/iommu.h
> > index cc0be81b4e..7c5d46df81 100644
> > --- a/xen/include/xen/iommu.h
> > +++ b/xen/include/xen/iommu.h
> > @@ -100,6 +100,8 @@ void iommu_teardown(struct domain *d);
> >  int __must_check iommu_map_page(struct domain *d, bfn_t bfn,
> >                                  mfn_t mfn, unsigned int flags);
> >  int __must_check iommu_unmap_page(struct domain *d, bfn_t bfn);
> > +int __must_check iommu_lookup_page(struct domain *d, bfn_t bfn,
> > mfn_t *mfn,
> > +                                   unsigned int *flags);
> >
> >  enum iommu_feature
> >  {
> > @@ -198,6 +200,8 @@ struct iommu_ops {
> >      int __must_check (*map_page)(struct domain *d, bfn_t bfn, mfn_t
> mfn,
> >                                   unsigned int flags);
> >      int __must_check (*unmap_page)(struct domain *d, bfn_t bfn);
> > +    int __must_check (*lookup_page)(struct domain *d, bfn_t bfn, mfn_t
> > *mfn,
> > +                                    unsigned int *flags);
> >      void (*free_page_table)(struct page_info *);
> >  #ifdef CONFIG_X86
> >      void (*update_ire_from_apic)(unsigned int apic, unsigned int reg,
> > unsigned int value);
> > --
> > 2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH v5 05/15] iommu: don't domain_crash() inside iommu_map/unmap_page()
  2018-08-07  8:05     ` Paul Durrant
@ 2018-08-07  8:23       ` Jan Beulich
  0 siblings, 0 replies; 57+ messages in thread
From: Jan Beulich @ 2018-08-07  8:23 UTC (permalink / raw)
  To: Paul Durrant, Kevin Tian
  Cc: Stefano Stabellini, Andrew Cooper, Tim Deegan, george.dunlap,
	Julien Grall, Jun Nakajima, xen-devel, IanJackson

>>> On 07.08.18 at 10:05, <Paul.Durrant@citrix.com> wrote:
>> From: Tian, Kevin [mailto:kevin.tian@intel.com]
>> Sent: 07 August 2018 03:56
>> > From: Paul Durrant [mailto:paul.durrant@citrix.com]
>> > Sent: Saturday, August 4, 2018 1:22 AM
>> >      if ( need_iommu(p2m->domain) &&
>> >           (lpae_valid(orig_pte) || lpae_valid(*entry)) )
>> > +    {
>> >          rc = iommu_iotlb_flush(p2m->domain, _bfn(gfn_x(sgfn)),
>> >                                 1UL << page_order);
>> > +        if ( unlikely(rc) && !is_hardware_domain(p2m->domain) )
>> > +            domain_crash(p2m->domain);
>> > +    }
>> 
>> to reduce duplication, does it make sense to introduce a wrapper
>> like domain_crash_nd ('nd' indicate !is_hardware_domain, and
>> becomes a nop for hardware domain)? Then it becomes:
>> 
>> 	if ( unlikely(rc) )
>> 		domain_crash_nd(p2m->domain);
> 
> That's a bigger change and I'd like to defer to the other maintainers as to 
> whether they think it is a good idea. I'm happy to change this in v6 if 
> anyone gives me a +1.

Well, if this is a common idiom, then yes, I think a helper would
be desirable. However - what is the _nd suffix supposed to
stand for? I'm inclined to suggest domu_crash().

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH v5 07/15] iommu: track reserved ranges using a rangeset
  2018-08-07  8:16     ` Paul Durrant
@ 2018-08-07  8:23       ` Tian, Kevin
  0 siblings, 0 replies; 57+ messages in thread
From: Tian, Kevin @ 2018-08-07  8:23 UTC (permalink / raw)
  To: Paul Durrant, xen-devel; +Cc: Jan Beulich

> From: Paul Durrant [mailto:Paul.Durrant@citrix.com]
> Sent: Tuesday, August 7, 2018 4:16 PM
> 
> > -----Original Message-----
> > From: Tian, Kevin [mailto:kevin.tian@intel.com]
> > Sent: 07 August 2018 04:04
> > To: Paul Durrant <Paul.Durrant@citrix.com>; xen-
> devel@lists.xenproject.org
> > Cc: Jan Beulich <jbeulich@suse.com>
> > Subject: RE: [PATCH v5 07/15] iommu: track reserved ranges using a
> rangeset
> >
> > > From: Paul Durrant [mailto:paul.durrant@citrix.com]
> > > Sent: Saturday, August 4, 2018 1:22 AM
> > >
> > > Ranges that should be considered reserved in the IOMMU are not
> > > necessarily
> > > limited to RMRRs. If iommu_inclusive_mapping is set then any frame
> > > number
> > > falling within an E820 reserved region should also be considered as
> > > reserved in the IOMMU.
> >
> > I don't think it is a good extension. RMRR by definition requires
> > identity mapping in IOMMU page table, thus must be also reserved
> > in bfn address space (to avoid being used for other purpose).
> > Normal E820 reserved regions have no such requirement. Guest
> > can use same bfn location for any purpose. I'm not sure why we
> > want to add more limitations here.
> 
> Have you tried booting an average Intel based server without identity
> mapped E820 reserved regions? Good luck with that, because I think about
> half the servers on the planet will wedge up if you only give them RMRRs.
> My Dell R730 certainly will and I'd hazard a guess that any other Dell box
> we have with an iDRAC will too.
> Note the behaviour is predicated on 'iommu_inclusive_mapping' being set
> so if you feel lucky then you can turn it off :-)
> 

I know that fact... looks I misunderstood the patch - I thought you
blindly add all reserved regions in this patch bypassing the existing
control from inclusive_mapping. Now re-read the description and
code you are just giving background :-)

Thanks
Kevin

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH v5 11/15] mm / iommu: split need_iommu() into has_iommu_pt() and sync_iommu_pt()
  2018-08-07  3:41   ` Tian, Kevin
@ 2018-08-07  8:24     ` Paul Durrant
  0 siblings, 0 replies; 57+ messages in thread
From: Paul Durrant @ 2018-08-07  8:24 UTC (permalink / raw)
  To: Kevin Tian, xen-devel
  Cc: Stefano Stabellini, Wei Liu, Nakajima, Jun, Razvan Cojocaru,
	Andrew Cooper, Tim (Xen.org),
	George Dunlap, Julien Grall, Tamas K Lengyel, Jan Beulich,
	Ian Jackson, Brian Woods, Suravee Suthikulpanit

> -----Original Message-----
> From: Tian, Kevin [mailto:kevin.tian@intel.com]
> Sent: 07 August 2018 04:41
> To: Paul Durrant <Paul.Durrant@citrix.com>; xen-devel@lists.xenproject.org
> Cc: Stefano Stabellini <sstabellini@kernel.org>; Julien Grall
> <julien.grall@arm.com>; Andrew Cooper <Andrew.Cooper3@citrix.com>;
> George Dunlap <George.Dunlap@citrix.com>; Ian Jackson
> <Ian.Jackson@citrix.com>; Jan Beulich <jbeulich@suse.com>; Konrad
> Rzeszutek Wilk <konrad.wilk@oracle.com>; Tim (Xen.org) <tim@xen.org>;
> Wei Liu <wei.liu2@citrix.com>; Tamas K Lengyel <tamas@tklengyel.com>;
> George Dunlap <George.Dunlap@citrix.com>; Nakajima, Jun
> <jun.nakajima@intel.com>; Razvan Cojocaru <rcojocaru@bitdefender.com>;
> Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>; Brian Woods
> <brian.woods@amd.com>
> Subject: RE: [PATCH v5 11/15] mm / iommu: split need_iommu() into
> has_iommu_pt() and sync_iommu_pt()
> 
> > From: Paul Durrant [mailto:paul.durrant@citrix.com]
> > Sent: Saturday, August 4, 2018 1:22 AM
> >
> > The name 'need_iommu' is a little confusing as it suggests a domain needs
> > to use the IOMMU but something might not be set up yet, when in fact it
> > doesn't become true until IOMMU mappings for the domain have been
> > created.
> >
> > Two different meanings are also inferred from it in various places in the
> > code:
> >
> > - Some callers want to test whether a domain has IOMMU mappings
> > - Some callers want to test whether they need to synchronize the domain's
> >   P2M and IOMMU mappings
> >
> > This patch therefore creates two new boolean flags, 'has_iommu_pt' and
> > 'sync_iommu_pt' to convey those meanings separately and creates the
> > corresponding macros. All callers of need_iommu() are then modified to
> > use the macro appropriate to what they are trying to test.
> 
> sync_iommu_pt sounds like an operation. what about need_sync_iommu?
> 
> and for has_iommu_pt, what about iommu_enabled where 'enabled'
> implies page table created?

I could do 'need_iommu_pt_sync', perhaps, but 'iommu_enabled' will clash with the global host flag which is why I went with 'has_iommu_pt'.

  Paul

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH v5 06/15] public / x86: introduce __HYPERCALL_iommu_op
  2018-08-07  8:10     ` Paul Durrant
@ 2018-08-07  8:25       ` Jan Beulich
  0 siblings, 0 replies; 57+ messages in thread
From: Jan Beulich @ 2018-08-07  8:25 UTC (permalink / raw)
  To: Julien Grall, Paul Durrant, Kevin Tian
  Cc: Stefano Stabellini, Wei Liu, Andrew Cooper, Tim Deegan,
	george.dunlap, Ian Jackson, xen-devel, Daniel de Graaf

>>> On 07.08.18 at 10:10, <Paul.Durrant@citrix.com> wrote:
>> From: Tian, Kevin [mailto:kevin.tian@intel.com]
>> Sent: 07 August 2018 04:00
>> 
>> > From: Paul Durrant
>> > Sent: Saturday, August 4, 2018 1:22 AM
>> >
>> > diff --git a/xen/arch/x86/iommu_op.c b/xen/arch/x86/iommu_op.c
>> > new file mode 100644
>> > index 0000000000..744c0fce27
>> > --- /dev/null
>> > +++ b/xen/arch/x86/iommu_op.c
>> > @@ -0,0 +1,184 @@
>> >
>> +/*********************************************************
>> ***
>> > ******************
>> > + * x86/iommu_op.c
>> > + *
>> > + * Paravirtualised IOMMU functionality
>> 
>> although only x86 is supported, logic in this file is vendor agnostic. Better
>> move to a common place now...
>> 
> 
> Personally I don't like to expand scope into architectures which I have no 
> facility to test.  IIUC ARM also very much relies on an identity GFN:BFN map 
> so this functionality is unlikely to get used by anything other than x86 for 
> the foreseeable future.

Well - moving to a common place doesn't mean it needs to be built
for ARM.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH v5 02/15] iommu: introduce the concept of BFN...
  2018-08-07  7:59     ` Paul Durrant
@ 2018-08-07  8:26       ` Tian, Kevin
  0 siblings, 0 replies; 57+ messages in thread
From: Tian, Kevin @ 2018-08-07  8:26 UTC (permalink / raw)
  To: Paul Durrant, xen-devel
  Cc: Julien Grall, Stefano Stabellini, Jan Beulich, Suravee Suthikulpanit

> From: Paul Durrant [mailto:Paul.Durrant@citrix.com]
> Sent: Tuesday, August 7, 2018 4:00 PM
> 
> > >   *
> > >   * gfn: Guest Frame Number
> > >   *   The values a guest puts in its own pagetables.  For an auto-
> translated
> > > @@ -26,6 +27,11 @@
> > >   *   A linear idea of a guest physical address space. For an auto-
> translated
> > >   *   guest, pfn == gfn while for a non-translated guest, pfn != gfn.
> > >   *
> > > + * bfn: Bus Frame Number (definitions in include/xen/iommu.h)
> > > + *   The linear frame numbers of IOMMU address space. All initiators
> for a
> > > + *   given domain share a single IOMMU address space and, by default,
> > > Xen will
> > > + *   ensure bfn == pfn.
> > > + *
> >
> > Above description is a bit confusing:
> >
> > - for 'domain', are you talking about VM or IOMMU domain? If the
> former
> > then each initiator could have its own IOMMU address space when a
> virtual
> > IOMMU (either pvIOMMU or emulated IOMMU) is exposed.
> >
> 
> I mean VM and, yes, while each initiator *could* have its own IOMMU
> address space, that is not what we do in (or have ever done) in Xen. I can
> replace the word 'domain' with the abbreviation 'VM' if you feel it makes
> things clearer.
> 

domain is OK here. just want to confirm.

> > - since you talk about 'by default', better also talk about non-default case,
> > i.e. when virtual IOMMU is exposed then bfn is a separate address space
> > managed by guest
> 
> That's putting the cart before the horse really. There is no PV-IOMMU at
> this stage in the series. I guess I could modify the comment in the later
> patch where I add the hypercall to enable PV-IOMMU.
> 

it's fine as long as finally the information is complete. 

Thanks
Kevin


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH v5 09/15] vtd: add lookup_page method to iommu_ops
  2018-08-07  8:21     ` Paul Durrant
@ 2018-08-07  8:29       ` Jan Beulich
  2018-08-07  8:32         ` Tian, Kevin
  2018-08-07  8:31       ` Tian, Kevin
  1 sibling, 1 reply; 57+ messages in thread
From: Jan Beulich @ 2018-08-07  8:29 UTC (permalink / raw)
  To: Paul Durrant; +Cc: xen-devel, Kevin Tian, george.dunlap

>>> On 07.08.18 at 10:21, <Paul.Durrant@citrix.com> wrote:
>> From: Tian, Kevin [mailto:kevin.tian@intel.com]
>> Sent: 07 August 2018 04:25
>> 
>> > From: Paul Durrant [mailto:paul.durrant@citrix.com]
>> > Sent: Saturday, August 4, 2018 1:22 AM
>> >
>> > --- a/xen/drivers/passthrough/vtd/iommu.c
>> > +++ b/xen/drivers/passthrough/vtd/iommu.c
>> > @@ -1830,6 +1830,39 @@ static int __must_check
>> > intel_iommu_unmap_page(struct domain *d,
>> >      return dma_pte_clear_one(d, bfn_to_baddr(bfn));
>> >  }
>> >
>> > +static int intel_iommu_lookup_page(struct domain *d, bfn_t bfn, mfn_t
>> > *mfn,
>> > +                                   unsigned int *flags)
>> 
>> Not looking at later patches yet... but in concept bfn address
>> space is per device instead of per domain.
> 
> Not in this case. Xen has always maintained a single IOMMU address per 
> virtual machine. That is what BFN refers to.

Nut is that a model we can maintain mid and long term? In particular
on ARM, where Julien has told me a single system could have multiple
_different_ IOMMUs, I could easily see the address spaces diverging.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH v5 09/15] vtd: add lookup_page method to iommu_ops
  2018-08-07  8:21     ` Paul Durrant
  2018-08-07  8:29       ` Jan Beulich
@ 2018-08-07  8:31       ` Tian, Kevin
  2018-08-07  8:35         ` Paul Durrant
  1 sibling, 1 reply; 57+ messages in thread
From: Tian, Kevin @ 2018-08-07  8:31 UTC (permalink / raw)
  To: Paul Durrant, xen-devel; +Cc: George Dunlap, Jan Beulich

> From: Paul Durrant [mailto:Paul.Durrant@citrix.com]
> Sent: Tuesday, August 7, 2018 4:22 PM
> 
> > -----Original Message-----
> > From: Tian, Kevin [mailto:kevin.tian@intel.com]
> > Sent: 07 August 2018 04:25
> > To: Paul Durrant <Paul.Durrant@citrix.com>; xen-
> devel@lists.xenproject.org
> > Cc: Jan Beulich <jbeulich@suse.com>; George Dunlap
> > <George.Dunlap@citrix.com>
> > Subject: RE: [PATCH v5 09/15] vtd: add lookup_page method to
> iommu_ops
> >
> > > From: Paul Durrant [mailto:paul.durrant@citrix.com]
> > > Sent: Saturday, August 4, 2018 1:22 AM
> > >
> > > This patch adds a new method to the VT-d IOMMU implementation to
> find
> > > the
> > > MFN currently mapped by the specified BFN along with a wrapper
> function
> > > in
> > > generic IOMMU code to call the implementation if it exists.
> > >
> > > This functionality will be used by a subsequent patch.
> > >
> > > Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
> > > Reviewed-by: Wei Liu <wei.liu2@citrix.com>
> > > ---
> > > Cc: Kevin Tian <kevin.tian@intel.com>
> > > Cc: Jan Beulich <jbeulich@suse.com>
> > > Cc: George Dunlap <george.dunlap@citrix.com>
> > >
> > > v3:
> > >  - Addressed comments from George.
> > >
> > > v2:
> > >  - Addressed some comments from Jan.
> > > ---
> > >  xen/drivers/passthrough/iommu.c     | 11 +++++++++++
> > >  xen/drivers/passthrough/vtd/iommu.c | 34
> > > ++++++++++++++++++++++++++++++++++
> > >  xen/drivers/passthrough/vtd/iommu.h |  3 +++
> > >  xen/include/xen/iommu.h             |  4 ++++
> > >  4 files changed, 52 insertions(+)
> > >
> > > diff --git a/xen/drivers/passthrough/iommu.c
> > > b/xen/drivers/passthrough/iommu.c
> > > index b10a37e5d7..9b7baca93f 100644
> > > --- a/xen/drivers/passthrough/iommu.c
> > > +++ b/xen/drivers/passthrough/iommu.c
> > > @@ -305,6 +305,17 @@ int iommu_unmap_page(struct domain *d,
> bfn_t
> > > bfn)
> > >      return rc;
> > >  }
> > >
> > > +int iommu_lookup_page(struct domain *d, bfn_t bfn, mfn_t *mfn,
> > > +                      unsigned int *flags)
> > > +{
> > > +    const struct domain_iommu *hd = dom_iommu(d);
> > > +
> > > +    if ( !iommu_enabled || !hd->platform_ops )
> > > +        return -EOPNOTSUPP;
> > > +
> > > +    return hd->platform_ops->lookup_page(d, bfn, mfn, flags);
> > > +}
> > > +
> > >  static void iommu_free_pagetables(unsigned long unused)
> > >  {
> > >      do {
> > > diff --git a/xen/drivers/passthrough/vtd/iommu.c
> > > b/xen/drivers/passthrough/vtd/iommu.c
> > > index 282e227414..8cd3b59aa0 100644
> > > --- a/xen/drivers/passthrough/vtd/iommu.c
> > > +++ b/xen/drivers/passthrough/vtd/iommu.c
> > > @@ -1830,6 +1830,39 @@ static int __must_check
> > > intel_iommu_unmap_page(struct domain *d,
> > >      return dma_pte_clear_one(d, bfn_to_baddr(bfn));
> > >  }
> > >
> > > +static int intel_iommu_lookup_page(struct domain *d, bfn_t bfn,
> mfn_t
> > > *mfn,
> > > +                                   unsigned int *flags)
> >
> > Not looking at later patches yet... but in concept bfn address
> > space is per device instead of per domain.
> 
> Not in this case. Xen has always maintained a single IOMMU address per
> virtual machine. That is what BFN refers to.
> 
>   Paul
> 
> > In default situation
> > (w/o pvIOMMU exposed), all devices assigned to dom0 share
> > the same address space (bfn=pfn) which is currently linked
> > from domain structure. Then with pvIOMMU exposed, dom0
> > starts to manage individual pfn address space (called IOVA
> > address space within dom0) per assigned device. In that case
> > lookup should accept a bdf number and then find the right
> > page table...
> >
> 
> No. That is over-complicating things and would probably involve re-writing
> most of the IOMMU code in Xen AFAICT.
> 

it's not over-complicating. it's about correctness.

think about the role of IOMMU. Now you expose it to dom0. Then
dom0 wants to use it to isolate devices. What is the matter having
an IOMMU which supports only one global address space cross all
devices, from dom0's p.o.v? If that is the case, why does dom0 ever
want to change bfn globally on all devices?

Thanks
Kevin

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH v5 09/15] vtd: add lookup_page method to iommu_ops
  2018-08-07  8:29       ` Jan Beulich
@ 2018-08-07  8:32         ` Tian, Kevin
  2018-08-07  8:37           ` Paul Durrant
  0 siblings, 1 reply; 57+ messages in thread
From: Tian, Kevin @ 2018-08-07  8:32 UTC (permalink / raw)
  To: Jan Beulich, Paul Durrant; +Cc: xen-devel, george.dunlap

> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: Tuesday, August 7, 2018 4:30 PM
> 
> >>> On 07.08.18 at 10:21, <Paul.Durrant@citrix.com> wrote:
> >> From: Tian, Kevin [mailto:kevin.tian@intel.com]
> >> Sent: 07 August 2018 04:25
> >>
> >> > From: Paul Durrant [mailto:paul.durrant@citrix.com]
> >> > Sent: Saturday, August 4, 2018 1:22 AM
> >> >
> >> > --- a/xen/drivers/passthrough/vtd/iommu.c
> >> > +++ b/xen/drivers/passthrough/vtd/iommu.c
> >> > @@ -1830,6 +1830,39 @@ static int __must_check
> >> > intel_iommu_unmap_page(struct domain *d,
> >> >      return dma_pte_clear_one(d, bfn_to_baddr(bfn));
> >> >  }
> >> >
> >> > +static int intel_iommu_lookup_page(struct domain *d, bfn_t bfn,
> mfn_t
> >> > *mfn,
> >> > +                                   unsigned int *flags)
> >>
> >> Not looking at later patches yet... but in concept bfn address
> >> space is per device instead of per domain.
> >
> > Not in this case. Xen has always maintained a single IOMMU address per
> > virtual machine. That is what BFN refers to.
> 
> Nut is that a model we can maintain mid and long term? In particular
> on ARM, where Julien has told me a single system could have multiple
> _different_ IOMMUs, I could easily see the address spaces diverging.
> 

multiple IOMMUs is another thing.

what I questioned is that even one IOMMU needs to support mulitiple
address spaces. That is the point of an IOMMU...

Thanks
Kevin

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH v5 12/15] x86: add iommu_op to enable modification of IOMMU mappings
  2018-08-07  4:08   ` Tian, Kevin
@ 2018-08-07  8:32     ` Paul Durrant
  2018-08-07  8:37       ` Tian, Kevin
  0 siblings, 1 reply; 57+ messages in thread
From: Paul Durrant @ 2018-08-07  8:32 UTC (permalink / raw)
  To: Kevin Tian, xen-devel
  Cc: Stefano Stabellini, Wei Liu, Andrew Cooper, Tim (Xen.org),
	George Dunlap, Julien Grall, Jan Beulich, Ian Jackson

> -----Original Message-----
> From: Tian, Kevin [mailto:kevin.tian@intel.com]
> Sent: 07 August 2018 05:08
> To: Paul Durrant <Paul.Durrant@citrix.com>; xen-devel@lists.xenproject.org
> Cc: Stefano Stabellini <sstabellini@kernel.org>; Wei Liu
> <wei.liu2@citrix.com>; George Dunlap <George.Dunlap@citrix.com>;
> Andrew Cooper <Andrew.Cooper3@citrix.com>; Ian Jackson
> <Ian.Jackson@citrix.com>; Tim (Xen.org) <tim@xen.org>; Julien Grall
> <julien.grall@arm.com>; Jan Beulich <jbeulich@suse.com>
> Subject: RE: [Xen-devel] [PATCH v5 12/15] x86: add iommu_op to enable
> modification of IOMMU mappings
> 
> > From: Paul Durrant
> > Sent: Saturday, August 4, 2018 1:22 AM
> >
> > This patch adds an iommu_op which checks whether it is possible or
> > safe for a domain to modify its own IOMMU mappings and, if so, creates
> > a rangeset to track modifications.
> 
> Have to say that there might be a concept mismatch between us,
> so I will stop review here until we get aligned on the basic
> understanding.
> 
> What an IOMMU does is to provide DMA isolation between devices.
> Each device can be hooked with a different translation structure
> (representing a different bfn address space). Linux kernel uses this
> mechanism to harden kernel drivers (through dma APIs). Multiple
> devices can be also attached to the same address space, used by
> hypervisor when devices are assigned to the same VM.
> 

Indeed.

> Now with pvIOMMU exposed to dom0, , dom0 could use it to harden
> kernel drivers too. Then there will be multiple bfn address spaces:
> 
> - A default bfn address space created by Xen, where bfn = pfn
> - multiple per-bdf bfn address spaces created by Dom0, where
> bfn is completely irrelevant to pfn.
> 
> the default space should not be changed by Dom0. It is attached
> to devices which dom0 doesn't enable pviommu mapping.

No that's not the point here. I'm not trying to re-architect Xen's IOMMU handling. All the IOMMU code in Xen AFAICT is built around the assumption there is one set of page tables per-VM and all devices assigned to the VM get the same page tables. I suspect trying to change that will be a huge can of worms and I have no need to go there for my purposes.

> 
> per-bdf address spaces can be changed by Dom0, attached to
> devices which dom0 enables pviommu mapping. then pviommu ops
> should accept a bdf parameter. and internally Xen needs to maintain
> multiple page tables under dom0, and find a right page table according
> to specified bdf to complete the operation.
> 
> Now your series look assuming always just one bfn address space
> cross all assigned devices per domain... I'm not sure how it works.
> 

It does make that assumption because that assumption is baked into Xen's IOMMU support.

> Did I misunderstand anything?

Only perhaps that moving away from per-VM IOMMU pagetables will be something that is something I could do without making very invasive and lengthy changes to Xen's IOMMU code.

  Paul

> 
> >
> > NOTE: The actual map and unmap operations are introduced by
> > subsequent
> >       patches.
> >
> > Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
> > ---
> > Cc: Jan Beulich <jbeulich@suse.com>
> > Cc: Andrew Cooper <andrew.cooper3@citrix.com>
> > Cc: George Dunlap <George.Dunlap@eu.citrix.com>
> > Cc: Ian Jackson <ian.jackson@eu.citrix.com>
> > Cc: Julien Grall <julien.grall@arm.com>
> > Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> > Cc: Stefano Stabellini <sstabellini@kernel.org>
> > Cc: Tim Deegan <tim@xen.org>
> > Cc: Wei Liu <wei.liu2@citrix.com>
> >
> > v4:
> >  - Set sync_iommu_pt to false instead of need_iommu.
> >
> > v2:
> >  - New in v2.
> > ---
> >  xen/arch/x86/iommu_op.c         | 42
> > +++++++++++++++++++++++++++++++++++++++++
> >  xen/drivers/passthrough/iommu.c |  2 +-
> >  xen/drivers/passthrough/pci.c   |  4 ++--
> >  xen/include/public/iommu_op.h   |  6 ++++++
> >  xen/include/xen/iommu.h         |  3 +++
> >  5 files changed, 54 insertions(+), 3 deletions(-)
> >
> > diff --git a/xen/arch/x86/iommu_op.c b/xen/arch/x86/iommu_op.c
> > index bcfcd49102..b29547bffd 100644
> > --- a/xen/arch/x86/iommu_op.c
> > +++ b/xen/arch/x86/iommu_op.c
> > @@ -78,6 +78,42 @@ static int iommu_op_query_reserved(struct
> > xen_iommu_op_query_reserved *op)
> >      return 0;
> >  }
> >
> > +static int iommu_op_enable_modification(void)
> > +{
> > +    struct domain *currd = current->domain;
> > +    struct domain_iommu *iommu = dom_iommu(currd);
> > +    const struct iommu_ops *ops = iommu->platform_ops;
> > +
> > +    /* Has modification already been enabled? */
> > +    if ( iommu->iommu_op_ranges )
> > +        return 0;
> > +
> > +    /*
> > +     * The IOMMU mappings cannot be modified if:
> > +     * - the IOMMU is not enabled or,
> > +     * - the current domain is dom0 and tranlsation is disabled or,
> > +     * - HAP is enabled and the IOMMU shares the mappings.
> > +     */
> > +    if ( !iommu_enabled ||
> > +         (is_hardware_domain(currd) && iommu_passthrough) ||
> > +         iommu_use_hap_pt(currd) )
> > +        return -EACCES;
> > +
> > +    /*
> > +     * The IOMMU implementation must provide the lookup method if
> > +     * modification of the mappings is to be supported.
> > +     */
> > +    if ( !ops->lookup_page )
> > +        return -EOPNOTSUPP;
> > +
> > +    iommu->iommu_op_ranges = rangeset_new(currd, NULL, 0);
> > +    if ( !iommu->iommu_op_ranges )
> > +        return -ENOMEM;
> > +
> > +    currd->sync_iommu_pt = 0; /* Disable synchronization */
> > +    return 0;
> > +}
> > +
> >  static void iommu_op(xen_iommu_op_t *op)
> >  {
> >      switch ( op->op )
> > @@ -86,6 +122,10 @@ static void iommu_op(xen_iommu_op_t *op)
> >          op->status = iommu_op_query_reserved(&op->u.query_reserved);
> >          break;
> >
> > +    case XEN_IOMMUOP_enable_modification:
> > +        op->status = iommu_op_enable_modification();
> > +        break;
> > +
> >      default:
> >          op->status = -EOPNOTSUPP;
> >          break;
> > @@ -98,6 +138,7 @@ int do_one_iommu_op(xen_iommu_op_buf_t *buf)
> >      size_t offset;
> >      static const size_t op_size[] = {
> >          [XEN_IOMMUOP_query_reserved] = sizeof(struct
> > xen_iommu_op_query_reserved),
> > +        [XEN_IOMMUOP_enable_modification] = 0,
> >      };
> >      size_t size;
> >      int rc;
> > @@ -184,6 +225,7 @@ int
> > compat_one_iommu_op(compat_iommu_op_buf_t *buf)
> >      size_t offset;
> >      static const size_t op_size[] = {
> >          [XEN_IOMMUOP_query_reserved] = sizeof(struct
> > compat_iommu_op_query_reserved),
> > +        [XEN_IOMMUOP_enable_modification] = 0,
> >      };
> >      size_t size;
> >      xen_iommu_op_t nat;
> > diff --git a/xen/drivers/passthrough/iommu.c
> > b/xen/drivers/passthrough/iommu.c
> > index caf3d125ae..8f635a5cdb 100644
> > --- a/xen/drivers/passthrough/iommu.c
> > +++ b/xen/drivers/passthrough/iommu.c
> > @@ -26,7 +26,6 @@ static void iommu_dump_p2m_table(unsigned char
> > key);
> >
> >  unsigned int __read_mostly iommu_dev_iotlb_timeout = 1000;
> >  integer_param("iommu_dev_iotlb_timeout", iommu_dev_iotlb_timeout);
> > -
> >  /*
> >   * The 'iommu' parameter enables the IOMMU.  Optional comma
> > separated
> >   * value may contain:
> > @@ -265,6 +264,7 @@ void iommu_domain_destroy(struct domain *d)
> >      arch_iommu_domain_destroy(d);
> >
> >      rangeset_destroy(hd->reserved_ranges);
> > +    rangeset_destroy(hd->iommu_op_ranges);
> >  }
> >
> >  int iommu_map_page(struct domain *d, bfn_t bfn, mfn_t mfn,
> > diff --git a/xen/drivers/passthrough/pci.c b/xen/drivers/passthrough/pci.c
> > index 3d3ad484e7..d4033af41a 100644
> > --- a/xen/drivers/passthrough/pci.c
> > +++ b/xen/drivers/passthrough/pci.c
> > @@ -1460,7 +1460,7 @@ static int assign_device(struct domain *d, u16 seg,
> > u8 bus, u8 devfn, u32 flag)
> >      }
> >
> >   done:
> > -    if ( !has_arch_pdevs(d) && has_iommu_pt(d) )
> > +    if ( !has_arch_pdevs(d) && has_iommu_pt(d) && !hd-
> > >iommu_op_ranges )
> >          iommu_teardown(d);
> >      pcidevs_unlock();
> >
> > @@ -1510,7 +1510,7 @@ int deassign_device(struct domain *d, u16 seg,
> > u8 bus, u8 devfn)
> >
> >      pdev->fault.count = 0;
> >
> > -    if ( !has_arch_pdevs(d) && has_iommu_pt(d) )
> > +    if ( !has_arch_pdevs(d) && has_iommu_pt(d) && !hd-
> > >iommu_op_ranges )
> >          iommu_teardown(d);
> >
> >      return ret;
> > diff --git a/xen/include/public/iommu_op.h
> > b/xen/include/public/iommu_op.h
> > index ade404a877..9bf74bd007 100644
> > --- a/xen/include/public/iommu_op.h
> > +++ b/xen/include/public/iommu_op.h
> > @@ -61,6 +61,12 @@ struct xen_iommu_op_query_reserved {
> >      XEN_GUEST_HANDLE(xen_iommu_reserved_range_t) ranges;
> >  };
> >
> > +/*
> > + * XEN_IOMMUOP_enable_modification: Enable operations that modify
> > IOMMU
> > + *                                  mappings.
> > + */
> > +#define XEN_IOMMUOP_enable_modification 2
> > +
> >  struct xen_iommu_op {
> >      uint16_t op;    /* op type */
> >      uint16_t pad;
> > diff --git a/xen/include/xen/iommu.h b/xen/include/xen/iommu.h
> > index 7c5d46df81..08b163cbcb 100644
> > --- a/xen/include/xen/iommu.h
> > +++ b/xen/include/xen/iommu.h
> > @@ -130,6 +130,9 @@ struct domain_iommu {
> >       * must not be modified after initialization.
> >       */
> >      struct rangeset *reserved_ranges;
> > +
> > +    /* Ranges under the control of iommu_op */
> > +    struct rangeset *iommu_op_ranges;
> >  };
> >
> >  #define dom_iommu(d)              (&(d)->iommu)
> > --
> > 2.11.0
> >
> >
> > _______________________________________________
> > Xen-devel mailing list
> > Xen-devel@lists.xenproject.org
> > https://lists.xenproject.org/mailman/listinfo/xen-devel
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH v5 09/15] vtd: add lookup_page method to iommu_ops
  2018-08-07  8:31       ` Tian, Kevin
@ 2018-08-07  8:35         ` Paul Durrant
  2018-08-07  8:47           ` Tian, Kevin
  0 siblings, 1 reply; 57+ messages in thread
From: Paul Durrant @ 2018-08-07  8:35 UTC (permalink / raw)
  To: Kevin Tian, xen-devel; +Cc: George Dunlap, Jan Beulich

> -----Original Message-----
> From: Tian, Kevin [mailto:kevin.tian@intel.com]
> Sent: 07 August 2018 09:31
> To: Paul Durrant <Paul.Durrant@citrix.com>; xen-devel@lists.xenproject.org
> Cc: Jan Beulich <jbeulich@suse.com>; George Dunlap
> <George.Dunlap@citrix.com>
> Subject: RE: [PATCH v5 09/15] vtd: add lookup_page method to iommu_ops
> 
> > From: Paul Durrant [mailto:Paul.Durrant@citrix.com]
> > Sent: Tuesday, August 7, 2018 4:22 PM
> >
> > > -----Original Message-----
> > > From: Tian, Kevin [mailto:kevin.tian@intel.com]
> > > Sent: 07 August 2018 04:25
> > > To: Paul Durrant <Paul.Durrant@citrix.com>; xen-
> > devel@lists.xenproject.org
> > > Cc: Jan Beulich <jbeulich@suse.com>; George Dunlap
> > > <George.Dunlap@citrix.com>
> > > Subject: RE: [PATCH v5 09/15] vtd: add lookup_page method to
> > iommu_ops
> > >
> > > > From: Paul Durrant [mailto:paul.durrant@citrix.com]
> > > > Sent: Saturday, August 4, 2018 1:22 AM
> > > >
> > > > This patch adds a new method to the VT-d IOMMU implementation to
> > find
> > > > the
> > > > MFN currently mapped by the specified BFN along with a wrapper
> > function
> > > > in
> > > > generic IOMMU code to call the implementation if it exists.
> > > >
> > > > This functionality will be used by a subsequent patch.
> > > >
> > > > Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
> > > > Reviewed-by: Wei Liu <wei.liu2@citrix.com>
> > > > ---
> > > > Cc: Kevin Tian <kevin.tian@intel.com>
> > > > Cc: Jan Beulich <jbeulich@suse.com>
> > > > Cc: George Dunlap <george.dunlap@citrix.com>
> > > >
> > > > v3:
> > > >  - Addressed comments from George.
> > > >
> > > > v2:
> > > >  - Addressed some comments from Jan.
> > > > ---
> > > >  xen/drivers/passthrough/iommu.c     | 11 +++++++++++
> > > >  xen/drivers/passthrough/vtd/iommu.c | 34
> > > > ++++++++++++++++++++++++++++++++++
> > > >  xen/drivers/passthrough/vtd/iommu.h |  3 +++
> > > >  xen/include/xen/iommu.h             |  4 ++++
> > > >  4 files changed, 52 insertions(+)
> > > >
> > > > diff --git a/xen/drivers/passthrough/iommu.c
> > > > b/xen/drivers/passthrough/iommu.c
> > > > index b10a37e5d7..9b7baca93f 100644
> > > > --- a/xen/drivers/passthrough/iommu.c
> > > > +++ b/xen/drivers/passthrough/iommu.c
> > > > @@ -305,6 +305,17 @@ int iommu_unmap_page(struct domain *d,
> > bfn_t
> > > > bfn)
> > > >      return rc;
> > > >  }
> > > >
> > > > +int iommu_lookup_page(struct domain *d, bfn_t bfn, mfn_t *mfn,
> > > > +                      unsigned int *flags)
> > > > +{
> > > > +    const struct domain_iommu *hd = dom_iommu(d);
> > > > +
> > > > +    if ( !iommu_enabled || !hd->platform_ops )
> > > > +        return -EOPNOTSUPP;
> > > > +
> > > > +    return hd->platform_ops->lookup_page(d, bfn, mfn, flags);
> > > > +}
> > > > +
> > > >  static void iommu_free_pagetables(unsigned long unused)
> > > >  {
> > > >      do {
> > > > diff --git a/xen/drivers/passthrough/vtd/iommu.c
> > > > b/xen/drivers/passthrough/vtd/iommu.c
> > > > index 282e227414..8cd3b59aa0 100644
> > > > --- a/xen/drivers/passthrough/vtd/iommu.c
> > > > +++ b/xen/drivers/passthrough/vtd/iommu.c
> > > > @@ -1830,6 +1830,39 @@ static int __must_check
> > > > intel_iommu_unmap_page(struct domain *d,
> > > >      return dma_pte_clear_one(d, bfn_to_baddr(bfn));
> > > >  }
> > > >
> > > > +static int intel_iommu_lookup_page(struct domain *d, bfn_t bfn,
> > mfn_t
> > > > *mfn,
> > > > +                                   unsigned int *flags)
> > >
> > > Not looking at later patches yet... but in concept bfn address
> > > space is per device instead of per domain.
> >
> > Not in this case. Xen has always maintained a single IOMMU address per
> > virtual machine. That is what BFN refers to.
> >
> >   Paul
> >
> > > In default situation
> > > (w/o pvIOMMU exposed), all devices assigned to dom0 share
> > > the same address space (bfn=pfn) which is currently linked
> > > from domain structure. Then with pvIOMMU exposed, dom0
> > > starts to manage individual pfn address space (called IOVA
> > > address space within dom0) per assigned device. In that case
> > > lookup should accept a bdf number and then find the right
> > > page table...
> > >
> >
> > No. That is over-complicating things and would probably involve re-writing
> > most of the IOMMU code in Xen AFAICT.
> >
> 
> it's not over-complicating. it's about correctness.
> 
> think about the role of IOMMU. Now you expose it to dom0. Then
> dom0 wants to use it to isolate devices. What is the matter having
> an IOMMU which supports only one global address space cross all
> devices, from dom0's p.o.v? If that is the case, why does dom0 ever
> want to change bfn globally on all devices?
> 

Because even a PV dom0 would be beter off with a BFN:GFN map rather than a BFN:MFN map, since then the xen_swiotlb code and be avoided and lots of things that are contiguous in GFN space become contiguous in BFN space and hence become DMA-able without bouncing.

  Paul

> Thanks
> Kevin

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH v5 09/15] vtd: add lookup_page method to iommu_ops
  2018-08-07  8:32         ` Tian, Kevin
@ 2018-08-07  8:37           ` Paul Durrant
  2018-08-07  8:48             ` Tian, Kevin
  0 siblings, 1 reply; 57+ messages in thread
From: Paul Durrant @ 2018-08-07  8:37 UTC (permalink / raw)
  To: Kevin Tian, Jan Beulich; +Cc: xen-devel, George Dunlap

> -----Original Message-----
> From: Tian, Kevin [mailto:kevin.tian@intel.com]
> Sent: 07 August 2018 09:33
> To: Jan Beulich <JBeulich@suse.com>; Paul Durrant
> <Paul.Durrant@citrix.com>
> Cc: George Dunlap <George.Dunlap@citrix.com>; xen-devel <xen-
> devel@lists.xenproject.org>
> Subject: RE: [PATCH v5 09/15] vtd: add lookup_page method to iommu_ops
> 
> > From: Jan Beulich [mailto:JBeulich@suse.com]
> > Sent: Tuesday, August 7, 2018 4:30 PM
> >
> > >>> On 07.08.18 at 10:21, <Paul.Durrant@citrix.com> wrote:
> > >> From: Tian, Kevin [mailto:kevin.tian@intel.com]
> > >> Sent: 07 August 2018 04:25
> > >>
> > >> > From: Paul Durrant [mailto:paul.durrant@citrix.com]
> > >> > Sent: Saturday, August 4, 2018 1:22 AM
> > >> >
> > >> > --- a/xen/drivers/passthrough/vtd/iommu.c
> > >> > +++ b/xen/drivers/passthrough/vtd/iommu.c
> > >> > @@ -1830,6 +1830,39 @@ static int __must_check
> > >> > intel_iommu_unmap_page(struct domain *d,
> > >> >      return dma_pte_clear_one(d, bfn_to_baddr(bfn));
> > >> >  }
> > >> >
> > >> > +static int intel_iommu_lookup_page(struct domain *d, bfn_t bfn,
> > mfn_t
> > >> > *mfn,
> > >> > +                                   unsigned int *flags)
> > >>
> > >> Not looking at later patches yet... but in concept bfn address
> > >> space is per device instead of per domain.
> > >
> > > Not in this case. Xen has always maintained a single IOMMU address per
> > > virtual machine. That is what BFN refers to.
> >
> > Nut is that a model we can maintain mid and long term? In particular
> > on ARM, where Julien has told me a single system could have multiple
> > _different_ IOMMUs, I could easily see the address spaces diverging.
> >
> 
> multiple IOMMUs is another thing.
> 
> what I questioned is that even one IOMMU needs to support mulitiple
> address spaces. That is the point of an IOMMU...

Indeed and that is why we use it to enforce domain separation. I see no need, as yet, to enforce separation within a domain. That need may arise later and the code can be modified at that point.

  Paul

> 
> Thanks
> Kevin

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH v5 12/15] x86: add iommu_op to enable modification of IOMMU mappings
  2018-08-07  8:32     ` Paul Durrant
@ 2018-08-07  8:37       ` Tian, Kevin
  2018-08-07  8:44         ` Paul Durrant
  0 siblings, 1 reply; 57+ messages in thread
From: Tian, Kevin @ 2018-08-07  8:37 UTC (permalink / raw)
  To: Paul Durrant, xen-devel
  Cc: Stefano Stabellini, Wei Liu, Andrew Cooper, Tim (Xen.org),
	George Dunlap, Julien Grall, Jan Beulich, Ian Jackson

> From: Paul Durrant [mailto:Paul.Durrant@citrix.com]
> Sent: Tuesday, August 7, 2018 4:33 PM
> 
> >
> > > From: Paul Durrant
> > > Sent: Saturday, August 4, 2018 1:22 AM
> > >
> > > This patch adds an iommu_op which checks whether it is possible or
> > > safe for a domain to modify its own IOMMU mappings and, if so,
> creates
> > > a rangeset to track modifications.
> >
> > Have to say that there might be a concept mismatch between us,
> > so I will stop review here until we get aligned on the basic
> > understanding.
> >
> > What an IOMMU does is to provide DMA isolation between devices.
> > Each device can be hooked with a different translation structure
> > (representing a different bfn address space). Linux kernel uses this
> > mechanism to harden kernel drivers (through dma APIs). Multiple
> > devices can be also attached to the same address space, used by
> > hypervisor when devices are assigned to the same VM.
> >
> 
> Indeed.
> 
> > Now with pvIOMMU exposed to dom0, , dom0 could use it to harden
> > kernel drivers too. Then there will be multiple bfn address spaces:
> >
> > - A default bfn address space created by Xen, where bfn = pfn
> > - multiple per-bdf bfn address spaces created by Dom0, where
> > bfn is completely irrelevant to pfn.
> >
> > the default space should not be changed by Dom0. It is attached
> > to devices which dom0 doesn't enable pviommu mapping.
> 
> No that's not the point here. I'm not trying to re-architect Xen's IOMMU
> handling. All the IOMMU code in Xen AFAICT is built around the assumption
> there is one set of page tables per-VM and all devices assigned to the VM
> get the same page tables. I suspect trying to change that will be a huge can
> of worms and I have no need to go there for my purposes.

don't just think from Xen side. think about what Dom0 feels about
this IOMMU.

ideally pviommu driver is a new vendor driver attached to iommu
core within dom0. it needs to provide iommu dma ops to support
dma_alloc/map operations from different device drivers. iommu
core maintains a separate iova space for each device, so device
drivers can be isolated from each other. 

Now dom0 got only one global space. then why does dom0 need
to enable pviommu at all?

> 
> >
> > per-bdf address spaces can be changed by Dom0, attached to
> > devices which dom0 enables pviommu mapping. then pviommu ops
> > should accept a bdf parameter. and internally Xen needs to maintain
> > multiple page tables under dom0, and find a right page table according
> > to specified bdf to complete the operation.
> >
> > Now your series look assuming always just one bfn address space
> > cross all assigned devices per domain... I'm not sure how it works.
> >
> 
> It does make that assumption because that assumption is baked into Xen's
> IOMMU support.
> 
> > Did I misunderstand anything?
> 
> Only perhaps that moving away from per-VM IOMMU pagetables will be
> something that is something I could do without making very invasive and
> lengthy changes to Xen's IOMMU code.
> 

it's a must imo.

Thanks
Kevin
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH v5 12/15] x86: add iommu_op to enable modification of IOMMU mappings
  2018-08-07  8:37       ` Tian, Kevin
@ 2018-08-07  8:44         ` Paul Durrant
  2018-08-07  9:01           ` Tian, Kevin
  0 siblings, 1 reply; 57+ messages in thread
From: Paul Durrant @ 2018-08-07  8:44 UTC (permalink / raw)
  To: Kevin Tian, xen-devel
  Cc: Stefano Stabellini, Wei Liu, Andrew Cooper, Tim (Xen.org),
	George Dunlap, Julien Grall, Jan Beulich, Ian Jackson

> -----Original Message-----
> From: Tian, Kevin [mailto:kevin.tian@intel.com]
> Sent: 07 August 2018 09:38
> To: Paul Durrant <Paul.Durrant@citrix.com>; xen-devel@lists.xenproject.org
> Cc: Stefano Stabellini <sstabellini@kernel.org>; Wei Liu
> <wei.liu2@citrix.com>; George Dunlap <George.Dunlap@citrix.com>;
> Andrew Cooper <Andrew.Cooper3@citrix.com>; Ian Jackson
> <Ian.Jackson@citrix.com>; Tim (Xen.org) <tim@xen.org>; Julien Grall
> <julien.grall@arm.com>; Jan Beulich <jbeulich@suse.com>
> Subject: RE: [Xen-devel] [PATCH v5 12/15] x86: add iommu_op to enable
> modification of IOMMU mappings
> 
> > From: Paul Durrant [mailto:Paul.Durrant@citrix.com]
> > Sent: Tuesday, August 7, 2018 4:33 PM
> >
> > >
> > > > From: Paul Durrant
> > > > Sent: Saturday, August 4, 2018 1:22 AM
> > > >
> > > > This patch adds an iommu_op which checks whether it is possible or
> > > > safe for a domain to modify its own IOMMU mappings and, if so,
> > creates
> > > > a rangeset to track modifications.
> > >
> > > Have to say that there might be a concept mismatch between us,
> > > so I will stop review here until we get aligned on the basic
> > > understanding.
> > >
> > > What an IOMMU does is to provide DMA isolation between devices.
> > > Each device can be hooked with a different translation structure
> > > (representing a different bfn address space). Linux kernel uses this
> > > mechanism to harden kernel drivers (through dma APIs). Multiple
> > > devices can be also attached to the same address space, used by
> > > hypervisor when devices are assigned to the same VM.
> > >
> >
> > Indeed.
> >
> > > Now with pvIOMMU exposed to dom0, , dom0 could use it to harden
> > > kernel drivers too. Then there will be multiple bfn address spaces:
> > >
> > > - A default bfn address space created by Xen, where bfn = pfn
> > > - multiple per-bdf bfn address spaces created by Dom0, where
> > > bfn is completely irrelevant to pfn.
> > >
> > > the default space should not be changed by Dom0. It is attached
> > > to devices which dom0 doesn't enable pviommu mapping.
> >
> > No that's not the point here. I'm not trying to re-architect Xen's IOMMU
> > handling. All the IOMMU code in Xen AFAICT is built around the assumption
> > there is one set of page tables per-VM and all devices assigned to the VM
> > get the same page tables. I suspect trying to change that will be a huge can
> > of worms and I have no need to go there for my purposes.
> 
> don't just think from Xen side. think about what Dom0 feels about
> this IOMMU.
> 
> ideally pviommu driver is a new vendor driver attached to iommu
> core within dom0. it needs to provide iommu dma ops to support
> dma_alloc/map operations from different device drivers. iommu
> core maintains a separate iova space for each device, so device
> drivers can be isolated from each other.

But there is nothing that means that the IOVA space cannot be global, and that is good enough for a PV dom0.

> 
> Now dom0 got only one global space. then why does dom0 need
> to enable pviommu at all?

As I explained in another reply, it is primarily to allow a PV dom0 to have a BFN:GFN map. Since a PV domain maintains its own P2M then it is the domain that maintains the mapping. That is all I need to do.

> 
> >
> > >
> > > per-bdf address spaces can be changed by Dom0, attached to
> > > devices which dom0 enables pviommu mapping. then pviommu ops
> > > should accept a bdf parameter. and internally Xen needs to maintain
> > > multiple page tables under dom0, and find a right page table according
> > > to specified bdf to complete the operation.
> > >
> > > Now your series look assuming always just one bfn address space
> > > cross all assigned devices per domain... I'm not sure how it works.
> > >
> >
> > It does make that assumption because that assumption is baked into Xen's
> > IOMMU support.
> >
> > > Did I misunderstand anything?
> >
> > Only perhaps that moving away from per-VM IOMMU pagetables will be
> > something that is something I could do without making very invasive and
> > lengthy changes to Xen's IOMMU code.
> >
> 
> it's a must imo.

I may was well give up then. That's a mammoth task akin to e.g. moving to per-vcpu rather than per-domain P2Ms. I don't have a spare year to do the work.

  Paul

> 
> Thanks
> Kevin
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH v5 09/15] vtd: add lookup_page method to iommu_ops
  2018-08-07  8:35         ` Paul Durrant
@ 2018-08-07  8:47           ` Tian, Kevin
  0 siblings, 0 replies; 57+ messages in thread
From: Tian, Kevin @ 2018-08-07  8:47 UTC (permalink / raw)
  To: Paul Durrant, xen-devel; +Cc: George Dunlap, Jan Beulich

> From: Paul Durrant [mailto:Paul.Durrant@citrix.com]
> Sent: Tuesday, August 7, 2018 4:36 PM
> 
> > > > > +static int intel_iommu_lookup_page(struct domain *d, bfn_t bfn,
> > > mfn_t
> > > > > *mfn,
> > > > > +                                   unsigned int *flags)
> > > >
> > > > Not looking at later patches yet... but in concept bfn address
> > > > space is per device instead of per domain.
> > >
> > > Not in this case. Xen has always maintained a single IOMMU address
> per
> > > virtual machine. That is what BFN refers to.
> > >
> > >   Paul
> > >
> > > > In default situation
> > > > (w/o pvIOMMU exposed), all devices assigned to dom0 share
> > > > the same address space (bfn=pfn) which is currently linked
> > > > from domain structure. Then with pvIOMMU exposed, dom0
> > > > starts to manage individual pfn address space (called IOVA
> > > > address space within dom0) per assigned device. In that case
> > > > lookup should accept a bdf number and then find the right
> > > > page table...
> > > >
> > >
> > > No. That is over-complicating things and would probably involve re-
> writing
> > > most of the IOMMU code in Xen AFAICT.
> > >
> >
> > it's not over-complicating. it's about correctness.
> >
> > think about the role of IOMMU. Now you expose it to dom0. Then
> > dom0 wants to use it to isolate devices. What is the matter having
> > an IOMMU which supports only one global address space cross all
> > devices, from dom0's p.o.v? If that is the case, why does dom0 ever
> > want to change bfn globally on all devices?
> >
> 
> Because even a PV dom0 would be beter off with a BFN:GFN map rather
> than a BFN:MFN map, since then the xen_swiotlb code and be avoided and
> lots of things that are contiguous in GFN space become contiguous in BFN
> space and hence become DMA-able without bouncing.
> 

so this pvIOMMU really loses most functionalities which a physical IOMMU
provides... maybe not call it pvIOMMU. sort of pvDMA engine.

Thanks
Kevin

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH v5 09/15] vtd: add lookup_page method to iommu_ops
  2018-08-07  8:37           ` Paul Durrant
@ 2018-08-07  8:48             ` Tian, Kevin
  2018-08-07  8:56               ` Paul Durrant
  0 siblings, 1 reply; 57+ messages in thread
From: Tian, Kevin @ 2018-08-07  8:48 UTC (permalink / raw)
  To: Paul Durrant, Jan Beulich; +Cc: xen-devel, George Dunlap

> From: Paul Durrant [mailto:Paul.Durrant@citrix.com]
> Sent: Tuesday, August 7, 2018 4:37 PM
> 
> > -----Original Message-----
> > From: Tian, Kevin [mailto:kevin.tian@intel.com]
> > Sent: 07 August 2018 09:33
> > To: Jan Beulich <JBeulich@suse.com>; Paul Durrant
> > <Paul.Durrant@citrix.com>
> > Cc: George Dunlap <George.Dunlap@citrix.com>; xen-devel <xen-
> > devel@lists.xenproject.org>
> > Subject: RE: [PATCH v5 09/15] vtd: add lookup_page method to
> iommu_ops
> >
> > > From: Jan Beulich [mailto:JBeulich@suse.com]
> > > Sent: Tuesday, August 7, 2018 4:30 PM
> > >
> > > >>> On 07.08.18 at 10:21, <Paul.Durrant@citrix.com> wrote:
> > > >> From: Tian, Kevin [mailto:kevin.tian@intel.com]
> > > >> Sent: 07 August 2018 04:25
> > > >>
> > > >> > From: Paul Durrant [mailto:paul.durrant@citrix.com]
> > > >> > Sent: Saturday, August 4, 2018 1:22 AM
> > > >> >
> > > >> > --- a/xen/drivers/passthrough/vtd/iommu.c
> > > >> > +++ b/xen/drivers/passthrough/vtd/iommu.c
> > > >> > @@ -1830,6 +1830,39 @@ static int __must_check
> > > >> > intel_iommu_unmap_page(struct domain *d,
> > > >> >      return dma_pte_clear_one(d, bfn_to_baddr(bfn));
> > > >> >  }
> > > >> >
> > > >> > +static int intel_iommu_lookup_page(struct domain *d, bfn_t bfn,
> > > mfn_t
> > > >> > *mfn,
> > > >> > +                                   unsigned int *flags)
> > > >>
> > > >> Not looking at later patches yet... but in concept bfn address
> > > >> space is per device instead of per domain.
> > > >
> > > > Not in this case. Xen has always maintained a single IOMMU address
> per
> > > > virtual machine. That is what BFN refers to.
> > >
> > > Nut is that a model we can maintain mid and long term? In particular
> > > on ARM, where Julien has told me a single system could have multiple
> > > _different_ IOMMUs, I could easily see the address spaces diverging.
> > >
> >
> > multiple IOMMUs is another thing.
> >
> > what I questioned is that even one IOMMU needs to support mulitiple
> > address spaces. That is the point of an IOMMU...
> 
> Indeed and that is why we use it to enforce domain separation. I see no
> need, as yet, to enforce separation within a domain. That need may arise
> later and the code can be modified at that point.
> 

but then you need completely different set of APIs at that time...

Thanks
Kevin

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH v5 09/15] vtd: add lookup_page method to iommu_ops
  2018-08-07  8:48             ` Tian, Kevin
@ 2018-08-07  8:56               ` Paul Durrant
  2018-08-07  9:03                 ` Tian, Kevin
  0 siblings, 1 reply; 57+ messages in thread
From: Paul Durrant @ 2018-08-07  8:56 UTC (permalink / raw)
  To: Kevin Tian, Jan Beulich; +Cc: xen-devel, George Dunlap

> -----Original Message-----
> From: Tian, Kevin [mailto:kevin.tian@intel.com]
> Sent: 07 August 2018 09:48
> To: Paul Durrant <Paul.Durrant@citrix.com>; Jan Beulich
> <JBeulich@suse.com>
> Cc: George Dunlap <George.Dunlap@citrix.com>; xen-devel <xen-
> devel@lists.xenproject.org>
> Subject: RE: [PATCH v5 09/15] vtd: add lookup_page method to iommu_ops
> 
> > From: Paul Durrant [mailto:Paul.Durrant@citrix.com]
> > Sent: Tuesday, August 7, 2018 4:37 PM
> >
> > > -----Original Message-----
> > > From: Tian, Kevin [mailto:kevin.tian@intel.com]
> > > Sent: 07 August 2018 09:33
> > > To: Jan Beulich <JBeulich@suse.com>; Paul Durrant
> > > <Paul.Durrant@citrix.com>
> > > Cc: George Dunlap <George.Dunlap@citrix.com>; xen-devel <xen-
> > > devel@lists.xenproject.org>
> > > Subject: RE: [PATCH v5 09/15] vtd: add lookup_page method to
> > iommu_ops
> > >
> > > > From: Jan Beulich [mailto:JBeulich@suse.com]
> > > > Sent: Tuesday, August 7, 2018 4:30 PM
> > > >
> > > > >>> On 07.08.18 at 10:21, <Paul.Durrant@citrix.com> wrote:
> > > > >> From: Tian, Kevin [mailto:kevin.tian@intel.com]
> > > > >> Sent: 07 August 2018 04:25
> > > > >>
> > > > >> > From: Paul Durrant [mailto:paul.durrant@citrix.com]
> > > > >> > Sent: Saturday, August 4, 2018 1:22 AM
> > > > >> >
> > > > >> > --- a/xen/drivers/passthrough/vtd/iommu.c
> > > > >> > +++ b/xen/drivers/passthrough/vtd/iommu.c
> > > > >> > @@ -1830,6 +1830,39 @@ static int __must_check
> > > > >> > intel_iommu_unmap_page(struct domain *d,
> > > > >> >      return dma_pte_clear_one(d, bfn_to_baddr(bfn));
> > > > >> >  }
> > > > >> >
> > > > >> > +static int intel_iommu_lookup_page(struct domain *d, bfn_t bfn,
> > > > mfn_t
> > > > >> > *mfn,
> > > > >> > +                                   unsigned int *flags)
> > > > >>
> > > > >> Not looking at later patches yet... but in concept bfn address
> > > > >> space is per device instead of per domain.
> > > > >
> > > > > Not in this case. Xen has always maintained a single IOMMU address
> > per
> > > > > virtual machine. That is what BFN refers to.
> > > >
> > > > Nut is that a model we can maintain mid and long term? In particular
> > > > on ARM, where Julien has told me a single system could have multiple
> > > > _different_ IOMMUs, I could easily see the address spaces diverging.
> > > >
> > >
> > > multiple IOMMUs is another thing.
> > >
> > > what I questioned is that even one IOMMU needs to support mulitiple
> > > address spaces. That is the point of an IOMMU...
> >
> > Indeed and that is why we use it to enforce domain separation. I see no
> > need, as yet, to enforce separation within a domain. That need may arise
> > later and the code can be modified at that point.
> >
> 
> but then you need completely different set of APIs at that time...

Ok. Would you be happy if I add the option to supply a an SBDF in the map and unmap hypercalls but ignore the value for now? Or even have a 'global' flag but return -EOPNOTSUPP if it is not specified. That would avoid the need to add new hypercalls for per-device mapping.

  Paul

> 
> Thanks
> Kevin

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH v5 12/15] x86: add iommu_op to enable modification of IOMMU mappings
  2018-08-07  8:44         ` Paul Durrant
@ 2018-08-07  9:01           ` Tian, Kevin
  2018-08-07  9:12             ` Paul Durrant
  0 siblings, 1 reply; 57+ messages in thread
From: Tian, Kevin @ 2018-08-07  9:01 UTC (permalink / raw)
  To: Paul Durrant, xen-devel
  Cc: Stefano Stabellini, Wei Liu, Andrew Cooper, Tim (Xen.org),
	George Dunlap, Julien Grall, Jan Beulich, Ian Jackson

> From: Paul Durrant
> Sent: Tuesday, August 7, 2018 4:44 PM
> 
> > -----Original Message-----
> > From: Tian, Kevin [mailto:kevin.tian@intel.com]
> > Sent: 07 August 2018 09:38
> > To: Paul Durrant <Paul.Durrant@citrix.com>; xen-
> devel@lists.xenproject.org
> > Cc: Stefano Stabellini <sstabellini@kernel.org>; Wei Liu
> > <wei.liu2@citrix.com>; George Dunlap <George.Dunlap@citrix.com>;
> > Andrew Cooper <Andrew.Cooper3@citrix.com>; Ian Jackson
> > <Ian.Jackson@citrix.com>; Tim (Xen.org) <tim@xen.org>; Julien Grall
> > <julien.grall@arm.com>; Jan Beulich <jbeulich@suse.com>
> > Subject: RE: [Xen-devel] [PATCH v5 12/15] x86: add iommu_op to enable
> > modification of IOMMU mappings
> >
> > > From: Paul Durrant [mailto:Paul.Durrant@citrix.com]
> > > Sent: Tuesday, August 7, 2018 4:33 PM
> > >
> > > >
> > > > > From: Paul Durrant
> > > > > Sent: Saturday, August 4, 2018 1:22 AM
> > > > >
> > > > > This patch adds an iommu_op which checks whether it is possible or
> > > > > safe for a domain to modify its own IOMMU mappings and, if so,
> > > creates
> > > > > a rangeset to track modifications.
> > > >
> > > > Have to say that there might be a concept mismatch between us,
> > > > so I will stop review here until we get aligned on the basic
> > > > understanding.
> > > >
> > > > What an IOMMU does is to provide DMA isolation between devices.
> > > > Each device can be hooked with a different translation structure
> > > > (representing a different bfn address space). Linux kernel uses this
> > > > mechanism to harden kernel drivers (through dma APIs). Multiple
> > > > devices can be also attached to the same address space, used by
> > > > hypervisor when devices are assigned to the same VM.
> > > >
> > >
> > > Indeed.
> > >
> > > > Now with pvIOMMU exposed to dom0, , dom0 could use it to harden
> > > > kernel drivers too. Then there will be multiple bfn address spaces:
> > > >
> > > > - A default bfn address space created by Xen, where bfn = pfn
> > > > - multiple per-bdf bfn address spaces created by Dom0, where
> > > > bfn is completely irrelevant to pfn.
> > > >
> > > > the default space should not be changed by Dom0. It is attached
> > > > to devices which dom0 doesn't enable pviommu mapping.
> > >
> > > No that's not the point here. I'm not trying to re-architect Xen's IOMMU
> > > handling. All the IOMMU code in Xen AFAICT is built around the
> assumption
> > > there is one set of page tables per-VM and all devices assigned to the
> VM
> > > get the same page tables. I suspect trying to change that will be a huge
> can
> > > of worms and I have no need to go there for my purposes.
> >
> > don't just think from Xen side. think about what Dom0 feels about
> > this IOMMU.
> >
> > ideally pviommu driver is a new vendor driver attached to iommu
> > core within dom0. it needs to provide iommu dma ops to support
> > dma_alloc/map operations from different device drivers. iommu
> > core maintains a separate iova space for each device, so device
> > drivers can be isolated from each other.
> 
> But there is nothing that means that the IOVA space cannot be global, and
> that is good enough for a PV dom0.

You are right! Although my concept is all built around physical IOMMU
capability, I did a check that Linux doesn't state that IOVA cannot be
global. It's purely vendor IOMMU driver to decide.

So here current version pvIOMMU only provides global address space,
though unlike any existing IOMMU. maybe we should explicitly call out
this fact in some capability field for future extension.

> 
> >
> > Now dom0 got only one global space. then why does dom0 need
> > to enable pviommu at all?
> 
> As I explained in another reply, it is primarily to allow a PV dom0 to have a
> BFN:GFN map. Since a PV domain maintains its own P2M then it is the
> domain that maintains the mapping. That is all I need to do.

yes, I got this one.

Thanks
Kevin
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH v5 09/15] vtd: add lookup_page method to iommu_ops
  2018-08-07  8:56               ` Paul Durrant
@ 2018-08-07  9:03                 ` Tian, Kevin
  2018-08-07  9:07                   ` Paul Durrant
  0 siblings, 1 reply; 57+ messages in thread
From: Tian, Kevin @ 2018-08-07  9:03 UTC (permalink / raw)
  To: Paul Durrant, Jan Beulich; +Cc: xen-devel, George Dunlap

> From: Paul Durrant [mailto:Paul.Durrant@citrix.com]
> Sent: Tuesday, August 7, 2018 4:56 PM
> 
> > -----Original Message-----
> > From: Tian, Kevin [mailto:kevin.tian@intel.com]
> > Sent: 07 August 2018 09:48
> > To: Paul Durrant <Paul.Durrant@citrix.com>; Jan Beulich
> > <JBeulich@suse.com>
> > Cc: George Dunlap <George.Dunlap@citrix.com>; xen-devel <xen-
> > devel@lists.xenproject.org>
> > Subject: RE: [PATCH v5 09/15] vtd: add lookup_page method to
> iommu_ops
> >
> > > From: Paul Durrant [mailto:Paul.Durrant@citrix.com]
> > > Sent: Tuesday, August 7, 2018 4:37 PM
> > >
> > > > -----Original Message-----
> > > > From: Tian, Kevin [mailto:kevin.tian@intel.com]
> > > > Sent: 07 August 2018 09:33
> > > > To: Jan Beulich <JBeulich@suse.com>; Paul Durrant
> > > > <Paul.Durrant@citrix.com>
> > > > Cc: George Dunlap <George.Dunlap@citrix.com>; xen-devel <xen-
> > > > devel@lists.xenproject.org>
> > > > Subject: RE: [PATCH v5 09/15] vtd: add lookup_page method to
> > > iommu_ops
> > > >
> > > > > From: Jan Beulich [mailto:JBeulich@suse.com]
> > > > > Sent: Tuesday, August 7, 2018 4:30 PM
> > > > >
> > > > > >>> On 07.08.18 at 10:21, <Paul.Durrant@citrix.com> wrote:
> > > > > >> From: Tian, Kevin [mailto:kevin.tian@intel.com]
> > > > > >> Sent: 07 August 2018 04:25
> > > > > >>
> > > > > >> > From: Paul Durrant [mailto:paul.durrant@citrix.com]
> > > > > >> > Sent: Saturday, August 4, 2018 1:22 AM
> > > > > >> >
> > > > > >> > --- a/xen/drivers/passthrough/vtd/iommu.c
> > > > > >> > +++ b/xen/drivers/passthrough/vtd/iommu.c
> > > > > >> > @@ -1830,6 +1830,39 @@ static int __must_check
> > > > > >> > intel_iommu_unmap_page(struct domain *d,
> > > > > >> >      return dma_pte_clear_one(d, bfn_to_baddr(bfn));
> > > > > >> >  }
> > > > > >> >
> > > > > >> > +static int intel_iommu_lookup_page(struct domain *d, bfn_t
> bfn,
> > > > > mfn_t
> > > > > >> > *mfn,
> > > > > >> > +                                   unsigned int *flags)
> > > > > >>
> > > > > >> Not looking at later patches yet... but in concept bfn address
> > > > > >> space is per device instead of per domain.
> > > > > >
> > > > > > Not in this case. Xen has always maintained a single IOMMU
> address
> > > per
> > > > > > virtual machine. That is what BFN refers to.
> > > > >
> > > > > Nut is that a model we can maintain mid and long term? In
> particular
> > > > > on ARM, where Julien has told me a single system could have
> multiple
> > > > > _different_ IOMMUs, I could easily see the address spaces diverging.
> > > > >
> > > >
> > > > multiple IOMMUs is another thing.
> > > >
> > > > what I questioned is that even one IOMMU needs to support mulitiple
> > > > address spaces. That is the point of an IOMMU...
> > >
> > > Indeed and that is why we use it to enforce domain separation. I see no
> > > need, as yet, to enforce separation within a domain. That need may
> arise
> > > later and the code can be modified at that point.
> > >
> >
> > but then you need completely different set of APIs at that time...
> 
> Ok. Would you be happy if I add the option to supply a an SBDF in the map
> and unmap hypercalls but ignore the value for now? Or even have a 'global'
> flag but return -EOPNOTSUPP if it is not specified. That would avoid the
> need to add new hypercalls for per-device mapping.
> 

that would be better! maybe also worthy of a capability query interface
to tell whether global or per-bdf mapping is supported?

Thanks
Kevin

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH v5 09/15] vtd: add lookup_page method to iommu_ops
  2018-08-07  9:03                 ` Tian, Kevin
@ 2018-08-07  9:07                   ` Paul Durrant
  0 siblings, 0 replies; 57+ messages in thread
From: Paul Durrant @ 2018-08-07  9:07 UTC (permalink / raw)
  To: Kevin Tian, Jan Beulich; +Cc: xen-devel, George Dunlap

> -----Original Message-----
> From: Tian, Kevin [mailto:kevin.tian@intel.com]
> Sent: 07 August 2018 10:03
> To: Paul Durrant <Paul.Durrant@citrix.com>; Jan Beulich
> <JBeulich@suse.com>
> Cc: George Dunlap <George.Dunlap@citrix.com>; xen-devel <xen-
> devel@lists.xenproject.org>
> Subject: RE: [PATCH v5 09/15] vtd: add lookup_page method to iommu_ops
> 
> > From: Paul Durrant [mailto:Paul.Durrant@citrix.com]
> > Sent: Tuesday, August 7, 2018 4:56 PM
> >
> > > -----Original Message-----
> > > From: Tian, Kevin [mailto:kevin.tian@intel.com]
> > > Sent: 07 August 2018 09:48
> > > To: Paul Durrant <Paul.Durrant@citrix.com>; Jan Beulich
> > > <JBeulich@suse.com>
> > > Cc: George Dunlap <George.Dunlap@citrix.com>; xen-devel <xen-
> > > devel@lists.xenproject.org>
> > > Subject: RE: [PATCH v5 09/15] vtd: add lookup_page method to
> > iommu_ops
> > >
> > > > From: Paul Durrant [mailto:Paul.Durrant@citrix.com]
> > > > Sent: Tuesday, August 7, 2018 4:37 PM
> > > >
> > > > > -----Original Message-----
> > > > > From: Tian, Kevin [mailto:kevin.tian@intel.com]
> > > > > Sent: 07 August 2018 09:33
> > > > > To: Jan Beulich <JBeulich@suse.com>; Paul Durrant
> > > > > <Paul.Durrant@citrix.com>
> > > > > Cc: George Dunlap <George.Dunlap@citrix.com>; xen-devel <xen-
> > > > > devel@lists.xenproject.org>
> > > > > Subject: RE: [PATCH v5 09/15] vtd: add lookup_page method to
> > > > iommu_ops
> > > > >
> > > > > > From: Jan Beulich [mailto:JBeulich@suse.com]
> > > > > > Sent: Tuesday, August 7, 2018 4:30 PM
> > > > > >
> > > > > > >>> On 07.08.18 at 10:21, <Paul.Durrant@citrix.com> wrote:
> > > > > > >> From: Tian, Kevin [mailto:kevin.tian@intel.com]
> > > > > > >> Sent: 07 August 2018 04:25
> > > > > > >>
> > > > > > >> > From: Paul Durrant [mailto:paul.durrant@citrix.com]
> > > > > > >> > Sent: Saturday, August 4, 2018 1:22 AM
> > > > > > >> >
> > > > > > >> > --- a/xen/drivers/passthrough/vtd/iommu.c
> > > > > > >> > +++ b/xen/drivers/passthrough/vtd/iommu.c
> > > > > > >> > @@ -1830,6 +1830,39 @@ static int __must_check
> > > > > > >> > intel_iommu_unmap_page(struct domain *d,
> > > > > > >> >      return dma_pte_clear_one(d, bfn_to_baddr(bfn));
> > > > > > >> >  }
> > > > > > >> >
> > > > > > >> > +static int intel_iommu_lookup_page(struct domain *d, bfn_t
> > bfn,
> > > > > > mfn_t
> > > > > > >> > *mfn,
> > > > > > >> > +                                   unsigned int *flags)
> > > > > > >>
> > > > > > >> Not looking at later patches yet... but in concept bfn address
> > > > > > >> space is per device instead of per domain.
> > > > > > >
> > > > > > > Not in this case. Xen has always maintained a single IOMMU
> > address
> > > > per
> > > > > > > virtual machine. That is what BFN refers to.
> > > > > >
> > > > > > Nut is that a model we can maintain mid and long term? In
> > particular
> > > > > > on ARM, where Julien has told me a single system could have
> > multiple
> > > > > > _different_ IOMMUs, I could easily see the address spaces
> diverging.
> > > > > >
> > > > >
> > > > > multiple IOMMUs is another thing.
> > > > >
> > > > > what I questioned is that even one IOMMU needs to support
> mulitiple
> > > > > address spaces. That is the point of an IOMMU...
> > > >
> > > > Indeed and that is why we use it to enforce domain separation. I see no
> > > > need, as yet, to enforce separation within a domain. That need may
> > arise
> > > > later and the code can be modified at that point.
> > > >
> > >
> > > but then you need completely different set of APIs at that time...
> >
> > Ok. Would you be happy if I add the option to supply a an SBDF in the map
> > and unmap hypercalls but ignore the value for now? Or even have a 'global'
> > flag but return -EOPNOTSUPP if it is not specified. That would avoid the
> > need to add new hypercalls for per-device mapping.
> >
> 
> that would be better! maybe also worthy of a capability query interface
> to tell whether global or per-bdf mapping is supported?

I could make it an option to the 'enable' hypercall. A caller could try to enable per-device mappings first and, if that fails, fall back to global.

  Paul

> 
> Thanks
> Kevin

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH v5 12/15] x86: add iommu_op to enable modification of IOMMU mappings
  2018-08-07  9:01           ` Tian, Kevin
@ 2018-08-07  9:12             ` Paul Durrant
  2018-08-07  9:19               ` Tian, Kevin
  0 siblings, 1 reply; 57+ messages in thread
From: Paul Durrant @ 2018-08-07  9:12 UTC (permalink / raw)
  To: Kevin Tian, xen-devel
  Cc: Stefano Stabellini, Wei Liu, Andrew Cooper, Tim (Xen.org),
	George Dunlap, Julien Grall, Jan Beulich, Ian Jackson

> -----Original Message-----
> From: Tian, Kevin [mailto:kevin.tian@intel.com]
> Sent: 07 August 2018 10:01
> To: Paul Durrant <Paul.Durrant@citrix.com>; xen-devel@lists.xenproject.org
> Cc: Stefano Stabellini <sstabellini@kernel.org>; Wei Liu
> <wei.liu2@citrix.com>; Andrew Cooper <Andrew.Cooper3@citrix.com>; Tim
> (Xen.org) <tim@xen.org>; George Dunlap <George.Dunlap@citrix.com>;
> Julien Grall <julien.grall@arm.com>; Jan Beulich <jbeulich@suse.com>; Ian
> Jackson <Ian.Jackson@citrix.com>
> Subject: RE: [Xen-devel] [PATCH v5 12/15] x86: add iommu_op to enable
> modification of IOMMU mappings
> 
> > From: Paul Durrant
> > Sent: Tuesday, August 7, 2018 4:44 PM
> >
> > > -----Original Message-----
> > > From: Tian, Kevin [mailto:kevin.tian@intel.com]
> > > Sent: 07 August 2018 09:38
> > > To: Paul Durrant <Paul.Durrant@citrix.com>; xen-
> > devel@lists.xenproject.org
> > > Cc: Stefano Stabellini <sstabellini@kernel.org>; Wei Liu
> > > <wei.liu2@citrix.com>; George Dunlap <George.Dunlap@citrix.com>;
> > > Andrew Cooper <Andrew.Cooper3@citrix.com>; Ian Jackson
> > > <Ian.Jackson@citrix.com>; Tim (Xen.org) <tim@xen.org>; Julien Grall
> > > <julien.grall@arm.com>; Jan Beulich <jbeulich@suse.com>
> > > Subject: RE: [Xen-devel] [PATCH v5 12/15] x86: add iommu_op to enable
> > > modification of IOMMU mappings
> > >
> > > > From: Paul Durrant [mailto:Paul.Durrant@citrix.com]
> > > > Sent: Tuesday, August 7, 2018 4:33 PM
> > > >
> > > > >
> > > > > > From: Paul Durrant
> > > > > > Sent: Saturday, August 4, 2018 1:22 AM
> > > > > >
> > > > > > This patch adds an iommu_op which checks whether it is possible or
> > > > > > safe for a domain to modify its own IOMMU mappings and, if so,
> > > > creates
> > > > > > a rangeset to track modifications.
> > > > >
> > > > > Have to say that there might be a concept mismatch between us,
> > > > > so I will stop review here until we get aligned on the basic
> > > > > understanding.
> > > > >
> > > > > What an IOMMU does is to provide DMA isolation between devices.
> > > > > Each device can be hooked with a different translation structure
> > > > > (representing a different bfn address space). Linux kernel uses this
> > > > > mechanism to harden kernel drivers (through dma APIs). Multiple
> > > > > devices can be also attached to the same address space, used by
> > > > > hypervisor when devices are assigned to the same VM.
> > > > >
> > > >
> > > > Indeed.
> > > >
> > > > > Now with pvIOMMU exposed to dom0, , dom0 could use it to harden
> > > > > kernel drivers too. Then there will be multiple bfn address spaces:
> > > > >
> > > > > - A default bfn address space created by Xen, where bfn = pfn
> > > > > - multiple per-bdf bfn address spaces created by Dom0, where
> > > > > bfn is completely irrelevant to pfn.
> > > > >
> > > > > the default space should not be changed by Dom0. It is attached
> > > > > to devices which dom0 doesn't enable pviommu mapping.
> > > >
> > > > No that's not the point here. I'm not trying to re-architect Xen's IOMMU
> > > > handling. All the IOMMU code in Xen AFAICT is built around the
> > assumption
> > > > there is one set of page tables per-VM and all devices assigned to the
> > VM
> > > > get the same page tables. I suspect trying to change that will be a huge
> > can
> > > > of worms and I have no need to go there for my purposes.
> > >
> > > don't just think from Xen side. think about what Dom0 feels about
> > > this IOMMU.
> > >
> > > ideally pviommu driver is a new vendor driver attached to iommu
> > > core within dom0. it needs to provide iommu dma ops to support
> > > dma_alloc/map operations from different device drivers. iommu
> > > core maintains a separate iova space for each device, so device
> > > drivers can be isolated from each other.
> >
> > But there is nothing that means that the IOVA space cannot be global, and
> > that is good enough for a PV dom0.
> 
> You are right! Although my concept is all built around physical IOMMU
> capability, I did a check that Linux doesn't state that IOVA cannot be
> global. It's purely vendor IOMMU driver to decide.
> 

Yes, that's my understanding. So Xen would be the vendor in this case and the strictly global IOVA space is not a problem.

> So here current version pvIOMMU only provides global address space,
> though unlike any existing IOMMU. maybe we should explicitly call out
> this fact in some capability field for future extension.
> 

Does Linux expose such a thing through the dma_ops? I didn't see one, but if there is one then we can of course set it correctly according to the limitations of the underying hypervisor. Thus, if Xen is modified in future to support per-device mappings, this could be exposed through to the drivers using DMA.

> >
> > >
> > > Now dom0 got only one global space. then why does dom0 need
> > > to enable pviommu at all?
> >
> > As I explained in another reply, it is primarily to allow a PV dom0 to have a
> > BFN:GFN map. Since a PV domain maintains its own P2M then it is the
> > domain that maintains the mapping. That is all I need to do.
> 
> yes, I got this one.

Ok, cool.

   Paul

> 
> Thanks
> Kevin
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH v5 12/15] x86: add iommu_op to enable modification of IOMMU mappings
  2018-08-07  9:12             ` Paul Durrant
@ 2018-08-07  9:19               ` Tian, Kevin
  2018-08-07  9:22                 ` Paul Durrant
  0 siblings, 1 reply; 57+ messages in thread
From: Tian, Kevin @ 2018-08-07  9:19 UTC (permalink / raw)
  To: Paul Durrant, xen-devel
  Cc: Stefano Stabellini, Wei Liu, Andrew Cooper, Tim (Xen.org),
	George Dunlap, Julien Grall, Jan Beulich, Ian Jackson

> From: Paul Durrant [mailto:Paul.Durrant@citrix.com]
> Sent: Tuesday, August 7, 2018 5:12 PM
> 
> 
> > So here current version pvIOMMU only provides global address space,
> > though unlike any existing IOMMU. maybe we should explicitly call out
> > this fact in some capability field for future extension.
> >
> 
> Does Linux expose such a thing through the dma_ops? I didn't see one, but
> if there is one then we can of course set it correctly according to the
> limitations of the underying hypervisor. Thus, if Xen is modified in future to
> support per-device mappings, this could be exposed through to the drivers
> using DMA.
> 

not about Linux internal exposing. it's about hypervisor exposing
the capability to pvIOMMU driver, so the latter knows whether to
enable per-device mapping. But as you commented in another
mail, it could be achieved through enabling sequence.

btw could you put clarification according to our discussions in your 
next version coverletter to set the tone clear from start? I don't
think I'll be the only one interpreting wrong goal for this work. :-)

Thanks
Kevin
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH v5 12/15] x86: add iommu_op to enable modification of IOMMU mappings
  2018-08-07  9:19               ` Tian, Kevin
@ 2018-08-07  9:22                 ` Paul Durrant
  0 siblings, 0 replies; 57+ messages in thread
From: Paul Durrant @ 2018-08-07  9:22 UTC (permalink / raw)
  To: Kevin Tian, xen-devel
  Cc: Stefano Stabellini, Wei Liu, Andrew Cooper, Tim (Xen.org),
	George Dunlap, Julien Grall, Jan Beulich, Ian Jackson

> -----Original Message-----
> From: Tian, Kevin [mailto:kevin.tian@intel.com]
> Sent: 07 August 2018 10:20
> To: Paul Durrant <Paul.Durrant@citrix.com>; xen-devel@lists.xenproject.org
> Cc: Stefano Stabellini <sstabellini@kernel.org>; Wei Liu
> <wei.liu2@citrix.com>; Andrew Cooper <Andrew.Cooper3@citrix.com>; Tim
> (Xen.org) <tim@xen.org>; George Dunlap <George.Dunlap@citrix.com>;
> Julien Grall <julien.grall@arm.com>; Jan Beulich <jbeulich@suse.com>; Ian
> Jackson <Ian.Jackson@citrix.com>
> Subject: RE: [Xen-devel] [PATCH v5 12/15] x86: add iommu_op to enable
> modification of IOMMU mappings
> 
> > From: Paul Durrant [mailto:Paul.Durrant@citrix.com]
> > Sent: Tuesday, August 7, 2018 5:12 PM
> >
> >
> > > So here current version pvIOMMU only provides global address space,
> > > though unlike any existing IOMMU. maybe we should explicitly call out
> > > this fact in some capability field for future extension.
> > >
> >
> > Does Linux expose such a thing through the dma_ops? I didn't see one, but
> > if there is one then we can of course set it correctly according to the
> > limitations of the underying hypervisor. Thus, if Xen is modified in future to
> > support per-device mappings, this could be exposed through to the drivers
> > using DMA.
> >
> 
> not about Linux internal exposing. it's about hypervisor exposing
> the capability to pvIOMMU driver, so the latter knows whether to
> enable per-device mapping. But as you commented in another
> mail, it could be achieved through enabling sequence.
> 

Ok. Let's go with that then.

> btw could you put clarification according to our discussions in your
> next version coverletter to set the tone clear from start? I don't
> think I'll be the only one interpreting wrong goal for this work. :-)
> 

I can certainly do that. I'll put an explanation of the XenServer use-case in as a concrete example too.

  Paul

> Thanks
> Kevin
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH v5 01/15] iommu: turn need_iommu back into a boolean.
  2018-08-03 17:22 ` [PATCH v5 01/15] iommu: turn need_iommu back into a boolean Paul Durrant
@ 2018-08-08 13:39   ` Jan Beulich
  2018-08-08 13:56     ` Paul Durrant
  0 siblings, 1 reply; 57+ messages in thread
From: Jan Beulich @ 2018-08-08 13:39 UTC (permalink / raw)
  To: Paul Durrant
  Cc: Stefano Stabellini, Wei Liu, George Dunlap, Andrew Cooper,
	Ian Jackson, Tim Deegan, Julien Grall, xen-devel

>>> On 03.08.18 at 19:22, <paul.durrant@citrix.com> wrote:
> As noted in [1] the tri-state nature of need_iommu after commit 3e06b989 is
> confusing.
> 
> Because arch_iommu_populate_page_table() removes pages from the page list
> as it maps them it is actually safe to re-invoke multiple times without
> the need for any specific indication it has been called before, so it
> is actually safe to simply turn need_iommu back into a boolean with no
> change to the population algorithm.
> 
> [1] https://lists.xenproject.org/archives/html/xen-devel/2018-07/msg01870.html 

I have to admit that I wouldn't read "confusing" into that mail. And
given the below, I continue to wonder whether you really, really
need to change this.

> --- a/xen/drivers/passthrough/iommu.c
> +++ b/xen/drivers/passthrough/iommu.c
> @@ -214,14 +214,14 @@ void iommu_teardown(struct domain *d)
>  {
>      const struct domain_iommu *hd = dom_iommu(d);
>  
> -    d->need_iommu = 0;
> +    d->need_iommu = false;
>      hd->platform_ops->teardown(d);
>      tasklet_schedule(&iommu_pt_cleanup_tasklet);
>  }
>  
>  int iommu_construct(struct domain *d)
>  {
> -    if ( need_iommu(d) > 0 )
> +    if ( need_iommu(d) )
>          return 0;
>  
>      if ( !iommu_use_hap_pt(d) )
> @@ -233,7 +233,7 @@ int iommu_construct(struct domain *d)
>              return rc;
>      }
>  
> -    d->need_iommu = 1;
> +    d->need_iommu = true;

So with the setting to -1 gone from the caller, need_iommu(d) will
continue to return false until this completion point is reached. The
fundamental idea of the tristate was that once we've started
populating the IOMMU page tables (recall, the domain is not
paused if this is a hotplug), all _other_ operations requiring IOMMU
page table manipulation (grant table code, for example) will
DTRT (tm) despite the code here perhaps never getting to notice
the relevant page.

Trust me, it wasn't a lightweight decision to make this a tristate,
I just couldn't see a better approach (short of using a second
boolean, which I would have liked even less), and I still can't.

> @@ -493,7 +493,7 @@ static void iommu_dump_p2m_table(unsigned char key)
>      ops = iommu_get_ops();
>      for_each_domain(d)
>      {
> -        if ( is_hardware_domain(d) || need_iommu(d) <= 0 )
> +        if ( is_hardware_domain(d) || !need_iommu(d) )
>              continue;

I don't think the original semantics of the dumping to be skipped for
domains with their IOMMU page tables under construction is being
retained here.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH v5 01/15] iommu: turn need_iommu back into a boolean.
  2018-08-08 13:39   ` Jan Beulich
@ 2018-08-08 13:56     ` Paul Durrant
  0 siblings, 0 replies; 57+ messages in thread
From: Paul Durrant @ 2018-08-08 13:56 UTC (permalink / raw)
  To: 'Jan Beulich'
  Cc: Stefano Stabellini, Wei Liu, Andrew Cooper, Tim (Xen.org),
	George Dunlap, Julien Grall, xen-devel, Ian Jackson

> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: 08 August 2018 14:39
> To: Paul Durrant <Paul.Durrant@citrix.com>
> Cc: Julien Grall <julien.grall@arm.com>; Andrew Cooper
> <Andrew.Cooper3@citrix.com>; Wei Liu <wei.liu2@citrix.com>; George
> Dunlap <George.Dunlap@citrix.com>; Ian Jackson <Ian.Jackson@citrix.com>;
> Stefano Stabellini <sstabellini@kernel.org>; xen-devel <xen-
> devel@lists.xenproject.org>; Konrad Rzeszutek Wilk
> <konrad.wilk@oracle.com>; Tim (Xen.org) <tim@xen.org>
> Subject: Re: [PATCH v5 01/15] iommu: turn need_iommu back into a
> boolean.
> 
> >>> On 03.08.18 at 19:22, <paul.durrant@citrix.com> wrote:
> > As noted in [1] the tri-state nature of need_iommu after commit 3e06b989
> is
> > confusing.
> >
> > Because arch_iommu_populate_page_table() removes pages from the
> page list
> > as it maps them it is actually safe to re-invoke multiple times without
> > the need for any specific indication it has been called before, so it
> > is actually safe to simply turn need_iommu back into a boolean with no
> > change to the population algorithm.
> >
> > [1] https://lists.xenproject.org/archives/html/xen-devel/2018-
> 07/msg01870.html
> 
> I have to admit that I wouldn't read "confusing" into that mail. And
> given the below, I continue to wonder whether you really, really
> need to change this.

Ok, I'll squash this patch this into the subsequent patch where I separate the ideas of a domain having IOMMU pagetables and requiring them to be kept in sync.

  Paul

> 
> > --- a/xen/drivers/passthrough/iommu.c
> > +++ b/xen/drivers/passthrough/iommu.c
> > @@ -214,14 +214,14 @@ void iommu_teardown(struct domain *d)
> >  {
> >      const struct domain_iommu *hd = dom_iommu(d);
> >
> > -    d->need_iommu = 0;
> > +    d->need_iommu = false;
> >      hd->platform_ops->teardown(d);
> >      tasklet_schedule(&iommu_pt_cleanup_tasklet);
> >  }
> >
> >  int iommu_construct(struct domain *d)
> >  {
> > -    if ( need_iommu(d) > 0 )
> > +    if ( need_iommu(d) )
> >          return 0;
> >
> >      if ( !iommu_use_hap_pt(d) )
> > @@ -233,7 +233,7 @@ int iommu_construct(struct domain *d)
> >              return rc;
> >      }
> >
> > -    d->need_iommu = 1;
> > +    d->need_iommu = true;
> 
> So with the setting to -1 gone from the caller, need_iommu(d) will
> continue to return false until this completion point is reached. The
> fundamental idea of the tristate was that once we've started
> populating the IOMMU page tables (recall, the domain is not
> paused if this is a hotplug), all _other_ operations requiring IOMMU
> page table manipulation (grant table code, for example) will
> DTRT (tm) despite the code here perhaps never getting to notice
> the relevant page.
> 
> Trust me, it wasn't a lightweight decision to make this a tristate,
> I just couldn't see a better approach (short of using a second
> boolean, which I would have liked even less), and I still can't.
> 
> > @@ -493,7 +493,7 @@ static void iommu_dump_p2m_table(unsigned char
> key)
> >      ops = iommu_get_ops();
> >      for_each_domain(d)
> >      {
> > -        if ( is_hardware_domain(d) || need_iommu(d) <= 0 )
> > +        if ( is_hardware_domain(d) || !need_iommu(d) )
> >              continue;
> 
> I don't think the original semantics of the dumping to be skipped for
> domains with their IOMMU page tables under construction is being
> retained here.
> 
> Jan
> 


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH v5 06/15] public / x86: introduce __HYPERCALL_iommu_op
  2018-08-03 17:22 ` [PATCH v5 06/15] public / x86: introduce __HYPERCALL_iommu_op Paul Durrant
  2018-08-07  3:00   ` Tian, Kevin
@ 2018-08-17 21:10   ` Daniel De Graaf
  1 sibling, 0 replies; 57+ messages in thread
From: Daniel De Graaf @ 2018-08-17 21:10 UTC (permalink / raw)
  To: Paul Durrant, xen-devel
  Cc: Stefano Stabellini, Wei Liu, George Dunlap, Andrew Cooper,
	Ian Jackson, Tim Deegan, Jan Beulich

On 08/03/2018 01:22 PM, Paul Durrant wrote:
> This patch introduces the boilerplate for a new hypercall to allow a
> domain to control IOMMU mappings for its own pages.
> Whilst there is duplication of code between the native and compat entry
> points which appears ripe for some form of combination, I think it is
> better to maintain the separation as-is because the compat entry point
> will necessarily gain complexity in subsequent patches.
> 
> NOTE: This hypercall is only implemented for x86 and is currently
>        restricted by XSM to dom0. Its scope can be expanded in future
>        if need be.
> 
> Signed-off-by: Paul Durrant <paul.durrant@citrix.com>

Acked-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 57+ messages in thread

end of thread, other threads:[~2018-08-17 21:10 UTC | newest]

Thread overview: 57+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-08-03 17:22 [PATCH v5 00/15] paravirtual IOMMU interface Paul Durrant
2018-08-03 17:22 ` [PATCH v5 01/15] iommu: turn need_iommu back into a boolean Paul Durrant
2018-08-08 13:39   ` Jan Beulich
2018-08-08 13:56     ` Paul Durrant
2018-08-03 17:22 ` [PATCH v5 02/15] iommu: introduce the concept of BFN Paul Durrant
2018-08-07  2:38   ` Tian, Kevin
2018-08-07  7:59     ` Paul Durrant
2018-08-07  8:26       ` Tian, Kevin
2018-08-03 17:22 ` [PATCH v5 03/15] iommu: make use of type-safe BFN and MFN in exported functions Paul Durrant
2018-08-07  2:45   ` Tian, Kevin
2018-08-03 17:22 ` [PATCH v5 04/15] iommu: push use of type-safe BFN and MFN into iommu_ops Paul Durrant
2018-08-07  2:49   ` Tian, Kevin
2018-08-03 17:22 ` [PATCH v5 05/15] iommu: don't domain_crash() inside iommu_map/unmap_page() Paul Durrant
2018-08-07  2:55   ` Tian, Kevin
2018-08-07  8:05     ` Paul Durrant
2018-08-07  8:23       ` Jan Beulich
2018-08-03 17:22 ` [PATCH v5 06/15] public / x86: introduce __HYPERCALL_iommu_op Paul Durrant
2018-08-07  3:00   ` Tian, Kevin
2018-08-07  8:10     ` Paul Durrant
2018-08-07  8:25       ` Jan Beulich
2018-08-17 21:10   ` Daniel De Graaf
2018-08-03 17:22 ` [PATCH v5 07/15] iommu: track reserved ranges using a rangeset Paul Durrant
2018-08-07  3:04   ` Tian, Kevin
2018-08-07  8:16     ` Paul Durrant
2018-08-07  8:23       ` Tian, Kevin
2018-08-03 17:22 ` [PATCH v5 08/15] x86: add iommu_op to query reserved ranges Paul Durrant
2018-08-03 17:22 ` [PATCH v5 09/15] vtd: add lookup_page method to iommu_ops Paul Durrant
2018-08-07  3:25   ` Tian, Kevin
2018-08-07  8:21     ` Paul Durrant
2018-08-07  8:29       ` Jan Beulich
2018-08-07  8:32         ` Tian, Kevin
2018-08-07  8:37           ` Paul Durrant
2018-08-07  8:48             ` Tian, Kevin
2018-08-07  8:56               ` Paul Durrant
2018-08-07  9:03                 ` Tian, Kevin
2018-08-07  9:07                   ` Paul Durrant
2018-08-07  8:31       ` Tian, Kevin
2018-08-07  8:35         ` Paul Durrant
2018-08-07  8:47           ` Tian, Kevin
2018-08-03 17:22 ` [PATCH v5 10/15] mm / iommu: include need_iommu() test in iommu_use_hap_pt() Paul Durrant
2018-08-07  3:32   ` Tian, Kevin
2018-08-03 17:22 ` [PATCH v5 11/15] mm / iommu: split need_iommu() into has_iommu_pt() and sync_iommu_pt() Paul Durrant
2018-08-03 18:18   ` Razvan Cojocaru
2018-08-07  3:41   ` Tian, Kevin
2018-08-07  8:24     ` Paul Durrant
2018-08-03 17:22 ` [PATCH v5 12/15] x86: add iommu_op to enable modification of IOMMU mappings Paul Durrant
2018-08-07  4:08   ` Tian, Kevin
2018-08-07  8:32     ` Paul Durrant
2018-08-07  8:37       ` Tian, Kevin
2018-08-07  8:44         ` Paul Durrant
2018-08-07  9:01           ` Tian, Kevin
2018-08-07  9:12             ` Paul Durrant
2018-08-07  9:19               ` Tian, Kevin
2018-08-07  9:22                 ` Paul Durrant
2018-08-03 17:22 ` [PATCH v5 13/15] memory: add get_paged_gfn() as a wrapper Paul Durrant
2018-08-03 17:22 ` [PATCH v5 14/15] x86: add iommu_ops to modify and flush IOMMU mappings Paul Durrant
2018-08-03 17:22 ` [PATCH v5 15/15] x86: extend the map and unmap iommu_ops to support grant references Paul Durrant

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.