All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v6 00/14] paravirtual IOMMU interface
@ 2018-08-23  9:46 Paul Durrant
  2018-08-23  9:46 ` [PATCH v6 01/14] iommu: introduce the concept of BFN Paul Durrant
                   ` (13 more replies)
  0 siblings, 14 replies; 111+ messages in thread
From: Paul Durrant @ 2018-08-23  9:46 UTC (permalink / raw)
  To: xen-devel
  Cc: Kevin Tian, Stefano Stabellini, Wei Liu, Jun Nakajima,
	Razvan Cojocaru, Andrew Cooper, Ian Jackson, George Dunlap,
	Tim Deegan, Julien Grall, Paul Durrant, Tamas K Lengyel,
	Jan Beulich, Brian Woods, Suravee Suthikulpanit

The idea of a paravirtual IOMMU interface was last discussed on xen-devel
several years ago and narrowed down on a draft specification [1].
There was also an RFC patch series posted with an implementation, however
this was never followed through.

In this patch series I have tried to simplify the interface and therefore
have moved away from the draft specification. There is not yet any
new specification but hopefully the interface in the introduced iommu_op
header file will be understandable without such a specification.

[1] https://lists.xenproject.org/archives/html/xen-devel/2016-02/msg01428.html

Paul Durrant (14):
  iommu: introduce the concept of BFN...
  iommu: make use of type-safe BFN and MFN in exported functions
  iommu: push use of type-safe BFN and MFN into iommu_ops
  iommu: don't domain_crash() inside iommu_map/unmap_page()
  public / x86: introduce __HYPERCALL_iommu_op
  iommu: track reserved ranges using a rangeset
  x86: add iommu_op to query reserved ranges
  vtd: add lookup_page method to iommu_ops
  mm / iommu: include need_iommu() test in iommu_use_hap_pt()
  mm / iommu: split need_iommu() into has_iommu_pt() and
    need_iommu_pt_sync()
  x86: add iommu_op to enable modification of IOMMU mappings
  memory: add get_paged_gfn() as a wrapper...
  x86: add iommu_ops to modify and flush IOMMU mappings
  x86: extend the map and unmap iommu_ops to support grant references

 tools/flask/policy/modules/xen.if             |   1 +
 xen/arch/arm/p2m.c                            |   9 +-
 xen/arch/x86/hvm/emulate.c                    |  32 +-
 xen/arch/x86/hvm/hvm.c                        |  16 +-
 xen/arch/x86/hvm/hypercall.c                  |   1 +
 xen/arch/x86/hvm/mtrr.c                       |   5 +-
 xen/arch/x86/hypercall.c                      |   1 +
 xen/arch/x86/mm.c                             |  15 +-
 xen/arch/x86/mm/mem_sharing.c                 |   2 +-
 xen/arch/x86/mm/p2m-ept.c                     |  19 +-
 xen/arch/x86/mm/p2m-pt.c                      |  52 ++-
 xen/arch/x86/mm/p2m.c                         |  42 +-
 xen/arch/x86/mm/paging.c                      |   2 +-
 xen/arch/x86/pv/hypercall.c                   |   1 +
 xen/arch/x86/x86_64/mm.c                      |   8 +-
 xen/common/Makefile                           |   1 +
 xen/common/grant_table.c                      | 197 +++++++--
 xen/common/iommu_op.c                         | 598 ++++++++++++++++++++++++++
 xen/common/memory.c                           |  69 ++-
 xen/common/vm_event.c                         |   2 +-
 xen/drivers/passthrough/amd/iommu_cmd.c       |  18 +-
 xen/drivers/passthrough/amd/iommu_map.c       |  86 ++--
 xen/drivers/passthrough/amd/pci_amd_iommu.c   |   6 +-
 xen/drivers/passthrough/arm/smmu.c            |  20 +-
 xen/drivers/passthrough/device_tree.c         |  21 +-
 xen/drivers/passthrough/iommu.c               | 106 +++--
 xen/drivers/passthrough/pci.c                 |   6 +-
 xen/drivers/passthrough/vtd/iommu.c           |  90 ++--
 xen/drivers/passthrough/vtd/iommu.h           |   3 +
 xen/drivers/passthrough/vtd/x86/vtd.c         |  18 +-
 xen/drivers/passthrough/x86/iommu.c           |   8 +-
 xen/include/Makefile                          |   2 +
 xen/include/asm-arm/grant_table.h             |   2 +-
 xen/include/asm-arm/iommu.h                   |   2 +-
 xen/include/asm-arm/p2m.h                     |   3 +
 xen/include/asm-x86/grant_table.h             |   2 +-
 xen/include/asm-x86/hvm/svm/amd-iommu-proto.h |   8 +-
 xen/include/asm-x86/iommu.h                   |   5 +-
 xen/include/asm-x86/p2m.h                     |   2 +
 xen/include/public/iommu_op.h                 | 254 +++++++++++
 xen/include/public/xen.h                      |   1 +
 xen/include/xen/grant_table.h                 |   7 +
 xen/include/xen/hypercall.h                   |  12 +
 xen/include/xen/iommu.h                       |  83 +++-
 xen/include/xen/mm.h                          |   5 +
 xen/include/xen/sched.h                       |  14 +-
 xen/include/xlat.lst                          |   8 +
 xen/include/xsm/dummy.h                       |   6 +
 xen/include/xsm/xsm.h                         |   6 +
 xen/xsm/dummy.c                               |   1 +
 xen/xsm/flask/hooks.c                         |   6 +
 xen/xsm/flask/policy/access_vectors           |   2 +
 52 files changed, 1576 insertions(+), 310 deletions(-)
 create mode 100644 xen/common/iommu_op.c
 create mode 100644 xen/include/public/iommu_op.h
---
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Brian Woods <brian.woods@amd.com>
Cc: George Dunlap <george.dunlap@citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Julien Grall <julien.grall@arm.com>
Cc: Jun Nakajima <jun.nakajima@intel.com>
Cc: Kevin Tian <kevin.tian@intel.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Razvan Cojocaru <rcojocaru@bitdefender.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Cc: Tamas K Lengyel <tamas@tklengyel.com>
Cc: Tim Deegan <tim@xen.org>
Cc: Wei Liu <wei.liu2@citrix.com>
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 111+ messages in thread

* [PATCH v6 01/14] iommu: introduce the concept of BFN...
  2018-08-23  9:46 [PATCH v6 00/14] paravirtual IOMMU interface Paul Durrant
@ 2018-08-23  9:46 ` Paul Durrant
  2018-08-30 15:59   ` Jan Beulich
  2018-08-23  9:46 ` [PATCH v6 02/14] iommu: make use of type-safe BFN and MFN in exported functions Paul Durrant
                   ` (12 subsequent siblings)
  13 siblings, 1 reply; 111+ messages in thread
From: Paul Durrant @ 2018-08-23  9:46 UTC (permalink / raw)
  To: xen-devel
  Cc: Kevin Tian, Stefano Stabellini, Suravee Suthikulpanit,
	Julien Grall, Paul Durrant, Jan Beulich

...meaning 'bus frame number' i.e. a frame number mapped in the IOMMU
rather than the MMU.

This patch is a largely cosmetic change that substitutes the terms 'gfn'
and 'gaddr' for 'bfn' and 'baddr' in all the places where the frame number
or address relate to the IOMMU rather than the MMU.

The parts that are not purely cosmetic are:

 - the introduction of a type-safe declaration of bfn_t and definition of
   INVALID_BFN to make the substitution of gfn_x(INVALID_GFN) mechanical.
 - the introduction of __bfn_to_baddr and __baddr_to_bfn (and type-safe
   variants without the leading __) with some use of the former.

Subsequent patches will convert code to make use of type-safe BFNs.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
---
Cc: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Julien Grall <julien.grall@arm.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Kevin Tian <kevin.tian@intel.com>

v6:
 - Dropped changes to 'mfn' section in xen/mm.h as suggested by Kevin.

v3:
 - Get rid of intermediate 'frame' variables again.

v2:
 - Addressed comments from Jan.
---
 xen/drivers/passthrough/amd/iommu_cmd.c     | 18 +++----
 xen/drivers/passthrough/amd/iommu_map.c     | 76 ++++++++++++++---------------
 xen/drivers/passthrough/amd/pci_amd_iommu.c |  2 +-
 xen/drivers/passthrough/arm/smmu.c          | 16 +++---
 xen/drivers/passthrough/iommu.c             | 28 +++++------
 xen/drivers/passthrough/vtd/iommu.c         | 30 ++++++------
 xen/include/xen/iommu.h                     | 38 ++++++++++++---
 xen/include/xen/mm.h                        |  5 ++
 8 files changed, 122 insertions(+), 91 deletions(-)

diff --git a/xen/drivers/passthrough/amd/iommu_cmd.c b/xen/drivers/passthrough/amd/iommu_cmd.c
index 08247fa354..f93becd6e1 100644
--- a/xen/drivers/passthrough/amd/iommu_cmd.c
+++ b/xen/drivers/passthrough/amd/iommu_cmd.c
@@ -284,7 +284,7 @@ void invalidate_iommu_all(struct amd_iommu *iommu)
 }
 
 void amd_iommu_flush_iotlb(u8 devfn, const struct pci_dev *pdev,
-                           uint64_t gaddr, unsigned int order)
+                           baddr_t baddr, unsigned int order)
 {
     unsigned long flags;
     struct amd_iommu *iommu;
@@ -315,12 +315,12 @@ void amd_iommu_flush_iotlb(u8 devfn, const struct pci_dev *pdev,
 
     /* send INVALIDATE_IOTLB_PAGES command */
     spin_lock_irqsave(&iommu->lock, flags);
-    invalidate_iotlb_pages(iommu, maxpend, 0, queueid, gaddr, req_id, order);
+    invalidate_iotlb_pages(iommu, maxpend, 0, queueid, baddr, req_id, order);
     flush_command_buffer(iommu);
     spin_unlock_irqrestore(&iommu->lock, flags);
 }
 
-static void amd_iommu_flush_all_iotlbs(struct domain *d, uint64_t gaddr,
+static void amd_iommu_flush_all_iotlbs(struct domain *d, baddr_t baddr,
                                        unsigned int order)
 {
     struct pci_dev *pdev;
@@ -333,7 +333,7 @@ static void amd_iommu_flush_all_iotlbs(struct domain *d, uint64_t gaddr,
         u8 devfn = pdev->devfn;
 
         do {
-            amd_iommu_flush_iotlb(devfn, pdev, gaddr, order);
+            amd_iommu_flush_iotlb(devfn, pdev, baddr, order);
             devfn += pdev->phantom_stride;
         } while ( devfn != pdev->devfn &&
                   PCI_SLOT(devfn) == PCI_SLOT(pdev->devfn) );
@@ -342,7 +342,7 @@ static void amd_iommu_flush_all_iotlbs(struct domain *d, uint64_t gaddr,
 
 /* Flush iommu cache after p2m changes. */
 static void _amd_iommu_flush_pages(struct domain *d,
-                                   uint64_t gaddr, unsigned int order)
+                                   baddr_t baddr, unsigned int order)
 {
     unsigned long flags;
     struct amd_iommu *iommu;
@@ -352,13 +352,13 @@ static void _amd_iommu_flush_pages(struct domain *d,
     for_each_amd_iommu ( iommu )
     {
         spin_lock_irqsave(&iommu->lock, flags);
-        invalidate_iommu_pages(iommu, gaddr, dom_id, order);
+        invalidate_iommu_pages(iommu, baddr, dom_id, order);
         flush_command_buffer(iommu);
         spin_unlock_irqrestore(&iommu->lock, flags);
     }
 
     if ( ats_enabled )
-        amd_iommu_flush_all_iotlbs(d, gaddr, order);
+        amd_iommu_flush_all_iotlbs(d, baddr, order);
 }
 
 void amd_iommu_flush_all_pages(struct domain *d)
@@ -367,9 +367,9 @@ void amd_iommu_flush_all_pages(struct domain *d)
 }
 
 void amd_iommu_flush_pages(struct domain *d,
-                           unsigned long gfn, unsigned int order)
+                           unsigned long bfn, unsigned int order)
 {
-    _amd_iommu_flush_pages(d, (uint64_t) gfn << PAGE_SHIFT, order);
+    _amd_iommu_flush_pages(d, __bfn_to_baddr(bfn), order);
 }
 
 void amd_iommu_flush_device(struct amd_iommu *iommu, uint16_t bdf)
diff --git a/xen/drivers/passthrough/amd/iommu_map.c b/xen/drivers/passthrough/amd/iommu_map.c
index 70b4345b37..4deab9cd2f 100644
--- a/xen/drivers/passthrough/amd/iommu_map.c
+++ b/xen/drivers/passthrough/amd/iommu_map.c
@@ -35,12 +35,12 @@ static unsigned int pfn_to_pde_idx(unsigned long pfn, unsigned int level)
     return idx;
 }
 
-void clear_iommu_pte_present(unsigned long l1_mfn, unsigned long gfn)
+void clear_iommu_pte_present(unsigned long l1_mfn, unsigned long bfn)
 {
     u64 *table, *pte;
 
     table = map_domain_page(_mfn(l1_mfn));
-    pte = table + pfn_to_pde_idx(gfn, IOMMU_PAGING_MODE_LEVEL_1);
+    pte = table + pfn_to_pde_idx(bfn, IOMMU_PAGING_MODE_LEVEL_1);
     *pte = 0;
     unmap_domain_page(table);
 }
@@ -104,7 +104,7 @@ static bool_t set_iommu_pde_present(u32 *pde, unsigned long next_mfn,
     return need_flush;
 }
 
-static bool_t set_iommu_pte_present(unsigned long pt_mfn, unsigned long gfn, 
+static bool_t set_iommu_pte_present(unsigned long pt_mfn, unsigned long bfn,
                                     unsigned long next_mfn, int pde_level, 
                                     bool_t iw, bool_t ir)
 {
@@ -114,7 +114,7 @@ static bool_t set_iommu_pte_present(unsigned long pt_mfn, unsigned long gfn,
 
     table = map_domain_page(_mfn(pt_mfn));
 
-    pde = (u32*)(table + pfn_to_pde_idx(gfn, pde_level));
+    pde = (u32*)(table + pfn_to_pde_idx(bfn, pde_level));
 
     need_flush = set_iommu_pde_present(pde, next_mfn, 
                                        IOMMU_PAGING_MODE_LEVEL_0, iw, ir);
@@ -331,7 +331,7 @@ static void set_pde_count(u64 *pde, unsigned int count)
  * otherwise increase pde count if mfn is contigous with mfn - 1
  */
 static int iommu_update_pde_count(struct domain *d, unsigned long pt_mfn,
-                                  unsigned long gfn, unsigned long mfn,
+                                  unsigned long bfn, unsigned long mfn,
                                   unsigned int merge_level)
 {
     unsigned int pde_count, next_level;
@@ -347,7 +347,7 @@ static int iommu_update_pde_count(struct domain *d, unsigned long pt_mfn,
 
     /* get pde at merge level */
     table = map_domain_page(_mfn(pt_mfn));
-    pde = table + pfn_to_pde_idx(gfn, merge_level);
+    pde = table + pfn_to_pde_idx(bfn, merge_level);
 
     /* get page table of next level */
     ntable_maddr = amd_iommu_get_next_table_from_pte((u32*)pde);
@@ -362,7 +362,7 @@ static int iommu_update_pde_count(struct domain *d, unsigned long pt_mfn,
     mask = (1ULL<< (PTE_PER_TABLE_SHIFT * next_level)) - 1;
 
     if ( ((first_mfn & mask) == 0) &&
-         (((gfn & mask) | first_mfn) == mfn) )
+         (((bfn & mask) | first_mfn) == mfn) )
     {
         pde_count = get_pde_count(*pde);
 
@@ -387,7 +387,7 @@ out:
 }
 
 static int iommu_merge_pages(struct domain *d, unsigned long pt_mfn,
-                             unsigned long gfn, unsigned int flags,
+                             unsigned long bfn, unsigned int flags,
                              unsigned int merge_level)
 {
     u64 *table, *pde, *ntable;
@@ -398,7 +398,7 @@ static int iommu_merge_pages(struct domain *d, unsigned long pt_mfn,
     ASSERT( spin_is_locked(&hd->arch.mapping_lock) && pt_mfn );
 
     table = map_domain_page(_mfn(pt_mfn));
-    pde = table + pfn_to_pde_idx(gfn, merge_level);
+    pde = table + pfn_to_pde_idx(bfn, merge_level);
 
     /* get first mfn */
     ntable_mfn = amd_iommu_get_next_table_from_pte((u32*)pde) >> PAGE_SHIFT;
@@ -436,7 +436,7 @@ static int iommu_merge_pages(struct domain *d, unsigned long pt_mfn,
  * {Re, un}mapping super page frames causes re-allocation of io
  * page tables.
  */
-static int iommu_pde_from_gfn(struct domain *d, unsigned long pfn, 
+static int iommu_pde_from_bfn(struct domain *d, unsigned long pfn,
                               unsigned long pt_mfn[])
 {
     u64 *pde, *next_table_vaddr;
@@ -477,11 +477,11 @@ static int iommu_pde_from_gfn(struct domain *d, unsigned long pfn,
              next_table_mfn != 0 )
         {
             int i;
-            unsigned long mfn, gfn;
+            unsigned long mfn, bfn;
             unsigned int page_sz;
 
             page_sz = 1 << (PTE_PER_TABLE_SHIFT * (next_level - 1));
-            gfn =  pfn & ~((1 << (PTE_PER_TABLE_SHIFT * next_level)) - 1);
+            bfn =  pfn & ~((1 << (PTE_PER_TABLE_SHIFT * next_level)) - 1);
             mfn = next_table_mfn;
 
             /* allocate lower level page table */
@@ -499,10 +499,10 @@ static int iommu_pde_from_gfn(struct domain *d, unsigned long pfn,
 
             for ( i = 0; i < PTE_PER_TABLE_SIZE; i++ )
             {
-                set_iommu_pte_present(next_table_mfn, gfn, mfn, next_level,
+                set_iommu_pte_present(next_table_mfn, bfn, mfn, next_level,
                                       !!IOMMUF_writable, !!IOMMUF_readable);
                 mfn += page_sz;
-                gfn += page_sz;
+                bfn += page_sz;
              }
 
             amd_iommu_flush_all_pages(d);
@@ -540,7 +540,7 @@ static int iommu_pde_from_gfn(struct domain *d, unsigned long pfn,
     return 0;
 }
 
-static int update_paging_mode(struct domain *d, unsigned long gfn)
+static int update_paging_mode(struct domain *d, unsigned long bfn)
 {
     u16 bdf;
     void *device_entry;
@@ -554,13 +554,13 @@ static int update_paging_mode(struct domain *d, unsigned long gfn)
     unsigned long old_root_mfn;
     struct domain_iommu *hd = dom_iommu(d);
 
-    if ( gfn == gfn_x(INVALID_GFN) )
+    if ( bfn == bfn_x(INVALID_BFN) )
         return -EADDRNOTAVAIL;
-    ASSERT(!(gfn >> DEFAULT_DOMAIN_ADDRESS_WIDTH));
+    ASSERT(!(bfn >> DEFAULT_DOMAIN_ADDRESS_WIDTH));
 
     level = hd->arch.paging_mode;
     old_root = hd->arch.root_table;
-    offset = gfn >> (PTE_PER_TABLE_SHIFT * (level - 1));
+    offset = bfn >> (PTE_PER_TABLE_SHIFT * (level - 1));
 
     ASSERT(spin_is_locked(&hd->arch.mapping_lock) && is_hvm_domain(d));
 
@@ -631,7 +631,7 @@ static int update_paging_mode(struct domain *d, unsigned long gfn)
     return 0;
 }
 
-int amd_iommu_map_page(struct domain *d, unsigned long gfn, unsigned long mfn,
+int amd_iommu_map_page(struct domain *d, unsigned long bfn, unsigned long mfn,
                        unsigned int flags)
 {
     bool_t need_flush = 0;
@@ -651,34 +651,34 @@ int amd_iommu_map_page(struct domain *d, unsigned long gfn, unsigned long mfn,
     if ( rc )
     {
         spin_unlock(&hd->arch.mapping_lock);
-        AMD_IOMMU_DEBUG("Root table alloc failed, gfn = %lx\n", gfn);
+        AMD_IOMMU_DEBUG("Root table alloc failed, bfn = %lx\n", bfn);
         domain_crash(d);
         return rc;
     }
 
     /* Since HVM domain is initialized with 2 level IO page table,
-     * we might need a deeper page table for lager gfn now */
+     * we might need a deeper page table for wider bfn now */
     if ( is_hvm_domain(d) )
     {
-        if ( update_paging_mode(d, gfn) )
+        if ( update_paging_mode(d, bfn) )
         {
             spin_unlock(&hd->arch.mapping_lock);
-            AMD_IOMMU_DEBUG("Update page mode failed gfn = %lx\n", gfn);
+            AMD_IOMMU_DEBUG("Update page mode failed bfn = %lx\n", bfn);
             domain_crash(d);
             return -EFAULT;
         }
     }
 
-    if ( iommu_pde_from_gfn(d, gfn, pt_mfn) || (pt_mfn[1] == 0) )
+    if ( iommu_pde_from_bfn(d, bfn, pt_mfn) || (pt_mfn[1] == 0) )
     {
         spin_unlock(&hd->arch.mapping_lock);
-        AMD_IOMMU_DEBUG("Invalid IO pagetable entry gfn = %lx\n", gfn);
+        AMD_IOMMU_DEBUG("Invalid IO pagetable entry bfn = %lx\n", bfn);
         domain_crash(d);
         return -EFAULT;
     }
 
     /* Install 4k mapping first */
-    need_flush = set_iommu_pte_present(pt_mfn[1], gfn, mfn, 
+    need_flush = set_iommu_pte_present(pt_mfn[1], bfn, mfn,
                                        IOMMU_PAGING_MODE_LEVEL_1,
                                        !!(flags & IOMMUF_writable),
                                        !!(flags & IOMMUF_readable));
@@ -690,7 +690,7 @@ int amd_iommu_map_page(struct domain *d, unsigned long gfn, unsigned long mfn,
     /* 4K mapping for PV guests never changes, 
      * no need to flush if we trust non-present bits */
     if ( is_hvm_domain(d) )
-        amd_iommu_flush_pages(d, gfn, 0);
+        amd_iommu_flush_pages(d, bfn, 0);
 
     for ( merge_level = IOMMU_PAGING_MODE_LEVEL_2;
           merge_level <= hd->arch.paging_mode; merge_level++ )
@@ -698,15 +698,15 @@ int amd_iommu_map_page(struct domain *d, unsigned long gfn, unsigned long mfn,
         if ( pt_mfn[merge_level] == 0 )
             break;
         if ( !iommu_update_pde_count(d, pt_mfn[merge_level],
-                                     gfn, mfn, merge_level) )
+                                     bfn, mfn, merge_level) )
             break;
 
-        if ( iommu_merge_pages(d, pt_mfn[merge_level], gfn, 
+        if ( iommu_merge_pages(d, pt_mfn[merge_level], bfn,
                                flags, merge_level) )
         {
             spin_unlock(&hd->arch.mapping_lock);
             AMD_IOMMU_DEBUG("Merge iommu page failed at level %d, "
-                            "gfn = %lx mfn = %lx\n", merge_level, gfn, mfn);
+                            "bfn = %lx mfn = %lx\n", merge_level, bfn, mfn);
             domain_crash(d);
             return -EFAULT;
         }
@@ -720,7 +720,7 @@ out:
     return 0;
 }
 
-int amd_iommu_unmap_page(struct domain *d, unsigned long gfn)
+int amd_iommu_unmap_page(struct domain *d, unsigned long bfn)
 {
     unsigned long pt_mfn[7];
     struct domain_iommu *hd = dom_iommu(d);
@@ -739,34 +739,34 @@ int amd_iommu_unmap_page(struct domain *d, unsigned long gfn)
     }
 
     /* Since HVM domain is initialized with 2 level IO page table,
-     * we might need a deeper page table for lager gfn now */
+     * we might need a deeper page table for lager bfn now */
     if ( is_hvm_domain(d) )
     {
-        int rc = update_paging_mode(d, gfn);
+        int rc = update_paging_mode(d, bfn);
 
         if ( rc )
         {
             spin_unlock(&hd->arch.mapping_lock);
-            AMD_IOMMU_DEBUG("Update page mode failed gfn = %lx\n", gfn);
+            AMD_IOMMU_DEBUG("Update page mode failed bfn = %lx\n", bfn);
             if ( rc != -EADDRNOTAVAIL )
                 domain_crash(d);
             return rc;
         }
     }
 
-    if ( iommu_pde_from_gfn(d, gfn, pt_mfn) || (pt_mfn[1] == 0) )
+    if ( iommu_pde_from_bfn(d, bfn, pt_mfn) || (pt_mfn[1] == 0) )
     {
         spin_unlock(&hd->arch.mapping_lock);
-        AMD_IOMMU_DEBUG("Invalid IO pagetable entry gfn = %lx\n", gfn);
+        AMD_IOMMU_DEBUG("Invalid IO pagetable entry bfn = %lx\n", bfn);
         domain_crash(d);
         return -EFAULT;
     }
 
     /* mark PTE as 'page not present' */
-    clear_iommu_pte_present(pt_mfn[1], gfn);
+    clear_iommu_pte_present(pt_mfn[1], bfn);
     spin_unlock(&hd->arch.mapping_lock);
 
-    amd_iommu_flush_pages(d, gfn, 0);
+    amd_iommu_flush_pages(d, bfn, 0);
 
     return 0;
 }
diff --git a/xen/drivers/passthrough/amd/pci_amd_iommu.c b/xen/drivers/passthrough/amd/pci_amd_iommu.c
index 12d2695b89..d608631e6e 100644
--- a/xen/drivers/passthrough/amd/pci_amd_iommu.c
+++ b/xen/drivers/passthrough/amd/pci_amd_iommu.c
@@ -578,7 +578,7 @@ static void amd_dump_p2m_table_level(struct page_info* pg, int level,
                 maddr_to_page(next_table_maddr), next_level,
                 address, indent + 1);
         else
-            printk("%*sgfn: %08lx  mfn: %08lx\n",
+            printk("%*sbfn: %08lx  mfn: %08lx\n",
                    indent, "",
                    (unsigned long)PFN_DOWN(address),
                    (unsigned long)PFN_DOWN(next_table_maddr));
diff --git a/xen/drivers/passthrough/arm/smmu.c b/xen/drivers/passthrough/arm/smmu.c
index 74c09b0991..1e4d561b47 100644
--- a/xen/drivers/passthrough/arm/smmu.c
+++ b/xen/drivers/passthrough/arm/smmu.c
@@ -2551,7 +2551,7 @@ static int __must_check arm_smmu_iotlb_flush_all(struct domain *d)
 }
 
 static int __must_check arm_smmu_iotlb_flush(struct domain *d,
-                                             unsigned long gfn,
+                                             unsigned long bfn,
                                              unsigned int page_count)
 {
 	/* ARM SMMU v1 doesn't have flush by VMA and VMID */
@@ -2737,7 +2737,7 @@ static void arm_smmu_iommu_domain_teardown(struct domain *d)
 	xfree(xen_domain);
 }
 
-static int __must_check arm_smmu_map_page(struct domain *d, unsigned long gfn,
+static int __must_check arm_smmu_map_page(struct domain *d, unsigned long bfn,
 			unsigned long mfn, unsigned int flags)
 {
 	p2m_type_t t;
@@ -2748,10 +2748,10 @@ static int __must_check arm_smmu_map_page(struct domain *d, unsigned long gfn,
 	 * protected by an IOMMU, Xen needs to add a 1:1 mapping in the domain
 	 * p2m to allow DMA request to work.
 	 * This is only valid when the domain is directed mapped. Hence this
-	 * function should only be used by gnttab code with gfn == mfn.
+	 * function should only be used by gnttab code with gfn == mfn == bfn.
 	 */
 	BUG_ON(!is_domain_direct_mapped(d));
-	BUG_ON(mfn != gfn);
+	BUG_ON(mfn != bfn);
 
 	/* We only support readable and writable flags */
 	if (!(flags & (IOMMUF_readable | IOMMUF_writable)))
@@ -2763,19 +2763,19 @@ static int __must_check arm_smmu_map_page(struct domain *d, unsigned long gfn,
 	 * The function guest_physmap_add_entry replaces the current mapping
 	 * if there is already one...
 	 */
-	return guest_physmap_add_entry(d, _gfn(gfn), _mfn(mfn), 0, t);
+	return guest_physmap_add_entry(d, _gfn(bfn), _mfn(bfn), 0, t);
 }
 
-static int __must_check arm_smmu_unmap_page(struct domain *d, unsigned long gfn)
+static int __must_check arm_smmu_unmap_page(struct domain *d, unsigned long bfn)
 {
 	/*
 	 * This function should only be used by gnttab code when the domain
-	 * is direct mapped
+	 * is direct mapped (i.e. gfn == mfn == bfn).
 	 */
 	if ( !is_domain_direct_mapped(d) )
 		return -EINVAL;
 
-	return guest_physmap_remove_page(d, _gfn(gfn), _mfn(gfn), 0);
+	return guest_physmap_remove_page(d, _gfn(bfn), _mfn(bfn), 0);
 }
 
 static const struct iommu_ops arm_smmu_iommu_ops = {
diff --git a/xen/drivers/passthrough/iommu.c b/xen/drivers/passthrough/iommu.c
index 70d218f910..f88dad0177 100644
--- a/xen/drivers/passthrough/iommu.c
+++ b/xen/drivers/passthrough/iommu.c
@@ -185,7 +185,7 @@ void __hwdom_init iommu_hwdom_init(struct domain *d)
         page_list_for_each ( page, &d->page_list )
         {
             unsigned long mfn = mfn_x(page_to_mfn(page));
-            unsigned long gfn = mfn_to_gmfn(d, mfn);
+            unsigned long bfn = mfn_to_gmfn(d, mfn);
             unsigned int mapping = IOMMUF_readable;
             int ret;
 
@@ -194,7 +194,7 @@ void __hwdom_init iommu_hwdom_init(struct domain *d)
                   == PGT_writable_page) )
                 mapping |= IOMMUF_writable;
 
-            ret = hd->platform_ops->map_page(d, gfn, mfn, mapping);
+            ret = hd->platform_ops->map_page(d, bfn, mfn, mapping);
             if ( !rc )
                 rc = ret;
 
@@ -255,7 +255,7 @@ void iommu_domain_destroy(struct domain *d)
     arch_iommu_domain_destroy(d);
 }
 
-int iommu_map_page(struct domain *d, unsigned long gfn, unsigned long mfn,
+int iommu_map_page(struct domain *d, unsigned long bfn, unsigned long mfn,
                    unsigned int flags)
 {
     const struct domain_iommu *hd = dom_iommu(d);
@@ -264,13 +264,13 @@ int iommu_map_page(struct domain *d, unsigned long gfn, unsigned long mfn,
     if ( !iommu_enabled || !hd->platform_ops )
         return 0;
 
-    rc = hd->platform_ops->map_page(d, gfn, mfn, flags);
+    rc = hd->platform_ops->map_page(d, bfn, mfn, flags);
     if ( unlikely(rc) )
     {
         if ( !d->is_shutting_down && printk_ratelimit() )
             printk(XENLOG_ERR
-                   "d%d: IOMMU mapping gfn %#lx to mfn %#lx failed: %d\n",
-                   d->domain_id, gfn, mfn, rc);
+                   "d%d: IOMMU mapping bfn %#lx to mfn %#lx failed: %d\n",
+                   d->domain_id, bfn, mfn, rc);
 
         if ( !is_hardware_domain(d) )
             domain_crash(d);
@@ -279,7 +279,7 @@ int iommu_map_page(struct domain *d, unsigned long gfn, unsigned long mfn,
     return rc;
 }
 
-int iommu_unmap_page(struct domain *d, unsigned long gfn)
+int iommu_unmap_page(struct domain *d, unsigned long bfn)
 {
     const struct domain_iommu *hd = dom_iommu(d);
     int rc;
@@ -287,13 +287,13 @@ int iommu_unmap_page(struct domain *d, unsigned long gfn)
     if ( !iommu_enabled || !hd->platform_ops )
         return 0;
 
-    rc = hd->platform_ops->unmap_page(d, gfn);
+    rc = hd->platform_ops->unmap_page(d, bfn);
     if ( unlikely(rc) )
     {
         if ( !d->is_shutting_down && printk_ratelimit() )
             printk(XENLOG_ERR
-                   "d%d: IOMMU unmapping gfn %#lx failed: %d\n",
-                   d->domain_id, gfn, rc);
+                   "d%d: IOMMU unmapping bfn %#lx failed: %d\n",
+                   d->domain_id, bfn, rc);
 
         if ( !is_hardware_domain(d) )
             domain_crash(d);
@@ -319,7 +319,7 @@ static void iommu_free_pagetables(unsigned long unused)
                             cpumask_cycle(smp_processor_id(), &cpu_online_map));
 }
 
-int iommu_iotlb_flush(struct domain *d, unsigned long gfn,
+int iommu_iotlb_flush(struct domain *d, unsigned long bfn,
                       unsigned int page_count)
 {
     const struct domain_iommu *hd = dom_iommu(d);
@@ -328,13 +328,13 @@ int iommu_iotlb_flush(struct domain *d, unsigned long gfn,
     if ( !iommu_enabled || !hd->platform_ops || !hd->platform_ops->iotlb_flush )
         return 0;
 
-    rc = hd->platform_ops->iotlb_flush(d, gfn, page_count);
+    rc = hd->platform_ops->iotlb_flush(d, bfn, page_count);
     if ( unlikely(rc) )
     {
         if ( !d->is_shutting_down && printk_ratelimit() )
             printk(XENLOG_ERR
-                   "d%d: IOMMU IOTLB flush failed: %d, gfn %#lx, page count %u\n",
-                   d->domain_id, rc, gfn, page_count);
+                   "d%d: IOMMU IOTLB flush failed: %d, bfn %#lx, page count %u\n",
+                   d->domain_id, rc, bfn, page_count);
 
         if ( !is_hardware_domain(d) )
             domain_crash(d);
diff --git a/xen/drivers/passthrough/vtd/iommu.c b/xen/drivers/passthrough/vtd/iommu.c
index 1710256823..48f62e0e8d 100644
--- a/xen/drivers/passthrough/vtd/iommu.c
+++ b/xen/drivers/passthrough/vtd/iommu.c
@@ -585,7 +585,7 @@ static int __must_check iommu_flush_all(void)
 }
 
 static int __must_check iommu_flush_iotlb(struct domain *d,
-                                          unsigned long gfn,
+                                          unsigned long bfn,
                                           bool_t dma_old_pte_present,
                                           unsigned int page_count)
 {
@@ -612,12 +612,12 @@ static int __must_check iommu_flush_iotlb(struct domain *d,
         if ( iommu_domid == -1 )
             continue;
 
-        if ( page_count != 1 || gfn == gfn_x(INVALID_GFN) )
+        if ( page_count != 1 || bfn == bfn_x(INVALID_BFN) )
             rc = iommu_flush_iotlb_dsi(iommu, iommu_domid,
                                        0, flush_dev_iotlb);
         else
             rc = iommu_flush_iotlb_psi(iommu, iommu_domid,
-                                       (paddr_t)gfn << PAGE_SHIFT_4K,
+                                       __bfn_to_baddr(bfn),
                                        PAGE_ORDER_4K,
                                        !dma_old_pte_present,
                                        flush_dev_iotlb);
@@ -633,15 +633,15 @@ static int __must_check iommu_flush_iotlb(struct domain *d,
 }
 
 static int __must_check iommu_flush_iotlb_pages(struct domain *d,
-                                                unsigned long gfn,
+                                                unsigned long bfn,
                                                 unsigned int page_count)
 {
-    return iommu_flush_iotlb(d, gfn, 1, page_count);
+    return iommu_flush_iotlb(d, bfn, 1, page_count);
 }
 
 static int __must_check iommu_flush_iotlb_all(struct domain *d)
 {
-    return iommu_flush_iotlb(d, gfn_x(INVALID_GFN), 0, 0);
+    return iommu_flush_iotlb(d, bfn_x(INVALID_BFN), 0, 0);
 }
 
 /* clear one page's page table */
@@ -1767,7 +1767,7 @@ static void iommu_domain_teardown(struct domain *d)
 }
 
 static int __must_check intel_iommu_map_page(struct domain *d,
-                                             unsigned long gfn,
+                                             unsigned long bfn,
                                              unsigned long mfn,
                                              unsigned int flags)
 {
@@ -1786,14 +1786,14 @@ static int __must_check intel_iommu_map_page(struct domain *d,
 
     spin_lock(&hd->arch.mapping_lock);
 
-    pg_maddr = addr_to_dma_page_maddr(d, (paddr_t)gfn << PAGE_SHIFT_4K, 1);
+    pg_maddr = addr_to_dma_page_maddr(d, __bfn_to_baddr(bfn), 1);
     if ( pg_maddr == 0 )
     {
         spin_unlock(&hd->arch.mapping_lock);
         return -ENOMEM;
     }
     page = (struct dma_pte *)map_vtd_domain_page(pg_maddr);
-    pte = page + (gfn & LEVEL_MASK);
+    pte = page + (bfn & LEVEL_MASK);
     old = *pte;
     dma_set_pte_addr(new, (paddr_t)mfn << PAGE_SHIFT_4K);
     dma_set_pte_prot(new,
@@ -1817,22 +1817,22 @@ static int __must_check intel_iommu_map_page(struct domain *d,
     unmap_vtd_domain_page(page);
 
     if ( !this_cpu(iommu_dont_flush_iotlb) )
-        rc = iommu_flush_iotlb(d, gfn, dma_pte_present(old), 1);
+        rc = iommu_flush_iotlb(d, bfn, dma_pte_present(old), 1);
 
     return rc;
 }
 
 static int __must_check intel_iommu_unmap_page(struct domain *d,
-                                               unsigned long gfn)
+                                               unsigned long bfn)
 {
     /* Do nothing if hardware domain and iommu supports pass thru. */
     if ( iommu_passthrough && is_hardware_domain(d) )
         return 0;
 
-    return dma_pte_clear_one(d, (paddr_t)gfn << PAGE_SHIFT_4K);
+    return dma_pte_clear_one(d, __bfn_to_baddr(bfn));
 }
 
-int iommu_pte_flush(struct domain *d, u64 gfn, u64 *pte,
+int iommu_pte_flush(struct domain *d, uint64_t bfn, uint64_t *pte,
                     int order, int present)
 {
     struct acpi_drhd_unit *drhd;
@@ -1856,7 +1856,7 @@ int iommu_pte_flush(struct domain *d, u64 gfn, u64 *pte,
             continue;
 
         rc = iommu_flush_iotlb_psi(iommu, iommu_domid,
-                                   (paddr_t)gfn << PAGE_SHIFT_4K,
+                                   __bfn_to_baddr(bfn),
                                    order, !present, flush_dev_iotlb);
         if ( rc > 0 )
         {
@@ -2626,7 +2626,7 @@ static void vtd_dump_p2m_table_level(paddr_t pt_maddr, int level, paddr_t gpa,
             vtd_dump_p2m_table_level(dma_pte_addr(*pte), next_level, 
                                      address, indent + 1);
         else
-            printk("%*sgfn: %08lx mfn: %08lx\n",
+            printk("%*sbfn: %08lx mfn: %08lx\n",
                    indent, "",
                    (unsigned long)(address >> PAGE_SHIFT_4K),
                    (unsigned long)(dma_pte_addr(*pte) >> PAGE_SHIFT_4K));
diff --git a/xen/include/xen/iommu.h b/xen/include/xen/iommu.h
index e35d941f3c..40af7bd7c9 100644
--- a/xen/include/xen/iommu.h
+++ b/xen/include/xen/iommu.h
@@ -23,11 +23,37 @@
 #include <xen/page-defs.h>
 #include <xen/spinlock.h>
 #include <xen/pci.h>
+#include <xen/typesafe.h>
 #include <public/hvm/ioreq.h>
 #include <public/domctl.h>
 #include <asm/device.h>
 #include <asm/iommu.h>
 
+TYPE_SAFE(uint64_t, bfn);
+#define PRI_bfn     PRIx64
+#define INVALID_BFN _bfn(~0UL)
+
+#ifndef bfn_t
+#define bfn_t /* Grep fodder: bfn_t, _bfn() and bfn_x() are defined above */
+#define _bfn
+#define bfn_x
+#undef bfn_t
+#undef _bfn
+#undef bfn_x
+#endif
+
+#define IOMMU_PAGE_SHIFT 12
+#define IOMMU_PAGE_SIZE  (_AC(1,L) << IOMMU_PAGE_SHIFT)
+#define IOMMU_PAGE_MASK  (~(IOMMU_PAGE_SIZE - 1))
+
+typedef uint64_t baddr_t;
+
+#define __bfn_to_baddr(bfn) ((baddr_t)(bfn) << IOMMU_PAGE_SHIFT)
+#define __baddr_to_bfn(baddr) ((uint64_t)(baddr >> IOMMU_PAGE_SHIFT))
+
+#define bfn_to_baddr(bfn) __bfn_to_baddr(bfn_x(bfn))
+#define baddr_to_bfn(baddr) _bfn(__baddr_to_bfn(baddr))
+
 extern bool_t iommu_enable, iommu_enabled;
 extern bool_t force_iommu, iommu_dom0_strict, iommu_verbose;
 extern bool_t iommu_workaround_bios_bug, iommu_igfx, iommu_passthrough;
@@ -60,9 +86,9 @@ void iommu_teardown(struct domain *d);
 #define IOMMUF_readable  (1u<<_IOMMUF_readable)
 #define _IOMMUF_writable 1
 #define IOMMUF_writable  (1u<<_IOMMUF_writable)
-int __must_check iommu_map_page(struct domain *d, unsigned long gfn,
+int __must_check iommu_map_page(struct domain *d, unsigned long bfn,
                                 unsigned long mfn, unsigned int flags);
-int __must_check iommu_unmap_page(struct domain *d, unsigned long gfn);
+int __must_check iommu_unmap_page(struct domain *d, unsigned long bfn);
 
 enum iommu_feature
 {
@@ -150,9 +176,9 @@ struct iommu_ops {
 #endif /* HAS_PCI */
 
     void (*teardown)(struct domain *d);
-    int __must_check (*map_page)(struct domain *d, unsigned long gfn,
+    int __must_check (*map_page)(struct domain *d, unsigned long bfn,
                                  unsigned long mfn, unsigned int flags);
-    int __must_check (*unmap_page)(struct domain *d, unsigned long gfn);
+    int __must_check (*unmap_page)(struct domain *d, unsigned long bfn);
     void (*free_page_table)(struct page_info *);
 #ifdef CONFIG_X86
     void (*update_ire_from_apic)(unsigned int apic, unsigned int reg, unsigned int value);
@@ -163,7 +189,7 @@ struct iommu_ops {
     void (*resume)(void);
     void (*share_p2m)(struct domain *d);
     void (*crash_shutdown)(void);
-    int __must_check (*iotlb_flush)(struct domain *d, unsigned long gfn,
+    int __must_check (*iotlb_flush)(struct domain *d, unsigned long bfn,
                                     unsigned int page_count);
     int __must_check (*iotlb_flush_all)(struct domain *d);
     int (*get_reserved_device_memory)(iommu_grdm_t *, void *);
@@ -185,7 +211,7 @@ int iommu_do_pci_domctl(struct xen_domctl *, struct domain *d,
 int iommu_do_domctl(struct xen_domctl *, struct domain *d,
                     XEN_GUEST_HANDLE_PARAM(xen_domctl_t));
 
-int __must_check iommu_iotlb_flush(struct domain *d, unsigned long gfn,
+int __must_check iommu_iotlb_flush(struct domain *d, unsigned long bfn,
                                    unsigned int page_count);
 int __must_check iommu_iotlb_flush_all(struct domain *d);
 
diff --git a/xen/include/xen/mm.h b/xen/include/xen/mm.h
index 24654e8e22..e0b1ff41a6 100644
--- a/xen/include/xen/mm.h
+++ b/xen/include/xen/mm.h
@@ -26,6 +26,11 @@
  *   A linear idea of a guest physical address space. For an auto-translated
  *   guest, pfn == gfn while for a non-translated guest, pfn != gfn.
  *
+ * bfn: Bus Frame Number (definitions in include/xen/iommu.h)
+ *   The linear frame numbers of IOMMU address space. All initiators for (i.e.
+ *   all devices assigned to) a guest share a single IOMMU address space and,
+ *   by default, Xen will ensure bfn == pfn.
+ *
  * WARNING: Some of these terms have changed over time while others have been
  * used inconsistently, meaning that a lot of existing code does not match the
  * definitions above.  New code should use these terms as described here, and
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [PATCH v6 02/14] iommu: make use of type-safe BFN and MFN in exported functions
  2018-08-23  9:46 [PATCH v6 00/14] paravirtual IOMMU interface Paul Durrant
  2018-08-23  9:46 ` [PATCH v6 01/14] iommu: introduce the concept of BFN Paul Durrant
@ 2018-08-23  9:46 ` Paul Durrant
  2018-09-04 10:29   ` Jan Beulich
  2018-08-23  9:47 ` [PATCH v6 03/14] iommu: push use of type-safe BFN and MFN into iommu_ops Paul Durrant
                   ` (11 subsequent siblings)
  13 siblings, 1 reply; 111+ messages in thread
From: Paul Durrant @ 2018-08-23  9:46 UTC (permalink / raw)
  To: xen-devel
  Cc: Stefano Stabellini, Jun Nakajima, George Dunlap, Andrew Cooper,
	Ian Jackson, Tim Deegan, Julien Grall, Paul Durrant, Jan Beulich

This patch modifies the declaration of the entry points to the IOMMU
sub-system to use bfn_t and mfn_t in place of unsigned long. A subsequent
patch will similarly modify the methods in the iommu_ops structure.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
---
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: George Dunlap <George.Dunlap@eu.citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Julien Grall <julien.grall@arm.com>
Cc: Tim Deegan <tim@xen.org>
Cc: Jun Nakajima <jun.nakajima@intel.com>
Cc: George Dunlap <george.dunlap@eu.citrix.com>

v6:
 - Re-base.

v3:
 - Removed most of the uses of an intermediate 'frame' variable.

v2:
 - Addressed comments from Jan.
 - Use intermediate 'frame' variable to avoid directly encapsulating
   mfn or gfn values as bfns.
---
 xen/arch/arm/p2m.c                    |  3 ++-
 xen/arch/x86/mm.c                     | 10 ++++----
 xen/arch/x86/mm/p2m-ept.c             | 10 +++++---
 xen/arch/x86/mm/p2m-pt.c              | 45 ++++++++++++++++++++---------------
 xen/arch/x86/mm/p2m.c                 | 16 ++++++++-----
 xen/arch/x86/x86_64/mm.c              |  5 ++--
 xen/common/grant_table.c              | 12 +++++-----
 xen/common/memory.c                   |  4 ++--
 xen/drivers/passthrough/iommu.c       | 25 ++++++++++---------
 xen/drivers/passthrough/vtd/x86/vtd.c |  3 ++-
 xen/include/xen/iommu.h               | 14 +++++++----
 11 files changed, 85 insertions(+), 62 deletions(-)

diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
index 56b5474625..072029dfbe 100644
--- a/xen/arch/arm/p2m.c
+++ b/xen/arch/arm/p2m.c
@@ -957,7 +957,8 @@ static int __p2m_set_entry(struct p2m_domain *p2m,
 
     if ( need_iommu(p2m->domain) &&
          (lpae_is_valid(orig_pte) || lpae_is_valid(*entry)) )
-        rc = iommu_iotlb_flush(p2m->domain, gfn_x(sgfn), 1UL << page_order);
+        rc = iommu_iotlb_flush(p2m->domain, _bfn(gfn_x(sgfn)),
+                               1UL << page_order);
     else
         rc = 0;
 
diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index 8ac4412554..9e9fb9421e 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -2752,14 +2752,14 @@ static int _get_page_type(struct page_info *page, unsigned long type,
         struct domain *d = page_get_owner(page);
         if ( d && is_pv_domain(d) && unlikely(need_iommu(d)) )
         {
-            gfn_t gfn = _gfn(mfn_to_gmfn(d, mfn_x(page_to_mfn(page))));
+            mfn_t mfn = page_to_mfn(page);
 
             if ( (x & PGT_type_mask) == PGT_writable_page )
-                iommu_ret = iommu_unmap_page(d, gfn_x(gfn));
+                iommu_ret = iommu_unmap_page(d, _bfn(mfn_x(mfn)));
             else if ( type == PGT_writable_page )
-                iommu_ret = iommu_map_page(d, gfn_x(gfn),
-                                           mfn_x(page_to_mfn(page)),
-                                           IOMMUF_readable|IOMMUF_writable);
+                iommu_ret = iommu_map_page(d, _bfn(mfn_x(mfn)), mfn,
+                                           IOMMUF_readable |
+                                           IOMMUF_writable);
         }
     }
 
diff --git a/xen/arch/x86/mm/p2m-ept.c b/xen/arch/x86/mm/p2m-ept.c
index 14b593923b..2089b5232d 100644
--- a/xen/arch/x86/mm/p2m-ept.c
+++ b/xen/arch/x86/mm/p2m-ept.c
@@ -870,15 +870,19 @@ out:
             rc = iommu_pte_flush(d, gfn, &ept_entry->epte, order, vtd_pte_present);
         else
         {
+            bfn_t bfn = _bfn(gfn);
+
             if ( iommu_flags )
                 for ( i = 0; i < (1 << order); i++ )
                 {
-                    rc = iommu_map_page(d, gfn + i, mfn_x(mfn) + i, iommu_flags);
+                    rc = iommu_map_page(d, bfn_add(bfn, i),
+                                        mfn_add(mfn, i), iommu_flags);
                     if ( unlikely(rc) )
                     {
                         while ( i-- )
                             /* If statement to satisfy __must_check. */
-                            if ( iommu_unmap_page(p2m->domain, gfn + i) )
+                            if ( iommu_unmap_page(p2m->domain,
+                                                  bfn_add(bfn, i)) )
                                 continue;
 
                         break;
@@ -887,7 +891,7 @@ out:
             else
                 for ( i = 0; i < (1 << order); i++ )
                 {
-                    ret = iommu_unmap_page(d, gfn + i);
+                    ret = iommu_unmap_page(d, bfn_add(bfn, i));
                     if ( !rc )
                         rc = ret;
                 }
diff --git a/xen/arch/x86/mm/p2m-pt.c b/xen/arch/x86/mm/p2m-pt.c
index b8c5d2ed26..a441af388a 100644
--- a/xen/arch/x86/mm/p2m-pt.c
+++ b/xen/arch/x86/mm/p2m-pt.c
@@ -687,29 +687,36 @@ p2m_pt_set_entry(struct p2m_domain *p2m, gfn_t gfn_, mfn_t mfn,
             if ( iommu_old_flags )
                 amd_iommu_flush_pages(p2m->domain, gfn, page_order);
         }
-        else if ( iommu_pte_flags )
-            for ( i = 0; i < (1UL << page_order); i++ )
-            {
-                rc = iommu_map_page(p2m->domain, gfn + i, mfn_x(mfn) + i,
-                                    iommu_pte_flags);
-                if ( unlikely(rc) )
+        else
+        {
+            bfn_t bfn = _bfn(gfn);
+
+            if ( iommu_pte_flags )
+                for ( i = 0; i < (1UL << page_order); i++ )
                 {
-                    while ( i-- )
-                        /* If statement to satisfy __must_check. */
-                        if ( iommu_unmap_page(p2m->domain, gfn + i) )
-                            continue;
+                    rc = iommu_map_page(p2m->domain, bfn_add(bfn, i),
+                                        mfn_add(mfn, i), iommu_pte_flags);
+                    if ( unlikely(rc) )
+                    {
+                        while ( i-- )
+                            /* If statement to satisfy __must_check. */
+                            if ( iommu_unmap_page(p2m->domain,
+                                                  bfn_add(bfn, i)) )
+                                continue;
 
-                    break;
+                        break;
+                    }
                 }
-            }
-        else
-            for ( i = 0; i < (1UL << page_order); i++ )
-            {
-                int ret = iommu_unmap_page(p2m->domain, gfn + i);
+            else
+                for ( i = 0; i < (1UL << page_order); i++ )
+                {
+                    int ret = iommu_unmap_page(p2m->domain,
+                                               bfn_add(bfn, i));
 
-                if ( !rc )
-                    rc = ret;
-            }
+                    if ( !rc )
+                        rc = ret;
+                }
+        }
     }
 
     /*
diff --git a/xen/arch/x86/mm/p2m.c b/xen/arch/x86/mm/p2m.c
index 8e9fbb5a14..fbf67def50 100644
--- a/xen/arch/x86/mm/p2m.c
+++ b/xen/arch/x86/mm/p2m.c
@@ -715,9 +715,11 @@ p2m_remove_page(struct p2m_domain *p2m, unsigned long gfn_l, unsigned long mfn,
 
         if ( need_iommu(p2m->domain) )
         {
+            bfn_t bfn = _bfn(mfn);
+
             for ( i = 0; i < (1 << page_order); i++ )
             {
-                int ret = iommu_unmap_page(p2m->domain, mfn + i);
+                int ret = iommu_unmap_page(p2m->domain, bfn_add(bfn, i));
 
                 if ( !rc )
                     rc = ret;
@@ -774,16 +776,17 @@ guest_physmap_add_entry(struct domain *d, gfn_t gfn, mfn_t mfn,
     {
         if ( need_iommu(d) && t == p2m_ram_rw )
         {
+            bfn_t bfn = _bfn(mfn_x(mfn));
+
             for ( i = 0; i < (1 << page_order); i++ )
             {
-                rc = iommu_map_page(d, mfn_x(mfn_add(mfn, i)),
-                                    mfn_x(mfn_add(mfn, i)),
+                rc = iommu_map_page(d, bfn_add(bfn, i), mfn_add(mfn, i),
                                     IOMMUF_readable|IOMMUF_writable);
                 if ( rc != 0 )
                 {
                     while ( i-- > 0 )
                         /* If statement to satisfy __must_check. */
-                        if ( iommu_unmap_page(d, mfn_x(mfn_add(mfn, i))) )
+                        if ( iommu_unmap_page(d, bfn_add(bfn, i)) )
                             continue;
 
                     return rc;
@@ -1158,7 +1161,8 @@ int set_identity_p2m_entry(struct domain *d, unsigned long gfn_l,
     {
         if ( !need_iommu(d) )
             return 0;
-        return iommu_map_page(d, gfn_l, gfn_l, IOMMUF_readable|IOMMUF_writable);
+        return iommu_map_page(d, _bfn(gfn_l), _mfn(gfn_l),
+                              IOMMUF_readable | IOMMUF_writable);
     }
 
     gfn_lock(p2m, gfn, 0);
@@ -1248,7 +1252,7 @@ int clear_identity_p2m_entry(struct domain *d, unsigned long gfn_l)
     {
         if ( !need_iommu(d) )
             return 0;
-        return iommu_unmap_page(d, gfn_l);
+        return iommu_unmap_page(d, _bfn(gfn_l));
     }
 
     gfn_lock(p2m, gfn, 0);
diff --git a/xen/arch/x86/x86_64/mm.c b/xen/arch/x86/x86_64/mm.c
index cca4ae926e..cc58e4cef4 100644
--- a/xen/arch/x86/x86_64/mm.c
+++ b/xen/arch/x86/x86_64/mm.c
@@ -1429,13 +1429,14 @@ int memory_add(unsigned long spfn, unsigned long epfn, unsigned int pxm)
     if ( iommu_enabled && !iommu_passthrough && !need_iommu(hardware_domain) )
     {
         for ( i = spfn; i < epfn; i++ )
-            if ( iommu_map_page(hardware_domain, i, i, IOMMUF_readable|IOMMUF_writable) )
+            if ( iommu_map_page(hardware_domain, _bfn(i), _mfn(i),
+                                IOMMUF_readable | IOMMUF_writable) )
                 break;
         if ( i != epfn )
         {
             while (i-- > old_max)
                 /* If statement to satisfy __must_check. */
-                if ( iommu_unmap_page(hardware_domain, i) )
+                if ( iommu_unmap_page(hardware_domain, _bfn(i)) )
                     continue;
 
             goto destroy_m2p;
diff --git a/xen/common/grant_table.c b/xen/common/grant_table.c
index ad55cfa0ec..af41133953 100644
--- a/xen/common/grant_table.c
+++ b/xen/common/grant_table.c
@@ -1143,13 +1143,13 @@ map_grant_ref(
              !(old_pin & (GNTPIN_hstw_mask|GNTPIN_devw_mask)) )
         {
             if ( !(kind & MAPKIND_WRITE) )
-                err = iommu_map_page(ld, mfn_x(mfn), mfn_x(mfn),
-                                     IOMMUF_readable|IOMMUF_writable);
+                err = iommu_map_page(ld, _bfn(mfn_x(mfn)), mfn,
+                                     IOMMUF_readable | IOMMUF_writable);
         }
         else if ( act_pin && !old_pin )
         {
             if ( !kind )
-                err = iommu_map_page(ld, mfn_x(mfn), mfn_x(mfn),
+                err = iommu_map_page(ld, _bfn(mfn_x(mfn)), mfn,
                                      IOMMUF_readable);
         }
         if ( err )
@@ -1398,10 +1398,10 @@ unmap_common(
 
         kind = mapkind(lgt, rd, op->mfn);
         if ( !kind )
-            err = iommu_unmap_page(ld, mfn_x(op->mfn));
+            err = iommu_unmap_page(ld, _bfn(mfn_x(op->mfn)));
         else if ( !(kind & MAPKIND_WRITE) )
-            err = iommu_map_page(ld, mfn_x(op->mfn),
-                                 mfn_x(op->mfn), IOMMUF_readable);
+            err = iommu_map_page(ld, _bfn(mfn_x(op->mfn)), op->mfn,
+                                 IOMMUF_readable);
 
         double_gt_unlock(lgt, rgt);
 
diff --git a/xen/common/memory.c b/xen/common/memory.c
index 996f94b103..8ba8921c79 100644
--- a/xen/common/memory.c
+++ b/xen/common/memory.c
@@ -834,11 +834,11 @@ int xenmem_add_to_physmap(struct domain *d, struct xen_add_to_physmap *xatp,
 
         this_cpu(iommu_dont_flush_iotlb) = 0;
 
-        ret = iommu_iotlb_flush(d, xatp->idx - done, done);
+        ret = iommu_iotlb_flush(d, _bfn(xatp->idx - done), done);
         if ( unlikely(ret) && rc >= 0 )
             rc = ret;
 
-        ret = iommu_iotlb_flush(d, xatp->gpfn - done, done);
+        ret = iommu_iotlb_flush(d, _bfn(xatp->gpfn - done), done);
         if ( unlikely(ret) && rc >= 0 )
             rc = ret;
     }
diff --git a/xen/drivers/passthrough/iommu.c b/xen/drivers/passthrough/iommu.c
index f88dad0177..42b53b15e9 100644
--- a/xen/drivers/passthrough/iommu.c
+++ b/xen/drivers/passthrough/iommu.c
@@ -255,7 +255,7 @@ void iommu_domain_destroy(struct domain *d)
     arch_iommu_domain_destroy(d);
 }
 
-int iommu_map_page(struct domain *d, unsigned long bfn, unsigned long mfn,
+int iommu_map_page(struct domain *d, bfn_t bfn, mfn_t mfn,
                    unsigned int flags)
 {
     const struct domain_iommu *hd = dom_iommu(d);
@@ -264,13 +264,13 @@ int iommu_map_page(struct domain *d, unsigned long bfn, unsigned long mfn,
     if ( !iommu_enabled || !hd->platform_ops )
         return 0;
 
-    rc = hd->platform_ops->map_page(d, bfn, mfn, flags);
+    rc = hd->platform_ops->map_page(d, bfn_x(bfn), mfn_x(mfn), flags);
     if ( unlikely(rc) )
     {
         if ( !d->is_shutting_down && printk_ratelimit() )
             printk(XENLOG_ERR
-                   "d%d: IOMMU mapping bfn %#lx to mfn %#lx failed: %d\n",
-                   d->domain_id, bfn, mfn, rc);
+                   "d%d: IOMMU mapping bfn %"PRI_bfn" to mfn %"PRI_mfn" failed: %d\n",
+                   d->domain_id, bfn_x(bfn), mfn_x(mfn), rc);
 
         if ( !is_hardware_domain(d) )
             domain_crash(d);
@@ -279,7 +279,7 @@ int iommu_map_page(struct domain *d, unsigned long bfn, unsigned long mfn,
     return rc;
 }
 
-int iommu_unmap_page(struct domain *d, unsigned long bfn)
+int iommu_unmap_page(struct domain *d, bfn_t bfn)
 {
     const struct domain_iommu *hd = dom_iommu(d);
     int rc;
@@ -287,13 +287,13 @@ int iommu_unmap_page(struct domain *d, unsigned long bfn)
     if ( !iommu_enabled || !hd->platform_ops )
         return 0;
 
-    rc = hd->platform_ops->unmap_page(d, bfn);
+    rc = hd->platform_ops->unmap_page(d, bfn_x(bfn));
     if ( unlikely(rc) )
     {
         if ( !d->is_shutting_down && printk_ratelimit() )
             printk(XENLOG_ERR
-                   "d%d: IOMMU unmapping bfn %#lx failed: %d\n",
-                   d->domain_id, bfn, rc);
+                   "d%d: IOMMU unmapping bfn %"PRI_bfn" failed: %d\n",
+                   d->domain_id, bfn_x(bfn), rc);
 
         if ( !is_hardware_domain(d) )
             domain_crash(d);
@@ -319,8 +319,7 @@ static void iommu_free_pagetables(unsigned long unused)
                             cpumask_cycle(smp_processor_id(), &cpu_online_map));
 }
 
-int iommu_iotlb_flush(struct domain *d, unsigned long bfn,
-                      unsigned int page_count)
+int iommu_iotlb_flush(struct domain *d, bfn_t bfn, unsigned int page_count)
 {
     const struct domain_iommu *hd = dom_iommu(d);
     int rc;
@@ -328,13 +327,13 @@ int iommu_iotlb_flush(struct domain *d, unsigned long bfn,
     if ( !iommu_enabled || !hd->platform_ops || !hd->platform_ops->iotlb_flush )
         return 0;
 
-    rc = hd->platform_ops->iotlb_flush(d, bfn, page_count);
+    rc = hd->platform_ops->iotlb_flush(d, bfn_x(bfn), page_count);
     if ( unlikely(rc) )
     {
         if ( !d->is_shutting_down && printk_ratelimit() )
             printk(XENLOG_ERR
-                   "d%d: IOMMU IOTLB flush failed: %d, bfn %#lx, page count %u\n",
-                   d->domain_id, rc, bfn, page_count);
+                   "d%d: IOMMU IOTLB flush failed: %d, bfn %"PRI_bfn", page count %u\n",
+                   d->domain_id, rc, bfn_x(bfn), page_count);
 
         if ( !is_hardware_domain(d) )
             domain_crash(d);
diff --git a/xen/drivers/passthrough/vtd/x86/vtd.c b/xen/drivers/passthrough/vtd/x86/vtd.c
index 00a9891005..fb674cdc68 100644
--- a/xen/drivers/passthrough/vtd/x86/vtd.c
+++ b/xen/drivers/passthrough/vtd/x86/vtd.c
@@ -152,7 +152,8 @@ void __hwdom_init vtd_set_hwdom_mapping(struct domain *d)
              page_is_ram_type(pfn, RAM_TYPE_CONVENTIONAL) )
             continue;
 
-        rc = iommu_map_page(d, pfn, pfn, IOMMUF_readable|IOMMUF_writable);
+        rc = iommu_map_page(d, _bfn(pfn), _mfn(pfn),
+			    IOMMUF_readable | IOMMUF_writable);
         if ( rc )
             printk(XENLOG_WARNING VTDPREFIX " d%d: IOMMU mapping failed: %d\n",
                    d->domain_id, rc);
diff --git a/xen/include/xen/iommu.h b/xen/include/xen/iommu.h
index 40af7bd7c9..3d4f3e7d26 100644
--- a/xen/include/xen/iommu.h
+++ b/xen/include/xen/iommu.h
@@ -24,6 +24,7 @@
 #include <xen/spinlock.h>
 #include <xen/pci.h>
 #include <xen/typesafe.h>
+#include <xen/mm.h>
 #include <public/hvm/ioreq.h>
 #include <public/domctl.h>
 #include <asm/device.h>
@@ -42,6 +43,11 @@ TYPE_SAFE(uint64_t, bfn);
 #undef bfn_x
 #endif
 
+static inline bfn_t bfn_add(bfn_t bfn, unsigned long i)
+{
+    return _bfn(bfn_x(bfn) + i);
+}
+
 #define IOMMU_PAGE_SHIFT 12
 #define IOMMU_PAGE_SIZE  (_AC(1,L) << IOMMU_PAGE_SHIFT)
 #define IOMMU_PAGE_MASK  (~(IOMMU_PAGE_SIZE - 1))
@@ -86,9 +92,9 @@ void iommu_teardown(struct domain *d);
 #define IOMMUF_readable  (1u<<_IOMMUF_readable)
 #define _IOMMUF_writable 1
 #define IOMMUF_writable  (1u<<_IOMMUF_writable)
-int __must_check iommu_map_page(struct domain *d, unsigned long bfn,
-                                unsigned long mfn, unsigned int flags);
-int __must_check iommu_unmap_page(struct domain *d, unsigned long bfn);
+int __must_check iommu_map_page(struct domain *d, bfn_t bfn,
+                                mfn_t mfn, unsigned int flags);
+int __must_check iommu_unmap_page(struct domain *d, bfn_t bfn);
 
 enum iommu_feature
 {
@@ -211,7 +217,7 @@ int iommu_do_pci_domctl(struct xen_domctl *, struct domain *d,
 int iommu_do_domctl(struct xen_domctl *, struct domain *d,
                     XEN_GUEST_HANDLE_PARAM(xen_domctl_t));
 
-int __must_check iommu_iotlb_flush(struct domain *d, unsigned long bfn,
+int __must_check iommu_iotlb_flush(struct domain *d, bfn_t bfn,
                                    unsigned int page_count);
 int __must_check iommu_iotlb_flush_all(struct domain *d);
 
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [PATCH v6 03/14] iommu: push use of type-safe BFN and MFN into iommu_ops
  2018-08-23  9:46 [PATCH v6 00/14] paravirtual IOMMU interface Paul Durrant
  2018-08-23  9:46 ` [PATCH v6 01/14] iommu: introduce the concept of BFN Paul Durrant
  2018-08-23  9:46 ` [PATCH v6 02/14] iommu: make use of type-safe BFN and MFN in exported functions Paul Durrant
@ 2018-08-23  9:47 ` Paul Durrant
  2018-09-04 10:32   ` Jan Beulich
  2018-08-23  9:47 ` [PATCH v6 04/14] iommu: don't domain_crash() inside iommu_map/unmap_page() Paul Durrant
                   ` (10 subsequent siblings)
  13 siblings, 1 reply; 111+ messages in thread
From: Paul Durrant @ 2018-08-23  9:47 UTC (permalink / raw)
  To: xen-devel
  Cc: Andrew Cooper, Paul Durrant, Jan Beulich, Suravee Suthikulpanit,
	George Dunlap

This patch modifies the methods in struct iommu_ops to use type-safe BFN
and MFN. This follows on from the prior patch that modified the functions
exported in xen/iommu.h.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
---
Cc: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: George Dunlap <george.dunlap@citrix.com>

v6:
 - Re-base.

v3:
 - Remove some use of intermediate 'frame' variables.

v2:
 - Addressed comments from Jan.
 - Extend use of intermediate 'frame' variable to avoid directly
   encapsulating gfn values as bfns or vice versa.
---
 xen/drivers/passthrough/amd/iommu_map.c       | 46 ++++++++++++++++-----------
 xen/drivers/passthrough/amd/pci_amd_iommu.c   |  2 +-
 xen/drivers/passthrough/arm/smmu.c            | 16 +++++-----
 xen/drivers/passthrough/iommu.c               |  9 +++---
 xen/drivers/passthrough/vtd/iommu.c           | 26 +++++++--------
 xen/drivers/passthrough/x86/iommu.c           |  2 +-
 xen/include/asm-x86/hvm/svm/amd-iommu-proto.h |  8 ++---
 xen/include/xen/iommu.h                       | 13 +++++---
 8 files changed, 67 insertions(+), 55 deletions(-)

diff --git a/xen/drivers/passthrough/amd/iommu_map.c b/xen/drivers/passthrough/amd/iommu_map.c
index 4deab9cd2f..5a9a0af320 100644
--- a/xen/drivers/passthrough/amd/iommu_map.c
+++ b/xen/drivers/passthrough/amd/iommu_map.c
@@ -631,7 +631,7 @@ static int update_paging_mode(struct domain *d, unsigned long bfn)
     return 0;
 }
 
-int amd_iommu_map_page(struct domain *d, unsigned long bfn, unsigned long mfn,
+int amd_iommu_map_page(struct domain *d, bfn_t bfn, mfn_t mfn,
                        unsigned int flags)
 {
     bool_t need_flush = 0;
@@ -651,7 +651,8 @@ int amd_iommu_map_page(struct domain *d, unsigned long bfn, unsigned long mfn,
     if ( rc )
     {
         spin_unlock(&hd->arch.mapping_lock);
-        AMD_IOMMU_DEBUG("Root table alloc failed, bfn = %lx\n", bfn);
+        AMD_IOMMU_DEBUG("Root table alloc failed, bfn = %"PRI_bfn"\n",
+                        bfn_x(bfn));
         domain_crash(d);
         return rc;
     }
@@ -660,25 +661,27 @@ int amd_iommu_map_page(struct domain *d, unsigned long bfn, unsigned long mfn,
      * we might need a deeper page table for wider bfn now */
     if ( is_hvm_domain(d) )
     {
-        if ( update_paging_mode(d, bfn) )
+        if ( update_paging_mode(d, bfn_x(bfn)) )
         {
             spin_unlock(&hd->arch.mapping_lock);
-            AMD_IOMMU_DEBUG("Update page mode failed bfn = %lx\n", bfn);
+            AMD_IOMMU_DEBUG("Update page mode failed bfn = %"PRI_bfn"\n",
+                            bfn_x(bfn));
             domain_crash(d);
             return -EFAULT;
         }
     }
 
-    if ( iommu_pde_from_bfn(d, bfn, pt_mfn) || (pt_mfn[1] == 0) )
+    if ( iommu_pde_from_bfn(d, bfn_x(bfn), pt_mfn) || (pt_mfn[1] == 0) )
     {
         spin_unlock(&hd->arch.mapping_lock);
-        AMD_IOMMU_DEBUG("Invalid IO pagetable entry bfn = %lx\n", bfn);
+        AMD_IOMMU_DEBUG("Invalid IO pagetable entry bfn = %"PRI_bfn"\n",
+                        bfn_x(bfn));
         domain_crash(d);
         return -EFAULT;
     }
 
     /* Install 4k mapping first */
-    need_flush = set_iommu_pte_present(pt_mfn[1], bfn, mfn,
+    need_flush = set_iommu_pte_present(pt_mfn[1], bfn_x(bfn), mfn_x(mfn),
                                        IOMMU_PAGING_MODE_LEVEL_1,
                                        !!(flags & IOMMUF_writable),
                                        !!(flags & IOMMUF_readable));
@@ -690,7 +693,7 @@ int amd_iommu_map_page(struct domain *d, unsigned long bfn, unsigned long mfn,
     /* 4K mapping for PV guests never changes, 
      * no need to flush if we trust non-present bits */
     if ( is_hvm_domain(d) )
-        amd_iommu_flush_pages(d, bfn, 0);
+        amd_iommu_flush_pages(d, bfn_x(bfn), 0);
 
     for ( merge_level = IOMMU_PAGING_MODE_LEVEL_2;
           merge_level <= hd->arch.paging_mode; merge_level++ )
@@ -698,15 +701,16 @@ int amd_iommu_map_page(struct domain *d, unsigned long bfn, unsigned long mfn,
         if ( pt_mfn[merge_level] == 0 )
             break;
         if ( !iommu_update_pde_count(d, pt_mfn[merge_level],
-                                     bfn, mfn, merge_level) )
+                                     bfn_x(bfn), mfn_x(mfn), merge_level) )
             break;
 
-        if ( iommu_merge_pages(d, pt_mfn[merge_level], bfn,
+        if ( iommu_merge_pages(d, pt_mfn[merge_level], bfn_x(bfn),
                                flags, merge_level) )
         {
             spin_unlock(&hd->arch.mapping_lock);
             AMD_IOMMU_DEBUG("Merge iommu page failed at level %d, "
-                            "bfn = %lx mfn = %lx\n", merge_level, bfn, mfn);
+                            "bfn = %"PRI_bfn" mfn = %"PRI_mfn"\n",
+                            merge_level, bfn_x(bfn), mfn_x(mfn));
             domain_crash(d);
             return -EFAULT;
         }
@@ -720,7 +724,7 @@ out:
     return 0;
 }
 
-int amd_iommu_unmap_page(struct domain *d, unsigned long bfn)
+int amd_iommu_unmap_page(struct domain *d, bfn_t bfn)
 {
     unsigned long pt_mfn[7];
     struct domain_iommu *hd = dom_iommu(d);
@@ -742,31 +746,33 @@ int amd_iommu_unmap_page(struct domain *d, unsigned long bfn)
      * we might need a deeper page table for lager bfn now */
     if ( is_hvm_domain(d) )
     {
-        int rc = update_paging_mode(d, bfn);
+        int rc = update_paging_mode(d, bfn_x(bfn));
 
         if ( rc )
         {
             spin_unlock(&hd->arch.mapping_lock);
-            AMD_IOMMU_DEBUG("Update page mode failed bfn = %lx\n", bfn);
+            AMD_IOMMU_DEBUG("Update page mode failed bfn = %"PRI_bfn"\n",
+                            bfn_x(bfn));
             if ( rc != -EADDRNOTAVAIL )
                 domain_crash(d);
             return rc;
         }
     }
 
-    if ( iommu_pde_from_bfn(d, bfn, pt_mfn) || (pt_mfn[1] == 0) )
+    if ( iommu_pde_from_bfn(d, bfn_x(bfn), pt_mfn) || (pt_mfn[1] == 0) )
     {
         spin_unlock(&hd->arch.mapping_lock);
-        AMD_IOMMU_DEBUG("Invalid IO pagetable entry bfn = %lx\n", bfn);
+        AMD_IOMMU_DEBUG("Invalid IO pagetable entry bfn = %"PRI_bfn"\n",
+                        bfn_x(bfn));
         domain_crash(d);
         return -EFAULT;
     }
 
     /* mark PTE as 'page not present' */
-    clear_iommu_pte_present(pt_mfn[1], bfn);
+    clear_iommu_pte_present(pt_mfn[1], bfn_x(bfn));
     spin_unlock(&hd->arch.mapping_lock);
 
-    amd_iommu_flush_pages(d, bfn, 0);
+    amd_iommu_flush_pages(d, bfn_x(bfn), 0);
 
     return 0;
 }
@@ -787,7 +793,9 @@ int amd_iommu_reserve_domain_unity_map(struct domain *domain,
     gfn = phys_addr >> PAGE_SHIFT;
     for ( i = 0; i < npages; i++ )
     {
-        rt = amd_iommu_map_page(domain, gfn +i, gfn +i, flags);
+        unsigned long frame = gfn + i;
+
+        rt = amd_iommu_map_page(domain, _bfn(frame), _mfn(frame), flags);
         if ( rt != 0 )
             return rt;
     }
diff --git a/xen/drivers/passthrough/amd/pci_amd_iommu.c b/xen/drivers/passthrough/amd/pci_amd_iommu.c
index d608631e6e..eea22c3d0d 100644
--- a/xen/drivers/passthrough/amd/pci_amd_iommu.c
+++ b/xen/drivers/passthrough/amd/pci_amd_iommu.c
@@ -271,7 +271,7 @@ static void __hwdom_init amd_iommu_hwdom_init(struct domain *d)
              */
             if ( mfn_valid(_mfn(pfn)) )
             {
-                int ret = amd_iommu_map_page(d, pfn, pfn,
+                int ret = amd_iommu_map_page(d, _bfn(pfn), _mfn(pfn),
                                              IOMMUF_readable|IOMMUF_writable);
 
                 if ( !rc )
diff --git a/xen/drivers/passthrough/arm/smmu.c b/xen/drivers/passthrough/arm/smmu.c
index 1e4d561b47..221b62a59c 100644
--- a/xen/drivers/passthrough/arm/smmu.c
+++ b/xen/drivers/passthrough/arm/smmu.c
@@ -2550,8 +2550,7 @@ static int __must_check arm_smmu_iotlb_flush_all(struct domain *d)
 	return 0;
 }
 
-static int __must_check arm_smmu_iotlb_flush(struct domain *d,
-                                             unsigned long bfn,
+static int __must_check arm_smmu_iotlb_flush(struct domain *d, bfn_t bfn,
                                              unsigned int page_count)
 {
 	/* ARM SMMU v1 doesn't have flush by VMA and VMID */
@@ -2737,8 +2736,8 @@ static void arm_smmu_iommu_domain_teardown(struct domain *d)
 	xfree(xen_domain);
 }
 
-static int __must_check arm_smmu_map_page(struct domain *d, unsigned long bfn,
-			unsigned long mfn, unsigned int flags)
+static int __must_check arm_smmu_map_page(struct domain *d, bfn_t bfn,
+					  mfn_t mfn, unsigned int flags)
 {
 	p2m_type_t t;
 
@@ -2751,7 +2750,7 @@ static int __must_check arm_smmu_map_page(struct domain *d, unsigned long bfn,
 	 * function should only be used by gnttab code with gfn == mfn == bfn.
 	 */
 	BUG_ON(!is_domain_direct_mapped(d));
-	BUG_ON(mfn != bfn);
+	BUG_ON(mfn_x(mfn) != bfn_x(bfn));
 
 	/* We only support readable and writable flags */
 	if (!(flags & (IOMMUF_readable | IOMMUF_writable)))
@@ -2763,10 +2762,11 @@ static int __must_check arm_smmu_map_page(struct domain *d, unsigned long bfn,
 	 * The function guest_physmap_add_entry replaces the current mapping
 	 * if there is already one...
 	 */
-	return guest_physmap_add_entry(d, _gfn(bfn), _mfn(bfn), 0, t);
+	return guest_physmap_add_entry(d, _gfn(bfn_x(bfn)), _mfn(bfn_x(bfn)),
+				       0, t);
 }
 
-static int __must_check arm_smmu_unmap_page(struct domain *d, unsigned long bfn)
+static int __must_check arm_smmu_unmap_page(struct domain *d, bfn_t bfn)
 {
 	/*
 	 * This function should only be used by gnttab code when the domain
@@ -2775,7 +2775,7 @@ static int __must_check arm_smmu_unmap_page(struct domain *d, unsigned long bfn)
 	if ( !is_domain_direct_mapped(d) )
 		return -EINVAL;
 
-	return guest_physmap_remove_page(d, _gfn(bfn), _mfn(bfn), 0);
+	return guest_physmap_remove_page(d, _gfn(bfn_x(bfn)), _mfn(bfn_x(bfn)), 0);
 }
 
 static const struct iommu_ops arm_smmu_iommu_ops = {
diff --git a/xen/drivers/passthrough/iommu.c b/xen/drivers/passthrough/iommu.c
index 42b53b15e9..d9ec667945 100644
--- a/xen/drivers/passthrough/iommu.c
+++ b/xen/drivers/passthrough/iommu.c
@@ -194,7 +194,8 @@ void __hwdom_init iommu_hwdom_init(struct domain *d)
                   == PGT_writable_page) )
                 mapping |= IOMMUF_writable;
 
-            ret = hd->platform_ops->map_page(d, bfn, mfn, mapping);
+            ret = hd->platform_ops->map_page(d, _bfn(bfn), _mfn(mfn),
+                                             mapping);
             if ( !rc )
                 rc = ret;
 
@@ -264,7 +265,7 @@ int iommu_map_page(struct domain *d, bfn_t bfn, mfn_t mfn,
     if ( !iommu_enabled || !hd->platform_ops )
         return 0;
 
-    rc = hd->platform_ops->map_page(d, bfn_x(bfn), mfn_x(mfn), flags);
+    rc = hd->platform_ops->map_page(d, bfn, mfn, flags);
     if ( unlikely(rc) )
     {
         if ( !d->is_shutting_down && printk_ratelimit() )
@@ -287,7 +288,7 @@ int iommu_unmap_page(struct domain *d, bfn_t bfn)
     if ( !iommu_enabled || !hd->platform_ops )
         return 0;
 
-    rc = hd->platform_ops->unmap_page(d, bfn_x(bfn));
+    rc = hd->platform_ops->unmap_page(d, bfn);
     if ( unlikely(rc) )
     {
         if ( !d->is_shutting_down && printk_ratelimit() )
@@ -327,7 +328,7 @@ int iommu_iotlb_flush(struct domain *d, bfn_t bfn, unsigned int page_count)
     if ( !iommu_enabled || !hd->platform_ops || !hd->platform_ops->iotlb_flush )
         return 0;
 
-    rc = hd->platform_ops->iotlb_flush(d, bfn_x(bfn), page_count);
+    rc = hd->platform_ops->iotlb_flush(d, bfn, page_count);
     if ( unlikely(rc) )
     {
         if ( !d->is_shutting_down && printk_ratelimit() )
diff --git a/xen/drivers/passthrough/vtd/iommu.c b/xen/drivers/passthrough/vtd/iommu.c
index 48f62e0e8d..c9f50f04ad 100644
--- a/xen/drivers/passthrough/vtd/iommu.c
+++ b/xen/drivers/passthrough/vtd/iommu.c
@@ -584,8 +584,7 @@ static int __must_check iommu_flush_all(void)
     return rc;
 }
 
-static int __must_check iommu_flush_iotlb(struct domain *d,
-                                          unsigned long bfn,
+static int __must_check iommu_flush_iotlb(struct domain *d, bfn_t bfn,
                                           bool_t dma_old_pte_present,
                                           unsigned int page_count)
 {
@@ -612,12 +611,12 @@ static int __must_check iommu_flush_iotlb(struct domain *d,
         if ( iommu_domid == -1 )
             continue;
 
-        if ( page_count != 1 || bfn == bfn_x(INVALID_BFN) )
+        if ( page_count != 1 || bfn_eq(bfn, INVALID_BFN) )
             rc = iommu_flush_iotlb_dsi(iommu, iommu_domid,
                                        0, flush_dev_iotlb);
         else
             rc = iommu_flush_iotlb_psi(iommu, iommu_domid,
-                                       __bfn_to_baddr(bfn),
+                                       bfn_to_baddr(bfn),
                                        PAGE_ORDER_4K,
                                        !dma_old_pte_present,
                                        flush_dev_iotlb);
@@ -633,7 +632,7 @@ static int __must_check iommu_flush_iotlb(struct domain *d,
 }
 
 static int __must_check iommu_flush_iotlb_pages(struct domain *d,
-                                                unsigned long bfn,
+                                                bfn_t bfn,
                                                 unsigned int page_count)
 {
     return iommu_flush_iotlb(d, bfn, 1, page_count);
@@ -641,7 +640,7 @@ static int __must_check iommu_flush_iotlb_pages(struct domain *d,
 
 static int __must_check iommu_flush_iotlb_all(struct domain *d)
 {
-    return iommu_flush_iotlb(d, bfn_x(INVALID_BFN), 0, 0);
+    return iommu_flush_iotlb(d, INVALID_BFN, 0, 0);
 }
 
 /* clear one page's page table */
@@ -676,7 +675,7 @@ static int __must_check dma_pte_clear_one(struct domain *domain, u64 addr)
     iommu_flush_cache_entry(pte, sizeof(struct dma_pte));
 
     if ( !this_cpu(iommu_dont_flush_iotlb) )
-        rc = iommu_flush_iotlb_pages(domain, addr >> PAGE_SHIFT_4K, 1);
+        rc = iommu_flush_iotlb_pages(domain, baddr_to_bfn(addr), 1);
 
     unmap_vtd_domain_page(page);
 
@@ -1767,8 +1766,7 @@ static void iommu_domain_teardown(struct domain *d)
 }
 
 static int __must_check intel_iommu_map_page(struct domain *d,
-                                             unsigned long bfn,
-                                             unsigned long mfn,
+                                             bfn_t bfn, mfn_t mfn,
                                              unsigned int flags)
 {
     struct domain_iommu *hd = dom_iommu(d);
@@ -1786,16 +1784,16 @@ static int __must_check intel_iommu_map_page(struct domain *d,
 
     spin_lock(&hd->arch.mapping_lock);
 
-    pg_maddr = addr_to_dma_page_maddr(d, __bfn_to_baddr(bfn), 1);
+    pg_maddr = addr_to_dma_page_maddr(d, bfn_to_baddr(bfn), 1);
     if ( pg_maddr == 0 )
     {
         spin_unlock(&hd->arch.mapping_lock);
         return -ENOMEM;
     }
     page = (struct dma_pte *)map_vtd_domain_page(pg_maddr);
-    pte = page + (bfn & LEVEL_MASK);
+    pte = page + (bfn_x(bfn) & LEVEL_MASK);
     old = *pte;
-    dma_set_pte_addr(new, (paddr_t)mfn << PAGE_SHIFT_4K);
+    dma_set_pte_addr(new, mfn_to_maddr(mfn));
     dma_set_pte_prot(new,
                      ((flags & IOMMUF_readable) ? DMA_PTE_READ  : 0) |
                      ((flags & IOMMUF_writable) ? DMA_PTE_WRITE : 0));
@@ -1823,13 +1821,13 @@ static int __must_check intel_iommu_map_page(struct domain *d,
 }
 
 static int __must_check intel_iommu_unmap_page(struct domain *d,
-                                               unsigned long bfn)
+                                               bfn_t bfn)
 {
     /* Do nothing if hardware domain and iommu supports pass thru. */
     if ( iommu_passthrough && is_hardware_domain(d) )
         return 0;
 
-    return dma_pte_clear_one(d, __bfn_to_baddr(bfn));
+    return dma_pte_clear_one(d, bfn_to_baddr(bfn));
 }
 
 int iommu_pte_flush(struct domain *d, uint64_t bfn, uint64_t *pte,
diff --git a/xen/drivers/passthrough/x86/iommu.c b/xen/drivers/passthrough/x86/iommu.c
index 68182afd91..379882c690 100644
--- a/xen/drivers/passthrough/x86/iommu.c
+++ b/xen/drivers/passthrough/x86/iommu.c
@@ -65,7 +65,7 @@ int arch_iommu_populate_page_table(struct domain *d)
             {
                 ASSERT(!(gfn >> DEFAULT_DOMAIN_ADDRESS_WIDTH));
                 BUG_ON(SHARED_M2P(gfn));
-                rc = hd->platform_ops->map_page(d, gfn, mfn,
+                rc = hd->platform_ops->map_page(d, _bfn(gfn), _mfn(mfn),
                                                 IOMMUF_readable |
                                                 IOMMUF_writable);
             }
diff --git a/xen/include/asm-x86/hvm/svm/amd-iommu-proto.h b/xen/include/asm-x86/hvm/svm/amd-iommu-proto.h
index 99bc21c7b3..dce9ed6b83 100644
--- a/xen/include/asm-x86/hvm/svm/amd-iommu-proto.h
+++ b/xen/include/asm-x86/hvm/svm/amd-iommu-proto.h
@@ -52,9 +52,9 @@ int amd_iommu_init(void);
 int amd_iommu_update_ivrs_mapping_acpi(void);
 
 /* mapping functions */
-int __must_check amd_iommu_map_page(struct domain *d, unsigned long gfn,
-                                    unsigned long mfn, unsigned int flags);
-int __must_check amd_iommu_unmap_page(struct domain *d, unsigned long gfn);
+int __must_check amd_iommu_map_page(struct domain *d, bfn_t bfn,
+                                    mfn_t mfn, unsigned int flags);
+int __must_check amd_iommu_unmap_page(struct domain *d, bfn_t bfn);
 u64 amd_iommu_get_next_table_from_pte(u32 *entry);
 int __must_check amd_iommu_alloc_root(struct domain_iommu *hd);
 int amd_iommu_reserve_domain_unity_map(struct domain *domain,
@@ -77,7 +77,7 @@ void iommu_dte_set_guest_cr3(u32 *dte, u16 dom_id, u64 gcr3,
 
 /* send cmd to iommu */
 void amd_iommu_flush_all_pages(struct domain *d);
-void amd_iommu_flush_pages(struct domain *d, unsigned long gfn,
+void amd_iommu_flush_pages(struct domain *d, unsigned long bfn,
                            unsigned int order);
 void amd_iommu_flush_iotlb(u8 devfn, const struct pci_dev *pdev,
                            uint64_t gaddr, unsigned int order);
diff --git a/xen/include/xen/iommu.h b/xen/include/xen/iommu.h
index 3d4f3e7d26..6edc6e1a10 100644
--- a/xen/include/xen/iommu.h
+++ b/xen/include/xen/iommu.h
@@ -48,6 +48,11 @@ static inline bfn_t bfn_add(bfn_t bfn, unsigned long i)
     return _bfn(bfn_x(bfn) + i);
 }
 
+static inline bool_t bfn_eq(bfn_t x, bfn_t y)
+{
+    return bfn_x(x) == bfn_x(y);
+}
+
 #define IOMMU_PAGE_SHIFT 12
 #define IOMMU_PAGE_SIZE  (_AC(1,L) << IOMMU_PAGE_SHIFT)
 #define IOMMU_PAGE_MASK  (~(IOMMU_PAGE_SIZE - 1))
@@ -182,9 +187,9 @@ struct iommu_ops {
 #endif /* HAS_PCI */
 
     void (*teardown)(struct domain *d);
-    int __must_check (*map_page)(struct domain *d, unsigned long bfn,
-                                 unsigned long mfn, unsigned int flags);
-    int __must_check (*unmap_page)(struct domain *d, unsigned long bfn);
+    int __must_check (*map_page)(struct domain *d, bfn_t bfn, mfn_t mfn,
+                                 unsigned int flags);
+    int __must_check (*unmap_page)(struct domain *d, bfn_t bfn);
     void (*free_page_table)(struct page_info *);
 #ifdef CONFIG_X86
     void (*update_ire_from_apic)(unsigned int apic, unsigned int reg, unsigned int value);
@@ -195,7 +200,7 @@ struct iommu_ops {
     void (*resume)(void);
     void (*share_p2m)(struct domain *d);
     void (*crash_shutdown)(void);
-    int __must_check (*iotlb_flush)(struct domain *d, unsigned long bfn,
+    int __must_check (*iotlb_flush)(struct domain *d, bfn_t bfn,
                                     unsigned int page_count);
     int __must_check (*iotlb_flush_all)(struct domain *d);
     int (*get_reserved_device_memory)(iommu_grdm_t *, void *);
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [PATCH v6 04/14] iommu: don't domain_crash() inside iommu_map/unmap_page()
  2018-08-23  9:46 [PATCH v6 00/14] paravirtual IOMMU interface Paul Durrant
                   ` (2 preceding siblings ...)
  2018-08-23  9:47 ` [PATCH v6 03/14] iommu: push use of type-safe BFN and MFN into iommu_ops Paul Durrant
@ 2018-08-23  9:47 ` Paul Durrant
  2018-09-04 10:38   ` Jan Beulich
  2018-08-23  9:47 ` [PATCH v6 05/14] public / x86: introduce __HYPERCALL_iommu_op Paul Durrant
                   ` (9 subsequent siblings)
  13 siblings, 1 reply; 111+ messages in thread
From: Paul Durrant @ 2018-08-23  9:47 UTC (permalink / raw)
  To: xen-devel
  Cc: Kevin Tian, Stefano Stabellini, Wei Liu, Jun Nakajima,
	George Dunlap, Andrew Cooper, Ian Jackson, Tim Deegan,
	Julien Grall, Paul Durrant, Jan Beulich

This patch removes the implicit domain_crash() from iommu_map(),
unmap_page() and iommu_iotlb_flush() and turns them into straightforward
wrappers that check the existence of the relevant iommu_op and call
through to it. This makes them usable by PV IOMMU code to be delivered in
future patches.
This patch adds a helper macro, domu_crash(), that will only invoke
domain_crash() if the domain is not the hardware domain and modifies callers
of iommu_map(), unmap_page() and iommu_iotlb_flush() to use this should
an operation fail.

NOTE: This patch includes one bit of clean-up in set_identity_p2m_entry()
      replacing use of p2m->domain with the domain pointer passed into the
      function.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
---
Cc: Wei Liu <wei.liu2@citrix.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Julien Grall <julien.grall@arm.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Tim Deegan <tim@xen.org>
Cc: Jun Nakajima <jun.nakajima@intel.com>
Cc: Kevin Tian <kevin.tian@intel.com>
Cc: George Dunlap <george.dunlap@eu.citrix.com>

v6:
 - Introduce domu_crash() (idea suggested by Kevin, name suggested by Jan)
   to crash non-hardware domains.
 - Dropped Wei's and George's R-b because of change.

v2:
 - New in v2.
---
 xen/arch/arm/p2m.c                  |  4 ++++
 xen/arch/x86/mm.c                   |  3 +++
 xen/arch/x86/mm/p2m-ept.c           |  3 +++
 xen/arch/x86/mm/p2m-pt.c            |  3 +++
 xen/arch/x86/mm/p2m.c               | 22 ++++++++++++++++++----
 xen/common/grant_table.c            |  4 ++++
 xen/common/memory.c                 |  3 +++
 xen/drivers/passthrough/iommu.c     | 12 ------------
 xen/drivers/passthrough/x86/iommu.c |  4 ++++
 xen/include/xen/sched.h             |  5 +++++
 10 files changed, 47 insertions(+), 16 deletions(-)

diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
index 072029dfbe..4a8cac5050 100644
--- a/xen/arch/arm/p2m.c
+++ b/xen/arch/arm/p2m.c
@@ -957,8 +957,12 @@ static int __p2m_set_entry(struct p2m_domain *p2m,
 
     if ( need_iommu(p2m->domain) &&
          (lpae_is_valid(orig_pte) || lpae_is_valid(*entry)) )
+    {
         rc = iommu_iotlb_flush(p2m->domain, _bfn(gfn_x(sgfn)),
                                1UL << page_order);
+        if ( unlikely(rc) )
+            domu_crash(p2m->domain);
+    }
     else
         rc = 0;
 
diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index 9e9fb9421e..f598674f0c 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -2760,6 +2760,9 @@ static int _get_page_type(struct page_info *page, unsigned long type,
                 iommu_ret = iommu_map_page(d, _bfn(mfn_x(mfn)), mfn,
                                            IOMMUF_readable |
                                            IOMMUF_writable);
+
+            if ( unlikely(iommu_ret) )
+                domu_crash(d);
         }
     }
 
diff --git a/xen/arch/x86/mm/p2m-ept.c b/xen/arch/x86/mm/p2m-ept.c
index 2089b5232d..c9e1a7e288 100644
--- a/xen/arch/x86/mm/p2m-ept.c
+++ b/xen/arch/x86/mm/p2m-ept.c
@@ -895,6 +895,9 @@ out:
                     if ( !rc )
                         rc = ret;
                 }
+
+            if ( unlikely(rc) )
+                domu_crash(d);
         }
     }
 
diff --git a/xen/arch/x86/mm/p2m-pt.c b/xen/arch/x86/mm/p2m-pt.c
index a441af388a..3b8d184054 100644
--- a/xen/arch/x86/mm/p2m-pt.c
+++ b/xen/arch/x86/mm/p2m-pt.c
@@ -717,6 +717,9 @@ p2m_pt_set_entry(struct p2m_domain *p2m, gfn_t gfn_, mfn_t mfn,
                         rc = ret;
                 }
         }
+
+        if ( unlikely(rc) )
+            domu_crash(p2m->domain);
     }
 
     /*
diff --git a/xen/arch/x86/mm/p2m.c b/xen/arch/x86/mm/p2m.c
index fbf67def50..755329366e 100644
--- a/xen/arch/x86/mm/p2m.c
+++ b/xen/arch/x86/mm/p2m.c
@@ -724,6 +724,9 @@ p2m_remove_page(struct p2m_domain *p2m, unsigned long gfn_l, unsigned long mfn,
                 if ( !rc )
                     rc = ret;
             }
+
+            if ( unlikely(rc) )
+                domu_crash(p2m->domain);
         }
 
         return rc;
@@ -789,6 +792,7 @@ guest_physmap_add_entry(struct domain *d, gfn_t gfn, mfn_t mfn,
                         if ( iommu_unmap_page(d, bfn_add(bfn, i)) )
                             continue;
 
+                    domu_crash(d);
                     return rc;
                 }
             }
@@ -1157,12 +1161,17 @@ int set_identity_p2m_entry(struct domain *d, unsigned long gfn_l,
     struct p2m_domain *p2m = p2m_get_hostp2m(d);
     int ret;
 
-    if ( !paging_mode_translate(p2m->domain) )
+    if ( !paging_mode_translate(d) )
     {
         if ( !need_iommu(d) )
             return 0;
-        return iommu_map_page(d, _bfn(gfn_l), _mfn(gfn_l),
-                              IOMMUF_readable | IOMMUF_writable);
+
+        ret = iommu_map_page(d, _bfn(gfn_l), _mfn(gfn_l),
+                             IOMMUF_readable | IOMMUF_writable);
+        if ( unlikely(ret) )
+            domu_crash(d);
+
+        return ret;
     }
 
     gfn_lock(p2m, gfn, 0);
@@ -1252,7 +1261,12 @@ int clear_identity_p2m_entry(struct domain *d, unsigned long gfn_l)
     {
         if ( !need_iommu(d) )
             return 0;
-        return iommu_unmap_page(d, _bfn(gfn_l));
+
+        ret = iommu_unmap_page(d, _bfn(gfn_l));
+        if ( unlikely(ret) )
+            domu_crash(d);
+
+        return ret;
     }
 
     gfn_lock(p2m, gfn, 0);
diff --git a/xen/common/grant_table.c b/xen/common/grant_table.c
index af41133953..e694a4bf16 100644
--- a/xen/common/grant_table.c
+++ b/xen/common/grant_table.c
@@ -1154,6 +1154,7 @@ map_grant_ref(
         }
         if ( err )
         {
+            domu_crash(ld);
             double_gt_unlock(lgt, rgt);
             rc = GNTST_general_error;
             goto undo_out;
@@ -1406,7 +1407,10 @@ unmap_common(
         double_gt_unlock(lgt, rgt);
 
         if ( err )
+        {
+            domu_crash(ld);
             rc = GNTST_general_error;
+        }
     }
 
     /* If just unmapped a writable mapping, mark as dirtied */
diff --git a/xen/common/memory.c b/xen/common/memory.c
index 8ba8921c79..2e4cd8cdfd 100644
--- a/xen/common/memory.c
+++ b/xen/common/memory.c
@@ -841,6 +841,9 @@ int xenmem_add_to_physmap(struct domain *d, struct xen_add_to_physmap *xatp,
         ret = iommu_iotlb_flush(d, _bfn(xatp->gpfn - done), done);
         if ( unlikely(ret) && rc >= 0 )
             rc = ret;
+
+        if ( unlikely(rc < 0) )
+            domu_crash(d);
     }
 #endif
 
diff --git a/xen/drivers/passthrough/iommu.c b/xen/drivers/passthrough/iommu.c
index d9ec667945..f75cab130d 100644
--- a/xen/drivers/passthrough/iommu.c
+++ b/xen/drivers/passthrough/iommu.c
@@ -272,9 +272,6 @@ int iommu_map_page(struct domain *d, bfn_t bfn, mfn_t mfn,
             printk(XENLOG_ERR
                    "d%d: IOMMU mapping bfn %"PRI_bfn" to mfn %"PRI_mfn" failed: %d\n",
                    d->domain_id, bfn_x(bfn), mfn_x(mfn), rc);
-
-        if ( !is_hardware_domain(d) )
-            domain_crash(d);
     }
 
     return rc;
@@ -295,9 +292,6 @@ int iommu_unmap_page(struct domain *d, bfn_t bfn)
             printk(XENLOG_ERR
                    "d%d: IOMMU unmapping bfn %"PRI_bfn" failed: %d\n",
                    d->domain_id, bfn_x(bfn), rc);
-
-        if ( !is_hardware_domain(d) )
-            domain_crash(d);
     }
 
     return rc;
@@ -335,9 +329,6 @@ int iommu_iotlb_flush(struct domain *d, bfn_t bfn, unsigned int page_count)
             printk(XENLOG_ERR
                    "d%d: IOMMU IOTLB flush failed: %d, bfn %"PRI_bfn", page count %u\n",
                    d->domain_id, rc, bfn_x(bfn), page_count);
-
-        if ( !is_hardware_domain(d) )
-            domain_crash(d);
     }
 
     return rc;
@@ -358,9 +349,6 @@ int iommu_iotlb_flush_all(struct domain *d)
             printk(XENLOG_ERR
                    "d%d: IOMMU IOTLB flush all failed: %d\n",
                    d->domain_id, rc);
-
-        if ( !is_hardware_domain(d) )
-            domain_crash(d);
     }
 
     return rc;
diff --git a/xen/drivers/passthrough/x86/iommu.c b/xen/drivers/passthrough/x86/iommu.c
index 379882c690..09573722bd 100644
--- a/xen/drivers/passthrough/x86/iommu.c
+++ b/xen/drivers/passthrough/x86/iommu.c
@@ -104,7 +104,11 @@ int arch_iommu_populate_page_table(struct domain *d)
     this_cpu(iommu_dont_flush_iotlb) = 0;
 
     if ( !rc )
+    {
         rc = iommu_iotlb_flush_all(d);
+        if ( unlikely(rc) )
+            domain_crash(d);
+    }
 
     if ( rc && rc != -ERESTART )
         iommu_teardown(d);
diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
index 51ceebe6cc..381eb6dc8c 100644
--- a/xen/include/xen/sched.h
+++ b/xen/include/xen/sched.h
@@ -616,6 +616,11 @@ void __domain_crash(struct domain *d);
     __domain_crash(d);                                                    \
 } while (0)
 
+#define domu_crash(d) do {                \
+    if ( !is_hardware_domain(d) )         \
+        domain_crash(d);                  \
+} while (false)
+
 /*
  * Called from assembly code, with an optional address to help indicate why
  * the crash occured.  If addr is 0, look up address from last extable
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [PATCH v6 05/14] public / x86: introduce __HYPERCALL_iommu_op
  2018-08-23  9:46 [PATCH v6 00/14] paravirtual IOMMU interface Paul Durrant
                   ` (3 preceding siblings ...)
  2018-08-23  9:47 ` [PATCH v6 04/14] iommu: don't domain_crash() inside iommu_map/unmap_page() Paul Durrant
@ 2018-08-23  9:47 ` Paul Durrant
  2018-09-04 11:50   ` Jan Beulich
  2018-09-07 10:52   ` Jan Beulich
  2018-08-23  9:47 ` [PATCH v6 06/14] iommu: track reserved ranges using a rangeset Paul Durrant
                   ` (8 subsequent siblings)
  13 siblings, 2 replies; 111+ messages in thread
From: Paul Durrant @ 2018-08-23  9:47 UTC (permalink / raw)
  To: xen-devel
  Cc: Stefano Stabellini, Wei Liu, George Dunlap, Andrew Cooper,
	Ian Jackson, Tim Deegan, Paul Durrant, Jan Beulich

This patch introduces the boilerplate for a new hypercall to allow a
domain to control IOMMU mappings for its own pages.
Whilst there is duplication of code between the native and compat entry
points which appears ripe for some form of combination, I think it is
better to maintain the separation as-is because the compat entry point
will necessarily gain complexity in subsequent patches.

NOTE: This hypercall is only implemented for x86 and is currently
      restricted by XSM to dom0. Its scope can be expanded in future
      if need be.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Acked-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
---
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: George Dunlap <George.Dunlap@eu.citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Tim Deegan <tim@xen.org>
Cc: Wei Liu <wei.liu2@citrix.com>

v6:
 - Move iommu_op.c to xen/common as suggested by Kevin, but only build for
   x86 (as suggested by Jan).

v3:
 - Push op code into XSM check.

v2:
 - Get rid of the can_control_iommu() function, leaving this patch as pure
   boilerplate.
 - Re-structure the hypercall to use a buffer array, similar to that used
   by __HYPERCALL_dm_op, to allow for future expansion of op structure
   without affecting binary compatibility.
 - Drop use of __ in public header guard.
---
 tools/flask/policy/modules/xen.if   |   1 +
 xen/arch/x86/hvm/hypercall.c        |   1 +
 xen/arch/x86/hypercall.c            |   1 +
 xen/arch/x86/pv/hypercall.c         |   1 +
 xen/common/Makefile                 |   1 +
 xen/common/iommu_op.c               | 184 ++++++++++++++++++++++++++++++++++++
 xen/include/Makefile                |   2 +
 xen/include/public/iommu_op.h       |  64 +++++++++++++
 xen/include/public/xen.h            |   1 +
 xen/include/xen/hypercall.h         |  12 +++
 xen/include/xlat.lst                |   2 +
 xen/include/xsm/dummy.h             |   6 ++
 xen/include/xsm/xsm.h               |   6 ++
 xen/xsm/dummy.c                     |   1 +
 xen/xsm/flask/hooks.c               |   6 ++
 xen/xsm/flask/policy/access_vectors |   2 +
 16 files changed, 291 insertions(+)
 create mode 100644 xen/common/iommu_op.c
 create mode 100644 xen/include/public/iommu_op.h

diff --git a/tools/flask/policy/modules/xen.if b/tools/flask/policy/modules/xen.if
index 61b0e76715..e3ae39c764 100644
--- a/tools/flask/policy/modules/xen.if
+++ b/tools/flask/policy/modules/xen.if
@@ -60,6 +60,7 @@ define(`create_domain_common', `
 	allow $1 $2:grant setup;
 	allow $1 $2:hvm { getparam hvmctl sethvmc
 			setparam nested altp2mhvm altp2mhvm_op dm };
+	allow $1 $2:resource control_iommu;
 ')
 
 # create_domain(priv, target)
diff --git a/xen/arch/x86/hvm/hypercall.c b/xen/arch/x86/hvm/hypercall.c
index 85eacd7d33..3574966827 100644
--- a/xen/arch/x86/hvm/hypercall.c
+++ b/xen/arch/x86/hvm/hypercall.c
@@ -137,6 +137,7 @@ static const hypercall_table_t hvm_hypercall_table[] = {
     COMPAT_CALL(mmuext_op),
     HYPERCALL(xenpmu_op),
     COMPAT_CALL(dm_op),
+    COMPAT_CALL(iommu_op),
     HYPERCALL(arch_1)
 };
 
diff --git a/xen/arch/x86/hypercall.c b/xen/arch/x86/hypercall.c
index 90e88c1d2c..045753e702 100644
--- a/xen/arch/x86/hypercall.c
+++ b/xen/arch/x86/hypercall.c
@@ -68,6 +68,7 @@ const hypercall_args_t hypercall_args_table[NR_hypercalls] =
     ARGS(xenpmu_op, 2),
     ARGS(dm_op, 3),
     ARGS(mca, 1),
+    ARGS(iommu_op, 2),
     ARGS(arch_1, 1),
 };
 
diff --git a/xen/arch/x86/pv/hypercall.c b/xen/arch/x86/pv/hypercall.c
index bbc3011d1a..d23f9af42f 100644
--- a/xen/arch/x86/pv/hypercall.c
+++ b/xen/arch/x86/pv/hypercall.c
@@ -80,6 +80,7 @@ const hypercall_table_t pv_hypercall_table[] = {
     HYPERCALL(xenpmu_op),
     COMPAT_CALL(dm_op),
     HYPERCALL(mca),
+    COMPAT_CALL(iommu_op),
     HYPERCALL(arch_1),
 };
 
diff --git a/xen/common/Makefile b/xen/common/Makefile
index 6a05fffc7a..6341bdccee 100644
--- a/xen/common/Makefile
+++ b/xen/common/Makefile
@@ -13,6 +13,7 @@ obj-$(CONFIG_CRASH_DEBUG) += gdbstub.o
 obj-y += grant_table.o
 obj-y += guestcopy.o
 obj-bin-y += gunzip.init.o
+obj-$(CONFIG_X86) += iommu_op.o
 obj-y += irq.o
 obj-y += kernel.o
 obj-y += keyhandler.o
diff --git a/xen/common/iommu_op.c b/xen/common/iommu_op.c
new file mode 100644
index 0000000000..744c0fce27
--- /dev/null
+++ b/xen/common/iommu_op.c
@@ -0,0 +1,184 @@
+/******************************************************************************
+ * x86/iommu_op.c
+ *
+ * Paravirtualised IOMMU functionality
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; If not, see <http://www.gnu.org/licenses/>.
+ *
+ * Copyright (C) 2018 Citrix Systems Inc
+ */
+
+#include <xen/event.h>
+#include <xen/guest_access.h>
+#include <xen/hypercall.h>
+
+static void iommu_op(xen_iommu_op_t *op)
+{
+    switch ( op->op )
+    {
+    default:
+        op->status = -EOPNOTSUPP;
+        break;
+    }
+}
+
+int do_one_iommu_op(xen_iommu_op_buf_t *buf)
+{
+    xen_iommu_op_t op;
+    int rc;
+
+    if ( buf->size < sizeof(op) )
+        return -EFAULT;
+
+    if ( copy_from_guest((void *)&op, buf->h, sizeof(op)) )
+        return -EFAULT;
+
+    if ( op.pad )
+        return -EINVAL;
+
+    rc = xsm_iommu_op(XSM_PRIV, current->domain, op.op);
+    if ( rc )
+        return rc;
+
+    iommu_op(&op);
+
+    if ( __copy_field_to_guest(guest_handle_cast(buf->h, xen_iommu_op_t),
+                               &op, status) )
+        return -EFAULT;
+
+    return 0;
+}
+
+long do_iommu_op(unsigned int nr_bufs,
+                 XEN_GUEST_HANDLE_PARAM(xen_iommu_op_buf_t) bufs)
+{
+    unsigned int i;
+    long rc = 0;
+
+    for ( i = 0; i < nr_bufs; i++ )
+    {
+        xen_iommu_op_buf_t buf;
+
+        if ( ((i & 0xff) == 0xff) && hypercall_preempt_check() )
+        {
+            rc = i;
+            break;
+        }
+
+        if ( copy_from_guest_offset(&buf, bufs, i, 1) )
+        {
+            rc = -EFAULT;
+            break;
+        }
+
+        rc = do_one_iommu_op(&buf);
+        if ( rc )
+            break;
+    }
+
+    if ( rc > 0 )
+    {
+        ASSERT(rc < nr_bufs);
+        nr_bufs -= rc;
+        guest_handle_add_offset(bufs, rc);
+
+        rc = hypercall_create_continuation(__HYPERVISOR_iommu_op,
+                                           "ih", nr_bufs, bufs);
+    }
+
+    return rc;
+}
+
+int compat_one_iommu_op(compat_iommu_op_buf_t *buf)
+{
+    compat_iommu_op_t cmp;
+    xen_iommu_op_t nat;
+    int rc;
+
+    if ( buf->size < sizeof(cmp) )
+        return -EFAULT;
+
+    if ( copy_from_compat((void *)&cmp, buf->h, sizeof(cmp)) )
+        return -EFAULT;
+
+    if ( cmp.pad )
+        return -EINVAL;
+
+    rc = xsm_iommu_op(XSM_PRIV, current->domain, cmp.op);
+    if ( rc )
+        return rc;
+
+    XLAT_iommu_op(&nat, &cmp);
+
+    iommu_op(&nat);
+
+    XLAT_iommu_op(&cmp, &nat);
+
+    if ( __copy_field_to_compat(compat_handle_cast(buf->h,
+                                                   compat_iommu_op_t),
+                                &cmp, status) )
+        return -EFAULT;
+
+    return 0;
+}
+
+int compat_iommu_op(unsigned int nr_bufs,
+                    XEN_GUEST_HANDLE_PARAM(compat_iommu_op_buf_t) bufs)
+{
+    unsigned int i;
+    long rc = 0;
+
+    for ( i = 0; i < nr_bufs; i++ )
+    {
+        compat_iommu_op_buf_t buf;
+
+        if ( ((i & 0xff) == 0xff) && hypercall_preempt_check() )
+        {
+            rc = i;
+            break;
+        }
+
+        if ( copy_from_guest_offset(&buf, bufs, i, 1) )
+        {
+            rc = -EFAULT;
+            break;
+        }
+
+        rc = compat_one_iommu_op(&buf);
+        if ( rc )
+            break;
+    }
+
+    if ( rc > 0 )
+    {
+        ASSERT(rc < nr_bufs);
+        nr_bufs -= rc;
+        guest_handle_add_offset(bufs, rc);
+
+        rc = hypercall_create_continuation(__HYPERVISOR_iommu_op,
+                                           "ih", nr_bufs, bufs);
+    }
+
+    return rc;
+}
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * tab-width: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/xen/include/Makefile b/xen/include/Makefile
index df04182965..af54d8833f 100644
--- a/xen/include/Makefile
+++ b/xen/include/Makefile
@@ -11,6 +11,7 @@ headers-y := \
     compat/features.h \
     compat/grant_table.h \
     compat/kexec.h \
+    compat/iommu_op.h \
     compat/memory.h \
     compat/nmi.h \
     compat/physdev.h \
@@ -29,6 +30,7 @@ headers-$(CONFIG_X86)     += compat/arch-x86/xen-$(compat-arch-y).h
 headers-$(CONFIG_X86)     += compat/hvm/dm_op.h
 headers-$(CONFIG_X86)     += compat/hvm/hvm_op.h
 headers-$(CONFIG_X86)     += compat/hvm/hvm_vcpu.h
+headers-$(CONFIG_X86)     += compat/iommu_op.h
 headers-y                 += compat/arch-$(compat-arch-y).h compat/pmu.h compat/xlat.h
 headers-$(CONFIG_FLASK)   += compat/xsm/flask_op.h
 
diff --git a/xen/include/public/iommu_op.h b/xen/include/public/iommu_op.h
new file mode 100644
index 0000000000..c3b68f665a
--- /dev/null
+++ b/xen/include/public/iommu_op.h
@@ -0,0 +1,64 @@
+/*
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to
+ * deal in the Software without restriction, including without limitation the
+ * rights to use, copy, modify, merge, publish, distribute, sublicense, and/or
+ * sell copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+ * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+ * DEALINGS IN THE SOFTWARE.
+ *
+ * Copyright (C) 2018 Citrix Systems Inc
+ */
+
+#ifndef XEN_PUBLIC_IOMMU_OP_H
+#define XEN_PUBLIC_IOMMU_OP_H
+
+#include "xen.h"
+
+struct xen_iommu_op {
+    uint16_t op;    /* op type */
+    uint16_t pad;
+    int32_t status; /* op completion status: */
+                    /* 0 for success otherwise, negative errno */
+};
+typedef struct xen_iommu_op xen_iommu_op_t;
+DEFINE_XEN_GUEST_HANDLE(xen_iommu_op_t);
+
+struct xen_iommu_op_buf {
+    XEN_GUEST_HANDLE(void) h;
+    xen_ulong_t size;
+};
+typedef struct xen_iommu_op_buf xen_iommu_op_buf_t;
+DEFINE_XEN_GUEST_HANDLE(xen_iommu_op_buf_t);
+
+/* ` enum neg_errnoval
+ * ` HYPERVISOR_iommu_op(unsigned int nr_bufs,
+ * `                     xen_iommu_op_buf_t bufs[])
+ * `
+ *
+ * @nr_bufs is the number of buffers in the @bufs array.
+ * @bufs points to an array of buffers where each contains a struct
+ * xen_iommu_op.
+ */
+
+#endif /* XEN_PUBLIC_IOMMU_OP_H */
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * tab-width: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/xen/include/public/xen.h b/xen/include/public/xen.h
index fb1df8f293..68b0968e7d 100644
--- a/xen/include/public/xen.h
+++ b/xen/include/public/xen.h
@@ -121,6 +121,7 @@ DEFINE_XEN_GUEST_HANDLE(xen_ulong_t);
 #define __HYPERVISOR_xc_reserved_op       39 /* reserved for XenClient */
 #define __HYPERVISOR_xenpmu_op            40
 #define __HYPERVISOR_dm_op                41
+#define __HYPERVISOR_iommu_op             42
 
 /* Architecture-specific hypercall definitions. */
 #define __HYPERVISOR_arch_0               48
diff --git a/xen/include/xen/hypercall.h b/xen/include/xen/hypercall.h
index cc99aea57d..2ebc999f4b 100644
--- a/xen/include/xen/hypercall.h
+++ b/xen/include/xen/hypercall.h
@@ -16,6 +16,7 @@
 #include <public/version.h>
 #include <public/pmu.h>
 #include <public/hvm/dm_op.h>
+#include <public/iommu_op.h>
 #include <asm/hypercall.h>
 #include <xsm/xsm.h>
 
@@ -148,6 +149,10 @@ do_dm_op(
     unsigned int nr_bufs,
     XEN_GUEST_HANDLE_PARAM(xen_dm_op_buf_t) bufs);
 
+extern long
+do_iommu_op(unsigned int nr_bufs,
+            XEN_GUEST_HANDLE_PARAM(xen_iommu_op_buf_t) bufs);
+
 #ifdef CONFIG_COMPAT
 
 extern int
@@ -205,6 +210,13 @@ compat_dm_op(
     unsigned int nr_bufs,
     XEN_GUEST_HANDLE_PARAM(void) bufs);
 
+#include <compat/iommu_op.h>
+
+DEFINE_XEN_GUEST_HANDLE(compat_iommu_op_buf_t);
+extern int
+compat_iommu_op(unsigned int nr_bufs,
+                XEN_GUEST_HANDLE_PARAM(compat_iommu_op_buf_t) bufs);
+
 #endif
 
 void arch_get_xen_caps(xen_capabilities_info_t *info);
diff --git a/xen/include/xlat.lst b/xen/include/xlat.lst
index 527332054a..3b15c18c4e 100644
--- a/xen/include/xlat.lst
+++ b/xen/include/xlat.lst
@@ -77,6 +77,8 @@
 ?	vcpu_hvm_context		hvm/hvm_vcpu.h
 ?	vcpu_hvm_x86_32			hvm/hvm_vcpu.h
 ?	vcpu_hvm_x86_64			hvm/hvm_vcpu.h
+!	iommu_op			iommu_op.h
+!	iommu_op_buf			iommu_op.h
 ?	kexec_exec			kexec.h
 !	kexec_image			kexec.h
 !	kexec_range			kexec.h
diff --git a/xen/include/xsm/dummy.h b/xen/include/xsm/dummy.h
index b0ac1f66b3..34b786993d 100644
--- a/xen/include/xsm/dummy.h
+++ b/xen/include/xsm/dummy.h
@@ -715,6 +715,12 @@ static XSM_INLINE int xsm_dm_op(XSM_DEFAULT_ARG struct domain *d)
     return xsm_default_action(action, current->domain, d);
 }
 
+static XSM_INLINE int xsm_iommu_op(XSM_DEFAULT_ARG struct domain *d, unsigned int op)
+{
+    XSM_ASSERT_ACTION(XSM_PRIV);
+    return xsm_default_action(action, current->domain, d);
+}
+
 #endif /* CONFIG_X86 */
 
 #include <public/version.h>
diff --git a/xen/include/xsm/xsm.h b/xen/include/xsm/xsm.h
index 7636bcbb42..7d75d0076e 100644
--- a/xen/include/xsm/xsm.h
+++ b/xen/include/xsm/xsm.h
@@ -178,6 +178,7 @@ struct xsm_operations {
     int (*ioport_mapping) (struct domain *d, uint32_t s, uint32_t e, uint8_t allow);
     int (*pmu_op) (struct domain *d, unsigned int op);
     int (*dm_op) (struct domain *d);
+    int (*iommu_op) (struct domain *d, unsigned int op);
 #endif
     int (*xen_version) (uint32_t cmd);
     int (*domain_resource_map) (struct domain *d);
@@ -686,6 +687,11 @@ static inline int xsm_dm_op(xsm_default_t def, struct domain *d)
     return xsm_ops->dm_op(d);
 }
 
+static inline int xsm_iommu_op(xsm_default_t def, struct domain *d, unsigned int op)
+{
+    return xsm_ops->iommu_op(d, op);
+}
+
 #endif /* CONFIG_X86 */
 
 static inline int xsm_xen_version (xsm_default_t def, uint32_t op)
diff --git a/xen/xsm/dummy.c b/xen/xsm/dummy.c
index 3290d04527..8532b74b9a 100644
--- a/xen/xsm/dummy.c
+++ b/xen/xsm/dummy.c
@@ -155,6 +155,7 @@ void __init xsm_fixup_ops (struct xsm_operations *ops)
     set_to_dummy_if_null(ops, ioport_mapping);
     set_to_dummy_if_null(ops, pmu_op);
     set_to_dummy_if_null(ops, dm_op);
+    set_to_dummy_if_null(ops, iommu_op);
 #endif
     set_to_dummy_if_null(ops, xen_version);
     set_to_dummy_if_null(ops, domain_resource_map);
diff --git a/xen/xsm/flask/hooks.c b/xen/xsm/flask/hooks.c
index a4fbe62ac3..1c9be22112 100644
--- a/xen/xsm/flask/hooks.c
+++ b/xen/xsm/flask/hooks.c
@@ -1669,6 +1669,11 @@ static int flask_dm_op(struct domain *d)
     return current_has_perm(d, SECCLASS_HVM, HVM__DM);
 }
 
+static int flask_iommu_op(struct domain *d)
+{
+    return current_has_perm(d, SECCLASS_RESOURCE, RESOURCE__CONTROL_IOMMU);
+}
+
 #endif /* CONFIG_X86 */
 
 static int flask_xen_version (uint32_t op)
@@ -1847,6 +1852,7 @@ static struct xsm_operations flask_ops = {
     .ioport_mapping = flask_ioport_mapping,
     .pmu_op = flask_pmu_op,
     .dm_op = flask_dm_op,
+    .iommu_op = flask_iommu_op,
 #endif
     .xen_version = flask_xen_version,
     .domain_resource_map = flask_domain_resource_map,
diff --git a/xen/xsm/flask/policy/access_vectors b/xen/xsm/flask/policy/access_vectors
index b768870f37..00db6a1cf7 100644
--- a/xen/xsm/flask/policy/access_vectors
+++ b/xen/xsm/flask/policy/access_vectors
@@ -471,6 +471,8 @@ class resource
 # checked for PHYSDEVOP_setup_gsi (target IRQ)
 # checked for PHYSDEVOP_pci_mmcfg_reserved (target xen_t)
     setup
+# checked for IOMMU_OP
+    control_iommu
 }
 
 # Class security describes the FLASK security server itself; these operations
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [PATCH v6 06/14] iommu: track reserved ranges using a rangeset
  2018-08-23  9:46 [PATCH v6 00/14] paravirtual IOMMU interface Paul Durrant
                   ` (4 preceding siblings ...)
  2018-08-23  9:47 ` [PATCH v6 05/14] public / x86: introduce __HYPERCALL_iommu_op Paul Durrant
@ 2018-08-23  9:47 ` Paul Durrant
  2018-09-07 10:40   ` Jan Beulich
  2018-08-23  9:47 ` [PATCH v6 07/14] x86: add iommu_op to query reserved ranges Paul Durrant
                   ` (7 subsequent siblings)
  13 siblings, 1 reply; 111+ messages in thread
From: Paul Durrant @ 2018-08-23  9:47 UTC (permalink / raw)
  To: xen-devel; +Cc: Kevin Tian, Paul Durrant, Jan Beulich

Ranges that should be considered reserved in the IOMMU are not necessarily
limited to RMRRs. If iommu_inclusive_mapping is set then any frame number
falling within an E820 reserved region should also be considered as
reserved in the IOMMU.
This patch adds a rangeset to the domain_iommu structure that is then used
to track all reserved ranges. This will be needed by a subsequent patch
to test whether it is safe to modify a particular IOMMU entry.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Reviewed-by: George Dunlap <george.dunlap@citrix.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
---
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Kevin Tian <kevin.tian@intel.com>

v6:
 - Re-base.

v2:
 - New in v2.
---
 xen/drivers/passthrough/iommu.c       | 10 +++++++++-
 xen/drivers/passthrough/vtd/iommu.c   | 20 +++++++++++++-------
 xen/drivers/passthrough/vtd/x86/vtd.c | 15 ++++++++++++++-
 xen/include/xen/iommu.h               |  6 ++++++
 4 files changed, 42 insertions(+), 9 deletions(-)

diff --git a/xen/drivers/passthrough/iommu.c b/xen/drivers/passthrough/iommu.c
index f75cab130d..9b63e9efee 100644
--- a/xen/drivers/passthrough/iommu.c
+++ b/xen/drivers/passthrough/iommu.c
@@ -147,6 +147,10 @@ int iommu_domain_init(struct domain *d)
     if ( !iommu_enabled )
         return 0;
 
+    hd->reserved_ranges = rangeset_new(d, NULL, 0);
+    if ( !hd->reserved_ranges )
+        return -ENOMEM;
+
     hd->platform_ops = iommu_get_ops();
     return hd->platform_ops->init(d);
 }
@@ -248,12 +252,16 @@ int iommu_construct(struct domain *d)
 
 void iommu_domain_destroy(struct domain *d)
 {
-    if ( !iommu_enabled || !dom_iommu(d)->platform_ops )
+    const struct domain_iommu *hd = dom_iommu(d);
+
+    if ( !iommu_enabled || !hd->platform_ops )
         return;
 
     iommu_teardown(d);
 
     arch_iommu_domain_destroy(d);
+
+    rangeset_destroy(hd->reserved_ranges);
 }
 
 int iommu_map_page(struct domain *d, bfn_t bfn, mfn_t mfn,
diff --git a/xen/drivers/passthrough/vtd/iommu.c b/xen/drivers/passthrough/vtd/iommu.c
index c9f50f04ad..282e227414 100644
--- a/xen/drivers/passthrough/vtd/iommu.c
+++ b/xen/drivers/passthrough/vtd/iommu.c
@@ -1910,6 +1910,7 @@ static int rmrr_identity_mapping(struct domain *d, bool_t map,
     unsigned long end_pfn = PAGE_ALIGN_4K(rmrr->end_address) >> PAGE_SHIFT_4K;
     struct mapped_rmrr *mrmrr;
     struct domain_iommu *hd = dom_iommu(d);
+    int err = 0;
 
     ASSERT(pcidevs_locked());
     ASSERT(rmrr->base_address < rmrr->end_address);
@@ -1923,8 +1924,6 @@ static int rmrr_identity_mapping(struct domain *d, bool_t map,
         if ( mrmrr->base == rmrr->base_address &&
              mrmrr->end == rmrr->end_address )
         {
-            int ret = 0;
-
             if ( map )
             {
                 ++mrmrr->count;
@@ -1934,28 +1933,35 @@ static int rmrr_identity_mapping(struct domain *d, bool_t map,
             if ( --mrmrr->count )
                 return 0;
 
-            while ( base_pfn < end_pfn )
+            err = rangeset_remove_range(hd->reserved_ranges,
+                                        base_pfn, end_pfn);
+            while ( !err && base_pfn < end_pfn )
             {
                 if ( clear_identity_p2m_entry(d, base_pfn) )
-                    ret = -ENXIO;
+                    err = -ENXIO;
+
                 base_pfn++;
             }
 
             list_del(&mrmrr->list);
             xfree(mrmrr);
-            return ret;
+            return err;
         }
     }
 
     if ( !map )
         return -ENOENT;
 
+    err = rangeset_add_range(hd->reserved_ranges, base_pfn, end_pfn);
+    if ( err )
+        return err;
+
     while ( base_pfn < end_pfn )
     {
-        int err = set_identity_p2m_entry(d, base_pfn, p2m_access_rw, flag);
-
+        err = set_identity_p2m_entry(d, base_pfn, p2m_access_rw, flag);
         if ( err )
             return err;
+
         base_pfn++;
     }
 
diff --git a/xen/drivers/passthrough/vtd/x86/vtd.c b/xen/drivers/passthrough/vtd/x86/vtd.c
index fb674cdc68..601d82a16e 100644
--- a/xen/drivers/passthrough/vtd/x86/vtd.c
+++ b/xen/drivers/passthrough/vtd/x86/vtd.c
@@ -154,8 +154,21 @@ void __hwdom_init vtd_set_hwdom_mapping(struct domain *d)
 
         rc = iommu_map_page(d, _bfn(pfn), _mfn(pfn),
 			    IOMMUF_readable | IOMMUF_writable);
+
+        /*
+         * The only reason a reserved page would be mapped is that
+         * iommu_inclusive_mapping is set, in which case it needs to be
+         * marked as reserved in the IOMMU.
+         */
+        if ( !rc && page_is_ram_type(pfn, RAM_TYPE_RESERVED) )
+        {
+            ASSERT(iommu_inclusive_mapping);
+
+            rc = rangeset_add_singleton(dom_iommu(d)->reserved_ranges, pfn);
+        }
+
         if ( rc )
-            printk(XENLOG_WARNING VTDPREFIX " d%d: IOMMU mapping failed: %d\n",
+            printk(XENLOG_WARNING VTDPREFIX " d%d: IOMMU mapping/reservation failed: %d\n",
                    d->domain_id, rc);
 
         if (!(i & 0xfffff))
diff --git a/xen/include/xen/iommu.h b/xen/include/xen/iommu.h
index 6edc6e1a10..8cdb530d49 100644
--- a/xen/include/xen/iommu.h
+++ b/xen/include/xen/iommu.h
@@ -122,6 +122,12 @@ struct domain_iommu {
 
     /* Features supported by the IOMMU */
     DECLARE_BITMAP(features, IOMMU_FEAT_count);
+
+    /*
+     * BFN ranges that are reserved in the domain IOMMU mappings and
+     * must not be modified after initialization.
+     */
+    struct rangeset *reserved_ranges;
 };
 
 #define dom_iommu(d)              (&(d)->iommu)
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [PATCH v6 07/14] x86: add iommu_op to query reserved ranges
  2018-08-23  9:46 [PATCH v6 00/14] paravirtual IOMMU interface Paul Durrant
                   ` (5 preceding siblings ...)
  2018-08-23  9:47 ` [PATCH v6 06/14] iommu: track reserved ranges using a rangeset Paul Durrant
@ 2018-08-23  9:47 ` Paul Durrant
  2018-09-07 11:01   ` Jan Beulich
  2018-08-23  9:47 ` [PATCH v6 08/14] vtd: add lookup_page method to iommu_ops Paul Durrant
                   ` (6 subsequent siblings)
  13 siblings, 1 reply; 111+ messages in thread
From: Paul Durrant @ 2018-08-23  9:47 UTC (permalink / raw)
  To: xen-devel
  Cc: Stefano Stabellini, Wei Liu, George Dunlap, Andrew Cooper,
	Ian Jackson, Tim Deegan, Paul Durrant, Jan Beulich

This patch adds an iommu_op to allow the domain IOMMU reserved ranges to be
queried by the guest.

NOTE: The number of reserved ranges is determined by system firmware, in
      conjunction with Xen command line options, and is expected to be
      small. Thus, to avoid over-complicating the code, there is no
      pre-emption check within the operation.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
---
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: George Dunlap <George.Dunlap@eu.citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Tim Deegan <tim@xen.org>
Cc: Wei Liu <wei.liu2@citrix.com>

v4:
 - Make xen_bfn_t strictly 64 bits wide and drop associated compat
   translation.

v3:
 - Avoid speculation beyond array bounds check.

v2:
 - Re-implemented for v2 based on new rangeset.
---
 xen/common/iommu_op.c         | 164 ++++++++++++++++++++++++++++++++++++++++--
 xen/include/public/iommu_op.h |  39 ++++++++++
 xen/include/xlat.lst          |   2 +
 3 files changed, 199 insertions(+), 6 deletions(-)

diff --git a/xen/common/iommu_op.c b/xen/common/iommu_op.c
index 744c0fce27..bcfcd49102 100644
--- a/xen/common/iommu_op.c
+++ b/xen/common/iommu_op.c
@@ -22,11 +22,70 @@
 #include <xen/event.h>
 #include <xen/guest_access.h>
 #include <xen/hypercall.h>
+#include <xen/nospec.h>
+
+struct get_reserved_ctxt {
+    unsigned int max_entries;
+    unsigned int nr_entries;
+    XEN_GUEST_HANDLE(xen_iommu_reserved_range_t) ranges;
+};
+
+static int get_reserved(unsigned long s, unsigned long e, void *arg)
+{
+    struct get_reserved_ctxt *ctxt = arg;
+
+    if ( ctxt->nr_entries < ctxt->max_entries )
+    {
+        xen_iommu_reserved_range_t range = {
+            .start_bfn = s,
+            .nr_frames = e - s,
+        };
+
+        if ( copy_to_guest_offset(ctxt->ranges, ctxt->nr_entries, &range,
+                                  1) )
+            return -EFAULT;
+    }
+
+    ctxt->nr_entries++;
+    return 0;
+}
+
+static int iommu_op_query_reserved(struct xen_iommu_op_query_reserved *op)
+{
+    struct domain *currd = current->domain;
+    struct domain_iommu *iommu = dom_iommu(currd);
+    struct get_reserved_ctxt ctxt = {
+        .max_entries = op->nr_entries,
+        .ranges = op->ranges,
+    };
+    int rc;
+
+    if ( op->pad )
+        return -EINVAL;
+
+    rc = rangeset_report_ranges(iommu->reserved_ranges, 0, ~0ul,
+                                get_reserved, &ctxt);
+    if ( rc )
+        return rc;
+
+    /* Pass back the actual number of reserved ranges */
+    op->nr_entries = ctxt.nr_entries;
+
+    if ( !guest_handle_is_null(ctxt.ranges) &&
+         ctxt.nr_entries > ctxt.max_entries )
+        return -ENOBUFS;
+
+    return 0;
+}
 
 static void iommu_op(xen_iommu_op_t *op)
 {
     switch ( op->op )
     {
+    case XEN_IOMMUOP_query_reserved:
+        op->status = iommu_op_query_reserved(&op->u.query_reserved);
+        break;
+
     default:
         op->status = -EOPNOTSUPP;
         break;
@@ -35,13 +94,20 @@ static void iommu_op(xen_iommu_op_t *op)
 
 int do_one_iommu_op(xen_iommu_op_buf_t *buf)
 {
-    xen_iommu_op_t op;
+    xen_iommu_op_t op = {};
+    size_t offset;
+    static const size_t op_size[] = {
+        [XEN_IOMMUOP_query_reserved] = sizeof(struct xen_iommu_op_query_reserved),
+    };
+    size_t size;
     int rc;
 
-    if ( buf->size < sizeof(op) )
+    offset = offsetof(struct xen_iommu_op, u);
+
+    if ( buf->size < offset )
         return -EFAULT;
 
-    if ( copy_from_guest((void *)&op, buf->h, sizeof(op)) )
+    if ( copy_from_guest((void *)&op, buf->h, offset) )
         return -EFAULT;
 
     if ( op.pad )
@@ -51,6 +117,16 @@ int do_one_iommu_op(xen_iommu_op_buf_t *buf)
     if ( rc )
         return rc;
 
+    if ( op.op >= ARRAY_SIZE(op_size) )
+        return -EOPNOTSUPP;
+
+    size = op_size[array_index_nospec(op.op, ARRAY_SIZE(op_size))];
+    if ( buf->size < offset + size )
+        return -EFAULT;
+
+    if ( copy_from_guest_offset((void *)&op.u, buf->h, offset, size) )
+        return -EFAULT;
+
     iommu_op(&op);
 
     if ( __copy_field_to_guest(guest_handle_cast(buf->h, xen_iommu_op_t),
@@ -100,16 +176,27 @@ long do_iommu_op(unsigned int nr_bufs,
     return rc;
 }
 
+CHECK_iommu_reserved_range;
+
 int compat_one_iommu_op(compat_iommu_op_buf_t *buf)
 {
-    compat_iommu_op_t cmp;
+    compat_iommu_op_t cmp = {};
+    size_t offset;
+    static const size_t op_size[] = {
+        [XEN_IOMMUOP_query_reserved] = sizeof(struct compat_iommu_op_query_reserved),
+    };
+    size_t size;
     xen_iommu_op_t nat;
+    unsigned int u;
+    int32_t status;
     int rc;
 
-    if ( buf->size < sizeof(cmp) )
+    offset = offsetof(struct compat_iommu_op, u);
+
+    if ( buf->size < offset )
         return -EFAULT;
 
-    if ( copy_from_compat((void *)&cmp, buf->h, sizeof(cmp)) )
+    if ( copy_from_compat((void *)&cmp, buf->h, offset) )
         return -EFAULT;
 
     if ( cmp.pad )
@@ -119,17 +206,82 @@ int compat_one_iommu_op(compat_iommu_op_buf_t *buf)
     if ( rc )
         return rc;
 
+    if ( cmp.op >= ARRAY_SIZE(op_size) )
+        return -EOPNOTSUPP;
+
+    size = op_size[array_index_nospec(cmp.op, ARRAY_SIZE(op_size))];
+    if ( buf->size < offset + size )
+        return -EFAULT;
+
+    if ( copy_from_compat_offset((void *)&cmp.u, buf->h, offset, size) )
+        return -EFAULT;
+
+    /*
+     * The xlat magic doesn't quite know how to handle the union so
+     * we need to fix things up here.
+     */
+#define XLAT_iommu_op_u_query_reserved XEN_IOMMUOP_query_reserved
+    u = cmp.op;
+
+#define XLAT_iommu_op_query_reserved_HNDL_ranges(_d_, _s_)            \
+    do                                                                \
+    {                                                                 \
+        if ( !compat_handle_is_null((_s_)->ranges) )                  \
+        {                                                             \
+            unsigned int *nr_entries = COMPAT_ARG_XLAT_VIRT_BASE;     \
+            xen_iommu_reserved_range_t *ranges =                      \
+                (void *)(nr_entries + 1);                             \
+                                                                      \
+            if ( sizeof(*nr_entries) +                                \
+                 (sizeof(*ranges) * (_s_)->nr_entries) >              \
+                 COMPAT_ARG_XLAT_SIZE )                               \
+                return -E2BIG;                                        \
+                                                                      \
+            *nr_entries = (_s_)->nr_entries;                          \
+            set_xen_guest_handle((_d_)->ranges, ranges);              \
+        }                                                             \
+        else                                                          \
+            set_xen_guest_handle((_d_)->ranges, NULL);                \
+    } while (false)
+
     XLAT_iommu_op(&nat, &cmp);
 
+#undef XLAT_iommu_op_query_reserved_HNDL_ranges
+
     iommu_op(&nat);
 
+    status = nat.status;
+
+#define XLAT_iommu_op_query_reserved_HNDL_ranges(_d_, _s_)               \
+    do                                                                   \
+    {                                                                    \
+        if ( !compat_handle_is_null((_d_)->ranges) )                     \
+        {                                                                \
+            unsigned int *nr_entries = COMPAT_ARG_XLAT_VIRT_BASE;        \
+            compat_iommu_reserved_range_t *ranges =                      \
+                (void *)(nr_entries + 1);                                \
+            unsigned int nr =                                            \
+                min_t(unsigned int, (_d_)->nr_entries, *nr_entries);     \
+                                                                         \
+            if ( __copy_to_compat_offset((_d_)->ranges, 0, ranges, nr) ) \
+                status = -EFAULT;                                        \
+        }                                                                \
+    } while (false)
+
     XLAT_iommu_op(&cmp, &nat);
 
+    /* status will have been modified if __copy_to_compat_offset() failed */
+    cmp.status = status;
+
+#undef XLAT_iommu_op_query_reserved_HNDL_ranges
+
     if ( __copy_field_to_compat(compat_handle_cast(buf->h,
                                                    compat_iommu_op_t),
                                 &cmp, status) )
         return -EFAULT;
 
+#undef XLAT_iommu_op_u_query_reserved
+
     return 0;
 }
 
diff --git a/xen/include/public/iommu_op.h b/xen/include/public/iommu_op.h
index c3b68f665a..ade404a877 100644
--- a/xen/include/public/iommu_op.h
+++ b/xen/include/public/iommu_op.h
@@ -25,11 +25,50 @@
 
 #include "xen.h"
 
+typedef uint64_t xen_bfn_t;
+
+/* Structure describing a single range reserved in the IOMMU */
+struct xen_iommu_reserved_range {
+    xen_bfn_t start_bfn;
+    unsigned int nr_frames;
+    unsigned int pad;
+};
+typedef struct xen_iommu_reserved_range xen_iommu_reserved_range_t;
+DEFINE_XEN_GUEST_HANDLE(xen_iommu_reserved_range_t);
+
+/*
+ * XEN_IOMMUOP_query_reserved: Query ranges reserved in the IOMMU.
+ */
+#define XEN_IOMMUOP_query_reserved 1
+
+struct xen_iommu_op_query_reserved {
+    /*
+     * IN/OUT - On entry this is the number of entries available
+     *          in the ranges array below.
+     *          On exit this is the actual number of reserved ranges.
+     */
+    unsigned int nr_entries;
+    unsigned int pad;
+    /*
+     * OUT - This array is populated with reserved ranges. If it is
+     *       not sufficiently large then available entries are populated,
+     *       but the op status code will be set to -ENOBUFS.
+     *       It is permissable to set this to NULL if nr_entries is also
+     *       set to zero. In this case, on exit, nr_entries will still be
+     *       set to the actual number of reserved ranges but the status
+     *       code will be set to zero.
+     */
+    XEN_GUEST_HANDLE(xen_iommu_reserved_range_t) ranges;
+};
+
 struct xen_iommu_op {
     uint16_t op;    /* op type */
     uint16_t pad;
     int32_t status; /* op completion status: */
                     /* 0 for success otherwise, negative errno */
+    union {
+        struct xen_iommu_op_query_reserved query_reserved;
+    } u;
 };
 typedef struct xen_iommu_op xen_iommu_op_t;
 DEFINE_XEN_GUEST_HANDLE(xen_iommu_op_t);
diff --git a/xen/include/xlat.lst b/xen/include/xlat.lst
index 3b15c18c4e..d2f9b1034b 100644
--- a/xen/include/xlat.lst
+++ b/xen/include/xlat.lst
@@ -79,6 +79,8 @@
 ?	vcpu_hvm_x86_64			hvm/hvm_vcpu.h
 !	iommu_op			iommu_op.h
 !	iommu_op_buf			iommu_op.h
+!	iommu_op_query_reserved		iommu_op.h
+?	iommu_reserved_range		iommu_op.h
 ?	kexec_exec			kexec.h
 !	kexec_image			kexec.h
 !	kexec_range			kexec.h
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [PATCH v6 08/14] vtd: add lookup_page method to iommu_ops
  2018-08-23  9:46 [PATCH v6 00/14] paravirtual IOMMU interface Paul Durrant
                   ` (6 preceding siblings ...)
  2018-08-23  9:47 ` [PATCH v6 07/14] x86: add iommu_op to query reserved ranges Paul Durrant
@ 2018-08-23  9:47 ` Paul Durrant
  2018-09-07 11:11   ` Jan Beulich
  2018-08-23  9:47 ` [PATCH v6 09/14] mm / iommu: include need_iommu() test in iommu_use_hap_pt() Paul Durrant
                   ` (5 subsequent siblings)
  13 siblings, 1 reply; 111+ messages in thread
From: Paul Durrant @ 2018-08-23  9:47 UTC (permalink / raw)
  To: xen-devel; +Cc: Kevin Tian, Paul Durrant, George Dunlap, Jan Beulich

This patch adds a new method to the VT-d IOMMU implementation to find the
MFN currently mapped by the specified BFN along with a wrapper function in
generic IOMMU code to call the implementation if it exists.

This functionality will be used by a subsequent patch.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
---
Cc: Kevin Tian <kevin.tian@intel.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: George Dunlap <george.dunlap@citrix.com>

v3:
 - Addressed comments from George.

v2:
 - Addressed some comments from Jan.
---
 xen/drivers/passthrough/iommu.c     | 11 +++++++++++
 xen/drivers/passthrough/vtd/iommu.c | 34 ++++++++++++++++++++++++++++++++++
 xen/drivers/passthrough/vtd/iommu.h |  3 +++
 xen/include/xen/iommu.h             |  4 ++++
 4 files changed, 52 insertions(+)

diff --git a/xen/drivers/passthrough/iommu.c b/xen/drivers/passthrough/iommu.c
index 9b63e9efee..7808fa85d0 100644
--- a/xen/drivers/passthrough/iommu.c
+++ b/xen/drivers/passthrough/iommu.c
@@ -305,6 +305,17 @@ int iommu_unmap_page(struct domain *d, bfn_t bfn)
     return rc;
 }
 
+int iommu_lookup_page(struct domain *d, bfn_t bfn, mfn_t *mfn,
+                      unsigned int *flags)
+{
+    const struct domain_iommu *hd = dom_iommu(d);
+
+    if ( !iommu_enabled || !hd->platform_ops )
+        return -EOPNOTSUPP;
+
+    return hd->platform_ops->lookup_page(d, bfn, mfn, flags);
+}
+
 static void iommu_free_pagetables(unsigned long unused)
 {
     do {
diff --git a/xen/drivers/passthrough/vtd/iommu.c b/xen/drivers/passthrough/vtd/iommu.c
index 282e227414..8cd3b59aa0 100644
--- a/xen/drivers/passthrough/vtd/iommu.c
+++ b/xen/drivers/passthrough/vtd/iommu.c
@@ -1830,6 +1830,39 @@ static int __must_check intel_iommu_unmap_page(struct domain *d,
     return dma_pte_clear_one(d, bfn_to_baddr(bfn));
 }
 
+static int intel_iommu_lookup_page(struct domain *d, bfn_t bfn, mfn_t *mfn,
+                                   unsigned int *flags)
+{
+    struct domain_iommu *hd = dom_iommu(d);
+    struct dma_pte *page = NULL, *pte = NULL, val;
+    u64 pg_maddr;
+
+    spin_lock(&hd->arch.mapping_lock);
+
+    pg_maddr = addr_to_dma_page_maddr(d, bfn_to_baddr(bfn), 0);
+    if ( pg_maddr == 0 )
+    {
+        spin_unlock(&hd->arch.mapping_lock);
+        return -ENOMEM;
+    }
+
+    page = map_vtd_domain_page(pg_maddr);
+    pte = page + (bfn_x(bfn) & LEVEL_MASK);
+    val = *pte;
+
+    unmap_vtd_domain_page(page);
+    spin_unlock(&hd->arch.mapping_lock);
+
+    if ( !dma_pte_present(val) )
+        return -ENOENT;
+
+    *mfn = maddr_to_mfn(dma_pte_addr(val));
+    *flags = dma_pte_read(val) ? IOMMUF_readable : 0;
+    *flags |= dma_pte_write(val) ? IOMMUF_writable : 0;
+
+    return 0;
+}
+
 int iommu_pte_flush(struct domain *d, uint64_t bfn, uint64_t *pte,
                     int order, int present)
 {
@@ -2661,6 +2694,7 @@ const struct iommu_ops intel_iommu_ops = {
     .teardown = iommu_domain_teardown,
     .map_page = intel_iommu_map_page,
     .unmap_page = intel_iommu_unmap_page,
+    .lookup_page = intel_iommu_lookup_page,
     .free_page_table = iommu_free_page_table,
     .reassign_device = reassign_device_ownership,
     .get_device_group_id = intel_iommu_group_id,
diff --git a/xen/drivers/passthrough/vtd/iommu.h b/xen/drivers/passthrough/vtd/iommu.h
index 72c1a2e3cd..47bdfcb5ea 100644
--- a/xen/drivers/passthrough/vtd/iommu.h
+++ b/xen/drivers/passthrough/vtd/iommu.h
@@ -272,6 +272,9 @@ struct dma_pte {
 #define dma_set_pte_prot(p, prot) do { \
         (p).val = ((p).val & ~DMA_PTE_PROT) | ((prot) & DMA_PTE_PROT); \
     } while (0)
+#define dma_pte_prot(p) ((p).val & DMA_PTE_PROT)
+#define dma_pte_read(p) (dma_pte_prot(p) & DMA_PTE_READ)
+#define dma_pte_write(p) (dma_pte_prot(p) & DMA_PTE_WRITE)
 #define dma_pte_addr(p) ((p).val & PADDR_MASK & PAGE_MASK_4K)
 #define dma_set_pte_addr(p, addr) do {\
             (p).val |= ((addr) & PAGE_MASK_4K); } while (0)
diff --git a/xen/include/xen/iommu.h b/xen/include/xen/iommu.h
index 8cdb530d49..cdd75acf62 100644
--- a/xen/include/xen/iommu.h
+++ b/xen/include/xen/iommu.h
@@ -100,6 +100,8 @@ void iommu_teardown(struct domain *d);
 int __must_check iommu_map_page(struct domain *d, bfn_t bfn,
                                 mfn_t mfn, unsigned int flags);
 int __must_check iommu_unmap_page(struct domain *d, bfn_t bfn);
+int __must_check iommu_lookup_page(struct domain *d, bfn_t bfn, mfn_t *mfn,
+                                   unsigned int *flags);
 
 enum iommu_feature
 {
@@ -196,6 +198,8 @@ struct iommu_ops {
     int __must_check (*map_page)(struct domain *d, bfn_t bfn, mfn_t mfn,
                                  unsigned int flags);
     int __must_check (*unmap_page)(struct domain *d, bfn_t bfn);
+    int __must_check (*lookup_page)(struct domain *d, bfn_t bfn, mfn_t *mfn,
+                                    unsigned int *flags);
     void (*free_page_table)(struct page_info *);
 #ifdef CONFIG_X86
     void (*update_ire_from_apic)(unsigned int apic, unsigned int reg, unsigned int value);
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [PATCH v6 09/14] mm / iommu: include need_iommu() test in iommu_use_hap_pt()
  2018-08-23  9:46 [PATCH v6 00/14] paravirtual IOMMU interface Paul Durrant
                   ` (7 preceding siblings ...)
  2018-08-23  9:47 ` [PATCH v6 08/14] vtd: add lookup_page method to iommu_ops Paul Durrant
@ 2018-08-23  9:47 ` Paul Durrant
  2018-09-07 11:20   ` Jan Beulich
  2018-08-23  9:47 ` [PATCH v6 10/14] mm / iommu: split need_iommu() into has_iommu_pt() and need_iommu_pt_sync() Paul Durrant
                   ` (4 subsequent siblings)
  13 siblings, 1 reply; 111+ messages in thread
From: Paul Durrant @ 2018-08-23  9:47 UTC (permalink / raw)
  To: xen-devel
  Cc: Stefano Stabellini, Jun Nakajima, George Dunlap, Andrew Cooper,
	Julien Grall, Paul Durrant, Jan Beulich

The name 'iommu_use_hap_pt' suggests that that P2M table is in use as the
domain's IOMMU pagetable which, prior to this patch, is not strictly true
since the macro did not test whether the domain actually has IOMMU
mappings.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
---
Cc: Jun Nakajima <jun.nakajima@intel.com>
Cc: George Dunlap <george.dunlap@eu.citrix.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Julien Grall <julien.grall@arm.com>

v4:
 - New in v4.
---
 xen/arch/x86/mm/p2m-ept.c       | 6 +++---
 xen/arch/x86/mm/p2m-pt.c        | 6 +++---
 xen/arch/x86/mm/p2m.c           | 2 +-
 xen/drivers/passthrough/iommu.c | 2 +-
 xen/include/asm-arm/iommu.h     | 2 +-
 xen/include/asm-x86/iommu.h     | 5 +++--
 6 files changed, 12 insertions(+), 11 deletions(-)

diff --git a/xen/arch/x86/mm/p2m-ept.c b/xen/arch/x86/mm/p2m-ept.c
index c9e1a7e288..9ce9abd2b1 100644
--- a/xen/arch/x86/mm/p2m-ept.c
+++ b/xen/arch/x86/mm/p2m-ept.c
@@ -863,12 +863,12 @@ out:
         ept_sync_domain(p2m);
 
     /* For host p2m, may need to change VT-d page table.*/
-    if ( rc == 0 && p2m_is_hostp2m(p2m) && need_iommu(d) &&
+    if ( rc == 0 && p2m_is_hostp2m(p2m) &&
          need_modify_vtd_table )
     {
-        if ( iommu_hap_pt_share )
+        if ( iommu_use_hap_pt(d) )
             rc = iommu_pte_flush(d, gfn, &ept_entry->epte, order, vtd_pte_present);
-        else
+        else if ( need_iommu(d) )
         {
             bfn_t bfn = _bfn(gfn);
 
diff --git a/xen/arch/x86/mm/p2m-pt.c b/xen/arch/x86/mm/p2m-pt.c
index 3b8d184054..f265a325f6 100644
--- a/xen/arch/x86/mm/p2m-pt.c
+++ b/xen/arch/x86/mm/p2m-pt.c
@@ -677,8 +677,8 @@ p2m_pt_set_entry(struct p2m_domain *p2m, gfn_t gfn_, mfn_t mfn,
          && (gfn + (1UL << page_order) - 1 > p2m->max_mapped_pfn) )
         p2m->max_mapped_pfn = gfn + (1UL << page_order) - 1;
 
-    if ( iommu_enabled && need_iommu(p2m->domain) &&
-         (iommu_old_flags != iommu_pte_flags || old_mfn != mfn_x(mfn)) )
+    if ( iommu_enabled && (iommu_old_flags != iommu_pte_flags ||
+                           old_mfn != mfn_x(mfn)) )
     {
         ASSERT(rc == 0);
 
@@ -687,7 +687,7 @@ p2m_pt_set_entry(struct p2m_domain *p2m, gfn_t gfn_, mfn_t mfn,
             if ( iommu_old_flags )
                 amd_iommu_flush_pages(p2m->domain, gfn, page_order);
         }
-        else
+        else if ( need_iommu(p2m->domain) )
         {
             bfn_t bfn = _bfn(gfn);
 
diff --git a/xen/arch/x86/mm/p2m.c b/xen/arch/x86/mm/p2m.c
index 755329366e..485f7583ba 100644
--- a/xen/arch/x86/mm/p2m.c
+++ b/xen/arch/x86/mm/p2m.c
@@ -2090,7 +2090,7 @@ static unsigned int mmio_order(const struct domain *d,
      * - exclude PV guests, should execution reach this code for such.
      * So be careful when altering this.
      */
-    if ( !need_iommu(d) || !iommu_use_hap_pt(d) ||
+    if ( !iommu_use_hap_pt(d) ||
          (start_fn & ((1UL << PAGE_ORDER_2M) - 1)) || !(nr >> PAGE_ORDER_2M) )
         return PAGE_ORDER_4K;
 
diff --git a/xen/drivers/passthrough/iommu.c b/xen/drivers/passthrough/iommu.c
index 7808fa85d0..c262a59877 100644
--- a/xen/drivers/passthrough/iommu.c
+++ b/xen/drivers/passthrough/iommu.c
@@ -453,7 +453,7 @@ int iommu_do_domctl(
 
 void iommu_share_p2m_table(struct domain* d)
 {
-    if ( iommu_enabled && iommu_use_hap_pt(d) )
+    if ( iommu_use_hap_pt(d) )
         iommu_get_ops()->share_p2m(d);
 }
 
diff --git a/xen/include/asm-arm/iommu.h b/xen/include/asm-arm/iommu.h
index 57d9b1e14a..8d1506c6f7 100644
--- a/xen/include/asm-arm/iommu.h
+++ b/xen/include/asm-arm/iommu.h
@@ -21,7 +21,7 @@ struct arch_iommu
 };
 
 /* Always share P2M Table between the CPU and the IOMMU */
-#define iommu_use_hap_pt(d) (1)
+#define iommu_use_hap_pt(d) (need_iommu(d))
 
 const struct iommu_ops *iommu_get_ops(void);
 void __init iommu_set_ops(const struct iommu_ops *ops);
diff --git a/xen/include/asm-x86/iommu.h b/xen/include/asm-x86/iommu.h
index 14ad0489a6..0e7d66be54 100644
--- a/xen/include/asm-x86/iommu.h
+++ b/xen/include/asm-x86/iommu.h
@@ -77,8 +77,9 @@ static inline int iommu_hardware_setup(void)
     return -ENODEV;
 }
 
-/* Does this domain have a P2M table we can use as its IOMMU pagetable? */
-#define iommu_use_hap_pt(d) (hap_enabled(d) && iommu_hap_pt_share)
+/* Are we using the domain P2M table as its IOMMU pagetable? */
+#define iommu_use_hap_pt(d) \
+    (hap_enabled(d) && need_iommu(d) && iommu_hap_pt_share)
 
 void iommu_update_ire_from_apic(unsigned int apic, unsigned int reg, unsigned int value);
 unsigned int iommu_read_apic_from_ire(unsigned int apic, unsigned int reg);
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [PATCH v6 10/14] mm / iommu: split need_iommu() into has_iommu_pt() and need_iommu_pt_sync()
  2018-08-23  9:46 [PATCH v6 00/14] paravirtual IOMMU interface Paul Durrant
                   ` (8 preceding siblings ...)
  2018-08-23  9:47 ` [PATCH v6 09/14] mm / iommu: include need_iommu() test in iommu_use_hap_pt() Paul Durrant
@ 2018-08-23  9:47 ` Paul Durrant
  2018-08-23 11:10   ` Razvan Cojocaru
  2018-09-11 14:31   ` Jan Beulich
  2018-08-23  9:47 ` [PATCH v6 11/14] x86: add iommu_op to enable modification of IOMMU mappings Paul Durrant
                   ` (3 subsequent siblings)
  13 siblings, 2 replies; 111+ messages in thread
From: Paul Durrant @ 2018-08-23  9:47 UTC (permalink / raw)
  To: xen-devel
  Cc: Kevin Tian, Stefano Stabellini, Wei Liu, Jun Nakajima,
	Razvan Cojocaru, George Dunlap, Andrew Cooper, Ian Jackson,
	Tim Deegan, Julien Grall, Paul Durrant, Tamas K Lengyel,
	Jan Beulich, Brian Woods, Suravee Suthikulpanit

The name 'need_iommu()' is a little confusing as it suggests a domain needs
to use the IOMMU but something might not be set up yet, when in fact it
represents a tri-state value (not a boolean as might be expected) where
-1 means 'IOMMU mappings being set up' and 1 means 'IOMMU mappings have
been fully set up'.

Two different meanings are also inferred from the macro it in various
places in the code:

- Some callers want to test whether a domain has IOMMU mappings at all
- Some callers want to test whether they need to synchronize the domain's
  P2M and IOMMU mappings

This patch replaces the 'need_iommu' tri-state value with a defined
enumeration and adds a boolean flag 'need_sync' to separate these meanings,
and places both of these in struct domain_iommu, rather than directly in
struct domain.
This patch also creates two new boolean macros:

- 'has_iommu_pt()' evaluates to true if a domain has IOMMU mappings, even
  if they are still under construction.
- 'need_iommu_pt_sync()' evaluates to true if a domain requires explicit
  synchronization of the P2M and IOMMU mappings.

All callers of need_iommu() are then modified to use the macro appropriate
to what they are trying to test.

NOTE: The test of need_iommu(d) in the AMD IOMMU initialization code has
      been replaced with a test of iommu_dom0_strict since this more
      accurately reflects the meaning of the test and brings it into
      line with a similar test in the Intel VT-d code.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
---
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Julien Grall <julien.grall@arm.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: George Dunlap <George.Dunlap@eu.citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Tim Deegan <tim@xen.org>
Cc: Wei Liu <wei.liu2@citrix.com>
Cc: Tamas K Lengyel <tamas@tklengyel.com>
Cc: George Dunlap <george.dunlap@eu.citrix.com>
Cc: Jun Nakajima <jun.nakajima@intel.com>
Cc: Kevin Tian <kevin.tian@intel.com>
Cc: Razvan Cojocaru <rcojocaru@bitdefender.com>
Cc: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Cc: Brian Woods <brian.woods@amd.com>

v6:
 - Deal with need_iommu being tri-state.
 - Change the name of 'sync_iommu_pt' to 'need_iommu_pt_sync' and make
   sure it is set as soon as the mappings are under construction.
 - Not adding Razvan's A-b because of change.

v4:
 - New in v4.
---
 xen/arch/arm/p2m.c                          |  2 +-
 xen/arch/x86/hvm/mtrr.c                     |  5 ++--
 xen/arch/x86/mm.c                           |  2 +-
 xen/arch/x86/mm/mem_sharing.c               |  2 +-
 xen/arch/x86/mm/p2m-ept.c                   |  2 +-
 xen/arch/x86/mm/p2m-pt.c                    |  2 +-
 xen/arch/x86/mm/p2m.c                       |  8 +++---
 xen/arch/x86/mm/paging.c                    |  2 +-
 xen/arch/x86/x86_64/mm.c                    |  3 ++-
 xen/common/memory.c                         |  6 ++---
 xen/common/vm_event.c                       |  2 +-
 xen/drivers/passthrough/amd/pci_amd_iommu.c |  2 +-
 xen/drivers/passthrough/device_tree.c       | 21 ++++++++--------
 xen/drivers/passthrough/iommu.c             | 39 +++++++++++++++++++++--------
 xen/drivers/passthrough/pci.c               |  6 ++---
 xen/drivers/passthrough/x86/iommu.c         |  2 --
 xen/include/asm-arm/grant_table.h           |  2 +-
 xen/include/asm-arm/iommu.h                 |  2 +-
 xen/include/asm-x86/grant_table.h           |  2 +-
 xen/include/asm-x86/iommu.h                 |  2 +-
 xen/include/xen/iommu.h                     | 17 +++++++++++++
 xen/include/xen/sched.h                     |  9 +++----
 22 files changed, 87 insertions(+), 53 deletions(-)

diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
index 4a8cac5050..abd5920877 100644
--- a/xen/arch/arm/p2m.c
+++ b/xen/arch/arm/p2m.c
@@ -955,7 +955,7 @@ static int __p2m_set_entry(struct p2m_domain *p2m,
     if ( lpae_is_valid(orig_pte) && entry->p2m.base != orig_pte.p2m.base )
         p2m_free_entry(p2m, orig_pte, level);
 
-    if ( need_iommu(p2m->domain) &&
+    if ( need_iommu_pt_sync(p2m->domain) &&
          (lpae_is_valid(orig_pte) || lpae_is_valid(*entry)) )
     {
         rc = iommu_iotlb_flush(p2m->domain, _bfn(gfn_x(sgfn)),
diff --git a/xen/arch/x86/hvm/mtrr.c b/xen/arch/x86/hvm/mtrr.c
index edfe5cd2b2..44c65ca717 100644
--- a/xen/arch/x86/hvm/mtrr.c
+++ b/xen/arch/x86/hvm/mtrr.c
@@ -793,7 +793,8 @@ HVM_REGISTER_SAVE_RESTORE(MTRR, hvm_save_mtrr_msr, hvm_load_mtrr_msr,
 
 void memory_type_changed(struct domain *d)
 {
-    if ( need_iommu(d) && d->vcpu && d->vcpu[0] )
+    if ( (need_iommu_pt_sync(d) || iommu_use_hap_pt(d)) &&
+         d->vcpu && d->vcpu[0] )
     {
         p2m_memory_type_changed(d);
         flush_all(FLUSH_CACHE);
@@ -841,7 +842,7 @@ int epte_get_entry_emt(struct domain *d, unsigned long gfn, mfn_t mfn,
         return MTRR_TYPE_UNCACHABLE;
     }
 
-    if ( !need_iommu(d) && !cache_flush_permitted(d) )
+    if ( !has_iommu_pt(d) && !cache_flush_permitted(d) )
     {
         *ipat = 1;
         return MTRR_TYPE_WRBACK;
diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index f598674f0c..1b51ec7865 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -2750,7 +2750,7 @@ static int _get_page_type(struct page_info *page, unsigned long type,
     {
         /* Special pages should not be accessible from devices. */
         struct domain *d = page_get_owner(page);
-        if ( d && is_pv_domain(d) && unlikely(need_iommu(d)) )
+        if ( d && is_pv_domain(d) && unlikely(need_iommu_pt_sync(d)) )
         {
             mfn_t mfn = page_to_mfn(page);
 
diff --git a/xen/arch/x86/mm/mem_sharing.c b/xen/arch/x86/mm/mem_sharing.c
index fad8a9df13..c54845275f 100644
--- a/xen/arch/x86/mm/mem_sharing.c
+++ b/xen/arch/x86/mm/mem_sharing.c
@@ -1610,7 +1610,7 @@ int mem_sharing_domctl(struct domain *d, struct xen_domctl_mem_sharing_op *mec)
         case XEN_DOMCTL_MEM_SHARING_CONTROL:
         {
             rc = 0;
-            if ( unlikely(need_iommu(d) && mec->u.enable) )
+            if ( unlikely(has_iommu_pt(d) && mec->u.enable) )
                 rc = -EXDEV;
             else
                 d->arch.hvm_domain.mem_sharing_enabled = mec->u.enable;
diff --git a/xen/arch/x86/mm/p2m-ept.c b/xen/arch/x86/mm/p2m-ept.c
index 9ce9abd2b1..464bd5d805 100644
--- a/xen/arch/x86/mm/p2m-ept.c
+++ b/xen/arch/x86/mm/p2m-ept.c
@@ -868,7 +868,7 @@ out:
     {
         if ( iommu_use_hap_pt(d) )
             rc = iommu_pte_flush(d, gfn, &ept_entry->epte, order, vtd_pte_present);
-        else if ( need_iommu(d) )
+        else if ( need_iommu_pt_sync(d) )
         {
             bfn_t bfn = _bfn(gfn);
 
diff --git a/xen/arch/x86/mm/p2m-pt.c b/xen/arch/x86/mm/p2m-pt.c
index f265a325f6..4fe3437a97 100644
--- a/xen/arch/x86/mm/p2m-pt.c
+++ b/xen/arch/x86/mm/p2m-pt.c
@@ -687,7 +687,7 @@ p2m_pt_set_entry(struct p2m_domain *p2m, gfn_t gfn_, mfn_t mfn,
             if ( iommu_old_flags )
                 amd_iommu_flush_pages(p2m->domain, gfn, page_order);
         }
-        else if ( need_iommu(p2m->domain) )
+        else if ( need_iommu_pt_sync(p2m->domain) )
         {
             bfn_t bfn = _bfn(gfn);
 
diff --git a/xen/arch/x86/mm/p2m.c b/xen/arch/x86/mm/p2m.c
index 485f7583ba..9a0b9e373b 100644
--- a/xen/arch/x86/mm/p2m.c
+++ b/xen/arch/x86/mm/p2m.c
@@ -713,7 +713,7 @@ p2m_remove_page(struct p2m_domain *p2m, unsigned long gfn_l, unsigned long mfn,
     {
         int rc = 0;
 
-        if ( need_iommu(p2m->domain) )
+        if ( need_iommu_pt_sync(p2m->domain) )
         {
             bfn_t bfn = _bfn(mfn);
 
@@ -777,7 +777,7 @@ guest_physmap_add_entry(struct domain *d, gfn_t gfn, mfn_t mfn,
 
     if ( !paging_mode_translate(d) )
     {
-        if ( need_iommu(d) && t == p2m_ram_rw )
+        if ( need_iommu_pt_sync(d) && t == p2m_ram_rw )
         {
             bfn_t bfn = _bfn(mfn_x(mfn));
 
@@ -1163,7 +1163,7 @@ int set_identity_p2m_entry(struct domain *d, unsigned long gfn_l,
 
     if ( !paging_mode_translate(d) )
     {
-        if ( !need_iommu(d) )
+        if ( !need_iommu_pt_sync(d) )
             return 0;
 
         ret = iommu_map_page(d, _bfn(gfn_l), _mfn(gfn_l),
@@ -1259,7 +1259,7 @@ int clear_identity_p2m_entry(struct domain *d, unsigned long gfn_l)
 
     if ( !paging_mode_translate(d) )
     {
-        if ( !need_iommu(d) )
+        if ( !need_iommu_pt_sync(d) )
             return 0;
 
         ret = iommu_unmap_page(d, _bfn(gfn_l));
diff --git a/xen/arch/x86/mm/paging.c b/xen/arch/x86/mm/paging.c
index dcee496eb0..a4b545c905 100644
--- a/xen/arch/x86/mm/paging.c
+++ b/xen/arch/x86/mm/paging.c
@@ -213,7 +213,7 @@ int paging_log_dirty_enable(struct domain *d, bool_t log_global)
 {
     int ret;
 
-    if ( need_iommu(d) && log_global )
+    if ( has_iommu_pt(d) && log_global )
     {
         /*
          * Refuse to turn on global log-dirty mode
diff --git a/xen/arch/x86/x86_64/mm.c b/xen/arch/x86/x86_64/mm.c
index cc58e4cef4..c937f81f0a 100644
--- a/xen/arch/x86/x86_64/mm.c
+++ b/xen/arch/x86/x86_64/mm.c
@@ -1426,7 +1426,8 @@ int memory_add(unsigned long spfn, unsigned long epfn, unsigned int pxm)
     if ( ret )
         goto destroy_m2p;
 
-    if ( iommu_enabled && !iommu_passthrough && !need_iommu(hardware_domain) )
+    if ( iommu_enabled && !iommu_passthrough &&
+         !need_iommu_pt_sync(hardware_domain) )
     {
         for ( i = spfn; i < epfn; i++ )
             if ( iommu_map_page(hardware_domain, _bfn(i), _mfn(i),
diff --git a/xen/common/memory.c b/xen/common/memory.c
index 2e4cd8cdfd..c4671b457b 100644
--- a/xen/common/memory.c
+++ b/xen/common/memory.c
@@ -805,8 +805,8 @@ int xenmem_add_to_physmap(struct domain *d, struct xen_add_to_physmap *xatp,
     xatp->size -= start;
 
 #ifdef CONFIG_HAS_PASSTHROUGH
-    if ( need_iommu(d) )
-        this_cpu(iommu_dont_flush_iotlb) = 1;
+    if ( need_iommu_pt_sync(d) || iommu_use_hap_pt(d) )
+       this_cpu(iommu_dont_flush_iotlb) = 1;
 #endif
 
     while ( xatp->size > done )
@@ -828,7 +828,7 @@ int xenmem_add_to_physmap(struct domain *d, struct xen_add_to_physmap *xatp,
     }
 
 #ifdef CONFIG_HAS_PASSTHROUGH
-    if ( need_iommu(d) )
+    if ( need_iommu_pt_sync(d) || iommu_use_hap_pt(d) )
     {
         int ret;
 
diff --git a/xen/common/vm_event.c b/xen/common/vm_event.c
index 144ab81c86..ec3dfd1dae 100644
--- a/xen/common/vm_event.c
+++ b/xen/common/vm_event.c
@@ -644,7 +644,7 @@ int vm_event_domctl(struct domain *d, struct xen_domctl_vm_event_op *vec,
 
             /* No paging if iommu is used */
             rc = -EMLINK;
-            if ( unlikely(need_iommu(d)) )
+            if ( unlikely(has_iommu_pt(d)) )
                 break;
 
             rc = -EXDEV;
diff --git a/xen/drivers/passthrough/amd/pci_amd_iommu.c b/xen/drivers/passthrough/amd/pci_amd_iommu.c
index eea22c3d0d..752751b103 100644
--- a/xen/drivers/passthrough/amd/pci_amd_iommu.c
+++ b/xen/drivers/passthrough/amd/pci_amd_iommu.c
@@ -256,7 +256,7 @@ static void __hwdom_init amd_iommu_hwdom_init(struct domain *d)
     if ( allocate_domain_resources(dom_iommu(d)) )
         BUG();
 
-    if ( !iommu_passthrough && !need_iommu(d) )
+    if ( !iommu_passthrough && !iommu_dom0_strict )
     {
         int rc = 0;
 
diff --git a/xen/drivers/passthrough/device_tree.c b/xen/drivers/passthrough/device_tree.c
index 421f003438..b6eaae7283 100644
--- a/xen/drivers/passthrough/device_tree.c
+++ b/xen/drivers/passthrough/device_tree.c
@@ -40,17 +40,16 @@ int iommu_assign_dt_device(struct domain *d, struct dt_device_node *dev)
     if ( !list_empty(&dev->domain_list) )
         goto fail;
 
-    if ( need_iommu(d) <= 0 )
-    {
-        /*
-         * The hwdom is forced to use IOMMU for protecting assigned
-         * device. Therefore the IOMMU data is already set up.
-         */
-        ASSERT(!is_hardware_domain(d));
-        rc = iommu_construct(d);
-        if ( rc )
-            goto fail;
-    }
+    /*
+     * The hwdom is forced to use IOMMU for protecting assigned
+     * device. Therefore the IOMMU data is already set up.
+     */
+    ASSERT(!is_hardware_domain(d) ||
+           hd->status == IOMMU_STATUS_initialized);
+
+    rc = iommu_construct(d);
+    if ( rc )
+        goto fail;
 
     /* The flag field doesn't matter to DT device. */
     rc = hd->platform_ops->assign_device(d, 0, dt_to_dev(dev), 0);
diff --git a/xen/drivers/passthrough/iommu.c b/xen/drivers/passthrough/iommu.c
index c262a59877..848d6c7d69 100644
--- a/xen/drivers/passthrough/iommu.c
+++ b/xen/drivers/passthrough/iommu.c
@@ -171,7 +171,7 @@ static void __hwdom_init check_hwdom_reqs(struct domain *d)
 
 void __hwdom_init iommu_hwdom_init(struct domain *d)
 {
-    const struct domain_iommu *hd = dom_iommu(d);
+    struct domain_iommu *hd = dom_iommu(d);
 
     check_hwdom_reqs(d);
 
@@ -179,8 +179,10 @@ void __hwdom_init iommu_hwdom_init(struct domain *d)
         return;
 
     register_keyhandler('o', &iommu_dump_p2m_table, "dump iommu p2m table", 0);
-    d->need_iommu = !!iommu_dom0_strict;
-    if ( need_iommu(d) && !iommu_use_hap_pt(d) )
+
+    hd->status = IOMMU_STATUS_initializing;
+    hd->need_sync = !!iommu_dom0_strict && !iommu_use_hap_pt(d);
+    if ( need_iommu_pt_sync(d) )
     {
         struct page_info *page;
         unsigned int i = 0;
@@ -213,35 +215,51 @@ void __hwdom_init iommu_hwdom_init(struct domain *d)
     }
 
     hd->platform_ops->hwdom_init(d);
+
+    hd->status = IOMMU_STATUS_initialized;
 }
 
 void iommu_teardown(struct domain *d)
 {
-    const struct domain_iommu *hd = dom_iommu(d);
+    struct domain_iommu *hd = dom_iommu(d);
 
-    d->need_iommu = 0;
+    hd->status = IOMMU_STATUS_disabled;
     hd->platform_ops->teardown(d);
     tasklet_schedule(&iommu_pt_cleanup_tasklet);
 }
 
 int iommu_construct(struct domain *d)
 {
-    if ( need_iommu(d) > 0 )
+    struct domain_iommu *hd = dom_iommu(d);
+
+    if ( hd->status == IOMMU_STATUS_initialized )
         return 0;
 
-    if ( !iommu_use_hap_pt(d) )
+    if ( !iommu_use_hap_pt(d) && hd->status < IOMMU_STATUS_initialized )
     {
         int rc;
 
+        hd->status = IOMMU_STATUS_initializing;
+        hd->need_sync = true;
+
         rc = arch_iommu_populate_page_table(d);
         if ( rc )
+        {
+            if ( rc != -ERESTART )
+            {
+                hd->need_sync = false;
+                hd->status = IOMMU_STATUS_disabled;
+            }
+
             return rc;
+        }
     }
 
-    d->need_iommu = 1;
+    hd->status = IOMMU_STATUS_initialized;
+
     /*
      * There may be dirty cache lines when a device is assigned
-     * and before need_iommu(d) becoming true, this will cause
+     * and before has_iommu_pt(d) becoming true, this will cause
      * memory_type_changed lose effect if memory type changes.
      * Call memory_type_changed here to amend this.
      */
@@ -500,7 +518,8 @@ static void iommu_dump_p2m_table(unsigned char key)
     ops = iommu_get_ops();
     for_each_domain(d)
     {
-        if ( is_hardware_domain(d) || need_iommu(d) <= 0 )
+        if ( is_hardware_domain(d) ||
+             dom_iommu(d)->status < IOMMU_STATUS_initialized )
             continue;
 
         if ( iommu_use_hap_pt(d) )
diff --git a/xen/drivers/passthrough/pci.c b/xen/drivers/passthrough/pci.c
index d1adffa095..e10f0bba1c 100644
--- a/xen/drivers/passthrough/pci.c
+++ b/xen/drivers/passthrough/pci.c
@@ -1416,7 +1416,7 @@ static int assign_device(struct domain *d, u16 seg, u8 bus, u8 devfn, u32 flag)
 
     /* Prevent device assign if mem paging or mem sharing have been 
      * enabled for this domain */
-    if ( unlikely(!need_iommu(d) &&
+    if ( unlikely(!has_iommu_pt(d) &&
             (d->arch.hvm_domain.mem_sharing_enabled ||
              vm_event_check_ring(d->vm_event_paging) ||
              p2m_get_hostp2m(d)->global_logdirty)) )
@@ -1460,7 +1460,7 @@ static int assign_device(struct domain *d, u16 seg, u8 bus, u8 devfn, u32 flag)
     }
 
  done:
-    if ( !has_arch_pdevs(d) && need_iommu(d) )
+    if ( !has_arch_pdevs(d) && has_iommu_pt(d) )
         iommu_teardown(d);
     pcidevs_unlock();
 
@@ -1510,7 +1510,7 @@ int deassign_device(struct domain *d, u16 seg, u8 bus, u8 devfn)
 
     pdev->fault.count = 0;
 
-    if ( !has_arch_pdevs(d) && need_iommu(d) )
+    if ( !has_arch_pdevs(d) && has_iommu_pt(d) )
         iommu_teardown(d);
 
     return ret;
diff --git a/xen/drivers/passthrough/x86/iommu.c b/xen/drivers/passthrough/x86/iommu.c
index 09573722bd..1b3d2a2c8f 100644
--- a/xen/drivers/passthrough/x86/iommu.c
+++ b/xen/drivers/passthrough/x86/iommu.c
@@ -45,8 +45,6 @@ int arch_iommu_populate_page_table(struct domain *d)
     struct page_info *page;
     int rc = 0, n = 0;
 
-    d->need_iommu = -1;
-
     this_cpu(iommu_dont_flush_iotlb) = 1;
     spin_lock(&d->page_alloc_lock);
 
diff --git a/xen/include/asm-arm/grant_table.h b/xen/include/asm-arm/grant_table.h
index 5113b9156c..91c7681be9 100644
--- a/xen/include/asm-arm/grant_table.h
+++ b/xen/include/asm-arm/grant_table.h
@@ -92,7 +92,7 @@ static inline unsigned int gnttab_dom0_max(void)
     gfn_x(((i) >= nr_status_frames(t)) ? INVALID_GFN : (t)->arch.status_gfn[i])
 
 #define gnttab_need_iommu_mapping(d)                    \
-    (is_domain_direct_mapped(d) && need_iommu(d))
+    (is_domain_direct_mapped(d) && need_iommu_pt_sync(d))
 
 #endif /* __ASM_GRANT_TABLE_H__ */
 /*
diff --git a/xen/include/asm-arm/iommu.h b/xen/include/asm-arm/iommu.h
index 8d1506c6f7..f6df32f860 100644
--- a/xen/include/asm-arm/iommu.h
+++ b/xen/include/asm-arm/iommu.h
@@ -21,7 +21,7 @@ struct arch_iommu
 };
 
 /* Always share P2M Table between the CPU and the IOMMU */
-#define iommu_use_hap_pt(d) (need_iommu(d))
+#define iommu_use_hap_pt(d) (has_iommu_pt(d))
 
 const struct iommu_ops *iommu_get_ops(void);
 void __init iommu_set_ops(const struct iommu_ops *ops);
diff --git a/xen/include/asm-x86/grant_table.h b/xen/include/asm-x86/grant_table.h
index 76ec5dda2c..26b5f7bc53 100644
--- a/xen/include/asm-x86/grant_table.h
+++ b/xen/include/asm-x86/grant_table.h
@@ -99,6 +99,6 @@ static inline void gnttab_clear_flag(unsigned int nr, uint16_t *st)
 #define gnttab_release_host_mappings(domain) ( paging_mode_external(domain) )
 
 #define gnttab_need_iommu_mapping(d)                \
-    (!paging_mode_translate(d) && need_iommu(d))
+    (!paging_mode_translate(d) && need_iommu_pt_sync(d))
 
 #endif /* __ASM_GRANT_TABLE_H__ */
diff --git a/xen/include/asm-x86/iommu.h b/xen/include/asm-x86/iommu.h
index 0e7d66be54..291859520a 100644
--- a/xen/include/asm-x86/iommu.h
+++ b/xen/include/asm-x86/iommu.h
@@ -79,7 +79,7 @@ static inline int iommu_hardware_setup(void)
 
 /* Are we using the domain P2M table as its IOMMU pagetable? */
 #define iommu_use_hap_pt(d) \
-    (hap_enabled(d) && need_iommu(d) && iommu_hap_pt_share)
+    (hap_enabled(d) && has_iommu_pt(d) && iommu_hap_pt_share)
 
 void iommu_update_ire_from_apic(unsigned int apic, unsigned int reg, unsigned int value);
 unsigned int iommu_read_apic_from_ire(unsigned int apic, unsigned int reg);
diff --git a/xen/include/xen/iommu.h b/xen/include/xen/iommu.h
index cdd75acf62..37052e033f 100644
--- a/xen/include/xen/iommu.h
+++ b/xen/include/xen/iommu.h
@@ -111,6 +111,13 @@ enum iommu_feature
 
 bool_t iommu_has_feature(struct domain *d, enum iommu_feature feature);
 
+enum iommu_status
+{
+    IOMMU_STATUS_disabled,
+    IOMMU_STATUS_initializing,
+    IOMMU_STATUS_initialized
+};
+
 struct domain_iommu {
     struct arch_iommu arch;
 
@@ -130,6 +137,16 @@ struct domain_iommu {
      * must not be modified after initialization.
      */
     struct rangeset *reserved_ranges;
+
+    /* Status of guest IOMMU mappings */
+    enum iommu_status status;
+
+    /*
+     * Does the guest reqire mappings to be synchonized, to maintain
+     * the default bfn == pfn map. (See comment on bfn at the top of
+     * include/xen/mm.h).
+     */
+    bool need_sync;
 };
 
 #define dom_iommu(d)              (&(d)->iommu)
diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
index 381eb6dc8c..39bbd89f1f 100644
--- a/xen/include/xen/sched.h
+++ b/xen/include/xen/sched.h
@@ -371,9 +371,6 @@ struct domain
 
 #ifdef CONFIG_HAS_PASSTHROUGH
     struct domain_iommu iommu;
-
-    /* Does this guest need iommu mappings (-1 meaning "being set up")? */
-    s8               need_iommu;
 #endif
     /* is node-affinity automatically computed? */
     bool             auto_node_affinity;
@@ -889,9 +886,11 @@ void watchdog_domain_destroy(struct domain *d);
 #define is_pinned_vcpu(v) ((v)->domain->is_pinned || \
                            cpumask_weight((v)->cpu_hard_affinity) == 1)
 #ifdef CONFIG_HAS_PASSTHROUGH
-#define need_iommu(d)    ((d)->need_iommu)
+#define has_iommu_pt(d) (dom_iommu(d)->status != IOMMU_STATUS_disabled)
+#define need_iommu_pt_sync(d) (dom_iommu(d)->need_sync)
 #else
-#define need_iommu(d)    (0)
+#define has_iommu_pt(d) (0)
+#define need_iommu_pt_sync(d) (0)
 #endif
 
 static inline bool is_vcpu_online(const struct vcpu *v)
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [PATCH v6 11/14] x86: add iommu_op to enable modification of IOMMU mappings
  2018-08-23  9:46 [PATCH v6 00/14] paravirtual IOMMU interface Paul Durrant
                   ` (9 preceding siblings ...)
  2018-08-23  9:47 ` [PATCH v6 10/14] mm / iommu: split need_iommu() into has_iommu_pt() and need_iommu_pt_sync() Paul Durrant
@ 2018-08-23  9:47 ` Paul Durrant
  2018-09-11 14:48   ` Jan Beulich
  2018-08-23  9:47 ` [PATCH v6 12/14] memory: add get_paged_gfn() as a wrapper Paul Durrant
                   ` (2 subsequent siblings)
  13 siblings, 1 reply; 111+ messages in thread
From: Paul Durrant @ 2018-08-23  9:47 UTC (permalink / raw)
  To: xen-devel
  Cc: Stefano Stabellini, Wei Liu, George Dunlap, Andrew Cooper,
	Ian Jackson, Tim Deegan, Julien Grall, Paul Durrant, Jan Beulich

This patch adds an iommu_op which checks whether it is possible or
safe for a domain to modify its own IOMMU mappings and, if so, creates
a rangeset to track modifications.

The op passes back a capabilities mask. There is only one bit currently
defined for this mask: XEN_IOMMU_CAP_per_device_mappings. This bit is
always left clear by the current implementation but may be set in future
if the implementation is enhance to support per-device IOMMU mappings.

NOTE: The actual map and unmap operations are introduced by subsequent
      patches.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
---
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: George Dunlap <George.Dunlap@eu.citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Julien Grall <julien.grall@arm.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Tim Deegan <tim@xen.org>
Cc: Wei Liu <wei.liu2@citrix.com>

v6:
 - Add struct xen_iommu_op_enable_modification to hold the capabilities
   mask.

v4:
 - Set sync_iommu_pt to false instead of need_iommu.

v2:
 - New in v2.
---
 xen/common/iommu_op.c           | 53 +++++++++++++++++++++++++++++++++++++++++
 xen/drivers/passthrough/iommu.c |  2 +-
 xen/drivers/passthrough/pci.c   |  4 ++--
 xen/include/public/iommu_op.h   | 20 ++++++++++++++++
 xen/include/xen/iommu.h         |  3 +++
 xen/include/xlat.lst            |  1 +
 6 files changed, 80 insertions(+), 3 deletions(-)

diff --git a/xen/common/iommu_op.c b/xen/common/iommu_op.c
index bcfcd49102..ccbb9b6340 100644
--- a/xen/common/iommu_op.c
+++ b/xen/common/iommu_op.c
@@ -78,6 +78,51 @@ static int iommu_op_query_reserved(struct xen_iommu_op_query_reserved *op)
     return 0;
 }
 
+static int iommu_op_enable_modification(
+    struct xen_iommu_op_enable_modification *op)
+{
+    struct domain *currd = current->domain;
+    struct domain_iommu *iommu = dom_iommu(currd);
+    const struct iommu_ops *ops = iommu->platform_ops;
+
+    if ( op->cap || op->pad )
+        return -EINVAL;
+
+    /*
+     * XEN_IOMMU_CAP_per_device_mappings is not supported yet so we can
+     * leave op->cap alone.
+     */
+
+    /* Has modification already been enabled? */
+    if ( iommu->iommu_op_ranges )
+        return 0;
+
+    /*
+     * The IOMMU mappings cannot be modified if:
+     * - the IOMMU is not enabled or,
+     * - the current domain is dom0 and tranlsation is disabled or,
+     * - HAP is enabled and the IOMMU shares the mappings.
+     */
+    if ( !iommu_enabled ||
+         (is_hardware_domain(currd) && iommu_passthrough) ||
+         iommu_use_hap_pt(currd) )
+        return -EACCES;
+
+    /*
+     * The IOMMU implementation must provide the lookup method if
+     * modification of the mappings is to be supported.
+     */
+    if ( !ops->lookup_page )
+        return -EOPNOTSUPP;
+
+    iommu->iommu_op_ranges = rangeset_new(currd, NULL, 0);
+    if ( !iommu->iommu_op_ranges )
+        return -ENOMEM;
+
+    iommu->need_sync = false; /* Disable synchronization */
+    return 0;
+}
+
 static void iommu_op(xen_iommu_op_t *op)
 {
     switch ( op->op )
@@ -86,6 +131,10 @@ static void iommu_op(xen_iommu_op_t *op)
         op->status = iommu_op_query_reserved(&op->u.query_reserved);
         break;
 
+    case XEN_IOMMUOP_enable_modification:
+        op->status = iommu_op_enable_modification(&op->u.enable_modification);
+        break;
+
     default:
         op->status = -EOPNOTSUPP;
         break;
@@ -98,6 +147,7 @@ int do_one_iommu_op(xen_iommu_op_buf_t *buf)
     size_t offset;
     static const size_t op_size[] = {
         [XEN_IOMMUOP_query_reserved] = sizeof(struct xen_iommu_op_query_reserved),
+        [XEN_IOMMUOP_enable_modification] = sizeof(struct xen_iommu_op_enable_modification),
     };
     size_t size;
     int rc;
@@ -184,6 +234,7 @@ int compat_one_iommu_op(compat_iommu_op_buf_t *buf)
     size_t offset;
     static const size_t op_size[] = {
         [XEN_IOMMUOP_query_reserved] = sizeof(struct compat_iommu_op_query_reserved),
+        [XEN_IOMMUOP_enable_modification] = sizeof(struct compat_iommu_op_enable_modification),
     };
     size_t size;
     xen_iommu_op_t nat;
@@ -221,6 +272,7 @@ int compat_one_iommu_op(compat_iommu_op_buf_t *buf)
      * we need to fix things up here.
      */
 #define XLAT_iommu_op_u_query_reserved XEN_IOMMUOP_query_reserved
+#define XLAT_iommu_op_u_enable_modification XEN_IOMMUOP_enable_modification
     u = cmp.op;
 
 #define XLAT_iommu_op_query_reserved_HNDL_ranges(_d_, _s_)            \
@@ -280,6 +332,7 @@ int compat_one_iommu_op(compat_iommu_op_buf_t *buf)
                                 &cmp, status) )
         return -EFAULT;
 
+#undef XLAT_iommu_op_u_enable_modification
 #undef XLAT_iommu_op_u_query_reserved
 
     return 0;
diff --git a/xen/drivers/passthrough/iommu.c b/xen/drivers/passthrough/iommu.c
index 848d6c7d69..30b800c97b 100644
--- a/xen/drivers/passthrough/iommu.c
+++ b/xen/drivers/passthrough/iommu.c
@@ -26,7 +26,6 @@ static void iommu_dump_p2m_table(unsigned char key);
 
 unsigned int __read_mostly iommu_dev_iotlb_timeout = 1000;
 integer_param("iommu_dev_iotlb_timeout", iommu_dev_iotlb_timeout);
-
 /*
  * The 'iommu' parameter enables the IOMMU.  Optional comma separated
  * value may contain:
@@ -280,6 +279,7 @@ void iommu_domain_destroy(struct domain *d)
     arch_iommu_domain_destroy(d);
 
     rangeset_destroy(hd->reserved_ranges);
+    rangeset_destroy(hd->iommu_op_ranges);
 }
 
 int iommu_map_page(struct domain *d, bfn_t bfn, mfn_t mfn,
diff --git a/xen/drivers/passthrough/pci.c b/xen/drivers/passthrough/pci.c
index e10f0bba1c..91de8b2784 100644
--- a/xen/drivers/passthrough/pci.c
+++ b/xen/drivers/passthrough/pci.c
@@ -1460,7 +1460,7 @@ static int assign_device(struct domain *d, u16 seg, u8 bus, u8 devfn, u32 flag)
     }
 
  done:
-    if ( !has_arch_pdevs(d) && has_iommu_pt(d) )
+    if ( !has_arch_pdevs(d) && has_iommu_pt(d) && !hd->iommu_op_ranges )
         iommu_teardown(d);
     pcidevs_unlock();
 
@@ -1510,7 +1510,7 @@ int deassign_device(struct domain *d, u16 seg, u8 bus, u8 devfn)
 
     pdev->fault.count = 0;
 
-    if ( !has_arch_pdevs(d) && has_iommu_pt(d) )
+    if ( !has_arch_pdevs(d) && has_iommu_pt(d) && !hd->iommu_op_ranges )
         iommu_teardown(d);
 
     return ret;
diff --git a/xen/include/public/iommu_op.h b/xen/include/public/iommu_op.h
index ade404a877..9b98b5cf89 100644
--- a/xen/include/public/iommu_op.h
+++ b/xen/include/public/iommu_op.h
@@ -61,6 +61,25 @@ struct xen_iommu_op_query_reserved {
     XEN_GUEST_HANDLE(xen_iommu_reserved_range_t) ranges;
 };
 
+/*
+ * XEN_IOMMUOP_enable_modification: Enable operations that modify IOMMU
+ *                                  mappings.
+ */
+#define XEN_IOMMUOP_enable_modification 2
+
+struct xen_iommu_op_enable_modification {
+    /*
+     * OUT - On successful return this is set to the bitwise OR of capabilities
+     *       defined below. On entry this must be set to zero.
+     */
+    unsigned int cap;
+    unsigned int pad;
+
+    /* Does the implementation support per-device mappings? */
+#define _XEN_IOMMU_CAP_per_device_mappings 0
+#define XEN_IOMMU_CAP_per_device_mappings (1u << _XEN_IOMMU_CAP_per_device_mappings)
+};
+
 struct xen_iommu_op {
     uint16_t op;    /* op type */
     uint16_t pad;
@@ -68,6 +87,7 @@ struct xen_iommu_op {
                     /* 0 for success otherwise, negative errno */
     union {
         struct xen_iommu_op_query_reserved query_reserved;
+        struct xen_iommu_op_enable_modification enable_modification;
     } u;
 };
 typedef struct xen_iommu_op xen_iommu_op_t;
diff --git a/xen/include/xen/iommu.h b/xen/include/xen/iommu.h
index 37052e033f..14186ae8e4 100644
--- a/xen/include/xen/iommu.h
+++ b/xen/include/xen/iommu.h
@@ -147,6 +147,9 @@ struct domain_iommu {
      * include/xen/mm.h).
      */
     bool need_sync;
+
+    /* Ranges under the control of PV IOMMU interface */
+    struct rangeset *iommu_op_ranges;
 };
 
 #define dom_iommu(d)              (&(d)->iommu)
diff --git a/xen/include/xlat.lst b/xen/include/xlat.lst
index d2f9b1034b..c1b27e0349 100644
--- a/xen/include/xlat.lst
+++ b/xen/include/xlat.lst
@@ -79,6 +79,7 @@
 ?	vcpu_hvm_x86_64			hvm/hvm_vcpu.h
 !	iommu_op			iommu_op.h
 !	iommu_op_buf			iommu_op.h
+!	iommu_op_enable_modification	iommu_op.h
 !	iommu_op_query_reserved		iommu_op.h
 ?	iommu_reserved_range		iommu_op.h
 ?	kexec_exec			kexec.h
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [PATCH v6 12/14] memory: add get_paged_gfn() as a wrapper...
  2018-08-23  9:46 [PATCH v6 00/14] paravirtual IOMMU interface Paul Durrant
                   ` (10 preceding siblings ...)
  2018-08-23  9:47 ` [PATCH v6 11/14] x86: add iommu_op to enable modification of IOMMU mappings Paul Durrant
@ 2018-08-23  9:47 ` Paul Durrant
  2018-08-23 10:24   ` Julien Grall
  2018-09-11 14:56   ` Jan Beulich
  2018-08-23  9:47 ` [PATCH v6 13/14] x86: add iommu_ops to modify and flush IOMMU mappings Paul Durrant
  2018-08-23  9:47 ` [PATCH v6 14/14] x86: extend the map and unmap iommu_ops to support grant references Paul Durrant
  13 siblings, 2 replies; 111+ messages in thread
From: Paul Durrant @ 2018-08-23  9:47 UTC (permalink / raw)
  To: xen-devel
  Cc: Stefano Stabellini, Wei Liu, George Dunlap, Andrew Cooper,
	Ian Jackson, Tim Deegan, Julien Grall, Paul Durrant, Jan Beulich

...for some uses of get_page_from_gfn().

There are many occurences of the following pattern in the code:

    q = <readonly look-up> ? P2M_ALLOC : P2M_UNSHARE;
    page = get_page_from_gfn(d, gfn, &p2mt, q);

    if ( p2m_is_paging(p2mt) )
    {
        if ( page )
            put_page(page);

        p2m_mem_paging_populate(d, gfn);
        return <-EAGAIN or equivalent>;
    }

    if ( (q & P2M_UNSHARE) && p2m_is_shared(p2mt) )
    {
        if ( page )
            put_page(page);

        return <-EAGAIN or equivalent>;
    }

    if ( !page )
        return <-EINVAL or equivalent>;

    if ( !p2m_is_ram(p2mt) ||
         (!<readonly look-up> && p2m_is_readonly(p2mt)) )
    {
        put_page(page);
        return <-EINVAL or equivalent>;
    }

There are some small differences between the exact way the occurrences are
coded but the desired semantic is the same.

This patch introduces a new common implementation of this code in
get_paged_gfn() and then converts the various open-coded patterns into
calls to this new function.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
---
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: George Dunlap <George.Dunlap@eu.citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Julien Grall <julien.grall@arm.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Tim Deegan <tim@xen.org>
Cc: Wei Liu <wei.liu2@citrix.com>

v3:
 - Addressed comments from George.

v2:
 - New in v2.
---
 xen/arch/x86/hvm/emulate.c | 32 ++++++--------------------
 xen/arch/x86/hvm/hvm.c     | 16 ++-----------
 xen/common/grant_table.c   | 38 +++++++++----------------------
 xen/common/memory.c        | 56 +++++++++++++++++++++++++++++++++++++---------
 xen/include/asm-arm/p2m.h  |  3 +++
 xen/include/asm-x86/p2m.h  |  2 ++
 6 files changed, 71 insertions(+), 76 deletions(-)

diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c
index 20d1d5b88e..4d9c42e5ef 100644
--- a/xen/arch/x86/hvm/emulate.c
+++ b/xen/arch/x86/hvm/emulate.c
@@ -350,34 +350,16 @@ static int hvmemul_do_io_buffer(
 
 static int hvmemul_acquire_page(unsigned long gmfn, struct page_info **page)
 {
-    struct domain *curr_d = current->domain;
-    p2m_type_t p2mt;
-
-    *page = get_page_from_gfn(curr_d, gmfn, &p2mt, P2M_UNSHARE);
-
-    if ( *page == NULL )
-        return X86EMUL_UNHANDLEABLE;
-
-    if ( p2m_is_paging(p2mt) )
-    {
-        put_page(*page);
-        p2m_mem_paging_populate(curr_d, gmfn);
-        return X86EMUL_RETRY;
-    }
-
-    if ( p2m_is_shared(p2mt) )
+    switch ( get_paged_gfn(current->domain, _gfn(gmfn), false, NULL, page) )
     {
-        put_page(*page);
+    case -EAGAIN:
         return X86EMUL_RETRY;
-    }
-
-    /* This code should not be reached if the gmfn is not RAM */
-    if ( p2m_is_mmio(p2mt) )
-    {
-        domain_crash(curr_d);
-
-        put_page(*page);
+    case -EINVAL:
         return X86EMUL_UNHANDLEABLE;
+    default:
+        ASSERT_UNREACHABLE();
+    case 0:
+        break;
     }
 
     return X86EMUL_OKAY;
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index 72c51faecb..03430e6f07 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -2557,24 +2557,12 @@ static void *_hvm_map_guest_frame(unsigned long gfn, bool_t permanent,
                                   bool_t *writable)
 {
     void *map;
-    p2m_type_t p2mt;
     struct page_info *page;
     struct domain *d = current->domain;
+    p2m_type_t p2mt;
 
-    page = get_page_from_gfn(d, gfn, &p2mt,
-                             writable ? P2M_UNSHARE : P2M_ALLOC);
-    if ( (p2m_is_shared(p2mt) && writable) || !page )
-    {
-        if ( page )
-            put_page(page);
-        return NULL;
-    }
-    if ( p2m_is_paging(p2mt) )
-    {
-        put_page(page);
-        p2m_mem_paging_populate(d, gfn);
+    if ( get_paged_gfn(d, _gfn(gfn), !writable, &p2mt, &page) )
         return NULL;
-    }
 
     if ( writable )
     {
diff --git a/xen/common/grant_table.c b/xen/common/grant_table.c
index e694a4bf16..f3b2fad7a8 100644
--- a/xen/common/grant_table.c
+++ b/xen/common/grant_table.c
@@ -374,39 +374,23 @@ static int get_paged_frame(unsigned long gfn, mfn_t *mfn,
                            struct page_info **page, bool readonly,
                            struct domain *rd)
 {
-    int rc = GNTST_okay;
-    p2m_type_t p2mt;
-
-    *mfn = INVALID_MFN;
-    *page = get_page_from_gfn(rd, gfn, &p2mt,
-                              readonly ? P2M_ALLOC : P2M_UNSHARE);
-    if ( !*page )
-    {
-#ifdef P2M_SHARED_TYPES
-        if ( p2m_is_shared(p2mt) )
-            return GNTST_eagain;
-#endif
-#ifdef P2M_PAGES_TYPES
-        if ( p2m_is_paging(p2mt) )
-        {
-            p2m_mem_paging_populate(rd, gfn);
-            return GNTST_eagain;
-        }
-#endif
-        return GNTST_bad_page;
-    }
+    int rc;
 
-    if ( p2m_is_foreign(p2mt) )
+    rc = get_paged_gfn(rd, _gfn(gfn), readonly, NULL, page);
+    switch ( rc )
     {
-        put_page(*page);
-        *page = NULL;
-
+    case -EAGAIN:
+        return GNTST_eagain;
+    case -EINVAL:
         return GNTST_bad_page;
+    default:
+        ASSERT_UNREACHABLE();
+    case 0:
+        break;
     }
 
     *mfn = page_to_mfn(*page);
-
-    return rc;
+    return GNTST_okay;
 }
 
 static inline void
diff --git a/xen/common/memory.c b/xen/common/memory.c
index c4671b457b..7d8684f494 100644
--- a/xen/common/memory.c
+++ b/xen/common/memory.c
@@ -1625,37 +1625,73 @@ void destroy_ring_for_helper(
     }
 }
 
-int prepare_ring_for_helper(
-    struct domain *d, unsigned long gmfn, struct page_info **_page,
-    void **_va)
+/*
+ * Acquire a pointer to struct page_info for a specified doman and GFN,
+ * checking whether the page has been paged out, or needs unsharing.
+ * If the function succeeds then zero is returned and page_p is written
+ * with a pointer to the struct page_info with a reference taken. The
+ * caller is responsible for dropping the reference. If p2mt_p is non-NULL
+ * then it is also written with the P2M type of the page.
+ * If the function fails then an appropriate errno is returned and the
+ * values referenced by page_p and p2mt_p are undefined.
+ */
+int get_paged_gfn(struct domain *d, gfn_t gfn, bool readonly,
+                  p2m_type_t *p2mt_p, struct page_info **page_p)
 {
-    struct page_info *page;
+    p2m_query_t q = readonly ? P2M_ALLOC : P2M_UNSHARE;
     p2m_type_t p2mt;
-    void *va;
+    struct page_info *page;
 
-    page = get_page_from_gfn(d, gmfn, &p2mt, P2M_UNSHARE);
+    page = get_page_from_gfn(d, gfn_x(gfn), &p2mt, q);
 
 #ifdef CONFIG_HAS_MEM_PAGING
     if ( p2m_is_paging(p2mt) )
     {
         if ( page )
             put_page(page);
-        p2m_mem_paging_populate(d, gmfn);
-        return -ENOENT;
+
+        p2m_mem_paging_populate(d, gfn_x(gfn));
+        return -EAGAIN;
     }
 #endif
 #ifdef CONFIG_HAS_MEM_SHARING
-    if ( p2m_is_shared(p2mt) )
+    if ( (q & P2M_UNSHARE) && p2m_is_shared(p2mt) )
     {
         if ( page )
             put_page(page);
-        return -ENOENT;
+
+        return -EAGAIN;
     }
 #endif
 
     if ( !page )
         return -EINVAL;
 
+    if ( !p2m_is_ram(p2mt) || (!readonly && p2m_is_readonly(p2mt)) )
+    {
+        put_page(page);
+        return -EINVAL;
+    }
+
+    if ( p2mt_p )
+        *p2mt_p = p2mt;
+
+    *page_p = page;
+    return 0;
+}
+
+int prepare_ring_for_helper(
+    struct domain *d, unsigned long gmfn, struct page_info **_page,
+    void **_va)
+{
+    struct page_info *page;
+    void *va;
+    int rc;
+
+    rc = get_paged_gfn(d, _gfn(gmfn), false, NULL, &page);
+    if ( rc )
+        return (rc == -EAGAIN) ? -ENOENT : rc;
+
     if ( !get_page_type(page, PGT_writable_page) )
     {
         put_page(page);
diff --git a/xen/include/asm-arm/p2m.h b/xen/include/asm-arm/p2m.h
index 8823707c17..a39a4faabd 100644
--- a/xen/include/asm-arm/p2m.h
+++ b/xen/include/asm-arm/p2m.h
@@ -303,6 +303,9 @@ static inline struct page_info *get_page_from_gfn(
     return page;
 }
 
+int get_paged_gfn(struct domain *d, gfn_t gfn, bool readonly,
+                  p2m_type_t *p2mt_p, struct page_info **page_p);
+
 int get_page_type(struct page_info *page, unsigned long type);
 bool is_iomem_page(mfn_t mfn);
 static inline int get_page_and_type(struct page_info *page,
diff --git a/xen/include/asm-x86/p2m.h b/xen/include/asm-x86/p2m.h
index d4b3cfcb6e..e890bcd3e1 100644
--- a/xen/include/asm-x86/p2m.h
+++ b/xen/include/asm-x86/p2m.h
@@ -492,6 +492,8 @@ static inline struct page_info *get_page_from_gfn(
     return mfn_valid(_mfn(gfn)) && get_page(page, d) ? page : NULL;
 }
 
+int get_paged_gfn(struct domain *d, gfn_t gfn, bool readonly,
+                  p2m_type_t *p2mt_p, struct page_info **page_p);
 
 /* General conversion function from mfn to gfn */
 static inline unsigned long mfn_to_gfn(struct domain *d, mfn_t mfn)
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [PATCH v6 13/14] x86: add iommu_ops to modify and flush IOMMU mappings
  2018-08-23  9:46 [PATCH v6 00/14] paravirtual IOMMU interface Paul Durrant
                   ` (11 preceding siblings ...)
  2018-08-23  9:47 ` [PATCH v6 12/14] memory: add get_paged_gfn() as a wrapper Paul Durrant
@ 2018-08-23  9:47 ` Paul Durrant
  2018-09-11 15:15   ` Jan Beulich
  2018-09-12  7:03   ` Jan Beulich
  2018-08-23  9:47 ` [PATCH v6 14/14] x86: extend the map and unmap iommu_ops to support grant references Paul Durrant
  13 siblings, 2 replies; 111+ messages in thread
From: Paul Durrant @ 2018-08-23  9:47 UTC (permalink / raw)
  To: xen-devel
  Cc: Stefano Stabellini, Wei Liu, George Dunlap, Andrew Cooper,
	Ian Jackson, Tim Deegan, Julien Grall, Paul Durrant, Jan Beulich

This patch adds iommu_ops to add (map) or remove (unmap) frames in the
domain's IOMMU mappings, and an iommu_op to synchronize (flush) those
manipulations with the hardware.

Currently the flags value for each op must include the
XEN_IOMMUOP_map/unmap/flush_all flag as the implementation does not yet
support per-device mappings. The sbdf field of each hypercall is
accordingly ignored.

Mappings added by the map operation are tracked and only those mappings
may be removed by a subsequent unmap operation. Frames are specified by the
owning domain and GFN. It is, of course, permissable for a domain to map
its own frames using DOMID_SELF.

NOTE: The owning domain and GFN must also be specified in the unmap
      operation, as well as the BFN, so that they can be cross-checked
      with the existent mapping.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
---
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: George Dunlap <George.Dunlap@eu.citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Julien Grall <julien.grall@arm.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Tim Deegan <tim@xen.org>
Cc: Wei Liu <wei.liu2@citrix.com>

v6:
 - Add placeholder sbdf field and flag to control scope of map, unmap and
   flush.

v4:
 - Fixed logic inversion when checking return of iommu_unmap_page().

v3:
 - Add type pinning.

v2:
 - Heavily re-worked in v2, including explicit tracking of mappings.
   This avoids the need to clear non-reserved mappings from IOMMU
   at start of day, which would be prohibitively slow on a large host.
---
 xen/common/iommu_op.c         | 178 ++++++++++++++++++++++++++++++++++++++++++
 xen/include/public/iommu_op.h | 110 ++++++++++++++++++++++++++
 xen/include/xlat.lst          |   3 +
 3 files changed, 291 insertions(+)

diff --git a/xen/common/iommu_op.c b/xen/common/iommu_op.c
index ccbb9b6340..328522f245 100644
--- a/xen/common/iommu_op.c
+++ b/xen/common/iommu_op.c
@@ -123,6 +123,156 @@ static int iommu_op_enable_modification(
     return 0;
 }
 
+static int iommuop_map(struct xen_iommu_op_map *op)
+{
+    struct domain *d, *currd = current->domain;
+    struct domain_iommu *iommu = dom_iommu(currd);
+    bool readonly = op->flags & XEN_IOMMUOP_map_readonly;
+    bfn_t bfn = _bfn(op->bfn);
+    struct page_info *page;
+    unsigned int prot;
+    int rc, ignore;
+
+    if ( op->pad ||
+         (op->flags & ~(XEN_IOMMUOP_map_all |
+                        XEN_IOMMUOP_map_readonly)) )
+        return -EINVAL;
+
+    if ( !iommu->iommu_op_ranges )
+        return -EOPNOTSUPP;
+
+    /* Per-device mapping not yet supported */
+    if ( !(op->flags & XEN_IOMMUOP_map_all) )
+        return -EINVAL;
+
+    /* Check whether the specified BFN falls in a reserved region */
+    if ( rangeset_contains_singleton(iommu->reserved_ranges, bfn_x(bfn)) )
+        return -EINVAL;
+
+    d = rcu_lock_domain_by_any_id(op->domid);
+    if ( !d )
+        return -ESRCH;
+
+    rc = get_paged_gfn(d, _gfn(op->gfn), readonly, NULL, &page);
+    if ( rc )
+        goto unlock;
+
+    rc = -EINVAL;
+    if ( !readonly && !get_page_type(page, PGT_writable_page) )
+    {
+        put_page(page);
+        goto unlock;
+    }
+
+    prot = IOMMUF_readable;
+    if ( !readonly )
+        prot |= IOMMUF_writable;
+
+    rc = -EIO;
+    if ( iommu_map_page(currd, bfn, page_to_mfn(page), prot) )
+        goto release;
+
+    rc = rangeset_add_singleton(iommu->iommu_op_ranges, bfn_x(bfn));
+    if ( rc )
+        goto unmap;
+
+    rc = 0;
+    goto unlock; /* retain mapping and references */
+
+ unmap:
+    ignore = iommu_unmap_page(currd, bfn);
+
+ release:
+    if ( !readonly )
+        put_page_type(page);
+    put_page(page);
+
+ unlock:
+    rcu_unlock_domain(d);
+    return rc;
+}
+
+static int iommuop_unmap(struct xen_iommu_op_unmap *op)
+{
+    struct domain *d, *currd = current->domain;
+    struct domain_iommu *iommu = dom_iommu(currd);
+    bfn_t bfn = _bfn(op->bfn);
+    mfn_t mfn;
+    bool readonly;
+    unsigned int prot;
+    struct page_info *page;
+    int rc;
+
+    if ( op->pad ||
+         (op->flags & ~XEN_IOMMUOP_unmap_all) )
+        return -EINVAL;
+
+    if ( !iommu->iommu_op_ranges )
+        return -EOPNOTSUPP;
+
+    /* Per-device unmapping not yet supported */
+    if ( !(op->flags & XEN_IOMMUOP_unmap_all) )
+        return -EINVAL;
+
+    if ( !rangeset_contains_singleton(iommu->iommu_op_ranges, bfn_x(bfn)) ||
+         iommu_lookup_page(currd, bfn, &mfn, &prot) ||
+         !mfn_valid(mfn) )
+        return -ENOENT;
+
+    readonly = !(prot & IOMMUF_writable);
+
+    d = rcu_lock_domain_by_any_id(op->domid);
+    if ( !d )
+        return -ESRCH;
+
+    rc = get_paged_gfn(d, _gfn(op->gfn), !(prot & IOMMUF_writable), NULL,
+                       &page);
+    if ( rc )
+        goto unlock;
+
+    put_page(page); /* release extra reference just taken */
+
+    rc = -EINVAL;
+    if ( !mfn_eq(page_to_mfn(page), mfn) )
+        goto unlock;
+
+    /* release reference taken in map */
+    if ( !readonly )
+        put_page_type(page);
+    put_page(page);
+
+    rc = rangeset_remove_singleton(iommu->iommu_op_ranges, bfn_x(bfn));
+    if ( rc )
+        goto unlock;
+
+    if ( iommu_unmap_page(currd, bfn) )
+        rc = -EIO;
+
+ unlock:
+    rcu_unlock_domain(d);
+
+    return rc;
+}
+
+static int iommuop_flush(struct xen_iommu_op_flush *op)
+{
+    struct domain *currd = current->domain;
+    struct domain_iommu *iommu = dom_iommu(currd);
+
+    if ( op->pad0 || op->pad1 ||
+         (op->flags & ~XEN_IOMMUOP_flush_all) )
+        return -EINVAL;
+
+    if ( !iommu->iommu_op_ranges )
+        return -EOPNOTSUPP;
+
+    /* Per-device flushing not yet supported */
+    if ( !(op->flags & XEN_IOMMUOP_flush_all) )
+        return -EINVAL;
+
+    return !iommu_iotlb_flush_all(currd) ? 0 : -EIO;
+}
+
 static void iommu_op(xen_iommu_op_t *op)
 {
     switch ( op->op )
@@ -135,6 +285,22 @@ static void iommu_op(xen_iommu_op_t *op)
         op->status = iommu_op_enable_modification(&op->u.enable_modification);
         break;
 
+    case XEN_IOMMUOP_map:
+        this_cpu(iommu_dont_flush_iotlb) = 1;
+        op->status = iommuop_map(&op->u.map);
+        this_cpu(iommu_dont_flush_iotlb) = 0;
+        break;
+
+    case XEN_IOMMUOP_unmap:
+        this_cpu(iommu_dont_flush_iotlb) = 1;
+        op->status = iommuop_unmap(&op->u.unmap);
+        this_cpu(iommu_dont_flush_iotlb) = 0;
+        break;
+
+    case XEN_IOMMUOP_flush:
+        op->status = iommuop_flush(&op->u.flush);
+        break;
+
     default:
         op->status = -EOPNOTSUPP;
         break;
@@ -148,6 +314,9 @@ int do_one_iommu_op(xen_iommu_op_buf_t *buf)
     static const size_t op_size[] = {
         [XEN_IOMMUOP_query_reserved] = sizeof(struct xen_iommu_op_query_reserved),
         [XEN_IOMMUOP_enable_modification] = sizeof(struct xen_iommu_op_enable_modification),
+        [XEN_IOMMUOP_map] = sizeof(struct xen_iommu_op_map),
+        [XEN_IOMMUOP_unmap] = sizeof(struct xen_iommu_op_unmap),
+        [XEN_IOMMUOP_flush] = sizeof(struct xen_iommu_op_flush),
     };
     size_t size;
     int rc;
@@ -235,6 +404,9 @@ int compat_one_iommu_op(compat_iommu_op_buf_t *buf)
     static const size_t op_size[] = {
         [XEN_IOMMUOP_query_reserved] = sizeof(struct compat_iommu_op_query_reserved),
         [XEN_IOMMUOP_enable_modification] = sizeof(struct compat_iommu_op_enable_modification),
+        [XEN_IOMMUOP_map] = sizeof(struct compat_iommu_op_map),
+        [XEN_IOMMUOP_unmap] = sizeof(struct compat_iommu_op_unmap),
+        [XEN_IOMMUOP_flush] = sizeof(struct compat_iommu_op_flush),
     };
     size_t size;
     xen_iommu_op_t nat;
@@ -273,6 +445,9 @@ int compat_one_iommu_op(compat_iommu_op_buf_t *buf)
      */
 #define XLAT_iommu_op_u_query_reserved XEN_IOMMUOP_query_reserved
 #define XLAT_iommu_op_u_enable_modification XEN_IOMMUOP_enable_modification
+#define XLAT_iommu_op_u_map XEN_IOMMUOP_map
+#define XLAT_iommu_op_u_unmap XEN_IOMMUOP_unmap
+#define XLAT_iommu_op_u_flush XEN_IOMMUOP_flush
     u = cmp.op;
 
 #define XLAT_iommu_op_query_reserved_HNDL_ranges(_d_, _s_)            \
@@ -332,6 +507,9 @@ int compat_one_iommu_op(compat_iommu_op_buf_t *buf)
                                 &cmp, status) )
         return -EFAULT;
 
+#undef XLAT_iommu_op_u_flush
+#undef XLAT_iommu_op_u_unmap
+#undef XLAT_iommu_op_u_map
 #undef XLAT_iommu_op_u_enable_modification
 #undef XLAT_iommu_op_u_query_reserved
 
diff --git a/xen/include/public/iommu_op.h b/xen/include/public/iommu_op.h
index 9b98b5cf89..e6c08f4bdd 100644
--- a/xen/include/public/iommu_op.h
+++ b/xen/include/public/iommu_op.h
@@ -80,6 +80,113 @@ struct xen_iommu_op_enable_modification {
 #define XEN_IOMMU_CAP_per_device_mappings (1u << _XEN_IOMMU_CAP_per_device_mappings)
 };
 
+/*
+ * XEN_IOMMUOP_map: Map a guest page in the IOMMU.
+ */
+#define XEN_IOMMUOP_map 3
+
+struct xen_iommu_op_map {
+    /* IN - The domid of the guest */
+    domid_t domid;
+    /*
+     * IN - flags controlling the mapping. This should be a bitwise OR of the
+     *      flags defined below.
+     */
+    uint16_t flags;
+
+    /*
+     * Should the mapping be created for all initiators?
+     *
+     * NOTE: This flag is currently required as the implementation does not yet
+     *       support pre-device mappings.
+     */
+#define _XEN_IOMMUOP_map_all 0
+#define XEN_IOMMUOP_map_all (1 << (_XEN_IOMMUOP_map_all))
+
+    /* Should the mapping be read-only to the initiator? */
+#define _XEN_IOMMUOP_map_readonly 1
+#define XEN_IOMMUOP_map_readonly (1 << (_XEN_IOMMUOP_map_readonly))
+
+    uint32_t pad;
+    /*
+     * IN - Segment/Bus/Device/Function of the initiator.
+     *
+     * NOTE: This is ignored if XEN_IOMMUOP_map_all is set.
+     */
+    uint64_t sbdf;
+    /* IN - The IOMMU frame number which will hold the new mapping */
+    xen_bfn_t bfn;
+    /* IN - The guest frame number of the page to be mapped */
+    xen_pfn_t gfn;
+};
+
+/*
+ * XEN_IOMMUOP_unmap_gfn: Remove a mapping in the IOMMU.
+ */
+#define XEN_IOMMUOP_unmap 4
+
+struct xen_iommu_op_unmap {
+    /* IN - The domid of the guest */
+    domid_t domid;
+    /*
+     * IN - flags controlling the unmapping. This should be a bitwise OR of the
+     *      flags defined below.
+     */
+    uint16_t flags;
+
+    /*
+     * Should the mapping be destroyed for all initiators?
+     *
+     * NOTE: This flag is currently required as the implementation does not yet
+     *       support pre-device mappings.
+     */
+#define _XEN_IOMMUOP_unmap_all 0
+#define XEN_IOMMUOP_unmap_all (1 << (_XEN_IOMMUOP_unmap_all))
+
+    uint32_t pad;
+    /*
+     * IN - Segment/Bus/Device/Function of the initiator.
+     *
+     * NOTE: This is ignored if XEN_IOMMUOP_unmap_all is set.
+     */
+    uint64_t sbdf;
+    /* IN - The IOMMU frame number which holds the mapping to be removed */
+    xen_bfn_t bfn;
+    /* IN - The guest frame number of the page that is mapped */
+    xen_pfn_t gfn;
+};
+
+/*
+ * XEN_IOMMUOP_flush: Flush the IOMMU TLB.
+ */
+#define XEN_IOMMUOP_flush 5
+
+struct xen_iommu_op_flush {
+    /*
+     * IN - flags controlling flushing. This should be a bitwise OR of the
+     *      flags defined below.
+     */
+    uint16_t flags;
+
+    /*
+     * Should the mappings flushed for all initiators?
+     *
+     * NOTE: This flag is currently required as the implementation does not yet
+     *       support pre-device mappings.
+     */
+#define _XEN_IOMMUOP_flush_all 0
+#define XEN_IOMMUOP_flush_all (1 << (_XEN_IOMMUOP_flush_all))
+
+    uint16_t pad0;
+    uint32_t pad1;
+    /*
+     * IN - Segment/Bus/Device/Function of the initiator.
+     *
+     * NOTE: This is ignored if XEN_IOMMUOP_flush_all is set.
+     */
+    uint64_t sbdf;
+};
+
 struct xen_iommu_op {
     uint16_t op;    /* op type */
     uint16_t pad;
@@ -88,6 +195,9 @@ struct xen_iommu_op {
     union {
         struct xen_iommu_op_query_reserved query_reserved;
         struct xen_iommu_op_enable_modification enable_modification;
+        struct xen_iommu_op_map map;
+        struct xen_iommu_op_unmap unmap;
+        struct xen_iommu_op_flush flush;
     } u;
 };
 typedef struct xen_iommu_op xen_iommu_op_t;
diff --git a/xen/include/xlat.lst b/xen/include/xlat.lst
index c1b27e0349..5ab4c72264 100644
--- a/xen/include/xlat.lst
+++ b/xen/include/xlat.lst
@@ -80,7 +80,10 @@
 !	iommu_op			iommu_op.h
 !	iommu_op_buf			iommu_op.h
 !	iommu_op_enable_modification	iommu_op.h
+!	iommu_op_flush			iommu_op.h
+!	iommu_op_map			iommu_op.h
 !	iommu_op_query_reserved		iommu_op.h
+!	iommu_op_unmap			iommu_op.h
 ?	iommu_reserved_range		iommu_op.h
 ?	kexec_exec			kexec.h
 !	kexec_image			kexec.h
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 111+ messages in thread

* [PATCH v6 14/14] x86: extend the map and unmap iommu_ops to support grant references
  2018-08-23  9:46 [PATCH v6 00/14] paravirtual IOMMU interface Paul Durrant
                   ` (12 preceding siblings ...)
  2018-08-23  9:47 ` [PATCH v6 13/14] x86: add iommu_ops to modify and flush IOMMU mappings Paul Durrant
@ 2018-08-23  9:47 ` Paul Durrant
  2018-09-12 14:12   ` Jan Beulich
  13 siblings, 1 reply; 111+ messages in thread
From: Paul Durrant @ 2018-08-23  9:47 UTC (permalink / raw)
  To: xen-devel
  Cc: Stefano Stabellini, Wei Liu, George Dunlap, Andrew Cooper,
	Ian Jackson, Tim Deegan, Julien Grall, Paul Durrant, Jan Beulich

This patch allows a domain to add or remove foreign frames from its
IOMMU mappings by grant reference as well as GFN. This is necessary,
for example, to support a PV network backend that needs to construct a
packet buffer that can be directly accessed by a NIC.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
---
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: George Dunlap <George.Dunlap@eu.citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Julien Grall <julien.grall@arm.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Tim Deegan <tim@xen.org>
Cc: Wei Liu <wei.liu2@citrix.com>

v6:
 - Re-base.

v2:
 - New in v2.
---
 xen/common/grant_table.c      | 143 ++++++++++++++++++++++++++++++++++++++++++
 xen/common/iommu_op.c         |  83 ++++++++++++++++--------
 xen/include/public/iommu_op.h |  29 +++++++--
 xen/include/xen/grant_table.h |   7 +++
 4 files changed, 232 insertions(+), 30 deletions(-)

diff --git a/xen/common/grant_table.c b/xen/common/grant_table.c
index f3b2fad7a8..01a95c05ed 100644
--- a/xen/common/grant_table.c
+++ b/xen/common/grant_table.c
@@ -3961,6 +3961,149 @@ int gnttab_get_status_frame(struct domain *d, unsigned long idx,
     return rc;
 }
 
+int
+acquire_gref_for_iommu(struct domain *d, grant_ref_t gref,
+                       bool readonly, mfn_t *mfn)
+{
+    struct domain *currd = current->domain;
+    struct grant_table *gt = d->grant_table;
+    grant_entry_header_t *shah;
+    struct active_grant_entry *act;
+    uint16_t *status;
+    int rc;
+
+    grant_read_lock(gt);
+
+    rc = -ENOENT;
+    if ( gref > nr_grant_entries(gt) )
+        goto unlock;
+
+    act = active_entry_acquire(gt, gref);
+    shah = shared_entry_header(gt, gref);
+    status = ( gt->gt_version == 2 ) ?
+        &status_entry(gt, gref) :
+        &shah->flags;
+
+    rc = -EACCES;
+    if ( (shah->flags & GTF_type_mask) != GTF_permit_access ||
+         (shah->flags & GTF_sub_page) )
+        goto release;
+
+    rc = -ERANGE;
+    if ( act->pin && ((act->domid != currd->domain_id) ||
+                      (act->pin & 0x80808080U) != 0) )
+        goto release;
+
+    rc = -EINVAL;
+    if ( !act->pin ||
+         (!readonly && !(act->pin & GNTPIN_devw_mask)) ) {
+        if ( _set_status(gt->gt_version, currd->domain_id, readonly,
+                         0, shah, act, status) != GNTST_okay )
+            goto release;
+    }
+
+    if ( !act->pin )
+    {
+        gfn_t gfn = gt->gt_version == 1 ?
+            _gfn(shared_entry_v1(gt, gref).frame) :
+            _gfn(shared_entry_v2(gt, gref).full_page.frame);
+        struct page_info *page;
+
+        rc =  get_paged_gfn(d, gfn, readonly, NULL, &page);
+        if ( rc )
+            goto clear;
+
+        act_set_gfn(act, gfn);
+        act->mfn = page_to_mfn(page);
+        act->domid = currd->domain_id;
+        act->start = 0;
+        act->length = PAGE_SIZE;
+        act->is_sub_page = false;
+        act->trans_domain = d;
+        act->trans_gref = gref;
+    }
+    else
+    {
+        ASSERT(mfn_valid(act->mfn));
+        if ( !get_page(mfn_to_page(act->mfn), d) )
+            goto clear;
+    }
+
+    rc = 0;
+    act->pin += readonly ? GNTPIN_devr_inc : GNTPIN_devw_inc;
+    *mfn = act->mfn;
+    goto release;
+
+ clear:
+    if ( !readonly && !(act->pin & GNTPIN_devw_mask) )
+        gnttab_clear_flag(_GTF_writing, status);
+
+    if ( !act->pin )
+        gnttab_clear_flag(_GTF_reading, status);
+
+ release:
+    active_entry_release(act);
+
+ unlock:
+    grant_read_unlock(gt);
+
+    return rc;
+}
+
+int
+release_gref_for_iommu(struct domain *d, grant_ref_t gref,
+                       bool readonly, mfn_t mfn)
+{
+    struct domain *currd = current->domain;
+    struct grant_table *gt = d->grant_table;
+    grant_entry_header_t *shah;
+    struct active_grant_entry *act;
+    uint16_t *status;
+    int rc;
+
+    grant_read_lock(gt);
+
+    rc = -ENOENT;
+    if ( gref > nr_grant_entries(gt) )
+        goto unlock;
+
+    act = active_entry_acquire(gt, gref);
+    shah = shared_entry_header(gt, gref);
+    status = ( gt->gt_version == 2 ) ?
+        &status_entry(gt, gref) :
+        &shah->flags;
+
+    rc = -EINVAL;
+    if ( !act->pin || (act->domid != currd->domain_id) ||
+         !mfn_eq(act->mfn, mfn) )
+        goto release;
+
+    rc = 0;
+    if ( readonly )
+        act->pin -= GNTPIN_devr_inc;
+    else
+    {
+        gnttab_mark_dirty(d, mfn);
+
+        act->pin -= GNTPIN_devw_inc;
+        if ( !(act->pin & GNTPIN_devw_mask) )
+            gnttab_clear_flag(_GTF_writing, status);
+    }
+
+    if ( !act->pin )
+        gnttab_clear_flag(_GTF_reading, status);
+
+    put_page(mfn_to_page(mfn));
+
+ release:
+    active_entry_release(act);
+
+ unlock:
+    grant_read_unlock(gt);
+
+    return rc;
+}
+
 static void gnttab_usage_print(struct domain *rd)
 {
     int first = 1;
diff --git a/xen/common/iommu_op.c b/xen/common/iommu_op.c
index 328522f245..272f53298f 100644
--- a/xen/common/iommu_op.c
+++ b/xen/common/iommu_op.c
@@ -23,6 +23,7 @@
 #include <xen/guest_access.h>
 #include <xen/hypercall.h>
 #include <xen/nospec.h>
+#include <xen/grant_table.h>
 
 struct get_reserved_ctxt {
     unsigned int max_entries;
@@ -130,12 +131,14 @@ static int iommuop_map(struct xen_iommu_op_map *op)
     bool readonly = op->flags & XEN_IOMMUOP_map_readonly;
     bfn_t bfn = _bfn(op->bfn);
     struct page_info *page;
+    mfn_t mfn;
     unsigned int prot;
     int rc, ignore;
 
     if ( op->pad ||
          (op->flags & ~(XEN_IOMMUOP_map_all |
-                        XEN_IOMMUOP_map_readonly)) )
+                        XEN_IOMMUOP_map_readonly |
+                        XEN_IOMMUOP_map_gref)) )
         return -EINVAL;
 
     if ( !iommu->iommu_op_ranges )
@@ -153,15 +156,28 @@ static int iommuop_map(struct xen_iommu_op_map *op)
     if ( !d )
         return -ESRCH;
 
-    rc = get_paged_gfn(d, _gfn(op->gfn), readonly, NULL, &page);
-    if ( rc )
-        goto unlock;
+    if ( op->flags & XEN_IOMMUOP_map_gref )
+    {
+        rc = acquire_gref_for_iommu(d, op->u.gref, readonly, &mfn);
+        if ( rc )
+            goto unlock;
 
-    rc = -EINVAL;
-    if ( !readonly && !get_page_type(page, PGT_writable_page) )
+        page = mfn_to_page(mfn);
+    }
+    else
     {
-        put_page(page);
-        goto unlock;
+        rc = get_paged_gfn(d, _gfn(op->u.gfn), readonly, NULL, &page);
+        if ( rc )
+            goto unlock;
+
+        rc = -EINVAL;
+        if ( !readonly && !get_page_type(page, PGT_writable_page) )
+        {
+            put_page(page);
+            goto unlock;
+        }
+
+        mfn = page_to_mfn(page);
     }
 
     prot = IOMMUF_readable;
@@ -169,7 +185,7 @@ static int iommuop_map(struct xen_iommu_op_map *op)
         prot |= IOMMUF_writable;
 
     rc = -EIO;
-    if ( iommu_map_page(currd, bfn, page_to_mfn(page), prot) )
+    if ( iommu_map_page(currd, bfn, mfn, prot) )
         goto release;
 
     rc = rangeset_add_singleton(iommu->iommu_op_ranges, bfn_x(bfn));
@@ -183,9 +199,14 @@ static int iommuop_map(struct xen_iommu_op_map *op)
     ignore = iommu_unmap_page(currd, bfn);
 
  release:
-    if ( !readonly )
-        put_page_type(page);
-    put_page(page);
+    if ( op->flags & XEN_IOMMUOP_map_gref )
+        release_gref_for_iommu(d, op->u.gref, readonly, mfn);
+    else
+    {
+        if ( !readonly )
+            put_page_type(page);
+        put_page(page);
+    }
 
  unlock:
     rcu_unlock_domain(d);
@@ -200,11 +221,11 @@ static int iommuop_unmap(struct xen_iommu_op_unmap *op)
     mfn_t mfn;
     bool readonly;
     unsigned int prot;
-    struct page_info *page;
     int rc;
 
     if ( op->pad ||
-         (op->flags & ~XEN_IOMMUOP_unmap_all) )
+         (op->flags & ~(XEN_IOMMUOP_unmap_all |
+                        XEN_IOMMUOP_unmap_gref)) )
         return -EINVAL;
 
     if ( !iommu->iommu_op_ranges )
@@ -225,21 +246,31 @@ static int iommuop_unmap(struct xen_iommu_op_unmap *op)
     if ( !d )
         return -ESRCH;
 
-    rc = get_paged_gfn(d, _gfn(op->gfn), !(prot & IOMMUF_writable), NULL,
-                       &page);
-    if ( rc )
-        goto unlock;
+    if ( op->flags & XEN_IOMMUOP_unmap_gref )
+    {
+        rc = release_gref_for_iommu(d, op->u.gref, readonly, mfn);
+        if ( rc )
+            goto unlock;
+    }
+    else
+    {
+        struct page_info *page;
 
-    put_page(page); /* release extra reference just taken */
+        rc = get_paged_gfn(d, _gfn(op->u.gfn), readonly, NULL, &page);
+        if ( rc )
+            goto unlock;
 
-    rc = -EINVAL;
-    if ( !mfn_eq(page_to_mfn(page), mfn) )
-        goto unlock;
+        put_page(page); /* release extra reference just taken */
 
-    /* release reference taken in map */
-    if ( !readonly )
-        put_page_type(page);
-    put_page(page);
+        rc = -EINVAL;
+        if ( !mfn_eq(page_to_mfn(page), mfn) )
+            goto unlock;
+
+        /* release reference taken in map */
+        if ( !readonly )
+            put_page_type(page);
+        put_page(page);
+    }
 
     rc = rangeset_remove_singleton(iommu->iommu_op_ranges, bfn_x(bfn));
     if ( rc )
diff --git a/xen/include/public/iommu_op.h b/xen/include/public/iommu_op.h
index e6c08f4bdd..e3d702a8d0 100644
--- a/xen/include/public/iommu_op.h
+++ b/xen/include/public/iommu_op.h
@@ -24,6 +24,7 @@
 #define XEN_PUBLIC_IOMMU_OP_H
 
 #include "xen.h"
+#include "grant_table.h"
 
 typedef uint64_t xen_bfn_t;
 
@@ -107,6 +108,10 @@ struct xen_iommu_op_map {
 #define _XEN_IOMMUOP_map_readonly 1
 #define XEN_IOMMUOP_map_readonly (1 << (_XEN_IOMMUOP_map_readonly))
 
+    /* Is the memory specified by gfn or grant reference? */
+#define _XEN_IOMMUOP_map_gref 2
+#define XEN_IOMMUOP_map_gref (1 << (_XEN_IOMMUOP_map_gref))
+
     uint32_t pad;
     /*
      * IN - Segment/Bus/Device/Function of the initiator.
@@ -116,8 +121,14 @@ struct xen_iommu_op_map {
     uint64_t sbdf;
     /* IN - The IOMMU frame number which will hold the new mapping */
     xen_bfn_t bfn;
-    /* IN - The guest frame number of the page to be mapped */
-    xen_pfn_t gfn;
+    /*
+     * IN - The guest frame number or grant reference of the page to
+     * be mapped.
+     */
+    union {
+        xen_pfn_t gfn;
+        grant_ref_t gref;
+    } u;
 };
 
 /*
@@ -143,6 +154,10 @@ struct xen_iommu_op_unmap {
 #define _XEN_IOMMUOP_unmap_all 0
 #define XEN_IOMMUOP_unmap_all (1 << (_XEN_IOMMUOP_unmap_all))
 
+    /* Is the memory specified by gfn or grant reference? */
+#define _XEN_IOMMUOP_unmap_gref 1
+#define XEN_IOMMUOP_unmap_gref (1 << (_XEN_IOMMUOP_unmap_gref))
+
     uint32_t pad;
     /*
      * IN - Segment/Bus/Device/Function of the initiator.
@@ -152,8 +167,14 @@ struct xen_iommu_op_unmap {
     uint64_t sbdf;
     /* IN - The IOMMU frame number which holds the mapping to be removed */
     xen_bfn_t bfn;
-    /* IN - The guest frame number of the page that is mapped */
-    xen_pfn_t gfn;
+    /*
+     * IN - The guest frame number or grant reference of the page that
+     * is mapped.
+     */
+    union {
+        xen_pfn_t gfn;
+        grant_ref_t gref;
+    } u;
 };
 
 /*
diff --git a/xen/include/xen/grant_table.h b/xen/include/xen/grant_table.h
index c881414e5b..35afb27202 100644
--- a/xen/include/xen/grant_table.h
+++ b/xen/include/xen/grant_table.h
@@ -63,6 +63,13 @@ int gnttab_get_shared_frame(struct domain *d, unsigned long idx,
 int gnttab_get_status_frame(struct domain *d, unsigned long idx,
                             mfn_t *mfn);
 
+int
+acquire_gref_for_iommu(struct domain *d, grant_ref_t gref,
+                       bool readonly, mfn_t *mfn);
+int
+release_gref_for_iommu(struct domain *d, grant_ref_t gref,
+                       bool readonly, mfn_t mfn);
+
 unsigned int gnttab_dom0_frames(void);
 
 #endif /* __XEN_GRANT_TABLE_H__ */
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 111+ messages in thread

* Re: [PATCH v6 12/14] memory: add get_paged_gfn() as a wrapper...
  2018-08-23  9:47 ` [PATCH v6 12/14] memory: add get_paged_gfn() as a wrapper Paul Durrant
@ 2018-08-23 10:24   ` Julien Grall
  2018-08-23 10:30     ` Paul Durrant
  2018-09-11 14:56   ` Jan Beulich
  1 sibling, 1 reply; 111+ messages in thread
From: Julien Grall @ 2018-08-23 10:24 UTC (permalink / raw)
  To: Paul Durrant, xen-devel
  Cc: Stefano Stabellini, Wei Liu, George Dunlap, Andrew Cooper,
	Ian Jackson, Tim Deegan, Jan Beulich

Hi Paul,

On 08/23/2018 10:47 AM, Paul Durrant wrote:
> ...for some uses of get_page_from_gfn().
> 
> There are many occurences of the following pattern in the code:

NIT: s/occurences/occurences/

[...]

> +int get_paged_gfn(struct domain *d, gfn_t gfn, bool readonly,
> +                  p2m_type_t *p2mt_p, struct page_info **page_p)
>   {
> -    struct page_info *page;
> +    p2m_query_t q = readonly ? P2M_ALLOC : P2M_UNSHARE;
>       p2m_type_t p2mt;
> -    void *va;
> +    struct page_info *page;
>   
> -    page = get_page_from_gfn(d, gmfn, &p2mt, P2M_UNSHARE);
> +    page = get_page_from_gfn(d, gfn_x(gfn), &p2mt, q);
>   
>   #ifdef CONFIG_HAS_MEM_PAGING
>       if ( p2m_is_paging(p2mt) )
>       {
>           if ( page )
>               put_page(page);
> -        p2m_mem_paging_populate(d, gmfn);
> -        return -ENOENT;
> +
> +        p2m_mem_paging_populate(d, gfn_x(gfn));
> +        return -EAGAIN;
>       }
>   #endif
>   #ifdef CONFIG_HAS_MEM_SHARING
> -    if ( p2m_is_shared(p2mt) )
> +    if ( (q & P2M_UNSHARE) && p2m_is_shared(p2mt) )
>       {
>           if ( page )
>               put_page(page);
> -        return -ENOENT;
> +
> +        return -EAGAIN;
>       }
>   #endif
>   
>       if ( !page )
>           return -EINVAL;
>   
> +    if ( !p2m_is_ram(p2mt) || (!readonly && p2m_is_readonly(p2mt)) )

p2m_is_readonly does not exist on Arm. Can you please make sure this 
code build on Arm?

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v6 12/14] memory: add get_paged_gfn() as a wrapper...
  2018-08-23 10:24   ` Julien Grall
@ 2018-08-23 10:30     ` Paul Durrant
  0 siblings, 0 replies; 111+ messages in thread
From: Paul Durrant @ 2018-08-23 10:30 UTC (permalink / raw)
  To: 'Julien Grall', xen-devel
  Cc: Stefano Stabellini, Wei Liu, Andrew Cooper, Tim (Xen.org),
	George Dunlap, Jan Beulich, Ian Jackson

> -----Original Message-----
> From: Julien Grall [mailto:julien.grall@arm.com]
> Sent: 23 August 2018 11:25
> To: Paul Durrant <Paul.Durrant@citrix.com>; xen-devel@lists.xenproject.org
> Cc: Jan Beulich <jbeulich@suse.com>; Andrew Cooper
> <Andrew.Cooper3@citrix.com>; George Dunlap
> <George.Dunlap@citrix.com>; Ian Jackson <Ian.Jackson@citrix.com>; Konrad
> Rzeszutek Wilk <konrad.wilk@oracle.com>; Stefano Stabellini
> <sstabellini@kernel.org>; Tim (Xen.org) <tim@xen.org>; Wei Liu
> <wei.liu2@citrix.com>
> Subject: Re: [PATCH v6 12/14] memory: add get_paged_gfn() as a wrapper...
> 
> Hi Paul,
> 
> On 08/23/2018 10:47 AM, Paul Durrant wrote:
> > ...for some uses of get_page_from_gfn().
> >
> > There are many occurences of the following pattern in the code:
> 
> NIT: s/occurences/occurences/
> 
> [...]
> 
> > +int get_paged_gfn(struct domain *d, gfn_t gfn, bool readonly,
> > +                  p2m_type_t *p2mt_p, struct page_info **page_p)
> >   {
> > -    struct page_info *page;
> > +    p2m_query_t q = readonly ? P2M_ALLOC : P2M_UNSHARE;
> >       p2m_type_t p2mt;
> > -    void *va;
> > +    struct page_info *page;
> >
> > -    page = get_page_from_gfn(d, gmfn, &p2mt, P2M_UNSHARE);
> > +    page = get_page_from_gfn(d, gfn_x(gfn), &p2mt, q);
> >
> >   #ifdef CONFIG_HAS_MEM_PAGING
> >       if ( p2m_is_paging(p2mt) )
> >       {
> >           if ( page )
> >               put_page(page);
> > -        p2m_mem_paging_populate(d, gmfn);
> > -        return -ENOENT;
> > +
> > +        p2m_mem_paging_populate(d, gfn_x(gfn));
> > +        return -EAGAIN;
> >       }
> >   #endif
> >   #ifdef CONFIG_HAS_MEM_SHARING
> > -    if ( p2m_is_shared(p2mt) )
> > +    if ( (q & P2M_UNSHARE) && p2m_is_shared(p2mt) )
> >       {
> >           if ( page )
> >               put_page(page);
> > -        return -ENOENT;
> > +
> > +        return -EAGAIN;
> >       }
> >   #endif
> >
> >       if ( !page )
> >           return -EINVAL;
> >
> > +    if ( !p2m_is_ram(p2mt) || (!readonly && p2m_is_readonly(p2mt)) )
> 
> p2m_is_readonly does not exist on Arm. Can you please make sure this
> code build on Arm?
> 

Ok. Will do.

  Paul

> Cheers,
> 
> --
> Julien Grall
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v6 10/14] mm / iommu: split need_iommu() into has_iommu_pt() and need_iommu_pt_sync()
  2018-08-23  9:47 ` [PATCH v6 10/14] mm / iommu: split need_iommu() into has_iommu_pt() and need_iommu_pt_sync() Paul Durrant
@ 2018-08-23 11:10   ` Razvan Cojocaru
  2018-09-11 14:31   ` Jan Beulich
  1 sibling, 0 replies; 111+ messages in thread
From: Razvan Cojocaru @ 2018-08-23 11:10 UTC (permalink / raw)
  To: Paul Durrant, xen-devel
  Cc: Kevin Tian, Stefano Stabellini, Wei Liu, Jan Beulich,
	George Dunlap, Andrew Cooper, Ian Jackson, Tim Deegan,
	Julien Grall, Tamas K Lengyel, Jun Nakajima, Brian Woods,
	Suravee Suthikulpanit

On 8/23/18 12:47 PM, Paul Durrant wrote:
> The name 'need_iommu()' is a little confusing as it suggests a domain needs
> to use the IOMMU but something might not be set up yet, when in fact it
> represents a tri-state value (not a boolean as might be expected) where
> -1 means 'IOMMU mappings being set up' and 1 means 'IOMMU mappings have
> been fully set up'.
> 
> Two different meanings are also inferred from the macro it in various
> places in the code:
> 
> - Some callers want to test whether a domain has IOMMU mappings at all
> - Some callers want to test whether they need to synchronize the domain's
>   P2M and IOMMU mappings
> 
> This patch replaces the 'need_iommu' tri-state value with a defined
> enumeration and adds a boolean flag 'need_sync' to separate these meanings,
> and places both of these in struct domain_iommu, rather than directly in
> struct domain.
> This patch also creates two new boolean macros:
> 
> - 'has_iommu_pt()' evaluates to true if a domain has IOMMU mappings, even
>   if they are still under construction.
> - 'need_iommu_pt_sync()' evaluates to true if a domain requires explicit
>   synchronization of the P2M and IOMMU mappings.
> 
> All callers of need_iommu() are then modified to use the macro appropriate
> to what they are trying to test.
> 
> NOTE: The test of need_iommu(d) in the AMD IOMMU initialization code has
>       been replaced with a test of iommu_dom0_strict since this more
>       accurately reflects the meaning of the test and brings it into
>       line with a similar test in the Intel VT-d code.
> 
> Signed-off-by: Paul Durrant <paul.durrant@citrix.com>

For the vm_event part:
Acked-by: Razvan Cojocaru <rcojocaru@bitdefender.com>


Thanks,
Razvan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v6 01/14] iommu: introduce the concept of BFN...
  2018-08-23  9:46 ` [PATCH v6 01/14] iommu: introduce the concept of BFN Paul Durrant
@ 2018-08-30 15:59   ` Jan Beulich
  2018-09-03  8:23     ` Paul Durrant
  0 siblings, 1 reply; 111+ messages in thread
From: Jan Beulich @ 2018-08-30 15:59 UTC (permalink / raw)
  To: Paul Durrant
  Cc: xen-devel, Julien Grall, Stefano Stabellini, Kevin Tian,
	Suravee Suthikulpanit

>>> On 23.08.18 at 11:46, <paul.durrant@citrix.com> wrote:
> --- a/xen/include/xen/mm.h
> +++ b/xen/include/xen/mm.h
> @@ -26,6 +26,11 @@
>   *   A linear idea of a guest physical address space. For an auto-translated
>   *   guest, pfn == gfn while for a non-translated guest, pfn != gfn.
>   *
> + * bfn: Bus Frame Number (definitions in include/xen/iommu.h)
> + *   The linear frame numbers of IOMMU address space. All initiators for (i.e.
> + *   all devices assigned to) a guest share a single IOMMU address space and,
> + *   by default, Xen will ensure bfn == pfn.

The code changes are purely mechanical and hence fine, but I have
to admit I continue to struggle with the "bus" part in the name here:
I don't think it is any less ambiguous than GFN, because which bus
are you thinking about here? The (virtual) one as seen by the guest,
aiui. The physical (host) one would be at least as natural to be
indexed by such typed/named variables. I'd somehow like it to be
made explicit in the name whose view these represent. GBFN?
VBFN?

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v6 01/14] iommu: introduce the concept of BFN...
  2018-08-30 15:59   ` Jan Beulich
@ 2018-09-03  8:23     ` Paul Durrant
  2018-09-03 11:46       ` Jan Beulich
  0 siblings, 1 reply; 111+ messages in thread
From: Paul Durrant @ 2018-09-03  8:23 UTC (permalink / raw)
  To: 'Jan Beulich'
  Cc: xen-devel, Julien Grall, Stefano Stabellini, Kevin Tian,
	Suravee Suthikulpanit

> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: 30 August 2018 17:00
> To: Paul Durrant <Paul.Durrant@citrix.com>
> Cc: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>; Julien Grall
> <julien.grall@arm.com>; Kevin Tian <kevin.tian@intel.com>; Stefano
> Stabellini <sstabellini@kernel.org>; xen-devel <xen-
> devel@lists.xenproject.org>
> Subject: Re: [PATCH v6 01/14] iommu: introduce the concept of BFN...
> 
> >>> On 23.08.18 at 11:46, <paul.durrant@citrix.com> wrote:
> > --- a/xen/include/xen/mm.h
> > +++ b/xen/include/xen/mm.h
> > @@ -26,6 +26,11 @@
> >   *   A linear idea of a guest physical address space. For an auto-translated
> >   *   guest, pfn == gfn while for a non-translated guest, pfn != gfn.
> >   *
> > + * bfn: Bus Frame Number (definitions in include/xen/iommu.h)
> > + *   The linear frame numbers of IOMMU address space. All initiators for
> (i.e.
> > + *   all devices assigned to) a guest share a single IOMMU address space
> and,
> > + *   by default, Xen will ensure bfn == pfn.
> 
> The code changes are purely mechanical and hence fine, but I have
> to admit I continue to struggle with the "bus" part in the name here:
> I don't think it is any less ambiguous than GFN, because which bus
> are you thinking about here? The (virtual) one as seen by the guest,
> aiui. The physical (host) one would be at least as natural to be
> indexed by such typed/named variables. I'd somehow like it to be
> made explicit in the name whose view these represent. GBFN?
> VBFN?
> 

Well, it always refers to whatever physical bus is on the other side of the IOMMU from the core so it's the view of whatever peripheral devices are located on that bus. As Kevin said each device can have its own address space and the fact we always use a global per-VM space is an implementation detail so DFN for 'device frame number' or IOFN (since IOVA is reasonably widely used term) might be more future-proof?

> Jan
> 


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v6 01/14] iommu: introduce the concept of BFN...
  2018-09-03  8:23     ` Paul Durrant
@ 2018-09-03 11:46       ` Jan Beulich
  2018-09-04  6:48         ` Tian, Kevin
  0 siblings, 1 reply; 111+ messages in thread
From: Jan Beulich @ 2018-09-03 11:46 UTC (permalink / raw)
  To: Paul Durrant
  Cc: xen-devel, Julien Grall, Stefano Stabellini, Kevin Tian,
	Suravee Suthikulpanit

>>> On 03.09.18 at 10:23, <Paul.Durrant@citrix.com> wrote:
>> From: Jan Beulich [mailto:JBeulich@suse.com]
>> Sent: 30 August 2018 17:00
>> 
>> >>> On 23.08.18 at 11:46, <paul.durrant@citrix.com> wrote:
>> > --- a/xen/include/xen/mm.h
>> > +++ b/xen/include/xen/mm.h
>> > @@ -26,6 +26,11 @@
>> >   *   A linear idea of a guest physical address space. For an auto-translated
>> >   *   guest, pfn == gfn while for a non-translated guest, pfn != gfn.
>> >   *
>> > + * bfn: Bus Frame Number (definitions in include/xen/iommu.h)
>> > + *   The linear frame numbers of IOMMU address space. All initiators for
>> (i.e.
>> > + *   all devices assigned to) a guest share a single IOMMU address space
>> and,
>> > + *   by default, Xen will ensure bfn == pfn.
>> 
>> The code changes are purely mechanical and hence fine, but I have
>> to admit I continue to struggle with the "bus" part in the name here:
>> I don't think it is any less ambiguous than GFN, because which bus
>> are you thinking about here? The (virtual) one as seen by the guest,
>> aiui. The physical (host) one would be at least as natural to be
>> indexed by such typed/named variables. I'd somehow like it to be
>> made explicit in the name whose view these represent. GBFN?
>> VBFN?
> 
> Well, it always refers to whatever physical bus is on the other side of the 
> IOMMU from the core so it's the view of whatever peripheral devices are 
> located on that bus. As Kevin said each device can have its own address space 
> and the fact we always use a global per-VM space is an implementation detail 
> so DFN for 'device frame number' or IOFN (since IOVA is reasonably widely 
> used term) might be more future-proof?

I don't think any name alone would make things future-proof. Considering
the per-device address space case, a frame number would always need
to come in a tuple paired with a device identifier. Nevertheless I think
either of the names you suggest would be a slight improvement, perhaps
DFN even more than IOFN, as that includes whose view this is. But I'd
certainly appreciate opinions of others, so you don't again end up doing
something that I've asked for,just to subsequently undo it again.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v6 01/14] iommu: introduce the concept of BFN...
  2018-09-03 11:46       ` Jan Beulich
@ 2018-09-04  6:48         ` Tian, Kevin
  2018-09-04  8:32           ` Jan Beulich
  0 siblings, 1 reply; 111+ messages in thread
From: Tian, Kevin @ 2018-09-04  6:48 UTC (permalink / raw)
  To: Jan Beulich, Paul Durrant
  Cc: xen-devel, Julien Grall, Stefano Stabellini, Suravee Suthikulpanit

> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: Monday, September 3, 2018 7:47 PM
> 
> >>> On 03.09.18 at 10:23, <Paul.Durrant@citrix.com> wrote:
> >> From: Jan Beulich [mailto:JBeulich@suse.com]
> >> Sent: 30 August 2018 17:00
> >>
> >> >>> On 23.08.18 at 11:46, <paul.durrant@citrix.com> wrote:
> >> > --- a/xen/include/xen/mm.h
> >> > +++ b/xen/include/xen/mm.h
> >> > @@ -26,6 +26,11 @@
> >> >   *   A linear idea of a guest physical address space. For an auto-
> translated
> >> >   *   guest, pfn == gfn while for a non-translated guest, pfn != gfn.
> >> >   *
> >> > + * bfn: Bus Frame Number (definitions in include/xen/iommu.h)
> >> > + *   The linear frame numbers of IOMMU address space. All initiators
> for
> >> (i.e.
> >> > + *   all devices assigned to) a guest share a single IOMMU address
> space
> >> and,
> >> > + *   by default, Xen will ensure bfn == pfn.
> >>
> >> The code changes are purely mechanical and hence fine, but I have
> >> to admit I continue to struggle with the "bus" part in the name here:
> >> I don't think it is any less ambiguous than GFN, because which bus
> >> are you thinking about here? The (virtual) one as seen by the guest,
> >> aiui. The physical (host) one would be at least as natural to be
> >> indexed by such typed/named variables. I'd somehow like it to be
> >> made explicit in the name whose view these represent. GBFN?
> >> VBFN?
> >
> > Well, it always refers to whatever physical bus is on the other side of the
> > IOMMU from the core so it's the view of whatever peripheral devices are
> > located on that bus. As Kevin said each device can have its own address
> space
> > and the fact we always use a global per-VM space is an implementation
> detail
> > so DFN for 'device frame number' or IOFN (since IOVA is reasonably
> widely
> > used term) might be more future-proof?
> 
> I don't think any name alone would make things future-proof. Considering
> the per-device address space case, a frame number would always need
> to come in a tuple paired with a device identifier. Nevertheless I think
> either of the names you suggest would be a slight improvement, perhaps
> DFN even more than IOFN, as that includes whose view this is. But I'd
> certainly appreciate opinions of others, so you don't again end up doing
> something that I've asked for,just to subsequently undo it again.
> 

bus address is commonly used along with physical/virtual address, to
represent different views between devices and CPU. From that angle
I think BFN is a clear term in this context. btw it is not necessary to
differentiate GBFN and MBFN since there is only one BFN view per 
device.

DFN (device frame number) makes me think about something on the
device. What does a device address mean? Even DMA frame number
(thus DMA address) sounds clearer with same DFN notation.

IOFN/IOVA is less preferred to me, as in most cases it refers to virtual
addresses explicitly managed by device driver to be used in I/O requests.
it is not fit for the case where iofn=pfn.

so I prefer to staying with BFN. :-)

Thanks
Kevin


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v6 01/14] iommu: introduce the concept of BFN...
  2018-09-04  6:48         ` Tian, Kevin
@ 2018-09-04  8:32           ` Jan Beulich
  2018-09-04  8:37             ` Tian, Kevin
  0 siblings, 1 reply; 111+ messages in thread
From: Jan Beulich @ 2018-09-04  8:32 UTC (permalink / raw)
  To: Kevin Tian
  Cc: xen-devel, Julien Grall, Paul Durrant, Stefano Stabellini,
	Suravee Suthikulpanit

>>> On 04.09.18 at 08:48, <kevin.tian@intel.com> wrote:
>>  From: Jan Beulich [mailto:JBeulich@suse.com]
>> Sent: Monday, September 3, 2018 7:47 PM
>> 
>> >>> On 03.09.18 at 10:23, <Paul.Durrant@citrix.com> wrote:
>> >> From: Jan Beulich [mailto:JBeulich@suse.com]
>> >> Sent: 30 August 2018 17:00
>> >>
>> >> >>> On 23.08.18 at 11:46, <paul.durrant@citrix.com> wrote:
>> >> > --- a/xen/include/xen/mm.h
>> >> > +++ b/xen/include/xen/mm.h
>> >> > @@ -26,6 +26,11 @@
>> >> >   *   A linear idea of a guest physical address space. For an auto-
>> translated
>> >> >   *   guest, pfn == gfn while for a non-translated guest, pfn != gfn.
>> >> >   *
>> >> > + * bfn: Bus Frame Number (definitions in include/xen/iommu.h)
>> >> > + *   The linear frame numbers of IOMMU address space. All initiators
>> for
>> >> (i.e.
>> >> > + *   all devices assigned to) a guest share a single IOMMU address
>> space
>> >> and,
>> >> > + *   by default, Xen will ensure bfn == pfn.
>> >>
>> >> The code changes are purely mechanical and hence fine, but I have
>> >> to admit I continue to struggle with the "bus" part in the name here:
>> >> I don't think it is any less ambiguous than GFN, because which bus
>> >> are you thinking about here? The (virtual) one as seen by the guest,
>> >> aiui. The physical (host) one would be at least as natural to be
>> >> indexed by such typed/named variables. I'd somehow like it to be
>> >> made explicit in the name whose view these represent. GBFN?
>> >> VBFN?
>> >
>> > Well, it always refers to whatever physical bus is on the other side of the
>> > IOMMU from the core so it's the view of whatever peripheral devices are
>> > located on that bus. As Kevin said each device can have its own address
>> space
>> > and the fact we always use a global per-VM space is an implementation
>> detail
>> > so DFN for 'device frame number' or IOFN (since IOVA is reasonably
>> widely
>> > used term) might be more future-proof?
>> 
>> I don't think any name alone would make things future-proof. Considering
>> the per-device address space case, a frame number would always need
>> to come in a tuple paired with a device identifier. Nevertheless I think
>> either of the names you suggest would be a slight improvement, perhaps
>> DFN even more than IOFN, as that includes whose view this is. But I'd
>> certainly appreciate opinions of others, so you don't again end up doing
>> something that I've asked for,just to subsequently undo it again.
>> 
> 
> bus address is commonly used along with physical/virtual address, to
> represent different views between devices and CPU. From that angle
> I think BFN is a clear term in this context. btw it is not necessary to
> differentiate GBFN and MBFN since there is only one BFN view per 
> device.

Sure, but you neglect the presence of one or more IOMMUs when
you say "between devices and CPU". There addresses prior to and
after IOMMU translation are distinct, and while the one before the
translation matches the device view, the one after translation does
not necessarily match the CPU view. Hence there are two "bus"
frame numbers here - one representing the device view, and the
other representing the IOMMU (output) view.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v6 01/14] iommu: introduce the concept of BFN...
  2018-09-04  8:32           ` Jan Beulich
@ 2018-09-04  8:37             ` Tian, Kevin
  2018-09-04  8:47               ` Jan Beulich
  0 siblings, 1 reply; 111+ messages in thread
From: Tian, Kevin @ 2018-09-04  8:37 UTC (permalink / raw)
  To: Jan Beulich
  Cc: xen-devel, Julien Grall, Paul Durrant, Stefano Stabellini,
	Suravee Suthikulpanit

> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: Tuesday, September 4, 2018 4:33 PM
> >
> > bus address is commonly used along with physical/virtual address, to
> > represent different views between devices and CPU. From that angle
> > I think BFN is a clear term in this context. btw it is not necessary to
> > differentiate GBFN and MBFN since there is only one BFN view per
> > device.
> 
> Sure, but you neglect the presence of one or more IOMMUs when
> you say "between devices and CPU". There addresses prior to and
> after IOMMU translation are distinct, and while the one before the
> translation matches the device view, the one after translation does
> not necessarily match the CPU view. Hence there are two "bus"
> frame numbers here - one representing the device view, and the
> other representing the IOMMU (output) view.
> 

I didn't get. the output address from IOMMU is the one sent to
memory controller, same as the one sent from CPU.

Thanks
Kevin

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v6 01/14] iommu: introduce the concept of BFN...
  2018-09-04  8:37             ` Tian, Kevin
@ 2018-09-04  8:47               ` Jan Beulich
  2018-09-04  8:49                 ` Paul Durrant
  0 siblings, 1 reply; 111+ messages in thread
From: Jan Beulich @ 2018-09-04  8:47 UTC (permalink / raw)
  To: Kevin Tian
  Cc: xen-devel, Julien Grall, Paul Durrant, Stefano Stabellini,
	Suravee Suthikulpanit

>>> On 04.09.18 at 10:37, <kevin.tian@intel.com> wrote:
>>  From: Jan Beulich [mailto:JBeulich@suse.com]
>> Sent: Tuesday, September 4, 2018 4:33 PM
>> >
>> > bus address is commonly used along with physical/virtual address, to
>> > represent different views between devices and CPU. From that angle
>> > I think BFN is a clear term in this context. btw it is not necessary to
>> > differentiate GBFN and MBFN since there is only one BFN view per
>> > device.
>> 
>> Sure, but you neglect the presence of one or more IOMMUs when
>> you say "between devices and CPU". There addresses prior to and
>> after IOMMU translation are distinct, and while the one before the
>> translation matches the device view, the one after translation does
>> not necessarily match the CPU view. Hence there are two "bus"
>> frame numbers here - one representing the device view, and the
>> other representing the IOMMU (output) view.
>> 
> 
> I didn't get. the output address from IOMMU is the one sent to
> memory controller, same as the one sent from CPU.

That's on present x86 systems, but aiui not in the general case. The
terminology to be used in Xen should fit the general case though.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v6 01/14] iommu: introduce the concept of BFN...
  2018-09-04  8:47               ` Jan Beulich
@ 2018-09-04  8:49                 ` Paul Durrant
  2018-09-04  9:08                   ` Jan Beulich
  0 siblings, 1 reply; 111+ messages in thread
From: Paul Durrant @ 2018-09-04  8:49 UTC (permalink / raw)
  To: 'Jan Beulich', Kevin Tian
  Cc: xen-devel, Julien Grall, Stefano Stabellini, Suravee Suthikulpanit

> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: 04 September 2018 09:47
> To: Kevin Tian <kevin.tian@intel.com>
> Cc: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>; Julien Grall
> <julien.grall@arm.com>; Paul Durrant <Paul.Durrant@citrix.com>; Stefano
> Stabellini <sstabellini@kernel.org>; xen-devel <xen-
> devel@lists.xenproject.org>
> Subject: Re: [Xen-devel] [PATCH v6 01/14] iommu: introduce the concept of
> BFN...
> 
> >>> On 04.09.18 at 10:37, <kevin.tian@intel.com> wrote:
> >>  From: Jan Beulich [mailto:JBeulich@suse.com]
> >> Sent: Tuesday, September 4, 2018 4:33 PM
> >> >
> >> > bus address is commonly used along with physical/virtual address, to
> >> > represent different views between devices and CPU. From that angle
> >> > I think BFN is a clear term in this context. btw it is not necessary to
> >> > differentiate GBFN and MBFN since there is only one BFN view per
> >> > device.
> >>
> >> Sure, but you neglect the presence of one or more IOMMUs when
> >> you say "between devices and CPU". There addresses prior to and
> >> after IOMMU translation are distinct, and while the one before the
> >> translation matches the device view, the one after translation does
> >> not necessarily match the CPU view. Hence there are two "bus"
> >> frame numbers here - one representing the device view, and the
> >> other representing the IOMMU (output) view.
> >>
> >
> > I didn't get. the output address from IOMMU is the one sent to
> > memory controller, same as the one sent from CPU.
> 
> That's on present x86 systems, but aiui not in the general case. The
> terminology to be used in Xen should fit the general case though.

So your concern is cascaded IOMMUs?

  Paul

> 
> Jan
> 


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v6 01/14] iommu: introduce the concept of BFN...
  2018-09-04  8:49                 ` Paul Durrant
@ 2018-09-04  9:08                   ` Jan Beulich
  2018-09-05  0:42                     ` Tian, Kevin
  0 siblings, 1 reply; 111+ messages in thread
From: Jan Beulich @ 2018-09-04  9:08 UTC (permalink / raw)
  To: Paul Durrant
  Cc: xen-devel, Julien Grall, Stefano Stabellini, Kevin Tian,
	Suravee Suthikulpanit

>>> On 04.09.18 at 10:49, <Paul.Durrant@citrix.com> wrote:
>>  -----Original Message-----
>> From: Jan Beulich [mailto:JBeulich@suse.com]
>> Sent: 04 September 2018 09:47
>> To: Kevin Tian <kevin.tian@intel.com>
>> Cc: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>; Julien Grall
>> <julien.grall@arm.com>; Paul Durrant <Paul.Durrant@citrix.com>; Stefano
>> Stabellini <sstabellini@kernel.org>; xen-devel <xen-
>> devel@lists.xenproject.org>
>> Subject: Re: [Xen-devel] [PATCH v6 01/14] iommu: introduce the concept of
>> BFN...
>> 
>> >>> On 04.09.18 at 10:37, <kevin.tian@intel.com> wrote:
>> >>  From: Jan Beulich [mailto:JBeulich@suse.com]
>> >> Sent: Tuesday, September 4, 2018 4:33 PM
>> >> >
>> >> > bus address is commonly used along with physical/virtual address, to
>> >> > represent different views between devices and CPU. From that angle
>> >> > I think BFN is a clear term in this context. btw it is not necessary to
>> >> > differentiate GBFN and MBFN since there is only one BFN view per
>> >> > device.
>> >>
>> >> Sure, but you neglect the presence of one or more IOMMUs when
>> >> you say "between devices and CPU". There addresses prior to and
>> >> after IOMMU translation are distinct, and while the one before the
>> >> translation matches the device view, the one after translation does
>> >> not necessarily match the CPU view. Hence there are two "bus"
>> >> frame numbers here - one representing the device view, and the
>> >> other representing the IOMMU (output) view.
>> >>
>> >
>> > I didn't get. the output address from IOMMU is the one sent to
>> > memory controller, same as the one sent from CPU.
>> 
>> That's on present x86 systems, but aiui not in the general case. The
>> terminology to be used in Xen should fit the general case though.
> 
> So your concern is cascaded IOMMUs?

Not primarily. My concern are systems with an I/O address space
(behind the IOMMU) distinct from the CPU address space. Iirc at
least Alpha is/was that way.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v6 02/14] iommu: make use of type-safe BFN and MFN in exported functions
  2018-08-23  9:46 ` [PATCH v6 02/14] iommu: make use of type-safe BFN and MFN in exported functions Paul Durrant
@ 2018-09-04 10:29   ` Jan Beulich
  0 siblings, 0 replies; 111+ messages in thread
From: Jan Beulich @ 2018-09-04 10:29 UTC (permalink / raw)
  To: Paul Durrant
  Cc: Stefano Stabellini, George Dunlap, Andrew Cooper, Ian Jackson,
	Tim Deegan, Julien Grall, Jun Nakajima, xen-devel

>>> On 23.08.18 at 11:46, <paul.durrant@citrix.com> wrote:
> This patch modifies the declaration of the entry points to the IOMMU
> sub-system to use bfn_t and mfn_t in place of unsigned long. A subsequent
> patch will similarly modify the methods in the iommu_ops structure.
> 
> Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
> Reviewed-by: Wei Liu <wei.liu2@citrix.com>
> Reviewed-by: Kevin Tian <kevin.tian@intel.com>

Acked-by: Jan Beulich <jbeulich@suse.com>
(obviously subject to possible renaming as per discussion on patch 1)



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v6 03/14] iommu: push use of type-safe BFN and MFN into iommu_ops
  2018-08-23  9:47 ` [PATCH v6 03/14] iommu: push use of type-safe BFN and MFN into iommu_ops Paul Durrant
@ 2018-09-04 10:32   ` Jan Beulich
  0 siblings, 0 replies; 111+ messages in thread
From: Jan Beulich @ 2018-09-04 10:32 UTC (permalink / raw)
  To: Paul Durrant
  Cc: Andrew Cooper, george.dunlap, Suravee Suthikulpanit, xen-devel

>>> On 23.08.18 at 11:47, <paul.durrant@citrix.com> wrote:
> This patch modifies the methods in struct iommu_ops to use type-safe BFN
> and MFN. This follows on from the prior patch that modified the functions
> exported in xen/iommu.h.
> 
> Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
> Reviewed-by: Wei Liu <wei.liu2@citrix.com>
> Reviewed-by: Kevin Tian <kevin.tian@intel.com>

Acked-by: Jan Beulich <jbeulich@suse.com>
(same note as for patch 2)



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v6 04/14] iommu: don't domain_crash() inside iommu_map/unmap_page()
  2018-08-23  9:47 ` [PATCH v6 04/14] iommu: don't domain_crash() inside iommu_map/unmap_page() Paul Durrant
@ 2018-09-04 10:38   ` Jan Beulich
  2018-09-04 10:39     ` Paul Durrant
  0 siblings, 1 reply; 111+ messages in thread
From: Jan Beulich @ 2018-09-04 10:38 UTC (permalink / raw)
  To: Paul Durrant
  Cc: Kevin Tian, Stefano Stabellini, Wei Liu, George Dunlap,
	Andrew Cooper, Ian Jackson, Tim Deegan, Julien Grall,
	Jun Nakajima, xen-devel

>>> On 23.08.18 at 11:47, <paul.durrant@citrix.com> wrote:
> --- a/xen/common/grant_table.c
> +++ b/xen/common/grant_table.c
> @@ -1154,6 +1154,7 @@ map_grant_ref(
>          }
>          if ( err )
>          {
> +            domu_crash(ld);
>              double_gt_unlock(lgt, rgt);

You crash the domain with both locks held here, but ...

> @@ -1406,7 +1407,10 @@ unmap_common(
>          double_gt_unlock(lgt, rgt);
>  
>          if ( err )
> +        {
> +            domu_crash(ld);
>              rc = GNTST_general_error;
> +        }

... outside of the locked region here. I think the latter is fine, and
hence the former should be changed.

With that
Reviewed-by: Jan Beulich <jbeulich@suse.com>

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v6 04/14] iommu: don't domain_crash() inside iommu_map/unmap_page()
  2018-09-04 10:38   ` Jan Beulich
@ 2018-09-04 10:39     ` Paul Durrant
  0 siblings, 0 replies; 111+ messages in thread
From: Paul Durrant @ 2018-09-04 10:39 UTC (permalink / raw)
  To: 'Jan Beulich'
  Cc: Kevin Tian, Stefano Stabellini, Wei Liu, Andrew Cooper,
	Tim (Xen.org),
	George Dunlap, Julien Grall, Jun Nakajima, xen-devel,
	Ian Jackson

> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: 04 September 2018 11:38
> To: Paul Durrant <Paul.Durrant@citrix.com>
> Cc: Julien Grall <julien.grall@arm.com>; Andrew Cooper
> <Andrew.Cooper3@citrix.com>; Wei Liu <wei.liu2@citrix.com>; George
> Dunlap <George.Dunlap@citrix.com>; Ian Jackson <Ian.Jackson@citrix.com>;
> Jun Nakajima <jun.nakajima@intel.com>; Kevin Tian
> <kevin.tian@intel.com>; Stefano Stabellini <sstabellini@kernel.org>; xen-
> devel <xen-devel@lists.xenproject.org>; Konrad Rzeszutek Wilk
> <konrad.wilk@oracle.com>; Tim (Xen.org) <tim@xen.org>
> Subject: Re: [PATCH v6 04/14] iommu: don't domain_crash() inside
> iommu_map/unmap_page()
> 
> >>> On 23.08.18 at 11:47, <paul.durrant@citrix.com> wrote:
> > --- a/xen/common/grant_table.c
> > +++ b/xen/common/grant_table.c
> > @@ -1154,6 +1154,7 @@ map_grant_ref(
> >          }
> >          if ( err )
> >          {
> > +            domu_crash(ld);
> >              double_gt_unlock(lgt, rgt);
> 
> You crash the domain with both locks held here, but ...
> 
> > @@ -1406,7 +1407,10 @@ unmap_common(
> >          double_gt_unlock(lgt, rgt);
> >
> >          if ( err )
> > +        {
> > +            domu_crash(ld);
> >              rc = GNTST_general_error;
> > +        }
> 
> ... outside of the locked region here. I think the latter is fine, and
> hence the former should be changed.

Sure. Will do.

> 
> With that
> Reviewed-by: Jan Beulich <jbeulich@suse.com>
> 

Thanks,

  Paul

> Jan
> 


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v6 05/14] public / x86: introduce __HYPERCALL_iommu_op
  2018-08-23  9:47 ` [PATCH v6 05/14] public / x86: introduce __HYPERCALL_iommu_op Paul Durrant
@ 2018-09-04 11:50   ` Jan Beulich
  2018-09-04 12:23     ` Paul Durrant
  2018-09-07 10:52   ` Jan Beulich
  1 sibling, 1 reply; 111+ messages in thread
From: Jan Beulich @ 2018-09-04 11:50 UTC (permalink / raw)
  To: Paul Durrant
  Cc: Stefano Stabellini, Wei Liu, George Dunlap, Andrew Cooper,
	Ian Jackson, Tim Deegan, xen-devel

>>> On 23.08.18 at 11:47, <paul.durrant@citrix.com> wrote:
> --- /dev/null
> +++ b/xen/common/iommu_op.c
> @@ -0,0 +1,184 @@
> +/******************************************************************************
> + * x86/iommu_op.c

Oops?

> +int do_one_iommu_op(xen_iommu_op_buf_t *buf)
> +{
> +    xen_iommu_op_t op;
> +    int rc;
> +
> +    if ( buf->size < sizeof(op) )
> +        return -EFAULT;
> +
> +    if ( copy_from_guest((void *)&op, buf->h, sizeof(op)) )

This cast could be avoided if you made ...

> +        return -EFAULT;
> +
> +    if ( op.pad )
> +        return -EINVAL;
> +
> +    rc = xsm_iommu_op(XSM_PRIV, current->domain, op.op);
> +    if ( rc )
> +        return rc;
> +
> +    iommu_op(&op);
> +
> +    if ( __copy_field_to_guest(guest_handle_cast(buf->h, xen_iommu_op_t),

... this cast the initializer of a local variable of suitable handle
type (same on the compat path then).

> +int compat_one_iommu_op(compat_iommu_op_buf_t *buf)
> +{
> +    compat_iommu_op_t cmp;
> +    xen_iommu_op_t nat;
> +    int rc;
> +
> +    if ( buf->size < sizeof(cmp) )
> +        return -EFAULT;
> +
> +    if ( copy_from_compat((void *)&cmp, buf->h, sizeof(cmp)) )
> +        return -EFAULT;
> +
> +    if ( cmp.pad )
> +        return -EINVAL;
> +
> +    rc = xsm_iommu_op(XSM_PRIV, current->domain, cmp.op);
> +    if ( rc )
> +        return rc;
> +
> +    XLAT_iommu_op(&nat, &cmp);
> +
> +    iommu_op(&nat);
> +
> +    XLAT_iommu_op(&cmp, &nat);
> +
> +    if ( __copy_field_to_compat(compat_handle_cast(buf->h,
> +                                                   compat_iommu_op_t),
> +                                &cmp, status) )

Since you're only after the status field, perhaps better to avoid the
full-blown reverse XLAT_iommu_op() and copy just that one field?

> --- a/xen/include/Makefile
> +++ b/xen/include/Makefile
> @@ -11,6 +11,7 @@ headers-y := \
>      compat/features.h \
>      compat/grant_table.h \
>      compat/kexec.h \
> +    compat/iommu_op.h \
>      compat/memory.h \
>      compat/nmi.h \
>      compat/physdev.h \

I guess this is just an off-by-one wrt sorting?

> @@ -29,6 +30,7 @@ headers-$(CONFIG_X86)     += compat/arch-x86/xen-$(compat-arch-y).h
>  headers-$(CONFIG_X86)     += compat/hvm/dm_op.h
>  headers-$(CONFIG_X86)     += compat/hvm/hvm_op.h
>  headers-$(CONFIG_X86)     += compat/hvm/hvm_vcpu.h
> +headers-$(CONFIG_X86)     += compat/iommu_op.h

Did you forget to remove this when adding the entry above?

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v6 05/14] public / x86: introduce __HYPERCALL_iommu_op
  2018-09-04 11:50   ` Jan Beulich
@ 2018-09-04 12:23     ` Paul Durrant
  2018-09-04 12:55       ` Jan Beulich
  0 siblings, 1 reply; 111+ messages in thread
From: Paul Durrant @ 2018-09-04 12:23 UTC (permalink / raw)
  To: 'Jan Beulich'
  Cc: Stefano Stabellini, Wei Liu, Andrew Cooper, Tim (Xen.org),
	George Dunlap, Ian Jackson, xen-devel

> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: 04 September 2018 12:50
> To: Paul Durrant <Paul.Durrant@citrix.com>
> Cc: Andrew Cooper <Andrew.Cooper3@citrix.com>; Wei Liu
> <wei.liu2@citrix.com>; George Dunlap <George.Dunlap@citrix.com>; Ian
> Jackson <Ian.Jackson@citrix.com>; Stefano Stabellini
> <sstabellini@kernel.org>; xen-devel <xen-devel@lists.xenproject.org>;
> Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>; Tim (Xen.org)
> <tim@xen.org>
> Subject: Re: [PATCH v6 05/14] public / x86: introduce
> __HYPERCALL_iommu_op
> 
> >>> On 23.08.18 at 11:47, <paul.durrant@citrix.com> wrote:
> > --- /dev/null
> > +++ b/xen/common/iommu_op.c
> > @@ -0,0 +1,184 @@
> >
> +/*********************************************************
> *********************
> > + * x86/iommu_op.c
> 
> Oops?
> 

Yep. Missed that in the move to common.

> > +int do_one_iommu_op(xen_iommu_op_buf_t *buf)
> > +{
> > +    xen_iommu_op_t op;
> > +    int rc;
> > +
> > +    if ( buf->size < sizeof(op) )
> > +        return -EFAULT;
> > +
> > +    if ( copy_from_guest((void *)&op, buf->h, sizeof(op)) )
> 
> This cast could be avoided if you made ...
> 
> > +        return -EFAULT;
> > +
> > +    if ( op.pad )
> > +        return -EINVAL;
> > +
> > +    rc = xsm_iommu_op(XSM_PRIV, current->domain, op.op);
> > +    if ( rc )
> > +        return rc;
> > +
> > +    iommu_op(&op);
> > +
> > +    if ( __copy_field_to_guest(guest_handle_cast(buf->h,
> xen_iommu_op_t),
> 
> ... this cast the initializer of a local variable of suitable handle
> type (same on the compat path then).

Ok. I'll look at that.

> 
> > +int compat_one_iommu_op(compat_iommu_op_buf_t *buf)
> > +{
> > +    compat_iommu_op_t cmp;
> > +    xen_iommu_op_t nat;
> > +    int rc;
> > +
> > +    if ( buf->size < sizeof(cmp) )
> > +        return -EFAULT;
> > +
> > +    if ( copy_from_compat((void *)&cmp, buf->h, sizeof(cmp)) )
> > +        return -EFAULT;
> > +
> > +    if ( cmp.pad )
> > +        return -EINVAL;
> > +
> > +    rc = xsm_iommu_op(XSM_PRIV, current->domain, cmp.op);
> > +    if ( rc )
> > +        return rc;
> > +
> > +    XLAT_iommu_op(&nat, &cmp);
> > +
> > +    iommu_op(&nat);
> > +
> > +    XLAT_iommu_op(&cmp, &nat);
> > +
> > +    if ( __copy_field_to_compat(compat_handle_cast(buf->h,
> > +                                                   compat_iommu_op_t),
> > +                                &cmp, status) )
> 
> Since you're only after the status field, perhaps better to avoid the
> full-blown reverse XLAT_iommu_op() and copy just that one field?
> 

I kind of like the fact that the two calls mirror each other so I'd prefer to keep it.

> > --- a/xen/include/Makefile
> > +++ b/xen/include/Makefile
> > @@ -11,6 +11,7 @@ headers-y := \
> >      compat/features.h \
> >      compat/grant_table.h \
> >      compat/kexec.h \
> > +    compat/iommu_op.h \
> >      compat/memory.h \
> >      compat/nmi.h \
> >      compat/physdev.h \
> 
> I guess this is just an off-by-one wrt sorting?
> 

Yep. I'll move.

> > @@ -29,6 +30,7 @@ headers-$(CONFIG_X86)     += compat/arch-x86/xen-
> $(compat-arch-y).h
> >  headers-$(CONFIG_X86)     += compat/hvm/dm_op.h
> >  headers-$(CONFIG_X86)     += compat/hvm/hvm_op.h
> >  headers-$(CONFIG_X86)     += compat/hvm/hvm_vcpu.h
> > +headers-$(CONFIG_X86)     += compat/iommu_op.h
> 
> Did you forget to remove this when adding the entry above?
> 

Yes, it should have gone.

  Paul

> Jan
> 


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v6 05/14] public / x86: introduce __HYPERCALL_iommu_op
  2018-09-04 12:23     ` Paul Durrant
@ 2018-09-04 12:55       ` Jan Beulich
  2018-09-04 13:17         ` Paul Durrant
  0 siblings, 1 reply; 111+ messages in thread
From: Jan Beulich @ 2018-09-04 12:55 UTC (permalink / raw)
  To: Paul Durrant
  Cc: Stefano Stabellini, Wei Liu, Andrew Cooper, Tim Deegan,
	george.dunlap, Ian Jackson, xen-devel

>>> On 04.09.18 at 14:23, <Paul.Durrant@citrix.com> wrote:
>> From: Jan Beulich [mailto:JBeulich@suse.com]
>> Sent: 04 September 2018 12:50
>> 
>> >>> On 23.08.18 at 11:47, <paul.durrant@citrix.com> wrote:
>> > +int compat_one_iommu_op(compat_iommu_op_buf_t *buf)
>> > +{
>> > +    compat_iommu_op_t cmp;
>> > +    xen_iommu_op_t nat;
>> > +    int rc;
>> > +
>> > +    if ( buf->size < sizeof(cmp) )
>> > +        return -EFAULT;
>> > +
>> > +    if ( copy_from_compat((void *)&cmp, buf->h, sizeof(cmp)) )
>> > +        return -EFAULT;
>> > +
>> > +    if ( cmp.pad )
>> > +        return -EINVAL;
>> > +
>> > +    rc = xsm_iommu_op(XSM_PRIV, current->domain, cmp.op);
>> > +    if ( rc )
>> > +        return rc;
>> > +
>> > +    XLAT_iommu_op(&nat, &cmp);
>> > +
>> > +    iommu_op(&nat);
>> > +
>> > +    XLAT_iommu_op(&cmp, &nat);
>> > +
>> > +    if ( __copy_field_to_compat(compat_handle_cast(buf->h,
>> > +                                                   compat_iommu_op_t),
>> > +                                &cmp, status) )
>> 
>> Since you're only after the status field, perhaps better to avoid the
>> full-blown reverse XLAT_iommu_op() and copy just that one field?
>> 
> 
> I kind of like the fact that the two calls mirror each other so I'd prefer 
> to keep it.

Would you mind looking at the generated code (once you have a few
sub-ops in place)? If the compiler manages to remove most of the
cruft, I'd be fine keeping it as is. If, however, a whole lot of extra
code gets generated, I'd really like to ask to use the shorter form.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v6 05/14] public / x86: introduce __HYPERCALL_iommu_op
  2018-09-04 12:55       ` Jan Beulich
@ 2018-09-04 13:17         ` Paul Durrant
  0 siblings, 0 replies; 111+ messages in thread
From: Paul Durrant @ 2018-09-04 13:17 UTC (permalink / raw)
  To: 'Jan Beulich'
  Cc: Stefano Stabellini, Wei Liu, Andrew Cooper, Tim (Xen.org),
	George Dunlap, Ian Jackson, xen-devel

> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: 04 September 2018 13:55
> To: Paul Durrant <Paul.Durrant@citrix.com>
> Cc: Andrew Cooper <Andrew.Cooper3@citrix.com>; George Dunlap
> <George.Dunlap@citrix.com>; Ian Jackson <Ian.Jackson@citrix.com>; Wei Liu
> <wei.liu2@citrix.com>; Stefano Stabellini <sstabellini@kernel.org>; xen-
> devel <xen-devel@lists.xenproject.org>; Konrad Rzeszutek Wilk
> <konrad.wilk@oracle.com>; Tim (Xen.org) <tim@xen.org>
> Subject: RE: [PATCH v6 05/14] public / x86: introduce
> __HYPERCALL_iommu_op
> 
> >>> On 04.09.18 at 14:23, <Paul.Durrant@citrix.com> wrote:
> >> From: Jan Beulich [mailto:JBeulich@suse.com]
> >> Sent: 04 September 2018 12:50
> >>
> >> >>> On 23.08.18 at 11:47, <paul.durrant@citrix.com> wrote:
> >> > +int compat_one_iommu_op(compat_iommu_op_buf_t *buf)
> >> > +{
> >> > +    compat_iommu_op_t cmp;
> >> > +    xen_iommu_op_t nat;
> >> > +    int rc;
> >> > +
> >> > +    if ( buf->size < sizeof(cmp) )
> >> > +        return -EFAULT;
> >> > +
> >> > +    if ( copy_from_compat((void *)&cmp, buf->h, sizeof(cmp)) )
> >> > +        return -EFAULT;
> >> > +
> >> > +    if ( cmp.pad )
> >> > +        return -EINVAL;
> >> > +
> >> > +    rc = xsm_iommu_op(XSM_PRIV, current->domain, cmp.op);
> >> > +    if ( rc )
> >> > +        return rc;
> >> > +
> >> > +    XLAT_iommu_op(&nat, &cmp);
> >> > +
> >> > +    iommu_op(&nat);
> >> > +
> >> > +    XLAT_iommu_op(&cmp, &nat);
> >> > +
> >> > +    if ( __copy_field_to_compat(compat_handle_cast(buf->h,
> >> > +                                                   compat_iommu_op_t),
> >> > +                                &cmp, status) )
> >>
> >> Since you're only after the status field, perhaps better to avoid the
> >> full-blown reverse XLAT_iommu_op() and copy just that one field?
> >>
> >
> > I kind of like the fact that the two calls mirror each other so I'd prefer
> > to keep it.
> 
> Would you mind looking at the generated code (once you have a few
> sub-ops in place)? If the compiler manages to remove most of the
> cruft, I'd be fine keeping it as is. If, however, a whole lot of extra
> code gets generated, I'd really like to ask to use the shorter form.
> 

I checked... it's not wonderfully compact (because of the nested switches I guess), so I'll cherry-pick the status and add a comment.

  Paul

> Jan
> 


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v6 01/14] iommu: introduce the concept of BFN...
  2018-09-04  9:08                   ` Jan Beulich
@ 2018-09-05  0:42                     ` Tian, Kevin
  2018-09-05  6:48                       ` Jan Beulich
  0 siblings, 1 reply; 111+ messages in thread
From: Tian, Kevin @ 2018-09-05  0:42 UTC (permalink / raw)
  To: Jan Beulich, Paul Durrant
  Cc: xen-devel, Julien Grall, Stefano Stabellini, Suravee Suthikulpanit

> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: Tuesday, September 4, 2018 5:08 PM
> 
> >>> On 04.09.18 at 10:49, <Paul.Durrant@citrix.com> wrote:
> >>  -----Original Message-----
> >> From: Jan Beulich [mailto:JBeulich@suse.com]
> >> Sent: 04 September 2018 09:47
> >> To: Kevin Tian <kevin.tian@intel.com>
> >> Cc: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>; Julien
> Grall
> >> <julien.grall@arm.com>; Paul Durrant <Paul.Durrant@citrix.com>;
> Stefano
> >> Stabellini <sstabellini@kernel.org>; xen-devel <xen-
> >> devel@lists.xenproject.org>
> >> Subject: Re: [Xen-devel] [PATCH v6 01/14] iommu: introduce the concept
> of
> >> BFN...
> >>
> >> >>> On 04.09.18 at 10:37, <kevin.tian@intel.com> wrote:
> >> >>  From: Jan Beulich [mailto:JBeulich@suse.com]
> >> >> Sent: Tuesday, September 4, 2018 4:33 PM
> >> >> >
> >> >> > bus address is commonly used along with physical/virtual address,
> to
> >> >> > represent different views between devices and CPU. From that
> angle
> >> >> > I think BFN is a clear term in this context. btw it is not necessary to
> >> >> > differentiate GBFN and MBFN since there is only one BFN view per
> >> >> > device.
> >> >>
> >> >> Sure, but you neglect the presence of one or more IOMMUs when
> >> >> you say "between devices and CPU". There addresses prior to and
> >> >> after IOMMU translation are distinct, and while the one before the
> >> >> translation matches the device view, the one after translation does
> >> >> not necessarily match the CPU view. Hence there are two "bus"
> >> >> frame numbers here - one representing the device view, and the
> >> >> other representing the IOMMU (output) view.
> >> >>
> >> >
> >> > I didn't get. the output address from IOMMU is the one sent to
> >> > memory controller, same as the one sent from CPU.
> >>
> >> That's on present x86 systems, but aiui not in the general case. The
> >> terminology to be used in Xen should fit the general case though.
> >
> > So your concern is cascaded IOMMUs?
> 
> Not primarily. My concern are systems with an I/O address space
> (behind the IOMMU) distinct from the CPU address space. Iirc at
> least Alpha is/was that way.
> 

Then Paul please documents clearly that this bus address refers to
the input side of IOMMU. :-)

Thanks
Kevin

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v6 01/14] iommu: introduce the concept of BFN...
  2018-09-05  0:42                     ` Tian, Kevin
@ 2018-09-05  6:48                       ` Jan Beulich
  2018-09-05  6:56                         ` Tian, Kevin
  0 siblings, 1 reply; 111+ messages in thread
From: Jan Beulich @ 2018-09-05  6:48 UTC (permalink / raw)
  To: Kevin Tian
  Cc: xen-devel, Julien Grall, Paul Durrant, Stefano Stabellini,
	Suravee Suthikulpanit

>>> On 05.09.18 at 02:42, <kevin.tian@intel.com> wrote:
>>  From: Jan Beulich [mailto:JBeulich@suse.com]
>> Sent: Tuesday, September 4, 2018 5:08 PM
>> 
>> >>> On 04.09.18 at 10:49, <Paul.Durrant@citrix.com> wrote:
>> >>  -----Original Message-----
>> >> From: Jan Beulich [mailto:JBeulich@suse.com]
>> >> Sent: 04 September 2018 09:47
>> >> To: Kevin Tian <kevin.tian@intel.com>
>> >> Cc: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>; Julien
>> Grall
>> >> <julien.grall@arm.com>; Paul Durrant <Paul.Durrant@citrix.com>;
>> Stefano
>> >> Stabellini <sstabellini@kernel.org>; xen-devel <xen-
>> >> devel@lists.xenproject.org>
>> >> Subject: Re: [Xen-devel] [PATCH v6 01/14] iommu: introduce the concept
>> of
>> >> BFN...
>> >>
>> >> >>> On 04.09.18 at 10:37, <kevin.tian@intel.com> wrote:
>> >> >>  From: Jan Beulich [mailto:JBeulich@suse.com]
>> >> >> Sent: Tuesday, September 4, 2018 4:33 PM
>> >> >> >
>> >> >> > bus address is commonly used along with physical/virtual address,
>> to
>> >> >> > represent different views between devices and CPU. From that
>> angle
>> >> >> > I think BFN is a clear term in this context. btw it is not necessary to
>> >> >> > differentiate GBFN and MBFN since there is only one BFN view per
>> >> >> > device.
>> >> >>
>> >> >> Sure, but you neglect the presence of one or more IOMMUs when
>> >> >> you say "between devices and CPU". There addresses prior to and
>> >> >> after IOMMU translation are distinct, and while the one before the
>> >> >> translation matches the device view, the one after translation does
>> >> >> not necessarily match the CPU view. Hence there are two "bus"
>> >> >> frame numbers here - one representing the device view, and the
>> >> >> other representing the IOMMU (output) view.
>> >> >>
>> >> >
>> >> > I didn't get. the output address from IOMMU is the one sent to
>> >> > memory controller, same as the one sent from CPU.
>> >>
>> >> That's on present x86 systems, but aiui not in the general case. The
>> >> terminology to be used in Xen should fit the general case though.
>> >
>> > So your concern is cascaded IOMMUs?
>> 
>> Not primarily. My concern are systems with an I/O address space
>> (behind the IOMMU) distinct from the CPU address space. Iirc at
>> least Alpha is/was that way.
>> 
> 
> Then Paul please documents clearly that this bus address refers to
> the input side of IOMMU. :-)

But when reading code you can't always go back to look at the one
place where its meaning is documented. Hence my desire for a name
which properly conveys the meaning.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v6 01/14] iommu: introduce the concept of BFN...
  2018-09-05  6:48                       ` Jan Beulich
@ 2018-09-05  6:56                         ` Tian, Kevin
  2018-09-05  7:11                           ` Jan Beulich
  0 siblings, 1 reply; 111+ messages in thread
From: Tian, Kevin @ 2018-09-05  6:56 UTC (permalink / raw)
  To: Jan Beulich
  Cc: xen-devel, Julien Grall, Paul Durrant, Stefano Stabellini,
	Suravee Suthikulpanit

> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: Wednesday, September 5, 2018 2:49 PM
> 
> >>> On 05.09.18 at 02:42, <kevin.tian@intel.com> wrote:
> >>  From: Jan Beulich [mailto:JBeulich@suse.com]
> >> Sent: Tuesday, September 4, 2018 5:08 PM
> >>
> >> >>> On 04.09.18 at 10:49, <Paul.Durrant@citrix.com> wrote:
> >> >>  -----Original Message-----
> >> >> From: Jan Beulich [mailto:JBeulich@suse.com]
> >> >> Sent: 04 September 2018 09:47
> >> >> To: Kevin Tian <kevin.tian@intel.com>
> >> >> Cc: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>; Julien
> >> Grall
> >> >> <julien.grall@arm.com>; Paul Durrant <Paul.Durrant@citrix.com>;
> >> Stefano
> >> >> Stabellini <sstabellini@kernel.org>; xen-devel <xen-
> >> >> devel@lists.xenproject.org>
> >> >> Subject: Re: [Xen-devel] [PATCH v6 01/14] iommu: introduce the
> concept
> >> of
> >> >> BFN...
> >> >>
> >> >> >>> On 04.09.18 at 10:37, <kevin.tian@intel.com> wrote:
> >> >> >>  From: Jan Beulich [mailto:JBeulich@suse.com]
> >> >> >> Sent: Tuesday, September 4, 2018 4:33 PM
> >> >> >> >
> >> >> >> > bus address is commonly used along with physical/virtual
> address,
> >> to
> >> >> >> > represent different views between devices and CPU. From that
> >> angle
> >> >> >> > I think BFN is a clear term in this context. btw it is not necessary
> to
> >> >> >> > differentiate GBFN and MBFN since there is only one BFN view
> per
> >> >> >> > device.
> >> >> >>
> >> >> >> Sure, but you neglect the presence of one or more IOMMUs when
> >> >> >> you say "between devices and CPU". There addresses prior to and
> >> >> >> after IOMMU translation are distinct, and while the one before the
> >> >> >> translation matches the device view, the one after translation does
> >> >> >> not necessarily match the CPU view. Hence there are two "bus"
> >> >> >> frame numbers here - one representing the device view, and the
> >> >> >> other representing the IOMMU (output) view.
> >> >> >>
> >> >> >
> >> >> > I didn't get. the output address from IOMMU is the one sent to
> >> >> > memory controller, same as the one sent from CPU.
> >> >>
> >> >> That's on present x86 systems, but aiui not in the general case. The
> >> >> terminology to be used in Xen should fit the general case though.
> >> >
> >> > So your concern is cascaded IOMMUs?
> >>
> >> Not primarily. My concern are systems with an I/O address space
> >> (behind the IOMMU) distinct from the CPU address space. Iirc at
> >> least Alpha is/was that way.
> >>
> >
> > Then Paul please documents clearly that this bus address refers to
> > the input side of IOMMU. :-)
> 
> But when reading code you can't always go back to look at the one
> place where its meaning is documented. Hence my desire for a name
> which properly conveys the meaning.
> 

Then possibly go back to DFN, but take 'D' as DMA instead of device?

Thanks
Kevin 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v6 01/14] iommu: introduce the concept of BFN...
  2018-09-05  6:56                         ` Tian, Kevin
@ 2018-09-05  7:11                           ` Jan Beulich
  2018-09-05  9:13                             ` Paul Durrant
  0 siblings, 1 reply; 111+ messages in thread
From: Jan Beulich @ 2018-09-05  7:11 UTC (permalink / raw)
  To: Kevin Tian
  Cc: xen-devel, Julien Grall, Paul Durrant, Stefano Stabellini,
	Suravee Suthikulpanit

>>> On 05.09.18 at 08:56, <kevin.tian@intel.com> wrote:
>>  From: Jan Beulich [mailto:JBeulich@suse.com]
>> Sent: Wednesday, September 5, 2018 2:49 PM
>> 
>> >>> On 05.09.18 at 02:42, <kevin.tian@intel.com> wrote:
>> >>  From: Jan Beulich [mailto:JBeulich@suse.com]
>> >> Sent: Tuesday, September 4, 2018 5:08 PM
>> >>
>> >> >>> On 04.09.18 at 10:49, <Paul.Durrant@citrix.com> wrote:
>> >> >>  -----Original Message-----
>> >> >> From: Jan Beulich [mailto:JBeulich@suse.com]
>> >> >> Sent: 04 September 2018 09:47
>> >> >> To: Kevin Tian <kevin.tian@intel.com>
>> >> >> Cc: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>; Julien
>> >> Grall
>> >> >> <julien.grall@arm.com>; Paul Durrant <Paul.Durrant@citrix.com>;
>> >> Stefano
>> >> >> Stabellini <sstabellini@kernel.org>; xen-devel <xen-
>> >> >> devel@lists.xenproject.org>
>> >> >> Subject: Re: [Xen-devel] [PATCH v6 01/14] iommu: introduce the
>> concept
>> >> of
>> >> >> BFN...
>> >> >>
>> >> >> >>> On 04.09.18 at 10:37, <kevin.tian@intel.com> wrote:
>> >> >> >>  From: Jan Beulich [mailto:JBeulich@suse.com]
>> >> >> >> Sent: Tuesday, September 4, 2018 4:33 PM
>> >> >> >> >
>> >> >> >> > bus address is commonly used along with physical/virtual
>> address,
>> >> to
>> >> >> >> > represent different views between devices and CPU. From that
>> >> angle
>> >> >> >> > I think BFN is a clear term in this context. btw it is not necessary
>> to
>> >> >> >> > differentiate GBFN and MBFN since there is only one BFN view
>> per
>> >> >> >> > device.
>> >> >> >>
>> >> >> >> Sure, but you neglect the presence of one or more IOMMUs when
>> >> >> >> you say "between devices and CPU". There addresses prior to and
>> >> >> >> after IOMMU translation are distinct, and while the one before the
>> >> >> >> translation matches the device view, the one after translation does
>> >> >> >> not necessarily match the CPU view. Hence there are two "bus"
>> >> >> >> frame numbers here - one representing the device view, and the
>> >> >> >> other representing the IOMMU (output) view.
>> >> >> >>
>> >> >> >
>> >> >> > I didn't get. the output address from IOMMU is the one sent to
>> >> >> > memory controller, same as the one sent from CPU.
>> >> >>
>> >> >> That's on present x86 systems, but aiui not in the general case. The
>> >> >> terminology to be used in Xen should fit the general case though.
>> >> >
>> >> > So your concern is cascaded IOMMUs?
>> >>
>> >> Not primarily. My concern are systems with an I/O address space
>> >> (behind the IOMMU) distinct from the CPU address space. Iirc at
>> >> least Alpha is/was that way.
>> >>
>> >
>> > Then Paul please documents clearly that this bus address refers to
>> > the input side of IOMMU. :-)
>> 
>> But when reading code you can't always go back to look at the one
>> place where its meaning is documented. Hence my desire for a name
>> which properly conveys the meaning.
>> 
> 
> Then possibly go back to DFN, but take 'D' as DMA instead of device?

How would "DMA" be any better than "bus"? Whose view it is then still
is unclear.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v6 01/14] iommu: introduce the concept of BFN...
  2018-09-05  7:11                           ` Jan Beulich
@ 2018-09-05  9:13                             ` Paul Durrant
  2018-09-05  9:38                               ` Jan Beulich
  0 siblings, 1 reply; 111+ messages in thread
From: Paul Durrant @ 2018-09-05  9:13 UTC (permalink / raw)
  To: 'Jan Beulich', Kevin Tian
  Cc: xen-devel, Julien Grall, Stefano Stabellini, Suravee Suthikulpanit

> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: 05 September 2018 08:12
> To: Kevin Tian <kevin.tian@intel.com>
> Cc: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>; Julien Grall
> <julien.grall@arm.com>; Paul Durrant <Paul.Durrant@citrix.com>; Stefano
> Stabellini <sstabellini@kernel.org>; xen-devel <xen-
> devel@lists.xenproject.org>
> Subject: RE: [Xen-devel] [PATCH v6 01/14] iommu: introduce the concept of
> BFN...
> 
> >>> On 05.09.18 at 08:56, <kevin.tian@intel.com> wrote:
> >>  From: Jan Beulich [mailto:JBeulich@suse.com]
> >> Sent: Wednesday, September 5, 2018 2:49 PM
> >>
> >> >>> On 05.09.18 at 02:42, <kevin.tian@intel.com> wrote:
> >> >>  From: Jan Beulich [mailto:JBeulich@suse.com]
> >> >> Sent: Tuesday, September 4, 2018 5:08 PM
> >> >>
> >> >> >>> On 04.09.18 at 10:49, <Paul.Durrant@citrix.com> wrote:
> >> >> >>  -----Original Message-----
> >> >> >> From: Jan Beulich [mailto:JBeulich@suse.com]
> >> >> >> Sent: 04 September 2018 09:47
> >> >> >> To: Kevin Tian <kevin.tian@intel.com>
> >> >> >> Cc: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>;
> Julien
> >> >> Grall
> >> >> >> <julien.grall@arm.com>; Paul Durrant <Paul.Durrant@citrix.com>;
> >> >> Stefano
> >> >> >> Stabellini <sstabellini@kernel.org>; xen-devel <xen-
> >> >> >> devel@lists.xenproject.org>
> >> >> >> Subject: Re: [Xen-devel] [PATCH v6 01/14] iommu: introduce the
> >> concept
> >> >> of
> >> >> >> BFN...
> >> >> >>
> >> >> >> >>> On 04.09.18 at 10:37, <kevin.tian@intel.com> wrote:
> >> >> >> >>  From: Jan Beulich [mailto:JBeulich@suse.com]
> >> >> >> >> Sent: Tuesday, September 4, 2018 4:33 PM
> >> >> >> >> >
> >> >> >> >> > bus address is commonly used along with physical/virtual
> >> address,
> >> >> to
> >> >> >> >> > represent different views between devices and CPU. From
> that
> >> >> angle
> >> >> >> >> > I think BFN is a clear term in this context. btw it is not
> necessary
> >> to
> >> >> >> >> > differentiate GBFN and MBFN since there is only one BFN view
> >> per
> >> >> >> >> > device.
> >> >> >> >>
> >> >> >> >> Sure, but you neglect the presence of one or more IOMMUs
> when
> >> >> >> >> you say "between devices and CPU". There addresses prior to
> and
> >> >> >> >> after IOMMU translation are distinct, and while the one before
> the
> >> >> >> >> translation matches the device view, the one after translation
> does
> >> >> >> >> not necessarily match the CPU view. Hence there are two "bus"
> >> >> >> >> frame numbers here - one representing the device view, and
> the
> >> >> >> >> other representing the IOMMU (output) view.
> >> >> >> >>
> >> >> >> >
> >> >> >> > I didn't get. the output address from IOMMU is the one sent to
> >> >> >> > memory controller, same as the one sent from CPU.
> >> >> >>
> >> >> >> That's on present x86 systems, but aiui not in the general case. The
> >> >> >> terminology to be used in Xen should fit the general case though.
> >> >> >
> >> >> > So your concern is cascaded IOMMUs?
> >> >>
> >> >> Not primarily. My concern are systems with an I/O address space
> >> >> (behind the IOMMU) distinct from the CPU address space. Iirc at
> >> >> least Alpha is/was that way.
> >> >>
> >> >
> >> > Then Paul please documents clearly that this bus address refers to
> >> > the input side of IOMMU. :-)
> >>
> >> But when reading code you can't always go back to look at the one
> >> place where its meaning is documented. Hence my desire for a name
> >> which properly conveys the meaning.
> >>
> >
> > Then possibly go back to DFN, but take 'D' as DMA instead of device?
> 
> How would "DMA" be any better than "bus"? Whose view it is then still
> is unclear.
> 

Personally I think 'bus address' is commonly enough used term for addresses used by devices for DMA. Indeed we have already 'dev_bus_addr' in the grant map and unmap hypercalls. So baddr and bfn seem like ok terms to me. It's also not impossible to rename these later if they prove problematic.

  Paul

> Jan
> 


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v6 01/14] iommu: introduce the concept of BFN...
  2018-09-05  9:13                             ` Paul Durrant
@ 2018-09-05  9:38                               ` Jan Beulich
  2018-09-06 10:36                                 ` Paul Durrant
  0 siblings, 1 reply; 111+ messages in thread
From: Jan Beulich @ 2018-09-05  9:38 UTC (permalink / raw)
  To: Paul Durrant
  Cc: xen-devel, Julien Grall, Stefano Stabellini, Kevin Tian,
	Suravee Suthikulpanit

>>> On 05.09.18 at 11:13, <Paul.Durrant@citrix.com> wrote:
>>  -----Original Message-----
>> From: Jan Beulich [mailto:JBeulich@suse.com]
>> Sent: 05 September 2018 08:12
>> To: Kevin Tian <kevin.tian@intel.com>
>> Cc: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>; Julien Grall
>> <julien.grall@arm.com>; Paul Durrant <Paul.Durrant@citrix.com>; Stefano
>> Stabellini <sstabellini@kernel.org>; xen-devel <xen-
>> devel@lists.xenproject.org>
>> Subject: RE: [Xen-devel] [PATCH v6 01/14] iommu: introduce the concept of
>> BFN...
>> 
>> >>> On 05.09.18 at 08:56, <kevin.tian@intel.com> wrote:
>> >>  From: Jan Beulich [mailto:JBeulich@suse.com]
>> >> Sent: Wednesday, September 5, 2018 2:49 PM
>> >>
>> >> >>> On 05.09.18 at 02:42, <kevin.tian@intel.com> wrote:
>> >> >>  From: Jan Beulich [mailto:JBeulich@suse.com]
>> >> >> Sent: Tuesday, September 4, 2018 5:08 PM
>> >> >>
>> >> >> >>> On 04.09.18 at 10:49, <Paul.Durrant@citrix.com> wrote:
>> >> >> >>  -----Original Message-----
>> >> >> >> From: Jan Beulich [mailto:JBeulich@suse.com]
>> >> >> >> Sent: 04 September 2018 09:47
>> >> >> >> To: Kevin Tian <kevin.tian@intel.com>
>> >> >> >> Cc: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>;
>> Julien
>> >> >> Grall
>> >> >> >> <julien.grall@arm.com>; Paul Durrant <Paul.Durrant@citrix.com>;
>> >> >> Stefano
>> >> >> >> Stabellini <sstabellini@kernel.org>; xen-devel <xen-
>> >> >> >> devel@lists.xenproject.org>
>> >> >> >> Subject: Re: [Xen-devel] [PATCH v6 01/14] iommu: introduce the
>> >> concept
>> >> >> of
>> >> >> >> BFN...
>> >> >> >>
>> >> >> >> >>> On 04.09.18 at 10:37, <kevin.tian@intel.com> wrote:
>> >> >> >> >>  From: Jan Beulich [mailto:JBeulich@suse.com]
>> >> >> >> >> Sent: Tuesday, September 4, 2018 4:33 PM
>> >> >> >> >> >
>> >> >> >> >> > bus address is commonly used along with physical/virtual
>> >> address,
>> >> >> to
>> >> >> >> >> > represent different views between devices and CPU. From
>> that
>> >> >> angle
>> >> >> >> >> > I think BFN is a clear term in this context. btw it is not
>> necessary
>> >> to
>> >> >> >> >> > differentiate GBFN and MBFN since there is only one BFN view
>> >> per
>> >> >> >> >> > device.
>> >> >> >> >>
>> >> >> >> >> Sure, but you neglect the presence of one or more IOMMUs
>> when
>> >> >> >> >> you say "between devices and CPU". There addresses prior to
>> and
>> >> >> >> >> after IOMMU translation are distinct, and while the one before
>> the
>> >> >> >> >> translation matches the device view, the one after translation
>> does
>> >> >> >> >> not necessarily match the CPU view. Hence there are two "bus"
>> >> >> >> >> frame numbers here - one representing the device view, and
>> the
>> >> >> >> >> other representing the IOMMU (output) view.
>> >> >> >> >>
>> >> >> >> >
>> >> >> >> > I didn't get. the output address from IOMMU is the one sent to
>> >> >> >> > memory controller, same as the one sent from CPU.
>> >> >> >>
>> >> >> >> That's on present x86 systems, but aiui not in the general case. The
>> >> >> >> terminology to be used in Xen should fit the general case though.
>> >> >> >
>> >> >> > So your concern is cascaded IOMMUs?
>> >> >>
>> >> >> Not primarily. My concern are systems with an I/O address space
>> >> >> (behind the IOMMU) distinct from the CPU address space. Iirc at
>> >> >> least Alpha is/was that way.
>> >> >>
>> >> >
>> >> > Then Paul please documents clearly that this bus address refers to
>> >> > the input side of IOMMU. :-)
>> >>
>> >> But when reading code you can't always go back to look at the one
>> >> place where its meaning is documented. Hence my desire for a name
>> >> which properly conveys the meaning.
>> >>
>> >
>> > Then possibly go back to DFN, but take 'D' as DMA instead of device?
>> 
>> How would "DMA" be any better than "bus"? Whose view it is then still
>> is unclear.
>> 
> 
> Personally I think 'bus address' is commonly enough used term for addresses 
> used by devices for DMA. Indeed we have already 'dev_bus_addr' in the grant 
> map and unmap hypercalls. So baddr and bfn seem like ok terms to me. It's 
> also not impossible to rename these later if they prove problematic.

But that's the point - the names are problematic (to me): I permanently
have to remind myself that they do _not_ refer to the addresses as
seen when accessing memory, but the ones going _into_ the IOMMU.
The confusion (on my part) arises every time I see a mixture of gfn, bfn,
and mfn in the same patch, perhaps including some 1:1-ness assumptions
between pairs of them.

Take these two hunks as example (mixing in some pfn as well):

@@ -436,7 +436,7 @@ static int iommu_merge_pages(struct domain *d, unsigned long pt_mfn,
  * {Re, un}mapping super page frames causes re-allocation of io
  * page tables.
  */
-static int iommu_pde_from_gfn(struct domain *d, unsigned long pfn, 
+static int iommu_pde_from_bfn(struct domain *d, unsigned long pfn,
                               unsigned long pt_mfn[])
 {
     u64 *pde, *next_table_vaddr;
@@ -477,11 +477,11 @@ static int iommu_pde_from_gfn(struct domain *d, unsigned long pfn,
              next_table_mfn != 0 )
         {
             int i;
-            unsigned long mfn, gfn;
+            unsigned long mfn, bfn;
             unsigned int page_sz;
 
             page_sz = 1 << (PTE_PER_TABLE_SHIFT * (next_level - 1));
-            gfn =  pfn & ~((1 << (PTE_PER_TABLE_SHIFT * next_level)) - 1);
+            bfn =  pfn & ~((1 << (PTE_PER_TABLE_SHIFT * next_level)) - 1);
             mfn = next_table_mfn;
 
             /* allocate lower level page table */

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v6 01/14] iommu: introduce the concept of BFN...
  2018-09-05  9:38                               ` Jan Beulich
@ 2018-09-06 10:36                                 ` Paul Durrant
  2018-09-06 13:13                                   ` Jan Beulich
  0 siblings, 1 reply; 111+ messages in thread
From: Paul Durrant @ 2018-09-06 10:36 UTC (permalink / raw)
  To: 'Jan Beulich'
  Cc: xen-devel, Julien Grall, Stefano Stabellini, Kevin Tian,
	Suravee Suthikulpanit

> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: 05 September 2018 10:39
> To: Paul Durrant <Paul.Durrant@citrix.com>
> Cc: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>; Julien Grall
> <julien.grall@arm.com>; Kevin Tian <kevin.tian@intel.com>; Stefano
> Stabellini <sstabellini@kernel.org>; xen-devel <xen-
> devel@lists.xenproject.org>
> Subject: RE: [Xen-devel] [PATCH v6 01/14] iommu: introduce the concept of
> BFN...
> 
> >>> On 05.09.18 at 11:13, <Paul.Durrant@citrix.com> wrote:
> >>  -----Original Message-----
> >> From: Jan Beulich [mailto:JBeulich@suse.com]
> >> Sent: 05 September 2018 08:12
> >> To: Kevin Tian <kevin.tian@intel.com>
> >> Cc: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>; Julien Grall
> >> <julien.grall@arm.com>; Paul Durrant <Paul.Durrant@citrix.com>;
> Stefano
> >> Stabellini <sstabellini@kernel.org>; xen-devel <xen-
> >> devel@lists.xenproject.org>
> >> Subject: RE: [Xen-devel] [PATCH v6 01/14] iommu: introduce the concept
> of
> >> BFN...
> >>
> >> >>> On 05.09.18 at 08:56, <kevin.tian@intel.com> wrote:
> >> >>  From: Jan Beulich [mailto:JBeulich@suse.com]
> >> >> Sent: Wednesday, September 5, 2018 2:49 PM
> >> >>
> >> >> >>> On 05.09.18 at 02:42, <kevin.tian@intel.com> wrote:
> >> >> >>  From: Jan Beulich [mailto:JBeulich@suse.com]
> >> >> >> Sent: Tuesday, September 4, 2018 5:08 PM
> >> >> >>
> >> >> >> >>> On 04.09.18 at 10:49, <Paul.Durrant@citrix.com> wrote:
> >> >> >> >>  -----Original Message-----
> >> >> >> >> From: Jan Beulich [mailto:JBeulich@suse.com]
> >> >> >> >> Sent: 04 September 2018 09:47
> >> >> >> >> To: Kevin Tian <kevin.tian@intel.com>
> >> >> >> >> Cc: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>;
> >> Julien
> >> >> >> Grall
> >> >> >> >> <julien.grall@arm.com>; Paul Durrant
> <Paul.Durrant@citrix.com>;
> >> >> >> Stefano
> >> >> >> >> Stabellini <sstabellini@kernel.org>; xen-devel <xen-
> >> >> >> >> devel@lists.xenproject.org>
> >> >> >> >> Subject: Re: [Xen-devel] [PATCH v6 01/14] iommu: introduce the
> >> >> concept
> >> >> >> of
> >> >> >> >> BFN...
> >> >> >> >>
> >> >> >> >> >>> On 04.09.18 at 10:37, <kevin.tian@intel.com> wrote:
> >> >> >> >> >>  From: Jan Beulich [mailto:JBeulich@suse.com]
> >> >> >> >> >> Sent: Tuesday, September 4, 2018 4:33 PM
> >> >> >> >> >> >
> >> >> >> >> >> > bus address is commonly used along with physical/virtual
> >> >> address,
> >> >> >> to
> >> >> >> >> >> > represent different views between devices and CPU. From
> >> that
> >> >> >> angle
> >> >> >> >> >> > I think BFN is a clear term in this context. btw it is not
> >> necessary
> >> >> to
> >> >> >> >> >> > differentiate GBFN and MBFN since there is only one BFN
> view
> >> >> per
> >> >> >> >> >> > device.
> >> >> >> >> >>
> >> >> >> >> >> Sure, but you neglect the presence of one or more IOMMUs
> >> when
> >> >> >> >> >> you say "between devices and CPU". There addresses prior
> to
> >> and
> >> >> >> >> >> after IOMMU translation are distinct, and while the one
> before
> >> the
> >> >> >> >> >> translation matches the device view, the one after translation
> >> does
> >> >> >> >> >> not necessarily match the CPU view. Hence there are two
> "bus"
> >> >> >> >> >> frame numbers here - one representing the device view, and
> >> the
> >> >> >> >> >> other representing the IOMMU (output) view.
> >> >> >> >> >>
> >> >> >> >> >
> >> >> >> >> > I didn't get. the output address from IOMMU is the one sent
> to
> >> >> >> >> > memory controller, same as the one sent from CPU.
> >> >> >> >>
> >> >> >> >> That's on present x86 systems, but aiui not in the general case.
> The
> >> >> >> >> terminology to be used in Xen should fit the general case
> though.
> >> >> >> >
> >> >> >> > So your concern is cascaded IOMMUs?
> >> >> >>
> >> >> >> Not primarily. My concern are systems with an I/O address space
> >> >> >> (behind the IOMMU) distinct from the CPU address space. Iirc at
> >> >> >> least Alpha is/was that way.
> >> >> >>
> >> >> >
> >> >> > Then Paul please documents clearly that this bus address refers to
> >> >> > the input side of IOMMU. :-)
> >> >>
> >> >> But when reading code you can't always go back to look at the one
> >> >> place where its meaning is documented. Hence my desire for a name
> >> >> which properly conveys the meaning.
> >> >>
> >> >
> >> > Then possibly go back to DFN, but take 'D' as DMA instead of device?
> >>
> >> How would "DMA" be any better than "bus"? Whose view it is then still
> >> is unclear.
> >>
> >
> > Personally I think 'bus address' is commonly enough used term for
> addresses
> > used by devices for DMA. Indeed we have already 'dev_bus_addr' in the
> grant
> > map and unmap hypercalls. So baddr and bfn seem like ok terms to me. It's
> > also not impossible to rename these later if they prove problematic.
> 
> But that's the point - the names are problematic (to me): I permanently
> have to remind myself that they do _not_ refer to the addresses as
> seen when accessing memory, but the ones going _into_ the IOMMU.

Ok. Could we agree on 'IOFN' then? I think 'iova' and 'io address' are also reasonably widely used terms to refer to address from a device's PoV. I'd really like to unblock these early patches.

> The confusion (on my part) arises every time I see a mixture of gfn, bfn,
> and mfn in the same patch, perhaps including some 1:1-ness assumptions
> between pairs of them.
> 
> Take these two hunks as example (mixing in some pfn as well):
> 
> @@ -436,7 +436,7 @@ static int iommu_merge_pages(struct domain *d,
> unsigned long pt_mfn,
>   * {Re, un}mapping super page frames causes re-allocation of io
>   * page tables.
>   */
> -static int iommu_pde_from_gfn(struct domain *d, unsigned long pfn,
> +static int iommu_pde_from_bfn(struct domain *d, unsigned long pfn,
>                                unsigned long pt_mfn[])
>  {
>      u64 *pde, *next_table_vaddr;
> @@ -477,11 +477,11 @@ static int iommu_pde_from_gfn(struct domain *d,
> unsigned long pfn,
>               next_table_mfn != 0 )
>          {
>              int i;
> -            unsigned long mfn, gfn;
> +            unsigned long mfn, bfn;
>              unsigned int page_sz;
> 
>              page_sz = 1 << (PTE_PER_TABLE_SHIFT * (next_level - 1));
> -            gfn =  pfn & ~((1 << (PTE_PER_TABLE_SHIFT * next_level)) - 1);
> +            bfn =  pfn & ~((1 << (PTE_PER_TABLE_SHIFT * next_level)) - 1);

This is not wonderful code, agreed. In this particular case it looks like I may be able to just rename the pfn argument to iofn (assuming we go with that name) and lose the stack variable, if that helps.

  Paul

>              mfn = next_table_mfn;
> 
>              /* allocate lower level page table */
> 
> Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v6 01/14] iommu: introduce the concept of BFN...
  2018-09-06 10:36                                 ` Paul Durrant
@ 2018-09-06 13:13                                   ` Jan Beulich
  2018-09-06 14:54                                     ` Paul Durrant
  0 siblings, 1 reply; 111+ messages in thread
From: Jan Beulich @ 2018-09-06 13:13 UTC (permalink / raw)
  To: Paul Durrant
  Cc: xen-devel, Julien Grall, Stefano Stabellini, Kevin Tian,
	Suravee Suthikulpanit

>>> On 06.09.18 at 12:36, <Paul.Durrant@citrix.com> wrote:
>> From: Jan Beulich [mailto:JBeulich@suse.com]
>> Sent: 05 September 2018 10:39
>> 
>> >>> On 05.09.18 at 11:13, <Paul.Durrant@citrix.com> wrote:
>> > Personally I think 'bus address' is commonly enough used term for
>> addresses
>> > used by devices for DMA. Indeed we have already 'dev_bus_addr' in the
>> grant
>> > map and unmap hypercalls. So baddr and bfn seem like ok terms to me. It's
>> > also not impossible to rename these later if they prove problematic.
>> 
>> But that's the point - the names are problematic (to me): I permanently
>> have to remind myself that they do _not_ refer to the addresses as
>> seen when accessing memory, but the ones going _into_ the IOMMU.
> 
> Ok. Could we agree on 'IOFN' then? I think 'iova' and 'io address' are also 
> reasonably widely used terms to refer to address from a device's PoV. I'd 
> really like to unblock these early patches.

Hmm, earlier I had indicated I'd prefer DFN (as this make clear whose
view we are talking about). Kevin seemed to prefer DFN too, just with
a different association for D (which, as said, I consider unhelpful). So
is there a particular reason you're now suggesting IOFN nevertheless?

>> The confusion (on my part) arises every time I see a mixture of gfn, bfn,
>> and mfn in the same patch, perhaps including some 1:1-ness assumptions
>> between pairs of them.
>> 
>> Take these two hunks as example (mixing in some pfn as well):
>> 
>> @@ -436,7 +436,7 @@ static int iommu_merge_pages(struct domain *d,
>> unsigned long pt_mfn,
>>   * {Re, un}mapping super page frames causes re-allocation of io
>>   * page tables.
>>   */
>> -static int iommu_pde_from_gfn(struct domain *d, unsigned long pfn,
>> +static int iommu_pde_from_bfn(struct domain *d, unsigned long pfn,
>>                                unsigned long pt_mfn[])
>>  {
>>      u64 *pde, *next_table_vaddr;
>> @@ -477,11 +477,11 @@ static int iommu_pde_from_gfn(struct domain *d,
>> unsigned long pfn,
>>               next_table_mfn != 0 )
>>          {
>>              int i;
>> -            unsigned long mfn, gfn;
>> +            unsigned long mfn, bfn;
>>              unsigned int page_sz;
>> 
>>              page_sz = 1 << (PTE_PER_TABLE_SHIFT * (next_level - 1));
>> -            gfn =  pfn & ~((1 << (PTE_PER_TABLE_SHIFT * next_level)) - 1);
>> +            bfn =  pfn & ~((1 << (PTE_PER_TABLE_SHIFT * next_level)) - 1);
> 
> This is not wonderful code, agreed. In this particular case it looks like I 
> may be able to just rename the pfn argument to iofn (assuming we go with that 
> name) and lose the stack variable, if that helps.

Renaming the parameter will likely help, I agree. Getting rid of the
local variable, otoh, I'm not sure is going to work (you need to retain
the function parameter's original value for the next iteration of the
outer loop).



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v6 01/14] iommu: introduce the concept of BFN...
  2018-09-06 13:13                                   ` Jan Beulich
@ 2018-09-06 14:54                                     ` Paul Durrant
  2018-09-07  1:47                                       ` Tian, Kevin
  0 siblings, 1 reply; 111+ messages in thread
From: Paul Durrant @ 2018-09-06 14:54 UTC (permalink / raw)
  To: 'Jan Beulich'
  Cc: xen-devel, Julien Grall, Stefano Stabellini, Kevin Tian,
	Suravee Suthikulpanit

> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: 06 September 2018 14:13
> To: Paul Durrant <Paul.Durrant@citrix.com>
> Cc: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>; Julien Grall
> <julien.grall@arm.com>; Kevin Tian <kevin.tian@intel.com>; Stefano
> Stabellini <sstabellini@kernel.org>; xen-devel <xen-
> devel@lists.xenproject.org>
> Subject: RE: [Xen-devel] [PATCH v6 01/14] iommu: introduce the concept of
> BFN...
> 
> >>> On 06.09.18 at 12:36, <Paul.Durrant@citrix.com> wrote:
> >> From: Jan Beulich [mailto:JBeulich@suse.com]
> >> Sent: 05 September 2018 10:39
> >>
> >> >>> On 05.09.18 at 11:13, <Paul.Durrant@citrix.com> wrote:
> >> > Personally I think 'bus address' is commonly enough used term for
> >> addresses
> >> > used by devices for DMA. Indeed we have already 'dev_bus_addr' in
> the
> >> grant
> >> > map and unmap hypercalls. So baddr and bfn seem like ok terms to me.
> It's
> >> > also not impossible to rename these later if they prove problematic.
> >>
> >> But that's the point - the names are problematic (to me): I permanently
> >> have to remind myself that they do _not_ refer to the addresses as
> >> seen when accessing memory, but the ones going _into_ the IOMMU.
> >
> > Ok. Could we agree on 'IOFN' then? I think 'iova' and 'io address' are also
> > reasonably widely used terms to refer to address from a device's PoV. I'd
> > really like to unblock these early patches.
> 
> Hmm, earlier I had indicated I'd prefer DFN (as this make clear whose
> view we are talking about). Kevin seemed to prefer DFN too, just with
> a different association for D (which, as said, I consider unhelpful). So
> is there a particular reason you're now suggesting IOFN nevertheless?

It was the ambiguity and lack of agreement over the 'D' that made me think that the other alternative would be better.
Kevin, would you be ok with 'IOFN'?

> 
> >> The confusion (on my part) arises every time I see a mixture of gfn, bfn,
> >> and mfn in the same patch, perhaps including some 1:1-ness assumptions
> >> between pairs of them.
> >>
> >> Take these two hunks as example (mixing in some pfn as well):
> >>
> >> @@ -436,7 +436,7 @@ static int iommu_merge_pages(struct domain *d,
> >> unsigned long pt_mfn,
> >>   * {Re, un}mapping super page frames causes re-allocation of io
> >>   * page tables.
> >>   */
> >> -static int iommu_pde_from_gfn(struct domain *d, unsigned long pfn,
> >> +static int iommu_pde_from_bfn(struct domain *d, unsigned long pfn,
> >>                                unsigned long pt_mfn[])
> >>  {
> >>      u64 *pde, *next_table_vaddr;
> >> @@ -477,11 +477,11 @@ static int iommu_pde_from_gfn(struct domain
> *d,
> >> unsigned long pfn,
> >>               next_table_mfn != 0 )
> >>          {
> >>              int i;
> >> -            unsigned long mfn, gfn;
> >> +            unsigned long mfn, bfn;
> >>              unsigned int page_sz;
> >>
> >>              page_sz = 1 << (PTE_PER_TABLE_SHIFT * (next_level - 1));
> >> -            gfn =  pfn & ~((1 << (PTE_PER_TABLE_SHIFT * next_level)) - 1);
> >> +            bfn =  pfn & ~((1 << (PTE_PER_TABLE_SHIFT * next_level)) - 1);
> >
> > This is not wonderful code, agreed. In this particular case it looks like I
> > may be able to just rename the pfn argument to iofn (assuming we go with
> that
> > name) and lose the stack variable, if that helps.
> 
> Renaming the parameter will likely help, I agree. Getting rid of the
> local variable, otoh, I'm not sure is going to work (you need to retain
> the function parameter's original value for the next iteration of the
> outer loop).

Oh, yes I missed that. I'll see what I can do once we have agreement on a name.

  Paul

> 


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v6 01/14] iommu: introduce the concept of BFN...
  2018-09-06 14:54                                     ` Paul Durrant
@ 2018-09-07  1:47                                       ` Tian, Kevin
  2018-09-07  6:24                                         ` Jan Beulich
  0 siblings, 1 reply; 111+ messages in thread
From: Tian, Kevin @ 2018-09-07  1:47 UTC (permalink / raw)
  To: Paul Durrant, 'Jan Beulich'
  Cc: xen-devel, Julien Grall, Stefano Stabellini, Suravee Suthikulpanit

> From: Paul Durrant [mailto:Paul.Durrant@citrix.com]
> Sent: Thursday, September 6, 2018 10:54 PM
> 
> > -----Original Message-----
> > From: Jan Beulich [mailto:JBeulich@suse.com]
> > Sent: 06 September 2018 14:13
> > To: Paul Durrant <Paul.Durrant@citrix.com>
> > Cc: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>; Julien Grall
> > <julien.grall@arm.com>; Kevin Tian <kevin.tian@intel.com>; Stefano
> > Stabellini <sstabellini@kernel.org>; xen-devel <xen-
> > devel@lists.xenproject.org>
> > Subject: RE: [Xen-devel] [PATCH v6 01/14] iommu: introduce the concept
> of
> > BFN...
> >
> > >>> On 06.09.18 at 12:36, <Paul.Durrant@citrix.com> wrote:
> > >> From: Jan Beulich [mailto:JBeulich@suse.com]
> > >> Sent: 05 September 2018 10:39
> > >>
> > >> >>> On 05.09.18 at 11:13, <Paul.Durrant@citrix.com> wrote:
> > >> > Personally I think 'bus address' is commonly enough used term for
> > >> addresses
> > >> > used by devices for DMA. Indeed we have already 'dev_bus_addr' in
> > the
> > >> grant
> > >> > map and unmap hypercalls. So baddr and bfn seem like ok terms to
> me.
> > It's
> > >> > also not impossible to rename these later if they prove problematic.
> > >>
> > >> But that's the point - the names are problematic (to me): I
> permanently
> > >> have to remind myself that they do _not_ refer to the addresses as
> > >> seen when accessing memory, but the ones going _into_ the IOMMU.
> > >
> > > Ok. Could we agree on 'IOFN' then? I think 'iova' and 'io address' are
> also
> > > reasonably widely used terms to refer to address from a device's PoV.
> I'd
> > > really like to unblock these early patches.
> >
> > Hmm, earlier I had indicated I'd prefer DFN (as this make clear whose
> > view we are talking about). Kevin seemed to prefer DFN too, just with
> > a different association for D (which, as said, I consider unhelpful). So
> > is there a particular reason you're now suggesting IOFN nevertheless?
> 
> It was the ambiguity and lack of agreement over the 'D' that made me think
> that the other alternative would be better.
> Kevin, would you be ok with 'IOFN'?
> 

My problem with DFN is when combining D with address then "device 
address" is not very clear to me while interpreting D as DMA is also
not that clear from Jan's point.

I didn't see a perfect candidate without causing any ambiguity - at this
point I feel IOFN/IOVA might not be the best one but it wins to me when
considering the fact that it is widely used term in other places (e.g. in vtd 
spec, in Linux vfio/iommu driver, etc.). 

Thanks
Kevin

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v6 01/14] iommu: introduce the concept of BFN...
  2018-09-07  1:47                                       ` Tian, Kevin
@ 2018-09-07  6:24                                         ` Jan Beulich
  2018-09-07  8:13                                           ` Paul Durrant
  0 siblings, 1 reply; 111+ messages in thread
From: Jan Beulich @ 2018-09-07  6:24 UTC (permalink / raw)
  To: Paul Durrant, Kevin Tian
  Cc: xen-devel, Julien Grall, Stefano Stabellini, Suravee Suthikulpanit

>>> On 07.09.18 at 03:47, <kevin.tian@intel.com> wrote:
>>  From: Paul Durrant [mailto:Paul.Durrant@citrix.com]
>> Sent: Thursday, September 6, 2018 10:54 PM
>> 
>> > -----Original Message-----
>> > From: Jan Beulich [mailto:JBeulich@suse.com]
>> > Sent: 06 September 2018 14:13
>> > To: Paul Durrant <Paul.Durrant@citrix.com>
>> > Cc: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>; Julien Grall
>> > <julien.grall@arm.com>; Kevin Tian <kevin.tian@intel.com>; Stefano
>> > Stabellini <sstabellini@kernel.org>; xen-devel <xen-
>> > devel@lists.xenproject.org>
>> > Subject: RE: [Xen-devel] [PATCH v6 01/14] iommu: introduce the concept
>> of
>> > BFN...
>> >
>> > >>> On 06.09.18 at 12:36, <Paul.Durrant@citrix.com> wrote:
>> > >> From: Jan Beulich [mailto:JBeulich@suse.com]
>> > >> Sent: 05 September 2018 10:39
>> > >>
>> > >> >>> On 05.09.18 at 11:13, <Paul.Durrant@citrix.com> wrote:
>> > >> > Personally I think 'bus address' is commonly enough used term for
>> > >> addresses
>> > >> > used by devices for DMA. Indeed we have already 'dev_bus_addr' in
>> > the
>> > >> grant
>> > >> > map and unmap hypercalls. So baddr and bfn seem like ok terms to
>> me.
>> > It's
>> > >> > also not impossible to rename these later if they prove problematic.
>> > >>
>> > >> But that's the point - the names are problematic (to me): I
>> permanently
>> > >> have to remind myself that they do _not_ refer to the addresses as
>> > >> seen when accessing memory, but the ones going _into_ the IOMMU.
>> > >
>> > > Ok. Could we agree on 'IOFN' then? I think 'iova' and 'io address' are
>> also
>> > > reasonably widely used terms to refer to address from a device's PoV.
>> I'd
>> > > really like to unblock these early patches.
>> >
>> > Hmm, earlier I had indicated I'd prefer DFN (as this make clear whose
>> > view we are talking about). Kevin seemed to prefer DFN too, just with
>> > a different association for D (which, as said, I consider unhelpful). So
>> > is there a particular reason you're now suggesting IOFN nevertheless?
>> 
>> It was the ambiguity and lack of agreement over the 'D' that made me think
>> that the other alternative would be better.
>> Kevin, would you be ok with 'IOFN'?
>> 
> 
> My problem with DFN is when combining D with address then "device 
> address" is not very clear to me while interpreting D as DMA is also
> not that clear from Jan's point.

What about making its description mention both possible interpretations?

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v6 01/14] iommu: introduce the concept of BFN...
  2018-09-07  6:24                                         ` Jan Beulich
@ 2018-09-07  8:13                                           ` Paul Durrant
  2018-09-07  8:16                                             ` Tian, Kevin
  0 siblings, 1 reply; 111+ messages in thread
From: Paul Durrant @ 2018-09-07  8:13 UTC (permalink / raw)
  To: 'Jan Beulich', Kevin Tian
  Cc: xen-devel, Julien Grall, Stefano Stabellini, Suravee Suthikulpanit

> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: 07 September 2018 07:24
> To: Paul Durrant <Paul.Durrant@citrix.com>; Kevin Tian
> <kevin.tian@intel.com>
> Cc: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>; Julien Grall
> <julien.grall@arm.com>; Stefano Stabellini <sstabellini@kernel.org>; xen-
> devel <xen-devel@lists.xenproject.org>
> Subject: RE: [Xen-devel] [PATCH v6 01/14] iommu: introduce the concept of
> BFN...
> 
> >>> On 07.09.18 at 03:47, <kevin.tian@intel.com> wrote:
> >>  From: Paul Durrant [mailto:Paul.Durrant@citrix.com]
> >> Sent: Thursday, September 6, 2018 10:54 PM
> >>
> >> > -----Original Message-----
> >> > From: Jan Beulich [mailto:JBeulich@suse.com]
> >> > Sent: 06 September 2018 14:13
> >> > To: Paul Durrant <Paul.Durrant@citrix.com>
> >> > Cc: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>; Julien
> Grall
> >> > <julien.grall@arm.com>; Kevin Tian <kevin.tian@intel.com>; Stefano
> >> > Stabellini <sstabellini@kernel.org>; xen-devel <xen-
> >> > devel@lists.xenproject.org>
> >> > Subject: RE: [Xen-devel] [PATCH v6 01/14] iommu: introduce the
> concept
> >> of
> >> > BFN...
> >> >
> >> > >>> On 06.09.18 at 12:36, <Paul.Durrant@citrix.com> wrote:
> >> > >> From: Jan Beulich [mailto:JBeulich@suse.com]
> >> > >> Sent: 05 September 2018 10:39
> >> > >>
> >> > >> >>> On 05.09.18 at 11:13, <Paul.Durrant@citrix.com> wrote:
> >> > >> > Personally I think 'bus address' is commonly enough used term for
> >> > >> addresses
> >> > >> > used by devices for DMA. Indeed we have already 'dev_bus_addr'
> in
> >> > the
> >> > >> grant
> >> > >> > map and unmap hypercalls. So baddr and bfn seem like ok terms to
> >> me.
> >> > It's
> >> > >> > also not impossible to rename these later if they prove
> problematic.
> >> > >>
> >> > >> But that's the point - the names are problematic (to me): I
> >> permanently
> >> > >> have to remind myself that they do _not_ refer to the addresses as
> >> > >> seen when accessing memory, but the ones going _into_ the
> IOMMU.
> >> > >
> >> > > Ok. Could we agree on 'IOFN' then? I think 'iova' and 'io address' are
> >> also
> >> > > reasonably widely used terms to refer to address from a device's PoV.
> >> I'd
> >> > > really like to unblock these early patches.
> >> >
> >> > Hmm, earlier I had indicated I'd prefer DFN (as this make clear whose
> >> > view we are talking about). Kevin seemed to prefer DFN too, just with
> >> > a different association for D (which, as said, I consider unhelpful). So
> >> > is there a particular reason you're now suggesting IOFN nevertheless?
> >>
> >> It was the ambiguity and lack of agreement over the 'D' that made me
> think
> >> that the other alternative would be better.
> >> Kevin, would you be ok with 'IOFN'?
> >>
> >
> > My problem with DFN is when combining D with address then "device
> > address" is not very clear to me while interpreting D as DMA is also
> > not that clear from Jan's point.
> 
> What about making its description mention both possible interpretations?
> 

I'm ok with DFN plus supporting text. Kevin, are you ok with that?

  Paul

  

> Jan
> 


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v6 01/14] iommu: introduce the concept of BFN...
  2018-09-07  8:13                                           ` Paul Durrant
@ 2018-09-07  8:16                                             ` Tian, Kevin
  2018-09-07  8:25                                               ` Paul Durrant
  0 siblings, 1 reply; 111+ messages in thread
From: Tian, Kevin @ 2018-09-07  8:16 UTC (permalink / raw)
  To: Paul Durrant, 'Jan Beulich'
  Cc: xen-devel, Julien Grall, Stefano Stabellini, Suravee Suthikulpanit

> From: Paul Durrant [mailto:Paul.Durrant@citrix.com]
> Sent: Friday, September 7, 2018 4:13 PM
> 
> > -----Original Message-----
> > From: Jan Beulich [mailto:JBeulich@suse.com]
> > Sent: 07 September 2018 07:24
> > To: Paul Durrant <Paul.Durrant@citrix.com>; Kevin Tian
> > <kevin.tian@intel.com>
> > Cc: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>; Julien Grall
> > <julien.grall@arm.com>; Stefano Stabellini <sstabellini@kernel.org>; xen-
> > devel <xen-devel@lists.xenproject.org>
> > Subject: RE: [Xen-devel] [PATCH v6 01/14] iommu: introduce the concept
> of
> > BFN...
> >
> > >>> On 07.09.18 at 03:47, <kevin.tian@intel.com> wrote:
> > >>  From: Paul Durrant [mailto:Paul.Durrant@citrix.com]
> > >> Sent: Thursday, September 6, 2018 10:54 PM
> > >>
> > >> > -----Original Message-----
> > >> > From: Jan Beulich [mailto:JBeulich@suse.com]
> > >> > Sent: 06 September 2018 14:13
> > >> > To: Paul Durrant <Paul.Durrant@citrix.com>
> > >> > Cc: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>; Julien
> > Grall
> > >> > <julien.grall@arm.com>; Kevin Tian <kevin.tian@intel.com>; Stefano
> > >> > Stabellini <sstabellini@kernel.org>; xen-devel <xen-
> > >> > devel@lists.xenproject.org>
> > >> > Subject: RE: [Xen-devel] [PATCH v6 01/14] iommu: introduce the
> > concept
> > >> of
> > >> > BFN...
> > >> >
> > >> > >>> On 06.09.18 at 12:36, <Paul.Durrant@citrix.com> wrote:
> > >> > >> From: Jan Beulich [mailto:JBeulich@suse.com]
> > >> > >> Sent: 05 September 2018 10:39
> > >> > >>
> > >> > >> >>> On 05.09.18 at 11:13, <Paul.Durrant@citrix.com> wrote:
> > >> > >> > Personally I think 'bus address' is commonly enough used term
> for
> > >> > >> addresses
> > >> > >> > used by devices for DMA. Indeed we have already
> 'dev_bus_addr'
> > in
> > >> > the
> > >> > >> grant
> > >> > >> > map and unmap hypercalls. So baddr and bfn seem like ok
> terms to
> > >> me.
> > >> > It's
> > >> > >> > also not impossible to rename these later if they prove
> > problematic.
> > >> > >>
> > >> > >> But that's the point - the names are problematic (to me): I
> > >> permanently
> > >> > >> have to remind myself that they do _not_ refer to the addresses
> as
> > >> > >> seen when accessing memory, but the ones going _into_ the
> > IOMMU.
> > >> > >
> > >> > > Ok. Could we agree on 'IOFN' then? I think 'iova' and 'io address'
> are
> > >> also
> > >> > > reasonably widely used terms to refer to address from a device's
> PoV.
> > >> I'd
> > >> > > really like to unblock these early patches.
> > >> >
> > >> > Hmm, earlier I had indicated I'd prefer DFN (as this make clear
> whose
> > >> > view we are talking about). Kevin seemed to prefer DFN too, just with
> > >> > a different association for D (which, as said, I consider unhelpful). So
> > >> > is there a particular reason you're now suggesting IOFN nevertheless?
> > >>
> > >> It was the ambiguity and lack of agreement over the 'D' that made me
> > think
> > >> that the other alternative would be better.
> > >> Kevin, would you be ok with 'IOFN'?
> > >>
> > >
> > > My problem with DFN is when combining D with address then "device
> > > address" is not very clear to me while interpreting D as DMA is also
> > > not that clear from Jan's point.
> >
> > What about making its description mention both possible interpretations?
> >
> 
> I'm ok with DFN plus supporting text. Kevin, are you ok with that?
> 

sure

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v6 01/14] iommu: introduce the concept of BFN...
  2018-09-07  8:16                                             ` Tian, Kevin
@ 2018-09-07  8:25                                               ` Paul Durrant
  0 siblings, 0 replies; 111+ messages in thread
From: Paul Durrant @ 2018-09-07  8:25 UTC (permalink / raw)
  To: Kevin Tian, 'Jan Beulich'
  Cc: xen-devel, Julien Grall, Stefano Stabellini, Suravee Suthikulpanit

> -----Original Message-----
> From: Tian, Kevin [mailto:kevin.tian@intel.com]
> Sent: 07 September 2018 09:17
> To: Paul Durrant <Paul.Durrant@citrix.com>; 'Jan Beulich'
> <JBeulich@suse.com>
> Cc: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>; Julien Grall
> <julien.grall@arm.com>; Stefano Stabellini <sstabellini@kernel.org>; xen-
> devel <xen-devel@lists.xenproject.org>
> Subject: RE: [Xen-devel] [PATCH v6 01/14] iommu: introduce the concept of
> BFN...
> 
> > From: Paul Durrant [mailto:Paul.Durrant@citrix.com]
> > Sent: Friday, September 7, 2018 4:13 PM
> >
> > > -----Original Message-----
> > > From: Jan Beulich [mailto:JBeulich@suse.com]
> > > Sent: 07 September 2018 07:24
> > > To: Paul Durrant <Paul.Durrant@citrix.com>; Kevin Tian
> > > <kevin.tian@intel.com>
> > > Cc: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>; Julien Grall
> > > <julien.grall@arm.com>; Stefano Stabellini <sstabellini@kernel.org>;
> xen-
> > > devel <xen-devel@lists.xenproject.org>
> > > Subject: RE: [Xen-devel] [PATCH v6 01/14] iommu: introduce the concept
> > of
> > > BFN...
> > >
> > > >>> On 07.09.18 at 03:47, <kevin.tian@intel.com> wrote:
> > > >>  From: Paul Durrant [mailto:Paul.Durrant@citrix.com]
> > > >> Sent: Thursday, September 6, 2018 10:54 PM
> > > >>
> > > >> > -----Original Message-----
> > > >> > From: Jan Beulich [mailto:JBeulich@suse.com]
> > > >> > Sent: 06 September 2018 14:13
> > > >> > To: Paul Durrant <Paul.Durrant@citrix.com>
> > > >> > Cc: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>; Julien
> > > Grall
> > > >> > <julien.grall@arm.com>; Kevin Tian <kevin.tian@intel.com>;
> Stefano
> > > >> > Stabellini <sstabellini@kernel.org>; xen-devel <xen-
> > > >> > devel@lists.xenproject.org>
> > > >> > Subject: RE: [Xen-devel] [PATCH v6 01/14] iommu: introduce the
> > > concept
> > > >> of
> > > >> > BFN...
> > > >> >
> > > >> > >>> On 06.09.18 at 12:36, <Paul.Durrant@citrix.com> wrote:
> > > >> > >> From: Jan Beulich [mailto:JBeulich@suse.com]
> > > >> > >> Sent: 05 September 2018 10:39
> > > >> > >>
> > > >> > >> >>> On 05.09.18 at 11:13, <Paul.Durrant@citrix.com> wrote:
> > > >> > >> > Personally I think 'bus address' is commonly enough used term
> > for
> > > >> > >> addresses
> > > >> > >> > used by devices for DMA. Indeed we have already
> > 'dev_bus_addr'
> > > in
> > > >> > the
> > > >> > >> grant
> > > >> > >> > map and unmap hypercalls. So baddr and bfn seem like ok
> > terms to
> > > >> me.
> > > >> > It's
> > > >> > >> > also not impossible to rename these later if they prove
> > > problematic.
> > > >> > >>
> > > >> > >> But that's the point - the names are problematic (to me): I
> > > >> permanently
> > > >> > >> have to remind myself that they do _not_ refer to the addresses
> > as
> > > >> > >> seen when accessing memory, but the ones going _into_ the
> > > IOMMU.
> > > >> > >
> > > >> > > Ok. Could we agree on 'IOFN' then? I think 'iova' and 'io address'
> > are
> > > >> also
> > > >> > > reasonably widely used terms to refer to address from a device's
> > PoV.
> > > >> I'd
> > > >> > > really like to unblock these early patches.
> > > >> >
> > > >> > Hmm, earlier I had indicated I'd prefer DFN (as this make clear
> > whose
> > > >> > view we are talking about). Kevin seemed to prefer DFN too, just
> with
> > > >> > a different association for D (which, as said, I consider unhelpful). So
> > > >> > is there a particular reason you're now suggesting IOFN
> nevertheless?
> > > >>
> > > >> It was the ambiguity and lack of agreement over the 'D' that made me
> > > think
> > > >> that the other alternative would be better.
> > > >> Kevin, would you be ok with 'IOFN'?
> > > >>
> > > >
> > > > My problem with DFN is when combining D with address then "device
> > > > address" is not very clear to me while interpreting D as DMA is also
> > > > not that clear from Jan's point.
> > >
> > > What about making its description mention both possible
> interpretations?
> > >
> >
> > I'm ok with DFN plus supporting text. Kevin, are you ok with that?
> >
> 
> sure

Ok. Decision made. I will re-work the patches.

  Paul

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v6 06/14] iommu: track reserved ranges using a rangeset
  2018-08-23  9:47 ` [PATCH v6 06/14] iommu: track reserved ranges using a rangeset Paul Durrant
@ 2018-09-07 10:40   ` Jan Beulich
  2018-09-11  9:28     ` Paul Durrant
  0 siblings, 1 reply; 111+ messages in thread
From: Jan Beulich @ 2018-09-07 10:40 UTC (permalink / raw)
  To: Paul Durrant; +Cc: xen-devel, Kevin Tian

>>> On 23.08.18 at 11:47, <paul.durrant@citrix.com> wrote:
> @@ -248,12 +252,16 @@ int iommu_construct(struct domain *d)
>  
>  void iommu_domain_destroy(struct domain *d)
>  {
> -    if ( !iommu_enabled || !dom_iommu(d)->platform_ops )
> +    const struct domain_iommu *hd = dom_iommu(d);
> +
> +    if ( !iommu_enabled || !hd->platform_ops )
>          return;
>  
>      iommu_teardown(d);
>  
>      arch_iommu_domain_destroy(d);
> +
> +    rangeset_destroy(hd->reserved_ranges);

For idempotency reasons perhaps better to store NULL after
the call?

> --- a/xen/drivers/passthrough/vtd/x86/vtd.c
> +++ b/xen/drivers/passthrough/vtd/x86/vtd.c
> @@ -154,8 +154,21 @@ void __hwdom_init vtd_set_hwdom_mapping(struct domain *d)
>  
>          rc = iommu_map_page(d, _bfn(pfn), _mfn(pfn),
>  			    IOMMUF_readable | IOMMUF_writable);
> +
> +        /*
> +         * The only reason a reserved page would be mapped is that
> +         * iommu_inclusive_mapping is set, in which case it needs to be
> +         * marked as reserved in the IOMMU.
> +         */
> +        if ( !rc && page_is_ram_type(pfn, RAM_TYPE_RESERVED) )
> +        {
> +            ASSERT(iommu_inclusive_mapping);
> +
> +            rc = rangeset_add_singleton(dom_iommu(d)->reserved_ranges, pfn);
> +        }

Why would this be restricted to the E820 reserved type? I think this
should cover everything that gets mapped with iommu_inclusive_mapping
set, but not mapped with the flag clear.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v6 05/14] public / x86: introduce __HYPERCALL_iommu_op
  2018-08-23  9:47 ` [PATCH v6 05/14] public / x86: introduce __HYPERCALL_iommu_op Paul Durrant
  2018-09-04 11:50   ` Jan Beulich
@ 2018-09-07 10:52   ` Jan Beulich
  1 sibling, 0 replies; 111+ messages in thread
From: Jan Beulich @ 2018-09-07 10:52 UTC (permalink / raw)
  To: Paul Durrant
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Tim Deegan, xen-devel

>>> On 23.08.18 at 11:47, <paul.durrant@citrix.com> wrote:
> --- a/xen/common/Makefile
> +++ b/xen/common/Makefile
> @@ -13,6 +13,7 @@ obj-$(CONFIG_CRASH_DEBUG) += gdbstub.o
>  obj-y += grant_table.o
>  obj-y += guestcopy.o
>  obj-bin-y += gunzip.init.o
> +obj-$(CONFIG_X86) += iommu_op.o

Btw - irrespective of this I think you'd better ...

> +int compat_one_iommu_op(compat_iommu_op_buf_t *buf)

... have #ifdef CONFIG_COMPAT above here.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v6 07/14] x86: add iommu_op to query reserved ranges
  2018-08-23  9:47 ` [PATCH v6 07/14] x86: add iommu_op to query reserved ranges Paul Durrant
@ 2018-09-07 11:01   ` Jan Beulich
  2018-09-11  9:34     ` Paul Durrant
  2018-09-13  6:11     ` Tian, Kevin
  0 siblings, 2 replies; 111+ messages in thread
From: Jan Beulich @ 2018-09-07 11:01 UTC (permalink / raw)
  To: Paul Durrant
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Tim Deegan, xen-devel

>>> On 23.08.18 at 11:47, <paul.durrant@citrix.com> wrote:
> This patch adds an iommu_op to allow the domain IOMMU reserved ranges to be
> queried by the guest.
> 
> NOTE: The number of reserved ranges is determined by system firmware, in
>       conjunction with Xen command line options, and is expected to be
>       small. Thus, to avoid over-complicating the code, there is no
>       pre-emption check within the operation.

Hmm, RMRRs reportedly can cover a fair part of (the entire?) frame
buffer of a graphics device.

> @@ -100,16 +176,27 @@ long do_iommu_op(unsigned int nr_bufs,
>      return rc;
>  }
>  
> +CHECK_iommu_reserved_range;
> +
>  int compat_one_iommu_op(compat_iommu_op_buf_t *buf)
>  {
> -    compat_iommu_op_t cmp;
> +    compat_iommu_op_t cmp = {};
> +    size_t offset;
> +    static const size_t op_size[] = {
> +        [XEN_IOMMUOP_query_reserved] = sizeof(struct compat_iommu_op_query_reserved),
> +    };
> +    size_t size;
>      xen_iommu_op_t nat;
> +    unsigned int u;
> +    int32_t status;
>      int rc;
>  
> -    if ( buf->size < sizeof(cmp) )
> +    offset = offsetof(struct compat_iommu_op, u);
> +
> +    if ( buf->size < offset )
>          return -EFAULT;

For some reason I notice this only now and here - -EFAULT isn't
really appropriately describing the error condition here. -ENODATA
or -ENOBUFS perhaps?

> @@ -119,17 +206,82 @@ int compat_one_iommu_op(compat_iommu_op_buf_t *buf)
>      if ( rc )
>          return rc;
>  
> +    if ( cmp.op >= ARRAY_SIZE(op_size) )
> +        return -EOPNOTSUPP;
> +
> +    size = op_size[array_index_nospec(cmp.op, ARRAY_SIZE(op_size))];
> +    if ( buf->size < offset + size )
> +        return -EFAULT;
> +
> +    if ( copy_from_compat_offset((void *)&cmp.u, buf->h, offset, size) )
> +        return -EFAULT;
> +
> +    /*
> +     * The xlat magic doesn't quite know how to handle the union so
> +     * we need to fix things up here.
> +     */
> +#define XLAT_iommu_op_u_query_reserved XEN_IOMMUOP_query_reserved
> +    u = cmp.op;
> +
> +#define XLAT_iommu_op_query_reserved_HNDL_ranges(_d_, _s_)            \
> +    do                                                                \
> +    {                                                                 \
> +        if ( !compat_handle_is_null((_s_)->ranges) )                  \
> +        {                                                             \
> +            unsigned int *nr_entries = COMPAT_ARG_XLAT_VIRT_BASE;     \

uint32_t (see below) or perhaps even better typeof().

> +            xen_iommu_reserved_range_t *ranges =                      \
> +                (void *)(nr_entries + 1);                             \
> +                                                                      \
> +            if ( sizeof(*nr_entries) +                                \
> +                 (sizeof(*ranges) * (_s_)->nr_entries) >              \
> +                 COMPAT_ARG_XLAT_SIZE )                               \
> +                return -E2BIG;                                        \
> +                                                                      \
> +            *nr_entries = (_s_)->nr_entries;                          \
> +            set_xen_guest_handle((_d_)->ranges, ranges);              \
> +        }                                                             \
> +        else                                                          \
> +            set_xen_guest_handle((_d_)->ranges, NULL);                \
> +    } while (false)
> +
>      XLAT_iommu_op(&nat, &cmp);
>  
> +#undef XLAT_iommu_op_query_reserved_HNDL_ranges
> +
>      iommu_op(&nat);
>  
> +    status = nat.status;
> +
> +#define XLAT_iommu_op_query_reserved_HNDL_ranges(_d_, _s_)               \
> +    do                                                                   \
> +    {                                                                    \
> +        if ( !compat_handle_is_null((_d_)->ranges) )                     \
> +        {                                                                \
> +            unsigned int *nr_entries = COMPAT_ARG_XLAT_VIRT_BASE;        \
> +            compat_iommu_reserved_range_t *ranges =                      \
> +                (void *)(nr_entries + 1);                                \
> +            unsigned int nr =                                            \
> +                min_t(unsigned int, (_d_)->nr_entries, *nr_entries);     \
> +                                                                         \
> +            if ( __copy_to_compat_offset((_d_)->ranges, 0, ranges, nr) ) \
> +                status = -EFAULT;                                        \
> +        }                                                                \
> +    } while (false)
> +
>      XLAT_iommu_op(&cmp, &nat);
>  
> +    /* status will have been modified if __copy_to_compat_offset() failed */
> +    cmp.status = status;
> +
> +#undef XLAT_iommu_op_query_reserved_HNDL_ranges
> +
>      if ( __copy_field_to_compat(compat_handle_cast(buf->h,
>                                                     compat_iommu_op_t),
>                                  &cmp, status) )
>          return -EFAULT;
>  
> +#undef XLAT_iommu_op_u_query_reserved
> +
>      return 0;
>  }

It's somehow more evident here, but I think even on the native path
the nr_entries field doesn't get copied back despite it being an OUT.

> --- a/xen/include/public/iommu_op.h
> +++ b/xen/include/public/iommu_op.h
> @@ -25,11 +25,50 @@
>  
>  #include "xen.h"
>  
> +typedef uint64_t xen_bfn_t;
> +
> +/* Structure describing a single range reserved in the IOMMU */
> +struct xen_iommu_reserved_range {
> +    xen_bfn_t start_bfn;
> +    unsigned int nr_frames;
> +    unsigned int pad;

uint32_t

> +};
> +typedef struct xen_iommu_reserved_range xen_iommu_reserved_range_t;
> +DEFINE_XEN_GUEST_HANDLE(xen_iommu_reserved_range_t);
> +
> +/*
> + * XEN_IOMMUOP_query_reserved: Query ranges reserved in the IOMMU.
> + */

Single line comment please.

> +#define XEN_IOMMUOP_query_reserved 1
> +
> +struct xen_iommu_op_query_reserved {
> +    /*
> +     * IN/OUT - On entry this is the number of entries available
> +     *          in the ranges array below.
> +     *          On exit this is the actual number of reserved ranges.
> +     */
> +    unsigned int nr_entries;
> +    unsigned int pad;

uint32_t

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v6 08/14] vtd: add lookup_page method to iommu_ops
  2018-08-23  9:47 ` [PATCH v6 08/14] vtd: add lookup_page method to iommu_ops Paul Durrant
@ 2018-09-07 11:11   ` Jan Beulich
  2018-09-07 12:36     ` Paul Durrant
  2018-09-12  8:31     ` Paul Durrant
  0 siblings, 2 replies; 111+ messages in thread
From: Jan Beulich @ 2018-09-07 11:11 UTC (permalink / raw)
  To: Paul Durrant; +Cc: xen-devel, Kevin Tian, george.dunlap

>>> On 23.08.18 at 11:47, <paul.durrant@citrix.com> wrote:
> This patch adds a new method to the VT-d IOMMU implementation to find the
> MFN currently mapped by the specified BFN along with a wrapper function in
> generic IOMMU code to call the implementation if it exists.

For this to go in, I think the AMD side of it wants to also be implemented.

> --- a/xen/drivers/passthrough/iommu.c
> +++ b/xen/drivers/passthrough/iommu.c
> @@ -305,6 +305,17 @@ int iommu_unmap_page(struct domain *d, bfn_t bfn)
>      return rc;
>  }
>  
> +int iommu_lookup_page(struct domain *d, bfn_t bfn, mfn_t *mfn,
> +                      unsigned int *flags)
> +{
> +    const struct domain_iommu *hd = dom_iommu(d);
> +
> +    if ( !iommu_enabled || !hd->platform_ops )
> +        return -EOPNOTSUPP;
> +
> +    return hd->platform_ops->lookup_page(d, bfn, mfn, flags);
> +}

Shouldn't this be restricted to PV guests? HVM ones aren't supposed
to know about MFNs.

> +static int intel_iommu_lookup_page(struct domain *d, bfn_t bfn, mfn_t *mfn,
> +                                   unsigned int *flags)
> +{
> +    struct domain_iommu *hd = dom_iommu(d);
> +    struct dma_pte *page = NULL, *pte = NULL, val;

Pointless initializers. I also question the usefulness of "pte":

> +    u64 pg_maddr;
> +
> +    spin_lock(&hd->arch.mapping_lock);
> +
> +    pg_maddr = addr_to_dma_page_maddr(d, bfn_to_baddr(bfn), 0);
> +    if ( pg_maddr == 0 )
> +    {
> +        spin_unlock(&hd->arch.mapping_lock);
> +        return -ENOMEM;
> +    }
> +
> +    page = map_vtd_domain_page(pg_maddr);
> +    pte = page + (bfn_x(bfn) & LEVEL_MASK);
> +    val = *pte;

    val = page[bfn_x(bfn) & LEVEL_MASK];

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v6 09/14] mm / iommu: include need_iommu() test in iommu_use_hap_pt()
  2018-08-23  9:47 ` [PATCH v6 09/14] mm / iommu: include need_iommu() test in iommu_use_hap_pt() Paul Durrant
@ 2018-09-07 11:20   ` Jan Beulich
  2018-09-11  9:39     ` Paul Durrant
  0 siblings, 1 reply; 111+ messages in thread
From: Jan Beulich @ 2018-09-07 11:20 UTC (permalink / raw)
  To: Paul Durrant
  Cc: Stefano Stabellini, George Dunlap, Andrew Cooper, Julien Grall,
	Jun Nakajima, xen-devel

>>> On 23.08.18 at 11:47, <paul.durrant@citrix.com> wrote:
> The name 'iommu_use_hap_pt' suggests that that P2M table is in use as the
> domain's IOMMU pagetable which, prior to this patch, is not strictly true
> since the macro did not test whether the domain actually has IOMMU
> mappings.

Hmm, I would never have implied "has IOMMU mappings" from this
variable name. To me it has always been "use HAP page tables for
IOMMU if an IOMMU is in use". The code change looks sane, but
I'm not sure it is a clear improvement. Hence I wonder whether you
have a need for this change in subsequent patches which goes
beyond what you say above.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v6 08/14] vtd: add lookup_page method to iommu_ops
  2018-09-07 11:11   ` Jan Beulich
@ 2018-09-07 12:36     ` Paul Durrant
  2018-09-07 14:56       ` Jan Beulich
  2018-09-12  8:31     ` Paul Durrant
  1 sibling, 1 reply; 111+ messages in thread
From: Paul Durrant @ 2018-09-07 12:36 UTC (permalink / raw)
  To: 'Jan Beulich'; +Cc: xen-devel, Kevin Tian, George Dunlap

> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: 07 September 2018 12:11
> To: Paul Durrant <Paul.Durrant@citrix.com>
> Cc: George Dunlap <George.Dunlap@citrix.com>; Kevin Tian
> <kevin.tian@intel.com>; xen-devel <xen-devel@lists.xenproject.org>
> Subject: Re: [PATCH v6 08/14] vtd: add lookup_page method to iommu_ops
> 
> >>> On 23.08.18 at 11:47, <paul.durrant@citrix.com> wrote:
> > This patch adds a new method to the VT-d IOMMU implementation to find
> the
> > MFN currently mapped by the specified BFN along with a wrapper function
> in
> > generic IOMMU code to call the implementation if it exists.
> 
> For this to go in, I think the AMD side of it wants to also be implemented.

Why? It can be done later. Nothing existing is going to break if it is not implemented.

  Paul

> 
> > --- a/xen/drivers/passthrough/iommu.c
> > +++ b/xen/drivers/passthrough/iommu.c
> > @@ -305,6 +305,17 @@ int iommu_unmap_page(struct domain *d, bfn_t
> bfn)
> >      return rc;
> >  }
> >
> > +int iommu_lookup_page(struct domain *d, bfn_t bfn, mfn_t *mfn,
> > +                      unsigned int *flags)
> > +{
> > +    const struct domain_iommu *hd = dom_iommu(d);
> > +
> > +    if ( !iommu_enabled || !hd->platform_ops )
> > +        return -EOPNOTSUPP;
> > +
> > +    return hd->platform_ops->lookup_page(d, bfn, mfn, flags);
> > +}
> 
> Shouldn't this be restricted to PV guests? HVM ones aren't supposed
> to know about MFNs.
> 
> > +static int intel_iommu_lookup_page(struct domain *d, bfn_t bfn, mfn_t
> *mfn,
> > +                                   unsigned int *flags)
> > +{
> > +    struct domain_iommu *hd = dom_iommu(d);
> > +    struct dma_pte *page = NULL, *pte = NULL, val;
> 
> Pointless initializers. I also question the usefulness of "pte":
> 
> > +    u64 pg_maddr;
> > +
> > +    spin_lock(&hd->arch.mapping_lock);
> > +
> > +    pg_maddr = addr_to_dma_page_maddr(d, bfn_to_baddr(bfn), 0);
> > +    if ( pg_maddr == 0 )
> > +    {
> > +        spin_unlock(&hd->arch.mapping_lock);
> > +        return -ENOMEM;
> > +    }
> > +
> > +    page = map_vtd_domain_page(pg_maddr);
> > +    pte = page + (bfn_x(bfn) & LEVEL_MASK);
> > +    val = *pte;
> 
>     val = page[bfn_x(bfn) & LEVEL_MASK];
> 
> Jan
> 


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v6 08/14] vtd: add lookup_page method to iommu_ops
  2018-09-07 12:36     ` Paul Durrant
@ 2018-09-07 14:56       ` Jan Beulich
  2018-09-07 15:24         ` Paul Durrant
  0 siblings, 1 reply; 111+ messages in thread
From: Jan Beulich @ 2018-09-07 14:56 UTC (permalink / raw)
  To: Paul Durrant; +Cc: xen-devel, Kevin Tian, george.dunlap

>>> On 07.09.18 at 14:36, <Paul.Durrant@citrix.com> wrote:
>>  -----Original Message-----
>> From: Jan Beulich [mailto:JBeulich@suse.com]
>> Sent: 07 September 2018 12:11
>> To: Paul Durrant <Paul.Durrant@citrix.com>
>> Cc: George Dunlap <George.Dunlap@citrix.com>; Kevin Tian
>> <kevin.tian@intel.com>; xen-devel <xen-devel@lists.xenproject.org>
>> Subject: Re: [PATCH v6 08/14] vtd: add lookup_page method to iommu_ops
>> 
>> >>> On 23.08.18 at 11:47, <paul.durrant@citrix.com> wrote:
>> > This patch adds a new method to the VT-d IOMMU implementation to find
>> the
>> > MFN currently mapped by the specified BFN along with a wrapper function
>> in
>> > generic IOMMU code to call the implementation if it exists.
>> 
>> For this to go in, I think the AMD side of it wants to also be implemented.
> 
> Why? It can be done later. Nothing existing is going to break if it is not 
> implemented.

If it was something that's terribly difficult to implement, I'd
probably agree. But introducing PV IOMMU for Intel only (and
hence once again making AMD a second class citizen) I don't
really like. Another thing would be if you had the implementation
ready, but the maintainer(s) don't respond...

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v6 08/14] vtd: add lookup_page method to iommu_ops
  2018-09-07 14:56       ` Jan Beulich
@ 2018-09-07 15:24         ` Paul Durrant
  2018-09-07 15:52           ` Jan Beulich
  0 siblings, 1 reply; 111+ messages in thread
From: Paul Durrant @ 2018-09-07 15:24 UTC (permalink / raw)
  To: 'Jan Beulich'; +Cc: xen-devel, Kevin Tian, George Dunlap

> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: 07 September 2018 15:56
> To: Paul Durrant <Paul.Durrant@citrix.com>
> Cc: George Dunlap <George.Dunlap@citrix.com>; Kevin Tian
> <kevin.tian@intel.com>; xen-devel <xen-devel@lists.xenproject.org>
> Subject: RE: [PATCH v6 08/14] vtd: add lookup_page method to iommu_ops
> 
> >>> On 07.09.18 at 14:36, <Paul.Durrant@citrix.com> wrote:
> >>  -----Original Message-----
> >> From: Jan Beulich [mailto:JBeulich@suse.com]
> >> Sent: 07 September 2018 12:11
> >> To: Paul Durrant <Paul.Durrant@citrix.com>
> >> Cc: George Dunlap <George.Dunlap@citrix.com>; Kevin Tian
> >> <kevin.tian@intel.com>; xen-devel <xen-devel@lists.xenproject.org>
> >> Subject: Re: [PATCH v6 08/14] vtd: add lookup_page method to
> iommu_ops
> >>
> >> >>> On 23.08.18 at 11:47, <paul.durrant@citrix.com> wrote:
> >> > This patch adds a new method to the VT-d IOMMU implementation to
> find
> >> the
> >> > MFN currently mapped by the specified BFN along with a wrapper
> function
> >> in
> >> > generic IOMMU code to call the implementation if it exists.
> >>
> >> For this to go in, I think the AMD side of it wants to also be implemented.
> >
> > Why? It can be done later. Nothing existing is going to break if it is not
> > implemented.
> 
> If it was something that's terribly difficult to implement, I'd
> probably agree. But introducing PV IOMMU for Intel only (and
> hence once again making AMD a second class citizen) I don't
> really like. Another thing would be if you had the implementation
> ready, but the maintainer(s) don't respond...
> 

It's all time though. The fact is that, in XenServer, we've never had PV-IOMMU for AMD. It would be wonderful to have it for AMD too and indeed I may find the time to do it, but is that really a reason to block integration of these patches when no actual regression will be caused by of the lack of AMD support?

  Paul

> Jan
> 


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v6 08/14] vtd: add lookup_page method to iommu_ops
  2018-09-07 15:24         ` Paul Durrant
@ 2018-09-07 15:52           ` Jan Beulich
  0 siblings, 0 replies; 111+ messages in thread
From: Jan Beulich @ 2018-09-07 15:52 UTC (permalink / raw)
  To: Paul Durrant; +Cc: xen-devel, Kevin Tian, george.dunlap

>>> On 07.09.18 at 17:24, <Paul.Durrant@citrix.com> wrote:
>> From: Jan Beulich [mailto:JBeulich@suse.com]
>> Sent: 07 September 2018 15:56
>> 
>> >>> On 07.09.18 at 14:36, <Paul.Durrant@citrix.com> wrote:
>> >> From: Jan Beulich [mailto:JBeulich@suse.com]
>> >> Sent: 07 September 2018 12:11
>> >>
>> >> >>> On 23.08.18 at 11:47, <paul.durrant@citrix.com> wrote:
>> >> > This patch adds a new method to the VT-d IOMMU implementation to
>> find
>> >> the
>> >> > MFN currently mapped by the specified BFN along with a wrapper
>> function
>> >> in
>> >> > generic IOMMU code to call the implementation if it exists.
>> >>
>> >> For this to go in, I think the AMD side of it wants to also be implemented.
>> >
>> > Why? It can be done later. Nothing existing is going to break if it is not
>> > implemented.
>> 
>> If it was something that's terribly difficult to implement, I'd
>> probably agree. But introducing PV IOMMU for Intel only (and
>> hence once again making AMD a second class citizen) I don't
>> really like. Another thing would be if you had the implementation
>> ready, but the maintainer(s) don't respond...
>> 
> 
> It's all time though. The fact is that, in XenServer, we've never had 
> PV-IOMMU for AMD. It would be wonderful to have it for AMD too and indeed I 
> may find the time to do it, but is that really a reason to block integration 
> of these patches when no actual regression will be caused by of the lack of 
> AMD support?

First of all - I don't mean to block anything. I deliberately said
"I think it wants to be", not "It needs to be". But beyond that
my second class citizen statement stands - it's no different
from e.g. introducing VMX support, but not SVM one. That also
is not resulting in any regression, but still is not very desirable.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v6 06/14] iommu: track reserved ranges using a rangeset
  2018-09-07 10:40   ` Jan Beulich
@ 2018-09-11  9:28     ` Paul Durrant
  0 siblings, 0 replies; 111+ messages in thread
From: Paul Durrant @ 2018-09-11  9:28 UTC (permalink / raw)
  To: 'Jan Beulich'; +Cc: xen-devel, Kevin Tian

> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: 07 September 2018 11:40
> To: Paul Durrant <Paul.Durrant@citrix.com>
> Cc: Kevin Tian <kevin.tian@intel.com>; xen-devel <xen-
> devel@lists.xenproject.org>
> Subject: Re: [PATCH v6 06/14] iommu: track reserved ranges using a rangeset
> 
> >>> On 23.08.18 at 11:47, <paul.durrant@citrix.com> wrote:
> > @@ -248,12 +252,16 @@ int iommu_construct(struct domain *d)
> >
> >  void iommu_domain_destroy(struct domain *d)
> >  {
> > -    if ( !iommu_enabled || !dom_iommu(d)->platform_ops )
> > +    const struct domain_iommu *hd = dom_iommu(d);
> > +
> > +    if ( !iommu_enabled || !hd->platform_ops )
> >          return;
> >
> >      iommu_teardown(d);
> >
> >      arch_iommu_domain_destroy(d);
> > +
> > +    rangeset_destroy(hd->reserved_ranges);
> 
> For idempotency reasons perhaps better to store NULL after
> the call?

Ok.

> 
> > --- a/xen/drivers/passthrough/vtd/x86/vtd.c
> > +++ b/xen/drivers/passthrough/vtd/x86/vtd.c
> > @@ -154,8 +154,21 @@ void __hwdom_init
> vtd_set_hwdom_mapping(struct domain *d)
> >
> >          rc = iommu_map_page(d, _bfn(pfn), _mfn(pfn),
> >  			    IOMMUF_readable | IOMMUF_writable);
> > +
> > +        /*
> > +         * The only reason a reserved page would be mapped is that
> > +         * iommu_inclusive_mapping is set, in which case it needs to be
> > +         * marked as reserved in the IOMMU.
> > +         */
> > +        if ( !rc && page_is_ram_type(pfn, RAM_TYPE_RESERVED) )
> > +        {
> > +            ASSERT(iommu_inclusive_mapping);
> > +
> > +            rc = rangeset_add_singleton(dom_iommu(d)->reserved_ranges,
> pfn);
> > +        }
> 
> Why would this be restricted to the E820 reserved type? I think this
> should cover everything that gets mapped with iommu_inclusive_mapping
> set, but not mapped with the flag clear.
> 

When PV-IOMMU is enabled it clearly needs to be modify the domain's memory mappings (otherwise what's the point of the API) so I could change this range to include everything mapped for the domain except ranges explicitly identified as RAM.

  Paul

> Jan
> 


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v6 07/14] x86: add iommu_op to query reserved ranges
  2018-09-07 11:01   ` Jan Beulich
@ 2018-09-11  9:34     ` Paul Durrant
  2018-09-11  9:43       ` Jan Beulich
  2018-09-13  6:11     ` Tian, Kevin
  1 sibling, 1 reply; 111+ messages in thread
From: Paul Durrant @ 2018-09-11  9:34 UTC (permalink / raw)
  To: 'Jan Beulich'
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	Andrew Cooper, Tim (Xen.org),
	George Dunlap, Ian Jackson, xen-devel

> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: 07 September 2018 12:01
> To: Paul Durrant <Paul.Durrant@citrix.com>
> Cc: Andrew Cooper <Andrew.Cooper3@citrix.com>; Wei Liu
> <wei.liu2@citrix.com>; George Dunlap <George.Dunlap@citrix.com>; Ian
> Jackson <Ian.Jackson@citrix.com>; Stefano Stabellini
> <sstabellini@kernel.org>; xen-devel <xen-devel@lists.xenproject.org>;
> Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>; Tim (Xen.org)
> <tim@xen.org>
> Subject: Re: [PATCH v6 07/14] x86: add iommu_op to query reserved ranges
> 
> >>> On 23.08.18 at 11:47, <paul.durrant@citrix.com> wrote:
> > This patch adds an iommu_op to allow the domain IOMMU reserved
> ranges to be
> > queried by the guest.
> >
> > NOTE: The number of reserved ranges is determined by system firmware,
> in
> >       conjunction with Xen command line options, and is expected to be
> >       small. Thus, to avoid over-complicating the code, there is no
> >       pre-emption check within the operation.
> 
> Hmm, RMRRs reportedly can cover a fair part of (the entire?) frame
> buffer of a graphics device.

That would still be phrased as a single range though, right?

> 
> > @@ -100,16 +176,27 @@ long do_iommu_op(unsigned int nr_bufs,
> >      return rc;
> >  }
> >
> > +CHECK_iommu_reserved_range;
> > +
> >  int compat_one_iommu_op(compat_iommu_op_buf_t *buf)
> >  {
> > -    compat_iommu_op_t cmp;
> > +    compat_iommu_op_t cmp = {};
> > +    size_t offset;
> > +    static const size_t op_size[] = {
> > +        [XEN_IOMMUOP_query_reserved] = sizeof(struct
> compat_iommu_op_query_reserved),
> > +    };
> > +    size_t size;
> >      xen_iommu_op_t nat;
> > +    unsigned int u;
> > +    int32_t status;
> >      int rc;
> >
> > -    if ( buf->size < sizeof(cmp) )
> > +    offset = offsetof(struct compat_iommu_op, u);
> > +
> > +    if ( buf->size < offset )
> >          return -EFAULT;
> 
> For some reason I notice this only now and here - -EFAULT isn't
> really appropriately describing the error condition here. -ENODATA
> or -ENOBUFS perhaps?

Ok. ENODATA seems best.

> 
> > @@ -119,17 +206,82 @@ int
> compat_one_iommu_op(compat_iommu_op_buf_t *buf)
> >      if ( rc )
> >          return rc;
> >
> > +    if ( cmp.op >= ARRAY_SIZE(op_size) )
> > +        return -EOPNOTSUPP;
> > +
> > +    size = op_size[array_index_nospec(cmp.op, ARRAY_SIZE(op_size))];
> > +    if ( buf->size < offset + size )
> > +        return -EFAULT;
> > +
> > +    if ( copy_from_compat_offset((void *)&cmp.u, buf->h, offset, size) )
> > +        return -EFAULT;
> > +
> > +    /*
> > +     * The xlat magic doesn't quite know how to handle the union so
> > +     * we need to fix things up here.
> > +     */
> > +#define XLAT_iommu_op_u_query_reserved
> XEN_IOMMUOP_query_reserved
> > +    u = cmp.op;
> > +
> > +#define XLAT_iommu_op_query_reserved_HNDL_ranges(_d_, _s_)
> \
> > +    do                                                                \
> > +    {                                                                 \
> > +        if ( !compat_handle_is_null((_s_)->ranges) )                  \
> > +        {                                                             \
> > +            unsigned int *nr_entries = COMPAT_ARG_XLAT_VIRT_BASE;     \
> 
> uint32_t (see below) or perhaps even better typeof().
> 

Ok.

> > +            xen_iommu_reserved_range_t *ranges =                      \
> > +                (void *)(nr_entries + 1);                             \
> > +                                                                      \
> > +            if ( sizeof(*nr_entries) +                                \
> > +                 (sizeof(*ranges) * (_s_)->nr_entries) >              \
> > +                 COMPAT_ARG_XLAT_SIZE )                               \
> > +                return -E2BIG;                                        \
> > +                                                                      \
> > +            *nr_entries = (_s_)->nr_entries;                          \
> > +            set_xen_guest_handle((_d_)->ranges, ranges);              \
> > +        }                                                             \
> > +        else                                                          \
> > +            set_xen_guest_handle((_d_)->ranges, NULL);                \
> > +    } while (false)
> > +
> >      XLAT_iommu_op(&nat, &cmp);
> >
> > +#undef XLAT_iommu_op_query_reserved_HNDL_ranges
> > +
> >      iommu_op(&nat);
> >
> > +    status = nat.status;
> > +
> > +#define XLAT_iommu_op_query_reserved_HNDL_ranges(_d_, _s_)
> \
> > +    do                                                                   \
> > +    {                                                                    \
> > +        if ( !compat_handle_is_null((_d_)->ranges) )                     \
> > +        {                                                                \
> > +            unsigned int *nr_entries = COMPAT_ARG_XLAT_VIRT_BASE;        \
> > +            compat_iommu_reserved_range_t *ranges =                      \
> > +                (void *)(nr_entries + 1);                                \
> > +            unsigned int nr =                                            \
> > +                min_t(unsigned int, (_d_)->nr_entries, *nr_entries);     \
> > +                                                                         \
> > +            if ( __copy_to_compat_offset((_d_)->ranges, 0, ranges, nr) ) \
> > +                status = -EFAULT;                                        \
> > +        }                                                                \
> > +    } while (false)
> > +
> >      XLAT_iommu_op(&cmp, &nat);
> >
> > +    /* status will have been modified if __copy_to_compat_offset() failed
> */
> > +    cmp.status = status;
> > +
> > +#undef XLAT_iommu_op_query_reserved_HNDL_ranges
> > +
> >      if ( __copy_field_to_compat(compat_handle_cast(buf->h,
> >                                                     compat_iommu_op_t),
> >                                  &cmp, status) )
> >          return -EFAULT;
> >
> > +#undef XLAT_iommu_op_u_query_reserved
> > +
> >      return 0;
> >  }
> 
> It's somehow more evident here, but I think even on the native path
> the nr_entries field doesn't get copied back despite it being an OUT.

Indeed. It would be so much simpler just to copy back the entire buf to avoid the need to special-case this for each op.

> 
> > --- a/xen/include/public/iommu_op.h
> > +++ b/xen/include/public/iommu_op.h
> > @@ -25,11 +25,50 @@
> >
> >  #include "xen.h"
> >
> > +typedef uint64_t xen_bfn_t;
> > +
> > +/* Structure describing a single range reserved in the IOMMU */
> > +struct xen_iommu_reserved_range {
> > +    xen_bfn_t start_bfn;
> > +    unsigned int nr_frames;
> > +    unsigned int pad;
> 
> uint32_t

Ok.

> 
> > +};
> > +typedef struct xen_iommu_reserved_range
> xen_iommu_reserved_range_t;
> > +DEFINE_XEN_GUEST_HANDLE(xen_iommu_reserved_range_t);
> > +
> > +/*
> > + * XEN_IOMMUOP_query_reserved: Query ranges reserved in the
> IOMMU.
> > + */
> 
> Single line comment please.
> 

Ok.

> > +#define XEN_IOMMUOP_query_reserved 1
> > +
> > +struct xen_iommu_op_query_reserved {
> > +    /*
> > +     * IN/OUT - On entry this is the number of entries available
> > +     *          in the ranges array below.
> > +     *          On exit this is the actual number of reserved ranges.
> > +     */
> > +    unsigned int nr_entries;
> > +    unsigned int pad;
> 
> uint32_t
> 

Ok.

  Paul

> Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v6 09/14] mm / iommu: include need_iommu() test in iommu_use_hap_pt()
  2018-09-07 11:20   ` Jan Beulich
@ 2018-09-11  9:39     ` Paul Durrant
  2018-09-11  9:47       ` Jan Beulich
  0 siblings, 1 reply; 111+ messages in thread
From: Paul Durrant @ 2018-09-11  9:39 UTC (permalink / raw)
  To: 'Jan Beulich'
  Cc: Stefano Stabellini, Andrew Cooper, George Dunlap, Julien Grall,
	Jun Nakajima, xen-devel

> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: 07 September 2018 12:20
> To: Paul Durrant <Paul.Durrant@citrix.com>
> Cc: Julien Grall <julien.grall@arm.com>; Andrew Cooper
> <Andrew.Cooper3@citrix.com>; George Dunlap
> <George.Dunlap@citrix.com>; Jun Nakajima <jun.nakajima@intel.com>;
> Stefano Stabellini <sstabellini@kernel.org>; xen-devel <xen-
> devel@lists.xenproject.org>
> Subject: Re: [PATCH v6 09/14] mm / iommu: include need_iommu() test in
> iommu_use_hap_pt()
> 
> >>> On 23.08.18 at 11:47, <paul.durrant@citrix.com> wrote:
> > The name 'iommu_use_hap_pt' suggests that that P2M table is in use as
> the
> > domain's IOMMU pagetable which, prior to this patch, is not strictly true
> > since the macro did not test whether the domain actually has IOMMU
> > mappings.
> 
> Hmm, I would never have implied "has IOMMU mappings" from this
> variable name. To me it has always been "use HAP page tables for
> IOMMU if an IOMMU is in use". The code change looks sane, but
> I'm not sure it is a clear improvement. Hence I wonder whether you
> have a need for this change in subsequent patches which goes
> beyond what you say above.
> 

I could take it out but it would mean a non-trivial rebase, and to me - if true - the name sill implies the IOMMU is in use for the domain so I'd like to keep the change.

  Paul

> Jan
> 


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v6 07/14] x86: add iommu_op to query reserved ranges
  2018-09-11  9:34     ` Paul Durrant
@ 2018-09-11  9:43       ` Jan Beulich
  0 siblings, 0 replies; 111+ messages in thread
From: Jan Beulich @ 2018-09-11  9:43 UTC (permalink / raw)
  To: Paul Durrant
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	Andrew Cooper, Tim Deegan, george.dunlap, Ian Jackson, xen-devel

>>> On 11.09.18 at 11:34, <Paul.Durrant@citrix.com> wrote:
>> From: Jan Beulich [mailto:JBeulich@suse.com]
>> Sent: 07 September 2018 12:01
>> 
>> >>> On 23.08.18 at 11:47, <paul.durrant@citrix.com> wrote:
>> > This patch adds an iommu_op to allow the domain IOMMU reserved
>> ranges to be
>> > queried by the guest.
>> >
>> > NOTE: The number of reserved ranges is determined by system firmware,
>> in
>> >       conjunction with Xen command line options, and is expected to be
>> >       small. Thus, to avoid over-complicating the code, there is no
>> >       pre-emption check within the operation.
>> 
>> Hmm, RMRRs reportedly can cover a fair part of (the entire?) frame
>> buffer of a graphics device.
> 
> That would still be phrased as a single range though, right?

Ah, yes, indeed.

>> >      iommu_op(&nat);
>> >
>> > +    status = nat.status;
>> > +
>> > +#define XLAT_iommu_op_query_reserved_HNDL_ranges(_d_, _s_)
>> \
>> > +    do                                                                   \
>> > +    {                                                                    \
>> > +        if ( !compat_handle_is_null((_d_)->ranges) )                     \
>> > +        {                                                                \
>> > +            unsigned int *nr_entries = COMPAT_ARG_XLAT_VIRT_BASE;        \
>> > +            compat_iommu_reserved_range_t *ranges =                      \
>> > +                (void *)(nr_entries + 1);                                \
>> > +            unsigned int nr =                                            \
>> > +                min_t(unsigned int, (_d_)->nr_entries, *nr_entries);     \
>> > +                                                                         \
>> > +            if ( __copy_to_compat_offset((_d_)->ranges, 0, ranges, nr) ) \
>> > +                status = -EFAULT;                                        \
>> > +        }                                                                \
>> > +    } while (false)
>> > +
>> >      XLAT_iommu_op(&cmp, &nat);
>> >
>> > +    /* status will have been modified if __copy_to_compat_offset() failed
>> */
>> > +    cmp.status = status;
>> > +
>> > +#undef XLAT_iommu_op_query_reserved_HNDL_ranges
>> > +
>> >      if ( __copy_field_to_compat(compat_handle_cast(buf->h,
>> >                                                     compat_iommu_op_t),
>> >                                  &cmp, status) )
>> >          return -EFAULT;
>> >
>> > +#undef XLAT_iommu_op_u_query_reserved
>> > +
>> >      return 0;
>> >  }
>> 
>> It's somehow more evident here, but I think even on the native path
>> the nr_entries field doesn't get copied back despite it being an OUT.
> 
> Indeed. It would be so much simpler just to copy back the entire buf to 
> avoid the need to special-case this for each op.

In case of such complications I view doing so as perfectly legitimate. It's
just that in cases where copying a single field is obviously easy and
sufficient that I suggest/request that only that one field be copied.


Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v6 09/14] mm / iommu: include need_iommu() test in iommu_use_hap_pt()
  2018-09-11  9:39     ` Paul Durrant
@ 2018-09-11  9:47       ` Jan Beulich
  2018-09-13  6:23         ` Tian, Kevin
  0 siblings, 1 reply; 111+ messages in thread
From: Jan Beulich @ 2018-09-11  9:47 UTC (permalink / raw)
  To: Paul Durrant
  Cc: Kevin Tian, Stefano Stabellini, Suravee Suthikulpanit,
	Andrew Cooper, george.dunlap, Julien Grall, Jun Nakajima,
	xen-devel

>>> On 11.09.18 at 11:39, <Paul.Durrant@citrix.com> wrote:
>>  -----Original Message-----
>> From: Jan Beulich [mailto:JBeulich@suse.com]
>> Sent: 07 September 2018 12:20
>> To: Paul Durrant <Paul.Durrant@citrix.com>
>> Cc: Julien Grall <julien.grall@arm.com>; Andrew Cooper
>> <Andrew.Cooper3@citrix.com>; George Dunlap
>> <George.Dunlap@citrix.com>; Jun Nakajima <jun.nakajima@intel.com>;
>> Stefano Stabellini <sstabellini@kernel.org>; xen-devel <xen-
>> devel@lists.xenproject.org>
>> Subject: Re: [PATCH v6 09/14] mm / iommu: include need_iommu() test in
>> iommu_use_hap_pt()
>> 
>> >>> On 23.08.18 at 11:47, <paul.durrant@citrix.com> wrote:
>> > The name 'iommu_use_hap_pt' suggests that that P2M table is in use as
>> the
>> > domain's IOMMU pagetable which, prior to this patch, is not strictly true
>> > since the macro did not test whether the domain actually has IOMMU
>> > mappings.
>> 
>> Hmm, I would never have implied "has IOMMU mappings" from this
>> variable name. To me it has always been "use HAP page tables for
>> IOMMU if an IOMMU is in use". The code change looks sane, but
>> I'm not sure it is a clear improvement. Hence I wonder whether you
>> have a need for this change in subsequent patches which goes
>> beyond what you say above.
>> 
> 
> I could take it out but it would mean a non-trivial rebase, and to me - if 
> true - the name sill implies the IOMMU is in use for the domain so I'd like to 
> keep the change.

Let's broaden the Cc list a little - perhaps we can get further opinions this
way.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v6 10/14] mm / iommu: split need_iommu() into has_iommu_pt() and need_iommu_pt_sync()
  2018-08-23  9:47 ` [PATCH v6 10/14] mm / iommu: split need_iommu() into has_iommu_pt() and need_iommu_pt_sync() Paul Durrant
  2018-08-23 11:10   ` Razvan Cojocaru
@ 2018-09-11 14:31   ` Jan Beulich
  2018-09-11 15:40     ` Paul Durrant
  1 sibling, 1 reply; 111+ messages in thread
From: Jan Beulich @ 2018-09-11 14:31 UTC (permalink / raw)
  To: Paul Durrant
  Cc: Kevin Tian, Stefano Stabellini, Wei Liu, Jun Nakajima,
	Razvan Cojocaru, Konrad Rzeszutek Wilk, George Dunlap,
	Andrew Cooper, Ian Jackson, Tim Deegan, Julien Grall,
	Tamas K Lengyel, Suravee Suthikulpanit, xen-devel, Brian Woods

>>> On 23.08.18 at 11:47, <paul.durrant@citrix.com> wrote:
> --- a/xen/arch/x86/hvm/mtrr.c
> +++ b/xen/arch/x86/hvm/mtrr.c
> @@ -793,7 +793,8 @@ HVM_REGISTER_SAVE_RESTORE(MTRR, hvm_save_mtrr_msr, hvm_load_mtrr_msr,
>  
>  void memory_type_changed(struct domain *d)
>  {
> -    if ( need_iommu(d) && d->vcpu && d->vcpu[0] )
> +    if ( (need_iommu_pt_sync(d) || iommu_use_hap_pt(d)) &&
> +         d->vcpu && d->vcpu[0] )
>      {
>          p2m_memory_type_changed(d);
>          flush_all(FLUSH_CACHE);

How does need_iommu_pt_sync() come into play here? To me,
has_iommu_pt() would seem to be a better match. The question
here solely is (iirc) whether memory types take any effect in the
first place (or else we can skip adjusting them), which in turn
solely depends on whether there are any pass-through devices
in the domain. This is along the lines of ...

> @@ -841,7 +842,7 @@ int epte_get_entry_emt(struct domain *d, unsigned long gfn, mfn_t mfn,
>          return MTRR_TYPE_UNCACHABLE;
>      }
>  
> -    if ( !need_iommu(d) && !cache_flush_permitted(d) )
> +    if ( !has_iommu_pt(d) && !cache_flush_permitted(d) )
>      {
>          *ipat = 1;
>          return MTRR_TYPE_WRBACK;

... this.

> --- a/xen/arch/x86/x86_64/mm.c
> +++ b/xen/arch/x86/x86_64/mm.c
> @@ -1426,7 +1426,8 @@ int memory_add(unsigned long spfn, unsigned long epfn, unsigned int pxm)
>      if ( ret )
>          goto destroy_m2p;
>  
> -    if ( iommu_enabled && !iommu_passthrough && !need_iommu(hardware_domain) )
> +    if ( iommu_enabled && !iommu_passthrough &&
> +         !need_iommu_pt_sync(hardware_domain) )
>      {
>          for ( i = spfn; i < epfn; i++ )
>              if ( iommu_map_page(hardware_domain, _bfn(i), _mfn(i),

I'm confused - the condition you change looks to be inverted. Wouldn't
we better fix this?

And then I again wonder whether you've chosen the right predicate:
Where would the equivalent mappings come from in the opposite case?

> --- a/xen/common/memory.c
> +++ b/xen/common/memory.c
> @@ -805,8 +805,8 @@ int xenmem_add_to_physmap(struct domain *d, struct xen_add_to_physmap *xatp,
>      xatp->size -= start;
>  
>  #ifdef CONFIG_HAS_PASSTHROUGH
> -    if ( need_iommu(d) )
> -        this_cpu(iommu_dont_flush_iotlb) = 1;
> +    if ( need_iommu_pt_sync(d) || iommu_use_hap_pt(d) )
> +       this_cpu(iommu_dont_flush_iotlb) = 1;
>  #endif

Rather than making the conditional more complicated, perhaps
simply drop it (and move the reset-to-false code out of ...

> @@ -828,7 +828,7 @@ int xenmem_add_to_physmap(struct domain *d, struct xen_add_to_physmap *xatp,
>      }
>  
>  #ifdef CONFIG_HAS_PASSTHROUGH
> -    if ( need_iommu(d) )
> +    if ( need_iommu_pt_sync(d) || iommu_use_hap_pt(d) )
>      {
>          int ret;

... this if())?

Also it looks to me as if here you've got confused by the meaning
you've assigned to need_iommu_pt_sync(): According to the
description, it is about sync-ing of page tables. Here, however,
we're checking whether to flush TLBs.

> @@ -179,8 +179,10 @@ void __hwdom_init iommu_hwdom_init(struct domain *d)
>          return;
>  
>      register_keyhandler('o', &iommu_dump_p2m_table, "dump iommu p2m table", 0);
> -    d->need_iommu = !!iommu_dom0_strict;
> -    if ( need_iommu(d) && !iommu_use_hap_pt(d) )
> +
> +    hd->status = IOMMU_STATUS_initializing;
> +    hd->need_sync = !!iommu_dom0_strict && !iommu_use_hap_pt(d);

Pointless !! afaict.

>  int iommu_construct(struct domain *d)
>  {
> -    if ( need_iommu(d) > 0 )
> +    struct domain_iommu *hd = dom_iommu(d);
> +
> +    if ( hd->status == IOMMU_STATUS_initialized )
>          return 0;
>  
> -    if ( !iommu_use_hap_pt(d) )
> +    if ( !iommu_use_hap_pt(d) && hd->status < IOMMU_STATUS_initialized )

Isn't the addition here redundant with the earlier if()?

> @@ -889,9 +886,11 @@ void watchdog_domain_destroy(struct domain *d);
>  #define is_pinned_vcpu(v) ((v)->domain->is_pinned || \
>                             cpumask_weight((v)->cpu_hard_affinity) == 1)
>  #ifdef CONFIG_HAS_PASSTHROUGH
> -#define need_iommu(d)    ((d)->need_iommu)
> +#define has_iommu_pt(d) (dom_iommu(d)->status != IOMMU_STATUS_disabled)
> +#define need_iommu_pt_sync(d) (dom_iommu(d)->need_sync)
>  #else
> -#define need_iommu(d)    (0)
> +#define has_iommu_pt(d) (0)
> +#define need_iommu_pt_sync(d) (0)

Please use false here (and no need for the parentheses).

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v6 11/14] x86: add iommu_op to enable modification of IOMMU mappings
  2018-08-23  9:47 ` [PATCH v6 11/14] x86: add iommu_op to enable modification of IOMMU mappings Paul Durrant
@ 2018-09-11 14:48   ` Jan Beulich
  2018-09-11 15:52     ` Paul Durrant
  0 siblings, 1 reply; 111+ messages in thread
From: Jan Beulich @ 2018-09-11 14:48 UTC (permalink / raw)
  To: Paul Durrant
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Tim Deegan,
	Julien Grall, xen-devel

>>> On 23.08.18 at 11:47, <paul.durrant@citrix.com> wrote:
> This patch adds an iommu_op which checks whether it is possible or
> safe for a domain to modify its own IOMMU mappings and, if so, creates
> a rangeset to track modifications.

Now this can surely grow pretty big?

> --- a/xen/common/iommu_op.c
> +++ b/xen/common/iommu_op.c
> @@ -78,6 +78,51 @@ static int iommu_op_query_reserved(struct xen_iommu_op_query_reserved *op)
>      return 0;
>  }
>  
> +static int iommu_op_enable_modification(
> +    struct xen_iommu_op_enable_modification *op)
> +{
> +    struct domain *currd = current->domain;
> +    struct domain_iommu *iommu = dom_iommu(currd);
> +    const struct iommu_ops *ops = iommu->platform_ops;
> +
> +    if ( op->cap || op->pad )
> +        return -EINVAL;
> +
> +    /*
> +     * XEN_IOMMU_CAP_per_device_mappings is not supported yet so we can
> +     * leave op->cap alone.
> +     */
> +
> +    /* Has modification already been enabled? */
> +    if ( iommu->iommu_op_ranges )
> +        return 0;

I don't recall there being any synchronization around the check
here until ...

> +    /*
> +     * The IOMMU mappings cannot be modified if:
> +     * - the IOMMU is not enabled or,
> +     * - the current domain is dom0 and tranlsation is disabled or,
> +     * - HAP is enabled and the IOMMU shares the mappings.
> +     */
> +    if ( !iommu_enabled ||
> +         (is_hardware_domain(currd) && iommu_passthrough) ||
> +         iommu_use_hap_pt(currd) )
> +        return -EACCES;
> +
> +    /*
> +     * The IOMMU implementation must provide the lookup method if
> +     * modification of the mappings is to be supported.
> +     */
> +    if ( !ops->lookup_page )
> +        return -EOPNOTSUPP;
> +
> +    iommu->iommu_op_ranges = rangeset_new(currd, NULL, 0);

... the assignment here. In which case this is a (multi-vCPU) guest
controllable leak.

> --- a/xen/drivers/passthrough/iommu.c
> +++ b/xen/drivers/passthrough/iommu.c
> @@ -26,7 +26,6 @@ static void iommu_dump_p2m_table(unsigned char key);
>  
>  unsigned int __read_mostly iommu_dev_iotlb_timeout = 1000;
>  integer_param("iommu_dev_iotlb_timeout", iommu_dev_iotlb_timeout);
> -
>  /*
>   * The 'iommu' parameter enables the IOMMU.  Optional comma separated
>   * value may contain:

Stray change?

> --- a/xen/drivers/passthrough/pci.c
> +++ b/xen/drivers/passthrough/pci.c
> @@ -1460,7 +1460,7 @@ static int assign_device(struct domain *d, u16 seg, u8 bus, u8 devfn, u32 flag)
>      }
>  
>   done:
> -    if ( !has_arch_pdevs(d) && has_iommu_pt(d) )
> +    if ( !has_arch_pdevs(d) && has_iommu_pt(d) && !hd->iommu_op_ranges )
>          iommu_teardown(d);
>      pcidevs_unlock();
>  
> @@ -1510,7 +1510,7 @@ int deassign_device(struct domain *d, u16 seg, u8 bus, u8 devfn)
>  
>      pdev->fault.count = 0;
>  
> -    if ( !has_arch_pdevs(d) && has_iommu_pt(d) )
> +    if ( !has_arch_pdevs(d) && has_iommu_pt(d) && !hd->iommu_op_ranges )
>          iommu_teardown(d);

These additions are pretty un-obvious, and hence at least need
comments. But I'm also unclear about the intended behavior: For
a guest not meaning to play with its mappings, why would you
keep the tables around (and the memory uselessly allocated)?

> --- a/xen/include/public/iommu_op.h
> +++ b/xen/include/public/iommu_op.h
> @@ -61,6 +61,25 @@ struct xen_iommu_op_query_reserved {
>      XEN_GUEST_HANDLE(xen_iommu_reserved_range_t) ranges;
>  };
>  
> +/*
> + * XEN_IOMMUOP_enable_modification: Enable operations that modify IOMMU
> + *                                  mappings.
> + */
> +#define XEN_IOMMUOP_enable_modification 2
> +
> +struct xen_iommu_op_enable_modification {
> +    /*
> +     * OUT - On successful return this is set to the bitwise OR of capabilities
> +     *       defined below. On entry this must be set to zero.
> +     */
> +    unsigned int cap;
> +    unsigned int pad;

uint32_t

> --- a/xen/include/xlat.lst
> +++ b/xen/include/xlat.lst
> @@ -79,6 +79,7 @@
>  ?	vcpu_hvm_x86_64			hvm/hvm_vcpu.h
>  !	iommu_op			iommu_op.h
>  !	iommu_op_buf			iommu_op.h
> +!	iommu_op_enable_modification	iommu_op.h

The structure above looks to be 32-/64-bit agnostic. Why is this !
instead of ? ?

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v6 12/14] memory: add get_paged_gfn() as a wrapper...
  2018-08-23  9:47 ` [PATCH v6 12/14] memory: add get_paged_gfn() as a wrapper Paul Durrant
  2018-08-23 10:24   ` Julien Grall
@ 2018-09-11 14:56   ` Jan Beulich
  2018-09-12  9:10     ` Paul Durrant
  1 sibling, 1 reply; 111+ messages in thread
From: Jan Beulich @ 2018-09-11 14:56 UTC (permalink / raw)
  To: Paul Durrant
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Tim Deegan,
	Julien Grall, xen-devel

>>> On 23.08.18 at 11:47, <paul.durrant@citrix.com> wrote:
> ...for some uses of get_page_from_gfn().
> 
> There are many occurences of the following pattern in the code:
> 
>     q = <readonly look-up> ? P2M_ALLOC : P2M_UNSHARE;

Especially with this UNSHARE in mind - is "paged" in the helper
function's name really suitable? Since we (I think) already have
get_gfn(), how about try_get_gfn()?

> --- a/xen/arch/x86/hvm/emulate.c
> +++ b/xen/arch/x86/hvm/emulate.c
> @@ -350,34 +350,16 @@ static int hvmemul_do_io_buffer(
>  
>  static int hvmemul_acquire_page(unsigned long gmfn, struct page_info **page)
>  {
> -    struct domain *curr_d = current->domain;
> -    p2m_type_t p2mt;
> -
> -    *page = get_page_from_gfn(curr_d, gmfn, &p2mt, P2M_UNSHARE);
> -
> -    if ( *page == NULL )
> -        return X86EMUL_UNHANDLEABLE;
> -
> -    if ( p2m_is_paging(p2mt) )
> -    {
> -        put_page(*page);
> -        p2m_mem_paging_populate(curr_d, gmfn);
> -        return X86EMUL_RETRY;
> -    }
> -
> -    if ( p2m_is_shared(p2mt) )
> +    switch ( get_paged_gfn(current->domain, _gfn(gmfn), false, NULL, page) )
>      {
> -        put_page(*page);
> +    case -EAGAIN:
>          return X86EMUL_RETRY;
> -    }
> -
> -    /* This code should not be reached if the gmfn is not RAM */
> -    if ( p2m_is_mmio(p2mt) )
> -    {
> -        domain_crash(curr_d);
> -
> -        put_page(*page);
> +    case -EINVAL:
>          return X86EMUL_UNHANDLEABLE;
> +    default:
> +        ASSERT_UNREACHABLE();
> +    case 0:

I think you'd better have "default:" fall through to "case -EINVAL".
Similarly elsewhere.

> --- a/xen/arch/x86/hvm/hvm.c
> +++ b/xen/arch/x86/hvm/hvm.c
> @@ -2557,24 +2557,12 @@ static void *_hvm_map_guest_frame(unsigned long gfn, bool_t permanent,
>                                    bool_t *writable)
>  {
>      void *map;
> -    p2m_type_t p2mt;
>      struct page_info *page;
>      struct domain *d = current->domain;
> +    p2m_type_t p2mt;

???

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v6 13/14] x86: add iommu_ops to modify and flush IOMMU mappings
  2018-08-23  9:47 ` [PATCH v6 13/14] x86: add iommu_ops to modify and flush IOMMU mappings Paul Durrant
@ 2018-09-11 15:15   ` Jan Beulich
  2018-09-12  7:03   ` Jan Beulich
  1 sibling, 0 replies; 111+ messages in thread
From: Jan Beulich @ 2018-09-11 15:15 UTC (permalink / raw)
  To: Paul Durrant
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Tim Deegan,
	Julien Grall, xen-devel

>>> On 23.08.18 at 11:47, <paul.durrant@citrix.com> wrote:
> --- a/xen/common/iommu_op.c
> +++ b/xen/common/iommu_op.c
> @@ -123,6 +123,156 @@ static int iommu_op_enable_modification(
>      return 0;
>  }
>  
> +static int iommuop_map(struct xen_iommu_op_map *op)
> +{
> +    struct domain *d, *currd = current->domain;
> +    struct domain_iommu *iommu = dom_iommu(currd);

const (also in unmap)?

> +static int iommuop_unmap(struct xen_iommu_op_unmap *op)
> +{
> +    struct domain *d, *currd = current->domain;
> +    struct domain_iommu *iommu = dom_iommu(currd);
> +    bfn_t bfn = _bfn(op->bfn);
> +    mfn_t mfn;
> +    bool readonly;
> +    unsigned int prot;
> +    struct page_info *page;
> +    int rc;
> +
> +    if ( op->pad ||
> +         (op->flags & ~XEN_IOMMUOP_unmap_all) )
> +        return -EINVAL;
> +
> +    if ( !iommu->iommu_op_ranges )
> +        return -EOPNOTSUPP;
> +
> +    /* Per-device unmapping not yet supported */
> +    if ( !(op->flags & XEN_IOMMUOP_unmap_all) )
> +        return -EINVAL;
> +
> +    if ( !rangeset_contains_singleton(iommu->iommu_op_ranges, bfn_x(bfn)) ||

Again the question about a malicious multi-vCPU guest trying to
utilize the gap between the check here and ...

> +         iommu_lookup_page(currd, bfn, &mfn, &prot) ||
> +         !mfn_valid(mfn) )
> +        return -ENOENT;
> +
> +    readonly = !(prot & IOMMUF_writable);
> +
> +    d = rcu_lock_domain_by_any_id(op->domid);
> +    if ( !d )
> +        return -ESRCH;
> +
> +    rc = get_paged_gfn(d, _gfn(op->gfn), !(prot & IOMMUF_writable), NULL,
> +                       &page);
> +    if ( rc )
> +        goto unlock;
> +
> +    put_page(page); /* release extra reference just taken */
> +
> +    rc = -EINVAL;
> +    if ( !mfn_eq(page_to_mfn(page), mfn) )
> +        goto unlock;
> +
> +    /* release reference taken in map */
> +    if ( !readonly )
> +        put_page_type(page);
> +    put_page(page);
> +
> +    rc = rangeset_remove_singleton(iommu->iommu_op_ranges, bfn_x(bfn));

... the actual removal here.

> +    if ( rc )
> +        goto unlock;

You've already put the page ref - failure to remove needs to be fatal
to the guest, or you'd need to re-obtain refs.

> +    if ( iommu_unmap_page(currd, bfn) )
> +        rc = -EIO;

Similarly here: All records of the page having a mapping are gone.

> @@ -135,6 +285,22 @@ static void iommu_op(xen_iommu_op_t *op)
>          op->status = iommu_op_enable_modification(&op->u.enable_modification);
>          break;
>  
> +    case XEN_IOMMUOP_map:
> +        this_cpu(iommu_dont_flush_iotlb) = 1;
> +        op->status = iommuop_map(&op->u.map);
> +        this_cpu(iommu_dont_flush_iotlb) = 0;
> +        break;
> +
> +    case XEN_IOMMUOP_unmap:
> +        this_cpu(iommu_dont_flush_iotlb) = 1;
> +        op->status = iommuop_unmap(&op->u.unmap);
> +        this_cpu(iommu_dont_flush_iotlb) = 0;
> +        break;

How is this flush suppression secure?

> --- a/xen/include/public/iommu_op.h
> +++ b/xen/include/public/iommu_op.h
> @@ -80,6 +80,113 @@ struct xen_iommu_op_enable_modification {
>  #define XEN_IOMMU_CAP_per_device_mappings (1u << _XEN_IOMMU_CAP_per_device_mappings)
>  };
>  
> +/*
> + * XEN_IOMMUOP_map: Map a guest page in the IOMMU.
> + */

Single line comment please (also below).

> +#define XEN_IOMMUOP_map 3
> +
> +struct xen_iommu_op_map {
> +    /* IN - The domid of the guest */
> +    domid_t domid;
> +    /*
> +     * IN - flags controlling the mapping. This should be a bitwise OR of the
> +     *      flags defined below.
> +     */
> +    uint16_t flags;
> +
> +    /*
> +     * Should the mapping be created for all initiators?
> +     *
> +     * NOTE: This flag is currently required as the implementation does not yet
> +     *       support pre-device mappings.
> +     */
> +#define _XEN_IOMMUOP_map_all 0
> +#define XEN_IOMMUOP_map_all (1 << (_XEN_IOMMUOP_map_all))

Stray extra parentheses (also further down).

> +/*
> + * XEN_IOMMUOP_flush: Flush the IOMMU TLB.
> + */
> +#define XEN_IOMMUOP_flush 5
> +
> +struct xen_iommu_op_flush {
> +    /*
> +     * IN - flags controlling flushing. This should be a bitwise OR of the
> +     *      flags defined below.
> +     */
> +    uint16_t flags;
> +
> +    /*
> +     * Should the mappings flushed for all initiators?
> +     *
> +     * NOTE: This flag is currently required as the implementation does not yet
> +     *       support pre-device mappings.
> +     */
> +#define _XEN_IOMMUOP_flush_all 0
> +#define XEN_IOMMUOP_flush_all (1 << (_XEN_IOMMUOP_flush_all))
> +
> +    uint16_t pad0;
> +    uint32_t pad1;
> +    /*
> +     * IN - Segment/Bus/Device/Function of the initiator.
> +     *
> +     * NOTE: This is ignored if XEN_IOMMUOP_flush_all is set.
> +     */
> +    uint64_t sbdf;
> +};

No means to flush a single mapping, or a sub-range of the entire
address space?

> --- a/xen/include/xlat.lst
> +++ b/xen/include/xlat.lst
> @@ -80,7 +80,10 @@
>  !	iommu_op			iommu_op.h
>  !	iommu_op_buf			iommu_op.h
>  !	iommu_op_enable_modification	iommu_op.h
> +!	iommu_op_flush			iommu_op.h

As long as there's no bfn_t in there, this could again be ? instead of ! ?

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v6 10/14] mm / iommu: split need_iommu() into has_iommu_pt() and need_iommu_pt_sync()
  2018-09-11 14:31   ` Jan Beulich
@ 2018-09-11 15:40     ` Paul Durrant
  2018-09-12  6:45       ` Jan Beulich
  0 siblings, 1 reply; 111+ messages in thread
From: Paul Durrant @ 2018-09-11 15:40 UTC (permalink / raw)
  To: 'Jan Beulich'
  Cc: Kevin Tian, Stefano Stabellini, Wei Liu, Suravee Suthikulpanit,
	Razvan Cojocaru, Konrad Rzeszutek Wilk, Andrew Cooper,
	Tim (Xen.org),
	George Dunlap, Julien Grall, Tamas K Lengyel, Jun Nakajima,
	xen-devel, Ian Jackson, Brian Woods

> -----Original Message-----
> From: Xen-devel [mailto:xen-devel-bounces@lists.xenproject.org] On Behalf
> Of Jan Beulich
> Sent: 11 September 2018 15:31
> To: Paul Durrant <Paul.Durrant@citrix.com>
> Cc: Kevin Tian <kevin.tian@intel.com>; Stefano Stabellini
> <sstabellini@kernel.org>; Wei Liu <wei.liu2@citrix.com>; Jun Nakajima
> <jun.nakajima@intel.com>; Razvan Cojocaru <rcojocaru@bitdefender.com>;
> Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>; George Dunlap
> <George.Dunlap@citrix.com>; Andrew Cooper
> <Andrew.Cooper3@citrix.com>; Ian Jackson <Ian.Jackson@citrix.com>; Tim
> (Xen.org) <tim@xen.org>; Julien Grall <julien.grall@arm.com>; Tamas K
> Lengyel <tamas@tklengyel.com>; Suravee Suthikulpanit
> <suravee.suthikulpanit@amd.com>; xen-devel <xen-
> devel@lists.xenproject.org>; Brian Woods <brian.woods@amd.com>
> Subject: Re: [Xen-devel] [PATCH v6 10/14] mm / iommu: split need_iommu()
> into has_iommu_pt() and need_iommu_pt_sync()
> 
> >>> On 23.08.18 at 11:47, <paul.durrant@citrix.com> wrote:
> > --- a/xen/arch/x86/hvm/mtrr.c
> > +++ b/xen/arch/x86/hvm/mtrr.c
> > @@ -793,7 +793,8 @@ HVM_REGISTER_SAVE_RESTORE(MTRR,
> hvm_save_mtrr_msr, hvm_load_mtrr_msr,
> >
> >  void memory_type_changed(struct domain *d)
> >  {
> > -    if ( need_iommu(d) && d->vcpu && d->vcpu[0] )
> > +    if ( (need_iommu_pt_sync(d) || iommu_use_hap_pt(d)) &&
> > +         d->vcpu && d->vcpu[0] )
> >      {
> >          p2m_memory_type_changed(d);
> >          flush_all(FLUSH_CACHE);
> 
> How does need_iommu_pt_sync() come into play here? To me,
> has_iommu_pt() would seem to be a better match.

Ok.

> The question
> here solely is (iirc) whether memory types take any effect in the
> first place (or else we can skip adjusting them), which in turn
> solely depends on whether there are any pass-through devices
> in the domain. This is along the lines of ...
> 
> > @@ -841,7 +842,7 @@ int epte_get_entry_emt(struct domain *d,
> unsigned long gfn, mfn_t mfn,
> >          return MTRR_TYPE_UNCACHABLE;
> >      }
> >
> > -    if ( !need_iommu(d) && !cache_flush_permitted(d) )
> > +    if ( !has_iommu_pt(d) && !cache_flush_permitted(d) )
> >      {
> >          *ipat = 1;
> >          return MTRR_TYPE_WRBACK;
> 
> ... this.
> 
> > --- a/xen/arch/x86/x86_64/mm.c
> > +++ b/xen/arch/x86/x86_64/mm.c
> > @@ -1426,7 +1426,8 @@ int memory_add(unsigned long spfn, unsigned
> long epfn, unsigned int pxm)
> >      if ( ret )
> >          goto destroy_m2p;
> >
> > -    if ( iommu_enabled && !iommu_passthrough &&
> !need_iommu(hardware_domain) )
> > +    if ( iommu_enabled && !iommu_passthrough &&
> > +         !need_iommu_pt_sync(hardware_domain) )
> >      {
> >          for ( i = spfn; i < epfn; i++ )
> >              if ( iommu_map_page(hardware_domain, _bfn(i), _mfn(i),
> 
> I'm confused - the condition you change looks to be inverted. Wouldn't
> we better fix this?

I don't think it is inverted. I think this is to add new hotplugged memory to the 1:1 map in the case that dom0 is not in strict mode. I could be wrong. George?

> 
> And then I again wonder whether you've chosen the right predicate:
> Where would the equivalent mappings come from in the opposite case?
> 

If dom0 is in strict mode then I assume that the synchronization is handled when the calls are made to add memory into the p2m (which IIRC happens even for PV guests). My aim for this patch is to avoid any visible functional change.

> > --- a/xen/common/memory.c
> > +++ b/xen/common/memory.c
> > @@ -805,8 +805,8 @@ int xenmem_add_to_physmap(struct domain *d,
> struct xen_add_to_physmap *xatp,
> >      xatp->size -= start;
> >
> >  #ifdef CONFIG_HAS_PASSTHROUGH
> > -    if ( need_iommu(d) )
> > -        this_cpu(iommu_dont_flush_iotlb) = 1;
> > +    if ( need_iommu_pt_sync(d) || iommu_use_hap_pt(d) )
> > +       this_cpu(iommu_dont_flush_iotlb) = 1;
> >  #endif
> 
> Rather than making the conditional more complicated, perhaps
> simply drop it (and move the reset-to-false code out of ...
> 
> > @@ -828,7 +828,7 @@ int xenmem_add_to_physmap(struct domain *d,
> struct xen_add_to_physmap *xatp,
> >      }
> >
> >  #ifdef CONFIG_HAS_PASSTHROUGH
> > -    if ( need_iommu(d) )
> > +    if ( need_iommu_pt_sync(d) || iommu_use_hap_pt(d) )
> >      {
> >          int ret;
> 
> ... this if())?
> 
> Also it looks to me as if here you've got confused by the meaning
> you've assigned to need_iommu_pt_sync(): According to the
> description, it is about sync-ing of page tables. Here, however,
> we're checking whether to flush TLBs.
> 

Yes, I may be confused here but it would seem to me that flushing the IOTLB would be necessary even in the case where the page tables are shared. I'll check the logic again.

> > @@ -179,8 +179,10 @@ void __hwdom_init iommu_hwdom_init(struct
> domain *d)
> >          return;
> >
> >      register_keyhandler('o', &iommu_dump_p2m_table, "dump iommu
> p2m table", 0);
> > -    d->need_iommu = !!iommu_dom0_strict;
> > -    if ( need_iommu(d) && !iommu_use_hap_pt(d) )
> > +
> > +    hd->status = IOMMU_STATUS_initializing;
> > +    hd->need_sync = !!iommu_dom0_strict && !iommu_use_hap_pt(d);
> 
> Pointless !! afaict.
> 

Ok, I'll drop it.

> >  int iommu_construct(struct domain *d)
> >  {
> > -    if ( need_iommu(d) > 0 )
> > +    struct domain_iommu *hd = dom_iommu(d);
> > +
> > +    if ( hd->status == IOMMU_STATUS_initialized )
> >          return 0;
> >
> > -    if ( !iommu_use_hap_pt(d) )
> > +    if ( !iommu_use_hap_pt(d) && hd->status <
> IOMMU_STATUS_initialized )
> 
> Isn't the addition here redundant with the earlier if()?
> 

Yes, it does appear to be.

> > @@ -889,9 +886,11 @@ void watchdog_domain_destroy(struct domain
> *d);
> >  #define is_pinned_vcpu(v) ((v)->domain->is_pinned || \
> >                             cpumask_weight((v)->cpu_hard_affinity) == 1)
> >  #ifdef CONFIG_HAS_PASSTHROUGH
> > -#define need_iommu(d)    ((d)->need_iommu)
> > +#define has_iommu_pt(d) (dom_iommu(d)->status !=
> IOMMU_STATUS_disabled)
> > +#define need_iommu_pt_sync(d) (dom_iommu(d)->need_sync)
> >  #else
> > -#define need_iommu(d)    (0)
> > +#define has_iommu_pt(d) (0)
> > +#define need_iommu_pt_sync(d) (0)
> 
> Please use false here (and no need for the parentheses).
> 

Ok.

  Paul

> Jan
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xenproject.org
> https://lists.xenproject.org/mailman/listinfo/xen-devel
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v6 11/14] x86: add iommu_op to enable modification of IOMMU mappings
  2018-09-11 14:48   ` Jan Beulich
@ 2018-09-11 15:52     ` Paul Durrant
  2018-09-12  6:53       ` Jan Beulich
  0 siblings, 1 reply; 111+ messages in thread
From: Paul Durrant @ 2018-09-11 15:52 UTC (permalink / raw)
  To: 'Jan Beulich'
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	Andrew Cooper, Tim (Xen.org),
	George Dunlap, Julien Grall, xen-devel, Ian Jackson

> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: 11 September 2018 15:48
> To: Paul Durrant <Paul.Durrant@citrix.com>
> Cc: Julien Grall <julien.grall@arm.com>; Andrew Cooper
> <Andrew.Cooper3@citrix.com>; Wei Liu <wei.liu2@citrix.com>; George
> Dunlap <George.Dunlap@citrix.com>; Ian Jackson <Ian.Jackson@citrix.com>;
> Stefano Stabellini <sstabellini@kernel.org>; xen-devel <xen-
> devel@lists.xenproject.org>; Konrad Rzeszutek Wilk
> <konrad.wilk@oracle.com>; Tim (Xen.org) <tim@xen.org>
> Subject: Re: [PATCH v6 11/14] x86: add iommu_op to enable modification of
> IOMMU mappings
> 
> >>> On 23.08.18 at 11:47, <paul.durrant@citrix.com> wrote:
> > This patch adds an iommu_op which checks whether it is possible or
> > safe for a domain to modify its own IOMMU mappings and, if so, creates
> > a rangeset to track modifications.
> 
> Now this can surely grow pretty big?
> 

It could but I don't see any good way round it. The alternative is to shoot all RAM mappings out of the IOMMU when enabling PV IOMMU and determine the validity of an unmap op only on the basis of whether the mapping exists in the IOMMU. Is that option preferable?

> > --- a/xen/common/iommu_op.c
> > +++ b/xen/common/iommu_op.c
> > @@ -78,6 +78,51 @@ static int iommu_op_query_reserved(struct
> xen_iommu_op_query_reserved *op)
> >      return 0;
> >  }
> >
> > +static int iommu_op_enable_modification(
> > +    struct xen_iommu_op_enable_modification *op)
> > +{
> > +    struct domain *currd = current->domain;
> > +    struct domain_iommu *iommu = dom_iommu(currd);
> > +    const struct iommu_ops *ops = iommu->platform_ops;
> > +
> > +    if ( op->cap || op->pad )
> > +        return -EINVAL;
> > +
> > +    /*
> > +     * XEN_IOMMU_CAP_per_device_mappings is not supported yet so we
> can
> > +     * leave op->cap alone.
> > +     */
> > +
> > +    /* Has modification already been enabled? */
> > +    if ( iommu->iommu_op_ranges )
> > +        return 0;
> 
> I don't recall there being any synchronization around the check
> here until ...
> 
> > +    /*
> > +     * The IOMMU mappings cannot be modified if:
> > +     * - the IOMMU is not enabled or,
> > +     * - the current domain is dom0 and tranlsation is disabled or,
> > +     * - HAP is enabled and the IOMMU shares the mappings.
> > +     */
> > +    if ( !iommu_enabled ||
> > +         (is_hardware_domain(currd) && iommu_passthrough) ||
> > +         iommu_use_hap_pt(currd) )
> > +        return -EACCES;
> > +
> > +    /*
> > +     * The IOMMU implementation must provide the lookup method if
> > +     * modification of the mappings is to be supported.
> > +     */
> > +    if ( !ops->lookup_page )
> > +        return -EOPNOTSUPP;
> > +
> > +    iommu->iommu_op_ranges = rangeset_new(currd, NULL, 0);
> 
> ... the assignment here. In which case this is a (multi-vCPU) guest
> controllable leak.

True, yes. This would need locking.

> 
> > --- a/xen/drivers/passthrough/iommu.c
> > +++ b/xen/drivers/passthrough/iommu.c
> > @@ -26,7 +26,6 @@ static void iommu_dump_p2m_table(unsigned char
> key);
> >
> >  unsigned int __read_mostly iommu_dev_iotlb_timeout = 1000;
> >  integer_param("iommu_dev_iotlb_timeout", iommu_dev_iotlb_timeout);
> > -
> >  /*
> >   * The 'iommu' parameter enables the IOMMU.  Optional comma
> separated
> >   * value may contain:
> 
> Stray change?

Yes.

> 
> > --- a/xen/drivers/passthrough/pci.c
> > +++ b/xen/drivers/passthrough/pci.c
> > @@ -1460,7 +1460,7 @@ static int assign_device(struct domain *d, u16 seg,
> u8 bus, u8 devfn, u32 flag)
> >      }
> >
> >   done:
> > -    if ( !has_arch_pdevs(d) && has_iommu_pt(d) )
> > +    if ( !has_arch_pdevs(d) && has_iommu_pt(d) && !hd-
> >iommu_op_ranges )
> >          iommu_teardown(d);
> >      pcidevs_unlock();
> >
> > @@ -1510,7 +1510,7 @@ int deassign_device(struct domain *d, u16 seg, u8
> bus, u8 devfn)
> >
> >      pdev->fault.count = 0;
> >
> > -    if ( !has_arch_pdevs(d) && has_iommu_pt(d) )
> > +    if ( !has_arch_pdevs(d) && has_iommu_pt(d) && !hd-
> >iommu_op_ranges )
> >          iommu_teardown(d);
> 
> These additions are pretty un-obvious, and hence at least need
> comments. But I'm also unclear about the intended behavior: For
> a guest not meaning to play with its mappings, why would you
> keep the tables around (and the memory uselessly allocated)?
> 

No. If the guest has not enabled PV-IOMMu then hd->iommu_op_ranges would be NULL and so the tables will get torn down, otherwise the guest is in control and so Xen should leave the tables alone.

> > --- a/xen/include/public/iommu_op.h
> > +++ b/xen/include/public/iommu_op.h
> > @@ -61,6 +61,25 @@ struct xen_iommu_op_query_reserved {
> >      XEN_GUEST_HANDLE(xen_iommu_reserved_range_t) ranges;
> >  };
> >
> > +/*
> > + * XEN_IOMMUOP_enable_modification: Enable operations that modify
> IOMMU
> > + *                                  mappings.
> > + */
> > +#define XEN_IOMMUOP_enable_modification 2
> > +
> > +struct xen_iommu_op_enable_modification {
> > +    /*
> > +     * OUT - On successful return this is set to the bitwise OR of capabilities
> > +     *       defined below. On entry this must be set to zero.
> > +     */
> > +    unsigned int cap;
> > +    unsigned int pad;
> 
> uint32_t
> 

Ok.

> > --- a/xen/include/xlat.lst
> > +++ b/xen/include/xlat.lst
> > @@ -79,6 +79,7 @@
> >  ?	vcpu_hvm_x86_64			hvm/hvm_vcpu.h
> >  !	iommu_op			iommu_op.h
> >  !	iommu_op_buf			iommu_op.h
> > +!	iommu_op_enable_modification	iommu_op.h
> 
> The structure above looks to be 32-/64-bit agnostic. Why is this !
> instead of ? ?

IIRC I investigated this. I think the reason is that the XLAT macro won't do the necessary copy-back of the struct in the compat path and so it messed up getting the status field out. Since you wanted by to adjust the compat code to avoid unnecessary copying (in an earlier patch) I can probably change these to ?.

  Paul

> 



> Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v6 10/14] mm / iommu: split need_iommu() into has_iommu_pt() and need_iommu_pt_sync()
  2018-09-11 15:40     ` Paul Durrant
@ 2018-09-12  6:45       ` Jan Beulich
  2018-09-12  8:07         ` Paul Durrant
  0 siblings, 1 reply; 111+ messages in thread
From: Jan Beulich @ 2018-09-12  6:45 UTC (permalink / raw)
  To: Paul Durrant
  Cc: Kevin Tian, Stefano Stabellini, Wei Liu, Jun Nakajima,
	Razvan Cojocaru, Konrad Rzeszutek Wilk, Andrew Cooper,
	Tim Deegan, george.dunlap, Julien Grall, Tamas K Lengyel,
	Suravee Suthikulpanit, xen-devel, IanJackson, Brian Woods

>>> On 11.09.18 at 17:40, <Paul.Durrant@citrix.com> wrote:
>> From: Xen-devel [mailto:xen-devel-bounces@lists.xenproject.org] On Behalf
>> Of Jan Beulich
>> Sent: 11 September 2018 15:31
>> 
>> >>> On 23.08.18 at 11:47, <paul.durrant@citrix.com> wrote:
>> > --- a/xen/arch/x86/x86_64/mm.c
>> > +++ b/xen/arch/x86/x86_64/mm.c
>> > @@ -1426,7 +1426,8 @@ int memory_add(unsigned long spfn, unsigned
>> long epfn, unsigned int pxm)
>> >      if ( ret )
>> >          goto destroy_m2p;
>> >
>> > -    if ( iommu_enabled && !iommu_passthrough &&
>> !need_iommu(hardware_domain) )
>> > +    if ( iommu_enabled && !iommu_passthrough &&
>> > +         !need_iommu_pt_sync(hardware_domain) )
>> >      {
>> >          for ( i = spfn; i < epfn; i++ )
>> >              if ( iommu_map_page(hardware_domain, _bfn(i), _mfn(i),
>> 
>> I'm confused - the condition you change looks to be inverted. Wouldn't
>> we better fix this?
> 
> I don't think it is inverted. I think this is to add new hotplugged memory 
> to the 1:1 map in the case that dom0 is not in strict mode. I could be wrong. 

Oh, I think you're right. It is just rather confusing to see an
iommu_map_page() call qualified by !need_iommu(). But that's
as confusing (to me) as the setup logic for Dom0's page tables.

>> And then I again wonder whether you've chosen the right predicate:
>> Where would the equivalent mappings come from in the opposite case?
> 
> If dom0 is in strict mode then I assume that the synchronization is handled 
> when the calls are made to add memory into the p2m (which IIRC happens even 
> for PV guests).

Right you are.

> My aim for this patch is to avoid any visible functional  change.

Sure - I didn't mean anything here (if at all) to be done in this patch
(or perhaps even series), I've merely noticed this as an apparent
oddity (which if I were right would perhaps better have been fixed
before your transformations).

>> > --- a/xen/common/memory.c
>> > +++ b/xen/common/memory.c
>> > @@ -805,8 +805,8 @@ int xenmem_add_to_physmap(struct domain *d,
>> struct xen_add_to_physmap *xatp,
>> >      xatp->size -= start;
>> >
>> >  #ifdef CONFIG_HAS_PASSTHROUGH
>> > -    if ( need_iommu(d) )
>> > -        this_cpu(iommu_dont_flush_iotlb) = 1;
>> > +    if ( need_iommu_pt_sync(d) || iommu_use_hap_pt(d) )
>> > +       this_cpu(iommu_dont_flush_iotlb) = 1;
>> >  #endif
>> 
>> Rather than making the conditional more complicated, perhaps
>> simply drop it (and move the reset-to-false code out of ...
>> 
>> > @@ -828,7 +828,7 @@ int xenmem_add_to_physmap(struct domain *d,
>> struct xen_add_to_physmap *xatp,
>> >      }
>> >
>> >  #ifdef CONFIG_HAS_PASSTHROUGH
>> > -    if ( need_iommu(d) )
>> > +    if ( need_iommu_pt_sync(d) || iommu_use_hap_pt(d) )
>> >      {
>> >          int ret;
>> 
>> ... this if())?
>> 
>> Also it looks to me as if here you've got confused by the meaning
>> you've assigned to need_iommu_pt_sync(): According to the
>> description, it is about sync-ing of page tables. Here, however,
>> we're checking whether to flush TLBs.
> 
> Yes, I may be confused here but it would seem to me that flushing the IOTLB 
> would be necessary even in the case where the page tables are shared. I'll 
> check the logic again.

Flushing is necessary always, and my comment didn't go in that
direction. What I was trying to point out is that the value of
iommu_dont_flush_iotlb doesn't matter when no flushing
happens anyway. I.e. setting it to true unconditionally should
not have any bad effect (but the non-strict-mode-Dom0 case
may need double checking, albeit even in that case suppressing
individual page flushing would be desirable, in which case - if
needed - the second if() might need adjustment, independent
of the change you're doing here).

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v6 11/14] x86: add iommu_op to enable modification of IOMMU mappings
  2018-09-11 15:52     ` Paul Durrant
@ 2018-09-12  6:53       ` Jan Beulich
  2018-09-12  8:04         ` Paul Durrant
  0 siblings, 1 reply; 111+ messages in thread
From: Jan Beulich @ 2018-09-12  6:53 UTC (permalink / raw)
  To: Paul Durrant
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	Andrew Cooper, Tim Deegan, george.dunlap, Julien Grall,
	xen-devel, Ian Jackson

>>> On 11.09.18 at 17:52, <Paul.Durrant@citrix.com> wrote:
>> From: Jan Beulich [mailto:JBeulich@suse.com]
>> Sent: 11 September 2018 15:48
>> 
>> >>> On 23.08.18 at 11:47, <paul.durrant@citrix.com> wrote:
>> > This patch adds an iommu_op which checks whether it is possible or
>> > safe for a domain to modify its own IOMMU mappings and, if so, creates
>> > a rangeset to track modifications.
>> 
>> Now this can surely grow pretty big?
> 
> It could but I don't see any good way round it. The alternative is to shoot 
> all RAM mappings out of the IOMMU when enabling PV IOMMU and determine the 
> validity of an unmap op only on the basis of whether the mapping exists in 
> the IOMMU. Is that option preferable?

I'm not sure - it very much depends on the downsides. An obvious
one is that dropping all RAM mappings is going to take quite some
time for a big domain.

>> > --- a/xen/drivers/passthrough/pci.c
>> > +++ b/xen/drivers/passthrough/pci.c
>> > @@ -1460,7 +1460,7 @@ static int assign_device(struct domain *d, u16 seg,
>> u8 bus, u8 devfn, u32 flag)
>> >      }
>> >
>> >   done:
>> > -    if ( !has_arch_pdevs(d) && has_iommu_pt(d) )
>> > +    if ( !has_arch_pdevs(d) && has_iommu_pt(d) && !hd-
>> >iommu_op_ranges )
>> >          iommu_teardown(d);
>> >      pcidevs_unlock();
>> >
>> > @@ -1510,7 +1510,7 @@ int deassign_device(struct domain *d, u16 seg, u8
>> bus, u8 devfn)
>> >
>> >      pdev->fault.count = 0;
>> >
>> > -    if ( !has_arch_pdevs(d) && has_iommu_pt(d) )
>> > +    if ( !has_arch_pdevs(d) && has_iommu_pt(d) && !hd-
>> >iommu_op_ranges )
>> >          iommu_teardown(d);
>> 
>> These additions are pretty un-obvious, and hence at least need
>> comments. But I'm also unclear about the intended behavior: For
>> a guest not meaning to play with its mappings, why would you
>> keep the tables around (and the memory uselessly allocated)?
>> 
> 
> No. If the guest has not enabled PV-IOMMu then hd->iommu_op_ranges would be 
> NULL and so the tables will get torn down, otherwise the guest is in control 
> and so Xen should leave the tables alone.

Oh, I've misread hd->iommu_op_ranges (taking it for a check for the
required op that you check for in iommu_op_enable_modification()).

>> > --- a/xen/include/xlat.lst
>> > +++ b/xen/include/xlat.lst
>> > @@ -79,6 +79,7 @@
>> >  ?	vcpu_hvm_x86_64			hvm/hvm_vcpu.h
>> >  !	iommu_op			iommu_op.h
>> >  !	iommu_op_buf			iommu_op.h
>> > +!	iommu_op_enable_modification	iommu_op.h
>> 
>> The structure above looks to be 32-/64-bit agnostic. Why is this !
>> instead of ? ?
> 
> IIRC I investigated this. I think the reason is that the XLAT macro won't do 
> the necessary copy-back of the struct in the compat path and so it messed up 
> getting the status field out.

I'd be surprised, as I think we use such mix of ! and ? elsewhere
too. I can't exclude that's in no-copy-back scenarios only, though.

> Since you wanted by to adjust the compat code 
> to avoid unnecessary copying (in an earlier patch) I can probably change 
> these to ?.

Perhaps I'm missing the connection, so I don't know for sure what
to reply here - most likely the answer is "yes".

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v6 13/14] x86: add iommu_ops to modify and flush IOMMU mappings
  2018-08-23  9:47 ` [PATCH v6 13/14] x86: add iommu_ops to modify and flush IOMMU mappings Paul Durrant
  2018-09-11 15:15   ` Jan Beulich
@ 2018-09-12  7:03   ` Jan Beulich
  2018-09-12  8:02     ` Paul Durrant
  1 sibling, 1 reply; 111+ messages in thread
From: Jan Beulich @ 2018-09-12  7:03 UTC (permalink / raw)
  To: Paul Durrant
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Tim Deegan,
	Julien Grall, xen-devel

>>> On 23.08.18 at 11:47, <paul.durrant@citrix.com> wrote:
> +static int iommuop_map(struct xen_iommu_op_map *op)
> +{
> +    struct domain *d, *currd = current->domain;
> +    struct domain_iommu *iommu = dom_iommu(currd);
> +    bool readonly = op->flags & XEN_IOMMUOP_map_readonly;
> +    bfn_t bfn = _bfn(op->bfn);
> +    struct page_info *page;
> +    unsigned int prot;
> +    int rc, ignore;
> +
> +    if ( op->pad ||
> +         (op->flags & ~(XEN_IOMMUOP_map_all |
> +                        XEN_IOMMUOP_map_readonly)) )
> +        return -EINVAL;
> +
> +    if ( !iommu->iommu_op_ranges )
> +        return -EOPNOTSUPP;
> +
> +    /* Per-device mapping not yet supported */
> +    if ( !(op->flags & XEN_IOMMUOP_map_all) )
> +        return -EINVAL;
> +
> +    /* Check whether the specified BFN falls in a reserved region */
> +    if ( rangeset_contains_singleton(iommu->reserved_ranges, bfn_x(bfn)) )
> +        return -EINVAL;
> +
> +    d = rcu_lock_domain_by_any_id(op->domid);
> +    if ( !d )
> +        return -ESRCH;
> +
> +    rc = get_paged_gfn(d, _gfn(op->gfn), readonly, NULL, &page);
> +    if ( rc )
> +        goto unlock;
> +
> +    rc = -EINVAL;
> +    if ( !readonly && !get_page_type(page, PGT_writable_page) )
> +    {
> +        put_page(page);
> +        goto unlock;
> +    }
> +
> +    prot = IOMMUF_readable;
> +    if ( !readonly )
> +        prot |= IOMMUF_writable;
> +
> +    rc = -EIO;
> +    if ( iommu_map_page(currd, bfn, page_to_mfn(page), prot) )
> +        goto release;
> +
> +    rc = rangeset_add_singleton(iommu->iommu_op_ranges, bfn_x(bfn));
> +    if ( rc )
> +        goto unmap;

When a mapping is requested for the same BFN that a prior mapping
was already established for, the page refs of that prior mapping get
leaked here. I don't think you want to require an intermediate unmap,
so checking the rangeset first is not an option. Hence I think you
need to look up the translation anyway, which may mean that the
rangeset's usefulness is quite limited (relevant with the additional
context of my question regarding it perhaps requiring a pretty much
unbounded amount of memory).

In order to avoid shooting down all pre-existing RAM mappings - is
there no way the page table entries could be marked to identify
their origin?

I also have another more general concern: Allowing the guest to
manipulate its IOMMU page tables means that it can deliberately
shatter large pages, growing the overall memory footprint of the
domain. I'm hesitant to say this, but I'm afraid that resource
tracking of such "behind the scenes" allocations might be a
necessary prereq for the PV IOMMU work.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v6 13/14] x86: add iommu_ops to modify and flush IOMMU mappings
  2018-09-12  7:03   ` Jan Beulich
@ 2018-09-12  8:02     ` Paul Durrant
  2018-09-12  8:27       ` Jan Beulich
  2018-09-13  6:41       ` Tian, Kevin
  0 siblings, 2 replies; 111+ messages in thread
From: Paul Durrant @ 2018-09-12  8:02 UTC (permalink / raw)
  To: 'Jan Beulich'
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	Andrew Cooper, Tim (Xen.org),
	George Dunlap, Julien Grall, xen-devel, Ian Jackson

> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: 12 September 2018 08:04
> To: Paul Durrant <Paul.Durrant@citrix.com>
> Cc: Julien Grall <julien.grall@arm.com>; Andrew Cooper
> <Andrew.Cooper3@citrix.com>; Wei Liu <wei.liu2@citrix.com>; George
> Dunlap <George.Dunlap@citrix.com>; Ian Jackson <Ian.Jackson@citrix.com>;
> Stefano Stabellini <sstabellini@kernel.org>; xen-devel <xen-
> devel@lists.xenproject.org>; Konrad Rzeszutek Wilk
> <konrad.wilk@oracle.com>; Tim (Xen.org) <tim@xen.org>
> Subject: Re: [PATCH v6 13/14] x86: add iommu_ops to modify and flush
> IOMMU mappings
> 
> >>> On 23.08.18 at 11:47, <paul.durrant@citrix.com> wrote:
> > +static int iommuop_map(struct xen_iommu_op_map *op)
> > +{
> > +    struct domain *d, *currd = current->domain;
> > +    struct domain_iommu *iommu = dom_iommu(currd);
> > +    bool readonly = op->flags & XEN_IOMMUOP_map_readonly;
> > +    bfn_t bfn = _bfn(op->bfn);
> > +    struct page_info *page;
> > +    unsigned int prot;
> > +    int rc, ignore;
> > +
> > +    if ( op->pad ||
> > +         (op->flags & ~(XEN_IOMMUOP_map_all |
> > +                        XEN_IOMMUOP_map_readonly)) )
> > +        return -EINVAL;
> > +
> > +    if ( !iommu->iommu_op_ranges )
> > +        return -EOPNOTSUPP;
> > +
> > +    /* Per-device mapping not yet supported */
> > +    if ( !(op->flags & XEN_IOMMUOP_map_all) )
> > +        return -EINVAL;
> > +
> > +    /* Check whether the specified BFN falls in a reserved region */
> > +    if ( rangeset_contains_singleton(iommu->reserved_ranges,
> bfn_x(bfn)) )
> > +        return -EINVAL;
> > +
> > +    d = rcu_lock_domain_by_any_id(op->domid);
> > +    if ( !d )
> > +        return -ESRCH;
> > +
> > +    rc = get_paged_gfn(d, _gfn(op->gfn), readonly, NULL, &page);
> > +    if ( rc )
> > +        goto unlock;
> > +
> > +    rc = -EINVAL;
> > +    if ( !readonly && !get_page_type(page, PGT_writable_page) )
> > +    {
> > +        put_page(page);
> > +        goto unlock;
> > +    }
> > +
> > +    prot = IOMMUF_readable;
> > +    if ( !readonly )
> > +        prot |= IOMMUF_writable;
> > +
> > +    rc = -EIO;
> > +    if ( iommu_map_page(currd, bfn, page_to_mfn(page), prot) )
> > +        goto release;
> > +
> > +    rc = rangeset_add_singleton(iommu->iommu_op_ranges, bfn_x(bfn));
> > +    if ( rc )
> > +        goto unmap;
> 
> When a mapping is requested for the same BFN that a prior mapping
> was already established for, the page refs of that prior mapping get
> leaked here. I don't think you want to require an intermediate unmap,
> so checking the rangeset first is not an option. Hence I think you
> need to look up the translation anyway, which may mean that the
> rangeset's usefulness is quite limited (relevant with the additional
> context of my question regarding it perhaps requiring a pretty much
> unbounded amount of memory).
> 

Yes, that's a good point. I could do a lookup to check whether the B[D]FN is already there though. Agreed that the memory is unounded unless the number of ranges is limited, which there is already a facility for. It is not ideal though.

> In order to avoid shooting down all pre-existing RAM mappings - is
> there no way the page table entries could be marked to identify
> their origin?
> 

I don't know whether that is possible; I'll have to find specs for Intel and AMD IOMMUs and see if they have PTE bits available for such a use.

> I also have another more general concern: Allowing the guest to
> manipulate its IOMMU page tables means that it can deliberately
> shatter large pages, growing the overall memory footprint of the
> domain. I'm hesitant to say this, but I'm afraid that resource
> tracking of such "behind the scenes" allocations might be a
> necessary prereq for the PV IOMMU work.
> 

Remember that PV-IOMMU is only available for dom0 as it stands (and that is the only use-case that XenServer currently has) so I think that, whilst the concern is valid, there is no need danger in putting the code without such tracking. Such work can be deferred to making PV-IOMMU for de-privileged guests... if that facility is needed.

  Paul

> Jan
> 


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v6 11/14] x86: add iommu_op to enable modification of IOMMU mappings
  2018-09-12  6:53       ` Jan Beulich
@ 2018-09-12  8:04         ` Paul Durrant
  0 siblings, 0 replies; 111+ messages in thread
From: Paul Durrant @ 2018-09-12  8:04 UTC (permalink / raw)
  To: 'Jan Beulich'
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	Andrew Cooper, Tim (Xen.org),
	George Dunlap, Julien Grall, xen-devel, Ian Jackson

> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: 12 September 2018 07:54
> To: Paul Durrant <Paul.Durrant@citrix.com>
> Cc: Julien Grall <julien.grall@arm.com>; Andrew Cooper
> <Andrew.Cooper3@citrix.com>; George Dunlap
> <George.Dunlap@citrix.com>; Ian Jackson <Ian.Jackson@citrix.com>; Wei Liu
> <wei.liu2@citrix.com>; Stefano Stabellini <sstabellini@kernel.org>; xen-
> devel <xen-devel@lists.xenproject.org>; Konrad Rzeszutek Wilk
> <konrad.wilk@oracle.com>; Tim (Xen.org) <tim@xen.org>
> Subject: RE: [PATCH v6 11/14] x86: add iommu_op to enable modification of
> IOMMU mappings
> 
> >>> On 11.09.18 at 17:52, <Paul.Durrant@citrix.com> wrote:
> >> From: Jan Beulich [mailto:JBeulich@suse.com]
> >> Sent: 11 September 2018 15:48
> >>
> >> >>> On 23.08.18 at 11:47, <paul.durrant@citrix.com> wrote:
> >> > This patch adds an iommu_op which checks whether it is possible or
> >> > safe for a domain to modify its own IOMMU mappings and, if so, creates
> >> > a rangeset to track modifications.
> >>
> >> Now this can surely grow pretty big?
> >
> > It could but I don't see any good way round it. The alternative is to shoot
> > all RAM mappings out of the IOMMU when enabling PV IOMMU and
> determine the
> > validity of an unmap op only on the basis of whether the mapping exists in
> > the IOMMU. Is that option preferable?
> 
> I'm not sure - it very much depends on the downsides. An obvious
> one is that dropping all RAM mappings is going to take quite some
> time for a big domain.
> 
> >> > --- a/xen/drivers/passthrough/pci.c
> >> > +++ b/xen/drivers/passthrough/pci.c
> >> > @@ -1460,7 +1460,7 @@ static int assign_device(struct domain *d, u16
> seg,
> >> u8 bus, u8 devfn, u32 flag)
> >> >      }
> >> >
> >> >   done:
> >> > -    if ( !has_arch_pdevs(d) && has_iommu_pt(d) )
> >> > +    if ( !has_arch_pdevs(d) && has_iommu_pt(d) && !hd-
> >> >iommu_op_ranges )
> >> >          iommu_teardown(d);
> >> >      pcidevs_unlock();
> >> >
> >> > @@ -1510,7 +1510,7 @@ int deassign_device(struct domain *d, u16 seg,
> u8
> >> bus, u8 devfn)
> >> >
> >> >      pdev->fault.count = 0;
> >> >
> >> > -    if ( !has_arch_pdevs(d) && has_iommu_pt(d) )
> >> > +    if ( !has_arch_pdevs(d) && has_iommu_pt(d) && !hd-
> >> >iommu_op_ranges )
> >> >          iommu_teardown(d);
> >>
> >> These additions are pretty un-obvious, and hence at least need
> >> comments. But I'm also unclear about the intended behavior: For
> >> a guest not meaning to play with its mappings, why would you
> >> keep the tables around (and the memory uselessly allocated)?
> >>
> >
> > No. If the guest has not enabled PV-IOMMu then hd->iommu_op_ranges
> would be
> > NULL and so the tables will get torn down, otherwise the guest is in control
> > and so Xen should leave the tables alone.
> 
> Oh, I've misread hd->iommu_op_ranges (taking it for a check for the
> required op that you check for in iommu_op_enable_modification()).
> 
> >> > --- a/xen/include/xlat.lst
> >> > +++ b/xen/include/xlat.lst
> >> > @@ -79,6 +79,7 @@
> >> >  ?	vcpu_hvm_x86_64			hvm/hvm_vcpu.h
> >> >  !	iommu_op			iommu_op.h
> >> >  !	iommu_op_buf			iommu_op.h
> >> > +!	iommu_op_enable_modification	iommu_op.h
> >>
> >> The structure above looks to be 32-/64-bit agnostic. Why is this !
> >> instead of ? ?
> >
> > IIRC I investigated this. I think the reason is that the XLAT macro won't do
> > the necessary copy-back of the struct in the compat path and so it messed
> up
> > getting the status field out.
> 
> I'd be surprised, as I think we use such mix of ! and ? elsewhere
> too. I can't exclude that's in no-copy-back scenarios only, though.
> 
> > Since you wanted by to adjust the compat code
> > to avoid unnecessary copying (in an earlier patch) I can probably change
> > these to ?.
> 
> Perhaps I'm missing the connection, so I don't know for sure what
> to reply here - most likely the answer is "yes".
> 

It wasn't a question... perhaps I should have quoted that '?' :-) My point was that I'm going to have to re-investigate the copy-back anyway so I'll try to go for check rather than xlat where I can.

  Paul

> Jan
> 


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v6 10/14] mm / iommu: split need_iommu() into has_iommu_pt() and need_iommu_pt_sync()
  2018-09-12  6:45       ` Jan Beulich
@ 2018-09-12  8:07         ` Paul Durrant
  0 siblings, 0 replies; 111+ messages in thread
From: Paul Durrant @ 2018-09-12  8:07 UTC (permalink / raw)
  To: 'Jan Beulich'
  Cc: Kevin Tian, Stefano Stabellini, Wei Liu, Jun Nakajima,
	Razvan Cojocaru, Konrad Rzeszutek Wilk, Andrew Cooper,
	Tim (Xen.org),
	George Dunlap, Julien Grall, Tamas K Lengyel,
	Suravee Suthikulpanit, xen-devel, Ian Jackson, Brian Woods

> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: 12 September 2018 07:45
> To: Paul Durrant <Paul.Durrant@citrix.com>
> Cc: Brian Woods <brian.woods@amd.com>; Suravee Suthikulpanit
> <suravee.suthikulpanit@amd.com>; Julien Grall <julien.grall@arm.com>;
> Razvan Cojocaru <rcojocaru@bitdefender.com>; Andrew Cooper
> <Andrew.Cooper3@citrix.com>; George Dunlap
> <George.Dunlap@citrix.com>; Ian Jackson <Ian.Jackson@citrix.com>; Wei Liu
> <wei.liu2@citrix.com>; Jun Nakajima <jun.nakajima@intel.com>; Kevin Tian
> <kevin.tian@intel.com>; Stefano Stabellini <sstabellini@kernel.org>; xen-
> devel <xen-devel@lists.xenproject.org>; Konrad Rzeszutek Wilk
> <konrad.wilk@oracle.com>; Tamas K Lengyel <tamas@tklengyel.com>; Tim
> (Xen.org) <tim@xen.org>
> Subject: RE: [Xen-devel] [PATCH v6 10/14] mm / iommu: split need_iommu()
> into has_iommu_pt() and need_iommu_pt_sync()
> 
> >>> On 11.09.18 at 17:40, <Paul.Durrant@citrix.com> wrote:
> >> From: Xen-devel [mailto:xen-devel-bounces@lists.xenproject.org] On
> Behalf
> >> Of Jan Beulich
> >> Sent: 11 September 2018 15:31
> >>
> >> >>> On 23.08.18 at 11:47, <paul.durrant@citrix.com> wrote:
> >> > --- a/xen/arch/x86/x86_64/mm.c
> >> > +++ b/xen/arch/x86/x86_64/mm.c
> >> > @@ -1426,7 +1426,8 @@ int memory_add(unsigned long spfn, unsigned
> >> long epfn, unsigned int pxm)
> >> >      if ( ret )
> >> >          goto destroy_m2p;
> >> >
> >> > -    if ( iommu_enabled && !iommu_passthrough &&
> >> !need_iommu(hardware_domain) )
> >> > +    if ( iommu_enabled && !iommu_passthrough &&
> >> > +         !need_iommu_pt_sync(hardware_domain) )
> >> >      {
> >> >          for ( i = spfn; i < epfn; i++ )
> >> >              if ( iommu_map_page(hardware_domain, _bfn(i), _mfn(i),
> >>
> >> I'm confused - the condition you change looks to be inverted. Wouldn't
> >> we better fix this?
> >
> > I don't think it is inverted. I think this is to add new hotplugged memory
> > to the 1:1 map in the case that dom0 is not in strict mode. I could be wrong.
> 
> Oh, I think you're right. It is just rather confusing to see an
> iommu_map_page() call qualified by !need_iommu(). But that's
> as confusing (to me) as the setup logic for Dom0's page tables.
> 

I think it's generally confusing. I'll stick a comment in to explain.

> >> And then I again wonder whether you've chosen the right predicate:
> >> Where would the equivalent mappings come from in the opposite case?
> >
> > If dom0 is in strict mode then I assume that the synchronization is handled
> > when the calls are made to add memory into the p2m (which IIRC happens
> even
> > for PV guests).
> 
> Right you are.
> 
> > My aim for this patch is to avoid any visible functional  change.
> 
> Sure - I didn't mean anything here (if at all) to be done in this patch
> (or perhaps even series), I've merely noticed this as an apparent
> oddity (which if I were right would perhaps better have been fixed
> before your transformations).
> 
> >> > --- a/xen/common/memory.c
> >> > +++ b/xen/common/memory.c
> >> > @@ -805,8 +805,8 @@ int xenmem_add_to_physmap(struct domain
> *d,
> >> struct xen_add_to_physmap *xatp,
> >> >      xatp->size -= start;
> >> >
> >> >  #ifdef CONFIG_HAS_PASSTHROUGH
> >> > -    if ( need_iommu(d) )
> >> > -        this_cpu(iommu_dont_flush_iotlb) = 1;
> >> > +    if ( need_iommu_pt_sync(d) || iommu_use_hap_pt(d) )
> >> > +       this_cpu(iommu_dont_flush_iotlb) = 1;
> >> >  #endif
> >>
> >> Rather than making the conditional more complicated, perhaps
> >> simply drop it (and move the reset-to-false code out of ...
> >>
> >> > @@ -828,7 +828,7 @@ int xenmem_add_to_physmap(struct domain
> *d,
> >> struct xen_add_to_physmap *xatp,
> >> >      }
> >> >
> >> >  #ifdef CONFIG_HAS_PASSTHROUGH
> >> > -    if ( need_iommu(d) )
> >> > +    if ( need_iommu_pt_sync(d) || iommu_use_hap_pt(d) )
> >> >      {
> >> >          int ret;
> >>
> >> ... this if())?
> >>
> >> Also it looks to me as if here you've got confused by the meaning
> >> you've assigned to need_iommu_pt_sync(): According to the
> >> description, it is about sync-ing of page tables. Here, however,
> >> we're checking whether to flush TLBs.
> >
> > Yes, I may be confused here but it would seem to me that flushing the
> IOTLB
> > would be necessary even in the case where the page tables are shared. I'll
> > check the logic again.
> 
> Flushing is necessary always, and my comment didn't go in that
> direction. What I was trying to point out is that the value of
> iommu_dont_flush_iotlb doesn't matter when no flushing
> happens anyway. I.e. setting it to true unconditionally should
> not have any bad effect (but the non-strict-mode-Dom0 case
> may need double checking, albeit even in that case suppressing
> individual page flushing would be desirable, in which case - if
> needed - the second if() might need adjustment, independent
> of the change you're doing here).
> 

Ok. I'll see if this needs correction and put a pre-requisite patch in if need be.

  Paul

> Jan
> 


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v6 13/14] x86: add iommu_ops to modify and flush IOMMU mappings
  2018-09-12  8:02     ` Paul Durrant
@ 2018-09-12  8:27       ` Jan Beulich
  2018-09-13  6:41       ` Tian, Kevin
  1 sibling, 0 replies; 111+ messages in thread
From: Jan Beulich @ 2018-09-12  8:27 UTC (permalink / raw)
  To: Paul Durrant
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	Andrew Cooper, Tim Deegan, george.dunlap, Julien Grall,
	xen-devel, Ian Jackson

>>> On 12.09.18 at 10:02, <Paul.Durrant@citrix.com> wrote:
>> From: Jan Beulich [mailto:JBeulich@suse.com]
>> Sent: 12 September 2018 08:04
>> 
>> >>> On 23.08.18 at 11:47, <paul.durrant@citrix.com> wrote:
>> > +static int iommuop_map(struct xen_iommu_op_map *op)
>> > +{
>> > +    struct domain *d, *currd = current->domain;
>> > +    struct domain_iommu *iommu = dom_iommu(currd);
>> > +    bool readonly = op->flags & XEN_IOMMUOP_map_readonly;
>> > +    bfn_t bfn = _bfn(op->bfn);
>> > +    struct page_info *page;
>> > +    unsigned int prot;
>> > +    int rc, ignore;
>> > +
>> > +    if ( op->pad ||
>> > +         (op->flags & ~(XEN_IOMMUOP_map_all |
>> > +                        XEN_IOMMUOP_map_readonly)) )
>> > +        return -EINVAL;
>> > +
>> > +    if ( !iommu->iommu_op_ranges )
>> > +        return -EOPNOTSUPP;
>> > +
>> > +    /* Per-device mapping not yet supported */
>> > +    if ( !(op->flags & XEN_IOMMUOP_map_all) )
>> > +        return -EINVAL;
>> > +
>> > +    /* Check whether the specified BFN falls in a reserved region */
>> > +    if ( rangeset_contains_singleton(iommu->reserved_ranges,
>> bfn_x(bfn)) )
>> > +        return -EINVAL;
>> > +
>> > +    d = rcu_lock_domain_by_any_id(op->domid);
>> > +    if ( !d )
>> > +        return -ESRCH;
>> > +
>> > +    rc = get_paged_gfn(d, _gfn(op->gfn), readonly, NULL, &page);
>> > +    if ( rc )
>> > +        goto unlock;
>> > +
>> > +    rc = -EINVAL;
>> > +    if ( !readonly && !get_page_type(page, PGT_writable_page) )
>> > +    {
>> > +        put_page(page);
>> > +        goto unlock;
>> > +    }
>> > +
>> > +    prot = IOMMUF_readable;
>> > +    if ( !readonly )
>> > +        prot |= IOMMUF_writable;
>> > +
>> > +    rc = -EIO;
>> > +    if ( iommu_map_page(currd, bfn, page_to_mfn(page), prot) )
>> > +        goto release;
>> > +
>> > +    rc = rangeset_add_singleton(iommu->iommu_op_ranges, bfn_x(bfn));
>> > +    if ( rc )
>> > +        goto unmap;
>> 
>> When a mapping is requested for the same BFN that a prior mapping
>> was already established for, the page refs of that prior mapping get
>> leaked here. I don't think you want to require an intermediate unmap,
>> so checking the rangeset first is not an option. Hence I think you
>> need to look up the translation anyway, which may mean that the
>> rangeset's usefulness is quite limited (relevant with the additional
>> context of my question regarding it perhaps requiring a pretty much
>> unbounded amount of memory).
>> 
> 
> Yes, that's a good point. I could do a lookup to check whether the B[D]FN is 
> already there though. Agreed that the memory is unounded unless the number of 
> ranges is limited, which there is already a facility for. It is not ideal 
> though.
> 
>> In order to avoid shooting down all pre-existing RAM mappings - is
>> there no way the page table entries could be marked to identify
>> their origin?
>> 
> 
> I don't know whether that is possible; I'll have to find specs for Intel and 
> AMD IOMMUs and see if they have PTE bits available for such a use.

I seem to vaguely recall the AMD side lacking any suitable bits.

>> I also have another more general concern: Allowing the guest to
>> manipulate its IOMMU page tables means that it can deliberately
>> shatter large pages, growing the overall memory footprint of the
>> domain. I'm hesitant to say this, but I'm afraid that resource
>> tracking of such "behind the scenes" allocations might be a
>> necessary prereq for the PV IOMMU work.
> 
> Remember that PV-IOMMU is only available for dom0 as it stands (and that is 
> the only use-case that XenServer currently has) so I think that, whilst the 
> concern is valid, there is no need danger in putting the code without such 
> tracking. Such work can be deferred to making PV-IOMMU for de-privileged 
> guests... if that facility is needed.

Good point, but perhaps worth a prominent fixme note at a suitable
place in code, like iirc Roger has done in a number of places for his
vPCI work (which eventually we will want to extend to support DomU
too).

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v6 08/14] vtd: add lookup_page method to iommu_ops
  2018-09-07 11:11   ` Jan Beulich
  2018-09-07 12:36     ` Paul Durrant
@ 2018-09-12  8:31     ` Paul Durrant
  2018-09-12  8:43       ` Jan Beulich
  1 sibling, 1 reply; 111+ messages in thread
From: Paul Durrant @ 2018-09-12  8:31 UTC (permalink / raw)
  To: 'Jan Beulich'; +Cc: xen-devel, Kevin Tian, George Dunlap

> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: 07 September 2018 12:11
> To: Paul Durrant <Paul.Durrant@citrix.com>
> Cc: George Dunlap <George.Dunlap@citrix.com>; Kevin Tian
> <kevin.tian@intel.com>; xen-devel <xen-devel@lists.xenproject.org>
> Subject: Re: [PATCH v6 08/14] vtd: add lookup_page method to iommu_ops
> 
> >>> On 23.08.18 at 11:47, <paul.durrant@citrix.com> wrote:
> > This patch adds a new method to the VT-d IOMMU implementation to find
> the
> > MFN currently mapped by the specified BFN along with a wrapper function
> in
> > generic IOMMU code to call the implementation if it exists.
> 
> For this to go in, I think the AMD side of it wants to also be implemented.
> 
> > --- a/xen/drivers/passthrough/iommu.c
> > +++ b/xen/drivers/passthrough/iommu.c
> > @@ -305,6 +305,17 @@ int iommu_unmap_page(struct domain *d, bfn_t
> bfn)
> >      return rc;
> >  }
> >
> > +int iommu_lookup_page(struct domain *d, bfn_t bfn, mfn_t *mfn,
> > +                      unsigned int *flags)
> > +{
> > +    const struct domain_iommu *hd = dom_iommu(d);
> > +
> > +    if ( !iommu_enabled || !hd->platform_ops )
> > +        return -EOPNOTSUPP;
> > +
> > +    return hd->platform_ops->lookup_page(d, bfn, mfn, flags);
> > +}
> 
> Shouldn't this be restricted to PV guests? HVM ones aren't supposed
> to know about MFNs.

Agreed, but I think this is the wrong level to be applying such a check: iommu_map_page() is supplied an MFN regardless of whether the domain is PV or HVM, so I think it is reasonable for a lookup function to work in terms of MFNs.

> 
> > +static int intel_iommu_lookup_page(struct domain *d, bfn_t bfn, mfn_t
> *mfn,
> > +                                   unsigned int *flags)
> > +{
> > +    struct domain_iommu *hd = dom_iommu(d);
> > +    struct dma_pte *page = NULL, *pte = NULL, val;
> 
> Pointless initializers. I also question the usefulness of "pte":
> 

Yes, I'll tidy up the initializers.

> > +    u64 pg_maddr;
> > +
> > +    spin_lock(&hd->arch.mapping_lock);
> > +
> > +    pg_maddr = addr_to_dma_page_maddr(d, bfn_to_baddr(bfn), 0);
> > +    if ( pg_maddr == 0 )
> > +    {
> > +        spin_unlock(&hd->arch.mapping_lock);
> > +        return -ENOMEM;
> > +    }
> > +
> > +    page = map_vtd_domain_page(pg_maddr);
> > +    pte = page + (bfn_x(bfn) & LEVEL_MASK);
> > +    val = *pte;
> 
>     val = page[bfn_x(bfn) & LEVEL_MASK];
> 

Yes, that's neater.

  Paul

> Jan
> 


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v6 08/14] vtd: add lookup_page method to iommu_ops
  2018-09-12  8:31     ` Paul Durrant
@ 2018-09-12  8:43       ` Jan Beulich
  2018-09-12  8:45         ` Paul Durrant
  0 siblings, 1 reply; 111+ messages in thread
From: Jan Beulich @ 2018-09-12  8:43 UTC (permalink / raw)
  To: Paul Durrant; +Cc: xen-devel, Kevin Tian, george.dunlap

>>> On 12.09.18 at 10:31, <Paul.Durrant@citrix.com> wrote:
>> From: Jan Beulich [mailto:JBeulich@suse.com]
>> Sent: 07 September 2018 12:11
>> 
>> >>> On 23.08.18 at 11:47, <paul.durrant@citrix.com> wrote:
>> > This patch adds a new method to the VT-d IOMMU implementation to find
>> the
>> > MFN currently mapped by the specified BFN along with a wrapper function
>> in
>> > generic IOMMU code to call the implementation if it exists.
>> 
>> For this to go in, I think the AMD side of it wants to also be implemented.
>> 
>> > --- a/xen/drivers/passthrough/iommu.c
>> > +++ b/xen/drivers/passthrough/iommu.c
>> > @@ -305,6 +305,17 @@ int iommu_unmap_page(struct domain *d, bfn_t
>> bfn)
>> >      return rc;
>> >  }
>> >
>> > +int iommu_lookup_page(struct domain *d, bfn_t bfn, mfn_t *mfn,
>> > +                      unsigned int *flags)
>> > +{
>> > +    const struct domain_iommu *hd = dom_iommu(d);
>> > +
>> > +    if ( !iommu_enabled || !hd->platform_ops )
>> > +        return -EOPNOTSUPP;
>> > +
>> > +    return hd->platform_ops->lookup_page(d, bfn, mfn, flags);
>> > +}
>> 
>> Shouldn't this be restricted to PV guests? HVM ones aren't supposed
>> to know about MFNs.
> 
> Agreed, but I think this is the wrong level to be applying such a check: 
> iommu_map_page() is supplied an MFN regardless of whether the domain is PV or 
> HVM, so I think it is reasonable for a lookup function to work in terms of 
> MFNs.

I don't mind much where the check sits, but ASSERT(!is_hvm_domain()),
if placed here, should not trigger.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v6 08/14] vtd: add lookup_page method to iommu_ops
  2018-09-12  8:43       ` Jan Beulich
@ 2018-09-12  8:45         ` Paul Durrant
  2018-09-12  8:51           ` Paul Durrant
  2018-09-12  8:59           ` Jan Beulich
  0 siblings, 2 replies; 111+ messages in thread
From: Paul Durrant @ 2018-09-12  8:45 UTC (permalink / raw)
  To: 'Jan Beulich'; +Cc: xen-devel, Kevin Tian, George Dunlap

> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: 12 September 2018 09:44
> To: Paul Durrant <Paul.Durrant@citrix.com>
> Cc: George Dunlap <George.Dunlap@citrix.com>; Kevin Tian
> <kevin.tian@intel.com>; xen-devel <xen-devel@lists.xenproject.org>
> Subject: RE: [PATCH v6 08/14] vtd: add lookup_page method to iommu_ops
> 
> >>> On 12.09.18 at 10:31, <Paul.Durrant@citrix.com> wrote:
> >> From: Jan Beulich [mailto:JBeulich@suse.com]
> >> Sent: 07 September 2018 12:11
> >>
> >> >>> On 23.08.18 at 11:47, <paul.durrant@citrix.com> wrote:
> >> > This patch adds a new method to the VT-d IOMMU implementation to
> find
> >> the
> >> > MFN currently mapped by the specified BFN along with a wrapper
> function
> >> in
> >> > generic IOMMU code to call the implementation if it exists.
> >>
> >> For this to go in, I think the AMD side of it wants to also be implemented.
> >>
> >> > --- a/xen/drivers/passthrough/iommu.c
> >> > +++ b/xen/drivers/passthrough/iommu.c
> >> > @@ -305,6 +305,17 @@ int iommu_unmap_page(struct domain *d,
> bfn_t
> >> bfn)
> >> >      return rc;
> >> >  }
> >> >
> >> > +int iommu_lookup_page(struct domain *d, bfn_t bfn, mfn_t *mfn,
> >> > +                      unsigned int *flags)
> >> > +{
> >> > +    const struct domain_iommu *hd = dom_iommu(d);
> >> > +
> >> > +    if ( !iommu_enabled || !hd->platform_ops )
> >> > +        return -EOPNOTSUPP;
> >> > +
> >> > +    return hd->platform_ops->lookup_page(d, bfn, mfn, flags);
> >> > +}
> >>
> >> Shouldn't this be restricted to PV guests? HVM ones aren't supposed
> >> to know about MFNs.
> >
> > Agreed, but I think this is the wrong level to be applying such a check:
> > iommu_map_page() is supplied an MFN regardless of whether the domain
> is PV or
> > HVM, so I think it is reasonable for a lookup function to work in terms of
> > MFNs.
> 
> I don't mind much where the check sits, but ASSERT(!is_hvm_domain()),
> if placed here, should not trigger.
> 

It will though. I'm going to need to use this function for HVM guests after having done a GFN lookup.

  Paul

> Jan
> 


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v6 08/14] vtd: add lookup_page method to iommu_ops
  2018-09-12  8:45         ` Paul Durrant
@ 2018-09-12  8:51           ` Paul Durrant
  2018-09-12  8:53             ` Paul Durrant
  2018-09-12  8:59           ` Jan Beulich
  1 sibling, 1 reply; 111+ messages in thread
From: Paul Durrant @ 2018-09-12  8:51 UTC (permalink / raw)
  To: Paul Durrant, 'Jan Beulich'; +Cc: xen-devel, Kevin Tian, George Dunlap

> -----Original Message-----
> From: Xen-devel [mailto:xen-devel-bounces@lists.xenproject.org] On Behalf
> Of Paul Durrant
> Sent: 12 September 2018 09:45
> To: 'Jan Beulich' <JBeulich@suse.com>
> Cc: xen-devel <xen-devel@lists.xenproject.org>; Kevin Tian
> <kevin.tian@intel.com>; George Dunlap <George.Dunlap@citrix.com>
> Subject: Re: [Xen-devel] [PATCH v6 08/14] vtd: add lookup_page method to
> iommu_ops
> 
> > -----Original Message-----
> > From: Jan Beulich [mailto:JBeulich@suse.com]
> > Sent: 12 September 2018 09:44
> > To: Paul Durrant <Paul.Durrant@citrix.com>
> > Cc: George Dunlap <George.Dunlap@citrix.com>; Kevin Tian
> > <kevin.tian@intel.com>; xen-devel <xen-devel@lists.xenproject.org>
> > Subject: RE: [PATCH v6 08/14] vtd: add lookup_page method to
> iommu_ops
> >
> > >>> On 12.09.18 at 10:31, <Paul.Durrant@citrix.com> wrote:
> > >> From: Jan Beulich [mailto:JBeulich@suse.com]
> > >> Sent: 07 September 2018 12:11
> > >>
> > >> >>> On 23.08.18 at 11:47, <paul.durrant@citrix.com> wrote:
> > >> > This patch adds a new method to the VT-d IOMMU implementation to
> > find
> > >> the
> > >> > MFN currently mapped by the specified BFN along with a wrapper
> > function
> > >> in
> > >> > generic IOMMU code to call the implementation if it exists.
> > >>
> > >> For this to go in, I think the AMD side of it wants to also be
> implemented.
> > >>
> > >> > --- a/xen/drivers/passthrough/iommu.c
> > >> > +++ b/xen/drivers/passthrough/iommu.c
> > >> > @@ -305,6 +305,17 @@ int iommu_unmap_page(struct domain *d,
> > bfn_t
> > >> bfn)
> > >> >      return rc;
> > >> >  }
> > >> >
> > >> > +int iommu_lookup_page(struct domain *d, bfn_t bfn, mfn_t *mfn,
> > >> > +                      unsigned int *flags)
> > >> > +{
> > >> > +    const struct domain_iommu *hd = dom_iommu(d);
> > >> > +
> > >> > +    if ( !iommu_enabled || !hd->platform_ops )
> > >> > +        return -EOPNOTSUPP;
> > >> > +
> > >> > +    return hd->platform_ops->lookup_page(d, bfn, mfn, flags);
> > >> > +}
> > >>
> > >> Shouldn't this be restricted to PV guests? HVM ones aren't supposed
> > >> to know about MFNs.
> > >
> > > Agreed, but I think this is the wrong level to be applying such a check:
> > > iommu_map_page() is supplied an MFN regardless of whether the
> domain
> > is PV or
> > > HVM, so I think it is reasonable for a lookup function to work in terms of
> > > MFNs.
> >
> > I don't mind much where the check sits, but ASSERT(!is_hvm_domain()),
> > if placed here, should not trigger.
> >
> 
> It will though. I'm going to need to use this function for HVM guests after
> having done a GFN lookup.

Sorry... I'm getting confused myself now. It won't fire because in my case the domain here will always by PV (because it is the not the domain owning the GFN). I still think this is the wrong level for such a check though, but I'll put in the ASSERT.

  Paul

> 
>   Paul
> 
> > Jan
> >
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xenproject.org
> https://lists.xenproject.org/mailman/listinfo/xen-devel
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v6 08/14] vtd: add lookup_page method to iommu_ops
  2018-09-12  8:51           ` Paul Durrant
@ 2018-09-12  8:53             ` Paul Durrant
  2018-09-12  9:03               ` Jan Beulich
  0 siblings, 1 reply; 111+ messages in thread
From: Paul Durrant @ 2018-09-12  8:53 UTC (permalink / raw)
  To: 'Jan Beulich'; +Cc: xen-devel, Kevin Tian, George Dunlap

> -----Original Message-----
> From: Paul Durrant
> Sent: 12 September 2018 09:52
> To: Paul Durrant <Paul.Durrant@citrix.com>; 'Jan Beulich'
> <JBeulich@suse.com>
> Cc: xen-devel <xen-devel@lists.xenproject.org>; Kevin Tian
> <kevin.tian@intel.com>; George Dunlap <George.Dunlap@citrix.com>
> Subject: RE: [PATCH v6 08/14] vtd: add lookup_page method to iommu_ops
> 
> > -----Original Message-----
> > From: Xen-devel [mailto:xen-devel-bounces@lists.xenproject.org] On
> Behalf
> > Of Paul Durrant
> > Sent: 12 September 2018 09:45
> > To: 'Jan Beulich' <JBeulich@suse.com>
> > Cc: xen-devel <xen-devel@lists.xenproject.org>; Kevin Tian
> > <kevin.tian@intel.com>; George Dunlap <George.Dunlap@citrix.com>
> > Subject: Re: [Xen-devel] [PATCH v6 08/14] vtd: add lookup_page method
> to
> > iommu_ops
> >
> > > -----Original Message-----
> > > From: Jan Beulich [mailto:JBeulich@suse.com]
> > > Sent: 12 September 2018 09:44
> > > To: Paul Durrant <Paul.Durrant@citrix.com>
> > > Cc: George Dunlap <George.Dunlap@citrix.com>; Kevin Tian
> > > <kevin.tian@intel.com>; xen-devel <xen-devel@lists.xenproject.org>
> > > Subject: RE: [PATCH v6 08/14] vtd: add lookup_page method to
> > iommu_ops
> > >
> > > >>> On 12.09.18 at 10:31, <Paul.Durrant@citrix.com> wrote:
> > > >> From: Jan Beulich [mailto:JBeulich@suse.com]
> > > >> Sent: 07 September 2018 12:11
> > > >>
> > > >> >>> On 23.08.18 at 11:47, <paul.durrant@citrix.com> wrote:
> > > >> > This patch adds a new method to the VT-d IOMMU implementation
> to
> > > find
> > > >> the
> > > >> > MFN currently mapped by the specified BFN along with a wrapper
> > > function
> > > >> in
> > > >> > generic IOMMU code to call the implementation if it exists.
> > > >>
> > > >> For this to go in, I think the AMD side of it wants to also be
> > implemented.
> > > >>
> > > >> > --- a/xen/drivers/passthrough/iommu.c
> > > >> > +++ b/xen/drivers/passthrough/iommu.c
> > > >> > @@ -305,6 +305,17 @@ int iommu_unmap_page(struct domain *d,
> > > bfn_t
> > > >> bfn)
> > > >> >      return rc;
> > > >> >  }
> > > >> >
> > > >> > +int iommu_lookup_page(struct domain *d, bfn_t bfn, mfn_t *mfn,
> > > >> > +                      unsigned int *flags)
> > > >> > +{
> > > >> > +    const struct domain_iommu *hd = dom_iommu(d);
> > > >> > +
> > > >> > +    if ( !iommu_enabled || !hd->platform_ops )
> > > >> > +        return -EOPNOTSUPP;
> > > >> > +
> > > >> > +    return hd->platform_ops->lookup_page(d, bfn, mfn, flags);
> > > >> > +}
> > > >>
> > > >> Shouldn't this be restricted to PV guests? HVM ones aren't supposed
> > > >> to know about MFNs.
> > > >
> > > > Agreed, but I think this is the wrong level to be applying such a check:
> > > > iommu_map_page() is supplied an MFN regardless of whether the
> > domain
> > > is PV or
> > > > HVM, so I think it is reasonable for a lookup function to work in terms of
> > > > MFNs.
> > >
> > > I don't mind much where the check sits, but ASSERT(!is_hvm_domain()),
> > > if placed here, should not trigger.
> > >
> >
> > It will though. I'm going to need to use this function for HVM guests after
> > having done a GFN lookup.
> 
> Sorry... I'm getting confused myself now. It won't fire because in my case the
> domain here will always by PV (because it is the not the domain owning the
> GFN). I still think this is the wrong level for such a check though, but I'll put in
> the ASSERT.
> 

Actually, no I still don't think the ASSERT is correct. Why should we rule out HVM guests being able to use PV-IOMMU?

  Paul

>   Paul
> 
> >
> >   Paul
> >
> > > Jan
> > >
> >
> >
> > _______________________________________________
> > Xen-devel mailing list
> > Xen-devel@lists.xenproject.org
> > https://lists.xenproject.org/mailman/listinfo/xen-devel
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v6 08/14] vtd: add lookup_page method to iommu_ops
  2018-09-12  8:45         ` Paul Durrant
  2018-09-12  8:51           ` Paul Durrant
@ 2018-09-12  8:59           ` Jan Beulich
  1 sibling, 0 replies; 111+ messages in thread
From: Jan Beulich @ 2018-09-12  8:59 UTC (permalink / raw)
  To: Paul Durrant; +Cc: xen-devel, Kevin Tian, george.dunlap

>>> On 12.09.18 at 10:45, <Paul.Durrant@citrix.com> wrote:
>>  -----Original Message-----
>> From: Jan Beulich [mailto:JBeulich@suse.com]
>> Sent: 12 September 2018 09:44
>> To: Paul Durrant <Paul.Durrant@citrix.com>
>> Cc: George Dunlap <George.Dunlap@citrix.com>; Kevin Tian
>> <kevin.tian@intel.com>; xen-devel <xen-devel@lists.xenproject.org>
>> Subject: RE: [PATCH v6 08/14] vtd: add lookup_page method to iommu_ops
>> 
>> >>> On 12.09.18 at 10:31, <Paul.Durrant@citrix.com> wrote:
>> >> From: Jan Beulich [mailto:JBeulich@suse.com]
>> >> Sent: 07 September 2018 12:11
>> >>
>> >> >>> On 23.08.18 at 11:47, <paul.durrant@citrix.com> wrote:
>> >> > This patch adds a new method to the VT-d IOMMU implementation to
>> find
>> >> the
>> >> > MFN currently mapped by the specified BFN along with a wrapper
>> function
>> >> in
>> >> > generic IOMMU code to call the implementation if it exists.
>> >>
>> >> For this to go in, I think the AMD side of it wants to also be implemented.
>> >>
>> >> > --- a/xen/drivers/passthrough/iommu.c
>> >> > +++ b/xen/drivers/passthrough/iommu.c
>> >> > @@ -305,6 +305,17 @@ int iommu_unmap_page(struct domain *d,
>> bfn_t
>> >> bfn)
>> >> >      return rc;
>> >> >  }
>> >> >
>> >> > +int iommu_lookup_page(struct domain *d, bfn_t bfn, mfn_t *mfn,
>> >> > +                      unsigned int *flags)
>> >> > +{
>> >> > +    const struct domain_iommu *hd = dom_iommu(d);
>> >> > +
>> >> > +    if ( !iommu_enabled || !hd->platform_ops )
>> >> > +        return -EOPNOTSUPP;
>> >> > +
>> >> > +    return hd->platform_ops->lookup_page(d, bfn, mfn, flags);
>> >> > +}
>> >>
>> >> Shouldn't this be restricted to PV guests? HVM ones aren't supposed
>> >> to know about MFNs.
>> >
>> > Agreed, but I think this is the wrong level to be applying such a check:
>> > iommu_map_page() is supplied an MFN regardless of whether the domain
>> is PV or
>> > HVM, so I think it is reasonable for a lookup function to work in terms of
>> > MFNs.
>> 
>> I don't mind much where the check sits, but ASSERT(!is_hvm_domain()),
>> if placed here, should not trigger.
>> 
> 
> It will though. I'm going to need to use this function for HVM guests after 
> having done a GFN lookup.

That's the subject domain then, not the calling one.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v6 08/14] vtd: add lookup_page method to iommu_ops
  2018-09-12  8:53             ` Paul Durrant
@ 2018-09-12  9:03               ` Jan Beulich
  2018-09-12  9:05                 ` Paul Durrant
  0 siblings, 1 reply; 111+ messages in thread
From: Jan Beulich @ 2018-09-12  9:03 UTC (permalink / raw)
  To: Paul Durrant; +Cc: xen-devel, Kevin Tian, george.dunlap

>>> On 12.09.18 at 10:53, <Paul.Durrant@citrix.com> wrote:
>> From: Paul Durrant
>> Sent: 12 September 2018 09:52
>> 
>> > From: Xen-devel [mailto:xen-devel-bounces@lists.xenproject.org] On
>> Behalf
>> > Of Paul Durrant
>> > Sent: 12 September 2018 09:45
>> >
>> > > From: Jan Beulich [mailto:JBeulich@suse.com]
>> > > Sent: 12 September 2018 09:44
>> > >
>> > > >>> On 12.09.18 at 10:31, <Paul.Durrant@citrix.com> wrote:
>> > > >> From: Jan Beulich [mailto:JBeulich@suse.com]
>> > > >> Sent: 07 September 2018 12:11
>> > > >>
>> > > >> >>> On 23.08.18 at 11:47, <paul.durrant@citrix.com> wrote:
>> > > >> > This patch adds a new method to the VT-d IOMMU implementation
>> to
>> > > find
>> > > >> the
>> > > >> > MFN currently mapped by the specified BFN along with a wrapper
>> > > function
>> > > >> in
>> > > >> > generic IOMMU code to call the implementation if it exists.
>> > > >>
>> > > >> For this to go in, I think the AMD side of it wants to also be
>> > implemented.
>> > > >>
>> > > >> > --- a/xen/drivers/passthrough/iommu.c
>> > > >> > +++ b/xen/drivers/passthrough/iommu.c
>> > > >> > @@ -305,6 +305,17 @@ int iommu_unmap_page(struct domain *d,
>> > > bfn_t
>> > > >> bfn)
>> > > >> >      return rc;
>> > > >> >  }
>> > > >> >
>> > > >> > +int iommu_lookup_page(struct domain *d, bfn_t bfn, mfn_t *mfn,
>> > > >> > +                      unsigned int *flags)
>> > > >> > +{
>> > > >> > +    const struct domain_iommu *hd = dom_iommu(d);
>> > > >> > +
>> > > >> > +    if ( !iommu_enabled || !hd->platform_ops )
>> > > >> > +        return -EOPNOTSUPP;
>> > > >> > +
>> > > >> > +    return hd->platform_ops->lookup_page(d, bfn, mfn, flags);
>> > > >> > +}
>> > > >>
>> > > >> Shouldn't this be restricted to PV guests? HVM ones aren't supposed
>> > > >> to know about MFNs.
>> > > >
>> > > > Agreed, but I think this is the wrong level to be applying such a check:
>> > > > iommu_map_page() is supplied an MFN regardless of whether the
>> > domain
>> > > is PV or
>> > > > HVM, so I think it is reasonable for a lookup function to work in terms of
>> > > > MFNs.
>> > >
>> > > I don't mind much where the check sits, but ASSERT(!is_hvm_domain()),
>> > > if placed here, should not trigger.
>> > >
>> >
>> > It will though. I'm going to need to use this function for HVM guests after
>> > having done a GFN lookup.
>> 
>> Sorry... I'm getting confused myself now. It won't fire because in my case the
>> domain here will always by PV (because it is the not the domain owning the
>> GFN). I still think this is the wrong level for such a check though, but I'll put in
>> the ASSERT.

And what would guarantee the ASSERT() to not trigger? As said, I'm
fine with this being enforced at a different level, but it needs to be
enforced somewhere.

> Actually, no I still don't think the ASSERT is correct. Why should we rule 
> out HVM guests being able to use PV-IOMMU?

A HVM guest using the PV IOMMU is quite fine, but it shouldn't talk to
it in terms of MFNs.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v6 08/14] vtd: add lookup_page method to iommu_ops
  2018-09-12  9:03               ` Jan Beulich
@ 2018-09-12  9:05                 ` Paul Durrant
  2018-09-12  9:12                   ` Jan Beulich
  0 siblings, 1 reply; 111+ messages in thread
From: Paul Durrant @ 2018-09-12  9:05 UTC (permalink / raw)
  To: 'Jan Beulich'; +Cc: xen-devel, Kevin Tian, George Dunlap

> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: 12 September 2018 10:03
> To: Paul Durrant <Paul.Durrant@citrix.com>
> Cc: George Dunlap <George.Dunlap@citrix.com>; Kevin Tian
> <kevin.tian@intel.com>; xen-devel <xen-devel@lists.xenproject.org>
> Subject: RE: [PATCH v6 08/14] vtd: add lookup_page method to iommu_ops
> 
> >>> On 12.09.18 at 10:53, <Paul.Durrant@citrix.com> wrote:
> >> From: Paul Durrant
> >> Sent: 12 September 2018 09:52
> >>
> >> > From: Xen-devel [mailto:xen-devel-bounces@lists.xenproject.org] On
> >> Behalf
> >> > Of Paul Durrant
> >> > Sent: 12 September 2018 09:45
> >> >
> >> > > From: Jan Beulich [mailto:JBeulich@suse.com]
> >> > > Sent: 12 September 2018 09:44
> >> > >
> >> > > >>> On 12.09.18 at 10:31, <Paul.Durrant@citrix.com> wrote:
> >> > > >> From: Jan Beulich [mailto:JBeulich@suse.com]
> >> > > >> Sent: 07 September 2018 12:11
> >> > > >>
> >> > > >> >>> On 23.08.18 at 11:47, <paul.durrant@citrix.com> wrote:
> >> > > >> > This patch adds a new method to the VT-d IOMMU
> implementation
> >> to
> >> > > find
> >> > > >> the
> >> > > >> > MFN currently mapped by the specified BFN along with a
> wrapper
> >> > > function
> >> > > >> in
> >> > > >> > generic IOMMU code to call the implementation if it exists.
> >> > > >>
> >> > > >> For this to go in, I think the AMD side of it wants to also be
> >> > implemented.
> >> > > >>
> >> > > >> > --- a/xen/drivers/passthrough/iommu.c
> >> > > >> > +++ b/xen/drivers/passthrough/iommu.c
> >> > > >> > @@ -305,6 +305,17 @@ int iommu_unmap_page(struct domain
> *d,
> >> > > bfn_t
> >> > > >> bfn)
> >> > > >> >      return rc;
> >> > > >> >  }
> >> > > >> >
> >> > > >> > +int iommu_lookup_page(struct domain *d, bfn_t bfn, mfn_t
> *mfn,
> >> > > >> > +                      unsigned int *flags)
> >> > > >> > +{
> >> > > >> > +    const struct domain_iommu *hd = dom_iommu(d);
> >> > > >> > +
> >> > > >> > +    if ( !iommu_enabled || !hd->platform_ops )
> >> > > >> > +        return -EOPNOTSUPP;
> >> > > >> > +
> >> > > >> > +    return hd->platform_ops->lookup_page(d, bfn, mfn, flags);
> >> > > >> > +}
> >> > > >>
> >> > > >> Shouldn't this be restricted to PV guests? HVM ones aren't
> supposed
> >> > > >> to know about MFNs.
> >> > > >
> >> > > > Agreed, but I think this is the wrong level to be applying such a
> check:
> >> > > > iommu_map_page() is supplied an MFN regardless of whether the
> >> > domain
> >> > > is PV or
> >> > > > HVM, so I think it is reasonable for a lookup function to work in
> terms of
> >> > > > MFNs.
> >> > >
> >> > > I don't mind much where the check sits, but
> ASSERT(!is_hvm_domain()),
> >> > > if placed here, should not trigger.
> >> > >
> >> >
> >> > It will though. I'm going to need to use this function for HVM guests
> after
> >> > having done a GFN lookup.
> >>
> >> Sorry... I'm getting confused myself now. It won't fire because in my case
> the
> >> domain here will always by PV (because it is the not the domain owning
> the
> >> GFN). I still think this is the wrong level for such a check though, but I'll put
> in
> >> the ASSERT.
> 
> And what would guarantee the ASSERT() to not trigger? As said, I'm
> fine with this being enforced at a different level, but it needs to be
> enforced somewhere.
> 
> > Actually, no I still don't think the ASSERT is correct. Why should we rule
> > out HVM guests being able to use PV-IOMMU?
> 
> A HVM guest using the PV IOMMU is quite fine, but it shouldn't talk to
> it in terms of MFNs.
> 

Well, it has to talk MFNs at some level, surely? The output of the IOMMU is not subject to EPT/NPT, right?

  Paul

> Jan
> 


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v6 12/14] memory: add get_paged_gfn() as a wrapper...
  2018-09-11 14:56   ` Jan Beulich
@ 2018-09-12  9:10     ` Paul Durrant
  2018-09-12  9:15       ` Jan Beulich
  0 siblings, 1 reply; 111+ messages in thread
From: Paul Durrant @ 2018-09-12  9:10 UTC (permalink / raw)
  To: 'Jan Beulich'
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	Andrew Cooper, Tim (Xen.org),
	George Dunlap, Julien Grall, xen-devel, Ian Jackson

> -----Original Message-----
> From: Xen-devel [mailto:xen-devel-bounces@lists.xenproject.org] On Behalf
> Of Jan Beulich
> Sent: 11 September 2018 15:56
> To: Paul Durrant <Paul.Durrant@citrix.com>
> Cc: Stefano Stabellini <sstabellini@kernel.org>; Wei Liu
> <wei.liu2@citrix.com>; Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>;
> George Dunlap <George.Dunlap@citrix.com>; Andrew Cooper
> <Andrew.Cooper3@citrix.com>; Ian Jackson <Ian.Jackson@citrix.com>; Tim
> (Xen.org) <tim@xen.org>; Julien Grall <julien.grall@arm.com>; xen-devel
> <xen-devel@lists.xenproject.org>
> Subject: Re: [Xen-devel] [PATCH v6 12/14] memory: add get_paged_gfn() as
> a wrapper...
> 
> >>> On 23.08.18 at 11:47, <paul.durrant@citrix.com> wrote:
> > ...for some uses of get_page_from_gfn().
> >
> > There are many occurences of the following pattern in the code:
> >
> >     q = <readonly look-up> ? P2M_ALLOC : P2M_UNSHARE;
> 
> Especially with this UNSHARE in mind - is "paged" in the helper
> function's name really suitable? Since we (I think) already have
> get_gfn(), how about try_get_gfn()?

That name may be a little misleading since it suggests a close functional relationship with get_gfn() whereas it does more than that. How about try_get_page_from_gfn()?

> 
> > --- a/xen/arch/x86/hvm/emulate.c
> > +++ b/xen/arch/x86/hvm/emulate.c
> > @@ -350,34 +350,16 @@ static int hvmemul_do_io_buffer(
> >
> >  static int hvmemul_acquire_page(unsigned long gmfn, struct page_info
> **page)
> >  {
> > -    struct domain *curr_d = current->domain;
> > -    p2m_type_t p2mt;
> > -
> > -    *page = get_page_from_gfn(curr_d, gmfn, &p2mt, P2M_UNSHARE);
> > -
> > -    if ( *page == NULL )
> > -        return X86EMUL_UNHANDLEABLE;
> > -
> > -    if ( p2m_is_paging(p2mt) )
> > -    {
> > -        put_page(*page);
> > -        p2m_mem_paging_populate(curr_d, gmfn);
> > -        return X86EMUL_RETRY;
> > -    }
> > -
> > -    if ( p2m_is_shared(p2mt) )
> > +    switch ( get_paged_gfn(current->domain, _gfn(gmfn), false, NULL,
> page) )
> >      {
> > -        put_page(*page);
> > +    case -EAGAIN:
> >          return X86EMUL_RETRY;
> > -    }
> > -
> > -    /* This code should not be reached if the gmfn is not RAM */
> > -    if ( p2m_is_mmio(p2mt) )
> > -    {
> > -        domain_crash(curr_d);
> > -
> > -        put_page(*page);
> > +    case -EINVAL:
> >          return X86EMUL_UNHANDLEABLE;
> > +    default:
> > +        ASSERT_UNREACHABLE();
> > +    case 0:
> 
> I think you'd better have "default:" fall through to "case -EINVAL".
> Similarly elsewhere.

Ok. I'll keep the ASSERT_UNREACHABLE() though.

  Paul

> 
> > --- a/xen/arch/x86/hvm/hvm.c
> > +++ b/xen/arch/x86/hvm/hvm.c
> > @@ -2557,24 +2557,12 @@ static void *_hvm_map_guest_frame(unsigned
> long gfn, bool_t permanent,
> >                                    bool_t *writable)
> >  {
> >      void *map;
> > -    p2m_type_t p2mt;
> >      struct page_info *page;
> >      struct domain *d = current->domain;
> > +    p2m_type_t p2mt;
> 
> ???
> 
> Jan
> 
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xenproject.org
> https://lists.xenproject.org/mailman/listinfo/xen-devel
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v6 08/14] vtd: add lookup_page method to iommu_ops
  2018-09-12  9:05                 ` Paul Durrant
@ 2018-09-12  9:12                   ` Jan Beulich
  2018-09-12  9:15                     ` Paul Durrant
  0 siblings, 1 reply; 111+ messages in thread
From: Jan Beulich @ 2018-09-12  9:12 UTC (permalink / raw)
  To: Paul Durrant; +Cc: xen-devel, Kevin Tian, george.dunlap

>>> On 12.09.18 at 11:05, <Paul.Durrant@citrix.com> wrote:
>> From: Jan Beulich [mailto:JBeulich@suse.com]
>> Sent: 12 September 2018 10:03
>> 
>> A HVM guest using the PV IOMMU is quite fine, but it shouldn't talk to
>> it in terms of MFNs.
>> 
> 
> Well, it has to talk MFNs at some level, surely? The output of the IOMMU is 
> not subject to EPT/NPT, right?

Yes to the second question, but no to the first: The GFN -> MFN translation
should still be done inside Xen in the HVM case, imo (in the course of
manufacturing the PTE).

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v6 12/14] memory: add get_paged_gfn() as a wrapper...
  2018-09-12  9:10     ` Paul Durrant
@ 2018-09-12  9:15       ` Jan Beulich
  2018-09-12 10:01         ` George Dunlap
  0 siblings, 1 reply; 111+ messages in thread
From: Jan Beulich @ 2018-09-12  9:15 UTC (permalink / raw)
  To: george.dunlap, Paul Durrant
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	Andrew Cooper, Tim Deegan, Julien Grall, Ian Jackson, xen-devel

>>> On 12.09.18 at 11:10, <Paul.Durrant@citrix.com> wrote:
>> From: Xen-devel [mailto:xen-devel-bounces@lists.xenproject.org] On Behalf
>> Of Jan Beulich
>> Sent: 11 September 2018 15:56
>> 
>> >>> On 23.08.18 at 11:47, <paul.durrant@citrix.com> wrote:
>> > ...for some uses of get_page_from_gfn().
>> >
>> > There are many occurences of the following pattern in the code:
>> >
>> >     q = <readonly look-up> ? P2M_ALLOC : P2M_UNSHARE;
>> 
>> Especially with this UNSHARE in mind - is "paged" in the helper
>> function's name really suitable? Since we (I think) already have
>> get_gfn(), how about try_get_gfn()?
> 
> That name may be a little misleading since it suggests a close functional 
> relationship with get_gfn() whereas it does more than that. How about 
> try_get_page_from_gfn()?

Fine with me; George?

>> > --- a/xen/arch/x86/hvm/emulate.c
>> > +++ b/xen/arch/x86/hvm/emulate.c
>> > @@ -350,34 +350,16 @@ static int hvmemul_do_io_buffer(
>> >
>> >  static int hvmemul_acquire_page(unsigned long gmfn, struct page_info
>> **page)
>> >  {
>> > -    struct domain *curr_d = current->domain;
>> > -    p2m_type_t p2mt;
>> > -
>> > -    *page = get_page_from_gfn(curr_d, gmfn, &p2mt, P2M_UNSHARE);
>> > -
>> > -    if ( *page == NULL )
>> > -        return X86EMUL_UNHANDLEABLE;
>> > -
>> > -    if ( p2m_is_paging(p2mt) )
>> > -    {
>> > -        put_page(*page);
>> > -        p2m_mem_paging_populate(curr_d, gmfn);
>> > -        return X86EMUL_RETRY;
>> > -    }
>> > -
>> > -    if ( p2m_is_shared(p2mt) )
>> > +    switch ( get_paged_gfn(current->domain, _gfn(gmfn), false, NULL,
>> page) )
>> >      {
>> > -        put_page(*page);
>> > +    case -EAGAIN:
>> >          return X86EMUL_RETRY;
>> > -    }
>> > -
>> > -    /* This code should not be reached if the gmfn is not RAM */
>> > -    if ( p2m_is_mmio(p2mt) )
>> > -    {
>> > -        domain_crash(curr_d);
>> > -
>> > -        put_page(*page);
>> > +    case -EINVAL:
>> >          return X86EMUL_UNHANDLEABLE;
>> > +    default:
>> > +        ASSERT_UNREACHABLE();
>> > +    case 0:
>> 
>> I think you'd better have "default:" fall through to "case -EINVAL".
>> Similarly elsewhere.
> 
> Ok. I'll keep the ASSERT_UNREACHABLE() though.

That's what I was implying by saying "fall through" - otherwise
"case -EINVAL:" could as well have gone away.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v6 08/14] vtd: add lookup_page method to iommu_ops
  2018-09-12  9:12                   ` Jan Beulich
@ 2018-09-12  9:15                     ` Paul Durrant
  2018-09-12  9:21                       ` Jan Beulich
  0 siblings, 1 reply; 111+ messages in thread
From: Paul Durrant @ 2018-09-12  9:15 UTC (permalink / raw)
  To: 'Jan Beulich'; +Cc: xen-devel, Kevin Tian, George Dunlap

> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: 12 September 2018 10:13
> To: Paul Durrant <Paul.Durrant@citrix.com>
> Cc: George Dunlap <George.Dunlap@citrix.com>; Kevin Tian
> <kevin.tian@intel.com>; xen-devel <xen-devel@lists.xenproject.org>
> Subject: RE: [PATCH v6 08/14] vtd: add lookup_page method to iommu_ops
> 
> >>> On 12.09.18 at 11:05, <Paul.Durrant@citrix.com> wrote:
> >> From: Jan Beulich [mailto:JBeulich@suse.com]
> >> Sent: 12 September 2018 10:03
> >>
> >> A HVM guest using the PV IOMMU is quite fine, but it shouldn't talk to
> >> it in terms of MFNs.
> >>
> >
> > Well, it has to talk MFNs at some level, surely? The output of the IOMMU is
> > not subject to EPT/NPT, right?
> 
> Yes to the second question, but no to the first: The GFN -> MFN translation
> should still be done inside Xen in the HVM case, imo (in the course of
> manufacturing the PTE).

Indeed. This function is very much internal to Xen (it's simply an abstraction on top of a vendor implementation), so why should it not work in terms of MFNs? There's no hypercall that allows this to be blindly called by an HVM guest.

  Paul

> 
> Jan
> 


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v6 08/14] vtd: add lookup_page method to iommu_ops
  2018-09-12  9:15                     ` Paul Durrant
@ 2018-09-12  9:21                       ` Jan Beulich
  2018-09-12  9:30                         ` Paul Durrant
  0 siblings, 1 reply; 111+ messages in thread
From: Jan Beulich @ 2018-09-12  9:21 UTC (permalink / raw)
  To: Paul Durrant; +Cc: xen-devel, Kevin Tian, george.dunlap

>>> On 12.09.18 at 11:15, <Paul.Durrant@citrix.com> wrote:
>>  -----Original Message-----
>> From: Jan Beulich [mailto:JBeulich@suse.com]
>> Sent: 12 September 2018 10:13
>> To: Paul Durrant <Paul.Durrant@citrix.com>
>> Cc: George Dunlap <George.Dunlap@citrix.com>; Kevin Tian
>> <kevin.tian@intel.com>; xen-devel <xen-devel@lists.xenproject.org>
>> Subject: RE: [PATCH v6 08/14] vtd: add lookup_page method to iommu_ops
>> 
>> >>> On 12.09.18 at 11:05, <Paul.Durrant@citrix.com> wrote:
>> >> From: Jan Beulich [mailto:JBeulich@suse.com]
>> >> Sent: 12 September 2018 10:03
>> >>
>> >> A HVM guest using the PV IOMMU is quite fine, but it shouldn't talk to
>> >> it in terms of MFNs.
>> >>
>> >
>> > Well, it has to talk MFNs at some level, surely? The output of the IOMMU is
>> > not subject to EPT/NPT, right?
>> 
>> Yes to the second question, but no to the first: The GFN -> MFN translation
>> should still be done inside Xen in the HVM case, imo (in the course of
>> manufacturing the PTE).
> 
> Indeed. This function is very much internal to Xen (it's simply an 
> abstraction on top of a vendor implementation), so why should it not work in 
> terms of MFNs?

Because "MFN" is a concept a HVM guest is not knowing about, or
supposed to be knowing. The only time where (part of) it might
legitimately (have to) know is when it comes to managing the host
(including any guests), i.e. in the tool stack of a PVH Dom0.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v6 08/14] vtd: add lookup_page method to iommu_ops
  2018-09-12  9:21                       ` Jan Beulich
@ 2018-09-12  9:30                         ` Paul Durrant
  2018-09-12 10:07                           ` Jan Beulich
  0 siblings, 1 reply; 111+ messages in thread
From: Paul Durrant @ 2018-09-12  9:30 UTC (permalink / raw)
  To: 'Jan Beulich'; +Cc: xen-devel, Kevin Tian, George Dunlap

> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: 12 September 2018 10:21
> To: Paul Durrant <Paul.Durrant@citrix.com>
> Cc: George Dunlap <George.Dunlap@citrix.com>; Kevin Tian
> <kevin.tian@intel.com>; xen-devel <xen-devel@lists.xenproject.org>
> Subject: RE: [PATCH v6 08/14] vtd: add lookup_page method to iommu_ops
> 
> >>> On 12.09.18 at 11:15, <Paul.Durrant@citrix.com> wrote:
> >>  -----Original Message-----
> >> From: Jan Beulich [mailto:JBeulich@suse.com]
> >> Sent: 12 September 2018 10:13
> >> To: Paul Durrant <Paul.Durrant@citrix.com>
> >> Cc: George Dunlap <George.Dunlap@citrix.com>; Kevin Tian
> >> <kevin.tian@intel.com>; xen-devel <xen-devel@lists.xenproject.org>
> >> Subject: RE: [PATCH v6 08/14] vtd: add lookup_page method to
> iommu_ops
> >>
> >> >>> On 12.09.18 at 11:05, <Paul.Durrant@citrix.com> wrote:
> >> >> From: Jan Beulich [mailto:JBeulich@suse.com]
> >> >> Sent: 12 September 2018 10:03
> >> >>
> >> >> A HVM guest using the PV IOMMU is quite fine, but it shouldn't talk to
> >> >> it in terms of MFNs.
> >> >>
> >> >
> >> > Well, it has to talk MFNs at some level, surely? The output of the
> IOMMU is
> >> > not subject to EPT/NPT, right?
> >>
> >> Yes to the second question, but no to the first: The GFN -> MFN
> translation
> >> should still be done inside Xen in the HVM case, imo (in the course of
> >> manufacturing the PTE).
> >
> > Indeed. This function is very much internal to Xen (it's simply an
> > abstraction on top of a vendor implementation), so why should it not work
> in
> > terms of MFNs?
> 
> Because "MFN" is a concept a HVM guest is not knowing about, or
> supposed to be knowing. The only time where (part of) it might
> legitimately (have to) know is when it comes to managing the host
> (including any guests), i.e. in the tool stack of a PVH Dom0.

Ok. So consider validating a PV-IOMMU unmap request from an HVM guest. It passes in a DFN and a GFN  belonging to itself. Now Xen needs to figure out whether that BFN actually maps to the GFN. It can look up the MFN backing the GFN (from the p2m). How does Xen now validate it if it cannot lookup what MFN is actually present in the PTE referenced by the DFN?

  Paul

> 
> Jan
> 


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v6 12/14] memory: add get_paged_gfn() as a wrapper...
  2018-09-12  9:15       ` Jan Beulich
@ 2018-09-12 10:01         ` George Dunlap
  2018-09-12 10:08           ` Paul Durrant
  2018-09-12 10:10           ` Jan Beulich
  0 siblings, 2 replies; 111+ messages in thread
From: George Dunlap @ 2018-09-12 10:01 UTC (permalink / raw)
  To: Jan Beulich, Paul Durrant
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	Andrew Cooper, Tim Deegan, Julien Grall, Ian Jackson, xen-devel

On 09/12/2018 10:15 AM, Jan Beulich wrote:
>>>> On 12.09.18 at 11:10, <Paul.Durrant@citrix.com> wrote:
>>> From: Xen-devel [mailto:xen-devel-bounces@lists.xenproject.org] On Behalf
>>> Of Jan Beulich
>>> Sent: 11 September 2018 15:56
>>>
>>>>>> On 23.08.18 at 11:47, <paul.durrant@citrix.com> wrote:
>>>> ...for some uses of get_page_from_gfn().
>>>>
>>>> There are many occurences of the following pattern in the code:
>>>>
>>>>     q = <readonly look-up> ? P2M_ALLOC : P2M_UNSHARE;
>>>
>>> Especially with this UNSHARE in mind - is "paged" in the helper
>>> function's name really suitable? Since we (I think) already have
>>> get_gfn(), how about try_get_gfn()?
>>
>> That name may be a little misleading since it suggests a close functional 
>> relationship with get_gfn() whereas it does more than that. How about 
>> try_get_page_from_gfn()?
> 
> Fine with me; George?

At the risk of bike shedding.. "try" to me means only pass/fail, with no
side effects, and with no permissions checks.  What about
"check_and_get_page_from_gfn()"?

I'd prefer 'check' but if anyone objects I'd rather just go with 'try'
and get things in -- the code is a definite improvement.

 -George

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v6 08/14] vtd: add lookup_page method to iommu_ops
  2018-09-12  9:30                         ` Paul Durrant
@ 2018-09-12 10:07                           ` Jan Beulich
  2018-09-12 10:09                             ` Paul Durrant
  0 siblings, 1 reply; 111+ messages in thread
From: Jan Beulich @ 2018-09-12 10:07 UTC (permalink / raw)
  To: Paul Durrant; +Cc: xen-devel, Kevin Tian, george.dunlap

>>> On 12.09.18 at 11:30, <Paul.Durrant@citrix.com> wrote:
>>  -----Original Message-----
>> From: Jan Beulich [mailto:JBeulich@suse.com]
>> Sent: 12 September 2018 10:21
>> To: Paul Durrant <Paul.Durrant@citrix.com>
>> Cc: George Dunlap <George.Dunlap@citrix.com>; Kevin Tian
>> <kevin.tian@intel.com>; xen-devel <xen-devel@lists.xenproject.org>
>> Subject: RE: [PATCH v6 08/14] vtd: add lookup_page method to iommu_ops
>> 
>> >>> On 12.09.18 at 11:15, <Paul.Durrant@citrix.com> wrote:
>> >>  -----Original Message-----
>> >> From: Jan Beulich [mailto:JBeulich@suse.com]
>> >> Sent: 12 September 2018 10:13
>> >> To: Paul Durrant <Paul.Durrant@citrix.com>
>> >> Cc: George Dunlap <George.Dunlap@citrix.com>; Kevin Tian
>> >> <kevin.tian@intel.com>; xen-devel <xen-devel@lists.xenproject.org>
>> >> Subject: RE: [PATCH v6 08/14] vtd: add lookup_page method to
>> iommu_ops
>> >>
>> >> >>> On 12.09.18 at 11:05, <Paul.Durrant@citrix.com> wrote:
>> >> >> From: Jan Beulich [mailto:JBeulich@suse.com]
>> >> >> Sent: 12 September 2018 10:03
>> >> >>
>> >> >> A HVM guest using the PV IOMMU is quite fine, but it shouldn't talk to
>> >> >> it in terms of MFNs.
>> >> >>
>> >> >
>> >> > Well, it has to talk MFNs at some level, surely? The output of the
>> IOMMU is
>> >> > not subject to EPT/NPT, right?
>> >>
>> >> Yes to the second question, but no to the first: The GFN -> MFN
>> translation
>> >> should still be done inside Xen in the HVM case, imo (in the course of
>> >> manufacturing the PTE).
>> >
>> > Indeed. This function is very much internal to Xen (it's simply an
>> > abstraction on top of a vendor implementation), so why should it not work
>> in
>> > terms of MFNs?
>> 
>> Because "MFN" is a concept a HVM guest is not knowing about, or
>> supposed to be knowing. The only time where (part of) it might
>> legitimately (have to) know is when it comes to managing the host
>> (including any guests), i.e. in the tool stack of a PVH Dom0.
> 
> Ok. So consider validating a PV-IOMMU unmap request from an HVM guest. It 
> passes in a DFN and a GFN  belonging to itself. Now Xen needs to figure out 
> whether that BFN actually maps to the GFN. It can look up the MFN backing the 
> GFN (from the p2m). How does Xen now validate it if it cannot lookup what MFN 
> is actually present in the PTE referenced by the DFN?

I'm afraid I don't understand: The passed in GFN gets translated
to an MFN using a p2m lookup. The passed in DFN (which aiui ought
to match the GFN anyway on x86) gets translated to an MFN using
an IOMMU page table lookup. The resulting two MFNs have to
match for the request to be valid.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v6 12/14] memory: add get_paged_gfn() as a wrapper...
  2018-09-12 10:01         ` George Dunlap
@ 2018-09-12 10:08           ` Paul Durrant
  2018-09-12 10:10           ` Jan Beulich
  1 sibling, 0 replies; 111+ messages in thread
From: Paul Durrant @ 2018-09-12 10:08 UTC (permalink / raw)
  To: George Dunlap, Jan Beulich
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	Andrew Cooper, Tim (Xen.org),
	Julien Grall, Ian Jackson, xen-devel

> -----Original Message-----
> From: George Dunlap [mailto:george.dunlap@citrix.com]
> Sent: 12 September 2018 11:02
> To: Jan Beulich <JBeulich@suse.com>; Paul Durrant
> <Paul.Durrant@citrix.com>
> Cc: Julien Grall <julien.grall@arm.com>; Andrew Cooper
> <Andrew.Cooper3@citrix.com>; Ian Jackson <Ian.Jackson@citrix.com>; Wei
> Liu <wei.liu2@citrix.com>; Stefano Stabellini <sstabellini@kernel.org>; xen-
> devel <xen-devel@lists.xenproject.org>; Konrad Rzeszutek Wilk
> <konrad.wilk@oracle.com>; Tim (Xen.org) <tim@xen.org>
> Subject: Re: [Xen-devel] [PATCH v6 12/14] memory: add get_paged_gfn() as
> a wrapper...
> 
> On 09/12/2018 10:15 AM, Jan Beulich wrote:
> >>>> On 12.09.18 at 11:10, <Paul.Durrant@citrix.com> wrote:
> >>> From: Xen-devel [mailto:xen-devel-bounces@lists.xenproject.org] On
> Behalf
> >>> Of Jan Beulich
> >>> Sent: 11 September 2018 15:56
> >>>
> >>>>>> On 23.08.18 at 11:47, <paul.durrant@citrix.com> wrote:
> >>>> ...for some uses of get_page_from_gfn().
> >>>>
> >>>> There are many occurences of the following pattern in the code:
> >>>>
> >>>>     q = <readonly look-up> ? P2M_ALLOC : P2M_UNSHARE;
> >>>
> >>> Especially with this UNSHARE in mind - is "paged" in the helper
> >>> function's name really suitable? Since we (I think) already have
> >>> get_gfn(), how about try_get_gfn()?
> >>
> >> That name may be a little misleading since it suggests a close functional
> >> relationship with get_gfn() whereas it does more than that. How about
> >> try_get_page_from_gfn()?
> >
> > Fine with me; George?
> 
> At the risk of bike shedding.. "try" to me means only pass/fail, with no
> side effects, and with no permissions checks.  What about
> "check_and_get_page_from_gfn()"?
> 
> I'd prefer 'check' but if anyone objects I'd rather just go with 'try'
> and get things in -- the code is a definite improvement.
> 

Jan?

  Paul

>  -George
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v6 08/14] vtd: add lookup_page method to iommu_ops
  2018-09-12 10:07                           ` Jan Beulich
@ 2018-09-12 10:09                             ` Paul Durrant
  2018-09-12 12:15                               ` Jan Beulich
  0 siblings, 1 reply; 111+ messages in thread
From: Paul Durrant @ 2018-09-12 10:09 UTC (permalink / raw)
  To: 'Jan Beulich'; +Cc: xen-devel, Kevin Tian, George Dunlap

> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: 12 September 2018 11:08
> To: Paul Durrant <Paul.Durrant@citrix.com>
> Cc: George Dunlap <George.Dunlap@citrix.com>; Kevin Tian
> <kevin.tian@intel.com>; xen-devel <xen-devel@lists.xenproject.org>
> Subject: RE: [PATCH v6 08/14] vtd: add lookup_page method to iommu_ops
> 
> >>> On 12.09.18 at 11:30, <Paul.Durrant@citrix.com> wrote:
> >>  -----Original Message-----
> >> From: Jan Beulich [mailto:JBeulich@suse.com]
> >> Sent: 12 September 2018 10:21
> >> To: Paul Durrant <Paul.Durrant@citrix.com>
> >> Cc: George Dunlap <George.Dunlap@citrix.com>; Kevin Tian
> >> <kevin.tian@intel.com>; xen-devel <xen-devel@lists.xenproject.org>
> >> Subject: RE: [PATCH v6 08/14] vtd: add lookup_page method to
> iommu_ops
> >>
> >> >>> On 12.09.18 at 11:15, <Paul.Durrant@citrix.com> wrote:
> >> >>  -----Original Message-----
> >> >> From: Jan Beulich [mailto:JBeulich@suse.com]
> >> >> Sent: 12 September 2018 10:13
> >> >> To: Paul Durrant <Paul.Durrant@citrix.com>
> >> >> Cc: George Dunlap <George.Dunlap@citrix.com>; Kevin Tian
> >> >> <kevin.tian@intel.com>; xen-devel <xen-devel@lists.xenproject.org>
> >> >> Subject: RE: [PATCH v6 08/14] vtd: add lookup_page method to
> >> iommu_ops
> >> >>
> >> >> >>> On 12.09.18 at 11:05, <Paul.Durrant@citrix.com> wrote:
> >> >> >> From: Jan Beulich [mailto:JBeulich@suse.com]
> >> >> >> Sent: 12 September 2018 10:03
> >> >> >>
> >> >> >> A HVM guest using the PV IOMMU is quite fine, but it shouldn't talk
> to
> >> >> >> it in terms of MFNs.
> >> >> >>
> >> >> >
> >> >> > Well, it has to talk MFNs at some level, surely? The output of the
> >> IOMMU is
> >> >> > not subject to EPT/NPT, right?
> >> >>
> >> >> Yes to the second question, but no to the first: The GFN -> MFN
> >> translation
> >> >> should still be done inside Xen in the HVM case, imo (in the course of
> >> >> manufacturing the PTE).
> >> >
> >> > Indeed. This function is very much internal to Xen (it's simply an
> >> > abstraction on top of a vendor implementation), so why should it not
> work
> >> in
> >> > terms of MFNs?
> >>
> >> Because "MFN" is a concept a HVM guest is not knowing about, or
> >> supposed to be knowing. The only time where (part of) it might
> >> legitimately (have to) know is when it comes to managing the host
> >> (including any guests), i.e. in the tool stack of a PVH Dom0.
> >
> > Ok. So consider validating a PV-IOMMU unmap request from an HVM
> guest. It
> > passes in a DFN and a GFN  belonging to itself. Now Xen needs to figure out
> > whether that BFN actually maps to the GFN. It can look up the MFN backing
> the
> > GFN (from the p2m). How does Xen now validate it if it cannot lookup what
> MFN
> > is actually present in the PTE referenced by the DFN?
> 
> I'm afraid I don't understand: The passed in GFN gets translated
> to an MFN using a p2m lookup. The passed in DFN (which aiui ought
> to match the GFN anyway on x86) gets translated to an MFN using
> an IOMMU page table lookup. The resulting two MFNs have to
> match for the request to be valid.
> 

Quite. So how does that work if iommu_lookup_page() is ASSERTing that the domain in question is not HVM?

  Paul

> Jan
> 


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v6 12/14] memory: add get_paged_gfn() as a wrapper...
  2018-09-12 10:01         ` George Dunlap
  2018-09-12 10:08           ` Paul Durrant
@ 2018-09-12 10:10           ` Jan Beulich
  1 sibling, 0 replies; 111+ messages in thread
From: Jan Beulich @ 2018-09-12 10:10 UTC (permalink / raw)
  To: george.dunlap
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	Andrew Cooper, Tim Deegan, Julien Grall, Paul Durrant, xen-devel,
	Ian Jackson

>>> On 12.09.18 at 12:01, <george.dunlap@citrix.com> wrote:
> On 09/12/2018 10:15 AM, Jan Beulich wrote:
>>>>> On 12.09.18 at 11:10, <Paul.Durrant@citrix.com> wrote:
>>>> From: Xen-devel [mailto:xen-devel-bounces@lists.xenproject.org] On Behalf
>>>> Of Jan Beulich
>>>> Sent: 11 September 2018 15:56
>>>>
>>>>>>> On 23.08.18 at 11:47, <paul.durrant@citrix.com> wrote:
>>>>> ...for some uses of get_page_from_gfn().
>>>>>
>>>>> There are many occurences of the following pattern in the code:
>>>>>
>>>>>     q = <readonly look-up> ? P2M_ALLOC : P2M_UNSHARE;
>>>>
>>>> Especially with this UNSHARE in mind - is "paged" in the helper
>>>> function's name really suitable? Since we (I think) already have
>>>> get_gfn(), how about try_get_gfn()?
>>>
>>> That name may be a little misleading since it suggests a close functional 
>>> relationship with get_gfn() whereas it does more than that. How about 
>>> try_get_page_from_gfn()?
>> 
>> Fine with me; George?
> 
> At the risk of bike shedding.. "try" to me means only pass/fail, with no
> side effects, and with no permissions checks.  What about
> "check_and_get_page_from_gfn()"?
> 
> I'd prefer 'check' but if anyone objects I'd rather just go with 'try'
> and get things in -- the code is a definite improvement.

I'm fine with "check", and indeed I wasn't really happy about the
earlier proposed "try".

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v6 08/14] vtd: add lookup_page method to iommu_ops
  2018-09-12 10:09                             ` Paul Durrant
@ 2018-09-12 12:15                               ` Jan Beulich
  2018-09-12 12:22                                 ` Paul Durrant
  0 siblings, 1 reply; 111+ messages in thread
From: Jan Beulich @ 2018-09-12 12:15 UTC (permalink / raw)
  To: Paul Durrant; +Cc: xen-devel, Kevin Tian, george.dunlap

>>> On 12.09.18 at 12:09, <Paul.Durrant@citrix.com> wrote:
>>  -----Original Message-----
>> From: Jan Beulich [mailto:JBeulich@suse.com]
>> Sent: 12 September 2018 11:08
>> To: Paul Durrant <Paul.Durrant@citrix.com>
>> Cc: George Dunlap <George.Dunlap@citrix.com>; Kevin Tian
>> <kevin.tian@intel.com>; xen-devel <xen-devel@lists.xenproject.org>
>> Subject: RE: [PATCH v6 08/14] vtd: add lookup_page method to iommu_ops
>> 
>> >>> On 12.09.18 at 11:30, <Paul.Durrant@citrix.com> wrote:
>> >>  -----Original Message-----
>> >> From: Jan Beulich [mailto:JBeulich@suse.com]
>> >> Sent: 12 September 2018 10:21
>> >> To: Paul Durrant <Paul.Durrant@citrix.com>
>> >> Cc: George Dunlap <George.Dunlap@citrix.com>; Kevin Tian
>> >> <kevin.tian@intel.com>; xen-devel <xen-devel@lists.xenproject.org>
>> >> Subject: RE: [PATCH v6 08/14] vtd: add lookup_page method to
>> iommu_ops
>> >>
>> >> >>> On 12.09.18 at 11:15, <Paul.Durrant@citrix.com> wrote:
>> >> >>  -----Original Message-----
>> >> >> From: Jan Beulich [mailto:JBeulich@suse.com]
>> >> >> Sent: 12 September 2018 10:13
>> >> >> To: Paul Durrant <Paul.Durrant@citrix.com>
>> >> >> Cc: George Dunlap <George.Dunlap@citrix.com>; Kevin Tian
>> >> >> <kevin.tian@intel.com>; xen-devel <xen-devel@lists.xenproject.org>
>> >> >> Subject: RE: [PATCH v6 08/14] vtd: add lookup_page method to
>> >> iommu_ops
>> >> >>
>> >> >> >>> On 12.09.18 at 11:05, <Paul.Durrant@citrix.com> wrote:
>> >> >> >> From: Jan Beulich [mailto:JBeulich@suse.com]
>> >> >> >> Sent: 12 September 2018 10:03
>> >> >> >>
>> >> >> >> A HVM guest using the PV IOMMU is quite fine, but it shouldn't talk
>> to
>> >> >> >> it in terms of MFNs.
>> >> >> >>
>> >> >> >
>> >> >> > Well, it has to talk MFNs at some level, surely? The output of the
>> >> IOMMU is
>> >> >> > not subject to EPT/NPT, right?
>> >> >>
>> >> >> Yes to the second question, but no to the first: The GFN -> MFN
>> >> translation
>> >> >> should still be done inside Xen in the HVM case, imo (in the course of
>> >> >> manufacturing the PTE).
>> >> >
>> >> > Indeed. This function is very much internal to Xen (it's simply an
>> >> > abstraction on top of a vendor implementation), so why should it not
>> work
>> >> in
>> >> > terms of MFNs?
>> >>
>> >> Because "MFN" is a concept a HVM guest is not knowing about, or
>> >> supposed to be knowing. The only time where (part of) it might
>> >> legitimately (have to) know is when it comes to managing the host
>> >> (including any guests), i.e. in the tool stack of a PVH Dom0.
>> >
>> > Ok. So consider validating a PV-IOMMU unmap request from an HVM
>> guest. It
>> > passes in a DFN and a GFN  belonging to itself. Now Xen needs to figure out
>> > whether that BFN actually maps to the GFN. It can look up the MFN backing
>> the
>> > GFN (from the p2m). How does Xen now validate it if it cannot lookup what
>> MFN
>> > is actually present in the PTE referenced by the DFN?
>> 
>> I'm afraid I don't understand: The passed in GFN gets translated
>> to an MFN using a p2m lookup. The passed in DFN (which aiui ought
>> to match the GFN anyway on x86) gets translated to an MFN using
>> an IOMMU page table lookup. The resulting two MFNs have to
>> match for the request to be valid.
>> 
> 
> Quite. So how does that work if iommu_lookup_page() is ASSERTing that the 
> domain in question is not HVM?

Well, as soon as the function doesn't hand back MFNs anymore to
HVM callers, no such assertion would be needed anymore either.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v6 08/14] vtd: add lookup_page method to iommu_ops
  2018-09-12 12:15                               ` Jan Beulich
@ 2018-09-12 12:22                                 ` Paul Durrant
  2018-09-12 12:39                                   ` Jan Beulich
  0 siblings, 1 reply; 111+ messages in thread
From: Paul Durrant @ 2018-09-12 12:22 UTC (permalink / raw)
  To: 'Jan Beulich'; +Cc: xen-devel, Kevin Tian, George Dunlap

> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: 12 September 2018 13:15
> To: Paul Durrant <Paul.Durrant@citrix.com>
> Cc: George Dunlap <George.Dunlap@citrix.com>; Kevin Tian
> <kevin.tian@intel.com>; xen-devel <xen-devel@lists.xenproject.org>
> Subject: RE: [PATCH v6 08/14] vtd: add lookup_page method to iommu_ops
> 
> >>> On 12.09.18 at 12:09, <Paul.Durrant@citrix.com> wrote:
> >>  -----Original Message-----
> >> From: Jan Beulich [mailto:JBeulich@suse.com]
> >> Sent: 12 September 2018 11:08
> >> To: Paul Durrant <Paul.Durrant@citrix.com>
> >> Cc: George Dunlap <George.Dunlap@citrix.com>; Kevin Tian
> >> <kevin.tian@intel.com>; xen-devel <xen-devel@lists.xenproject.org>
> >> Subject: RE: [PATCH v6 08/14] vtd: add lookup_page method to
> iommu_ops
> >>
> >> >>> On 12.09.18 at 11:30, <Paul.Durrant@citrix.com> wrote:
> >> >>  -----Original Message-----
> >> >> From: Jan Beulich [mailto:JBeulich@suse.com]
> >> >> Sent: 12 September 2018 10:21
> >> >> To: Paul Durrant <Paul.Durrant@citrix.com>
> >> >> Cc: George Dunlap <George.Dunlap@citrix.com>; Kevin Tian
> >> >> <kevin.tian@intel.com>; xen-devel <xen-devel@lists.xenproject.org>
> >> >> Subject: RE: [PATCH v6 08/14] vtd: add lookup_page method to
> >> iommu_ops
> >> >>
> >> >> >>> On 12.09.18 at 11:15, <Paul.Durrant@citrix.com> wrote:
> >> >> >>  -----Original Message-----
> >> >> >> From: Jan Beulich [mailto:JBeulich@suse.com]
> >> >> >> Sent: 12 September 2018 10:13
> >> >> >> To: Paul Durrant <Paul.Durrant@citrix.com>
> >> >> >> Cc: George Dunlap <George.Dunlap@citrix.com>; Kevin Tian
> >> >> >> <kevin.tian@intel.com>; xen-devel <xen-
> devel@lists.xenproject.org>
> >> >> >> Subject: RE: [PATCH v6 08/14] vtd: add lookup_page method to
> >> >> iommu_ops
> >> >> >>
> >> >> >> >>> On 12.09.18 at 11:05, <Paul.Durrant@citrix.com> wrote:
> >> >> >> >> From: Jan Beulich [mailto:JBeulich@suse.com]
> >> >> >> >> Sent: 12 September 2018 10:03
> >> >> >> >>
> >> >> >> >> A HVM guest using the PV IOMMU is quite fine, but it shouldn't
> talk
> >> to
> >> >> >> >> it in terms of MFNs.
> >> >> >> >>
> >> >> >> >
> >> >> >> > Well, it has to talk MFNs at some level, surely? The output of the
> >> >> IOMMU is
> >> >> >> > not subject to EPT/NPT, right?
> >> >> >>
> >> >> >> Yes to the second question, but no to the first: The GFN -> MFN
> >> >> translation
> >> >> >> should still be done inside Xen in the HVM case, imo (in the course
> of
> >> >> >> manufacturing the PTE).
> >> >> >
> >> >> > Indeed. This function is very much internal to Xen (it's simply an
> >> >> > abstraction on top of a vendor implementation), so why should it not
> >> work
> >> >> in
> >> >> > terms of MFNs?
> >> >>
> >> >> Because "MFN" is a concept a HVM guest is not knowing about, or
> >> >> supposed to be knowing. The only time where (part of) it might
> >> >> legitimately (have to) know is when it comes to managing the host
> >> >> (including any guests), i.e. in the tool stack of a PVH Dom0.
> >> >
> >> > Ok. So consider validating a PV-IOMMU unmap request from an HVM
> >> guest. It
> >> > passes in a DFN and a GFN  belonging to itself. Now Xen needs to figure
> out
> >> > whether that BFN actually maps to the GFN. It can look up the MFN
> backing
> >> the
> >> > GFN (from the p2m). How does Xen now validate it if it cannot lookup
> what
> >> MFN
> >> > is actually present in the PTE referenced by the DFN?
> >>
> >> I'm afraid I don't understand: The passed in GFN gets translated
> >> to an MFN using a p2m lookup. The passed in DFN (which aiui ought
> >> to match the GFN anyway on x86) gets translated to an MFN using
> >> an IOMMU page table lookup. The resulting two MFNs have to
> >> match for the request to be valid.
> >>
> >
> > Quite. So how does that work if iommu_lookup_page() is ASSERTing that
> the
> > domain in question is not HVM?
> 
> Well, as soon as the function doesn't hand back MFNs anymore to
> HVM callers, no such assertion would be needed anymore either.
> 

So you'd prefer I add an ASSERTion that I'm going to remove as soon as I add a caller of the function?

  Paul

> Jan
> 


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v6 08/14] vtd: add lookup_page method to iommu_ops
  2018-09-12 12:22                                 ` Paul Durrant
@ 2018-09-12 12:39                                   ` Jan Beulich
  2018-09-12 12:53                                     ` Paul Durrant
  0 siblings, 1 reply; 111+ messages in thread
From: Jan Beulich @ 2018-09-12 12:39 UTC (permalink / raw)
  To: Paul Durrant; +Cc: xen-devel, Kevin Tian, george.dunlap

>>> On 12.09.18 at 14:22, <Paul.Durrant@citrix.com> wrote:
>>  -----Original Message-----
>> From: Jan Beulich [mailto:JBeulich@suse.com]
>> Sent: 12 September 2018 13:15
>> To: Paul Durrant <Paul.Durrant@citrix.com>
>> Cc: George Dunlap <George.Dunlap@citrix.com>; Kevin Tian
>> <kevin.tian@intel.com>; xen-devel <xen-devel@lists.xenproject.org>
>> Subject: RE: [PATCH v6 08/14] vtd: add lookup_page method to iommu_ops
>> 
>> >>> On 12.09.18 at 12:09, <Paul.Durrant@citrix.com> wrote:
>> >>  -----Original Message-----
>> >> From: Jan Beulich [mailto:JBeulich@suse.com]
>> >> Sent: 12 September 2018 11:08
>> >> To: Paul Durrant <Paul.Durrant@citrix.com>
>> >> Cc: George Dunlap <George.Dunlap@citrix.com>; Kevin Tian
>> >> <kevin.tian@intel.com>; xen-devel <xen-devel@lists.xenproject.org>
>> >> Subject: RE: [PATCH v6 08/14] vtd: add lookup_page method to
>> iommu_ops
>> >>
>> >> >>> On 12.09.18 at 11:30, <Paul.Durrant@citrix.com> wrote:
>> >> >>  -----Original Message-----
>> >> >> From: Jan Beulich [mailto:JBeulich@suse.com]
>> >> >> Sent: 12 September 2018 10:21
>> >> >> To: Paul Durrant <Paul.Durrant@citrix.com>
>> >> >> Cc: George Dunlap <George.Dunlap@citrix.com>; Kevin Tian
>> >> >> <kevin.tian@intel.com>; xen-devel <xen-devel@lists.xenproject.org>
>> >> >> Subject: RE: [PATCH v6 08/14] vtd: add lookup_page method to
>> >> iommu_ops
>> >> >>
>> >> >> >>> On 12.09.18 at 11:15, <Paul.Durrant@citrix.com> wrote:
>> >> >> >>  -----Original Message-----
>> >> >> >> From: Jan Beulich [mailto:JBeulich@suse.com]
>> >> >> >> Sent: 12 September 2018 10:13
>> >> >> >> To: Paul Durrant <Paul.Durrant@citrix.com>
>> >> >> >> Cc: George Dunlap <George.Dunlap@citrix.com>; Kevin Tian
>> >> >> >> <kevin.tian@intel.com>; xen-devel <xen-
>> devel@lists.xenproject.org>
>> >> >> >> Subject: RE: [PATCH v6 08/14] vtd: add lookup_page method to
>> >> >> iommu_ops
>> >> >> >>
>> >> >> >> >>> On 12.09.18 at 11:05, <Paul.Durrant@citrix.com> wrote:
>> >> >> >> >> From: Jan Beulich [mailto:JBeulich@suse.com]
>> >> >> >> >> Sent: 12 September 2018 10:03
>> >> >> >> >>
>> >> >> >> >> A HVM guest using the PV IOMMU is quite fine, but it shouldn't
>> talk
>> >> to
>> >> >> >> >> it in terms of MFNs.
>> >> >> >> >>
>> >> >> >> >
>> >> >> >> > Well, it has to talk MFNs at some level, surely? The output of the
>> >> >> IOMMU is
>> >> >> >> > not subject to EPT/NPT, right?
>> >> >> >>
>> >> >> >> Yes to the second question, but no to the first: The GFN -> MFN
>> >> >> translation
>> >> >> >> should still be done inside Xen in the HVM case, imo (in the course
>> of
>> >> >> >> manufacturing the PTE).
>> >> >> >
>> >> >> > Indeed. This function is very much internal to Xen (it's simply an
>> >> >> > abstraction on top of a vendor implementation), so why should it not
>> >> work
>> >> >> in
>> >> >> > terms of MFNs?
>> >> >>
>> >> >> Because "MFN" is a concept a HVM guest is not knowing about, or
>> >> >> supposed to be knowing. The only time where (part of) it might
>> >> >> legitimately (have to) know is when it comes to managing the host
>> >> >> (including any guests), i.e. in the tool stack of a PVH Dom0.
>> >> >
>> >> > Ok. So consider validating a PV-IOMMU unmap request from an HVM
>> >> guest. It
>> >> > passes in a DFN and a GFN  belonging to itself. Now Xen needs to figure
>> out
>> >> > whether that BFN actually maps to the GFN. It can look up the MFN
>> backing
>> >> the
>> >> > GFN (from the p2m). How does Xen now validate it if it cannot lookup
>> what
>> >> MFN
>> >> > is actually present in the PTE referenced by the DFN?
>> >>
>> >> I'm afraid I don't understand: The passed in GFN gets translated
>> >> to an MFN using a p2m lookup. The passed in DFN (which aiui ought
>> >> to match the GFN anyway on x86) gets translated to an MFN using
>> >> an IOMMU page table lookup. The resulting two MFNs have to
>> >> match for the request to be valid.
>> >>
>> >
>> > Quite. So how does that work if iommu_lookup_page() is ASSERTing that
>> the
>> > domain in question is not HVM?
>> 
>> Well, as soon as the function doesn't hand back MFNs anymore to
>> HVM callers, no such assertion would be needed anymore either.
> 
> So you'd prefer I add an ASSERTion that I'm going to remove as soon as I add 
> a caller of the function?

No. I guess I'm increasingly confused: The function at present returns
MFNs. Hence it must not be called by a HVM guest. Either you assert
that the calling domain isn't HVM, or you make the function return GFNs
for HVM domains (which then is a no-op due to gfn == dfn here, at
least for now).

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v6 08/14] vtd: add lookup_page method to iommu_ops
  2018-09-12 12:39                                   ` Jan Beulich
@ 2018-09-12 12:53                                     ` Paul Durrant
  2018-09-12 13:19                                       ` Jan Beulich
  0 siblings, 1 reply; 111+ messages in thread
From: Paul Durrant @ 2018-09-12 12:53 UTC (permalink / raw)
  To: 'Jan Beulich'; +Cc: xen-devel, Kevin Tian, George Dunlap

> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: 12 September 2018 13:39
> To: Paul Durrant <Paul.Durrant@citrix.com>
> Cc: George Dunlap <George.Dunlap@citrix.com>; Kevin Tian
> <kevin.tian@intel.com>; xen-devel <xen-devel@lists.xenproject.org>
> Subject: RE: [PATCH v6 08/14] vtd: add lookup_page method to iommu_ops
> 
> >>> On 12.09.18 at 14:22, <Paul.Durrant@citrix.com> wrote:
> >>  -----Original Message-----
> >> From: Jan Beulich [mailto:JBeulich@suse.com]
> >> Sent: 12 September 2018 13:15
> >> To: Paul Durrant <Paul.Durrant@citrix.com>
> >> Cc: George Dunlap <George.Dunlap@citrix.com>; Kevin Tian
> >> <kevin.tian@intel.com>; xen-devel <xen-devel@lists.xenproject.org>
> >> Subject: RE: [PATCH v6 08/14] vtd: add lookup_page method to
> iommu_ops
> >>
> >> >>> On 12.09.18 at 12:09, <Paul.Durrant@citrix.com> wrote:
> >> >>  -----Original Message-----
> >> >> From: Jan Beulich [mailto:JBeulich@suse.com]
> >> >> Sent: 12 September 2018 11:08
> >> >> To: Paul Durrant <Paul.Durrant@citrix.com>
> >> >> Cc: George Dunlap <George.Dunlap@citrix.com>; Kevin Tian
> >> >> <kevin.tian@intel.com>; xen-devel <xen-devel@lists.xenproject.org>
> >> >> Subject: RE: [PATCH v6 08/14] vtd: add lookup_page method to
> >> iommu_ops
> >> >>
> >> >> >>> On 12.09.18 at 11:30, <Paul.Durrant@citrix.com> wrote:
> >> >> >>  -----Original Message-----
> >> >> >> From: Jan Beulich [mailto:JBeulich@suse.com]
> >> >> >> Sent: 12 September 2018 10:21
> >> >> >> To: Paul Durrant <Paul.Durrant@citrix.com>
> >> >> >> Cc: George Dunlap <George.Dunlap@citrix.com>; Kevin Tian
> >> >> >> <kevin.tian@intel.com>; xen-devel <xen-
> devel@lists.xenproject.org>
> >> >> >> Subject: RE: [PATCH v6 08/14] vtd: add lookup_page method to
> >> >> iommu_ops
> >> >> >>
> >> >> >> >>> On 12.09.18 at 11:15, <Paul.Durrant@citrix.com> wrote:
> >> >> >> >>  -----Original Message-----
> >> >> >> >> From: Jan Beulich [mailto:JBeulich@suse.com]
> >> >> >> >> Sent: 12 September 2018 10:13
> >> >> >> >> To: Paul Durrant <Paul.Durrant@citrix.com>
> >> >> >> >> Cc: George Dunlap <George.Dunlap@citrix.com>; Kevin Tian
> >> >> >> >> <kevin.tian@intel.com>; xen-devel <xen-
> >> devel@lists.xenproject.org>
> >> >> >> >> Subject: RE: [PATCH v6 08/14] vtd: add lookup_page method to
> >> >> >> iommu_ops
> >> >> >> >>
> >> >> >> >> >>> On 12.09.18 at 11:05, <Paul.Durrant@citrix.com> wrote:
> >> >> >> >> >> From: Jan Beulich [mailto:JBeulich@suse.com]
> >> >> >> >> >> Sent: 12 September 2018 10:03
> >> >> >> >> >>
> >> >> >> >> >> A HVM guest using the PV IOMMU is quite fine, but it
> shouldn't
> >> talk
> >> >> to
> >> >> >> >> >> it in terms of MFNs.
> >> >> >> >> >>
> >> >> >> >> >
> >> >> >> >> > Well, it has to talk MFNs at some level, surely? The output of
> the
> >> >> >> IOMMU is
> >> >> >> >> > not subject to EPT/NPT, right?
> >> >> >> >>
> >> >> >> >> Yes to the second question, but no to the first: The GFN -> MFN
> >> >> >> translation
> >> >> >> >> should still be done inside Xen in the HVM case, imo (in the
> course
> >> of
> >> >> >> >> manufacturing the PTE).
> >> >> >> >
> >> >> >> > Indeed. This function is very much internal to Xen (it's simply an
> >> >> >> > abstraction on top of a vendor implementation), so why should it
> not
> >> >> work
> >> >> >> in
> >> >> >> > terms of MFNs?
> >> >> >>
> >> >> >> Because "MFN" is a concept a HVM guest is not knowing about, or
> >> >> >> supposed to be knowing. The only time where (part of) it might
> >> >> >> legitimately (have to) know is when it comes to managing the host
> >> >> >> (including any guests), i.e. in the tool stack of a PVH Dom0.
> >> >> >
> >> >> > Ok. So consider validating a PV-IOMMU unmap request from an
> HVM
> >> >> guest. It
> >> >> > passes in a DFN and a GFN  belonging to itself. Now Xen needs to
> figure
> >> out
> >> >> > whether that BFN actually maps to the GFN. It can look up the MFN
> >> backing
> >> >> the
> >> >> > GFN (from the p2m). How does Xen now validate it if it cannot
> lookup
> >> what
> >> >> MFN
> >> >> > is actually present in the PTE referenced by the DFN?
> >> >>
> >> >> I'm afraid I don't understand: The passed in GFN gets translated
> >> >> to an MFN using a p2m lookup. The passed in DFN (which aiui ought
> >> >> to match the GFN anyway on x86) gets translated to an MFN using
> >> >> an IOMMU page table lookup. The resulting two MFNs have to
> >> >> match for the request to be valid.
> >> >>
> >> >
> >> > Quite. So how does that work if iommu_lookup_page() is ASSERTing
> that
> >> the
> >> > domain in question is not HVM?
> >>
> >> Well, as soon as the function doesn't hand back MFNs anymore to
> >> HVM callers, no such assertion would be needed anymore either.
> >
> > So you'd prefer I add an ASSERTion that I'm going to remove as soon as I
> add
> > a caller of the function?
> 
> No. I guess I'm increasingly confused: The function at present returns
> MFNs. Hence it must not be called by a HVM guest.

That's the part I don't get. What do you mean by 'called by a HVM guest'? I completely agree that MFN values must not be exposed to an HVM guest so there is no way the output of this function should ever be exposed through a hypercall and I'm not proposing that ever be done.

> Either you assert
> that the calling domain isn't HVM, or you make the function return GFNs
> for HVM domains (which then is a no-op due to gfn == dfn here, at
> least for now).
> 

The function will never return its results to a guest, PV or HVM, so I really don't see the concern. It's a low level function, for Xen's internal use only. It's essentially the equivalent of a p2m lookup function and there's no way we'd ever expose the results of such a lookup to the guest either.

  Paul

> Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v6 08/14] vtd: add lookup_page method to iommu_ops
  2018-09-12 12:53                                     ` Paul Durrant
@ 2018-09-12 13:19                                       ` Jan Beulich
  2018-09-12 13:25                                         ` Paul Durrant
  0 siblings, 1 reply; 111+ messages in thread
From: Jan Beulich @ 2018-09-12 13:19 UTC (permalink / raw)
  To: Paul Durrant; +Cc: xen-devel, Kevin Tian, george.dunlap

>>> On 12.09.18 at 14:53, <Paul.Durrant@citrix.com> wrote:
> The function will never return its results to a guest, PV or HVM, so I 
> really don't see the concern. It's a low level function, for Xen's internal 
> use only. It's essentially the equivalent of a p2m lookup function and 
> there's no way we'd ever expose the results of such a lookup to the guest 
> either.

Oh, that was utter confusion on my part then, and I'm sorry for all
the noise. I've got mislead by the titles of this patch and patches
13 and 14, all of which have "iommu_ops" as parts of them.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v6 08/14] vtd: add lookup_page method to iommu_ops
  2018-09-12 13:19                                       ` Jan Beulich
@ 2018-09-12 13:25                                         ` Paul Durrant
  2018-09-12 13:39                                           ` Jan Beulich
  0 siblings, 1 reply; 111+ messages in thread
From: Paul Durrant @ 2018-09-12 13:25 UTC (permalink / raw)
  To: 'Jan Beulich'; +Cc: xen-devel, Kevin Tian, George Dunlap

> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: 12 September 2018 14:20
> To: Paul Durrant <Paul.Durrant@citrix.com>
> Cc: George Dunlap <George.Dunlap@citrix.com>; Kevin Tian
> <kevin.tian@intel.com>; xen-devel <xen-devel@lists.xenproject.org>
> Subject: RE: [PATCH v6 08/14] vtd: add lookup_page method to iommu_ops
> 
> >>> On 12.09.18 at 14:53, <Paul.Durrant@citrix.com> wrote:
> > The function will never return its results to a guest, PV or HVM, so I
> > really don't see the concern. It's a low level function, for Xen's internal
> > use only. It's essentially the equivalent of a p2m lookup function and
> > there's no way we'd ever expose the results of such a lookup to the guest
> > either.
> 
> Oh, that was utter confusion on my part then, and I'm sorry for all
> the noise. I've got mislead by the titles of this patch and patches
> 13 and 14, all of which have "iommu_ops" as parts of them.
> 

I recently noticed this name overloading so sorry for leading you astray. I will seriously consider renaming the hypercall when I re-work the later patches.

  Paul

> Jan
> 


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v6 08/14] vtd: add lookup_page method to iommu_ops
  2018-09-12 13:25                                         ` Paul Durrant
@ 2018-09-12 13:39                                           ` Jan Beulich
  2018-09-12 13:43                                             ` Paul Durrant
  0 siblings, 1 reply; 111+ messages in thread
From: Jan Beulich @ 2018-09-12 13:39 UTC (permalink / raw)
  To: Paul Durrant; +Cc: xen-devel, Kevin Tian, george.dunlap

>>> On 12.09.18 at 15:25, <Paul.Durrant@citrix.com> wrote:
>>  -----Original Message-----
>> From: Jan Beulich [mailto:JBeulich@suse.com]
>> Sent: 12 September 2018 14:20
>> To: Paul Durrant <Paul.Durrant@citrix.com>
>> Cc: George Dunlap <George.Dunlap@citrix.com>; Kevin Tian
>> <kevin.tian@intel.com>; xen-devel <xen-devel@lists.xenproject.org>
>> Subject: RE: [PATCH v6 08/14] vtd: add lookup_page method to iommu_ops
>> 
>> >>> On 12.09.18 at 14:53, <Paul.Durrant@citrix.com> wrote:
>> > The function will never return its results to a guest, PV or HVM, so I
>> > really don't see the concern. It's a low level function, for Xen's internal
>> > use only. It's essentially the equivalent of a p2m lookup function and
>> > there's no way we'd ever expose the results of such a lookup to the guest
>> > either.
>> 
>> Oh, that was utter confusion on my part then, and I'm sorry for all
>> the noise. I've got mislead by the titles of this patch and patches
>> 13 and 14, all of which have "iommu_ops" as parts of them.
>> 
> 
> I recently noticed this name overloading so sorry for leading you astray. I 
> will seriously consider renaming the hypercall when I re-work the later 
> patches.

The hypercall name is fine, I think. Perhaps the distinction
could already be made more clear by using iommu_ops for the
internal interface(s) (matching their structure name) and
iommu_op / iommu_op-s for the hypercall based operations,
seeing that the structure is names xen_iommu_op?

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v6 08/14] vtd: add lookup_page method to iommu_ops
  2018-09-12 13:39                                           ` Jan Beulich
@ 2018-09-12 13:43                                             ` Paul Durrant
  0 siblings, 0 replies; 111+ messages in thread
From: Paul Durrant @ 2018-09-12 13:43 UTC (permalink / raw)
  To: 'Jan Beulich'; +Cc: xen-devel, Kevin Tian, George Dunlap

> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: 12 September 2018 14:39
> To: Paul Durrant <Paul.Durrant@citrix.com>
> Cc: George Dunlap <George.Dunlap@citrix.com>; Kevin Tian
> <kevin.tian@intel.com>; xen-devel <xen-devel@lists.xenproject.org>
> Subject: RE: [PATCH v6 08/14] vtd: add lookup_page method to iommu_ops
> 
> >>> On 12.09.18 at 15:25, <Paul.Durrant@citrix.com> wrote:
> >>  -----Original Message-----
> >> From: Jan Beulich [mailto:JBeulich@suse.com]
> >> Sent: 12 September 2018 14:20
> >> To: Paul Durrant <Paul.Durrant@citrix.com>
> >> Cc: George Dunlap <George.Dunlap@citrix.com>; Kevin Tian
> >> <kevin.tian@intel.com>; xen-devel <xen-devel@lists.xenproject.org>
> >> Subject: RE: [PATCH v6 08/14] vtd: add lookup_page method to
> iommu_ops
> >>
> >> >>> On 12.09.18 at 14:53, <Paul.Durrant@citrix.com> wrote:
> >> > The function will never return its results to a guest, PV or HVM, so I
> >> > really don't see the concern. It's a low level function, for Xen's internal
> >> > use only. It's essentially the equivalent of a p2m lookup function and
> >> > there's no way we'd ever expose the results of such a lookup to the
> guest
> >> > either.
> >>
> >> Oh, that was utter confusion on my part then, and I'm sorry for all
> >> the noise. I've got mislead by the titles of this patch and patches
> >> 13 and 14, all of which have "iommu_ops" as parts of them.
> >>
> >
> > I recently noticed this name overloading so sorry for leading you astray. I
> > will seriously consider renaming the hypercall when I re-work the later
> > patches.
> 
> The hypercall name is fine, I think. Perhaps the distinction
> could already be made more clear by using iommu_ops for the
> internal interface(s) (matching their structure name) and
> iommu_op / iommu_op-s for the hypercall based operations,
> seeing that the structure is names xen_iommu_op?
> 

Ok. I'll leave the filenames as iommu_op.[ch] but I'll try to refer consistently to xen_iommu_op in comments and descriptions relating to the hypercalls and iommu_ops for the internal abstraction layer.

  Paul

> Jan
> 


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v6 14/14] x86: extend the map and unmap iommu_ops to support grant references
  2018-08-23  9:47 ` [PATCH v6 14/14] x86: extend the map and unmap iommu_ops to support grant references Paul Durrant
@ 2018-09-12 14:12   ` Jan Beulich
  2018-09-12 16:28     ` Paul Durrant
  0 siblings, 1 reply; 111+ messages in thread
From: Jan Beulich @ 2018-09-12 14:12 UTC (permalink / raw)
  To: Paul Durrant
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Tim Deegan,
	Julien Grall, xen-devel

>>> On 23.08.18 at 11:47, <paul.durrant@citrix.com> wrote:
> +int
> +acquire_gref_for_iommu(struct domain *d, grant_ref_t gref,
> +                       bool readonly, mfn_t *mfn)
> +{
> +    struct domain *currd = current->domain;
> +    struct grant_table *gt = d->grant_table;
> +    grant_entry_header_t *shah;
> +    struct active_grant_entry *act;
> +    uint16_t *status;
> +    int rc;
> +
> +    grant_read_lock(gt);
> +
> +    rc = -ENOENT;
> +    if ( gref > nr_grant_entries(gt) )

>= (also in the release counterpart)

> +        goto unlock;
> +
> +    act = active_entry_acquire(gt, gref);
> +    shah = shared_entry_header(gt, gref);
> +    status = ( gt->gt_version == 2 ) ?

Stray blanks. Further down in a similar construct you even omit the
parentheses, which you could as well do here too. Again also below.

> +        &status_entry(gt, gref) :
> +        &shah->flags;

The whole thing does not fit on a line?

> +    rc = -EACCES;
> +    if ( (shah->flags & GTF_type_mask) != GTF_permit_access ||
> +         (shah->flags & GTF_sub_page) )
> +        goto release;

So transitive grants are okay despite there being no special
handling anywhere in the function?

> +    rc = -ERANGE;
> +    if ( act->pin && ((act->domid != currd->domain_id) ||
> +                      (act->pin & 0x80808080U) != 0) )

You want to check only two of the four top bits, as you only add in
GNTPIN_dev{r,w}_inc below.

> +        goto release;
> +
> +    rc = -EINVAL;
> +    if ( !act->pin ||
> +         (!readonly && !(act->pin & GNTPIN_devw_mask)) ) {
> +        if ( _set_status(gt->gt_version, currd->domain_id, readonly,
> +                         0, shah, act, status) != GNTST_okay )
> +            goto release;
> +    }

Please combine the two if()-s.

> +    if ( !act->pin )
> +    {
> +        gfn_t gfn = gt->gt_version == 1 ?
> +            _gfn(shared_entry_v1(gt, gref).frame) :
> +            _gfn(shared_entry_v2(gt, gref).full_page.frame);
> +        struct page_info *page;
> +
> +        rc =  get_paged_gfn(d, gfn, readonly, NULL, &page);
> +        if ( rc )
> +            goto clear;
> +
> +        act_set_gfn(act, gfn);
> +        act->mfn = page_to_mfn(page);
> +        act->domid = currd->domain_id;
> +        act->start = 0;
> +        act->length = PAGE_SIZE;
> +        act->is_sub_page = false;
> +        act->trans_domain = d;
> +        act->trans_gref = gref;
> +    }
> +    else
> +    {
> +        ASSERT(mfn_valid(act->mfn));
> +        if ( !get_page(mfn_to_page(act->mfn), d) )
> +            goto clear;
> +    }

Don't you also need to also acquire a write ref here if !readonly?

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v6 14/14] x86: extend the map and unmap iommu_ops to support grant references
  2018-09-12 14:12   ` Jan Beulich
@ 2018-09-12 16:28     ` Paul Durrant
  0 siblings, 0 replies; 111+ messages in thread
From: Paul Durrant @ 2018-09-12 16:28 UTC (permalink / raw)
  To: 'Jan Beulich'
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	Andrew Cooper, Tim (Xen.org),
	George Dunlap, Julien Grall, xen-devel, Ian Jackson

> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: 12 September 2018 15:13
> To: Paul Durrant <Paul.Durrant@citrix.com>
> Cc: Julien Grall <julien.grall@arm.com>; Andrew Cooper
> <Andrew.Cooper3@citrix.com>; Wei Liu <wei.liu2@citrix.com>; George
> Dunlap <George.Dunlap@citrix.com>; Ian Jackson <Ian.Jackson@citrix.com>;
> Stefano Stabellini <sstabellini@kernel.org>; xen-devel <xen-
> devel@lists.xenproject.org>; Konrad Rzeszutek Wilk
> <konrad.wilk@oracle.com>; Tim (Xen.org) <tim@xen.org>
> Subject: Re: [PATCH v6 14/14] x86: extend the map and unmap iommu_ops
> to support grant references
> 
> >>> On 23.08.18 at 11:47, <paul.durrant@citrix.com> wrote:
> > +int
> > +acquire_gref_for_iommu(struct domain *d, grant_ref_t gref,
> > +                       bool readonly, mfn_t *mfn)
> > +{
> > +    struct domain *currd = current->domain;
> > +    struct grant_table *gt = d->grant_table;
> > +    grant_entry_header_t *shah;
> > +    struct active_grant_entry *act;
> > +    uint16_t *status;
> > +    int rc;
> > +
> > +    grant_read_lock(gt);
> > +
> > +    rc = -ENOENT;
> > +    if ( gref > nr_grant_entries(gt) )
> 
> >= (also in the release counterpart)
> 

Yes, good point.

> > +        goto unlock;
> > +
> > +    act = active_entry_acquire(gt, gref);
> > +    shah = shared_entry_header(gt, gref);
> > +    status = ( gt->gt_version == 2 ) ?
> 
> Stray blanks. Further down in a similar construct you even omit the
> parentheses, which you could as well do here too. Again also below.

Ok.

> 
> > +        &status_entry(gt, gref) :
> > +        &shah->flags;
> 
> The whole thing does not fit on a line?

I'll check.

> 
> > +    rc = -EACCES;
> > +    if ( (shah->flags & GTF_type_mask) != GTF_permit_access ||
> > +         (shah->flags & GTF_sub_page) )
> > +        goto release;
> 
> So transitive grants are okay despite there being no special
> handling anywhere in the function?
> 

No, they do need to be avoided.

> > +    rc = -ERANGE;
> > +    if ( act->pin && ((act->domid != currd->domain_id) ||
> > +                      (act->pin & 0x80808080U) != 0) )
> 
> You want to check only two of the four top bits, as you only add in
> GNTPIN_dev{r,w}_inc below.
> 

True.

> > +        goto release;
> > +
> > +    rc = -EINVAL;
> > +    if ( !act->pin ||
> > +         (!readonly && !(act->pin & GNTPIN_devw_mask)) ) {
> > +        if ( _set_status(gt->gt_version, currd->domain_id, readonly,
> > +                         0, shah, act, status) != GNTST_okay )
> > +            goto release;
> > +    }
> 
> Please combine the two if()-s.
> 

Ok.

> > +    if ( !act->pin )
> > +    {
> > +        gfn_t gfn = gt->gt_version == 1 ?
> > +            _gfn(shared_entry_v1(gt, gref).frame) :
> > +            _gfn(shared_entry_v2(gt, gref).full_page.frame);
> > +        struct page_info *page;
> > +
> > +        rc =  get_paged_gfn(d, gfn, readonly, NULL, &page);
> > +        if ( rc )
> > +            goto clear;
> > +
> > +        act_set_gfn(act, gfn);
> > +        act->mfn = page_to_mfn(page);
> > +        act->domid = currd->domain_id;
> > +        act->start = 0;
> > +        act->length = PAGE_SIZE;
> > +        act->is_sub_page = false;
> > +        act->trans_domain = d;
> > +        act->trans_gref = gref;
> > +    }
> > +    else
> > +    {
> > +        ASSERT(mfn_valid(act->mfn));
> > +        if ( !get_page(mfn_to_page(act->mfn), d) )
> > +            goto clear;
> > +    }
> 
> Don't you also need to also acquire a write ref here if !readonly?
> 

Yes, perhaps I should do a get_page_type() in iommuop_map() regardless of whether the page comes from a gfn or a gref lookup.

  Paul

> Jan
> 


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v6 07/14] x86: add iommu_op to query reserved ranges
  2018-09-07 11:01   ` Jan Beulich
  2018-09-11  9:34     ` Paul Durrant
@ 2018-09-13  6:11     ` Tian, Kevin
  1 sibling, 0 replies; 111+ messages in thread
From: Tian, Kevin @ 2018-09-13  6:11 UTC (permalink / raw)
  To: Jan Beulich, Paul Durrant
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Tim Deegan, xen-devel

> From: Jan Beulich
> Sent: Friday, September 7, 2018 7:01 PM
> 
> >>> On 23.08.18 at 11:47, <paul.durrant@citrix.com> wrote:
> > This patch adds an iommu_op to allow the domain IOMMU reserved
> ranges to be
> > queried by the guest.
> >
> > NOTE: The number of reserved ranges is determined by system firmware,
> in
> >       conjunction with Xen command line options, and is expected to be
> >       small. Thus, to avoid over-complicating the code, there is no
> >       pre-emption check within the operation.
> 
> Hmm, RMRRs reportedly can cover a fair part of (the entire?) frame
> buffer of a graphics device.
> 

but it is still just one range
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v6 09/14] mm / iommu: include need_iommu() test in iommu_use_hap_pt()
  2018-09-11  9:47       ` Jan Beulich
@ 2018-09-13  6:23         ` Tian, Kevin
  2018-09-13  8:34           ` Paul Durrant
  0 siblings, 1 reply; 111+ messages in thread
From: Tian, Kevin @ 2018-09-13  6:23 UTC (permalink / raw)
  To: Jan Beulich, Paul Durrant
  Cc: Stefano Stabellini, Suravee Suthikulpanit, Andrew Cooper,
	george.dunlap, Julien Grall, Nakajima, Jun, xen-devel

> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: Tuesday, September 11, 2018 5:48 PM
> 
> >>> On 11.09.18 at 11:39, <Paul.Durrant@citrix.com> wrote:
> >>  -----Original Message-----
> >> From: Jan Beulich [mailto:JBeulich@suse.com]
> >> Sent: 07 September 2018 12:20
> >> To: Paul Durrant <Paul.Durrant@citrix.com>
> >> Cc: Julien Grall <julien.grall@arm.com>; Andrew Cooper
> >> <Andrew.Cooper3@citrix.com>; George Dunlap
> >> <George.Dunlap@citrix.com>; Jun Nakajima <jun.nakajima@intel.com>;
> >> Stefano Stabellini <sstabellini@kernel.org>; xen-devel <xen-
> >> devel@lists.xenproject.org>
> >> Subject: Re: [PATCH v6 09/14] mm / iommu: include need_iommu() test
> in
> >> iommu_use_hap_pt()
> >>
> >> >>> On 23.08.18 at 11:47, <paul.durrant@citrix.com> wrote:
> >> > The name 'iommu_use_hap_pt' suggests that that P2M table is in use
> as
> >> the
> >> > domain's IOMMU pagetable which, prior to this patch, is not strictly
> true
> >> > since the macro did not test whether the domain actually has IOMMU
> >> > mappings.
> >>
> >> Hmm, I would never have implied "has IOMMU mappings" from this
> >> variable name. To me it has always been "use HAP page tables for
> >> IOMMU if an IOMMU is in use". The code change looks sane, but
> >> I'm not sure it is a clear improvement. Hence I wonder whether you
> >> have a need for this change in subsequent patches which goes
> >> beyond what you say above.
> >>
> >
> > I could take it out but it would mean a non-trivial rebase, and to me - if
> > true - the name sill implies the IOMMU is in use for the domain so I'd like
> to
> > keep the change.
> 
> Let's broaden the Cc list a little - perhaps we can get further opinions this
> way.
> 

the previous definition checks hap_enabled:

-#define iommu_use_hap_pt(d) (hap_enabled(d) && iommu_hap_pt_share)

if following Jan's way it should be "use HAP page tables for IOMMU if an 
IOMMU is in use and HAP is enabled", then hap_enabled should be moved
out too.

including iommu check in the macro looks more consistent.

Thanks
Kevin

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v6 13/14] x86: add iommu_ops to modify and flush IOMMU mappings
  2018-09-12  8:02     ` Paul Durrant
  2018-09-12  8:27       ` Jan Beulich
@ 2018-09-13  6:41       ` Tian, Kevin
  2018-09-13  8:32         ` Paul Durrant
  2018-09-13  8:49         ` Jan Beulich
  1 sibling, 2 replies; 111+ messages in thread
From: Tian, Kevin @ 2018-09-13  6:41 UTC (permalink / raw)
  To: Paul Durrant, 'Jan Beulich'
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	Andrew Cooper, Tim (Xen.org),
	George Dunlap, Julien Grall, Ian Jackson, xen-devel

> From: Paul Durrant
> Sent: Wednesday, September 12, 2018 4:02 PM
> 
> 
> > In order to avoid shooting down all pre-existing RAM mappings - is
> > there no way the page table entries could be marked to identify
> > their origin?
> >
> 
> I don't know whether that is possible; I'll have to find specs for Intel and
> AMD IOMMUs and see if they have PTE bits available for such a use.

there are ignored bits

> 
> > I also have another more general concern: Allowing the guest to
> > manipulate its IOMMU page tables means that it can deliberately
> > shatter large pages, growing the overall memory footprint of the
> > domain. I'm hesitant to say this, but I'm afraid that resource
> > tracking of such "behind the scenes" allocations might be a
> > necessary prereq for the PV IOMMU work.
> >
> 
> Remember that PV-IOMMU is only available for dom0 as it stands (and that
> is the only use-case that XenServer currently has) so I think that, whilst the
> concern is valid, there is no need danger in putting the code without such
> tracking. Such work can be deferred to making PV-IOMMU for de-privileged
> guests... if that facility is needed.
> 

I didn't get why this is PV-IOMMU specific. Guest can always manipulate
guest CPU page table to shatter large pages too...

Thanks
Kevin
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v6 13/14] x86: add iommu_ops to modify and flush IOMMU mappings
  2018-09-13  6:41       ` Tian, Kevin
@ 2018-09-13  8:32         ` Paul Durrant
  2018-09-13  8:49         ` Jan Beulich
  1 sibling, 0 replies; 111+ messages in thread
From: Paul Durrant @ 2018-09-13  8:32 UTC (permalink / raw)
  To: Kevin Tian, 'Jan Beulich'
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	Andrew Cooper, Tim (Xen.org),
	George Dunlap, Julien Grall, Ian Jackson, xen-devel

> -----Original Message-----
> From: Tian, Kevin [mailto:kevin.tian@intel.com]
> Sent: 13 September 2018 07:41
> To: Paul Durrant <Paul.Durrant@citrix.com>; 'Jan Beulich'
> <JBeulich@suse.com>
> Cc: Stefano Stabellini <sstabellini@kernel.org>; Wei Liu
> <wei.liu2@citrix.com>; Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>;
> Andrew Cooper <Andrew.Cooper3@citrix.com>; Tim (Xen.org)
> <tim@xen.org>; George Dunlap <George.Dunlap@citrix.com>; Julien Grall
> <julien.grall@arm.com>; xen-devel <xen-devel@lists.xenproject.org>; Ian
> Jackson <Ian.Jackson@citrix.com>
> Subject: RE: [PATCH v6 13/14] x86: add iommu_ops to modify and flush
> IOMMU mappings
> 
> > From: Paul Durrant
> > Sent: Wednesday, September 12, 2018 4:02 PM
> >
> >
> > > In order to avoid shooting down all pre-existing RAM mappings - is
> > > there no way the page table entries could be marked to identify
> > > their origin?
> > >
> >
> > I don't know whether that is possible; I'll have to find specs for Intel and
> > AMD IOMMUs and see if they have PTE bits available for such a use.
> 
> there are ignored bits
> 
> >
> > > I also have another more general concern: Allowing the guest to
> > > manipulate its IOMMU page tables means that it can deliberately
> > > shatter large pages, growing the overall memory footprint of the
> > > domain. I'm hesitant to say this, but I'm afraid that resource
> > > tracking of such "behind the scenes" allocations might be a
> > > necessary prereq for the PV IOMMU work.
> > >
> >
> > Remember that PV-IOMMU is only available for dom0 as it stands (and that
> > is the only use-case that XenServer currently has) so I think that, whilst the
> > concern is valid, there is no need danger in putting the code without such
> > tracking. Such work can be deferred to making PV-IOMMU for de-
> privileged
> > guests... if that facility is needed.
> >
> 
> I didn't get why this is PV-IOMMU specific. Guest can always manipulate
> guest CPU page table to shatter large pages too...
> 

At the moment that is true. I guess Jan doesn't want to introduce another way for a guest to cause Xen to consume large amounts of memory.

  Paul

> Thanks
> Kevin
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v6 09/14] mm / iommu: include need_iommu() test in iommu_use_hap_pt()
  2018-09-13  6:23         ` Tian, Kevin
@ 2018-09-13  8:34           ` Paul Durrant
  0 siblings, 0 replies; 111+ messages in thread
From: Paul Durrant @ 2018-09-13  8:34 UTC (permalink / raw)
  To: Kevin Tian, Jan Beulich
  Cc: Stefano Stabellini, Suravee Suthikulpanit, Andrew Cooper,
	George Dunlap, Julien Grall, Nakajima, Jun, xen-devel

> -----Original Message-----
> From: Tian, Kevin [mailto:kevin.tian@intel.com]
> Sent: 13 September 2018 07:24
> To: Jan Beulich <JBeulich@suse.com>; Paul Durrant
> <Paul.Durrant@citrix.com>
> Cc: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>; Julien Grall
> <julien.grall@arm.com>; Andrew Cooper <Andrew.Cooper3@citrix.com>;
> George Dunlap <George.Dunlap@citrix.com>; Nakajima, Jun
> <jun.nakajima@intel.com>; Stefano Stabellini <sstabellini@kernel.org>; xen-
> devel <xen-devel@lists.xenproject.org>
> Subject: RE: [PATCH v6 09/14] mm / iommu: include need_iommu() test in
> iommu_use_hap_pt()
> 
> > From: Jan Beulich [mailto:JBeulich@suse.com]
> > Sent: Tuesday, September 11, 2018 5:48 PM
> >
> > >>> On 11.09.18 at 11:39, <Paul.Durrant@citrix.com> wrote:
> > >>  -----Original Message-----
> > >> From: Jan Beulich [mailto:JBeulich@suse.com]
> > >> Sent: 07 September 2018 12:20
> > >> To: Paul Durrant <Paul.Durrant@citrix.com>
> > >> Cc: Julien Grall <julien.grall@arm.com>; Andrew Cooper
> > >> <Andrew.Cooper3@citrix.com>; George Dunlap
> > >> <George.Dunlap@citrix.com>; Jun Nakajima
> <jun.nakajima@intel.com>;
> > >> Stefano Stabellini <sstabellini@kernel.org>; xen-devel <xen-
> > >> devel@lists.xenproject.org>
> > >> Subject: Re: [PATCH v6 09/14] mm / iommu: include need_iommu() test
> > in
> > >> iommu_use_hap_pt()
> > >>
> > >> >>> On 23.08.18 at 11:47, <paul.durrant@citrix.com> wrote:
> > >> > The name 'iommu_use_hap_pt' suggests that that P2M table is in use
> > as
> > >> the
> > >> > domain's IOMMU pagetable which, prior to this patch, is not strictly
> > true
> > >> > since the macro did not test whether the domain actually has IOMMU
> > >> > mappings.
> > >>
> > >> Hmm, I would never have implied "has IOMMU mappings" from this
> > >> variable name. To me it has always been "use HAP page tables for
> > >> IOMMU if an IOMMU is in use". The code change looks sane, but
> > >> I'm not sure it is a clear improvement. Hence I wonder whether you
> > >> have a need for this change in subsequent patches which goes
> > >> beyond what you say above.
> > >>
> > >
> > > I could take it out but it would mean a non-trivial rebase, and to me - if
> > > true - the name sill implies the IOMMU is in use for the domain so I'd like
> > to
> > > keep the change.
> >
> > Let's broaden the Cc list a little - perhaps we can get further opinions this
> > way.
> >
> 
> the previous definition checks hap_enabled:
> 
> -#define iommu_use_hap_pt(d) (hap_enabled(d) &&
> iommu_hap_pt_share)
> 
> if following Jan's way it should be "use HAP page tables for IOMMU if an
> IOMMU is in use and HAP is enabled", then hap_enabled should be moved
> out too.
> 
> including iommu check in the macro looks more consistent.
> 

Thanks. I'll keep this patch.

  Paul

> Thanks
> Kevin

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 111+ messages in thread

* Re: [PATCH v6 13/14] x86: add iommu_ops to modify and flush IOMMU mappings
  2018-09-13  6:41       ` Tian, Kevin
  2018-09-13  8:32         ` Paul Durrant
@ 2018-09-13  8:49         ` Jan Beulich
  1 sibling, 0 replies; 111+ messages in thread
From: Jan Beulich @ 2018-09-13  8:49 UTC (permalink / raw)
  To: Kevin Tian
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	Andrew Cooper, Tim Deegan, george.dunlap, Julien Grall,
	Paul Durrant, xen-devel, Ian Jackson

>>> On 13.09.18 at 08:41, <kevin.tian@intel.com> wrote:
>>  From: Paul Durrant
>> Sent: Wednesday, September 12, 2018 4:02 PM
>> 
>> > I also have another more general concern: Allowing the guest to
>> > manipulate its IOMMU page tables means that it can deliberately
>> > shatter large pages, growing the overall memory footprint of the
>> > domain. I'm hesitant to say this, but I'm afraid that resource
>> > tracking of such "behind the scenes" allocations might be a
>> > necessary prereq for the PV IOMMU work.
>> >
>> 
>> Remember that PV-IOMMU is only available for dom0 as it stands (and that
>> is the only use-case that XenServer currently has) so I think that, whilst the
>> concern is valid, there is no need danger in putting the code without such
>> tracking. Such work can be deferred to making PV-IOMMU for de-privileged
>> guests... if that facility is needed.
> 
> I didn't get why this is PV-IOMMU specific. Guest can always manipulate
> guest CPU page table to shatter large pages too...

Hmm, good point. I keep forgetting that we allow guests to fiddle with
their own p2m.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 111+ messages in thread

end of thread, other threads:[~2018-09-13  8:49 UTC | newest]

Thread overview: 111+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-08-23  9:46 [PATCH v6 00/14] paravirtual IOMMU interface Paul Durrant
2018-08-23  9:46 ` [PATCH v6 01/14] iommu: introduce the concept of BFN Paul Durrant
2018-08-30 15:59   ` Jan Beulich
2018-09-03  8:23     ` Paul Durrant
2018-09-03 11:46       ` Jan Beulich
2018-09-04  6:48         ` Tian, Kevin
2018-09-04  8:32           ` Jan Beulich
2018-09-04  8:37             ` Tian, Kevin
2018-09-04  8:47               ` Jan Beulich
2018-09-04  8:49                 ` Paul Durrant
2018-09-04  9:08                   ` Jan Beulich
2018-09-05  0:42                     ` Tian, Kevin
2018-09-05  6:48                       ` Jan Beulich
2018-09-05  6:56                         ` Tian, Kevin
2018-09-05  7:11                           ` Jan Beulich
2018-09-05  9:13                             ` Paul Durrant
2018-09-05  9:38                               ` Jan Beulich
2018-09-06 10:36                                 ` Paul Durrant
2018-09-06 13:13                                   ` Jan Beulich
2018-09-06 14:54                                     ` Paul Durrant
2018-09-07  1:47                                       ` Tian, Kevin
2018-09-07  6:24                                         ` Jan Beulich
2018-09-07  8:13                                           ` Paul Durrant
2018-09-07  8:16                                             ` Tian, Kevin
2018-09-07  8:25                                               ` Paul Durrant
2018-08-23  9:46 ` [PATCH v6 02/14] iommu: make use of type-safe BFN and MFN in exported functions Paul Durrant
2018-09-04 10:29   ` Jan Beulich
2018-08-23  9:47 ` [PATCH v6 03/14] iommu: push use of type-safe BFN and MFN into iommu_ops Paul Durrant
2018-09-04 10:32   ` Jan Beulich
2018-08-23  9:47 ` [PATCH v6 04/14] iommu: don't domain_crash() inside iommu_map/unmap_page() Paul Durrant
2018-09-04 10:38   ` Jan Beulich
2018-09-04 10:39     ` Paul Durrant
2018-08-23  9:47 ` [PATCH v6 05/14] public / x86: introduce __HYPERCALL_iommu_op Paul Durrant
2018-09-04 11:50   ` Jan Beulich
2018-09-04 12:23     ` Paul Durrant
2018-09-04 12:55       ` Jan Beulich
2018-09-04 13:17         ` Paul Durrant
2018-09-07 10:52   ` Jan Beulich
2018-08-23  9:47 ` [PATCH v6 06/14] iommu: track reserved ranges using a rangeset Paul Durrant
2018-09-07 10:40   ` Jan Beulich
2018-09-11  9:28     ` Paul Durrant
2018-08-23  9:47 ` [PATCH v6 07/14] x86: add iommu_op to query reserved ranges Paul Durrant
2018-09-07 11:01   ` Jan Beulich
2018-09-11  9:34     ` Paul Durrant
2018-09-11  9:43       ` Jan Beulich
2018-09-13  6:11     ` Tian, Kevin
2018-08-23  9:47 ` [PATCH v6 08/14] vtd: add lookup_page method to iommu_ops Paul Durrant
2018-09-07 11:11   ` Jan Beulich
2018-09-07 12:36     ` Paul Durrant
2018-09-07 14:56       ` Jan Beulich
2018-09-07 15:24         ` Paul Durrant
2018-09-07 15:52           ` Jan Beulich
2018-09-12  8:31     ` Paul Durrant
2018-09-12  8:43       ` Jan Beulich
2018-09-12  8:45         ` Paul Durrant
2018-09-12  8:51           ` Paul Durrant
2018-09-12  8:53             ` Paul Durrant
2018-09-12  9:03               ` Jan Beulich
2018-09-12  9:05                 ` Paul Durrant
2018-09-12  9:12                   ` Jan Beulich
2018-09-12  9:15                     ` Paul Durrant
2018-09-12  9:21                       ` Jan Beulich
2018-09-12  9:30                         ` Paul Durrant
2018-09-12 10:07                           ` Jan Beulich
2018-09-12 10:09                             ` Paul Durrant
2018-09-12 12:15                               ` Jan Beulich
2018-09-12 12:22                                 ` Paul Durrant
2018-09-12 12:39                                   ` Jan Beulich
2018-09-12 12:53                                     ` Paul Durrant
2018-09-12 13:19                                       ` Jan Beulich
2018-09-12 13:25                                         ` Paul Durrant
2018-09-12 13:39                                           ` Jan Beulich
2018-09-12 13:43                                             ` Paul Durrant
2018-09-12  8:59           ` Jan Beulich
2018-08-23  9:47 ` [PATCH v6 09/14] mm / iommu: include need_iommu() test in iommu_use_hap_pt() Paul Durrant
2018-09-07 11:20   ` Jan Beulich
2018-09-11  9:39     ` Paul Durrant
2018-09-11  9:47       ` Jan Beulich
2018-09-13  6:23         ` Tian, Kevin
2018-09-13  8:34           ` Paul Durrant
2018-08-23  9:47 ` [PATCH v6 10/14] mm / iommu: split need_iommu() into has_iommu_pt() and need_iommu_pt_sync() Paul Durrant
2018-08-23 11:10   ` Razvan Cojocaru
2018-09-11 14:31   ` Jan Beulich
2018-09-11 15:40     ` Paul Durrant
2018-09-12  6:45       ` Jan Beulich
2018-09-12  8:07         ` Paul Durrant
2018-08-23  9:47 ` [PATCH v6 11/14] x86: add iommu_op to enable modification of IOMMU mappings Paul Durrant
2018-09-11 14:48   ` Jan Beulich
2018-09-11 15:52     ` Paul Durrant
2018-09-12  6:53       ` Jan Beulich
2018-09-12  8:04         ` Paul Durrant
2018-08-23  9:47 ` [PATCH v6 12/14] memory: add get_paged_gfn() as a wrapper Paul Durrant
2018-08-23 10:24   ` Julien Grall
2018-08-23 10:30     ` Paul Durrant
2018-09-11 14:56   ` Jan Beulich
2018-09-12  9:10     ` Paul Durrant
2018-09-12  9:15       ` Jan Beulich
2018-09-12 10:01         ` George Dunlap
2018-09-12 10:08           ` Paul Durrant
2018-09-12 10:10           ` Jan Beulich
2018-08-23  9:47 ` [PATCH v6 13/14] x86: add iommu_ops to modify and flush IOMMU mappings Paul Durrant
2018-09-11 15:15   ` Jan Beulich
2018-09-12  7:03   ` Jan Beulich
2018-09-12  8:02     ` Paul Durrant
2018-09-12  8:27       ` Jan Beulich
2018-09-13  6:41       ` Tian, Kevin
2018-09-13  8:32         ` Paul Durrant
2018-09-13  8:49         ` Jan Beulich
2018-08-23  9:47 ` [PATCH v6 14/14] x86: extend the map and unmap iommu_ops to support grant references Paul Durrant
2018-09-12 14:12   ` Jan Beulich
2018-09-12 16:28     ` Paul Durrant

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.