All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 00/10] Check VT-d Device-TLB flush error
@ 2016-04-29  9:25 Quan Xu
  2016-04-29  9:25 ` [PATCH v3 01/10] vt-d: fix the IOMMU flush issue Quan Xu
                   ` (9 more replies)
  0 siblings, 10 replies; 39+ messages in thread
From: Quan Xu @ 2016-04-29  9:25 UTC (permalink / raw)
  To: xen-devel; +Cc: dario.faggioli, Quan Xu

This patch set is a prereq patch set for Patch:'VT-d Device-TLB flush issue'.

While IOMMU Device-TLB flush timed out, xen calls panic() at present. However the existing panic()
is going to be eliminated, so we must propagate the IOMMU Device-TLB flush error up to the call trees.

This patch set is also based on the discussion of 'abstract model of IOMMU unmaping/mapping failures'

In this v3,
    - best effort flushing when an error.
    - adding __must_check.


Quan Xu (10):
  vt-d: fix the IOMMU flush issue
  IOMMU: handle IOMMU mapping and unmapping failures
  IOMMU/MMU: enhance the call trees of IOMMU unmapping and mapping
  vt-d: propagate IOMMU Device-TLB flush error up to IOMMU unmapping.
  vt-d: propagate IOMMU Device-TLB flush error up to IOMMU mapping.
  propagate IOMMU Device-TLB flush error up to iommu_iotlb_flush{,_all}
    (top level ones).
  IOMMU: propagate IOMMU Device-TLB flush error up to
    iommu_iotlb_flush{,_all} (leaf ones).
  vt-d/ept: propagate IOMMU Device-TLB flush error up to EPT update.
  IOMMU: propagate IOMMU Device-TLB flush error up to IOMMU suspending
  vt-d: propagate error up to ME phantom function mapping and unmapping

 xen/arch/arm/p2m.c                            |   5 +-
 xen/arch/x86/acpi/power.c                     |  15 ++-
 xen/arch/x86/mm.c                             |  13 +-
 xen/arch/x86/mm/p2m-ept.c                     |  30 ++++-
 xen/arch/x86/mm/p2m-pt.c                      |  24 +++-
 xen/arch/x86/mm/p2m.c                         |  11 +-
 xen/common/memory.c                           |  14 ++-
 xen/drivers/passthrough/amd/iommu_init.c      |   9 +-
 xen/drivers/passthrough/amd/pci_amd_iommu.c   |   2 +-
 xen/drivers/passthrough/arm/smmu.c            |  12 +-
 xen/drivers/passthrough/iommu.c               |  64 ++++++++--
 xen/drivers/passthrough/vtd/extern.h          |   3 +-
 xen/drivers/passthrough/vtd/iommu.c           | 168 ++++++++++++++++++--------
 xen/drivers/passthrough/vtd/quirks.c          |  28 +++--
 xen/drivers/passthrough/x86/iommu.c           |   5 +-
 xen/include/asm-x86/hvm/svm/amd-iommu-proto.h |   2 +-
 xen/include/asm-x86/iommu.h                   |   3 +-
 xen/include/xen/iommu.h                       |  13 +-
 18 files changed, 313 insertions(+), 108 deletions(-)

-- 
1.9.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 39+ messages in thread

* [PATCH v3 01/10] vt-d: fix the IOMMU flush issue
  2016-04-29  9:25 [PATCH v3 00/10] Check VT-d Device-TLB flush error Quan Xu
@ 2016-04-29  9:25 ` Quan Xu
  2016-05-04  1:26   ` Tian, Kevin
  2016-04-29  9:25 ` [PATCH v3 02/10] IOMMU: handle IOMMU mapping and unmapping failures Quan Xu
                   ` (8 subsequent siblings)
  9 siblings, 1 reply; 39+ messages in thread
From: Quan Xu @ 2016-04-29  9:25 UTC (permalink / raw)
  To: xen-devel
  Cc: Kevin Tian, Keir Fraser, Quan Xu, Andrew Cooper, dario.faggioli,
	Jan Beulich, Feng Wu

The propagation value from IOMMU flush interfaces may be positive, which
indicates callers need to flush cache, not one of faliures.

when the propagation value is positive, this patch fixes this flush issue
as follows:
  - call iommu_flush_write_buffer() to flush cache.
  - return zero.

Signed-off-by: Quan Xu <quan.xu@intel.com>

CC: Kevin Tian <kevin.tian@intel.com>
CC: Feng Wu <feng.wu@intel.com>
CC: Keir Fraser <keir@xen.org>
CC: Jan Beulich <jbeulich@suse.com>
CC: Andrew Cooper <andrew.cooper3@citrix.com>
---
 xen/drivers/passthrough/vtd/iommu.c | 94 +++++++++++++++++++++++++------------
 xen/include/asm-x86/iommu.h         |  2 +-
 2 files changed, 64 insertions(+), 32 deletions(-)

diff --git a/xen/drivers/passthrough/vtd/iommu.c b/xen/drivers/passthrough/vtd/iommu.c
index 5ad25dc..6e2e43a 100644
--- a/xen/drivers/passthrough/vtd/iommu.c
+++ b/xen/drivers/passthrough/vtd/iommu.c
@@ -558,14 +558,16 @@ static void iommu_flush_all(void)
     }
 }
 
-static void __intel_iommu_iotlb_flush(struct domain *d, unsigned long gfn,
-        int dma_old_pte_present, unsigned int page_count)
+static int iommu_flush_iotlb(struct domain *d, unsigned long gfn,
+                             bool_t dma_old_pte_present,
+                             unsigned int page_count)
 {
     struct hvm_iommu *hd = domain_hvm_iommu(d);
     struct acpi_drhd_unit *drhd;
     struct iommu *iommu;
     int flush_dev_iotlb;
     int iommu_domid;
+    int rc, ret = 0;
 
     /*
      * No need pcideves_lock here because we have flush
@@ -584,29 +586,35 @@ static void __intel_iommu_iotlb_flush(struct domain *d, unsigned long gfn,
             continue;
 
         if ( page_count != 1 || gfn == INVALID_GFN )
-        {
-            if ( iommu_flush_iotlb_dsi(iommu, iommu_domid,
-                        0, flush_dev_iotlb) )
-                iommu_flush_write_buffer(iommu);
-        }
+            rc = iommu_flush_iotlb_dsi(iommu, iommu_domid,
+                                       0, flush_dev_iotlb);
         else
+            rc = iommu_flush_iotlb_psi(iommu, iommu_domid,
+                                       (paddr_t)gfn << PAGE_SHIFT_4K,
+                                       PAGE_ORDER_4K,
+                                       !dma_old_pte_present,
+                                       flush_dev_iotlb);
+
+        if ( rc > 0 )
         {
-            if ( iommu_flush_iotlb_psi(iommu, iommu_domid,
-                        (paddr_t)gfn << PAGE_SHIFT_4K, PAGE_ORDER_4K,
-                        !dma_old_pte_present, flush_dev_iotlb) )
-                iommu_flush_write_buffer(iommu);
+            iommu_flush_write_buffer(iommu);
+            ret = 0;
         }
+        else if ( rc < 0 )
+            ret = rc;
     }
+
+    return ret;
 }
 
 static void intel_iommu_iotlb_flush(struct domain *d, unsigned long gfn, unsigned int page_count)
 {
-    __intel_iommu_iotlb_flush(d, gfn, 1, page_count);
+    iommu_flush_iotlb(d, gfn, 1, page_count);
 }
 
 static void intel_iommu_iotlb_flush_all(struct domain *d)
 {
-    __intel_iommu_iotlb_flush(d, INVALID_GFN, 0, 0);
+    iommu_flush_iotlb(d, INVALID_GFN, 0, 0);
 }
 
 /* clear one page's page table */
@@ -640,7 +648,7 @@ static void dma_pte_clear_one(struct domain *domain, u64 addr)
     iommu_flush_cache_entry(pte, sizeof(struct dma_pte));
 
     if ( !this_cpu(iommu_dont_flush_iotlb) )
-        __intel_iommu_iotlb_flush(domain, addr >> PAGE_SHIFT_4K, 1, 1);
+        iommu_flush_iotlb(domain, addr >> PAGE_SHIFT_4K, 1, 1);
 
     unmap_vtd_domain_page(page);
 }
@@ -1281,6 +1289,7 @@ int domain_context_mapping_one(
     u64 maddr, pgd_maddr;
     u16 seg = iommu->intel->drhd->segment;
     int agaw;
+    int rc;
 
     ASSERT(pcidevs_locked());
     spin_lock(&iommu->lock);
@@ -1394,13 +1403,19 @@ int domain_context_mapping_one(
     spin_unlock(&iommu->lock);
 
     /* Context entry was previously non-present (with domid 0). */
-    if ( iommu_flush_context_device(iommu, 0, (((u16)bus) << 8) | devfn,
-                                    DMA_CCMD_MASK_NOBIT, 1) )
-        iommu_flush_write_buffer(iommu);
-    else
+    rc = iommu_flush_context_device(iommu, 0, (((u16)bus) << 8) | devfn,
+                                    DMA_CCMD_MASK_NOBIT, 1);
+
+    if ( !rc )
     {
         int flush_dev_iotlb = find_ats_dev_drhd(iommu) ? 1 : 0;
-        iommu_flush_iotlb_dsi(iommu, 0, 1, flush_dev_iotlb);
+        rc = iommu_flush_iotlb_dsi(iommu, 0, 1, flush_dev_iotlb);
+    }
+
+    if ( rc > 0 )
+    {
+        iommu_flush_write_buffer(iommu);
+        rc = 0;
     }
 
     set_bit(iommu->index, &hd->arch.iommu_bitmap);
@@ -1410,7 +1425,7 @@ int domain_context_mapping_one(
     if ( !seg )
         me_wifi_quirk(domain, bus, devfn, MAP_ME_PHANTOM_FUNC);
 
-    return 0;
+    return rc;
 }
 
 static int domain_context_mapping(
@@ -1505,6 +1520,7 @@ int domain_context_unmap_one(
     struct context_entry *context, *context_entries;
     u64 maddr;
     int iommu_domid;
+    int rc;
 
     ASSERT(pcidevs_locked());
     spin_lock(&iommu->lock);
@@ -1532,14 +1548,20 @@ int domain_context_unmap_one(
         return -EINVAL;
     }
 
-    if ( iommu_flush_context_device(iommu, iommu_domid,
+    rc = iommu_flush_context_device(iommu, iommu_domid,
                                     (((u16)bus) << 8) | devfn,
-                                    DMA_CCMD_MASK_NOBIT, 0) )
-        iommu_flush_write_buffer(iommu);
-    else
+                                    DMA_CCMD_MASK_NOBIT, 0);
+
+    if ( !rc )
     {
         int flush_dev_iotlb = find_ats_dev_drhd(iommu) ? 1 : 0;
-        iommu_flush_iotlb_dsi(iommu, iommu_domid, 0, flush_dev_iotlb);
+        rc = iommu_flush_iotlb_dsi(iommu, iommu_domid, 0, flush_dev_iotlb);
+    }
+
+    if ( rc > 0 )
+    {
+        iommu_flush_write_buffer(iommu);
+        rc = 0;
     }
 
     spin_unlock(&iommu->lock);
@@ -1548,7 +1570,7 @@ int domain_context_unmap_one(
     if ( !iommu->intel->drhd->segment )
         me_wifi_quirk(domain, bus, devfn, UNMAP_ME_PHANTOM_FUNC);
 
-    return 0;
+    return rc;
 }
 
 static int domain_context_unmap(
@@ -1738,7 +1760,7 @@ static int intel_iommu_map_page(
     unmap_vtd_domain_page(page);
 
     if ( !this_cpu(iommu_dont_flush_iotlb) )
-        __intel_iommu_iotlb_flush(d, gfn, dma_pte_present(old), 1);
+        iommu_flush_iotlb(d, gfn, dma_pte_present(old), 1);
 
     return 0;
 }
@@ -1754,14 +1776,15 @@ static int intel_iommu_unmap_page(struct domain *d, unsigned long gfn)
     return 0;
 }
 
-void iommu_pte_flush(struct domain *d, u64 gfn, u64 *pte,
-                     int order, int present)
+int iommu_pte_flush(struct domain *d, u64 gfn, u64 *pte,
+                    int order, bool_t present)
 {
     struct acpi_drhd_unit *drhd;
     struct iommu *iommu = NULL;
     struct hvm_iommu *hd = domain_hvm_iommu(d);
     int flush_dev_iotlb;
     int iommu_domid;
+    int rc, ret = 0;
 
     iommu_flush_cache_entry(pte, sizeof(struct dma_pte));
 
@@ -1775,11 +1798,20 @@ void iommu_pte_flush(struct domain *d, u64 gfn, u64 *pte,
         iommu_domid= domain_iommu_domid(d, iommu);
         if ( iommu_domid == -1 )
             continue;
-        if ( iommu_flush_iotlb_psi(iommu, iommu_domid,
+
+        rc = iommu_flush_iotlb_psi(iommu, iommu_domid,
                                    (paddr_t)gfn << PAGE_SHIFT_4K,
-                                   order, !present, flush_dev_iotlb) )
+                                   order, !present, flush_dev_iotlb);
+        if ( rc > 0 )
+        {
             iommu_flush_write_buffer(iommu);
+            ret = 0;
+        }
+        else if ( rc < 0 )
+            ret = rc;
     }
+
+    return ret;
 }
 
 static int __init vtd_ept_page_compatible(struct iommu *iommu)
diff --git a/xen/include/asm-x86/iommu.h b/xen/include/asm-x86/iommu.h
index f22b3a5..1ae0c5a 100644
--- a/xen/include/asm-x86/iommu.h
+++ b/xen/include/asm-x86/iommu.h
@@ -26,7 +26,7 @@ int iommu_setup_hpet_msi(struct msi_desc *);
 
 /* While VT-d specific, this must get declared in a generic header. */
 int adjust_vtd_irq_affinities(void);
-void iommu_pte_flush(struct domain *d, u64 gfn, u64 *pte, int order, int present);
+int iommu_pte_flush(struct domain *d, u64 gfn, u64 *pte, int order, bool_t present);
 bool_t iommu_supports_eim(void);
 int iommu_enable_x2apic_IR(void);
 void iommu_disable_x2apic_IR(void);
-- 
1.9.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH v3 02/10] IOMMU: handle IOMMU mapping and unmapping failures
  2016-04-29  9:25 [PATCH v3 00/10] Check VT-d Device-TLB flush error Quan Xu
  2016-04-29  9:25 ` [PATCH v3 01/10] vt-d: fix the IOMMU flush issue Quan Xu
@ 2016-04-29  9:25 ` Quan Xu
  2016-05-04  1:28   ` Tian, Kevin
  2016-04-29  9:25 ` [PATCH v3 03/10] IOMMU/MMU: enhance the call trees of IOMMU unmapping and mapping Quan Xu
                   ` (7 subsequent siblings)
  9 siblings, 1 reply; 39+ messages in thread
From: Quan Xu @ 2016-04-29  9:25 UTC (permalink / raw)
  To: xen-devel; +Cc: dario.faggioli, Jan Beulich, Quan Xu

Treat IOMMU mapping and unmapping failures as a fatal to the domain
(with the exception of the hardware domain).

If IOMMU mapping and unmapping failed, crash the domain (with the
exception of the hardware domain) and propagate the error up to the
call trees.

Signed-off-by: Quan Xu <quan.xu@intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>

CC: Jan Beulich <jbeulich@suse.com>
---
 xen/drivers/passthrough/iommu.c | 30 ++++++++++++++++++++++++++++--
 1 file changed, 28 insertions(+), 2 deletions(-)

diff --git a/xen/drivers/passthrough/iommu.c b/xen/drivers/passthrough/iommu.c
index b64676f..a0003ac 100644
--- a/xen/drivers/passthrough/iommu.c
+++ b/xen/drivers/passthrough/iommu.c
@@ -243,21 +243,47 @@ int iommu_map_page(struct domain *d, unsigned long gfn, unsigned long mfn,
                    unsigned int flags)
 {
     struct hvm_iommu *hd = domain_hvm_iommu(d);
+    int rc;
 
     if ( !iommu_enabled || !hd->platform_ops )
         return 0;
 
-    return hd->platform_ops->map_page(d, gfn, mfn, flags);
+    rc = hd->platform_ops->map_page(d, gfn, mfn, flags);
+
+    if ( rc )
+    {
+        if ( is_hardware_domain(d) )
+            printk(XENLOG_ERR
+                   "iommu_map_page: IOMMU mapping gfn %#lx mfn %#lx failed for dom%d.",
+                   gfn, mfn, d->domain_id);
+        else
+            domain_crash(d);
+    }
+
+    return rc;
 }
 
 int iommu_unmap_page(struct domain *d, unsigned long gfn)
 {
     struct hvm_iommu *hd = domain_hvm_iommu(d);
+    int rc;
 
     if ( !iommu_enabled || !hd->platform_ops )
         return 0;
 
-    return hd->platform_ops->unmap_page(d, gfn);
+    rc = hd->platform_ops->unmap_page(d, gfn);
+
+    if ( rc )
+    {
+        if ( is_hardware_domain(d) )
+            printk(XENLOG_ERR
+                   "iommu_unmap_page: IOMMU unmapping gfn %#lx failed for dom%d.",
+                   gfn, d->domain_id);
+        else
+            domain_crash(d);
+    }
+
+    return rc;
 }
 
 static void iommu_free_pagetables(unsigned long unused)
-- 
1.9.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH v3 03/10] IOMMU/MMU: enhance the call trees of IOMMU unmapping and mapping
  2016-04-29  9:25 [PATCH v3 00/10] Check VT-d Device-TLB flush error Quan Xu
  2016-04-29  9:25 ` [PATCH v3 01/10] vt-d: fix the IOMMU flush issue Quan Xu
  2016-04-29  9:25 ` [PATCH v3 02/10] IOMMU: handle IOMMU mapping and unmapping failures Quan Xu
@ 2016-04-29  9:25 ` Quan Xu
  2016-05-04  1:45   ` Tian, Kevin
  2016-05-04 13:48   ` George Dunlap
  2016-04-29  9:25 ` [PATCH v3 04/10] vt-d: propagate IOMMU Device-TLB flush error up to IOMMU unmapping Quan Xu
                   ` (6 subsequent siblings)
  9 siblings, 2 replies; 39+ messages in thread
From: Quan Xu @ 2016-04-29  9:25 UTC (permalink / raw)
  To: xen-devel
  Cc: Kevin Tian, Keir Fraser, Jan Beulich, George Dunlap,
	Andrew Cooper, dario.faggioli, Jun Nakajima, Quan Xu

When IOMMU mapping is failed, we issue a best effort rollback, stopping
IOMMU mapping, unmapping the previous IOMMU maps and then reporting the
error up to the call trees. When rollback is not feasible (in early
initialization phase or trade-off of complexity) for the hardware domain,
we do things on a best effort basis, only throwing out an error message.

IOMMU unmapping should perhaps continue despite an error, in an attempt
to do best effort cleanup.

Signed-off-by: Quan Xu <quan.xu@intel.com>

CC: Keir Fraser <keir@xen.org>
CC: Jan Beulich <jbeulich@suse.com>
CC: Andrew Cooper <andrew.cooper3@citrix.com>
CC: Jun Nakajima <jun.nakajima@intel.com>
CC: Kevin Tian <kevin.tian@intel.com>
CC: George Dunlap <george.dunlap@eu.citrix.com>
---
 xen/arch/x86/mm.c               | 13 ++++++++-----
 xen/arch/x86/mm/p2m-ept.c       | 27 +++++++++++++++++++++++++--
 xen/arch/x86/mm/p2m-pt.c        | 24 ++++++++++++++++++++----
 xen/arch/x86/mm/p2m.c           | 11 +++++++++--
 xen/drivers/passthrough/iommu.c | 14 +++++++++++++-
 5 files changed, 75 insertions(+), 14 deletions(-)

diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index a42097f..427097d 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -2467,7 +2467,7 @@ static int __get_page_type(struct page_info *page, unsigned long type,
                            int preemptible)
 {
     unsigned long nx, x, y = page->u.inuse.type_info;
-    int rc = 0;
+    int rc = 0, ret = 0;
 
     ASSERT(!(type & ~(PGT_type_mask | PGT_pae_xen_l2)));
 
@@ -2578,11 +2578,11 @@ static int __get_page_type(struct page_info *page, unsigned long type,
         if ( d && is_pv_domain(d) && unlikely(need_iommu(d)) )
         {
             if ( (x & PGT_type_mask) == PGT_writable_page )
-                iommu_unmap_page(d, mfn_to_gmfn(d, page_to_mfn(page)));
+                ret = iommu_unmap_page(d, mfn_to_gmfn(d, page_to_mfn(page)));
             else if ( type == PGT_writable_page )
-                iommu_map_page(d, mfn_to_gmfn(d, page_to_mfn(page)),
-                               page_to_mfn(page),
-                               IOMMUF_readable|IOMMUF_writable);
+                ret = iommu_map_page(d, mfn_to_gmfn(d, page_to_mfn(page)),
+                                     page_to_mfn(page),
+                                     IOMMUF_readable|IOMMUF_writable);
         }
     }
 
@@ -2599,6 +2599,9 @@ static int __get_page_type(struct page_info *page, unsigned long type,
     if ( (x & PGT_partial) && !(nx & PGT_partial) )
         put_page(page);
 
+    if ( !rc )
+        rc = ret;
+
     return rc;
 }
 
diff --git a/xen/arch/x86/mm/p2m-ept.c b/xen/arch/x86/mm/p2m-ept.c
index 1ed5b47..df87944 100644
--- a/xen/arch/x86/mm/p2m-ept.c
+++ b/xen/arch/x86/mm/p2m-ept.c
@@ -821,6 +821,8 @@ out:
     if ( needs_sync )
         ept_sync_domain(p2m);
 
+    ret = 0;
+
     /* For host p2m, may need to change VT-d page table.*/
     if ( rc == 0 && p2m_is_hostp2m(p2m) && need_iommu(d) &&
          need_modify_vtd_table )
@@ -831,11 +833,29 @@ out:
         {
             if ( iommu_flags )
                 for ( i = 0; i < (1 << order); i++ )
-                    iommu_map_page(d, gfn + i, mfn_x(mfn) + i, iommu_flags);
+                {
+                    rc = iommu_map_page(d, gfn + i, mfn_x(mfn) + i, iommu_flags);
+
+                    if ( !ret && unlikely(rc) )
+                    {
+                        while ( i-- )
+                            iommu_unmap_page(d, gfn + i);
+
+                        ret = rc;
+                        break;
+                    }
+                }
             else
                 for ( i = 0; i < (1 << order); i++ )
-                    iommu_unmap_page(d, gfn + i);
+                {
+                    rc = iommu_unmap_page(d, gfn + i);
+
+                    if ( !ret && unlikely(rc) )
+                        ret = rc;
+                }
         }
+
+        rc = 0;
     }
 
     unmap_domain_page(table);
@@ -850,6 +870,9 @@ out:
     if ( rc == 0 && p2m_is_hostp2m(p2m) )
         p2m_altp2m_propagate_change(d, _gfn(gfn), mfn, order, p2mt, p2ma);
 
+    if ( !rc )
+        rc = ret;
+
     return rc;
 }
 
diff --git a/xen/arch/x86/mm/p2m-pt.c b/xen/arch/x86/mm/p2m-pt.c
index 3d80612..9f5539e 100644
--- a/xen/arch/x86/mm/p2m-pt.c
+++ b/xen/arch/x86/mm/p2m-pt.c
@@ -498,7 +498,7 @@ p2m_pt_set_entry(struct p2m_domain *p2m, unsigned long gfn, mfn_t mfn,
     l1_pgentry_t intermediate_entry = l1e_empty();
     l2_pgentry_t l2e_content;
     l3_pgentry_t l3e_content;
-    int rc;
+    int rc, ret;
     unsigned int iommu_pte_flags = p2m_get_iommu_flags(p2mt);
     /*
      * old_mfn and iommu_old_flags control possible flush/update needs on the
@@ -680,11 +680,27 @@ p2m_pt_set_entry(struct p2m_domain *p2m, unsigned long gfn, mfn_t mfn,
         }
         else if ( iommu_pte_flags )
             for ( i = 0; i < (1UL << page_order); i++ )
-                iommu_map_page(p2m->domain, gfn + i, mfn_x(mfn) + i,
-                               iommu_pte_flags);
+            {
+                ret = iommu_map_page(p2m->domain, gfn + i, mfn_x(mfn) + i,
+                                    iommu_pte_flags);
+
+                if ( !rc && unlikely(ret) )
+                {
+                    while ( i-- )
+                        iommu_unmap_page(p2m->domain, gfn + i);
+
+                    rc = ret;
+                    break;
+                }
+            }
         else
             for ( i = 0; i < (1UL << page_order); i++ )
-                iommu_unmap_page(p2m->domain, gfn + i);
+            {
+                ret = iommu_unmap_page(p2m->domain, gfn + i);
+
+                if ( !rc && unlikely(ret) )
+                    rc = ret;
+            }
     }
 
     /*
diff --git a/xen/arch/x86/mm/p2m.c b/xen/arch/x86/mm/p2m.c
index 6eef2f3..6a9bba1 100644
--- a/xen/arch/x86/mm/p2m.c
+++ b/xen/arch/x86/mm/p2m.c
@@ -636,13 +636,20 @@ p2m_remove_page(struct p2m_domain *p2m, unsigned long gfn, unsigned long mfn,
     mfn_t mfn_return;
     p2m_type_t t;
     p2m_access_t a;
+    int rc = 0, ret;
 
     if ( !paging_mode_translate(p2m->domain) )
     {
         if ( need_iommu(p2m->domain) )
             for ( i = 0; i < (1 << page_order); i++ )
-                iommu_unmap_page(p2m->domain, mfn + i);
-        return 0;
+            {
+               ret = iommu_unmap_page(p2m->domain, mfn + i);
+
+               if ( !rc && unlikely(ret) )
+                   rc = ret;
+            }
+
+        return rc;
     }
 
     ASSERT(gfn_locked_by_me(p2m, gfn));
diff --git a/xen/drivers/passthrough/iommu.c b/xen/drivers/passthrough/iommu.c
index a0003ac..d74433d 100644
--- a/xen/drivers/passthrough/iommu.c
+++ b/xen/drivers/passthrough/iommu.c
@@ -172,6 +172,8 @@ void __hwdom_init iommu_hwdom_init(struct domain *d)
     {
         struct page_info *page;
         unsigned int i = 0;
+        int ret, rc = 0;
+
         page_list_for_each ( page, &d->page_list )
         {
             unsigned long mfn = page_to_mfn(page);
@@ -182,10 +184,20 @@ void __hwdom_init iommu_hwdom_init(struct domain *d)
                  ((page->u.inuse.type_info & PGT_type_mask)
                   == PGT_writable_page) )
                 mapping |= IOMMUF_writable;
-            hd->platform_ops->map_page(d, gfn, mfn, mapping);
+
+            ret = hd->platform_ops->map_page(d, gfn, mfn, mapping);
+
+            if ( unlikely(ret) )
+                rc = ret;
+
             if ( !(i++ & 0xfffff) )
                 process_pending_softirqs();
         }
+
+        if ( rc )
+            printk(XENLOG_WARNING
+                   "iommu_hwdom_init: IOMMU mapping failed for dom%d.",
+                   d->domain_id);
     }
 
     return hd->platform_ops->hwdom_init(d);
-- 
1.9.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH v3 04/10] vt-d: propagate IOMMU Device-TLB flush error up to IOMMU unmapping.
  2016-04-29  9:25 [PATCH v3 00/10] Check VT-d Device-TLB flush error Quan Xu
                   ` (2 preceding siblings ...)
  2016-04-29  9:25 ` [PATCH v3 03/10] IOMMU/MMU: enhance the call trees of IOMMU unmapping and mapping Quan Xu
@ 2016-04-29  9:25 ` Quan Xu
  2016-04-29  9:25 ` [PATCH v3 05/10] vt-d: propagate IOMMU Device-TLB flush error up to IOMMU mapping Quan Xu
                   ` (5 subsequent siblings)
  9 siblings, 0 replies; 39+ messages in thread
From: Quan Xu @ 2016-04-29  9:25 UTC (permalink / raw)
  To: xen-devel; +Cc: dario.faggioli, Jan Beulich, Feng Wu, Kevin Tian, Quan Xu

Propagate the IOMMU Device-TLB flush error up to IOMMU unmapping.

Signed-off-by: Quan Xu <quan.xu@intel.com>
Acked-by: Kevin Tian <kevin.tian@intel.com>

CC: Kevin Tian <kevin.tian@intel.com>
CC: Feng Wu <feng.wu@intel.com>
CC: Jan Beulich <jbeulich@suse.com>
---
 xen/drivers/passthrough/vtd/iommu.c | 15 ++++++++-------
 1 file changed, 8 insertions(+), 7 deletions(-)

diff --git a/xen/drivers/passthrough/vtd/iommu.c b/xen/drivers/passthrough/vtd/iommu.c
index 6e2e43a..e908046 100644
--- a/xen/drivers/passthrough/vtd/iommu.c
+++ b/xen/drivers/passthrough/vtd/iommu.c
@@ -618,11 +618,12 @@ static void intel_iommu_iotlb_flush_all(struct domain *d)
 }
 
 /* clear one page's page table */
-static void dma_pte_clear_one(struct domain *domain, u64 addr)
+static int __must_check dma_pte_clear_one(struct domain *domain, u64 addr)
 {
     struct hvm_iommu *hd = domain_hvm_iommu(domain);
     struct dma_pte *page = NULL, *pte = NULL;
     u64 pg_maddr;
+    int rc = 0;
 
     spin_lock(&hd->arch.mapping_lock);
     /* get last level pte */
@@ -630,7 +631,7 @@ static void dma_pte_clear_one(struct domain *domain, u64 addr)
     if ( pg_maddr == 0 )
     {
         spin_unlock(&hd->arch.mapping_lock);
-        return;
+        return 0;
     }
 
     page = (struct dma_pte *)map_vtd_domain_page(pg_maddr);
@@ -640,7 +641,7 @@ static void dma_pte_clear_one(struct domain *domain, u64 addr)
     {
         spin_unlock(&hd->arch.mapping_lock);
         unmap_vtd_domain_page(page);
-        return;
+        return 0;
     }
 
     dma_clear_pte(*pte);
@@ -648,9 +649,11 @@ static void dma_pte_clear_one(struct domain *domain, u64 addr)
     iommu_flush_cache_entry(pte, sizeof(struct dma_pte));
 
     if ( !this_cpu(iommu_dont_flush_iotlb) )
-        iommu_flush_iotlb(domain, addr >> PAGE_SHIFT_4K, 1, 1);
+        rc = iommu_flush_iotlb(domain, addr >> PAGE_SHIFT_4K, 1, 1);
 
     unmap_vtd_domain_page(page);
+
+    return rc;
 }
 
 static void iommu_free_pagetable(u64 pt_maddr, int level)
@@ -1771,9 +1774,7 @@ static int intel_iommu_unmap_page(struct domain *d, unsigned long gfn)
     if ( iommu_passthrough && is_hardware_domain(d) )
         return 0;
 
-    dma_pte_clear_one(d, (paddr_t)gfn << PAGE_SHIFT_4K);
-
-    return 0;
+    return dma_pte_clear_one(d, (paddr_t)gfn << PAGE_SHIFT_4K);
 }
 
 int iommu_pte_flush(struct domain *d, u64 gfn, u64 *pte,
-- 
1.9.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH v3 05/10] vt-d: propagate IOMMU Device-TLB flush error up to IOMMU mapping.
  2016-04-29  9:25 [PATCH v3 00/10] Check VT-d Device-TLB flush error Quan Xu
                   ` (3 preceding siblings ...)
  2016-04-29  9:25 ` [PATCH v3 04/10] vt-d: propagate IOMMU Device-TLB flush error up to IOMMU unmapping Quan Xu
@ 2016-04-29  9:25 ` Quan Xu
  2016-04-29  9:25 ` [PATCH v3 06/10] propagate IOMMU Device-TLB flush error up to iommu_iotlb_flush{, _all} (top level ones) Quan Xu
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 39+ messages in thread
From: Quan Xu @ 2016-04-29  9:25 UTC (permalink / raw)
  To: xen-devel; +Cc: dario.faggioli, Jan Beulich, Feng Wu, Kevin Tian, Quan Xu

Propagate the IOMMU Device-TLB flush error up to IOMMU mapping.

Signed-off-by: Quan Xu <quan.xu@intel.com>
Acked-by: Kevin Tian <kevin.tian@intel.com>

CC: Kevin Tian <kevin.tian@intel.com>
CC: Feng Wu <feng.wu@intel.com>
CC: Jan Beulich <jbeulich@suse.com>
---
 xen/drivers/passthrough/vtd/iommu.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/xen/drivers/passthrough/vtd/iommu.c b/xen/drivers/passthrough/vtd/iommu.c
index e908046..802de02 100644
--- a/xen/drivers/passthrough/vtd/iommu.c
+++ b/xen/drivers/passthrough/vtd/iommu.c
@@ -1721,6 +1721,7 @@ static int intel_iommu_map_page(
     struct hvm_iommu *hd = domain_hvm_iommu(d);
     struct dma_pte *page = NULL, *pte = NULL, old, new = { 0 };
     u64 pg_maddr;
+    int rc = 0;
 
     /* Do nothing if VT-d shares EPT page table */
     if ( iommu_use_hap_pt(d) )
@@ -1763,9 +1764,9 @@ static int intel_iommu_map_page(
     unmap_vtd_domain_page(page);
 
     if ( !this_cpu(iommu_dont_flush_iotlb) )
-        iommu_flush_iotlb(d, gfn, dma_pte_present(old), 1);
+        rc = iommu_flush_iotlb(d, gfn, dma_pte_present(old), 1);
 
-    return 0;
+    return rc;
 }
 
 static int intel_iommu_unmap_page(struct domain *d, unsigned long gfn)
-- 
1.9.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH v3 06/10] propagate IOMMU Device-TLB flush error up to iommu_iotlb_flush{, _all} (top level ones).
  2016-04-29  9:25 [PATCH v3 00/10] Check VT-d Device-TLB flush error Quan Xu
                   ` (4 preceding siblings ...)
  2016-04-29  9:25 ` [PATCH v3 05/10] vt-d: propagate IOMMU Device-TLB flush error up to IOMMU mapping Quan Xu
@ 2016-04-29  9:25 ` Quan Xu
  2016-04-29  9:25 ` [PATCH v3 07/10] IOMMU: propagate IOMMU Device-TLB flush error up to iommu_iotlb_flush{, _all} (leaf ones) Quan Xu
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 39+ messages in thread
From: Quan Xu @ 2016-04-29  9:25 UTC (permalink / raw)
  To: xen-devel
  Cc: Kevin Tian, Stefano Stabellini, Quan Xu, dario.faggioli,
	Julien Grall, Jan Beulich

Propagate the IOMMU Device-TLB flush error up to the iommu_iotlb_flush{,_all}.

This patch fixes the top level ones (this patch doesn't fix the leaf ones but
the next patch does).

Signed-off-by: Quan Xu <quan.xu@intel.com>

CC: Stefano Stabellini <sstabellini@kernel.org>
CC: Julien Grall <julien.grall@arm.com>
CC: Jan Beulich <jbeulich@suse.com>
CC: Kevin Tian <kevin.tian@intel.com>
---
 xen/arch/arm/p2m.c                  |  5 ++++-
 xen/common/memory.c                 | 14 ++++++++++++--
 xen/drivers/passthrough/iommu.c     | 13 +++++++++----
 xen/drivers/passthrough/x86/iommu.c |  5 +++--
 xen/include/xen/iommu.h             |  5 +++--
 5 files changed, 31 insertions(+), 11 deletions(-)

diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
index db21433..63df5b1 100644
--- a/xen/arch/arm/p2m.c
+++ b/xen/arch/arm/p2m.c
@@ -1178,7 +1178,10 @@ out:
     if ( flush )
     {
         flush_tlb_domain(d);
-        iommu_iotlb_flush(d, sgfn, egfn - sgfn);
+        ret = iommu_iotlb_flush(d, sgfn, egfn - sgfn);
+
+        if ( !rc && unlikely(ret) )
+            rc = ret;
     }
 
     while ( (pg = page_list_remove_head(&free_pages)) )
diff --git a/xen/common/memory.c b/xen/common/memory.c
index 644f81a..21b8098 100644
--- a/xen/common/memory.c
+++ b/xen/common/memory.c
@@ -677,9 +677,19 @@ static int xenmem_add_to_physmap(struct domain *d,
 #ifdef CONFIG_HAS_PASSTHROUGH
     if ( need_iommu(d) )
     {
+        int ret;
+
         this_cpu(iommu_dont_flush_iotlb) = 0;
-        iommu_iotlb_flush(d, xatp->idx - done, done);
-        iommu_iotlb_flush(d, xatp->gpfn - done, done);
+
+        ret = iommu_iotlb_flush(d, xatp->idx - done, done);
+
+        if ( rc >= 0 && unlikely(ret) )
+            rc = ret;
+
+        ret = iommu_iotlb_flush(d, xatp->gpfn - done, done);
+
+        if ( rc >= 0 && unlikely(ret) )
+            rc = ret;
     }
 #endif
 
diff --git a/xen/drivers/passthrough/iommu.c b/xen/drivers/passthrough/iommu.c
index d74433d..b380638 100644
--- a/xen/drivers/passthrough/iommu.c
+++ b/xen/drivers/passthrough/iommu.c
@@ -315,24 +315,29 @@ static void iommu_free_pagetables(unsigned long unused)
                             cpumask_cycle(smp_processor_id(), &cpu_online_map));
 }
 
-void iommu_iotlb_flush(struct domain *d, unsigned long gfn, unsigned int page_count)
+int __must_check iommu_iotlb_flush(struct domain *d, unsigned long gfn,
+                                   unsigned int page_count)
 {
     struct hvm_iommu *hd = domain_hvm_iommu(d);
 
     if ( !iommu_enabled || !hd->platform_ops || !hd->platform_ops->iotlb_flush )
-        return;
+        return 0;
 
     hd->platform_ops->iotlb_flush(d, gfn, page_count);
+
+    return 0;
 }
 
-void iommu_iotlb_flush_all(struct domain *d)
+int __must_check iommu_iotlb_flush_all(struct domain *d)
 {
     struct hvm_iommu *hd = domain_hvm_iommu(d);
 
     if ( !iommu_enabled || !hd->platform_ops || !hd->platform_ops->iotlb_flush_all )
-        return;
+        return 0;
 
     hd->platform_ops->iotlb_flush_all(d);
+
+    return 0;
 }
 
 int __init iommu_setup(void)
diff --git a/xen/drivers/passthrough/x86/iommu.c b/xen/drivers/passthrough/x86/iommu.c
index 8cbb655..c2c1ee3 100644
--- a/xen/drivers/passthrough/x86/iommu.c
+++ b/xen/drivers/passthrough/x86/iommu.c
@@ -104,8 +104,9 @@ int arch_iommu_populate_page_table(struct domain *d)
     this_cpu(iommu_dont_flush_iotlb) = 0;
 
     if ( !rc )
-        iommu_iotlb_flush_all(d);
-    else if ( rc != -ERESTART )
+        rc = iommu_iotlb_flush_all(d);
+
+    if ( rc && rc != -ERESTART )
         iommu_teardown(d);
 
     return rc;
diff --git a/xen/include/xen/iommu.h b/xen/include/xen/iommu.h
index 8217cb7..0f4c18d 100644
--- a/xen/include/xen/iommu.h
+++ b/xen/include/xen/iommu.h
@@ -182,8 +182,9 @@ int iommu_do_pci_domctl(struct xen_domctl *, struct domain *d,
 int iommu_do_domctl(struct xen_domctl *, struct domain *d,
                     XEN_GUEST_HANDLE_PARAM(xen_domctl_t));
 
-void iommu_iotlb_flush(struct domain *d, unsigned long gfn, unsigned int page_count);
-void iommu_iotlb_flush_all(struct domain *d);
+int __must_check iommu_iotlb_flush(struct domain *d, unsigned long gfn,
+                                   unsigned int page_count);
+int __must_check iommu_iotlb_flush_all(struct domain *d);
 
 /*
  * The purpose of the iommu_dont_flush_iotlb optional cpu flag is to
-- 
1.9.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH v3 07/10] IOMMU: propagate IOMMU Device-TLB flush error up to iommu_iotlb_flush{, _all} (leaf ones).
  2016-04-29  9:25 [PATCH v3 00/10] Check VT-d Device-TLB flush error Quan Xu
                   ` (5 preceding siblings ...)
  2016-04-29  9:25 ` [PATCH v3 06/10] propagate IOMMU Device-TLB flush error up to iommu_iotlb_flush{, _all} (top level ones) Quan Xu
@ 2016-04-29  9:25 ` Quan Xu
  2016-05-04  1:54   ` Tian, Kevin
  2016-04-29  9:25 ` [PATCH v3 08/10] vt-d/ept: propagate IOMMU Device-TLB flush error up to EPT update Quan Xu
                   ` (2 subsequent siblings)
  9 siblings, 1 reply; 39+ messages in thread
From: Quan Xu @ 2016-04-29  9:25 UTC (permalink / raw)
  To: xen-devel
  Cc: Kevin Tian, Stefano Stabellini, Feng Wu, Quan Xu, dario.faggioli,
	Julien Grall, Jan Beulich

Propagate the IOMMU Device-TLB flush error up to the iommu_iotlb_flush{,_all}.

This patch fixes the leaf ones.

Signed-off-by: Quan Xu <quan.xu@intel.com>

CC: Stefano Stabellini <sstabellini@kernel.org>
CC: Julien Grall <julien.grall@arm.com>
CC: Jan Beulich <jbeulich@suse.com>
CC: Kevin Tian <kevin.tian@intel.com>
CC: Feng Wu <feng.wu@intel.com>
---
 xen/drivers/passthrough/arm/smmu.c  | 12 +++++++-----
 xen/drivers/passthrough/iommu.c     |  8 ++------
 xen/drivers/passthrough/vtd/iommu.c |  9 +++++----
 xen/include/xen/iommu.h             |  4 ++--
 4 files changed, 16 insertions(+), 17 deletions(-)

diff --git a/xen/drivers/passthrough/arm/smmu.c b/xen/drivers/passthrough/arm/smmu.c
index 62da087..c83030b 100644
--- a/xen/drivers/passthrough/arm/smmu.c
+++ b/xen/drivers/passthrough/arm/smmu.c
@@ -2540,7 +2540,7 @@ static int force_stage = 2;
  */
 static u32 platform_features = ARM_SMMU_FEAT_COHERENT_WALK;
 
-static void arm_smmu_iotlb_flush_all(struct domain *d)
+static int arm_smmu_iotlb_flush_all(struct domain *d)
 {
 	struct arm_smmu_xen_domain *smmu_domain = domain_hvm_iommu(d)->arch.priv;
 	struct iommu_domain *cfg;
@@ -2557,13 +2557,15 @@ static void arm_smmu_iotlb_flush_all(struct domain *d)
 		arm_smmu_tlb_inv_context(cfg->priv);
 	}
 	spin_unlock(&smmu_domain->lock);
+
+	return 0;
 }
 
-static void arm_smmu_iotlb_flush(struct domain *d, unsigned long gfn,
-                                 unsigned int page_count)
+static int arm_smmu_iotlb_flush(struct domain *d, unsigned long gfn,
+                                unsigned int page_count)
 {
-    /* ARM SMMU v1 doesn't have flush by VMA and VMID */
-    arm_smmu_iotlb_flush_all(d);
+	/* ARM SMMU v1 doesn't have flush by VMA and VMID */
+	return arm_smmu_iotlb_flush_all(d);
 }
 
 static struct iommu_domain *arm_smmu_get_domain(struct domain *d,
diff --git a/xen/drivers/passthrough/iommu.c b/xen/drivers/passthrough/iommu.c
index b380638..839852f 100644
--- a/xen/drivers/passthrough/iommu.c
+++ b/xen/drivers/passthrough/iommu.c
@@ -323,9 +323,7 @@ int __must_check iommu_iotlb_flush(struct domain *d, unsigned long gfn,
     if ( !iommu_enabled || !hd->platform_ops || !hd->platform_ops->iotlb_flush )
         return 0;
 
-    hd->platform_ops->iotlb_flush(d, gfn, page_count);
-
-    return 0;
+    return hd->platform_ops->iotlb_flush(d, gfn, page_count);
 }
 
 int __must_check iommu_iotlb_flush_all(struct domain *d)
@@ -335,9 +333,7 @@ int __must_check iommu_iotlb_flush_all(struct domain *d)
     if ( !iommu_enabled || !hd->platform_ops || !hd->platform_ops->iotlb_flush_all )
         return 0;
 
-    hd->platform_ops->iotlb_flush_all(d);
-
-    return 0;
+    return hd->platform_ops->iotlb_flush_all(d);
 }
 
 int __init iommu_setup(void)
diff --git a/xen/drivers/passthrough/vtd/iommu.c b/xen/drivers/passthrough/vtd/iommu.c
index 802de02..e2cb1e1 100644
--- a/xen/drivers/passthrough/vtd/iommu.c
+++ b/xen/drivers/passthrough/vtd/iommu.c
@@ -607,14 +607,15 @@ static int iommu_flush_iotlb(struct domain *d, unsigned long gfn,
     return ret;
 }
 
-static void intel_iommu_iotlb_flush(struct domain *d, unsigned long gfn, unsigned int page_count)
+static int intel_iommu_iotlb_flush(struct domain *d, unsigned long gfn,
+                                   unsigned int page_count)
 {
-    iommu_flush_iotlb(d, gfn, 1, page_count);
+    return iommu_flush_iotlb(d, gfn, 1, page_count);
 }
 
-static void intel_iommu_iotlb_flush_all(struct domain *d)
+static int intel_iommu_iotlb_flush_all(struct domain *d)
 {
-    iommu_flush_iotlb(d, INVALID_GFN, 0, 0);
+    return iommu_flush_iotlb(d, INVALID_GFN, 0, 0);
 }
 
 /* clear one page's page table */
diff --git a/xen/include/xen/iommu.h b/xen/include/xen/iommu.h
index 0f4c18d..fe52428 100644
--- a/xen/include/xen/iommu.h
+++ b/xen/include/xen/iommu.h
@@ -161,8 +161,8 @@ struct iommu_ops {
     void (*resume)(void);
     void (*share_p2m)(struct domain *d);
     void (*crash_shutdown)(void);
-    void (*iotlb_flush)(struct domain *d, unsigned long gfn, unsigned int page_count);
-    void (*iotlb_flush_all)(struct domain *d);
+    int (*iotlb_flush)(struct domain *d, unsigned long gfn, unsigned int page_count);
+    int (*iotlb_flush_all)(struct domain *d);
     int (*get_reserved_device_memory)(iommu_grdm_t *, void *);
     void (*dump_p2m_table)(struct domain *d);
 };
-- 
1.9.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH v3 08/10] vt-d/ept: propagate IOMMU Device-TLB flush error up to EPT update.
  2016-04-29  9:25 [PATCH v3 00/10] Check VT-d Device-TLB flush error Quan Xu
                   ` (6 preceding siblings ...)
  2016-04-29  9:25 ` [PATCH v3 07/10] IOMMU: propagate IOMMU Device-TLB flush error up to iommu_iotlb_flush{, _all} (leaf ones) Quan Xu
@ 2016-04-29  9:25 ` Quan Xu
  2016-05-04  1:56   ` Tian, Kevin
  2016-04-29  9:25 ` [PATCH v3 09/10] IOMMU: propagate IOMMU Device-TLB flush error up to IOMMU suspending Quan Xu
  2016-04-29  9:25 ` [PATCH v3 10/10] vt-d: propagate error up to ME phantom function mapping and unmapping Quan Xu
  9 siblings, 1 reply; 39+ messages in thread
From: Quan Xu @ 2016-04-29  9:25 UTC (permalink / raw)
  To: xen-devel
  Cc: Kevin Tian, Feng Wu, Jun Nakajima, George Dunlap, Andrew Cooper,
	dario.faggioli, Jan Beulich, Quan Xu

Propagate the IOMMU Device-TLB flush error up to the ept_set_entry(),
when VT-d shares EPT page table.

Signed-off-by: Quan Xu <quan.xu@intel.com>

CC: Jun Nakajima <jun.nakajima@intel.com>
CC: Kevin Tian <kevin.tian@intel.com>
CC: George Dunlap <george.dunlap@eu.citrix.com>
CC: Jan Beulich <jbeulich@suse.com>
CC: Andrew Cooper <andrew.cooper3@citrix.com>
CC: Feng Wu <feng.wu@intel.com>
---
 xen/arch/x86/mm/p2m-ept.c           | 3 ++-
 xen/drivers/passthrough/vtd/iommu.c | 4 ++--
 xen/include/asm-x86/iommu.h         | 3 ++-
 3 files changed, 6 insertions(+), 4 deletions(-)

diff --git a/xen/arch/x86/mm/p2m-ept.c b/xen/arch/x86/mm/p2m-ept.c
index df87944..d31650f 100644
--- a/xen/arch/x86/mm/p2m-ept.c
+++ b/xen/arch/x86/mm/p2m-ept.c
@@ -828,7 +828,8 @@ out:
          need_modify_vtd_table )
     {
         if ( iommu_hap_pt_share )
-            iommu_pte_flush(d, gfn, &ept_entry->epte, order, vtd_pte_present);
+            ret = iommu_pte_flush(d, gfn, &ept_entry->epte,
+                                  order, vtd_pte_present);
         else
         {
             if ( iommu_flags )
diff --git a/xen/drivers/passthrough/vtd/iommu.c b/xen/drivers/passthrough/vtd/iommu.c
index e2cb1e1..bf8c9c1 100644
--- a/xen/drivers/passthrough/vtd/iommu.c
+++ b/xen/drivers/passthrough/vtd/iommu.c
@@ -1779,8 +1779,8 @@ static int intel_iommu_unmap_page(struct domain *d, unsigned long gfn)
     return dma_pte_clear_one(d, (paddr_t)gfn << PAGE_SHIFT_4K);
 }
 
-int iommu_pte_flush(struct domain *d, u64 gfn, u64 *pte,
-                    int order, bool_t present)
+int __must_check iommu_pte_flush(struct domain *d, u64 gfn, u64 *pte,
+                                 int order, bool_t present)
 {
     struct acpi_drhd_unit *drhd;
     struct iommu *iommu = NULL;
diff --git a/xen/include/asm-x86/iommu.h b/xen/include/asm-x86/iommu.h
index 1ae0c5a..fbfb1a4 100644
--- a/xen/include/asm-x86/iommu.h
+++ b/xen/include/asm-x86/iommu.h
@@ -26,7 +26,8 @@ int iommu_setup_hpet_msi(struct msi_desc *);
 
 /* While VT-d specific, this must get declared in a generic header. */
 int adjust_vtd_irq_affinities(void);
-int iommu_pte_flush(struct domain *d, u64 gfn, u64 *pte, int order, bool_t present);
+int __must_check iommu_pte_flush(struct domain *d, u64 gfn, u64 *pte,
+                                 int order, bool_t present);
 bool_t iommu_supports_eim(void);
 int iommu_enable_x2apic_IR(void);
 void iommu_disable_x2apic_IR(void);
-- 
1.9.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH v3 09/10] IOMMU: propagate IOMMU Device-TLB flush error up to IOMMU suspending
  2016-04-29  9:25 [PATCH v3 00/10] Check VT-d Device-TLB flush error Quan Xu
                   ` (7 preceding siblings ...)
  2016-04-29  9:25 ` [PATCH v3 08/10] vt-d/ept: propagate IOMMU Device-TLB flush error up to EPT update Quan Xu
@ 2016-04-29  9:25 ` Quan Xu
  2016-05-04  1:59   ` Tian, Kevin
  2016-04-29  9:25 ` [PATCH v3 10/10] vt-d: propagate error up to ME phantom function mapping and unmapping Quan Xu
  9 siblings, 1 reply; 39+ messages in thread
From: Quan Xu @ 2016-04-29  9:25 UTC (permalink / raw)
  To: xen-devel
  Cc: Kevin Tian, Stefano Stabellini, Keir Fraser, Quan Xu,
	Liu Jinsong, dario.faggioli, Julien Grall, Jan Beulich,
	Andrew Cooper, Feng Wu, Suravee Suthikulpanit

Propagate the IOMMU Device-TLB flush error up to IOMMU suspending.

Signed-off-by: Quan Xu <quan.xu@intel.com>

CC: Jan Beulich <jbeulich@suse.com>
CC: Liu Jinsong <jinsong.liu@alibaba-inc.com>
CC: Keir Fraser <keir@xen.org>
CC: Andrew Cooper <andrew.cooper3@citrix.com>
CC: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
CC: Stefano Stabellini <sstabellini@kernel.org>
CC: Julien Grall <julien.grall@arm.com>
CC: Kevin Tian <kevin.tian@intel.com>
CC: Feng Wu <feng.wu@intel.com>
---
 xen/arch/x86/acpi/power.c                     | 15 ++++++++++-
 xen/drivers/passthrough/amd/iommu_init.c      |  9 ++++++-
 xen/drivers/passthrough/amd/pci_amd_iommu.c   |  2 +-
 xen/drivers/passthrough/iommu.c               |  7 +++--
 xen/drivers/passthrough/vtd/iommu.c           | 39 ++++++++++++++++++++-------
 xen/include/asm-x86/hvm/svm/amd-iommu-proto.h |  2 +-
 xen/include/xen/iommu.h                       |  4 +--
 7 files changed, 61 insertions(+), 17 deletions(-)

diff --git a/xen/arch/x86/acpi/power.c b/xen/arch/x86/acpi/power.c
index 2885e31..9097333 100644
--- a/xen/arch/x86/acpi/power.c
+++ b/xen/arch/x86/acpi/power.c
@@ -45,6 +45,8 @@ void do_suspend_lowlevel(void);
 
 static int device_power_down(void)
 {
+    int err;
+
     console_suspend();
 
     time_suspend();
@@ -53,11 +55,22 @@ static int device_power_down(void)
 
     ioapic_suspend();
 
-    iommu_suspend();
+    err = iommu_suspend();
+
+    if ( err )
+        goto iommu_suspend_error;
 
     lapic_suspend();
 
     return 0;
+
+ iommu_suspend_error:
+    ioapic_resume();
+    i8259A_resume();
+    time_resume();
+    console_resume();
+
+    return err;
 }
 
 static void device_power_up(void)
diff --git a/xen/drivers/passthrough/amd/iommu_init.c b/xen/drivers/passthrough/amd/iommu_init.c
index 4536106..02588aa 100644
--- a/xen/drivers/passthrough/amd/iommu_init.c
+++ b/xen/drivers/passthrough/amd/iommu_init.c
@@ -1339,12 +1339,19 @@ static void invalidate_all_devices(void)
     iterate_ivrs_mappings(_invalidate_all_devices);
 }
 
-void amd_iommu_suspend(void)
+int amd_iommu_suspend(void)
 {
     struct amd_iommu *iommu;
 
     for_each_amd_iommu ( iommu )
         disable_iommu(iommu);
+
+    return 0;
+}
+
+void amd_iommu_crash_shutdown(void)
+{
+    amd_iommu_suspend();
 }
 
 void amd_iommu_resume(void)
diff --git a/xen/drivers/passthrough/amd/pci_amd_iommu.c b/xen/drivers/passthrough/amd/pci_amd_iommu.c
index c8ee8dc..fb8e816 100644
--- a/xen/drivers/passthrough/amd/pci_amd_iommu.c
+++ b/xen/drivers/passthrough/amd/pci_amd_iommu.c
@@ -628,6 +628,6 @@ const struct iommu_ops amd_iommu_ops = {
     .suspend = amd_iommu_suspend,
     .resume = amd_iommu_resume,
     .share_p2m = amd_iommu_share_p2m,
-    .crash_shutdown = amd_iommu_suspend,
+    .crash_shutdown = amd_iommu_crash_shutdown,
     .dump_p2m_table = amd_dump_p2m_table,
 };
diff --git a/xen/drivers/passthrough/iommu.c b/xen/drivers/passthrough/iommu.c
index 839852f..c565c66 100644
--- a/xen/drivers/passthrough/iommu.c
+++ b/xen/drivers/passthrough/iommu.c
@@ -407,11 +407,14 @@ int iommu_do_domctl(
     return ret;
 }
 
-void iommu_suspend()
+int __must_check iommu_suspend()
 {
     const struct iommu_ops *ops = iommu_get_ops();
+
     if ( iommu_enabled )
-        ops->suspend();
+        return ops->suspend();
+
+    return 0;
 }
 
 void iommu_share_p2m_table(struct domain* d)
diff --git a/xen/drivers/passthrough/vtd/iommu.c b/xen/drivers/passthrough/vtd/iommu.c
index bf8c9c1..cf847ec 100644
--- a/xen/drivers/passthrough/vtd/iommu.c
+++ b/xen/drivers/passthrough/vtd/iommu.c
@@ -542,11 +542,12 @@ static int iommu_flush_iotlb_psi(
     return status;
 }
 
-static void iommu_flush_all(void)
+static int __must_check iommu_flush_all(void)
 {
     struct acpi_drhd_unit *drhd;
     struct iommu *iommu;
     int flush_dev_iotlb;
+    int rc = 0, ret;
 
     flush_all_cache();
     for_each_drhd_unit ( drhd )
@@ -554,8 +555,13 @@ static void iommu_flush_all(void)
         iommu = drhd->iommu;
         iommu_flush_context_global(iommu, 0);
         flush_dev_iotlb = find_ats_dev_drhd(iommu) ? 1 : 0;
-        iommu_flush_iotlb_global(iommu, 0, flush_dev_iotlb);
+        ret = iommu_flush_iotlb_global(iommu, 0, flush_dev_iotlb);
+
+        if ( ret )
+            rc = ret;
     }
+
+    return rc;
 }
 
 static int iommu_flush_iotlb(struct domain *d, unsigned long gfn,
@@ -1272,7 +1278,9 @@ static void __hwdom_init intel_iommu_hwdom_init(struct domain *d)
     setup_hwdom_pci_devices(d, setup_hwdom_device);
     setup_hwdom_rmrr(d);
 
-    iommu_flush_all();
+    if ( iommu_flush_all() )
+        printk(XENLOG_WARNING VTDPREFIX
+               " intel_iommu_hwdom_init: IOMMU flush all failed.\n");
 
     for_each_drhd_unit ( drhd )
     {
@@ -2134,8 +2142,8 @@ static int init_vtd_hw(void)
             return -EIO;
         }
     }
-    iommu_flush_all();
-    return 0;
+
+    return iommu_flush_all();
 }
 
 static void __hwdom_init setup_hwdom_rmrr(struct domain *d)
@@ -2424,16 +2432,25 @@ static int intel_iommu_group_id(u16 seg, u8 bus, u8 devfn)
 }
 
 static u32 iommu_state[MAX_IOMMUS][MAX_IOMMU_REGS];
-static void vtd_suspend(void)
+
+static int vtd_suspend(void)
 {
     struct acpi_drhd_unit *drhd;
     struct iommu *iommu;
     u32    i;
+    int rc;
 
     if ( !iommu_enabled )
-        return;
+        return 0;
 
-    iommu_flush_all();
+    rc = iommu_flush_all();
+
+    if ( rc )
+    {
+        printk(XENLOG_WARNING VTDPREFIX
+               " vtd_suspend: IOMMU flush all failed.\n");
+        return rc;
+    }
 
     for_each_drhd_unit ( drhd )
     {
@@ -2462,6 +2479,8 @@ static void vtd_suspend(void)
         if ( !iommu_intremap && iommu_qinval )
             disable_qinval(iommu);
     }
+
+    return 0;
 }
 
 static void vtd_crash_shutdown(void)
@@ -2472,7 +2491,9 @@ static void vtd_crash_shutdown(void)
     if ( !iommu_enabled )
         return;
 
-    iommu_flush_all();
+    if ( iommu_flush_all() )
+        printk(XENLOG_WARNING VTDPREFIX
+               " vtd_crash_shutdown: IOMMU flush all failed.\n");
 
     for_each_drhd_unit ( drhd )
     {
diff --git a/xen/include/asm-x86/hvm/svm/amd-iommu-proto.h b/xen/include/asm-x86/hvm/svm/amd-iommu-proto.h
index 9c51172..f540fc8 100644
--- a/xen/include/asm-x86/hvm/svm/amd-iommu-proto.h
+++ b/xen/include/asm-x86/hvm/svm/amd-iommu-proto.h
@@ -119,7 +119,7 @@ extern unsigned long *shared_intremap_inuse;
 
 /* power management support */
 void amd_iommu_resume(void);
-void amd_iommu_suspend(void);
+int amd_iommu_suspend(void);
 void amd_iommu_crash_shutdown(void);
 
 /* guest iommu support */
diff --git a/xen/include/xen/iommu.h b/xen/include/xen/iommu.h
index fe52428..dfa52ef 100644
--- a/xen/include/xen/iommu.h
+++ b/xen/include/xen/iommu.h
@@ -157,7 +157,7 @@ struct iommu_ops {
     unsigned int (*read_apic_from_ire)(unsigned int apic, unsigned int reg);
     int (*setup_hpet_msi)(struct msi_desc *);
 #endif /* CONFIG_X86 */
-    void (*suspend)(void);
+    int (*suspend)(void);
     void (*resume)(void);
     void (*share_p2m)(struct domain *d);
     void (*crash_shutdown)(void);
@@ -167,7 +167,7 @@ struct iommu_ops {
     void (*dump_p2m_table)(struct domain *d);
 };
 
-void iommu_suspend(void);
+int __must_check iommu_suspend(void);
 void iommu_resume(void);
 void iommu_crash_shutdown(void);
 int iommu_get_reserved_device_memory(iommu_grdm_t *, void *);
-- 
1.9.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH v3 10/10] vt-d: propagate error up to ME phantom function mapping and unmapping
  2016-04-29  9:25 [PATCH v3 00/10] Check VT-d Device-TLB flush error Quan Xu
                   ` (8 preceding siblings ...)
  2016-04-29  9:25 ` [PATCH v3 09/10] IOMMU: propagate IOMMU Device-TLB flush error up to IOMMU suspending Quan Xu
@ 2016-04-29  9:25 ` Quan Xu
  2016-05-04  2:02   ` Tian, Kevin
  9 siblings, 1 reply; 39+ messages in thread
From: Quan Xu @ 2016-04-29  9:25 UTC (permalink / raw)
  To: xen-devel; +Cc: Kevin Tian, dario.faggioli, Feng Wu, Jan Beulich, Quan Xu

Propagate the IOMMU Device-TLB flush error up to ME phantom function
mapping and unmapping.

Signed-off-by: Quan Xu <quan.xu@intel.com>

CC: Jan Beulich <jbeulich@suse.com>
CC: Kevin Tian <kevin.tian@intel.com>
CC: Feng Wu <feng.wu@intel.com>
---
 xen/drivers/passthrough/vtd/extern.h |  3 ++-
 xen/drivers/passthrough/vtd/iommu.c  | 18 ++++++++++++++----
 xen/drivers/passthrough/vtd/quirks.c | 28 ++++++++++++++++++----------
 3 files changed, 34 insertions(+), 15 deletions(-)

diff --git a/xen/drivers/passthrough/vtd/extern.h b/xen/drivers/passthrough/vtd/extern.h
index cbe0286..6772839 100644
--- a/xen/drivers/passthrough/vtd/extern.h
+++ b/xen/drivers/passthrough/vtd/extern.h
@@ -91,7 +91,8 @@ int is_igd_vt_enabled_quirk(void);
 void platform_quirks_init(void);
 void vtd_ops_preamble_quirk(struct iommu* iommu);
 void vtd_ops_postamble_quirk(struct iommu* iommu);
-void me_wifi_quirk(struct domain *domain, u8 bus, u8 devfn, int map);
+int __must_check me_wifi_quirk(struct domain *domain,
+                               u8 bus, u8 devfn, int map);
 void pci_vtd_quirk(const struct pci_dev *);
 bool_t platform_supports_intremap(void);
 bool_t platform_supports_x2apic(void);
diff --git a/xen/drivers/passthrough/vtd/iommu.c b/xen/drivers/passthrough/vtd/iommu.c
index cf847ec..bfee299 100644
--- a/xen/drivers/passthrough/vtd/iommu.c
+++ b/xen/drivers/passthrough/vtd/iommu.c
@@ -1301,7 +1301,7 @@ int domain_context_mapping_one(
     u64 maddr, pgd_maddr;
     u16 seg = iommu->intel->drhd->segment;
     int agaw;
-    int rc;
+    int rc, ret;
 
     ASSERT(pcidevs_locked());
     spin_lock(&iommu->lock);
@@ -1435,7 +1435,12 @@ int domain_context_mapping_one(
     unmap_vtd_domain_page(context_entries);
 
     if ( !seg )
-        me_wifi_quirk(domain, bus, devfn, MAP_ME_PHANTOM_FUNC);
+    {
+        ret = me_wifi_quirk(domain, bus, devfn, MAP_ME_PHANTOM_FUNC);
+
+        if ( !rc && unlikely(ret) )
+            rc = ret;
+    }
 
     return rc;
 }
@@ -1532,7 +1537,7 @@ int domain_context_unmap_one(
     struct context_entry *context, *context_entries;
     u64 maddr;
     int iommu_domid;
-    int rc;
+    int rc, ret;
 
     ASSERT(pcidevs_locked());
     spin_lock(&iommu->lock);
@@ -1580,7 +1585,12 @@ int domain_context_unmap_one(
     unmap_vtd_domain_page(context_entries);
 
     if ( !iommu->intel->drhd->segment )
-        me_wifi_quirk(domain, bus, devfn, UNMAP_ME_PHANTOM_FUNC);
+    {
+        ret = me_wifi_quirk(domain, bus, devfn, UNMAP_ME_PHANTOM_FUNC);
+
+        if ( !rc && unlikely(ret) )
+            rc = ret;
+    }
 
     return rc;
 }
diff --git a/xen/drivers/passthrough/vtd/quirks.c b/xen/drivers/passthrough/vtd/quirks.c
index 49df41d..bc17d44 100644
--- a/xen/drivers/passthrough/vtd/quirks.c
+++ b/xen/drivers/passthrough/vtd/quirks.c
@@ -332,10 +332,12 @@ void __init platform_quirks_init(void)
  * assigning Intel integrated wifi device to a guest.
  */
 
-static void map_me_phantom_function(struct domain *domain, u32 dev, int map)
+static int __must_check map_me_phantom_function(struct domain *domain,
+                                                u32 dev, int map)
 {
     struct acpi_drhd_unit *drhd;
     struct pci_dev *pdev;
+    int rc;
 
     /* find ME VT-d engine base on a real ME device */
     pdev = pci_get_pdev(0, 0, PCI_DEVFN(dev, 0));
@@ -343,23 +345,27 @@ static void map_me_phantom_function(struct domain *domain, u32 dev, int map)
 
     /* map or unmap ME phantom function */
     if ( map )
-        domain_context_mapping_one(domain, drhd->iommu, 0,
-                                   PCI_DEVFN(dev, 7), NULL);
+        rc = domain_context_mapping_one(domain, drhd->iommu, 0,
+                                        PCI_DEVFN(dev, 7), NULL);
     else
-        domain_context_unmap_one(domain, drhd->iommu, 0,
-                                 PCI_DEVFN(dev, 7));
+        rc = domain_context_unmap_one(domain, drhd->iommu, 0,
+                                      PCI_DEVFN(dev, 7));
+
+    return rc;
 }
 
-void me_wifi_quirk(struct domain *domain, u8 bus, u8 devfn, int map)
+int __must_check me_wifi_quirk(struct domain *domain,
+                               u8 bus, u8 devfn, int map)
 {
     u32 id;
+    int rc = 0;
 
     id = pci_conf_read32(0, 0, 0, 0, 0);
     if ( IS_CTG(id) )
     {
         /* quit if ME does not exist */
         if ( pci_conf_read32(0, 0, 3, 0, 0) == 0xffffffff )
-            return;
+            return 0;
 
         /* if device is WLAN device, map ME phantom device 0:3.7 */
         id = pci_conf_read32(0, bus, PCI_SLOT(devfn), PCI_FUNC(devfn), 0);
@@ -373,7 +379,7 @@ void me_wifi_quirk(struct domain *domain, u8 bus, u8 devfn, int map)
             case 0x423b8086:
             case 0x423c8086:
             case 0x423d8086:
-                map_me_phantom_function(domain, 3, map);
+                rc = map_me_phantom_function(domain, 3, map);
                 break;
             default:
                 break;
@@ -383,7 +389,7 @@ void me_wifi_quirk(struct domain *domain, u8 bus, u8 devfn, int map)
     {
         /* quit if ME does not exist */
         if ( pci_conf_read32(0, 0, 22, 0, 0) == 0xffffffff )
-            return;
+            return 0;
 
         /* if device is WLAN device, map ME phantom device 0:22.7 */
         id = pci_conf_read32(0, bus, PCI_SLOT(devfn), PCI_FUNC(devfn), 0);
@@ -399,12 +405,14 @@ void me_wifi_quirk(struct domain *domain, u8 bus, u8 devfn, int map)
             case 0x42388086:        /* Puma Peak */
             case 0x422b8086:
             case 0x422c8086:
-                map_me_phantom_function(domain, 22, map);
+                rc = map_me_phantom_function(domain, 22, map);
                 break;
             default:
                 break;
         }
     }
+
+    return rc;
 }
 
 void pci_vtd_quirk(const struct pci_dev *pdev)
-- 
1.9.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* Re: [PATCH v3 01/10] vt-d: fix the IOMMU flush issue
  2016-04-29  9:25 ` [PATCH v3 01/10] vt-d: fix the IOMMU flush issue Quan Xu
@ 2016-05-04  1:26   ` Tian, Kevin
  2016-05-04 12:09     ` Xu, Quan
  0 siblings, 1 reply; 39+ messages in thread
From: Tian, Kevin @ 2016-05-04  1:26 UTC (permalink / raw)
  To: Xu, Quan, xen-devel
  Cc: Keir Fraser, dario.faggioli, Wu, Feng, Jan Beulich, Andrew Cooper

> From: Xu, Quan
> Sent: Friday, April 29, 2016 5:25 PM
> 
> The propagation value from IOMMU flush interfaces may be positive, which
> indicates callers need to flush cache, not one of faliures.
> 
> when the propagation value is positive, this patch fixes this flush issue
> as follows:
>   - call iommu_flush_write_buffer() to flush cache.
>   - return zero.
> 
> Signed-off-by: Quan Xu <quan.xu@intel.com>
> 
> CC: Kevin Tian <kevin.tian@intel.com>
> CC: Feng Wu <feng.wu@intel.com>
> CC: Keir Fraser <keir@xen.org>
> CC: Jan Beulich <jbeulich@suse.com>
> CC: Andrew Cooper <andrew.cooper3@citrix.com>
> ---
>  xen/drivers/passthrough/vtd/iommu.c | 94
> +++++++++++++++++++++++++------------
>  xen/include/asm-x86/iommu.h         |  2 +-
>  2 files changed, 64 insertions(+), 32 deletions(-)
> 
> diff --git a/xen/drivers/passthrough/vtd/iommu.c
> b/xen/drivers/passthrough/vtd/iommu.c
> index 5ad25dc..6e2e43a 100644
> --- a/xen/drivers/passthrough/vtd/iommu.c
> +++ b/xen/drivers/passthrough/vtd/iommu.c
> @@ -558,14 +558,16 @@ static void iommu_flush_all(void)
>      }
>  }
> 
> -static void __intel_iommu_iotlb_flush(struct domain *d, unsigned long gfn,
> -        int dma_old_pte_present, unsigned int page_count)
> +static int iommu_flush_iotlb(struct domain *d, unsigned long gfn,
> +                             bool_t dma_old_pte_present,
> +                             unsigned int page_count)
>  {
>      struct hvm_iommu *hd = domain_hvm_iommu(d);
>      struct acpi_drhd_unit *drhd;
>      struct iommu *iommu;
>      int flush_dev_iotlb;
>      int iommu_domid;
> +    int rc, ret = 0;
> 
>      /*
>       * No need pcideves_lock here because we have flush
> @@ -584,29 +586,35 @@ static void __intel_iommu_iotlb_flush(struct domain *d,
> unsigned long gfn,
>              continue;
> 
>          if ( page_count != 1 || gfn == INVALID_GFN )
> -        {
> -            if ( iommu_flush_iotlb_dsi(iommu, iommu_domid,
> -                        0, flush_dev_iotlb) )
> -                iommu_flush_write_buffer(iommu);
> -        }
> +            rc = iommu_flush_iotlb_dsi(iommu, iommu_domid,
> +                                       0, flush_dev_iotlb);
>          else
> +            rc = iommu_flush_iotlb_psi(iommu, iommu_domid,
> +                                       (paddr_t)gfn << PAGE_SHIFT_4K,
> +                                       PAGE_ORDER_4K,
> +                                       !dma_old_pte_present,
> +                                       flush_dev_iotlb);
> +
> +        if ( rc > 0 )
>          {
> -            if ( iommu_flush_iotlb_psi(iommu, iommu_domid,
> -                        (paddr_t)gfn << PAGE_SHIFT_4K, PAGE_ORDER_4K,
> -                        !dma_old_pte_present, flush_dev_iotlb) )
> -                iommu_flush_write_buffer(iommu);
> +            iommu_flush_write_buffer(iommu);
> +            ret = 0;

You don't need 'ret' here. Just change 'rc' to 0.

>          }
> +        else if ( rc < 0 )
> +            ret = rc;

Then this change is not required

>      }
> +
> +    return ret;

Then return 'rc' instead.

>  }
> 
>  static void intel_iommu_iotlb_flush(struct domain *d, unsigned long gfn, unsigned int
> page_count)

could we remove "Intel_" prefix completely? You can rename this
one to iommu_flush_iotlb_page...

>  {
> -    __intel_iommu_iotlb_flush(d, gfn, 1, page_count);
> +    iommu_flush_iotlb(d, gfn, 1, page_count);
>  }
> 
>  static void intel_iommu_iotlb_flush_all(struct domain *d)

and this one just iommu_flush_iotlb_all

>  {
> -    __intel_iommu_iotlb_flush(d, INVALID_GFN, 0, 0);
> +    iommu_flush_iotlb(d, INVALID_GFN, 0, 0);
>  }
> 


Thanks
Kevin

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v3 02/10] IOMMU: handle IOMMU mapping and unmapping failures
  2016-04-29  9:25 ` [PATCH v3 02/10] IOMMU: handle IOMMU mapping and unmapping failures Quan Xu
@ 2016-05-04  1:28   ` Tian, Kevin
  2016-05-04 11:49     ` Xu, Quan
  0 siblings, 1 reply; 39+ messages in thread
From: Tian, Kevin @ 2016-05-04  1:28 UTC (permalink / raw)
  To: xen-devel; +Cc: dario.faggioli, Xu, Quan, Jan Beulich

> From: Quan Xu
> Sent: Friday, April 29, 2016 5:25 PM
> 
> Treat IOMMU mapping and unmapping failures as a fatal to the domain
> (with the exception of the hardware domain).
> 
> If IOMMU mapping and unmapping failed, crash the domain (with the
> exception of the hardware domain) and propagate the error up to the
> call trees.
> 
> Signed-off-by: Quan Xu <quan.xu@intel.com>
> Reviewed-by: Kevin Tian <kevin.tian@intel.com>
> 
> CC: Jan Beulich <jbeulich@suse.com>
> ---
>  xen/drivers/passthrough/iommu.c | 30 ++++++++++++++++++++++++++++--
>  1 file changed, 28 insertions(+), 2 deletions(-)
> 
> diff --git a/xen/drivers/passthrough/iommu.c b/xen/drivers/passthrough/iommu.c
> index b64676f..a0003ac 100644
> --- a/xen/drivers/passthrough/iommu.c
> +++ b/xen/drivers/passthrough/iommu.c
> @@ -243,21 +243,47 @@ int iommu_map_page(struct domain *d, unsigned long gfn,
> unsigned long mfn,
>                     unsigned int flags)
>  {
>      struct hvm_iommu *hd = domain_hvm_iommu(d);
> +    int rc;
> 
>      if ( !iommu_enabled || !hd->platform_ops )
>          return 0;
> 
> -    return hd->platform_ops->map_page(d, gfn, mfn, flags);
> +    rc = hd->platform_ops->map_page(d, gfn, mfn, flags);
> +
> +    if ( rc )
> +    {
> +        if ( is_hardware_domain(d) )
> +            printk(XENLOG_ERR
> +                   "iommu_map_page: IOMMU mapping gfn %#lx mfn %#lx failed for
> dom%d.",
> +                   gfn, mfn, d->domain_id);
> +        else
> +            domain_crash(d);
> +    }

It makes sense to print error messages for all domains, and
then selectively crash domain:

printk(XENLOG_ERR ...);
if ( !is_hardware_domain(d) )
	domain_crash(d);

Thanks
Kevin
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v3 03/10] IOMMU/MMU: enhance the call trees of IOMMU unmapping and mapping
  2016-04-29  9:25 ` [PATCH v3 03/10] IOMMU/MMU: enhance the call trees of IOMMU unmapping and mapping Quan Xu
@ 2016-05-04  1:45   ` Tian, Kevin
  2016-05-04  8:40     ` Jan Beulich
  2016-05-04 13:48   ` George Dunlap
  1 sibling, 1 reply; 39+ messages in thread
From: Tian, Kevin @ 2016-05-04  1:45 UTC (permalink / raw)
  To: Xu, Quan, xen-devel
  Cc: Keir Fraser, Jan Beulich, George Dunlap, Andrew Cooper,
	dario.faggioli, Nakajima, Jun

> From: Xu, Quan
> Sent: Friday, April 29, 2016 5:25 PM
> 
> When IOMMU mapping is failed, we issue a best effort rollback, stopping
> IOMMU mapping, unmapping the previous IOMMU maps and then reporting the
> error up to the call trees. When rollback is not feasible (in early
> initialization phase or trade-off of complexity) for the hardware domain,
> we do things on a best effort basis, only throwing out an error message.
> 
> IOMMU unmapping should perhaps continue despite an error, in an attempt
> to do best effort cleanup.
> 
> Signed-off-by: Quan Xu <quan.xu@intel.com>
> 
> CC: Keir Fraser <keir@xen.org>
> CC: Jan Beulich <jbeulich@suse.com>
> CC: Andrew Cooper <andrew.cooper3@citrix.com>
> CC: Jun Nakajima <jun.nakajima@intel.com>
> CC: Kevin Tian <kevin.tian@intel.com>
> CC: George Dunlap <george.dunlap@eu.citrix.com>
> ---
>  xen/arch/x86/mm.c               | 13 ++++++++-----
>  xen/arch/x86/mm/p2m-ept.c       | 27 +++++++++++++++++++++++++--
>  xen/arch/x86/mm/p2m-pt.c        | 24 ++++++++++++++++++++----
>  xen/arch/x86/mm/p2m.c           | 11 +++++++++--
>  xen/drivers/passthrough/iommu.c | 14 +++++++++++++-
>  5 files changed, 75 insertions(+), 14 deletions(-)
> 
> diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
> index a42097f..427097d 100644
> --- a/xen/arch/x86/mm.c
> +++ b/xen/arch/x86/mm.c
> @@ -2467,7 +2467,7 @@ static int __get_page_type(struct page_info *page, unsigned
> long type,
>                             int preemptible)
>  {
>      unsigned long nx, x, y = page->u.inuse.type_info;
> -    int rc = 0;
> +    int rc = 0, ret = 0;
> 
>      ASSERT(!(type & ~(PGT_type_mask | PGT_pae_xen_l2)));
> 
> @@ -2578,11 +2578,11 @@ static int __get_page_type(struct page_info *page, unsigned
> long type,
>          if ( d && is_pv_domain(d) && unlikely(need_iommu(d)) )
>          {
>              if ( (x & PGT_type_mask) == PGT_writable_page )
> -                iommu_unmap_page(d, mfn_to_gmfn(d, page_to_mfn(page)));
> +                ret = iommu_unmap_page(d, mfn_to_gmfn(d, page_to_mfn(page)));
>              else if ( type == PGT_writable_page )
> -                iommu_map_page(d, mfn_to_gmfn(d, page_to_mfn(page)),
> -                               page_to_mfn(page),
> -                               IOMMUF_readable|IOMMUF_writable);
> +                ret = iommu_map_page(d, mfn_to_gmfn(d, page_to_mfn(page)),
> +                                     page_to_mfn(page),
> +                                     IOMMUF_readable|IOMMUF_writable);
>          }
>      }
> 
> @@ -2599,6 +2599,9 @@ static int __get_page_type(struct page_info *page, unsigned
> long type,
>      if ( (x & PGT_partial) && !(nx & PGT_partial) )
>          put_page(page);
> 
> +    if ( !rc )
> +        rc = ret;
> +
>      return rc;
>  }

I know there were quite some discussions before around above change (sorry I
didn't remember all of them). Just based on mental picture we should return
error where it firstly occurs. However above change looks favoring errors in
later "rc = alloc_page_type" over earlier iommu_map/unmap_page error. Is it
what we want? If there is a reason that we cannot return immediately upon
iommu_map/unmap, it's more reasonable to revert above check like below:

	if ( !ret )
		ret = rc;

	return ret;

> 
> diff --git a/xen/arch/x86/mm/p2m-ept.c b/xen/arch/x86/mm/p2m-ept.c
> index 1ed5b47..df87944 100644
> --- a/xen/arch/x86/mm/p2m-ept.c
> +++ b/xen/arch/x86/mm/p2m-ept.c
> @@ -821,6 +821,8 @@ out:
>      if ( needs_sync )
>          ept_sync_domain(p2m);
> 
> +    ret = 0;
> +
>      /* For host p2m, may need to change VT-d page table.*/
>      if ( rc == 0 && p2m_is_hostp2m(p2m) && need_iommu(d) &&
>           need_modify_vtd_table )
> @@ -831,11 +833,29 @@ out:
>          {
>              if ( iommu_flags )
>                  for ( i = 0; i < (1 << order); i++ )
> -                    iommu_map_page(d, gfn + i, mfn_x(mfn) + i, iommu_flags);
> +                {
> +                    rc = iommu_map_page(d, gfn + i, mfn_x(mfn) + i, iommu_flags);
> +
> +                    if ( !ret && unlikely(rc) )

I think you should move check of ret before iommu_map_page, since we
should stop map against any error (either from best-effort unmap error side).

> +                    {
> +                        while ( i-- )
> +                            iommu_unmap_page(d, gfn + i);
> +
> +                        ret = rc;
> +                        break;
> +                    }
> +                }
>              else
>                  for ( i = 0; i < (1 << order); i++ )
> -                    iommu_unmap_page(d, gfn + i);
> +                {
> +                    rc = iommu_unmap_page(d, gfn + i);
> +
> +                    if ( !ret && unlikely(rc) )
> +                        ret = rc;
> +                }
>          }
> +
> +        rc = 0;
>      }
> 
Thanks
Kevin

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v3 07/10] IOMMU: propagate IOMMU Device-TLB flush error up to iommu_iotlb_flush{, _all} (leaf ones).
  2016-04-29  9:25 ` [PATCH v3 07/10] IOMMU: propagate IOMMU Device-TLB flush error up to iommu_iotlb_flush{, _all} (leaf ones) Quan Xu
@ 2016-05-04  1:54   ` Tian, Kevin
  0 siblings, 0 replies; 39+ messages in thread
From: Tian, Kevin @ 2016-05-04  1:54 UTC (permalink / raw)
  To: Xu, Quan, xen-devel
  Cc: dario.faggioli, Stefano Stabellini, Wu, Feng, Julien Grall, Jan Beulich

> From: Xu, Quan
> Sent: Friday, April 29, 2016 5:25 PM
> 
> Propagate the IOMMU Device-TLB flush error up to the iommu_iotlb_flush{,_all}.
> 
> This patch fixes the leaf ones.
> 
> Signed-off-by: Quan Xu <quan.xu@intel.com>
> 
> CC: Stefano Stabellini <sstabellini@kernel.org>
> CC: Julien Grall <julien.grall@arm.com>
> CC: Jan Beulich <jbeulich@suse.com>
> CC: Kevin Tian <kevin.tian@intel.com>
> CC: Feng Wu <feng.wu@intel.com>

Acked-by: Kevin Tian <kevin.tian@intel.com>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v3 08/10] vt-d/ept: propagate IOMMU Device-TLB flush error up to EPT update.
  2016-04-29  9:25 ` [PATCH v3 08/10] vt-d/ept: propagate IOMMU Device-TLB flush error up to EPT update Quan Xu
@ 2016-05-04  1:56   ` Tian, Kevin
  0 siblings, 0 replies; 39+ messages in thread
From: Tian, Kevin @ 2016-05-04  1:56 UTC (permalink / raw)
  To: Xu, Quan, xen-devel
  Cc: Wu, Feng, Nakajima, Jun, George Dunlap, Andrew Cooper,
	dario.faggioli, Jan Beulich

> From: Xu, Quan
> Sent: Friday, April 29, 2016 5:25 PM
> 
> Propagate the IOMMU Device-TLB flush error up to the ept_set_entry(),
> when VT-d shares EPT page table.
> 
> Signed-off-by: Quan Xu <quan.xu@intel.com>
> 

Acked-by: Kevin Tian <kevin.tian@intel.com>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v3 09/10] IOMMU: propagate IOMMU Device-TLB flush error up to IOMMU suspending
  2016-04-29  9:25 ` [PATCH v3 09/10] IOMMU: propagate IOMMU Device-TLB flush error up to IOMMU suspending Quan Xu
@ 2016-05-04  1:59   ` Tian, Kevin
  2016-05-04  2:14     ` Xu, Quan
  0 siblings, 1 reply; 39+ messages in thread
From: Tian, Kevin @ 2016-05-04  1:59 UTC (permalink / raw)
  To: Xu, Quan, xen-devel
  Cc: Stefano Stabellini, Keir Fraser, Jan Beulich, Liu Jinsong,
	dario.faggioli, Julien Grall, Suravee Suthikulpanit,
	Andrew Cooper, Wu, Feng

> From: Xu, Quan
> Sent: Friday, April 29, 2016 5:25 PM
> diff --git a/xen/arch/x86/acpi/power.c b/xen/arch/x86/acpi/power.c
> index 2885e31..9097333 100644
> --- a/xen/arch/x86/acpi/power.c
> +++ b/xen/arch/x86/acpi/power.c
> @@ -45,6 +45,8 @@ void do_suspend_lowlevel(void);
> 
>  static int device_power_down(void)
>  {
> +    int err;
> +
>      console_suspend();
> 
>      time_suspend();
> @@ -53,11 +55,22 @@ static int device_power_down(void)
> 
>      ioapic_suspend();
> 
> -    iommu_suspend();
> +    err = iommu_suspend();
> +
> +    if ( err )
> +        goto iommu_suspend_error;
> 
>      lapic_suspend();
> 
>      return 0;
> +
> + iommu_suspend_error:
> +    ioapic_resume();
> +    i8259A_resume();
> +    time_resume();
> +    console_resume();
> +
> +    return err;
>  }

Jan had comment to better reuse device_power_up... looks no change in this version.


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v3 10/10] vt-d: propagate error up to ME phantom function mapping and unmapping
  2016-04-29  9:25 ` [PATCH v3 10/10] vt-d: propagate error up to ME phantom function mapping and unmapping Quan Xu
@ 2016-05-04  2:02   ` Tian, Kevin
  2016-05-04  2:19     ` Xu, Quan
  0 siblings, 1 reply; 39+ messages in thread
From: Tian, Kevin @ 2016-05-04  2:02 UTC (permalink / raw)
  To: Xu, Quan, xen-devel; +Cc: dario.faggioli, Wu, Feng, Jan Beulich

> From: Xu, Quan
> Sent: Friday, April 29, 2016 5:25 PM
> diff --git a/xen/drivers/passthrough/vtd/iommu.c
> b/xen/drivers/passthrough/vtd/iommu.c
> index cf847ec..bfee299 100644
> --- a/xen/drivers/passthrough/vtd/iommu.c
> +++ b/xen/drivers/passthrough/vtd/iommu.c
> @@ -1301,7 +1301,7 @@ int domain_context_mapping_one(
>      u64 maddr, pgd_maddr;
>      u16 seg = iommu->intel->drhd->segment;
>      int agaw;
> -    int rc;
> +    int rc, ret;
> 
>      ASSERT(pcidevs_locked());
>      spin_lock(&iommu->lock);
> @@ -1435,7 +1435,12 @@ int domain_context_mapping_one(
>      unmap_vtd_domain_page(context_entries);
> 
>      if ( !seg )
> -        me_wifi_quirk(domain, bus, devfn, MAP_ME_PHANTOM_FUNC);
> +    {
> +        ret = me_wifi_quirk(domain, bus, devfn, MAP_ME_PHANTOM_FUNC);
> +
> +        if ( !rc && unlikely(ret) )
> +            rc = ret;
> +    }

similarly to make it simple no need to check ret again.

> 
>      return rc;
>  }


Thanks
Kevin

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v3 09/10] IOMMU: propagate IOMMU Device-TLB flush error up to IOMMU suspending
  2016-05-04  1:59   ` Tian, Kevin
@ 2016-05-04  2:14     ` Xu, Quan
  2016-05-04  8:42       ` Jan Beulich
  0 siblings, 1 reply; 39+ messages in thread
From: Xu, Quan @ 2016-05-04  2:14 UTC (permalink / raw)
  To: Tian, Kevin, xen-devel
  Cc: Stefano Stabellini, Keir Fraser, Jan Beulich, Liu Jinsong,
	dario.faggioli, Julien Grall, Suravee Suthikulpanit,
	Andrew Cooper, Wu, Feng

On May 04, 2016 10:00 AM, Tian, Kevin <kevin.tian@intel.com> wrote:
> > From: Xu, Quan
> > Sent: Friday, April 29, 2016 5:25 PM
> > diff --git a/xen/arch/x86/acpi/power.c b/xen/arch/x86/acpi/power.c
> > index 2885e31..9097333 100644
> > --- a/xen/arch/x86/acpi/power.c
> > +++ b/xen/arch/x86/acpi/power.c
> > @@ -45,6 +45,8 @@ void do_suspend_lowlevel(void);
> >
> >  static int device_power_down(void)
> >  {
> > +    int err;
> > +
> >      console_suspend();
> >
> >      time_suspend();
> > @@ -53,11 +55,22 @@ static int device_power_down(void)
> >
> >      ioapic_suspend();
> >
> > -    iommu_suspend();
> > +    err = iommu_suspend();
> > +
> > +    if ( err )
> > +        goto iommu_suspend_error;
> >
> >      lapic_suspend();
> >
> >      return 0;
> > +
> > + iommu_suspend_error:
> > +    ioapic_resume();
> > +    i8259A_resume();
> > +    time_resume();
> > +    console_resume();
> > +
> > +    return err;
> >  }
> 
> Jan had comment to better reuse device_power_up... looks no change in this
> version.

Yes,  __iiuc__, this may be an optimization, but not a must.
We can discuss this in detail In this version. 

Quan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v3 10/10] vt-d: propagate error up to ME phantom function mapping and unmapping
  2016-05-04  2:02   ` Tian, Kevin
@ 2016-05-04  2:19     ` Xu, Quan
  0 siblings, 0 replies; 39+ messages in thread
From: Xu, Quan @ 2016-05-04  2:19 UTC (permalink / raw)
  To: Tian, Kevin, xen-devel; +Cc: dario.faggioli, Wu, Feng, Jan Beulich

On May 04, 2016 10:02 AM, Tian, Kevin <kevin.tian@intel.com> wrote:
> > From: Xu, Quan
> > Sent: Friday, April 29, 2016 5:25 PM
> > diff --git a/xen/drivers/passthrough/vtd/iommu.c
> > b/xen/drivers/passthrough/vtd/iommu.c
> > index cf847ec..bfee299 100644
> > --- a/xen/drivers/passthrough/vtd/iommu.c
> > +++ b/xen/drivers/passthrough/vtd/iommu.c
> > @@ -1301,7 +1301,7 @@ int domain_context_mapping_one(
> >      u64 maddr, pgd_maddr;
> >      u16 seg = iommu->intel->drhd->segment;
> >      int agaw;
> > -    int rc;
> > +    int rc, ret;
> >
> >      ASSERT(pcidevs_locked());
> >      spin_lock(&iommu->lock);
> > @@ -1435,7 +1435,12 @@ int domain_context_mapping_one(
> >      unmap_vtd_domain_page(context_entries);
> >
> >      if ( !seg )
> > -        me_wifi_quirk(domain, bus, devfn, MAP_ME_PHANTOM_FUNC);
> > +    {
> > +        ret = me_wifi_quirk(domain, bus, devfn, MAP_ME_PHANTOM_FUNC);
> > +
> > +        if ( !rc && unlikely(ret) )
> > +            rc = ret;
> > +    }
> 
> similarly to make it simple no need to check ret again.

Agreed, I also hesitated to add this check. I'll drop similar ret check in this patch set.

Quan

> 
> >
> >      return rc;
> >  }


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v3 03/10] IOMMU/MMU: enhance the call trees of IOMMU unmapping and mapping
  2016-05-04  1:45   ` Tian, Kevin
@ 2016-05-04  8:40     ` Jan Beulich
  2016-05-04  9:13       ` Xu, Quan
  0 siblings, 1 reply; 39+ messages in thread
From: Jan Beulich @ 2016-05-04  8:40 UTC (permalink / raw)
  To: Kevin Tian
  Cc: Keir Fraser, Jun Nakajima, George Dunlap, Andrew Cooper,
	dario.faggioli, xen-devel, Quan Xu

>>> On 04.05.16 at 03:45, <kevin.tian@intel.com> wrote:
>>  From: Xu, Quan
>> Sent: Friday, April 29, 2016 5:25 PM
>> --- a/xen/arch/x86/mm.c
>> +++ b/xen/arch/x86/mm.c
>> @@ -2467,7 +2467,7 @@ static int __get_page_type(struct page_info *page, unsigned
>> long type,
>>                             int preemptible)
>>  {
>>      unsigned long nx, x, y = page->u.inuse.type_info;
>> -    int rc = 0;
>> +    int rc = 0, ret = 0;
>> 
>>      ASSERT(!(type & ~(PGT_type_mask | PGT_pae_xen_l2)));
>> 
>> @@ -2578,11 +2578,11 @@ static int __get_page_type(struct page_info *page, unsigned
>> long type,
>>          if ( d && is_pv_domain(d) && unlikely(need_iommu(d)) )
>>          {
>>              if ( (x & PGT_type_mask) == PGT_writable_page )
>> -                iommu_unmap_page(d, mfn_to_gmfn(d, page_to_mfn(page)));
>> +                ret = iommu_unmap_page(d, mfn_to_gmfn(d, page_to_mfn(page)));
>>              else if ( type == PGT_writable_page )
>> -                iommu_map_page(d, mfn_to_gmfn(d, page_to_mfn(page)),
>> -                               page_to_mfn(page),
>> -                               IOMMUF_readable|IOMMUF_writable);
>> +                ret = iommu_map_page(d, mfn_to_gmfn(d, page_to_mfn(page)),
>> +                                     page_to_mfn(page),
>> +                                     IOMMUF_readable|IOMMUF_writable);
>>          }
>>      }
>> 
>> @@ -2599,6 +2599,9 @@ static int __get_page_type(struct page_info *page, unsigned
>> long type,
>>      if ( (x & PGT_partial) && !(nx & PGT_partial) )
>>          put_page(page);
>> 
>> +    if ( !rc )
>> +        rc = ret;
>> +
>>      return rc;
>>  }
> 
> I know there were quite some discussions before around above change (sorry I
> didn't remember all of them). Just based on mental picture we should return
> error where it firstly occurs. However above change looks favoring errors in
> later "rc = alloc_page_type" over earlier iommu_map/unmap_page error. Is it
> what we want?

Yes, as that's the primary operation here.

> If there is a reason that we cannot return immediately upon
> iommu_map/unmap,

Since for Dom0 we don't call domain_crash(), we must not bypass
alloc_page_type() here. And even for DomU it would seem at least
fragile if we did - we better don't alter the refcounting behavior.

>> --- a/xen/arch/x86/mm/p2m-ept.c
>> +++ b/xen/arch/x86/mm/p2m-ept.c
>> @@ -821,6 +821,8 @@ out:
>>      if ( needs_sync )
>>          ept_sync_domain(p2m);
>> 
>> +    ret = 0;
>> +
>>      /* For host p2m, may need to change VT-d page table.*/
>>      if ( rc == 0 && p2m_is_hostp2m(p2m) && need_iommu(d) &&
>>           need_modify_vtd_table )
>> @@ -831,11 +833,29 @@ out:
>>          {
>>              if ( iommu_flags )
>>                  for ( i = 0; i < (1 << order); i++ )
>> -                    iommu_map_page(d, gfn + i, mfn_x(mfn) + i, iommu_flags);
>> +                {
>> +                    rc = iommu_map_page(d, gfn + i, mfn_x(mfn) + i, iommu_flags);
>> +
>> +                    if ( !ret && unlikely(rc) )
> 
> I think you should move check of ret before iommu_map_page, since we
> should stop map against any error (either from best-effort unmap error side).

Considering ret getting set to zero ahead of the loop plus ...

>> +                    {
>> +                        while ( i-- )
>> +                            iommu_unmap_page(d, gfn + i);
>> +
>> +                        ret = rc;
>> +                        break;

... this, it looks to me as if the checking of ret above is simply
unnecessary.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v3 09/10] IOMMU: propagate IOMMU Device-TLB flush error up to IOMMU suspending
  2016-05-04  2:14     ` Xu, Quan
@ 2016-05-04  8:42       ` Jan Beulich
  2016-05-04  8:59         ` Xu, Quan
  2016-05-05 10:18         ` Xu, Quan
  0 siblings, 2 replies; 39+ messages in thread
From: Jan Beulich @ 2016-05-04  8:42 UTC (permalink / raw)
  To: Quan Xu
  Cc: Kevin Tian, Stefano Stabellini, Feng Wu, Liu Jinsong,
	dario.faggioli, xen-devel, Julien Grall, SuraveeSuthikulpanit,
	Andrew Cooper, Keir Fraser

>>> On 04.05.16 at 04:14, <quan.xu@intel.com> wrote:
> On May 04, 2016 10:00 AM, Tian, Kevin <kevin.tian@intel.com> wrote:
>> > From: Xu, Quan
>> > Sent: Friday, April 29, 2016 5:25 PM
>> > diff --git a/xen/arch/x86/acpi/power.c b/xen/arch/x86/acpi/power.c
>> > index 2885e31..9097333 100644
>> > --- a/xen/arch/x86/acpi/power.c
>> > +++ b/xen/arch/x86/acpi/power.c
>> > @@ -45,6 +45,8 @@ void do_suspend_lowlevel(void);
>> >
>> >  static int device_power_down(void)
>> >  {
>> > +    int err;
>> > +
>> >      console_suspend();
>> >
>> >      time_suspend();
>> > @@ -53,11 +55,22 @@ static int device_power_down(void)
>> >
>> >      ioapic_suspend();
>> >
>> > -    iommu_suspend();
>> > +    err = iommu_suspend();
>> > +
>> > +    if ( err )
>> > +        goto iommu_suspend_error;
>> >
>> >      lapic_suspend();
>> >
>> >      return 0;
>> > +
>> > + iommu_suspend_error:
>> > +    ioapic_resume();
>> > +    i8259A_resume();
>> > +    time_resume();
>> > +    console_resume();
>> > +
>> > +    return err;
>> >  }
>> 
>> Jan had comment to better reuse device_power_up... looks no change in this
>> version.
> 
> Yes,  __iiuc__, this may be an optimization, but not a must.
> We can discuss this in detail In this version. 

As an optimization it would indeed be quite pointless here. My
request was more for maintainability: By re-using the function
future changes don't need to go to two places, and hence there's
no risk of one of them getting forgotten.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v3 09/10] IOMMU: propagate IOMMU Device-TLB flush error up to IOMMU suspending
  2016-05-04  8:42       ` Jan Beulich
@ 2016-05-04  8:59         ` Xu, Quan
  2016-05-05 10:18         ` Xu, Quan
  1 sibling, 0 replies; 39+ messages in thread
From: Xu, Quan @ 2016-05-04  8:59 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Tian, Kevin, Stefano Stabellini, Wu, Feng, Liu Jinsong,
	dario.faggioli, xen-devel, Julien Grall, SuraveeSuthikulpanit,
	Andrew Cooper, Keir Fraser

On May 04, 2016 4:43 PM, Jan Beulich <JBeulich@suse.com> wrote:
> >>> On 04.05.16 at 04:14, <quan.xu@intel.com> wrote:
> > On May 04, 2016 10:00 AM, Tian, Kevin <kevin.tian@intel.com> wrote:
> >> > From: Xu, Quan
> >> > Sent: Friday, April 29, 2016 5:25 PM diff --git
> >> > a/xen/arch/x86/acpi/power.c b/xen/arch/x86/acpi/power.c index
> >> > 2885e31..9097333 100644
> >> > --- a/xen/arch/x86/acpi/power.c
> >> > +++ b/xen/arch/x86/acpi/power.c
> >> > @@ -45,6 +45,8 @@ void do_suspend_lowlevel(void);
> >> >
> >> >  static int device_power_down(void)  {
> >> > +    int err;
> >> > +
> >> >      console_suspend();
> >> >
> >> >      time_suspend();
> >> > @@ -53,11 +55,22 @@ static int device_power_down(void)
> >> >
> >> >      ioapic_suspend();
> >> >
> >> > -    iommu_suspend();
> >> > +    err = iommu_suspend();
> >> > +
> >> > +    if ( err )
> >> > +        goto iommu_suspend_error;
> >> >
> >> >      lapic_suspend();
> >> >
> >> >      return 0;
> >> > +
> >> > + iommu_suspend_error:
> >> > +    ioapic_resume();
> >> > +    i8259A_resume();
> >> > +    time_resume();
> >> > +    console_resume();
> >> > +
> >> > +    return err;
> >> >  }
> >>
> >> Jan had comment to better reuse device_power_up... looks no change in
> >> this version.
> >
> > Yes,  __iiuc__, this may be an optimization, but not a must.
> > We can discuss this in detail In this version.
> 
> As an optimization it would indeed be quite pointless here. My request was
> more for maintainability: By re-using the function future changes don't need
> to go to two places, and hence there's no risk of one of them getting
> forgotten.
> 

Got it, I will fix it in next v4.

Quan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v3 03/10] IOMMU/MMU: enhance the call trees of IOMMU unmapping and mapping
  2016-05-04  8:40     ` Jan Beulich
@ 2016-05-04  9:13       ` Xu, Quan
  2016-05-04  9:28         ` Jan Beulich
  0 siblings, 1 reply; 39+ messages in thread
From: Xu, Quan @ 2016-05-04  9:13 UTC (permalink / raw)
  To: Jan Beulich, Tian, Kevin
  Cc: Keir Fraser, George Dunlap, Andrew Cooper, dario.faggioli,
	xen-devel, Nakajima, Jun

On May 04, 2016 4:41 PM, Jan Beulich <JBeulich@suse.com> wrote:
> >>> On 04.05.16 at 03:45, <kevin.tian@intel.com> wrote:
> >>  From: Xu, Quan
> >> Sent: Friday, April 29, 2016 5:25 PM
> >> --- a/xen/arch/x86/mm.c
> >> +++ b/xen/arch/x86/mm.c
> >> @@ -2467,7 +2467,7 @@ static int __get_page_type(struct page_info
> >> *page, unsigned long type,
> >>                             int preemptible)  {
> >>      unsigned long nx, x, y = page->u.inuse.type_info;
> >> -    int rc = 0;
> >> +    int rc = 0, ret = 0;
> >>
> >>      ASSERT(!(type & ~(PGT_type_mask | PGT_pae_xen_l2)));
> >>
> >> @@ -2578,11 +2578,11 @@ static int __get_page_type(struct page_info
> >> *page, unsigned long type,
> >>          if ( d && is_pv_domain(d) && unlikely(need_iommu(d)) )
> >>          {
> >>              if ( (x & PGT_type_mask) == PGT_writable_page )
> >> -                iommu_unmap_page(d, mfn_to_gmfn(d, page_to_mfn(page)));
> >> +                ret = iommu_unmap_page(d, mfn_to_gmfn(d,
> >> + page_to_mfn(page)));
> >>              else if ( type == PGT_writable_page )
> >> -                iommu_map_page(d, mfn_to_gmfn(d, page_to_mfn(page)),
> >> -                               page_to_mfn(page),
> >> -                               IOMMUF_readable|IOMMUF_writable);
> >> +                ret = iommu_map_page(d, mfn_to_gmfn(d, page_to_mfn(page)),
> >> +                                     page_to_mfn(page),
> >> +
> >> + IOMMUF_readable|IOMMUF_writable);
> >>          }
> >>      }
> >>
> >> @@ -2599,6 +2599,9 @@ static int __get_page_type(struct page_info
> >> *page, unsigned long type,
> >>      if ( (x & PGT_partial) && !(nx & PGT_partial) )
> >>          put_page(page);
> >>
> >> +    if ( !rc )
> >> +        rc = ret;
> >> +
> >>      return rc;
> >>  }
> >
> > I know there were quite some discussions before around above change
> > (sorry I didn't remember all of them). Just based on mental picture we
> > should return error where it firstly occurs. However above change
> > looks favoring errors in later "rc = alloc_page_type" over earlier
> > iommu_map/unmap_page error. Is it what we want?
> 
> Yes, as that's the primary operation here.
> 
> > If there is a reason that we cannot return immediately upon
> > iommu_map/unmap,
> 
> Since for Dom0 we don't call domain_crash(), we must not bypass
> alloc_page_type() here. And even for DomU it would seem at least fragile if we
> did - we better don't alter the refcounting behavior.
> 

A little bit confused.
Check it,  for this __get_page_type(), can I leave my modification as is?

> >> --- a/xen/arch/x86/mm/p2m-ept.c
> >> +++ b/xen/arch/x86/mm/p2m-ept.c
> >> @@ -821,6 +821,8 @@ out:
> >>      if ( needs_sync )
> >>          ept_sync_domain(p2m);
> >>
> >> +    ret = 0;
> >> +
> >>      /* For host p2m, may need to change VT-d page table.*/
> >>      if ( rc == 0 && p2m_is_hostp2m(p2m) && need_iommu(d) &&
> >>           need_modify_vtd_table )
> >> @@ -831,11 +833,29 @@ out:
> >>          {
> >>              if ( iommu_flags )
> >>                  for ( i = 0; i < (1 << order); i++ )
> >> -                    iommu_map_page(d, gfn + i, mfn_x(mfn) + i, iommu_flags);
> >> +                {
> >> +                    rc = iommu_map_page(d, gfn + i, mfn_x(mfn) + i,
> >> + iommu_flags);
> >> +
> >> +                    if ( !ret && unlikely(rc) )
> >
> > I think you should move check of ret before iommu_map_page, since we
> > should stop map against any error (either from best-effort unmap error
> side).
> 
> Considering ret getting set to zero ahead of the loop plus ...
> 
> >> +                    {
> >> +                        while ( i-- )
> >> +                            iommu_unmap_page(d, gfn + i);
> >> +
> >> +                        ret = rc;
> >> +                        break;
> 
> ... this, it looks to me as if the checking of ret above is simply unnecessary.
> 

Make sense. I'll drop ret check.

Quan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v3 03/10] IOMMU/MMU: enhance the call trees of IOMMU unmapping and mapping
  2016-05-04  9:13       ` Xu, Quan
@ 2016-05-04  9:28         ` Jan Beulich
  2016-05-05  0:38           ` Tian, Kevin
  0 siblings, 1 reply; 39+ messages in thread
From: Jan Beulich @ 2016-05-04  9:28 UTC (permalink / raw)
  To: Quan Xu
  Cc: Kevin Tian, Keir Fraser, George Dunlap, Andrew Cooper,
	dario.faggioli, xen-devel, Jun Nakajima

>>> On 04.05.16 at 11:13, <quan.xu@intel.com> wrote:
> On May 04, 2016 4:41 PM, Jan Beulich <JBeulich@suse.com> wrote:
>> >>> On 04.05.16 at 03:45, <kevin.tian@intel.com> wrote:
>> >>  From: Xu, Quan
>> >> Sent: Friday, April 29, 2016 5:25 PM
>> >> --- a/xen/arch/x86/mm.c
>> >> +++ b/xen/arch/x86/mm.c
>> >> @@ -2467,7 +2467,7 @@ static int __get_page_type(struct page_info
>> >> *page, unsigned long type,
>> >>                             int preemptible)  {
>> >>      unsigned long nx, x, y = page->u.inuse.type_info;
>> >> -    int rc = 0;
>> >> +    int rc = 0, ret = 0;
>> >>
>> >>      ASSERT(!(type & ~(PGT_type_mask | PGT_pae_xen_l2)));
>> >>
>> >> @@ -2578,11 +2578,11 @@ static int __get_page_type(struct page_info
>> >> *page, unsigned long type,
>> >>          if ( d && is_pv_domain(d) && unlikely(need_iommu(d)) )
>> >>          {
>> >>              if ( (x & PGT_type_mask) == PGT_writable_page )
>> >> -                iommu_unmap_page(d, mfn_to_gmfn(d, page_to_mfn(page)));
>> >> +                ret = iommu_unmap_page(d, mfn_to_gmfn(d,
>> >> + page_to_mfn(page)));
>> >>              else if ( type == PGT_writable_page )
>> >> -                iommu_map_page(d, mfn_to_gmfn(d, page_to_mfn(page)),
>> >> -                               page_to_mfn(page),
>> >> -                               IOMMUF_readable|IOMMUF_writable);
>> >> +                ret = iommu_map_page(d, mfn_to_gmfn(d, page_to_mfn(page)),
>> >> +                                     page_to_mfn(page),
>> >> +
>> >> + IOMMUF_readable|IOMMUF_writable);
>> >>          }
>> >>      }
>> >>
>> >> @@ -2599,6 +2599,9 @@ static int __get_page_type(struct page_info
>> >> *page, unsigned long type,
>> >>      if ( (x & PGT_partial) && !(nx & PGT_partial) )
>> >>          put_page(page);
>> >>
>> >> +    if ( !rc )
>> >> +        rc = ret;
>> >> +
>> >>      return rc;
>> >>  }
>> >
>> > I know there were quite some discussions before around above change
>> > (sorry I didn't remember all of them). Just based on mental picture we
>> > should return error where it firstly occurs. However above change
>> > looks favoring errors in later "rc = alloc_page_type" over earlier
>> > iommu_map/unmap_page error. Is it what we want?
>> 
>> Yes, as that's the primary operation here.
>> 
>> > If there is a reason that we cannot return immediately upon
>> > iommu_map/unmap,
>> 
>> Since for Dom0 we don't call domain_crash(), we must not bypass
>> alloc_page_type() here. And even for DomU it would seem at least fragile if we
>> did - we better don't alter the refcounting behavior.
>> 
> 
> A little bit confused.
> Check it,  for this __get_page_type(), can I leave my modification as is?

Yes. I merely tried to explain the reason to Kevin.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v3 02/10] IOMMU: handle IOMMU mapping and unmapping failures
  2016-05-04  1:28   ` Tian, Kevin
@ 2016-05-04 11:49     ` Xu, Quan
  2016-05-04 13:44       ` Jan Beulich
  0 siblings, 1 reply; 39+ messages in thread
From: Xu, Quan @ 2016-05-04 11:49 UTC (permalink / raw)
  To: Tian, Kevin, xen-devel, Jan Beulich; +Cc: dario.faggioli

On May 04, 2016 9:29 AM, Tian, Kevin <kevin.tian@intel.com> wrote:
> > From: Quan Xu
> > Sent: Friday, April 29, 2016 5:25 PM
> >
> > Treat IOMMU mapping and unmapping failures as a fatal to the domain
> > (with the exception of the hardware domain).
> >
> > If IOMMU mapping and unmapping failed, crash the domain (with the
> > exception of the hardware domain) and propagate the error up to the
> > call trees.
> >
> > Signed-off-by: Quan Xu <quan.xu@intel.com>
> > Reviewed-by: Kevin Tian <kevin.tian@intel.com>
> >
> > CC: Jan Beulich <jbeulich@suse.com>
> > ---
> >  xen/drivers/passthrough/iommu.c | 30 ++++++++++++++++++++++++++++--
> >  1 file changed, 28 insertions(+), 2 deletions(-)
> >
> > diff --git a/xen/drivers/passthrough/iommu.c
> > b/xen/drivers/passthrough/iommu.c index b64676f..a0003ac 100644
> > --- a/xen/drivers/passthrough/iommu.c
> > +++ b/xen/drivers/passthrough/iommu.c
> > @@ -243,21 +243,47 @@ int iommu_map_page(struct domain *d,
> unsigned
> > long gfn, unsigned long mfn,
> >                     unsigned int flags)  {
> >      struct hvm_iommu *hd = domain_hvm_iommu(d);
> > +    int rc;
> >
> >      if ( !iommu_enabled || !hd->platform_ops )
> >          return 0;
> >
> > -    return hd->platform_ops->map_page(d, gfn, mfn, flags);
> > +    rc = hd->platform_ops->map_page(d, gfn, mfn, flags);
> > +
> > +    if ( rc )
> > +    {
> > +        if ( is_hardware_domain(d) )
> > +            printk(XENLOG_ERR
> > +                   "iommu_map_page: IOMMU mapping gfn %#lx mfn %#lx
> > + failed for
> > dom%d.",
> > +                   gfn, mfn, d->domain_id);
> > +        else
> > +            domain_crash(d);
> > +    }
> 
> It makes sense to print error messages for all domains, and then selectively
> crash domain:
> 
> printk(XENLOG_ERR ...);
> if ( !is_hardware_domain(d) )
> 	domain_crash(d);
> 

But Jan said,
..
    if ( is_hardware_domain() )
        printk();
    else
        domain_crash();
..
is the better approach.


I remain neutral for this point. I think this is not a technical problem, but a matter of preference.
This gap is subject to preventing spamming the log. 

For iommu_map_page(), I think Kevin's suggestion is better, much more information for the crash domain,
And also won't spam the log as we stop mapping against any error.

For iommu_unmap_page(),IOMMU unmapping should perhaps continue despite an error, in an attempt
to do best effort cleanup. Then, Kevin's suggestion may spam the log.


Quan
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v3 01/10] vt-d: fix the IOMMU flush issue
  2016-05-04  1:26   ` Tian, Kevin
@ 2016-05-04 12:09     ` Xu, Quan
  2016-05-04 13:51       ` Jan Beulich
  0 siblings, 1 reply; 39+ messages in thread
From: Xu, Quan @ 2016-05-04 12:09 UTC (permalink / raw)
  To: Tian, Kevin, xen-devel
  Cc: Keir Fraser, dario.faggioli, Wu, Feng, Jan Beulich, Andrew Cooper

On May 04, 2016 9:26 AM, Tian, Kevin <kevin.tian@intel.com> wrote:
> > From: Xu, Quan
> > Sent: Friday, April 29, 2016 5:25 PM
> >
> > The propagation value from IOMMU flush interfaces may be positive,
> > which indicates callers need to flush cache, not one of faliures.
> >
> > when the propagation value is positive, this patch fixes this flush
> > issue as follows:
> >   - call iommu_flush_write_buffer() to flush cache.
> >   - return zero.
> >
> > Signed-off-by: Quan Xu <quan.xu@intel.com>
> >
> > CC: Kevin Tian <kevin.tian@intel.com>
> > CC: Feng Wu <feng.wu@intel.com>
> > CC: Keir Fraser <keir@xen.org>
> > CC: Jan Beulich <jbeulich@suse.com>
> > CC: Andrew Cooper <andrew.cooper3@citrix.com>
> > ---
> >  xen/drivers/passthrough/vtd/iommu.c | 94
> > +++++++++++++++++++++++++------------
> >  xen/include/asm-x86/iommu.h         |  2 +-
> >  2 files changed, 64 insertions(+), 32 deletions(-)
> >
> > diff --git a/xen/drivers/passthrough/vtd/iommu.c
> > b/xen/drivers/passthrough/vtd/iommu.c
> > index 5ad25dc..6e2e43a 100644
> > --- a/xen/drivers/passthrough/vtd/iommu.c
> > +++ b/xen/drivers/passthrough/vtd/iommu.c
> > @@ -558,14 +558,16 @@ static void iommu_flush_all(void)
> >      }
> >  }
> >
> > -static void __intel_iommu_iotlb_flush(struct domain *d, unsigned long gfn,
> > -        int dma_old_pte_present, unsigned int page_count)
> > +static int iommu_flush_iotlb(struct domain *d, unsigned long gfn,
> > +                             bool_t dma_old_pte_present,
> > +                             unsigned int page_count)
> >  {
> >      struct hvm_iommu *hd = domain_hvm_iommu(d);
> >      struct acpi_drhd_unit *drhd;
> >      struct iommu *iommu;
> >      int flush_dev_iotlb;
> >      int iommu_domid;
> > +    int rc, ret = 0;
> >
> >      /*
> >       * No need pcideves_lock here because we have flush @@ -584,29
> > +586,35 @@ static void __intel_iommu_iotlb_flush(struct domain *d,
> > unsigned long gfn,
> >              continue;
> >
> >          if ( page_count != 1 || gfn == INVALID_GFN )
> > -        {
> > -            if ( iommu_flush_iotlb_dsi(iommu, iommu_domid,
> > -                        0, flush_dev_iotlb) )
> > -                iommu_flush_write_buffer(iommu);
> > -        }
> > +            rc = iommu_flush_iotlb_dsi(iommu, iommu_domid,
> > +                                       0, flush_dev_iotlb);
> >          else
> > +            rc = iommu_flush_iotlb_psi(iommu, iommu_domid,
> > +                                       (paddr_t)gfn << PAGE_SHIFT_4K,
> > +                                       PAGE_ORDER_4K,
> > +                                       !dma_old_pte_present,
> > +                                       flush_dev_iotlb);
> > +
> > +        if ( rc > 0 )
> >          {
> > -            if ( iommu_flush_iotlb_psi(iommu, iommu_domid,
> > -                        (paddr_t)gfn << PAGE_SHIFT_4K, PAGE_ORDER_4K,
> > -                        !dma_old_pte_present, flush_dev_iotlb) )
> > -                iommu_flush_write_buffer(iommu);
> > +            iommu_flush_write_buffer(iommu);
> > +            ret = 0;
> 
> You don't need 'ret' here. Just change 'rc' to 0.
>

Agreed, this is really a good idea.
 
> >          }
> > +        else if ( rc < 0 )
> > +            ret = rc;
> 
> Then this change is not required
> 

Ditto.

> >      }
> > +
> > +    return ret;
> 
> Then return 'rc' instead.
> 

Ditto.

> >  }
> >
> >  static void intel_iommu_iotlb_flush(struct domain *d, unsigned long
> > gfn, unsigned int
> > page_count)
> 
> could we remove "Intel_" prefix completely? You can rename this one to
> iommu_flush_iotlb_page...
>

Sure, I am ok.
I wonder why to remove "Intel_" prefix.
In this xen/drivers/passthrough/vtd/iommu.c file, most of functions are beginning with "intel_" as intel specific.
In xen/drivers/passthrough/iommu.c file, most of functions are beginning with 'iommu_' as common part.

Quan


 
> >  {
> > -    __intel_iommu_iotlb_flush(d, gfn, 1, page_count);
> > +    iommu_flush_iotlb(d, gfn, 1, page_count);
> >  }
> >
> >  static void intel_iommu_iotlb_flush_all(struct domain *d)
> 
> and this one just iommu_flush_iotlb_all
> 
> >  {
> > -    __intel_iommu_iotlb_flush(d, INVALID_GFN, 0, 0);
> > +    iommu_flush_iotlb(d, INVALID_GFN, 0, 0);
> >  }
> >
> 


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v3 02/10] IOMMU: handle IOMMU mapping and unmapping failures
  2016-05-04 11:49     ` Xu, Quan
@ 2016-05-04 13:44       ` Jan Beulich
  2016-05-05  1:47         ` Xu, Quan
  0 siblings, 1 reply; 39+ messages in thread
From: Jan Beulich @ 2016-05-04 13:44 UTC (permalink / raw)
  To: Kevin Tian, Quan Xu; +Cc: dario.faggioli, xen-devel

>>> On 04.05.16 at 13:49, <quan.xu@intel.com> wrote:
> On May 04, 2016 9:29 AM, Tian, Kevin <kevin.tian@intel.com> wrote:
>> > From: Quan Xu
>> > Sent: Friday, April 29, 2016 5:25 PM
>> >
>> > Treat IOMMU mapping and unmapping failures as a fatal to the domain
>> > (with the exception of the hardware domain).
>> >
>> > If IOMMU mapping and unmapping failed, crash the domain (with the
>> > exception of the hardware domain) and propagate the error up to the
>> > call trees.
>> >
>> > Signed-off-by: Quan Xu <quan.xu@intel.com>
>> > Reviewed-by: Kevin Tian <kevin.tian@intel.com>
>> >
>> > CC: Jan Beulich <jbeulich@suse.com>
>> > ---
>> >  xen/drivers/passthrough/iommu.c | 30 ++++++++++++++++++++++++++++--
>> >  1 file changed, 28 insertions(+), 2 deletions(-)
>> >
>> > diff --git a/xen/drivers/passthrough/iommu.c
>> > b/xen/drivers/passthrough/iommu.c index b64676f..a0003ac 100644
>> > --- a/xen/drivers/passthrough/iommu.c
>> > +++ b/xen/drivers/passthrough/iommu.c
>> > @@ -243,21 +243,47 @@ int iommu_map_page(struct domain *d,
>> unsigned
>> > long gfn, unsigned long mfn,
>> >                     unsigned int flags)  {
>> >      struct hvm_iommu *hd = domain_hvm_iommu(d);
>> > +    int rc;
>> >
>> >      if ( !iommu_enabled || !hd->platform_ops )
>> >          return 0;
>> >
>> > -    return hd->platform_ops->map_page(d, gfn, mfn, flags);
>> > +    rc = hd->platform_ops->map_page(d, gfn, mfn, flags);
>> > +
>> > +    if ( rc )
>> > +    {
>> > +        if ( is_hardware_domain(d) )
>> > +            printk(XENLOG_ERR
>> > +                   "iommu_map_page: IOMMU mapping gfn %#lx mfn %#lx
>> > + failed for
>> > dom%d.",
>> > +                   gfn, mfn, d->domain_id);
>> > +        else
>> > +            domain_crash(d);
>> > +    }
>> 
>> It makes sense to print error messages for all domains, and then selectively
>> crash domain:
>> 
>> printk(XENLOG_ERR ...);
>> if ( !is_hardware_domain(d) )
>> 	domain_crash(d);
>> 
> 
> But Jan said,
> ..
>     if ( is_hardware_domain() )
>         printk();
>     else
>         domain_crash();
> ..
> is the better approach.

Not exactly. All I complained about was that the Dom0 case went
completely silently.

> I remain neutral for this point. I think this is not a technical problem, 
> but a matter of preference.

Indeed.

> This gap is subject to preventing spamming the log. 
> 
> For iommu_map_page(), I think Kevin's suggestion is better, much more 
> information for the crash domain,
> And also won't spam the log as we stop mapping against any error.
> 
> For iommu_unmap_page(),IOMMU unmapping should perhaps continue despite an 
> error, in an attempt
> to do best effort cleanup. Then, Kevin's suggestion may spam the log.

But I've always been saying that for multiple successive operations
you shouldn't issue one message each.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v3 03/10] IOMMU/MMU: enhance the call trees of IOMMU unmapping and mapping
  2016-04-29  9:25 ` [PATCH v3 03/10] IOMMU/MMU: enhance the call trees of IOMMU unmapping and mapping Quan Xu
  2016-05-04  1:45   ` Tian, Kevin
@ 2016-05-04 13:48   ` George Dunlap
  2016-05-05  6:53     ` Xu, Quan
  1 sibling, 1 reply; 39+ messages in thread
From: George Dunlap @ 2016-05-04 13:48 UTC (permalink / raw)
  To: Quan Xu
  Cc: Kevin Tian, Keir Fraser, Jun Nakajima, Andrew Cooper,
	Dario Faggioli, xen-devel, Jan Beulich

On Fri, Apr 29, 2016 at 10:25 AM, Quan Xu <quan.xu@intel.com> wrote:
> When IOMMU mapping is failed, we issue a best effort rollback, stopping
> IOMMU mapping, unmapping the previous IOMMU maps and then reporting the
> error up to the call trees. When rollback is not feasible (in early
> initialization phase or trade-off of complexity) for the hardware domain,
> we do things on a best effort basis, only throwing out an error message.
>
> IOMMU unmapping should perhaps continue despite an error, in an attempt
> to do best effort cleanup.
>
> Signed-off-by: Quan Xu <quan.xu@intel.com>
>
> CC: Keir Fraser <keir@xen.org>
> CC: Jan Beulich <jbeulich@suse.com>
> CC: Andrew Cooper <andrew.cooper3@citrix.com>
> CC: Jun Nakajima <jun.nakajima@intel.com>
> CC: Kevin Tian <kevin.tian@intel.com>
> CC: George Dunlap <george.dunlap@eu.citrix.com>
> ---
>  xen/arch/x86/mm.c               | 13 ++++++++-----
>  xen/arch/x86/mm/p2m-ept.c       | 27 +++++++++++++++++++++++++--
>  xen/arch/x86/mm/p2m-pt.c        | 24 ++++++++++++++++++++----
>  xen/arch/x86/mm/p2m.c           | 11 +++++++++--
>  xen/drivers/passthrough/iommu.c | 14 +++++++++++++-
>  5 files changed, 75 insertions(+), 14 deletions(-)
>
> diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
> index a42097f..427097d 100644
> --- a/xen/arch/x86/mm.c
> +++ b/xen/arch/x86/mm.c
> @@ -2467,7 +2467,7 @@ static int __get_page_type(struct page_info *page, unsigned long type,
>                             int preemptible)
>  {
>      unsigned long nx, x, y = page->u.inuse.type_info;
> -    int rc = 0;
> +    int rc = 0, ret = 0;
>
>      ASSERT(!(type & ~(PGT_type_mask | PGT_pae_xen_l2)));
>
> @@ -2578,11 +2578,11 @@ static int __get_page_type(struct page_info *page, unsigned long type,
>          if ( d && is_pv_domain(d) && unlikely(need_iommu(d)) )
>          {
>              if ( (x & PGT_type_mask) == PGT_writable_page )
> -                iommu_unmap_page(d, mfn_to_gmfn(d, page_to_mfn(page)));
> +                ret = iommu_unmap_page(d, mfn_to_gmfn(d, page_to_mfn(page)));
>              else if ( type == PGT_writable_page )
> -                iommu_map_page(d, mfn_to_gmfn(d, page_to_mfn(page)),
> -                               page_to_mfn(page),
> -                               IOMMUF_readable|IOMMUF_writable);
> +                ret = iommu_map_page(d, mfn_to_gmfn(d, page_to_mfn(page)),
> +                                     page_to_mfn(page),
> +                                     IOMMUF_readable|IOMMUF_writable);
>          }
>      }
>
> @@ -2599,6 +2599,9 @@ static int __get_page_type(struct page_info *page, unsigned long type,
>      if ( (x & PGT_partial) && !(nx & PGT_partial) )
>          put_page(page);
>
> +    if ( !rc )
> +        rc = ret;
> +
>      return rc;
>  }
>
> diff --git a/xen/arch/x86/mm/p2m-ept.c b/xen/arch/x86/mm/p2m-ept.c
> index 1ed5b47..df87944 100644
> --- a/xen/arch/x86/mm/p2m-ept.c
> +++ b/xen/arch/x86/mm/p2m-ept.c
> @@ -821,6 +821,8 @@ out:
>      if ( needs_sync )
>          ept_sync_domain(p2m);
>
> +    ret = 0;
> +
>      /* For host p2m, may need to change VT-d page table.*/
>      if ( rc == 0 && p2m_is_hostp2m(p2m) && need_iommu(d) &&
>           need_modify_vtd_table )
> @@ -831,11 +833,29 @@ out:
>          {
>              if ( iommu_flags )
>                  for ( i = 0; i < (1 << order); i++ )
> -                    iommu_map_page(d, gfn + i, mfn_x(mfn) + i, iommu_flags);
> +                {
> +                    rc = iommu_map_page(d, gfn + i, mfn_x(mfn) + i, iommu_flags);
> +
> +                    if ( !ret && unlikely(rc) )
> +                    {
> +                        while ( i-- )
> +                            iommu_unmap_page(d, gfn + i);
> +
> +                        ret = rc;
> +                        break;
> +                    }
> +                }
>              else
>                  for ( i = 0; i < (1 << order); i++ )
> -                    iommu_unmap_page(d, gfn + i);
> +                {
> +                    rc = iommu_unmap_page(d, gfn + i);
> +
> +                    if ( !ret && unlikely(rc) )
> +                        ret = rc;
> +                }
>          }
> +
> +        rc = 0;
>      }

So the reason for doing this thing with setting ret=0, then using rc,
then setting rc to zero, is to make sure that the change is propagated
to the altp2m if necessary?

I'm not a fan of this sort of "implied signalling".  The check at the
altp2m conditional is meant to say, "everything went fine, go ahead
and propagate the change".  But with your changes, that's not what we
want anymore -- we want, "If the change actually made it into the
hostp2m, propagate it to the altp2m."

As such, I think it would be a lot clearer to just make a new boolean
variable, maybe "entry_written", which we initialize to false and then
set to true when the entry is actually written; and then change the
condition re the altp2m to "if ( entry_written && p2m_is_hostp2m(..)
)".

Then I think you could make the ret / rc use mirror what you do
elsewhere, where ret is used to store the return value of the function
call, and it's propagated to rc only if rc is not already set.

> @@ -680,11 +680,27 @@ p2m_pt_set_entry(struct p2m_domain *p2m, unsigned long gfn, mfn_t mfn,
>          }
>          else if ( iommu_pte_flags )
>              for ( i = 0; i < (1UL << page_order); i++ )
> -                iommu_map_page(p2m->domain, gfn + i, mfn_x(mfn) + i,
> -                               iommu_pte_flags);
> +            {
> +                ret = iommu_map_page(p2m->domain, gfn + i, mfn_x(mfn) + i,
> +                                    iommu_pte_flags);
> +
> +                if ( !rc && unlikely(ret) )
> +                {
> +                    while ( i-- )
> +                        iommu_unmap_page(p2m->domain, gfn + i);
> +
> +                    rc = ret;
> +                    break;
> +                }

Hmm, is this conditional here right?  What the code actually says to
do is to always map pages, but to only unmap them on an error if there
have been no other errors in the function so far.

As it happens, at the moment rc cannot be non-zero here since any time
it's non-zero the code jumps down to the 'out' label, skipping this.
But if that ever changed, this would fail to unmap when it probably
should.

It seems like this be:

if ( unlikely(ret) )
{
  while ( i-- )
    iommu_unmap_page(p2m->domain, gfn + i);
  if ( !rc )
    rc = ret;
  break;
}

Or if you want to assume that rc will never be zero, then put an
ASSERT() just before the for loop here.

Everything else looks good to me.  Thanks again for your work on this.

 -George

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v3 01/10] vt-d: fix the IOMMU flush issue
  2016-05-04 12:09     ` Xu, Quan
@ 2016-05-04 13:51       ` Jan Beulich
  2016-05-04 15:03         ` Xu, Quan
  0 siblings, 1 reply; 39+ messages in thread
From: Jan Beulich @ 2016-05-04 13:51 UTC (permalink / raw)
  To: Quan Xu
  Cc: Kevin Tian, Keir Fraser, Andrew Cooper, dario.faggioli,
	xen-devel, Feng Wu

>>> On 04.05.16 at 14:09, <quan.xu@intel.com> wrote:
> On May 04, 2016 9:26 AM, Tian, Kevin <kevin.tian@intel.com> wrote:
>> > From: Xu, Quan
>> > Sent: Friday, April 29, 2016 5:25 PM
>> >  static void intel_iommu_iotlb_flush(struct domain *d, unsigned long
>> > gfn, unsigned int
>> > page_count)
>> 
>> could we remove "Intel_" prefix completely? You can rename this one to
>> iommu_flush_iotlb_page...
>>
> 
> Sure, I am ok.
> I wonder why to remove "Intel_" prefix.
> In this xen/drivers/passthrough/vtd/iommu.c file, most of functions are 
> beginning with "intel_" as intel specific.
> In xen/drivers/passthrough/iommu.c file, most of functions are beginning 
> with 'iommu_' as common part.

For non-static functions I'd suggest to keep the prefix (or use
vtd_ alternatively). For static functions there's no need to
disambiguate them, and hence an intel_ prefix is just mostly
useless baggage.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v3 01/10] vt-d: fix the IOMMU flush issue
  2016-05-04 13:51       ` Jan Beulich
@ 2016-05-04 15:03         ` Xu, Quan
  0 siblings, 0 replies; 39+ messages in thread
From: Xu, Quan @ 2016-05-04 15:03 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Tian, Kevin, Keir Fraser, Andrew Cooper, dario.faggioli,
	xen-devel, Wu, Feng

On May 04, 2016 9:52 PM, Jan Beulich <JBeulich@suse.com> wrote:
> >>> On 04.05.16 at 14:09, <quan.xu@intel.com> wrote:
> > On May 04, 2016 9:26 AM, Tian, Kevin <kevin.tian@intel.com> wrote:
> >> > From: Xu, Quan
> >> > Sent: Friday, April 29, 2016 5:25 PM  static void
> >> > intel_iommu_iotlb_flush(struct domain *d, unsigned long gfn,
> >> > unsigned int
> >> > page_count)
> >>
> >> could we remove "Intel_" prefix completely? You can rename this one
> >> to iommu_flush_iotlb_page...
> >>
> >
> > Sure, I am ok.
> > I wonder why to remove "Intel_" prefix.
> > In this xen/drivers/passthrough/vtd/iommu.c file, most of functions
> > are beginning with "intel_" as intel specific.
> > In xen/drivers/passthrough/iommu.c file, most of functions are
> > beginning with 'iommu_' as common part.
> 
> For non-static functions I'd suggest to keep the prefix (or use vtd_
> alternatively). For static functions there's no need to disambiguate them, and
> hence an intel_ prefix is just mostly useless baggage.
> 

Thanks. The above two are both static functions. So  I will follow Kevin's suggestion.
Quan
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v3 03/10] IOMMU/MMU: enhance the call trees of IOMMU unmapping and mapping
  2016-05-04  9:28         ` Jan Beulich
@ 2016-05-05  0:38           ` Tian, Kevin
  0 siblings, 0 replies; 39+ messages in thread
From: Tian, Kevin @ 2016-05-05  0:38 UTC (permalink / raw)
  To: Jan Beulich, Xu, Quan
  Cc: Keir Fraser, George Dunlap, Andrew Cooper, dario.faggioli,
	xen-devel, Nakajima, Jun

> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: Wednesday, May 04, 2016 5:29 PM
> 
> >>> On 04.05.16 at 11:13, <quan.xu@intel.com> wrote:
> > On May 04, 2016 4:41 PM, Jan Beulich <JBeulich@suse.com> wrote:
> >> >>> On 04.05.16 at 03:45, <kevin.tian@intel.com> wrote:
> >> >>  From: Xu, Quan
> >> >> Sent: Friday, April 29, 2016 5:25 PM
> >> >> --- a/xen/arch/x86/mm.c
> >> >> +++ b/xen/arch/x86/mm.c
> >> >> @@ -2467,7 +2467,7 @@ static int __get_page_type(struct page_info
> >> >> *page, unsigned long type,
> >> >>                             int preemptible)  {
> >> >>      unsigned long nx, x, y = page->u.inuse.type_info;
> >> >> -    int rc = 0;
> >> >> +    int rc = 0, ret = 0;
> >> >>
> >> >>      ASSERT(!(type & ~(PGT_type_mask | PGT_pae_xen_l2)));
> >> >>
> >> >> @@ -2578,11 +2578,11 @@ static int __get_page_type(struct page_info
> >> >> *page, unsigned long type,
> >> >>          if ( d && is_pv_domain(d) && unlikely(need_iommu(d)) )
> >> >>          {
> >> >>              if ( (x & PGT_type_mask) == PGT_writable_page )
> >> >> -                iommu_unmap_page(d, mfn_to_gmfn(d, page_to_mfn(page)));
> >> >> +                ret = iommu_unmap_page(d, mfn_to_gmfn(d,
> >> >> + page_to_mfn(page)));
> >> >>              else if ( type == PGT_writable_page )
> >> >> -                iommu_map_page(d, mfn_to_gmfn(d, page_to_mfn(page)),
> >> >> -                               page_to_mfn(page),
> >> >> -                               IOMMUF_readable|IOMMUF_writable);
> >> >> +                ret = iommu_map_page(d, mfn_to_gmfn(d,
> page_to_mfn(page)),
> >> >> +                                     page_to_mfn(page),
> >> >> +
> >> >> + IOMMUF_readable|IOMMUF_writable);
> >> >>          }
> >> >>      }
> >> >>
> >> >> @@ -2599,6 +2599,9 @@ static int __get_page_type(struct page_info
> >> >> *page, unsigned long type,
> >> >>      if ( (x & PGT_partial) && !(nx & PGT_partial) )
> >> >>          put_page(page);
> >> >>
> >> >> +    if ( !rc )
> >> >> +        rc = ret;
> >> >> +
> >> >>      return rc;
> >> >>  }
> >> >
> >> > I know there were quite some discussions before around above change
> >> > (sorry I didn't remember all of them). Just based on mental picture we
> >> > should return error where it firstly occurs. However above change
> >> > looks favoring errors in later "rc = alloc_page_type" over earlier
> >> > iommu_map/unmap_page error. Is it what we want?
> >>
> >> Yes, as that's the primary operation here.
> >>
> >> > If there is a reason that we cannot return immediately upon
> >> > iommu_map/unmap,
> >>
> >> Since for Dom0 we don't call domain_crash(), we must not bypass
> >> alloc_page_type() here. And even for DomU it would seem at least fragile if we
> >> did - we better don't alter the refcounting behavior.
> >>
> >
> > A little bit confused.
> > Check it,  for this __get_page_type(), can I leave my modification as is?
> 
> Yes. I merely tried to explain the reason to Kevin.
> 

Understand now.

Thanks
Kevin

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v3 02/10] IOMMU: handle IOMMU mapping and unmapping failures
  2016-05-04 13:44       ` Jan Beulich
@ 2016-05-05  1:47         ` Xu, Quan
  0 siblings, 0 replies; 39+ messages in thread
From: Xu, Quan @ 2016-05-05  1:47 UTC (permalink / raw)
  To: Jan Beulich, Tian, Kevin; +Cc: dario.faggioli, xen-devel

On May 04, 2016 9:44 PM, Jan Beulich <JBeulich@suse.com> wrote:
> >>> On 04.05.16 at 13:49, <quan.xu@intel.com> wrote:
> > On May 04, 2016 9:29 AM, Tian, Kevin <kevin.tian@intel.com> wrote:
> >> > From: Quan Xu
> >> > Sent: Friday, April 29, 2016 5:25 PM
> >> >
> >> > Treat IOMMU mapping and unmapping failures as a fatal to the domain
> >> > (with the exception of the hardware domain).
> >> >
> >> > If IOMMU mapping and unmapping failed, crash the domain (with the
> >> > exception of the hardware domain) and propagate the error up to the
> >> > call trees.
> >> >
> >> > Signed-off-by: Quan Xu <quan.xu@intel.com>
> >> > Reviewed-by: Kevin Tian <kevin.tian@intel.com>
> >> >
> >> > CC: Jan Beulich <jbeulich@suse.com>
> >> > ---
> >> >  xen/drivers/passthrough/iommu.c | 30
> >> > ++++++++++++++++++++++++++++--
> >> >  1 file changed, 28 insertions(+), 2 deletions(-)
> >> >
> >> > diff --git a/xen/drivers/passthrough/iommu.c
> >> > b/xen/drivers/passthrough/iommu.c index b64676f..a0003ac 100644
> >> > --- a/xen/drivers/passthrough/iommu.c
> >> > +++ b/xen/drivers/passthrough/iommu.c
> >> > @@ -243,21 +243,47 @@ int iommu_map_page(struct domain *d,
> >> unsigned
> >> > long gfn, unsigned long mfn,
> >> >                     unsigned int flags)  {
> >> >      struct hvm_iommu *hd = domain_hvm_iommu(d);
> >> > +    int rc;
> >> >
> >> >      if ( !iommu_enabled || !hd->platform_ops )
> >> >          return 0;
> >> >
> >> > -    return hd->platform_ops->map_page(d, gfn, mfn, flags);
> >> > +    rc = hd->platform_ops->map_page(d, gfn, mfn, flags);
> >> > +
> >> > +    if ( rc )
> >> > +    {
> >> > +        if ( is_hardware_domain(d) )
> >> > +            printk(XENLOG_ERR
> >> > +                   "iommu_map_page: IOMMU mapping gfn %#lx mfn
> >> > + %#lx failed for
> >> > dom%d.",
> >> > +                   gfn, mfn, d->domain_id);
> >> > +        else
> >> > +            domain_crash(d);
> >> > +    }
> >>
> >> It makes sense to print error messages for all domains, and then
> >> selectively crash domain:
> >>
> >> printk(XENLOG_ERR ...);
> >> if ( !is_hardware_domain(d) )
> >> 	domain_crash(d);
> >>
> >
> > But Jan said,
> > ..
> >     if ( is_hardware_domain() )
> >         printk();
> >     else
> >         domain_crash();
> > ..
> > is the better approach.
> 
> Not exactly. All I complained about was that the Dom0 case went completely
> silently.
> 
> > I remain neutral for this point. I think this is not a technical
> > problem, but a matter of preference.
> 
> Indeed.
>

If no other suggestions, I'll take Kevin's suggestion as a final  conclusion for both iommu_map_page() and iommu_unmap_page().
 

Quan

> > This gap is subject to preventing spamming the log.
> >
> > For iommu_map_page(), I think Kevin's suggestion is better, much more
> > information for the crash domain, And also won't spam the log as we
> > stop mapping against any error.
> >
> > For iommu_unmap_page(),IOMMU unmapping should perhaps continue
> despite
> > an error, in an attempt to do best effort cleanup. Then, Kevin's
> > suggestion may spam the log.
> 
> But I've always been saying that for multiple successive operations you
> shouldn't issue one message each.


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v3 03/10] IOMMU/MMU: enhance the call trees of IOMMU unmapping and mapping
  2016-05-04 13:48   ` George Dunlap
@ 2016-05-05  6:53     ` Xu, Quan
  2016-05-05 11:02       ` George Dunlap
  0 siblings, 1 reply; 39+ messages in thread
From: Xu, Quan @ 2016-05-05  6:53 UTC (permalink / raw)
  To: George Dunlap, Tian, Kevin, Jan Beulich
  Cc: Andrew Cooper, Dario Faggioli, Keir Fraser, Nakajima, Jun, xen-devel

On May 04, 2016 9:49 PM, George Dunlap <George.Dunlap@eu.citrix.com> wrote:
> On Fri, Apr 29, 2016 at 10:25 AM, Quan Xu <quan.xu@intel.com> wrote:
> > When IOMMU mapping is failed, we issue a best effort rollback,
> > stopping IOMMU mapping, unmapping the previous IOMMU maps and
> then
> > reporting the error up to the call trees. When rollback is not
> > feasible (in early initialization phase or trade-off of complexity)
> > for the hardware domain, we do things on a best effort basis, only throwing
> out an error message.
> >
> > IOMMU unmapping should perhaps continue despite an error, in an
> > attempt to do best effort cleanup.
> >
> > Signed-off-by: Quan Xu <quan.xu@intel.com>
> >
> > CC: Keir Fraser <keir@xen.org>
> > CC: Jan Beulich <jbeulich@suse.com>
> > CC: Andrew Cooper <andrew.cooper3@citrix.com>
> > CC: Jun Nakajima <jun.nakajima@intel.com>
> > CC: Kevin Tian <kevin.tian@intel.com>
> > CC: George Dunlap <george.dunlap@eu.citrix.com>
> > ---
> >  xen/arch/x86/mm.c               | 13 ++++++++-----
> >  xen/arch/x86/mm/p2m-ept.c       | 27 +++++++++++++++++++++++++--
> >  xen/arch/x86/mm/p2m-pt.c        | 24 ++++++++++++++++++++----
> >  xen/arch/x86/mm/p2m.c           | 11 +++++++++--
> >  xen/drivers/passthrough/iommu.c | 14 +++++++++++++-
> >  5 files changed, 75 insertions(+), 14 deletions(-)
> >
> > diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c index
> > a42097f..427097d 100644
> > --- a/xen/arch/x86/mm.c
> > +++ b/xen/arch/x86/mm.c
> > @@ -2467,7 +2467,7 @@ static int __get_page_type(struct page_info
> *page, unsigned long type,
> >                             int preemptible)  {
> >      unsigned long nx, x, y = page->u.inuse.type_info;
> > -    int rc = 0;
> > +    int rc = 0, ret = 0;
> >
> >      ASSERT(!(type & ~(PGT_type_mask | PGT_pae_xen_l2)));
> >
> > @@ -2578,11 +2578,11 @@ static int __get_page_type(struct page_info
> *page, unsigned long type,
> >          if ( d && is_pv_domain(d) && unlikely(need_iommu(d)) )
> >          {
> >              if ( (x & PGT_type_mask) == PGT_writable_page )
> > -                iommu_unmap_page(d, mfn_to_gmfn(d, page_to_mfn(page)));
> > +                ret = iommu_unmap_page(d, mfn_to_gmfn(d,
> page_to_mfn(page)));
> >              else if ( type == PGT_writable_page )
> > -                iommu_map_page(d, mfn_to_gmfn(d, page_to_mfn(page)),
> > -                               page_to_mfn(page),
> > -                               IOMMUF_readable|IOMMUF_writable);
> > +                ret = iommu_map_page(d, mfn_to_gmfn(d, page_to_mfn(page)),
> > +                                     page_to_mfn(page),
> > +                                     IOMMUF_readable|IOMMUF_writable);
> >          }
> >      }
> >
> > @@ -2599,6 +2599,9 @@ static int __get_page_type(struct page_info
> *page, unsigned long type,
> >      if ( (x & PGT_partial) && !(nx & PGT_partial) )
> >          put_page(page);
> >
> > +    if ( !rc )
> > +        rc = ret;
> > +
> >      return rc;
> >  }
> >
> > diff --git a/xen/arch/x86/mm/p2m-ept.c b/xen/arch/x86/mm/p2m-ept.c
> > index 1ed5b47..df87944 100644
> > --- a/xen/arch/x86/mm/p2m-ept.c
> > +++ b/xen/arch/x86/mm/p2m-ept.c
> > @@ -821,6 +821,8 @@ out:
> >      if ( needs_sync )
> >          ept_sync_domain(p2m);
> >
> > +    ret = 0;
> > +
> >      /* For host p2m, may need to change VT-d page table.*/
> >      if ( rc == 0 && p2m_is_hostp2m(p2m) && need_iommu(d) &&
> >           need_modify_vtd_table )
> > @@ -831,11 +833,29 @@ out:
> >          {
> >              if ( iommu_flags )
> >                  for ( i = 0; i < (1 << order); i++ )
> > -                    iommu_map_page(d, gfn + i, mfn_x(mfn) + i, iommu_flags);
> > +                {
> > +                    rc = iommu_map_page(d, gfn + i, mfn_x(mfn) + i, iommu_flags);
> > +
> > +                    if ( !ret && unlikely(rc) )
> > +                    {
> > +                        while ( i-- )
> > +                            iommu_unmap_page(d, gfn + i);
> > +
> > +                        ret = rc;
> > +                        break;
> > +                    }
> > +                }
> >              else
> >                  for ( i = 0; i < (1 << order); i++ )
> > -                    iommu_unmap_page(d, gfn + i);
> > +                {
> > +                    rc = iommu_unmap_page(d, gfn + i);
> > +
> > +                    if ( !ret && unlikely(rc) )
> > +                        ret = rc;
> > +                }
> >          }
> > +
> > +        rc = 0;
> >      }
> 
> So the reason for doing this thing with setting ret=0, then using rc,
> then setting rc to zero, is to make sure that the change is propagated
> to the altp2m if necessary?
> 
> I'm not a fan of this sort of "implied signalling".  The check at the
> altp2m conditional is meant to say, "everything went fine, go ahead
> and propagate the change".  But with your changes, that's not what we
> want anymore -- we want, "If the change actually made it into the
> hostp2m, propagate it to the altp2m."
> 
> As such, I think it would be a lot clearer to just make a new boolean
> variable, maybe "entry_written", which we initialize to false and then
> set to true when the entry is actually written; and then change the
> condition re the altp2m to "if ( entry_written && p2m_is_hostp2m(..)
> )".
> 
> Then I think you could make the ret / rc use mirror what you do
> elsewhere, where ret is used to store the return value of the function
> call, and it's propagated to rc only if rc is not already set.
> 

George,

Thanks for your detailed explanation. This seems an another matter of preference.
Of course, I'd follow your suggestion in p2m  field, while I need to hear the other maintainers' options as well
(especially when it comes from Kevin/Jan, as they do spend a lot of time for me).

A little bit of difference here, IMO, an int 'entry_written' is much better, as we also need  to propagate the 'entry_written' up to callers,
i.e. the errors returned from atomic_write_ept_entry() are '-EINVAL', '-ESRCH' and '-EBUSY'. Then, the follow is my modification (feel free to correct me):



--- a/xen/arch/x86/mm/p2m-ept.c
+++ b/xen/arch/x86/mm/p2m-ept.c
@@ -666,7 +666,7 @@ ept_set_entry(struct p2m_domain *p2m, unsigned long gfn, mfn_t mfn,
     ept_entry_t *table, *ept_entry = NULL;
     unsigned long gfn_remainder = gfn;
     unsigned int i, target = order / EPT_TABLE_ORDER;
-    int ret, rc = 0;
+    int ret, rc = 0, entry_written = 0;
     bool_t direct_mmio = (p2mt == p2m_mmio_direct);
     uint8_t ipat = 0;
     bool_t need_modify_vtd_table = 1;
@@ -763,8 +763,8 @@ ept_set_entry(struct p2m_domain *p2m, unsigned long gfn, mfn_t mfn,

         /* now install the newly split ept sub-tree */
         /* NB: please make sure domian is paused and no in-fly VT-d DMA. */
-        rc = atomic_write_ept_entry(ept_entry, split_ept_entry, i);
-        ASSERT(rc == 0);
+        entry_written = atomic_write_ept_entry(ept_entry, split_ept_entry, i);
+        ASSERT(entry_written == 0);

         /* then move to the level we want to make real changes */
         for ( ; i > target; i-- )
@@ -809,9 +809,12 @@ ept_set_entry(struct p2m_domain *p2m, unsigned long gfn, mfn_t mfn,
         new_entry.suppress_ve = is_epte_valid(&old_entry) ?
                                     old_entry.suppress_ve : 1;

-    rc = atomic_write_ept_entry(ept_entry, new_entry, target);
-    if ( unlikely(rc) )
+    entry_written = atomic_write_ept_entry(ept_entry, new_entry, target);
+    if ( unlikely(entry_written) )
+    {
         old_entry.epte = 0;
+        rc = entry_written;
+    }
     else if ( p2mt != p2m_invalid &&
               (gfn + (1UL << order) - 1 > p2m->max_mapped_pfn) )
         /* Track the highest gfn for which we have ever had a valid mapping */
@@ -822,7 +825,7 @@ out:
         ept_sync_domain(p2m);

     /* For host p2m, may need to change VT-d page table.*/
-    if ( rc == 0 && p2m_is_hostp2m(p2m) && need_iommu(d) &&
+    if ( entry_written == 0 && p2m_is_hostp2m(p2m) && need_iommu(d) &&
          need_modify_vtd_table )
     {
         if ( iommu_hap_pt_share )
@@ -831,10 +834,28 @@ out:
         {
             if ( iommu_flags )
                 for ( i = 0; i < (1 << order); i++ )
-                    iommu_map_page(d, gfn + i, mfn_x(mfn) + i, iommu_flags);
+                {
+                    ret = iommu_map_page(d, gfn + i, mfn_x(mfn) + i, iommu_flags);
+
+                    if ( unlikely(ret) )
+                    {
+                        while ( i-- )
+                            iommu_unmap_page(p2m->domain, gfn + i);
+
+                        if ( !rc )
+                            rc = ret;
+
+                        break;
+                    }
+                }
             else
                 for ( i = 0; i < (1 << order); i++ )
-                    iommu_unmap_page(d, gfn + i);
+                {
+                    ret = iommu_unmap_page(d, gfn + i);
+
+                    if ( !rc && unlikely(ret) )
+                        rc = ret;
+                }
         }
     }





> > @@ -680,11 +680,27 @@ p2m_pt_set_entry(struct p2m_domain *p2m,
> unsigned long gfn, mfn_t mfn,
> >          }
> >          else if ( iommu_pte_flags )
> >              for ( i = 0; i < (1UL << page_order); i++ )
> > -                iommu_map_page(p2m->domain, gfn + i, mfn_x(mfn) + i,
> > -                               iommu_pte_flags);
> > +            {
> > +                ret = iommu_map_page(p2m->domain, gfn + i, mfn_x(mfn) + i,
> > +                                    iommu_pte_flags);
> > +
> > +                if ( !rc && unlikely(ret) )
> > +                {
> > +                    while ( i-- )
> > +                        iommu_unmap_page(p2m->domain, gfn + i);
> > +
> > +                    rc = ret;
> > +                    break;
> > +                }
> 
> Hmm, is this conditional here right?  What the code actually says to
> do is to always map pages, but to only unmap them on an error if there
> have been no other errors in the function so far.
> 
> As it happens, at the moment rc cannot be non-zero here since any time
> it's non-zero the code jumps down to the 'out' label, skipping this.
> But if that ever changed, this would fail to unmap when it probably
> should.
> 
> It seems like this be:
> 
> if ( unlikely(ret) )
> {
>   while ( i-- )
>     iommu_unmap_page(p2m->domain, gfn + i);
>   if ( !rc )
>     rc = ret;
>   break;
> }
> 

Looks good. Thanks.

> Or if you want to assume that rc will never be zero, then put an
> ASSERT() just before the for loop here.
> 
> Everything else looks good to me.  Thanks again for your work on this.
> 

Quan
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v3 09/10] IOMMU: propagate IOMMU Device-TLB flush error up to IOMMU suspending
  2016-05-04  8:42       ` Jan Beulich
  2016-05-04  8:59         ` Xu, Quan
@ 2016-05-05 10:18         ` Xu, Quan
  2016-05-06  7:02           ` Jan Beulich
  1 sibling, 1 reply; 39+ messages in thread
From: Xu, Quan @ 2016-05-05 10:18 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Tian, Kevin, Stefano Stabellini, Wu, Feng, Liu Jinsong,
	dario.faggioli, xen-devel, Julien Grall, SuraveeSuthikulpanit,
	Andrew Cooper, Keir Fraser

On May 04, 2016 4:43 PM, Jan Beulich <JBeulich@suse.com> wrote:
> >>> On 04.05.16 at 04:14, <quan.xu@intel.com> wrote:
> > On May 04, 2016 10:00 AM, Tian, Kevin <kevin.tian@intel.com> wrote:
> >> > From: Xu, Quan
> >> > Sent: Friday, April 29, 2016 5:25 PM diff --git
> >> > a/xen/arch/x86/acpi/power.c b/xen/arch/x86/acpi/power.c index
> >> > 2885e31..9097333 100644
> >> > --- a/xen/arch/x86/acpi/power.c
> >> > +++ b/xen/arch/x86/acpi/power.c
> >> > @@ -45,6 +45,8 @@ void do_suspend_lowlevel(void);
> >> >
> >> >  static int device_power_down(void)  {
> >> > +    int err;
> >> > +
> >> >      console_suspend();
> >> >
> >> >      time_suspend();
> >> > @@ -53,11 +55,22 @@ static int device_power_down(void)
> >> >
> >> >      ioapic_suspend();
> >> >
> >> > -    iommu_suspend();
> >> > +    err = iommu_suspend();
> >> > +
> >> > +    if ( err )
> >> > +        goto iommu_suspend_error;
> >> >
> >> >      lapic_suspend();
> >> >
> >> >      return 0;
> >> > +
> >> > + iommu_suspend_error:
> >> > +    ioapic_resume();
> >> > +    i8259A_resume();
> >> > +    time_resume();
> >> > +    console_resume();
> >> > +
> >> > +    return err;
> >> >  }
> >>
> >> Jan had comment to better reuse device_power_up... looks no change in
> >> this version.
> >
> > Yes,  __iiuc__, this may be an optimization, but not a must.
> > We can discuss this in detail In this version.
> 
> As an optimization it would indeed be quite pointless here. My request was
> more for maintainability: By re-using the function future changes don't need
> to go to two places, and hence there's no risk of one of them getting
> forgotten.
> 

Jan,
What about this one as below:


--- a/xen/arch/x86/acpi/power.c
+++ b/xen/arch/x86/acpi/power.c
@@ -43,6 +43,17 @@ struct acpi_sleep_info acpi_sinfo;

 void do_suspend_lowlevel(void);

+enum dev_power_type {
+    TYPE_ALL,
+    TYPE_CONSOLE,
+    TYPE_TIME,
+    TYPE_I8259A,
+    TYPE_IOAPIC,
+    TYPE_IOMMU,
+    TYPE_LAPIC,
+    TYPE_UNKNOWN
+};
+
 static int device_power_down(void)
 {
     console_suspend();
@@ -53,26 +64,35 @@ static int device_power_down(void)

     ioapic_suspend();

-    iommu_suspend();
+    if ( iommu_suspend() )
+        return TYPE_IOMMU;

     lapic_suspend();

     return 0;
 }

-static void device_power_up(void)
+static void device_power_up(enum dev_power_type type)
 {
-    lapic_resume();
-
-    iommu_resume();
-
-    ioapic_resume();
-
-    i8259A_resume();
-
-    time_resume();
-
-    console_resume();
+    switch ( type ) {
+    case TYPE_ALL:
+    case TYPE_LAPIC:
+        lapic_resume();
+    case TYPE_IOMMU:
+        iommu_resume();
+    case TYPE_IOAPIC:
+        ioapic_resume();
+    case TYPE_I8259A:
+        i8259A_resume();
+    case TYPE_TIME:
+        time_resume();
+    case TYPE_CONSOLE:
+        console_resume();
+        break;
+    default:
+        BUG();
+        break;
+    }
 }

 static void freeze_domains(void)
@@ -169,6 +189,7 @@ static int enter_state(u32 state)
     {
         printk(XENLOG_ERR "Some devices failed to power down.");
         system_state = SYS_STATE_resume;
+        device_power_up(error);
         goto done;
     }

@@ -196,7 +217,7 @@ static int enter_state(u32 state)
     write_cr4(cr4 & ~X86_CR4_MCE);
     write_efer(read_efer());

-    device_power_up();
+    device_power_up(TYPE_ALL);

     mcheck_init(&boot_cpu_data, 0);
     write_cr4(cr4);

--
Quan




_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v3 03/10] IOMMU/MMU: enhance the call trees of IOMMU unmapping and mapping
  2016-05-05  6:53     ` Xu, Quan
@ 2016-05-05 11:02       ` George Dunlap
  2016-05-06  2:40         ` Xu, Quan
  0 siblings, 1 reply; 39+ messages in thread
From: George Dunlap @ 2016-05-05 11:02 UTC (permalink / raw)
  To: Xu, Quan, George Dunlap, Tian, Kevin, Jan Beulich
  Cc: Andrew Cooper, Dario Faggioli, Keir Fraser, Nakajima, Jun, xen-devel

On 05/05/16 07:53, Xu, Quan wrote:
> On May 04, 2016 9:49 PM, George Dunlap <George.Dunlap@eu.citrix.com> wrote:
>> On Fri, Apr 29, 2016 at 10:25 AM, Quan Xu <quan.xu@intel.com> wrote:
>>> When IOMMU mapping is failed, we issue a best effort rollback,
>>> stopping IOMMU mapping, unmapping the previous IOMMU maps and
>> then
>>> reporting the error up to the call trees. When rollback is not
>>> feasible (in early initialization phase or trade-off of complexity)
>>> for the hardware domain, we do things on a best effort basis, only throwing
>> out an error message.
>>>
>>> IOMMU unmapping should perhaps continue despite an error, in an
>>> attempt to do best effort cleanup.
>>>
>>> Signed-off-by: Quan Xu <quan.xu@intel.com>
>>>
>>> CC: Keir Fraser <keir@xen.org>
>>> CC: Jan Beulich <jbeulich@suse.com>
>>> CC: Andrew Cooper <andrew.cooper3@citrix.com>
>>> CC: Jun Nakajima <jun.nakajima@intel.com>
>>> CC: Kevin Tian <kevin.tian@intel.com>
>>> CC: George Dunlap <george.dunlap@eu.citrix.com>
>>> ---
>>>  xen/arch/x86/mm.c               | 13 ++++++++-----
>>>  xen/arch/x86/mm/p2m-ept.c       | 27 +++++++++++++++++++++++++--
>>>  xen/arch/x86/mm/p2m-pt.c        | 24 ++++++++++++++++++++----
>>>  xen/arch/x86/mm/p2m.c           | 11 +++++++++--
>>>  xen/drivers/passthrough/iommu.c | 14 +++++++++++++-
>>>  5 files changed, 75 insertions(+), 14 deletions(-)
>>>
>>> diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c index
>>> a42097f..427097d 100644
>>> --- a/xen/arch/x86/mm.c
>>> +++ b/xen/arch/x86/mm.c
>>> @@ -2467,7 +2467,7 @@ static int __get_page_type(struct page_info
>> *page, unsigned long type,
>>>                             int preemptible)  {
>>>      unsigned long nx, x, y = page->u.inuse.type_info;
>>> -    int rc = 0;
>>> +    int rc = 0, ret = 0;
>>>
>>>      ASSERT(!(type & ~(PGT_type_mask | PGT_pae_xen_l2)));
>>>
>>> @@ -2578,11 +2578,11 @@ static int __get_page_type(struct page_info
>> *page, unsigned long type,
>>>          if ( d && is_pv_domain(d) && unlikely(need_iommu(d)) )
>>>          {
>>>              if ( (x & PGT_type_mask) == PGT_writable_page )
>>> -                iommu_unmap_page(d, mfn_to_gmfn(d, page_to_mfn(page)));
>>> +                ret = iommu_unmap_page(d, mfn_to_gmfn(d,
>> page_to_mfn(page)));
>>>              else if ( type == PGT_writable_page )
>>> -                iommu_map_page(d, mfn_to_gmfn(d, page_to_mfn(page)),
>>> -                               page_to_mfn(page),
>>> -                               IOMMUF_readable|IOMMUF_writable);
>>> +                ret = iommu_map_page(d, mfn_to_gmfn(d, page_to_mfn(page)),
>>> +                                     page_to_mfn(page),
>>> +                                     IOMMUF_readable|IOMMUF_writable);
>>>          }
>>>      }
>>>
>>> @@ -2599,6 +2599,9 @@ static int __get_page_type(struct page_info
>> *page, unsigned long type,
>>>      if ( (x & PGT_partial) && !(nx & PGT_partial) )
>>>          put_page(page);
>>>
>>> +    if ( !rc )
>>> +        rc = ret;
>>> +
>>>      return rc;
>>>  }
>>>
>>> diff --git a/xen/arch/x86/mm/p2m-ept.c b/xen/arch/x86/mm/p2m-ept.c
>>> index 1ed5b47..df87944 100644
>>> --- a/xen/arch/x86/mm/p2m-ept.c
>>> +++ b/xen/arch/x86/mm/p2m-ept.c
>>> @@ -821,6 +821,8 @@ out:
>>>      if ( needs_sync )
>>>          ept_sync_domain(p2m);
>>>
>>> +    ret = 0;
>>> +
>>>      /* For host p2m, may need to change VT-d page table.*/
>>>      if ( rc == 0 && p2m_is_hostp2m(p2m) && need_iommu(d) &&
>>>           need_modify_vtd_table )
>>> @@ -831,11 +833,29 @@ out:
>>>          {
>>>              if ( iommu_flags )
>>>                  for ( i = 0; i < (1 << order); i++ )
>>> -                    iommu_map_page(d, gfn + i, mfn_x(mfn) + i, iommu_flags);
>>> +                {
>>> +                    rc = iommu_map_page(d, gfn + i, mfn_x(mfn) + i, iommu_flags);
>>> +
>>> +                    if ( !ret && unlikely(rc) )
>>> +                    {
>>> +                        while ( i-- )
>>> +                            iommu_unmap_page(d, gfn + i);
>>> +
>>> +                        ret = rc;
>>> +                        break;
>>> +                    }
>>> +                }
>>>              else
>>>                  for ( i = 0; i < (1 << order); i++ )
>>> -                    iommu_unmap_page(d, gfn + i);
>>> +                {
>>> +                    rc = iommu_unmap_page(d, gfn + i);
>>> +
>>> +                    if ( !ret && unlikely(rc) )
>>> +                        ret = rc;
>>> +                }
>>>          }
>>> +
>>> +        rc = 0;
>>>      }
>>
>> So the reason for doing this thing with setting ret=0, then using rc,
>> then setting rc to zero, is to make sure that the change is propagated
>> to the altp2m if necessary?
>>
>> I'm not a fan of this sort of "implied signalling".  The check at the
>> altp2m conditional is meant to say, "everything went fine, go ahead
>> and propagate the change".  But with your changes, that's not what we
>> want anymore -- we want, "If the change actually made it into the
>> hostp2m, propagate it to the altp2m."
>>
>> As such, I think it would be a lot clearer to just make a new boolean
>> variable, maybe "entry_written", which we initialize to false and then
>> set to true when the entry is actually written; and then change the
>> condition re the altp2m to "if ( entry_written && p2m_is_hostp2m(..)
>> )".
>>
>> Then I think you could make the ret / rc use mirror what you do
>> elsewhere, where ret is used to store the return value of the function
>> call, and it's propagated to rc only if rc is not already set.
>>
> 
> George,
> 
> Thanks for your detailed explanation. This seems an another matter of preference.
> Of course, I'd follow your suggestion in p2m  field, while I need to hear the other maintainers' options as well
> (especially when it comes from Kevin/Jan, as they do spend a lot of time for me).
> 
> A little bit of difference here, IMO, an int 'entry_written' is much better, as we also need  to propagate the 'entry_written' up to callers,
> i.e. the errors returned from atomic_write_ept_entry() are '-EINVAL', '-ESRCH' and '-EBUSY'. Then, the follow is my modification (feel free to correct me):

But it doesn't look like you're actually taking advantage of the fact
that entry_written is non-boolean -- other than immediately copying the
value into rc, all the places you're only checking if it's zero or not
(where "zero" in this case actually means "the entry was written").

The code you had before was, as far as I can tell, functionally correct.
 The point of introducing the extra variable is to decrease cognitive
load on people coming along to modify the code later.  To do that, you
want to maximize the things you do that are within expectation, and
minimize the things you do outside of that.

The name "entry_written" sounds like a boolean; developers will expect
"0 / false" to mean "not written" and "1/true" to mean "entry written".
 Furthermore, rc looks like a return value; if coders see "rc =
atomic_write_ept_entry()" they immediately have a framework for what's
going on; whereas if they see "entry_written == [...]; if
(entry_written) { ... rc= entry_written; }", it takes some thinking to
figure out.  Not much, but every little bit of unnecessary thinking adds up.

My suggestion remains to:
1. Declare entry_written as a bool, initialized to false
2. In both atomic_write_ept_entry() calls, use rc to collect the return
value
3. In the second one (the one which may fail), add an else {
entry_written = true; }
4. In the conditional deciding whether to update the iommu, use
"entry_written", I think.  At this point rc == 0 and entry_written =
true will be identical, but if we ever insert something in between, we
want the iommu update to be attempted any time the entry is successfully
written.
5. Use "if ( entry_written && ...)" when deciding whether to propagate
the change to the altp2m.

Thanks,
 -George


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v3 03/10] IOMMU/MMU: enhance the call trees of IOMMU unmapping and mapping
  2016-05-05 11:02       ` George Dunlap
@ 2016-05-06  2:40         ` Xu, Quan
  0 siblings, 0 replies; 39+ messages in thread
From: Xu, Quan @ 2016-05-06  2:40 UTC (permalink / raw)
  To: George Dunlap, George Dunlap
  Cc: Tian, Kevin, Keir Fraser, Jan Beulich, Andrew Cooper,
	Dario Faggioli, xen-devel, Nakajima, Jun

On May 05, 2016 7:03 PM, George Dunlap <george.dunlap@citrix.com> wrote:
> On 05/05/16 07:53, Xu, Quan wrote:
> > On May 04, 2016 9:49 PM, George Dunlap <George.Dunlap@eu.citrix.com>
> wrote:
> >> On Fri, Apr 29, 2016 at 10:25 AM, Quan Xu <quan.xu@intel.com> wrote:
> >>> When IOMMU mapping is failed, we issue a best effort rollback,
> >>> stopping IOMMU mapping, unmapping the previous IOMMU maps and
> >> then
> >>> reporting the error up to the call trees. When rollback is not
> >>> feasible (in early initialization phase or trade-off of complexity)
> >>> for the hardware domain, we do things on a best effort basis, only
> >>> throwing
> >> out an error message.
> >>>
> >>> IOMMU unmapping should perhaps continue despite an error, in an
> >>> attempt to do best effort cleanup.
> >>>
> >>> Signed-off-by: Quan Xu <quan.xu@intel.com>
> >>>
> >>> CC: Keir Fraser <keir@xen.org>
> >>> CC: Jan Beulich <jbeulich@suse.com>
> >>> CC: Andrew Cooper <andrew.cooper3@citrix.com>
> >>> CC: Jun Nakajima <jun.nakajima@intel.com>
> >>> CC: Kevin Tian <kevin.tian@intel.com>
> >>> CC: George Dunlap <george.dunlap@eu.citrix.com>
> >>> ---
> >>>  xen/arch/x86/mm.c               | 13 ++++++++-----
> >>>  xen/arch/x86/mm/p2m-ept.c       | 27 +++++++++++++++++++++++++--
> >>>  xen/arch/x86/mm/p2m-pt.c        | 24 ++++++++++++++++++++----
> >>>  xen/arch/x86/mm/p2m.c           | 11 +++++++++--
> >>>  xen/drivers/passthrough/iommu.c | 14 +++++++++++++-
> >>>  5 files changed, 75 insertions(+), 14 deletions(-)
> >>>
> >>> diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c index
> >>> a42097f..427097d 100644
> >>> --- a/xen/arch/x86/mm.c
> >>> +++ b/xen/arch/x86/mm.c
> >>> @@ -2467,7 +2467,7 @@ static int __get_page_type(struct page_info
> >> *page, unsigned long type,
> >>>                             int preemptible)  {
> >>>      unsigned long nx, x, y = page->u.inuse.type_info;
> >>> -    int rc = 0;
> >>> +    int rc = 0, ret = 0;
> >>>
> >>>      ASSERT(!(type & ~(PGT_type_mask | PGT_pae_xen_l2)));
> >>>
> >>> @@ -2578,11 +2578,11 @@ static int __get_page_type(struct page_info
> >> *page, unsigned long type,
> >>>          if ( d && is_pv_domain(d) && unlikely(need_iommu(d)) )
> >>>          {
> >>>              if ( (x & PGT_type_mask) == PGT_writable_page )
> >>> -                iommu_unmap_page(d, mfn_to_gmfn(d, page_to_mfn(page)));
> >>> +                ret = iommu_unmap_page(d, mfn_to_gmfn(d,
> >> page_to_mfn(page)));
> >>>              else if ( type == PGT_writable_page )
> >>> -                iommu_map_page(d, mfn_to_gmfn(d, page_to_mfn(page)),
> >>> -                               page_to_mfn(page),
> >>> -                               IOMMUF_readable|IOMMUF_writable);
> >>> +                ret = iommu_map_page(d, mfn_to_gmfn(d,
> page_to_mfn(page)),
> >>> +                                     page_to_mfn(page),
> >>> +
> >>> + IOMMUF_readable|IOMMUF_writable);
> >>>          }
> >>>      }
> >>>
> >>> @@ -2599,6 +2599,9 @@ static int __get_page_type(struct page_info
> >> *page, unsigned long type,
> >>>      if ( (x & PGT_partial) && !(nx & PGT_partial) )
> >>>          put_page(page);
> >>>
> >>> +    if ( !rc )
> >>> +        rc = ret;
> >>> +
> >>>      return rc;
> >>>  }
> >>>
> >>> diff --git a/xen/arch/x86/mm/p2m-ept.c b/xen/arch/x86/mm/p2m-ept.c
> >>> index 1ed5b47..df87944 100644
> >>> --- a/xen/arch/x86/mm/p2m-ept.c
> >>> +++ b/xen/arch/x86/mm/p2m-ept.c
> >>> @@ -821,6 +821,8 @@ out:
> >>>      if ( needs_sync )
> >>>          ept_sync_domain(p2m);
> >>>
> >>> +    ret = 0;
> >>> +
> >>>      /* For host p2m, may need to change VT-d page table.*/
> >>>      if ( rc == 0 && p2m_is_hostp2m(p2m) && need_iommu(d) &&
> >>>           need_modify_vtd_table )
> >>> @@ -831,11 +833,29 @@ out:
> >>>          {
> >>>              if ( iommu_flags )
> >>>                  for ( i = 0; i < (1 << order); i++ )
> >>> -                    iommu_map_page(d, gfn + i, mfn_x(mfn) + i, iommu_flags);
> >>> +                {
> >>> +                    rc = iommu_map_page(d, gfn + i, mfn_x(mfn) + i,
> >>> + iommu_flags);
> >>> +
> >>> +                    if ( !ret && unlikely(rc) )
> >>> +                    {
> >>> +                        while ( i-- )
> >>> +                            iommu_unmap_page(d, gfn + i);
> >>> +
> >>> +                        ret = rc;
> >>> +                        break;
> >>> +                    }
> >>> +                }
> >>>              else
> >>>                  for ( i = 0; i < (1 << order); i++ )
> >>> -                    iommu_unmap_page(d, gfn + i);
> >>> +                {
> >>> +                    rc = iommu_unmap_page(d, gfn + i);
> >>> +
> >>> +                    if ( !ret && unlikely(rc) )
> >>> +                        ret = rc;
> >>> +                }
> >>>          }
> >>> +
> >>> +        rc = 0;
> >>>      }
> >>
> >> So the reason for doing this thing with setting ret=0, then using rc,
> >> then setting rc to zero, is to make sure that the change is
> >> propagated to the altp2m if necessary?
> >>
> >> I'm not a fan of this sort of "implied signalling".  The check at the
> >> altp2m conditional is meant to say, "everything went fine, go ahead
> >> and propagate the change".  But with your changes, that's not what we
> >> want anymore -- we want, "If the change actually made it into the
> >> hostp2m, propagate it to the altp2m."
> >>
> >> As such, I think it would be a lot clearer to just make a new boolean
> >> variable, maybe "entry_written", which we initialize to false and
> >> then set to true when the entry is actually written; and then change
> >> the condition re the altp2m to "if ( entry_written &&
> >> p2m_is_hostp2m(..) )".
> >>
> >> Then I think you could make the ret / rc use mirror what you do
> >> elsewhere, where ret is used to store the return value of the
> >> function call, and it's propagated to rc only if rc is not already set.
> >>
> >
> > George,
> >
> > Thanks for your detailed explanation. This seems an another matter of
> preference.
> > Of course, I'd follow your suggestion in p2m  field, while I need to
> > hear the other maintainers' options as well (especially when it comes from
> Kevin/Jan, as they do spend a lot of time for me).
> >
> > A little bit of difference here, IMO, an int 'entry_written' is much
> > better, as we also need  to propagate the 'entry_written' up to callers, i.e. the
> errors returned from atomic_write_ept_entry() are '-EINVAL', '-ESRCH' and '-
> EBUSY'. Then, the follow is my modification (feel free to correct me):
> 
> But it doesn't look like you're actually taking advantage of the fact that
> entry_written is non-boolean -- other than immediately copying the value into
> rc, all the places you're only checking if it's zero or not (where "zero" in this
> case actually means "the entry was written").
> 
> The code you had before was, as far as I can tell, functionally correct.
>  The point of introducing the extra variable is to decrease cognitive load on
> people coming along to modify the code later.  To do that, you want to
> maximize the things you do that are within expectation, and minimize the
> things you do outside of that.
> 
> The name "entry_written" sounds like a boolean; developers will expect
> "0 / false" to mean "not written" and "1/true" to mean "entry written".
>  Furthermore, rc looks like a return value; if coders see "rc =
> atomic_write_ept_entry()" they immediately have a framework for what's
> going on; whereas if they see "entry_written == [...]; if
> (entry_written) { ... rc= entry_written; }", it takes some thinking to figure out.
> Not much, but every little bit of unnecessary thinking adds up.
> 
> My suggestion remains to:
> 1. Declare entry_written as a bool, initialized to false 2. In both
> atomic_write_ept_entry() calls, use rc to collect the return value 3. In the
> second one (the one which may fail), add an else { entry_written = true; } 4. In
> the conditional deciding whether to update the iommu, use "entry_written", I
> think.  At this point rc == 0 and entry_written = true will be identical, but if we
> ever insert something in between, we want the iommu update to be
> attempted any time the entry is successfully written.
> 5. Use "if ( entry_written && ...)" when deciding whether to propagate the
> change to the altp2m.
> 

George, 
Thanks. It is very clear now, and I will  follow it in next v4. 

Quan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v3 09/10] IOMMU: propagate IOMMU Device-TLB flush error up to IOMMU suspending
  2016-05-05 10:18         ` Xu, Quan
@ 2016-05-06  7:02           ` Jan Beulich
  2016-05-06  7:20             ` Xu, Quan
  0 siblings, 1 reply; 39+ messages in thread
From: Jan Beulich @ 2016-05-06  7:02 UTC (permalink / raw)
  To: Quan Xu
  Cc: Kevin Tian, Stefano Stabellini, Feng Wu, Liu Jinsong,
	dario.faggioli, xen-devel, Julien Grall, SuraveeSuthikulpanit,
	Andrew Cooper, Keir Fraser

>>> On 05.05.16 at 12:18, <quan.xu@intel.com> wrote:
> What about this one as below:

If done properly (coding style, comments to silence Coverity,
consistency throughout not just your change, but its effects on
the rest of the code in that file), that's certainly an option.

Jan

> --- a/xen/arch/x86/acpi/power.c
> +++ b/xen/arch/x86/acpi/power.c
> @@ -43,6 +43,17 @@ struct acpi_sleep_info acpi_sinfo;
> 
>  void do_suspend_lowlevel(void);
> 
> +enum dev_power_type {
> +    TYPE_ALL,
> +    TYPE_CONSOLE,
> +    TYPE_TIME,
> +    TYPE_I8259A,
> +    TYPE_IOAPIC,
> +    TYPE_IOMMU,
> +    TYPE_LAPIC,
> +    TYPE_UNKNOWN
> +};
> +
>  static int device_power_down(void)
>  {
>      console_suspend();
> @@ -53,26 +64,35 @@ static int device_power_down(void)
> 
>      ioapic_suspend();
> 
> -    iommu_suspend();
> +    if ( iommu_suspend() )
> +        return TYPE_IOMMU;
> 
>      lapic_suspend();
> 
>      return 0;
>  }
> 
> -static void device_power_up(void)
> +static void device_power_up(enum dev_power_type type)
>  {
> -    lapic_resume();
> -
> -    iommu_resume();
> -
> -    ioapic_resume();
> -
> -    i8259A_resume();
> -
> -    time_resume();
> -
> -    console_resume();
> +    switch ( type ) {
> +    case TYPE_ALL:
> +    case TYPE_LAPIC:
> +        lapic_resume();
> +    case TYPE_IOMMU:
> +        iommu_resume();
> +    case TYPE_IOAPIC:
> +        ioapic_resume();
> +    case TYPE_I8259A:
> +        i8259A_resume();
> +    case TYPE_TIME:
> +        time_resume();
> +    case TYPE_CONSOLE:
> +        console_resume();
> +        break;
> +    default:
> +        BUG();
> +        break;
> +    }
>  }
> 
>  static void freeze_domains(void)
> @@ -169,6 +189,7 @@ static int enter_state(u32 state)
>      {
>          printk(XENLOG_ERR "Some devices failed to power down.");
>          system_state = SYS_STATE_resume;
> +        device_power_up(error);
>          goto done;
>      }
> 
> @@ -196,7 +217,7 @@ static int enter_state(u32 state)
>      write_cr4(cr4 & ~X86_CR4_MCE);
>      write_efer(read_efer());
> 
> -    device_power_up();
> +    device_power_up(TYPE_ALL);
> 
>      mcheck_init(&boot_cpu_data, 0);
>      write_cr4(cr4);
> 
> --
> Quan




_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH v3 09/10] IOMMU: propagate IOMMU Device-TLB flush error up to IOMMU suspending
  2016-05-06  7:02           ` Jan Beulich
@ 2016-05-06  7:20             ` Xu, Quan
  0 siblings, 0 replies; 39+ messages in thread
From: Xu, Quan @ 2016-05-06  7:20 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Tian, Kevin, Stefano Stabellini, Wu, Feng, Liu Jinsong,
	dario.faggioli, xen-devel, Julien Grall, SuraveeSuthikulpanit,
	Andrew Cooper, Keir Fraser

On May 06, 2016 3:03 PM, Jan Beulich <JBeulich@suse.com> wrote:
> >>> On 05.05.16 at 12:18, <quan.xu@intel.com> wrote:
> > What about this one as below:
> 
> If done properly (coding style, comments to silence Coverity, consistency
> throughout not just your change, but its effects on the rest of the code in that
> file), that's certainly an option.
> 

Thanks. I'll try it in next v4.

-Quan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 39+ messages in thread

end of thread, other threads:[~2016-05-06  7:20 UTC | newest]

Thread overview: 39+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-04-29  9:25 [PATCH v3 00/10] Check VT-d Device-TLB flush error Quan Xu
2016-04-29  9:25 ` [PATCH v3 01/10] vt-d: fix the IOMMU flush issue Quan Xu
2016-05-04  1:26   ` Tian, Kevin
2016-05-04 12:09     ` Xu, Quan
2016-05-04 13:51       ` Jan Beulich
2016-05-04 15:03         ` Xu, Quan
2016-04-29  9:25 ` [PATCH v3 02/10] IOMMU: handle IOMMU mapping and unmapping failures Quan Xu
2016-05-04  1:28   ` Tian, Kevin
2016-05-04 11:49     ` Xu, Quan
2016-05-04 13:44       ` Jan Beulich
2016-05-05  1:47         ` Xu, Quan
2016-04-29  9:25 ` [PATCH v3 03/10] IOMMU/MMU: enhance the call trees of IOMMU unmapping and mapping Quan Xu
2016-05-04  1:45   ` Tian, Kevin
2016-05-04  8:40     ` Jan Beulich
2016-05-04  9:13       ` Xu, Quan
2016-05-04  9:28         ` Jan Beulich
2016-05-05  0:38           ` Tian, Kevin
2016-05-04 13:48   ` George Dunlap
2016-05-05  6:53     ` Xu, Quan
2016-05-05 11:02       ` George Dunlap
2016-05-06  2:40         ` Xu, Quan
2016-04-29  9:25 ` [PATCH v3 04/10] vt-d: propagate IOMMU Device-TLB flush error up to IOMMU unmapping Quan Xu
2016-04-29  9:25 ` [PATCH v3 05/10] vt-d: propagate IOMMU Device-TLB flush error up to IOMMU mapping Quan Xu
2016-04-29  9:25 ` [PATCH v3 06/10] propagate IOMMU Device-TLB flush error up to iommu_iotlb_flush{, _all} (top level ones) Quan Xu
2016-04-29  9:25 ` [PATCH v3 07/10] IOMMU: propagate IOMMU Device-TLB flush error up to iommu_iotlb_flush{, _all} (leaf ones) Quan Xu
2016-05-04  1:54   ` Tian, Kevin
2016-04-29  9:25 ` [PATCH v3 08/10] vt-d/ept: propagate IOMMU Device-TLB flush error up to EPT update Quan Xu
2016-05-04  1:56   ` Tian, Kevin
2016-04-29  9:25 ` [PATCH v3 09/10] IOMMU: propagate IOMMU Device-TLB flush error up to IOMMU suspending Quan Xu
2016-05-04  1:59   ` Tian, Kevin
2016-05-04  2:14     ` Xu, Quan
2016-05-04  8:42       ` Jan Beulich
2016-05-04  8:59         ` Xu, Quan
2016-05-05 10:18         ` Xu, Quan
2016-05-06  7:02           ` Jan Beulich
2016-05-06  7:20             ` Xu, Quan
2016-04-29  9:25 ` [PATCH v3 10/10] vt-d: propagate error up to ME phantom function mapping and unmapping Quan Xu
2016-05-04  2:02   ` Tian, Kevin
2016-05-04  2:19     ` Xu, Quan

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.