All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v5 0/7]  VT-d Device-TLB flush issue
@ 2016-02-05 10:18 Quan Xu
  2016-02-05 10:18 ` [PATCH v5 1/7] VT-d: Check VT-d Device-TLB flush error(IOMMU part) Quan Xu
                   ` (6 more replies)
  0 siblings, 7 replies; 31+ messages in thread
From: Quan Xu @ 2016-02-05 10:18 UTC (permalink / raw)
  To: jbeulich, kevin.tian
  Cc: feng.wu, george.dunlap, andrew.cooper3, tim, xen-devel, Quan Xu

This patches fix current timeout concern and also allow limited ATS support:

1. Check VT-d Device-TLB flush error.
   This patch checks all kinds of error and all the way up the call trees of VT-d Device-TLB flush.

2. Reduce spin timeout to 1ms, which can be boot-time changed with 'vtd_qi_timeout'.
   For example:
           multiboot /boot/xen.gz ats=1 vtd_qi_timeout=100

3. Pass down a flag indicating whether the lock is being held.
 
4. Fix vt-d Device-TLB flush timeout issue.
   If Device-TLB flush is timeout, we'll hide the target ATS device and crash the domain owning this ATS device.
   If impacted domain is hardware domain, just throw out a warning.
   The hidden device will be disallowed to be further assigned to  any domain.

---- 

 * DMAR_OPERATION_TIMEOUT should be also chopped down to a low number of milliseconds.
   As Kevin Tian mentioned in 'Revisit VT-d asynchronous flush issue', We also confirmed with hardware team
   that 1ms is large enough for IOMMU internal flush. So I can change DMAR_OPERATION_TIMEOUT from 1000 ms to 1 ms.

   IOMMU_WAIT_OP() is only for VT-d registers read/write, and there is also a panic. We need a further discussion
   whether or how to remove this panic in next patch set.

 * The coming patch set will fix IOTLB/Context/IETC flush timeout.

-Changes in v5:
  * Split the 'check VT-d Device-TLB flush error' into MMU part and IOMMU part. (P1-P2)
  * Add a new standalone entry for new command 'vtd_qi_timeout' in docs/misc/xen-command-line.markdown.(P3)
  * Change the option name from 'iommu_qi_timeout_ms' to 'vtd_qi_timeout'.(P3)
  * Pass down a flag indicating whether the lock is being held.(P4-P6)
  * Fix multiple return points when this can be trivially avoided.(P7)
  * Enhance the print out message. (P7)
  * Enhance the comment.(P7)
  * Consult the bitmap along with the domain ID array.(P7)



Quan Xu (7):
  VT-d: Check VT-d Device-TLB flush error(IOMMU part).
  VT-d: Check VT-d Device-TLB flush error(MMU part).
  VT-d: Reduce spin timeout to 1ms, which can be boot-time changed.
  VT-d: Refactor iommu_ops .iotlb_flush() and iotlb_flush_all()
  VT-d: Refactor iommu_ops .map_page() and unmap_page()
  VT-d: Refactor iommu_flush .iotlb()
  VT-d: Fix vt-d Device-TLB flush timeout issue.

 docs/misc/xen-command-line.markdown           |   7 ++
 xen/arch/arm/p2m.c                            |   2 +-
 xen/arch/x86/acpi/power.c                     |   6 +-
 xen/arch/x86/crash.c                          |   3 +-
 xen/arch/x86/domain_build.c                   |   5 +-
 xen/arch/x86/mm.c                             |  15 ++-
 xen/arch/x86/mm/p2m-ept.c                     |  15 ++-
 xen/arch/x86/mm/p2m-pt.c                      |  14 ++-
 xen/arch/x86/mm/p2m.c                         |  24 ++--
 xen/arch/x86/x86_64/mm.c                      |   5 +-
 xen/common/domain.c                           |   2 +-
 xen/common/grant_table.c                      |  16 ++-
 xen/common/memory.c                           |   5 +-
 xen/drivers/passthrough/amd/iommu_init.c      |   4 +-
 xen/drivers/passthrough/amd/iommu_map.c       |   7 +-
 xen/drivers/passthrough/amd/pci_amd_iommu.c   |   7 +-
 xen/drivers/passthrough/arm/smmu.c            |  15 ++-
 xen/drivers/passthrough/iommu.c               |  47 ++++---
 xen/drivers/passthrough/pci.c                 |   2 +-
 xen/drivers/passthrough/vtd/extern.h          |  12 +-
 xen/drivers/passthrough/vtd/iommu.c           | 173 +++++++++++++++++---------
 xen/drivers/passthrough/vtd/iommu.h           |   3 +-
 xen/drivers/passthrough/vtd/qinval.c          |  93 +++++++++++++-
 xen/drivers/passthrough/vtd/quirks.c          |  26 ++--
 xen/drivers/passthrough/vtd/x86/ats.c         |  14 ++-
 xen/drivers/passthrough/vtd/x86/vtd.c         |  18 ++-
 xen/drivers/passthrough/x86/iommu.c           |   9 +-
 xen/include/asm-x86/hvm/svm/amd-iommu-proto.h |   8 +-
 xen/include/asm-x86/iommu.h                   |   3 +-
 xen/include/asm-x86/p2m.h                     |   6 +-
 xen/include/xen/iommu.h                       |  37 +++---
 31 files changed, 429 insertions(+), 174 deletions(-)

-- 
1.9.1

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH v5 1/7] VT-d: Check VT-d Device-TLB flush error(IOMMU part).
  2016-02-05 10:18 [PATCH v5 0/7] VT-d Device-TLB flush issue Quan Xu
@ 2016-02-05 10:18 ` Quan Xu
  2016-02-10 17:01   ` Jan Beulich
  2016-02-05 10:18 ` [PATCH v5 2/7] VT-d: Check VT-d Device-TLB flush error(MMU part) Quan Xu
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 31+ messages in thread
From: Quan Xu @ 2016-02-05 10:18 UTC (permalink / raw)
  To: jbeulich, kevin.tian
  Cc: feng.wu, george.dunlap, andrew.cooper3, tim, xen-devel, Quan Xu

This patch checks all kinds of error and all the way up
the call trees of VT-d Device-TLB flush(IOMMU part).

Signed-off-by: Quan Xu <quan.xu@intel.com>
---
 xen/drivers/passthrough/amd/iommu_init.c      |   4 +-
 xen/drivers/passthrough/amd/pci_amd_iommu.c   |   4 +-
 xen/drivers/passthrough/arm/smmu.c            |  13 +--
 xen/drivers/passthrough/iommu.c               |  37 +++++---
 xen/drivers/passthrough/vtd/extern.h          |   4 +-
 xen/drivers/passthrough/vtd/iommu.c           | 125 ++++++++++++++++----------
 xen/drivers/passthrough/vtd/qinval.c          |   2 +-
 xen/drivers/passthrough/vtd/quirks.c          |  26 +++---
 xen/drivers/passthrough/vtd/x86/vtd.c         |  17 +++-
 xen/drivers/passthrough/x86/iommu.c           |   6 +-
 xen/include/asm-x86/hvm/svm/amd-iommu-proto.h |   4 +-
 xen/include/asm-x86/iommu.h                   |   2 +-
 xen/include/xen/iommu.h                       |  20 ++---
 13 files changed, 166 insertions(+), 98 deletions(-)

diff --git a/xen/drivers/passthrough/amd/iommu_init.c b/xen/drivers/passthrough/amd/iommu_init.c
index d90a2d2..ec47e22 100644
--- a/xen/drivers/passthrough/amd/iommu_init.c
+++ b/xen/drivers/passthrough/amd/iommu_init.c
@@ -1340,12 +1340,14 @@ static void invalidate_all_devices(void)
     iterate_ivrs_mappings(_invalidate_all_devices);
 }
 
-void amd_iommu_suspend(void)
+int amd_iommu_suspend(void)
 {
     struct amd_iommu *iommu;
 
     for_each_amd_iommu ( iommu )
         disable_iommu(iommu);
+
+    return 0;
 }
 
 void amd_iommu_resume(void)
diff --git a/xen/drivers/passthrough/amd/pci_amd_iommu.c b/xen/drivers/passthrough/amd/pci_amd_iommu.c
index c1c0b6b..449de13 100644
--- a/xen/drivers/passthrough/amd/pci_amd_iommu.c
+++ b/xen/drivers/passthrough/amd/pci_amd_iommu.c
@@ -280,7 +280,7 @@ static int amd_iommu_domain_init(struct domain *d)
     return 0;
 }
 
-static void __hwdom_init amd_iommu_hwdom_init(struct domain *d)
+static int __hwdom_init amd_iommu_hwdom_init(struct domain *d)
 {
     unsigned long i; 
     const struct amd_iommu *iommu;
@@ -312,6 +312,8 @@ static void __hwdom_init amd_iommu_hwdom_init(struct domain *d)
             BUG();
 
     setup_hwdom_pci_devices(d, amd_iommu_setup_hwdom_device);
+
+    return 0;
 }
 
 void amd_iommu_disable_domain_device(struct domain *domain,
diff --git a/xen/drivers/passthrough/arm/smmu.c b/xen/drivers/passthrough/arm/smmu.c
index bb08827..155b7f3 100644
--- a/xen/drivers/passthrough/arm/smmu.c
+++ b/xen/drivers/passthrough/arm/smmu.c
@@ -2544,7 +2544,7 @@ static int force_stage = 2;
  */
 static u32 platform_features = ARM_SMMU_FEAT_COHERENT_WALK;
 
-static void arm_smmu_iotlb_flush_all(struct domain *d)
+static int arm_smmu_iotlb_flush_all(struct domain *d)
 {
 	struct arm_smmu_xen_domain *smmu_domain = domain_hvm_iommu(d)->arch.priv;
 	struct iommu_domain *cfg;
@@ -2561,13 +2561,15 @@ static void arm_smmu_iotlb_flush_all(struct domain *d)
 		arm_smmu_tlb_inv_context(cfg->priv);
 	}
 	spin_unlock(&smmu_domain->lock);
+
+        return 0;
 }
 
-static void arm_smmu_iotlb_flush(struct domain *d, unsigned long gfn,
-                                 unsigned int page_count)
+static int arm_smmu_iotlb_flush(struct domain *d, unsigned long gfn,
+                                unsigned int page_count)
 {
     /* ARM SMMU v1 doesn't have flush by VMA and VMID */
-    arm_smmu_iotlb_flush_all(d);
+    return arm_smmu_iotlb_flush_all(d);
 }
 
 static struct iommu_domain *arm_smmu_get_domain(struct domain *d,
@@ -2737,8 +2739,9 @@ static int arm_smmu_iommu_domain_init(struct domain *d)
 	return 0;
 }
 
-static void __hwdom_init arm_smmu_iommu_hwdom_init(struct domain *d)
+static int __hwdom_init arm_smmu_iommu_hwdom_init(struct domain *d)
 {
+        return 0;
 }
 
 static void arm_smmu_iommu_domain_teardown(struct domain *d)
diff --git a/xen/drivers/passthrough/iommu.c b/xen/drivers/passthrough/iommu.c
index d5137733..cdf8e9a 100644
--- a/xen/drivers/passthrough/iommu.c
+++ b/xen/drivers/passthrough/iommu.c
@@ -146,14 +146,15 @@ static void __hwdom_init check_hwdom_reqs(struct domain *d)
     iommu_dom0_strict = 1;
 }
 
-void __hwdom_init iommu_hwdom_init(struct domain *d)
+int __hwdom_init iommu_hwdom_init(struct domain *d)
 {
     struct hvm_iommu *hd = domain_hvm_iommu(d);
+    int rc = 0;
 
     check_hwdom_reqs(d);
 
     if ( !iommu_enabled )
-        return;
+        return 0;
 
     register_keyhandler('o', &iommu_p2m_table);
     d->need_iommu = !!iommu_dom0_strict;
@@ -171,7 +172,10 @@ void __hwdom_init iommu_hwdom_init(struct domain *d)
                  ((page->u.inuse.type_info & PGT_type_mask)
                   == PGT_writable_page) )
                 mapping |= IOMMUF_writable;
-            hd->platform_ops->map_page(d, gfn, mfn, mapping);
+            rc = hd->platform_ops->map_page(d, gfn, mfn, mapping);
+            if ( rc )
+                return rc;
+
             if ( !(i++ & 0xfffff) )
                 process_pending_softirqs();
         }
@@ -266,24 +270,24 @@ static void iommu_free_pagetables(unsigned long unused)
                             cpumask_cycle(smp_processor_id(), &cpu_online_map));
 }
 
-void iommu_iotlb_flush(struct domain *d, unsigned long gfn, unsigned int page_count)
+int iommu_iotlb_flush(struct domain *d, unsigned long gfn, unsigned int page_count)
 {
     struct hvm_iommu *hd = domain_hvm_iommu(d);
 
     if ( !iommu_enabled || !hd->platform_ops || !hd->platform_ops->iotlb_flush )
-        return;
+        return 0;
 
-    hd->platform_ops->iotlb_flush(d, gfn, page_count);
+    return hd->platform_ops->iotlb_flush(d, gfn, page_count);
 }
 
-void iommu_iotlb_flush_all(struct domain *d)
+int iommu_iotlb_flush_all(struct domain *d)
 {
     struct hvm_iommu *hd = domain_hvm_iommu(d);
 
     if ( !iommu_enabled || !hd->platform_ops || !hd->platform_ops->iotlb_flush_all )
-        return;
+        return 0;
 
-    hd->platform_ops->iotlb_flush_all(d);
+    return hd->platform_ops->iotlb_flush_all(d);
 }
 
 int __init iommu_setup(void)
@@ -354,11 +358,14 @@ int iommu_do_domctl(
     return ret;
 }
 
-void iommu_suspend()
+int iommu_suspend()
 {
     const struct iommu_ops *ops = iommu_get_ops();
+
     if ( iommu_enabled )
-        ops->suspend();
+        return ops->suspend();
+
+    return 0;
 }
 
 void iommu_share_p2m_table(struct domain* d)
@@ -369,12 +376,16 @@ void iommu_share_p2m_table(struct domain* d)
         ops->share_p2m(d);
 }
 
-void iommu_crash_shutdown(void)
+int iommu_crash_shutdown(void)
 {
     const struct iommu_ops *ops = iommu_get_ops();
+
     if ( iommu_enabled )
-        ops->crash_shutdown();
+        return ops->crash_shutdown();
+
     iommu_enabled = iommu_intremap = 0;
+
+    return 0;
 }
 
 int iommu_get_reserved_device_memory(iommu_grdm_t *func, void *ctxt)
diff --git a/xen/drivers/passthrough/vtd/extern.h b/xen/drivers/passthrough/vtd/extern.h
index 8acf889..ec9c513 100644
--- a/xen/drivers/passthrough/vtd/extern.h
+++ b/xen/drivers/passthrough/vtd/extern.h
@@ -91,11 +91,11 @@ int is_igd_vt_enabled_quirk(void);
 void platform_quirks_init(void);
 void vtd_ops_preamble_quirk(struct iommu* iommu);
 void vtd_ops_postamble_quirk(struct iommu* iommu);
-void me_wifi_quirk(struct domain *domain, u8 bus, u8 devfn, int map);
+int me_wifi_quirk(struct domain *domain, u8 bus, u8 devfn, int map);
 void pci_vtd_quirk(const struct pci_dev *);
 int platform_supports_intremap(void);
 int platform_supports_x2apic(void);
 
-void vtd_set_hwdom_mapping(struct domain *d);
+int vtd_set_hwdom_mapping(struct domain *d);
 
 #endif // _VTD_EXTERN_H_
diff --git a/xen/drivers/passthrough/vtd/iommu.c b/xen/drivers/passthrough/vtd/iommu.c
index dd13865..a780632 100644
--- a/xen/drivers/passthrough/vtd/iommu.c
+++ b/xen/drivers/passthrough/vtd/iommu.c
@@ -542,7 +542,7 @@ static int iommu_flush_iotlb_psi(
     return status;
 }
 
-static void iommu_flush_all(void)
+static int iommu_flush_all(void)
 {
     struct acpi_drhd_unit *drhd;
     struct iommu *iommu;
@@ -554,11 +554,13 @@ static void iommu_flush_all(void)
         iommu = drhd->iommu;
         iommu_flush_context_global(iommu, 0);
         flush_dev_iotlb = find_ats_dev_drhd(iommu) ? 1 : 0;
-        iommu_flush_iotlb_global(iommu, 0, flush_dev_iotlb);
+        return iommu_flush_iotlb_global(iommu, 0, flush_dev_iotlb);
     }
+
+    return 0;
 }
 
-static void __intel_iommu_iotlb_flush(struct domain *d, unsigned long gfn,
+static int __intel_iommu_iotlb_flush(struct domain *d, unsigned long gfn,
         int dma_old_pte_present, unsigned int page_count)
 {
     struct hvm_iommu *hd = domain_hvm_iommu(d);
@@ -566,6 +568,7 @@ static void __intel_iommu_iotlb_flush(struct domain *d, unsigned long gfn,
     struct iommu *iommu;
     int flush_dev_iotlb;
     int iommu_domid;
+    int rc = 0;
 
     /*
      * No need pcideves_lock here because we have flush
@@ -584,37 +587,37 @@ static void __intel_iommu_iotlb_flush(struct domain *d, unsigned long gfn,
             continue;
 
         if ( page_count > 1 || gfn == -1 )
-        {
-            if ( iommu_flush_iotlb_dsi(iommu, iommu_domid,
-                        0, flush_dev_iotlb) )
-                iommu_flush_write_buffer(iommu);
-        }
+            rc = iommu_flush_iotlb_dsi(iommu, iommu_domid,
+                        0, flush_dev_iotlb);
         else
-        {
-            if ( iommu_flush_iotlb_psi(iommu, iommu_domid,
+            rc = iommu_flush_iotlb_psi(iommu, iommu_domid,
                         (paddr_t)gfn << PAGE_SHIFT_4K, 0,
-                        !dma_old_pte_present, flush_dev_iotlb) )
-                iommu_flush_write_buffer(iommu);
-        }
+                        !dma_old_pte_present, flush_dev_iotlb);
+
+        if ( rc )
+            iommu_flush_write_buffer(iommu);
     }
+
+    return rc;
 }
 
-static void intel_iommu_iotlb_flush(struct domain *d, unsigned long gfn, unsigned int page_count)
+static int intel_iommu_iotlb_flush(struct domain *d, unsigned long gfn, unsigned int page_count)
 {
-    __intel_iommu_iotlb_flush(d, gfn, 1, page_count);
+    return __intel_iommu_iotlb_flush(d, gfn, 1, page_count);
 }
 
-static void intel_iommu_iotlb_flush_all(struct domain *d)
+static int intel_iommu_iotlb_flush_all(struct domain *d)
 {
-    __intel_iommu_iotlb_flush(d, 0, 0, 0);
+    return __intel_iommu_iotlb_flush(d, 0, 0, 0);
 }
 
 /* clear one page's page table */
-static void dma_pte_clear_one(struct domain *domain, u64 addr)
+static int dma_pte_clear_one(struct domain *domain, u64 addr)
 {
     struct hvm_iommu *hd = domain_hvm_iommu(domain);
     struct dma_pte *page = NULL, *pte = NULL;
     u64 pg_maddr;
+    int rc = 0;
 
     spin_lock(&hd->arch.mapping_lock);
     /* get last level pte */
@@ -622,7 +625,7 @@ static void dma_pte_clear_one(struct domain *domain, u64 addr)
     if ( pg_maddr == 0 )
     {
         spin_unlock(&hd->arch.mapping_lock);
-        return;
+        return -ENOMEM;
     }
 
     page = (struct dma_pte *)map_vtd_domain_page(pg_maddr);
@@ -632,7 +635,7 @@ static void dma_pte_clear_one(struct domain *domain, u64 addr)
     {
         spin_unlock(&hd->arch.mapping_lock);
         unmap_vtd_domain_page(page);
-        return;
+        return 0;
     }
 
     dma_clear_pte(*pte);
@@ -640,9 +643,11 @@ static void dma_pte_clear_one(struct domain *domain, u64 addr)
     iommu_flush_cache_entry(pte, sizeof(struct dma_pte));
 
     if ( !this_cpu(iommu_dont_flush_iotlb) )
-        __intel_iommu_iotlb_flush(domain, addr >> PAGE_SHIFT_4K, 1, 1);
+        rc = __intel_iommu_iotlb_flush(domain, addr >> PAGE_SHIFT_4K, 1, 1);
 
     unmap_vtd_domain_page(page);
+
+    return rc;
 }
 
 static void iommu_free_pagetable(u64 pt_maddr, int level)
@@ -1251,20 +1256,25 @@ static int intel_iommu_domain_init(struct domain *d)
     return 0;
 }
 
-static void __hwdom_init intel_iommu_hwdom_init(struct domain *d)
+static int __hwdom_init intel_iommu_hwdom_init(struct domain *d)
 {
     struct acpi_drhd_unit *drhd;
+    int rc;
 
     if ( !iommu_passthrough && !need_iommu(d) )
     {
         /* Set up 1:1 page table for hardware domain. */
-        vtd_set_hwdom_mapping(d);
+        rc = vtd_set_hwdom_mapping(d);
+        if ( rc )
+            return rc;
     }
 
     setup_hwdom_pci_devices(d, setup_hwdom_device);
     setup_hwdom_rmrr(d);
 
-    iommu_flush_all();
+    rc = iommu_flush_all();
+    if ( rc )
+        return rc;
 
     for_each_drhd_unit ( drhd )
     {
@@ -1273,6 +1283,8 @@ static void __hwdom_init intel_iommu_hwdom_init(struct domain *d)
             BUG();
         iommu_enable_translation(drhd);
     }
+
+    return 0;
 }
 
 int domain_context_mapping_one(
@@ -1285,6 +1297,7 @@ int domain_context_mapping_one(
     u64 maddr, pgd_maddr;
     u16 seg = iommu->intel->drhd->segment;
     int agaw;
+    int rc = 0;
 
     ASSERT(spin_is_locked(&pcidevs_lock));
     spin_lock(&iommu->lock);
@@ -1404,17 +1417,18 @@ int domain_context_mapping_one(
     else
     {
         int flush_dev_iotlb = find_ats_dev_drhd(iommu) ? 1 : 0;
-        iommu_flush_iotlb_dsi(iommu, 0, 1, flush_dev_iotlb);
+
+        rc = iommu_flush_iotlb_dsi(iommu, 0, 1, flush_dev_iotlb);
     }
 
     set_bit(iommu->index, &hd->arch.iommu_bitmap);
 
     unmap_vtd_domain_page(context_entries);
 
-    if ( !seg )
-        me_wifi_quirk(domain, bus, devfn, MAP_ME_PHANTOM_FUNC);
+    if ( !rc && !seg )
+        rc = me_wifi_quirk(domain, bus, devfn, MAP_ME_PHANTOM_FUNC);
 
-    return 0;
+    return rc;
 }
 
 static int domain_context_mapping(
@@ -1509,6 +1523,7 @@ int domain_context_unmap_one(
     struct context_entry *context, *context_entries;
     u64 maddr;
     int iommu_domid;
+    int rc = 0;
 
     ASSERT(spin_is_locked(&pcidevs_lock));
     spin_lock(&iommu->lock);
@@ -1543,16 +1558,16 @@ int domain_context_unmap_one(
     else
     {
         int flush_dev_iotlb = find_ats_dev_drhd(iommu) ? 1 : 0;
-        iommu_flush_iotlb_dsi(iommu, iommu_domid, 0, flush_dev_iotlb);
+        rc = iommu_flush_iotlb_dsi(iommu, iommu_domid, 0, flush_dev_iotlb);
     }
 
     spin_unlock(&iommu->lock);
     unmap_vtd_domain_page(context_entries);
 
-    if ( !iommu->intel->drhd->segment )
-        me_wifi_quirk(domain, bus, devfn, UNMAP_ME_PHANTOM_FUNC);
+    if ( !rc && !iommu->intel->drhd->segment )
+        rc = me_wifi_quirk(domain, bus, devfn, UNMAP_ME_PHANTOM_FUNC);
 
-    return 0;
+    return rc;
 }
 
 static int domain_context_unmap(
@@ -1742,7 +1757,9 @@ static int intel_iommu_map_page(
     unmap_vtd_domain_page(page);
 
     if ( !this_cpu(iommu_dont_flush_iotlb) )
-        __intel_iommu_iotlb_flush(d, gfn, dma_pte_present(old), 1);
+    {
+        return __intel_iommu_iotlb_flush(d, gfn, dma_pte_present(old), 1);
+    }
 
     return 0;
 }
@@ -1753,19 +1770,18 @@ static int intel_iommu_unmap_page(struct domain *d, unsigned long gfn)
     if ( iommu_passthrough && is_hardware_domain(d) )
         return 0;
 
-    dma_pte_clear_one(d, (paddr_t)gfn << PAGE_SHIFT_4K);
-
-    return 0;
+    return dma_pte_clear_one(d, (paddr_t)gfn << PAGE_SHIFT_4K);
 }
 
-void iommu_pte_flush(struct domain *d, u64 gfn, u64 *pte,
-                     int order, int present)
+int iommu_pte_flush(struct domain *d, u64 gfn, u64 *pte,
+                    int order, int present)
 {
     struct acpi_drhd_unit *drhd;
     struct iommu *iommu = NULL;
     struct hvm_iommu *hd = domain_hvm_iommu(d);
     int flush_dev_iotlb;
     int iommu_domid;
+    int rc = 0;
 
     iommu_flush_cache_entry(pte, sizeof(struct dma_pte));
 
@@ -1779,11 +1795,14 @@ void iommu_pte_flush(struct domain *d, u64 gfn, u64 *pte,
         iommu_domid= domain_iommu_domid(d, iommu);
         if ( iommu_domid == -1 )
             continue;
-        if ( iommu_flush_iotlb_psi(iommu, iommu_domid,
+        rc = iommu_flush_iotlb_psi(iommu, iommu_domid,
                                    (paddr_t)gfn << PAGE_SHIFT_4K,
-                                   order, !present, flush_dev_iotlb) )
+                                   order, !present, flush_dev_iotlb);
+        if ( rc )
             iommu_flush_write_buffer(iommu);
     }
+
+    return rc;
 }
 
 static int __init vtd_ept_page_compatible(struct iommu *iommu)
@@ -2103,8 +2122,8 @@ static int init_vtd_hw(void)
             return -EIO;
         }
     }
-    iommu_flush_all();
-    return 0;
+
+    return iommu_flush_all();
 }
 
 static void __hwdom_init setup_hwdom_rmrr(struct domain *d)
@@ -2372,16 +2391,19 @@ static int intel_iommu_group_id(u16 seg, u8 bus, u8 devfn)
 }
 
 static u32 iommu_state[MAX_IOMMUS][MAX_IOMMU_REGS];
-static void vtd_suspend(void)
+static int vtd_suspend(void)
 {
     struct acpi_drhd_unit *drhd;
     struct iommu *iommu;
+    int rc;
     u32    i;
 
     if ( !iommu_enabled )
-        return;
+        return 0;
 
-    iommu_flush_all();
+    rc = iommu_flush_all();
+    if ( rc )
+        return rc;
 
     for_each_drhd_unit ( drhd )
     {
@@ -2410,17 +2432,22 @@ static void vtd_suspend(void)
         if ( !iommu_intremap && iommu_qinval )
             disable_qinval(iommu);
     }
+
+    return 0;
 }
 
-static void vtd_crash_shutdown(void)
+static int vtd_crash_shutdown(void)
 {
     struct acpi_drhd_unit *drhd;
     struct iommu *iommu;
+    int rc;
 
     if ( !iommu_enabled )
-        return;
+        return 0;
 
-    iommu_flush_all();
+    rc = iommu_flush_all();
+    if ( rc )
+        return rc;
 
     for_each_drhd_unit ( drhd )
     {
@@ -2429,6 +2456,8 @@ static void vtd_crash_shutdown(void)
         disable_intremap(drhd->iommu);
         disable_qinval(drhd->iommu);
     }
+
+    return 0;
 }
 
 static void vtd_resume(void)
diff --git a/xen/drivers/passthrough/vtd/qinval.c b/xen/drivers/passthrough/vtd/qinval.c
index b81b0bd..946e812 100644
--- a/xen/drivers/passthrough/vtd/qinval.c
+++ b/xen/drivers/passthrough/vtd/qinval.c
@@ -324,7 +324,7 @@ static int flush_iotlb_qi(
     if ( flush_non_present_entry )
     {
         if ( !cap_caching_mode(iommu->cap) )
-            return 1;
+            return 0;
         else
             did = 0;
     }
diff --git a/xen/drivers/passthrough/vtd/quirks.c b/xen/drivers/passthrough/vtd/quirks.c
index 1888843..d9aea7e 100644
--- a/xen/drivers/passthrough/vtd/quirks.c
+++ b/xen/drivers/passthrough/vtd/quirks.c
@@ -332,10 +332,11 @@ void __init platform_quirks_init(void)
  * assigning Intel integrated wifi device to a guest.
  */
 
-static void map_me_phantom_function(struct domain *domain, u32 dev, int map)
+static int map_me_phantom_function(struct domain *domain, u32 dev, int map)
 {
     struct acpi_drhd_unit *drhd;
     struct pci_dev *pdev;
+    int rc = 0;
 
     /* find ME VT-d engine base on a real ME device */
     pdev = pci_get_pdev(0, 0, PCI_DEVFN(dev, 0));
@@ -343,23 +344,26 @@ static void map_me_phantom_function(struct domain *domain, u32 dev, int map)
 
     /* map or unmap ME phantom function */
     if ( map )
-        domain_context_mapping_one(domain, drhd->iommu, 0,
-                                   PCI_DEVFN(dev, 7), NULL);
+        rc = domain_context_mapping_one(domain, drhd->iommu, 0,
+                                        PCI_DEVFN(dev, 7), NULL);
     else
-        domain_context_unmap_one(domain, drhd->iommu, 0,
-                                 PCI_DEVFN(dev, 7));
+        rc = domain_context_unmap_one(domain, drhd->iommu, 0,
+                                      PCI_DEVFN(dev, 7));
+
+    return rc;
 }
 
-void me_wifi_quirk(struct domain *domain, u8 bus, u8 devfn, int map)
+int me_wifi_quirk(struct domain *domain, u8 bus, u8 devfn, int map)
 {
     u32 id;
+    int rc = 0;
 
     id = pci_conf_read32(0, 0, 0, 0, 0);
     if ( IS_CTG(id) )
     {
         /* quit if ME does not exist */
         if ( pci_conf_read32(0, 0, 3, 0, 0) == 0xffffffff )
-            return;
+            return -ENOENT;
 
         /* if device is WLAN device, map ME phantom device 0:3.7 */
         id = pci_conf_read32(0, bus, PCI_SLOT(devfn), PCI_FUNC(devfn), 0);
@@ -373,7 +377,7 @@ void me_wifi_quirk(struct domain *domain, u8 bus, u8 devfn, int map)
             case 0x423b8086:
             case 0x423c8086:
             case 0x423d8086:
-                map_me_phantom_function(domain, 3, map);
+                rc = map_me_phantom_function(domain, 3, map);
                 break;
             default:
                 break;
@@ -383,7 +387,7 @@ void me_wifi_quirk(struct domain *domain, u8 bus, u8 devfn, int map)
     {
         /* quit if ME does not exist */
         if ( pci_conf_read32(0, 0, 22, 0, 0) == 0xffffffff )
-            return;
+            return -ENOENT;
 
         /* if device is WLAN device, map ME phantom device 0:22.7 */
         id = pci_conf_read32(0, bus, PCI_SLOT(devfn), PCI_FUNC(devfn), 0);
@@ -399,12 +403,14 @@ void me_wifi_quirk(struct domain *domain, u8 bus, u8 devfn, int map)
             case 0x42388086:        /* Puma Peak */
             case 0x422b8086:
             case 0x422c8086:
-                map_me_phantom_function(domain, 22, map);
+                rc = map_me_phantom_function(domain, 22, map);
                 break;
             default:
                 break;
         }
     }
+
+    return rc;
 }
 
 void pci_vtd_quirk(const struct pci_dev *pdev)
diff --git a/xen/drivers/passthrough/vtd/x86/vtd.c b/xen/drivers/passthrough/vtd/x86/vtd.c
index c0d6aab..a19177c 100644
--- a/xen/drivers/passthrough/vtd/x86/vtd.c
+++ b/xen/drivers/passthrough/vtd/x86/vtd.c
@@ -108,9 +108,10 @@ void hvm_dpci_isairq_eoi(struct domain *d, unsigned int isairq)
     spin_unlock(&d->event_lock);
 }
 
-void __hwdom_init vtd_set_hwdom_mapping(struct domain *d)
+int __hwdom_init vtd_set_hwdom_mapping(struct domain *d)
 {
     unsigned long i, j, tmp, top;
+    int rc = 0;
 
     BUG_ON(!is_hardware_domain(d));
 
@@ -140,11 +141,21 @@ void __hwdom_init vtd_set_hwdom_mapping(struct domain *d)
 
         tmp = 1 << (PAGE_SHIFT - PAGE_SHIFT_4K);
         for ( j = 0; j < tmp; j++ )
-            iommu_map_page(d, pfn * tmp + j, pfn * tmp + j,
-                           IOMMUF_readable|IOMMUF_writable);
+        {
+            rc = iommu_map_page(d, pfn * tmp + j, pfn * tmp + j,
+                                IOMMUF_readable|IOMMUF_writable);
+            if ( rc )
+            {
+                while ( j-- > 0 )
+                    iommu_unmap_page(d, pfn * tmp + j);
+                break;
+            }
+        }
 
         if (!(i & (0xfffff >> (PAGE_SHIFT - PAGE_SHIFT_4K))))
             process_pending_softirqs();
     }
+
+    return rc;
 }
 
diff --git a/xen/drivers/passthrough/x86/iommu.c b/xen/drivers/passthrough/x86/iommu.c
index 8cbb655..6674fb0 100644
--- a/xen/drivers/passthrough/x86/iommu.c
+++ b/xen/drivers/passthrough/x86/iommu.c
@@ -104,7 +104,11 @@ int arch_iommu_populate_page_table(struct domain *d)
     this_cpu(iommu_dont_flush_iotlb) = 0;
 
     if ( !rc )
-        iommu_iotlb_flush_all(d);
+    {
+        rc = iommu_iotlb_flush_all(d);
+        if ( rc )
+            return rc;
+    }
     else if ( rc != -ERESTART )
         iommu_teardown(d);
 
diff --git a/xen/include/asm-x86/hvm/svm/amd-iommu-proto.h b/xen/include/asm-x86/hvm/svm/amd-iommu-proto.h
index 9c51172..4691f9b 100644
--- a/xen/include/asm-x86/hvm/svm/amd-iommu-proto.h
+++ b/xen/include/asm-x86/hvm/svm/amd-iommu-proto.h
@@ -119,8 +119,8 @@ extern unsigned long *shared_intremap_inuse;
 
 /* power management support */
 void amd_iommu_resume(void);
-void amd_iommu_suspend(void);
-void amd_iommu_crash_shutdown(void);
+int amd_iommu_suspend(void);
+int amd_iommu_crash_shutdown(void);
 
 /* guest iommu support */
 void amd_iommu_send_guest_cmd(struct amd_iommu *iommu, u32 cmd[]);
diff --git a/xen/include/asm-x86/iommu.h b/xen/include/asm-x86/iommu.h
index 29203d7..cf2a269 100644
--- a/xen/include/asm-x86/iommu.h
+++ b/xen/include/asm-x86/iommu.h
@@ -26,7 +26,7 @@ int iommu_setup_hpet_msi(struct msi_desc *);
 
 /* While VT-d specific, this must get declared in a generic header. */
 int adjust_vtd_irq_affinities(void);
-void iommu_pte_flush(struct domain *d, u64 gfn, u64 *pte, int order, int present);
+int iommu_pte_flush(struct domain *d, u64 gfn, u64 *pte, int order, int present);
 int iommu_supports_eim(void);
 int iommu_enable_x2apic_IR(void);
 void iommu_disable_x2apic_IR(void);
diff --git a/xen/include/xen/iommu.h b/xen/include/xen/iommu.h
index 8f3a20e..f5b6f7e 100644
--- a/xen/include/xen/iommu.h
+++ b/xen/include/xen/iommu.h
@@ -55,7 +55,7 @@ int iommu_add_device(struct pci_dev *pdev);
 int iommu_enable_device(struct pci_dev *pdev);
 int iommu_remove_device(struct pci_dev *pdev);
 int iommu_domain_init(struct domain *d);
-void iommu_hwdom_init(struct domain *d);
+int iommu_hwdom_init(struct domain *d);
 void iommu_domain_destroy(struct domain *d);
 int deassign_device(struct domain *d, u16 seg, u8 bus, u8 devfn);
 
@@ -134,7 +134,7 @@ typedef int iommu_grdm_t(xen_pfn_t start, xen_ulong_t nr, u32 id, void *ctxt);
 
 struct iommu_ops {
     int (*init)(struct domain *d);
-    void (*hwdom_init)(struct domain *d);
+    int (*hwdom_init)(struct domain *d);
     int (*add_device)(u8 devfn, device_t *dev);
     int (*enable_device)(device_t *dev);
     int (*remove_device)(u8 devfn, device_t *dev);
@@ -157,19 +157,19 @@ struct iommu_ops {
     unsigned int (*read_apic_from_ire)(unsigned int apic, unsigned int reg);
     int (*setup_hpet_msi)(struct msi_desc *);
 #endif /* CONFIG_X86 */
-    void (*suspend)(void);
+    int (*suspend)(void);
     void (*resume)(void);
     void (*share_p2m)(struct domain *d);
-    void (*crash_shutdown)(void);
-    void (*iotlb_flush)(struct domain *d, unsigned long gfn, unsigned int page_count);
-    void (*iotlb_flush_all)(struct domain *d);
+    int (*crash_shutdown)(void);
+    int (*iotlb_flush)(struct domain *d, unsigned long gfn, unsigned int page_count);
+    int (*iotlb_flush_all)(struct domain *d);
     int (*get_reserved_device_memory)(iommu_grdm_t *, void *);
     void (*dump_p2m_table)(struct domain *d);
 };
 
-void iommu_suspend(void);
+int iommu_suspend(void);
 void iommu_resume(void);
-void iommu_crash_shutdown(void);
+int iommu_crash_shutdown(void);
 int iommu_get_reserved_device_memory(iommu_grdm_t *, void *);
 
 void iommu_share_p2m_table(struct domain *d);
@@ -182,8 +182,8 @@ int iommu_do_pci_domctl(struct xen_domctl *, struct domain *d,
 int iommu_do_domctl(struct xen_domctl *, struct domain *d,
                     XEN_GUEST_HANDLE_PARAM(xen_domctl_t));
 
-void iommu_iotlb_flush(struct domain *d, unsigned long gfn, unsigned int page_count);
-void iommu_iotlb_flush_all(struct domain *d);
+int iommu_iotlb_flush(struct domain *d, unsigned long gfn, unsigned int page_count);
+int iommu_iotlb_flush_all(struct domain *d);
 
 /*
  * The purpose of the iommu_dont_flush_iotlb optional cpu flag is to
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v5 2/7] VT-d: Check VT-d Device-TLB flush error(MMU part).
  2016-02-05 10:18 [PATCH v5 0/7] VT-d Device-TLB flush issue Quan Xu
  2016-02-05 10:18 ` [PATCH v5 1/7] VT-d: Check VT-d Device-TLB flush error(IOMMU part) Quan Xu
@ 2016-02-05 10:18 ` Quan Xu
  2016-02-17 14:02   ` Jan Beulich
  2016-02-05 10:18 ` [PATCH v5 3/7] VT-d: Reduce spin timeout to 1ms, which can be boot-time changed Quan Xu
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 31+ messages in thread
From: Quan Xu @ 2016-02-05 10:18 UTC (permalink / raw)
  To: jbeulich, kevin.tian
  Cc: feng.wu, george.dunlap, andrew.cooper3, tim, xen-devel, Quan Xu

This patch checks all kinds of error and all the way up
the call trees of VT-d Device-TLB flush(MMU part).

Signed-off-by: Quan Xu <quan.xu@intel.com>
---
 xen/arch/x86/acpi/power.c   |  6 +++++-
 xen/arch/x86/crash.c        |  3 ++-
 xen/arch/x86/domain_build.c |  5 ++++-
 xen/arch/x86/mm.c           | 10 +++++++---
 xen/arch/x86/mm/p2m-ept.c   | 11 +++++++++--
 xen/arch/x86/mm/p2m-pt.c    | 11 +++++++++--
 xen/common/domain.c         |  2 +-
 xen/common/grant_table.c    |  5 +++--
 xen/common/memory.c         |  5 +++--
 9 files changed, 43 insertions(+), 15 deletions(-)

diff --git a/xen/arch/x86/acpi/power.c b/xen/arch/x86/acpi/power.c
index f41f0de..ff397c3 100644
--- a/xen/arch/x86/acpi/power.c
+++ b/xen/arch/x86/acpi/power.c
@@ -45,6 +45,8 @@ void do_suspend_lowlevel(void);
 
 static int device_power_down(void)
 {
+    int rc;
+
     console_suspend();
 
     time_suspend();
@@ -53,7 +55,9 @@ static int device_power_down(void)
 
     ioapic_suspend();
 
-    iommu_suspend();
+    rc = iommu_suspend();
+    if ( rc )
+        return rc;
 
     lapic_suspend();
 
diff --git a/xen/arch/x86/crash.c b/xen/arch/x86/crash.c
index 888a214..59e1af6 100644
--- a/xen/arch/x86/crash.c
+++ b/xen/arch/x86/crash.c
@@ -170,7 +170,8 @@ static void nmi_shootdown_cpus(void)
 
     /* Crash shutdown any IOMMU functionality as the crashdump kernel is not
      * happy when booting if interrupt/dma remapping is still enabled */
-    iommu_crash_shutdown();
+    if ( iommu_crash_shutdown() )
+        printk("Failed to shut down IOMMU.\n");
 
     __stop_this_cpu();
 
diff --git a/xen/arch/x86/domain_build.c b/xen/arch/x86/domain_build.c
index bca6fe7..d10321a 100644
--- a/xen/arch/x86/domain_build.c
+++ b/xen/arch/x86/domain_build.c
@@ -1627,7 +1627,10 @@ int __init construct_dom0(
     }
 
     if ( d->domain_id == hardware_domid )
-        iommu_hwdom_init(d);
+    {
+        rc = iommu_hwdom_init(d);
+        BUG_ON(rc != 0);
+    }
 
     return 0;
 
diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index 202ff76..1e50b94 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -2445,9 +2445,13 @@ static int __get_page_type(struct page_info *page, unsigned long type,
             if ( (x & PGT_type_mask) == PGT_writable_page )
                 iommu_unmap_page(d, mfn_to_gmfn(d, page_to_mfn(page)));
             else if ( type == PGT_writable_page )
-                iommu_map_page(d, mfn_to_gmfn(d, page_to_mfn(page)),
-                               page_to_mfn(page),
-                               IOMMUF_readable|IOMMUF_writable);
+            {
+                rc = iommu_map_page(d, mfn_to_gmfn(d, page_to_mfn(page)),
+                                    page_to_mfn(page),
+                                    IOMMUF_readable|IOMMUF_writable);
+                if ( rc )
+                    iommu_unmap_page(d, mfn_to_gmfn(d, page_to_mfn(page)));
+            }
         }
     }
 
diff --git a/xen/arch/x86/mm/p2m-ept.c b/xen/arch/x86/mm/p2m-ept.c
index 9860c6c..9e1f5c6 100644
--- a/xen/arch/x86/mm/p2m-ept.c
+++ b/xen/arch/x86/mm/p2m-ept.c
@@ -829,12 +829,19 @@ out:
          need_modify_vtd_table )
     {
         if ( iommu_hap_pt_share )
-            iommu_pte_flush(d, gfn, &ept_entry->epte, order, vtd_pte_present);
+            rc = iommu_pte_flush(d, gfn, &ept_entry->epte, order, vtd_pte_present);
         else
         {
             if ( iommu_flags )
                 for ( i = 0; i < (1 << order); i++ )
-                    iommu_map_page(d, gfn + i, mfn_x(mfn) + i, iommu_flags);
+                {
+                    rc = iommu_map_page(d, gfn + i, mfn_x(mfn) + i, iommu_flags);
+                    if ( rc )
+                    {
+                        while ( i-- > 0 )
+                            iommu_unmap_page(d, gfn + i);
+                    }
+                }
             else
                 for ( i = 0; i < (1 << order); i++ )
                     iommu_unmap_page(d, gfn + i);
diff --git a/xen/arch/x86/mm/p2m-pt.c b/xen/arch/x86/mm/p2m-pt.c
index 709920a..942a11c 100644
--- a/xen/arch/x86/mm/p2m-pt.c
+++ b/xen/arch/x86/mm/p2m-pt.c
@@ -675,8 +675,15 @@ p2m_pt_set_entry(struct p2m_domain *p2m, unsigned long gfn, mfn_t mfn,
         }
         else if ( iommu_pte_flags )
             for ( i = 0; i < (1UL << page_order); i++ )
-                iommu_map_page(p2m->domain, gfn + i, mfn_x(mfn) + i,
-                               iommu_pte_flags);
+            {
+                rc = iommu_map_page(p2m->domain, gfn + i, mfn_x(mfn) + i,
+                                    iommu_pte_flags);
+                if ( rc )
+                {
+                    while ( i-- > 0 )
+                        iommu_unmap_page(p2m->domain, gfn + i);
+                }
+            }
         else
             for ( i = 0; i < (1UL << page_order); i++ )
                 iommu_unmap_page(p2m->domain, gfn + i);
diff --git a/xen/common/domain.c b/xen/common/domain.c
index 1b9fcfc..577eb3d 100644
--- a/xen/common/domain.c
+++ b/xen/common/domain.c
@@ -228,7 +228,7 @@ static int late_hwdom_init(struct domain *d)
 
     rcu_unlock_domain(dom0);
 
-    iommu_hwdom_init(d);
+    rv = iommu_hwdom_init(d);
 
     return rv;
 #else
diff --git a/xen/common/grant_table.c b/xen/common/grant_table.c
index 2b449d5..1b9bd05 100644
--- a/xen/common/grant_table.c
+++ b/xen/common/grant_table.c
@@ -919,8 +919,9 @@ __gnttab_map_grant_ref(
             {
                 nr_gets++;
                 (void)get_page(pg, rd);
-                if ( !(op->flags & GNTMAP_readonly) )
-                    get_page_type(pg, PGT_writable_page);
+                if ( !(op->flags & GNTMAP_readonly) &&
+                     get_page_type(pg, PGT_writable_page) )
+                        goto could_not_pin;
             }
         }
     }
diff --git a/xen/common/memory.c b/xen/common/memory.c
index b541f4a1..c228d9f 100644
--- a/xen/common/memory.c
+++ b/xen/common/memory.c
@@ -631,8 +631,9 @@ static int xenmem_add_to_physmap(struct domain *d,
     if ( need_iommu(d) )
     {
         this_cpu(iommu_dont_flush_iotlb) = 0;
-        iommu_iotlb_flush(d, xatp->idx - done, done);
-        iommu_iotlb_flush(d, xatp->gpfn - done, done);
+        rc = iommu_iotlb_flush(d, xatp->idx - done, done);
+        if ( !rc )
+            rc = iommu_iotlb_flush(d, xatp->gpfn - done, done);
     }
 #endif
 
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v5 3/7] VT-d: Reduce spin timeout to 1ms, which can be boot-time changed.
  2016-02-05 10:18 [PATCH v5 0/7] VT-d Device-TLB flush issue Quan Xu
  2016-02-05 10:18 ` [PATCH v5 1/7] VT-d: Check VT-d Device-TLB flush error(IOMMU part) Quan Xu
  2016-02-05 10:18 ` [PATCH v5 2/7] VT-d: Check VT-d Device-TLB flush error(MMU part) Quan Xu
@ 2016-02-05 10:18 ` Quan Xu
  2016-02-17 14:06   ` Jan Beulich
  2016-02-05 10:18 ` [PATCH v5 4/7] VT-d: Refactor iommu_ops .iotlb_flush() and iotlb_flush_all() Quan Xu
                   ` (3 subsequent siblings)
  6 siblings, 1 reply; 31+ messages in thread
From: Quan Xu @ 2016-02-05 10:18 UTC (permalink / raw)
  To: jbeulich, kevin.tian
  Cc: feng.wu, george.dunlap, andrew.cooper3, tim, xen-devel, Quan Xu

Signed-off-by: Quan Xu <quan.xu@intel.com>
---
 docs/misc/xen-command-line.markdown  |  7 +++++++
 xen/drivers/passthrough/vtd/qinval.c | 11 +++++++++--
 2 files changed, 16 insertions(+), 2 deletions(-)

diff --git a/docs/misc/xen-command-line.markdown b/docs/misc/xen-command-line.markdown
index a565c1b..6ed5cd8 100644
--- a/docs/misc/xen-command-line.markdown
+++ b/docs/misc/xen-command-line.markdown
@@ -968,6 +968,13 @@ Use this to work around firmware issues providing correct RMRR entries. Rather
 than only mapping RAM pages for IOMMU accesses for Dom0, with this option all
 pages not marked as unusable in the E820 table will get a mapping established.
 
+### vtd\_qi\_timeout (VT-d)
+> `= <integer>`
+
+> Default: `1`
+
+>> Specify the timeout of the VT-d Queued Invalidation in ms.
+
 ### irq\_ratelimit
 > `= <integer>`
 
diff --git a/xen/drivers/passthrough/vtd/qinval.c b/xen/drivers/passthrough/vtd/qinval.c
index 946e812..f9e752b 100644
--- a/xen/drivers/passthrough/vtd/qinval.c
+++ b/xen/drivers/passthrough/vtd/qinval.c
@@ -28,6 +28,11 @@
 #include "vtd.h"
 #include "extern.h"
 
+static unsigned int __read_mostly vtd_qi_timeout = 1;
+integer_param("vtd_qi_timeout", vtd_qi_timeout);
+
+#define IOMMU_QI_TIMEOUT (vtd_qi_timeout * MILLISECS(1))
+
 static void print_qi_regs(struct iommu *iommu)
 {
     u64 val;
@@ -167,10 +172,12 @@ static int queue_invalidate_wait(struct iommu *iommu,
         start_time = NOW();
         while ( poll_slot != QINVAL_STAT_DONE )
         {
-            if ( NOW() > (start_time + DMAR_OPERATION_TIMEOUT) )
+            if ( NOW() > (start_time + IOMMU_QI_TIMEOUT) )
             {
                 print_qi_regs(iommu);
-                panic("queue invalidate wait descriptor was not executed");
+                dprintk(XENLOG_WARNING VTDPREFIX,
+                        "Queue invalidate wait descriptor was timeout.\n");
+                return -ETIMEDOUT;
             }
             cpu_relax();
         }
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v5 4/7] VT-d: Refactor iommu_ops .iotlb_flush() and iotlb_flush_all()
  2016-02-05 10:18 [PATCH v5 0/7] VT-d Device-TLB flush issue Quan Xu
                   ` (2 preceding siblings ...)
  2016-02-05 10:18 ` [PATCH v5 3/7] VT-d: Reduce spin timeout to 1ms, which can be boot-time changed Quan Xu
@ 2016-02-05 10:18 ` Quan Xu
  2016-02-17 14:19   ` Jan Beulich
  2016-02-05 10:18 ` [PATCH v5 5/7] VT-d: Refactor iommu_ops .map_page() and unmap_page() Quan Xu
                   ` (2 subsequent siblings)
  6 siblings, 1 reply; 31+ messages in thread
From: Quan Xu @ 2016-02-05 10:18 UTC (permalink / raw)
  To: jbeulich, kevin.tian
  Cc: feng.wu, george.dunlap, andrew.cooper3, tim, xen-devel, Quan Xu

to pass down a flag indicating whether the lock is being held.

Signed-off-by: Quan Xu <quan.xu@intel.com>
---
 xen/arch/arm/p2m.c                  |  2 +-
 xen/common/memory.c                 |  4 ++--
 xen/drivers/passthrough/iommu.c     |  9 +++++----
 xen/drivers/passthrough/vtd/iommu.c |  5 +++--
 xen/drivers/passthrough/x86/iommu.c |  2 +-
 xen/include/xen/iommu.h             | 17 +++++++++++++----
 6 files changed, 25 insertions(+), 14 deletions(-)

diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
index e396c40..6eec959 100644
--- a/xen/arch/arm/p2m.c
+++ b/xen/arch/arm/p2m.c
@@ -1100,7 +1100,7 @@ tlbflush:
     if ( flush )
     {
         flush_tlb_domain(d);
-        iommu_iotlb_flush(d, sgfn, egfn - sgfn);
+        iommu_iotlb_flush(d, sgfn, egfn - sgfn, NONE_LOCK);
     }
 
 out:
diff --git a/xen/common/memory.c b/xen/common/memory.c
index c228d9f..e68c3dd 100644
--- a/xen/common/memory.c
+++ b/xen/common/memory.c
@@ -631,9 +631,9 @@ static int xenmem_add_to_physmap(struct domain *d,
     if ( need_iommu(d) )
     {
         this_cpu(iommu_dont_flush_iotlb) = 0;
-        rc = iommu_iotlb_flush(d, xatp->idx - done, done);
+        rc = iommu_iotlb_flush(d, xatp->idx - done, done, NONE_LOCK);
         if ( !rc )
-            rc = iommu_iotlb_flush(d, xatp->gpfn - done, done);
+            rc = iommu_iotlb_flush(d, xatp->gpfn - done, done, NONE_LOCK);
     }
 #endif
 
diff --git a/xen/drivers/passthrough/iommu.c b/xen/drivers/passthrough/iommu.c
index cdf8e9a..ebd6d47 100644
--- a/xen/drivers/passthrough/iommu.c
+++ b/xen/drivers/passthrough/iommu.c
@@ -270,24 +270,25 @@ static void iommu_free_pagetables(unsigned long unused)
                             cpumask_cycle(smp_processor_id(), &cpu_online_map));
 }
 
-int iommu_iotlb_flush(struct domain *d, unsigned long gfn, unsigned int page_count)
+int iommu_iotlb_flush(struct domain *d, unsigned long gfn,
+                      unsigned int page_count, unsigned int lock)
 {
     struct hvm_iommu *hd = domain_hvm_iommu(d);
 
     if ( !iommu_enabled || !hd->platform_ops || !hd->platform_ops->iotlb_flush )
         return 0;
 
-    return hd->platform_ops->iotlb_flush(d, gfn, page_count);
+    return hd->platform_ops->iotlb_flush(d, gfn, page_count, lock);
 }
 
-int iommu_iotlb_flush_all(struct domain *d)
+int iommu_iotlb_flush_all(struct domain *d, unsigned int lock)
 {
     struct hvm_iommu *hd = domain_hvm_iommu(d);
 
     if ( !iommu_enabled || !hd->platform_ops || !hd->platform_ops->iotlb_flush_all )
         return 0;
 
-    return hd->platform_ops->iotlb_flush_all(d);
+    return hd->platform_ops->iotlb_flush_all(d, lock);
 }
 
 int __init iommu_setup(void)
diff --git a/xen/drivers/passthrough/vtd/iommu.c b/xen/drivers/passthrough/vtd/iommu.c
index a780632..e8cbfdb 100644
--- a/xen/drivers/passthrough/vtd/iommu.c
+++ b/xen/drivers/passthrough/vtd/iommu.c
@@ -601,12 +601,13 @@ static int __intel_iommu_iotlb_flush(struct domain *d, unsigned long gfn,
     return rc;
 }
 
-static int intel_iommu_iotlb_flush(struct domain *d, unsigned long gfn, unsigned int page_count)
+static int intel_iommu_iotlb_flush(struct domain *d, unsigned long gfn,
+                                   unsigned int page_count, unsigned int lock)
 {
     return __intel_iommu_iotlb_flush(d, gfn, 1, page_count);
 }
 
-static int intel_iommu_iotlb_flush_all(struct domain *d)
+static int intel_iommu_iotlb_flush_all(struct domain *d, unsigned int lock)
 {
     return __intel_iommu_iotlb_flush(d, 0, 0, 0);
 }
diff --git a/xen/drivers/passthrough/x86/iommu.c b/xen/drivers/passthrough/x86/iommu.c
index 6674fb0..4bbf5f8 100644
--- a/xen/drivers/passthrough/x86/iommu.c
+++ b/xen/drivers/passthrough/x86/iommu.c
@@ -105,7 +105,7 @@ int arch_iommu_populate_page_table(struct domain *d)
 
     if ( !rc )
     {
-        rc = iommu_iotlb_flush_all(d);
+        rc = iommu_iotlb_flush_all(d, PCIDEVS_LOCK);
         if ( rc )
             return rc;
     }
diff --git a/xen/include/xen/iommu.h b/xen/include/xen/iommu.h
index f5b6f7e..f58e9d6 100644
--- a/xen/include/xen/iommu.h
+++ b/xen/include/xen/iommu.h
@@ -131,6 +131,13 @@ struct page_info;
  * callback pair.
  */
 typedef int iommu_grdm_t(xen_pfn_t start, xen_ulong_t nr, u32 id, void *ctxt);
+/*
+ * A flag indicates whether the lock is being held.
+ * NONE_LOCK - no lock is being held.
+ * PCIDEVS_LOCK - pcidevs_lock is being held.
+ */
+#define NONE_LOCK 0
+#define PCIDEVS_LOCK 1
 
 struct iommu_ops {
     int (*init)(struct domain *d);
@@ -161,8 +168,9 @@ struct iommu_ops {
     void (*resume)(void);
     void (*share_p2m)(struct domain *d);
     int (*crash_shutdown)(void);
-    int (*iotlb_flush)(struct domain *d, unsigned long gfn, unsigned int page_count);
-    int (*iotlb_flush_all)(struct domain *d);
+    int (*iotlb_flush)(struct domain *d, unsigned long gfn, unsigned int page_count,
+                       unsigned int lock);
+    int (*iotlb_flush_all)(struct domain *d, unsigned int lock);
     int (*get_reserved_device_memory)(iommu_grdm_t *, void *);
     void (*dump_p2m_table)(struct domain *d);
 };
@@ -182,8 +190,9 @@ int iommu_do_pci_domctl(struct xen_domctl *, struct domain *d,
 int iommu_do_domctl(struct xen_domctl *, struct domain *d,
                     XEN_GUEST_HANDLE_PARAM(xen_domctl_t));
 
-int iommu_iotlb_flush(struct domain *d, unsigned long gfn, unsigned int page_count);
-int iommu_iotlb_flush_all(struct domain *d);
+int iommu_iotlb_flush(struct domain *d, unsigned long gfn, unsigned int page_count,
+                      unsigned int lock);
+int iommu_iotlb_flush_all(struct domain *d, unsigned int lock);
 
 /*
  * The purpose of the iommu_dont_flush_iotlb optional cpu flag is to
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v5 5/7] VT-d: Refactor iommu_ops .map_page() and unmap_page()
  2016-02-05 10:18 [PATCH v5 0/7] VT-d Device-TLB flush issue Quan Xu
                   ` (3 preceding siblings ...)
  2016-02-05 10:18 ` [PATCH v5 4/7] VT-d: Refactor iommu_ops .iotlb_flush() and iotlb_flush_all() Quan Xu
@ 2016-02-05 10:18 ` Quan Xu
  2016-02-17 14:23   ` Jan Beulich
  2016-02-05 10:18 ` [PATCH v5 6/7] VT-d: Refactor iommu_flush .iotlb() Quan Xu
  2016-02-05 10:18 ` [PATCH v5 7/7] VT-d: Fix vt-d Device-TLB flush timeout issue Quan Xu
  6 siblings, 1 reply; 31+ messages in thread
From: Quan Xu @ 2016-02-05 10:18 UTC (permalink / raw)
  To: jbeulich, kevin.tian
  Cc: feng.wu, george.dunlap, andrew.cooper3, tim, xen-devel, Quan Xu

to pass down a flag indicating whether the lock is being held,
and check the way up the call trees.

Signed-off-by: Quan Xu <quan.xu@intel.com>
---
 xen/arch/x86/mm.c                             |  9 ++++++---
 xen/arch/x86/mm/p2m-ept.c                     |  7 ++++---
 xen/arch/x86/mm/p2m-pt.c                      |  7 ++++---
 xen/arch/x86/mm/p2m.c                         | 24 +++++++++++++++---------
 xen/arch/x86/x86_64/mm.c                      |  5 +++--
 xen/common/grant_table.c                      | 11 +++++++----
 xen/drivers/passthrough/amd/iommu_map.c       |  7 ++++---
 xen/drivers/passthrough/amd/pci_amd_iommu.c   |  3 ++-
 xen/drivers/passthrough/arm/smmu.c            |  2 +-
 xen/drivers/passthrough/iommu.c               | 11 ++++++-----
 xen/drivers/passthrough/vtd/iommu.c           | 10 ++++++----
 xen/drivers/passthrough/vtd/x86/vtd.c         |  5 +++--
 xen/drivers/passthrough/x86/iommu.c           |  3 ++-
 xen/include/asm-x86/hvm/svm/amd-iommu-proto.h |  4 ++--
 xen/include/asm-x86/p2m.h                     |  6 ++++--
 xen/include/xen/iommu.h                       |  8 ++++----
 16 files changed, 73 insertions(+), 49 deletions(-)

diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index 1e50b94..f9030e5 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -2443,14 +2443,17 @@ static int __get_page_type(struct page_info *page, unsigned long type,
         if ( d && is_pv_domain(d) && unlikely(need_iommu(d)) )
         {
             if ( (x & PGT_type_mask) == PGT_writable_page )
-                iommu_unmap_page(d, mfn_to_gmfn(d, page_to_mfn(page)));
+                iommu_unmap_page(d, mfn_to_gmfn(d, page_to_mfn(page)),
+                                 NONE_LOCK);
             else if ( type == PGT_writable_page )
             {
                 rc = iommu_map_page(d, mfn_to_gmfn(d, page_to_mfn(page)),
                                     page_to_mfn(page),
-                                    IOMMUF_readable|IOMMUF_writable);
+                                    IOMMUF_readable|IOMMUF_writable,
+                                    NONE_LOCK);
                 if ( rc )
-                    iommu_unmap_page(d, mfn_to_gmfn(d, page_to_mfn(page)));
+                    iommu_unmap_page(d, mfn_to_gmfn(d, page_to_mfn(page)),
+                                     NONE_LOCK);
             }
         }
     }
diff --git a/xen/arch/x86/mm/p2m-ept.c b/xen/arch/x86/mm/p2m-ept.c
index 9e1f5c6..ecf7e67 100644
--- a/xen/arch/x86/mm/p2m-ept.c
+++ b/xen/arch/x86/mm/p2m-ept.c
@@ -835,16 +835,17 @@ out:
             if ( iommu_flags )
                 for ( i = 0; i < (1 << order); i++ )
                 {
-                    rc = iommu_map_page(d, gfn + i, mfn_x(mfn) + i, iommu_flags);
+                    rc = iommu_map_page(d, gfn + i, mfn_x(mfn) + i,
+                                        iommu_flags, NONE_LOCK);
                     if ( rc )
                     {
                         while ( i-- > 0 )
-                            iommu_unmap_page(d, gfn + i);
+                            iommu_unmap_page(d, gfn + i, NONE_LOCK);
                     }
                 }
             else
                 for ( i = 0; i < (1 << order); i++ )
-                    iommu_unmap_page(d, gfn + i);
+                    iommu_unmap_page(d, gfn + i, NONE_LOCK);
         }
     }
 
diff --git a/xen/arch/x86/mm/p2m-pt.c b/xen/arch/x86/mm/p2m-pt.c
index 942a11c..e73c0e8 100644
--- a/xen/arch/x86/mm/p2m-pt.c
+++ b/xen/arch/x86/mm/p2m-pt.c
@@ -677,16 +677,17 @@ p2m_pt_set_entry(struct p2m_domain *p2m, unsigned long gfn, mfn_t mfn,
             for ( i = 0; i < (1UL << page_order); i++ )
             {
                 rc = iommu_map_page(p2m->domain, gfn + i, mfn_x(mfn) + i,
-                                    iommu_pte_flags);
+                                    iommu_pte_flags, NONE_LOCK);
                 if ( rc )
                 {
                     while ( i-- > 0 )
-                        iommu_unmap_page(p2m->domain, gfn + i);
+                        iommu_unmap_page(p2m->domain, gfn + i,
+                                         NONE_LOCK);
                 }
             }
         else
             for ( i = 0; i < (1UL << page_order); i++ )
-                iommu_unmap_page(p2m->domain, gfn + i);
+                iommu_unmap_page(p2m->domain, gfn + i, NONE_LOCK);
     }
 
     /*
diff --git a/xen/arch/x86/mm/p2m.c b/xen/arch/x86/mm/p2m.c
index c6b883d..76748d4 100644
--- a/xen/arch/x86/mm/p2m.c
+++ b/xen/arch/x86/mm/p2m.c
@@ -610,7 +610,7 @@ p2m_remove_page(struct p2m_domain *p2m, unsigned long gfn, unsigned long mfn,
     {
         if ( need_iommu(p2m->domain) )
             for ( i = 0; i < (1 << page_order); i++ )
-                iommu_unmap_page(p2m->domain, mfn + i);
+                iommu_unmap_page(p2m->domain, mfn + i, NONE_LOCK);
         return 0;
     }
 
@@ -662,12 +662,13 @@ guest_physmap_add_entry(struct domain *d, unsigned long gfn,
         {
             for ( i = 0; i < (1 << page_order); i++ )
             {
-                rc = iommu_map_page(
-                    d, mfn + i, mfn + i, IOMMUF_readable|IOMMUF_writable);
+                rc = iommu_map_page(d, mfn + i, mfn + i,
+                                    IOMMUF_readable|IOMMUF_writable,
+                                    NONE_LOCK);
                 if ( rc != 0 )
                 {
                     while ( i-- > 0 )
-                        iommu_unmap_page(d, mfn + i);
+                        iommu_unmap_page(d, mfn + i, NONE_LOCK);
                     return rc;
                 }
             }
@@ -948,7 +949,8 @@ int set_mmio_p2m_entry(struct domain *d, unsigned long gfn, mfn_t mfn,
 }
 
 int set_identity_p2m_entry(struct domain *d, unsigned long gfn,
-                           p2m_access_t p2ma, unsigned int flag)
+                           p2m_access_t p2ma, unsigned int flag,
+                           unsigned int lock)
 {
     p2m_type_t p2mt;
     p2m_access_t a;
@@ -960,7 +962,8 @@ int set_identity_p2m_entry(struct domain *d, unsigned long gfn,
     {
         if ( !need_iommu(d) )
             return 0;
-        return iommu_map_page(d, gfn, gfn, IOMMUF_readable|IOMMUF_writable);
+        return iommu_map_page(d, gfn, gfn, IOMMUF_readable|IOMMUF_writable,
+                              lock);
     }
 
     gfn_lock(p2m, gfn, 0);
@@ -979,7 +982,9 @@ int set_identity_p2m_entry(struct domain *d, unsigned long gfn,
          * RMRRs are correctly mapped with IOMMU.
          */
         if ( is_hardware_domain(d) && !iommu_use_hap_pt(d) )
-            ret = iommu_map_page(d, gfn, gfn, IOMMUF_readable|IOMMUF_writable);
+            ret = iommu_map_page(d, gfn, gfn,
+                                 IOMMUF_readable|IOMMUF_writable,
+                                 lock);
     }
     else
     {
@@ -1032,7 +1037,8 @@ int clear_mmio_p2m_entry(struct domain *d, unsigned long gfn, mfn_t mfn)
     return rc;
 }
 
-int clear_identity_p2m_entry(struct domain *d, unsigned long gfn)
+int clear_identity_p2m_entry(struct domain *d, unsigned long gfn,
+                             unsigned int lock)
 {
     p2m_type_t p2mt;
     p2m_access_t a;
@@ -1044,7 +1050,7 @@ int clear_identity_p2m_entry(struct domain *d, unsigned long gfn)
     {
         if ( !need_iommu(d) )
             return 0;
-        return iommu_unmap_page(d, gfn);
+        return iommu_unmap_page(d, gfn, lock);
     }
 
     gfn_lock(p2m, gfn, 0);
diff --git a/xen/arch/x86/x86_64/mm.c b/xen/arch/x86/x86_64/mm.c
index d918002..ebd6fad 100644
--- a/xen/arch/x86/x86_64/mm.c
+++ b/xen/arch/x86/x86_64/mm.c
@@ -1433,12 +1433,13 @@ int memory_add(unsigned long spfn, unsigned long epfn, unsigned int pxm)
     if ( iommu_enabled && !iommu_passthrough && !need_iommu(hardware_domain) )
     {
         for ( i = spfn; i < epfn; i++ )
-            if ( iommu_map_page(hardware_domain, i, i, IOMMUF_readable|IOMMUF_writable) )
+            if ( iommu_map_page(hardware_domain, i, i,
+                 IOMMUF_readable|IOMMUF_writable, NONE_LOCK) )
                 break;
         if ( i != epfn )
         {
             while (i-- > old_max)
-                iommu_unmap_page(hardware_domain, i);
+                iommu_unmap_page(hardware_domain, i, NONE_LOCK);
             goto destroy_m2p;
         }
     }
diff --git a/xen/common/grant_table.c b/xen/common/grant_table.c
index 1b9bd05..8bc233e 100644
--- a/xen/common/grant_table.c
+++ b/xen/common/grant_table.c
@@ -953,12 +953,14 @@ __gnttab_map_grant_ref(
         {
             if ( !(kind & MAPKIND_WRITE) )
                 err = iommu_map_page(ld, frame, frame,
-                                     IOMMUF_readable|IOMMUF_writable);
+                                     IOMMUF_readable|IOMMUF_writable,
+                                     NONE_LOCK);
         }
         else if ( act_pin && !old_pin )
         {
             if ( !kind )
-                err = iommu_map_page(ld, frame, frame, IOMMUF_readable);
+                err = iommu_map_page(ld, frame, frame,
+                                     IOMMUF_readable, NONE_LOCK);
         }
         if ( err )
         {
@@ -1178,9 +1180,10 @@ __gnttab_unmap_common(
 
         kind = mapkind(lgt, rd, op->frame);
         if ( !kind )
-            err = iommu_unmap_page(ld, op->frame);
+            err = iommu_unmap_page(ld, op->frame, NONE_LOCK);
         else if ( !(kind & MAPKIND_WRITE) )
-            err = iommu_map_page(ld, op->frame, op->frame, IOMMUF_readable);
+            err = iommu_map_page(ld, op->frame, op->frame,
+                                 IOMMUF_readable, NONE_LOCK);
 
         double_gt_unlock(lgt, rgt);
 
diff --git a/xen/drivers/passthrough/amd/iommu_map.c b/xen/drivers/passthrough/amd/iommu_map.c
index 78862c9..523feec 100644
--- a/xen/drivers/passthrough/amd/iommu_map.c
+++ b/xen/drivers/passthrough/amd/iommu_map.c
@@ -634,7 +634,7 @@ static int update_paging_mode(struct domain *d, unsigned long gfn)
 }
 
 int amd_iommu_map_page(struct domain *d, unsigned long gfn, unsigned long mfn,
-                       unsigned int flags)
+                       unsigned int flags, unsigned int lock)
 {
     bool_t need_flush = 0;
     struct hvm_iommu *hd = domain_hvm_iommu(d);
@@ -714,7 +714,7 @@ out:
     return 0;
 }
 
-int amd_iommu_unmap_page(struct domain *d, unsigned long gfn)
+int amd_iommu_unmap_page(struct domain *d, unsigned long gfn, unsigned int lock)
 {
     unsigned long pt_mfn[7];
     struct hvm_iommu *hd = domain_hvm_iommu(d);
@@ -777,7 +777,8 @@ int amd_iommu_reserve_domain_unity_map(struct domain *domain,
     gfn = phys_addr >> PAGE_SHIFT;
     for ( i = 0; i < npages; i++ )
     {
-        rt = amd_iommu_map_page(domain, gfn +i, gfn +i, flags);
+        rt = amd_iommu_map_page(domain, gfn +i, gfn +i, flags,
+                                PCIDEVS_LOCK);
         if ( rt != 0 )
             return rt;
     }
diff --git a/xen/drivers/passthrough/amd/pci_amd_iommu.c b/xen/drivers/passthrough/amd/pci_amd_iommu.c
index 449de13..1655dd9 100644
--- a/xen/drivers/passthrough/amd/pci_amd_iommu.c
+++ b/xen/drivers/passthrough/amd/pci_amd_iommu.c
@@ -298,7 +298,8 @@ static int __hwdom_init amd_iommu_hwdom_init(struct domain *d)
              */
             if ( mfn_valid(pfn) )
                 amd_iommu_map_page(d, pfn, pfn, 
-                                   IOMMUF_readable|IOMMUF_writable);
+                                   IOMMUF_readable|IOMMUF_writable,
+                                   PCIDEVS_LOCK);
 
             if ( !(i & 0xfffff) )
                 process_pending_softirqs();
diff --git a/xen/drivers/passthrough/arm/smmu.c b/xen/drivers/passthrough/arm/smmu.c
index 155b7f3..f89ee1b 100644
--- a/xen/drivers/passthrough/arm/smmu.c
+++ b/xen/drivers/passthrough/arm/smmu.c
@@ -2781,7 +2781,7 @@ static int arm_smmu_map_page(struct domain *d, unsigned long gfn,
 	return guest_physmap_add_entry(d, gfn, mfn, 0, t);
 }
 
-static int arm_smmu_unmap_page(struct domain *d, unsigned long gfn)
+static int arm_smmu_unmap_page(struct domain *d, unsigned long gfn, unsigned int lock)
 {
 	/*
 	 * This function should only be used by gnttab code when the domain
diff --git a/xen/drivers/passthrough/iommu.c b/xen/drivers/passthrough/iommu.c
index ebd6d47..8b5c0a3 100644
--- a/xen/drivers/passthrough/iommu.c
+++ b/xen/drivers/passthrough/iommu.c
@@ -172,7 +172,8 @@ int __hwdom_init iommu_hwdom_init(struct domain *d)
                  ((page->u.inuse.type_info & PGT_type_mask)
                   == PGT_writable_page) )
                 mapping |= IOMMUF_writable;
-            rc = hd->platform_ops->map_page(d, gfn, mfn, mapping);
+            rc = hd->platform_ops->map_page(d, gfn, mfn, mapping,
+                                            PCIDEVS_LOCK);
             if ( rc )
                 return rc;
 
@@ -233,24 +234,24 @@ void iommu_domain_destroy(struct domain *d)
 }
 
 int iommu_map_page(struct domain *d, unsigned long gfn, unsigned long mfn,
-                   unsigned int flags)
+                   unsigned int flags, unsigned int lock)
 {
     struct hvm_iommu *hd = domain_hvm_iommu(d);
 
     if ( !iommu_enabled || !hd->platform_ops )
         return 0;
 
-    return hd->platform_ops->map_page(d, gfn, mfn, flags);
+    return hd->platform_ops->map_page(d, gfn, mfn, flags, lock);
 }
 
-int iommu_unmap_page(struct domain *d, unsigned long gfn)
+int iommu_unmap_page(struct domain *d, unsigned long gfn, unsigned int lock)
 {
     struct hvm_iommu *hd = domain_hvm_iommu(d);
 
     if ( !iommu_enabled || !hd->platform_ops )
         return 0;
 
-    return hd->platform_ops->unmap_page(d, gfn);
+    return hd->platform_ops->unmap_page(d, gfn, lock);
 }
 
 static void iommu_free_pagetables(unsigned long unused)
diff --git a/xen/drivers/passthrough/vtd/iommu.c b/xen/drivers/passthrough/vtd/iommu.c
index e8cbfdb..6696b16 100644
--- a/xen/drivers/passthrough/vtd/iommu.c
+++ b/xen/drivers/passthrough/vtd/iommu.c
@@ -1711,7 +1711,7 @@ static void iommu_domain_teardown(struct domain *d)
 
 static int intel_iommu_map_page(
     struct domain *d, unsigned long gfn, unsigned long mfn,
-    unsigned int flags)
+    unsigned int flags, unsigned int lock)
 {
     struct hvm_iommu *hd = domain_hvm_iommu(d);
     struct dma_pte *page = NULL, *pte = NULL, old, new = { 0 };
@@ -1765,7 +1765,8 @@ static int intel_iommu_map_page(
     return 0;
 }
 
-static int intel_iommu_unmap_page(struct domain *d, unsigned long gfn)
+static int intel_iommu_unmap_page(struct domain *d, unsigned long gfn,
+                                  unsigned int lock)
 {
     /* Do nothing if hardware domain and iommu supports pass thru. */
     if ( iommu_passthrough && is_hardware_domain(d) )
@@ -1865,7 +1866,7 @@ static int rmrr_identity_mapping(struct domain *d, bool_t map,
 
             while ( base_pfn < end_pfn )
             {
-                if ( clear_identity_p2m_entry(d, base_pfn) )
+                if ( clear_identity_p2m_entry(d, base_pfn, PCIDEVS_LOCK) )
                     ret = -ENXIO;
                 base_pfn++;
             }
@@ -1881,7 +1882,8 @@ static int rmrr_identity_mapping(struct domain *d, bool_t map,
 
     while ( base_pfn < end_pfn )
     {
-        int err = set_identity_p2m_entry(d, base_pfn, p2m_access_rw, flag);
+        int err = set_identity_p2m_entry(d, base_pfn, p2m_access_rw, flag,
+                                         PCIDEVS_LOCK);
 
         if ( err )
             return err;
diff --git a/xen/drivers/passthrough/vtd/x86/vtd.c b/xen/drivers/passthrough/vtd/x86/vtd.c
index a19177c..0d4aea7 100644
--- a/xen/drivers/passthrough/vtd/x86/vtd.c
+++ b/xen/drivers/passthrough/vtd/x86/vtd.c
@@ -143,11 +143,12 @@ int __hwdom_init vtd_set_hwdom_mapping(struct domain *d)
         for ( j = 0; j < tmp; j++ )
         {
             rc = iommu_map_page(d, pfn * tmp + j, pfn * tmp + j,
-                                IOMMUF_readable|IOMMUF_writable);
+                                IOMMUF_readable|IOMMUF_writable,
+                                NONE_LOCK);
             if ( rc )
             {
                 while ( j-- > 0 )
-                    iommu_unmap_page(d, pfn * tmp + j);
+                    iommu_unmap_page(d, pfn * tmp + j, NONE_LOCK);
                 break;
             }
         }
diff --git a/xen/drivers/passthrough/x86/iommu.c b/xen/drivers/passthrough/x86/iommu.c
index 4bbf5f8..9267a54 100644
--- a/xen/drivers/passthrough/x86/iommu.c
+++ b/xen/drivers/passthrough/x86/iommu.c
@@ -67,7 +67,8 @@ int arch_iommu_populate_page_table(struct domain *d)
                 BUG_ON(SHARED_M2P(gfn));
                 rc = hd->platform_ops->map_page(d, gfn, mfn,
                                                 IOMMUF_readable |
-                                                IOMMUF_writable);
+                                                IOMMUF_writable,
+                                                PCIDEVS_LOCK);
             }
             if ( rc )
             {
diff --git a/xen/include/asm-x86/hvm/svm/amd-iommu-proto.h b/xen/include/asm-x86/hvm/svm/amd-iommu-proto.h
index 4691f9b..9fc678a 100644
--- a/xen/include/asm-x86/hvm/svm/amd-iommu-proto.h
+++ b/xen/include/asm-x86/hvm/svm/amd-iommu-proto.h
@@ -52,8 +52,8 @@ int amd_iommu_update_ivrs_mapping_acpi(void);
 
 /* mapping functions */
 int amd_iommu_map_page(struct domain *d, unsigned long gfn, unsigned long mfn,
-                       unsigned int flags);
-int amd_iommu_unmap_page(struct domain *d, unsigned long gfn);
+                       unsigned int flags, unsigned int lock);
+int amd_iommu_unmap_page(struct domain *d, unsigned long gfn, unsigned int lock);
 u64 amd_iommu_get_next_table_from_pte(u32 *entry);
 int amd_iommu_reserve_domain_unity_map(struct domain *domain,
                                        u64 phys_addr, unsigned long size,
diff --git a/xen/include/asm-x86/p2m.h b/xen/include/asm-x86/p2m.h
index 5e99ac6..3c12c95 100644
--- a/xen/include/asm-x86/p2m.h
+++ b/xen/include/asm-x86/p2m.h
@@ -562,8 +562,10 @@ int clear_mmio_p2m_entry(struct domain *d, unsigned long gfn, mfn_t mfn);
 
 /* Set identity addresses in the p2m table (for pass-through) */
 int set_identity_p2m_entry(struct domain *d, unsigned long gfn,
-                           p2m_access_t p2ma, unsigned int flag);
-int clear_identity_p2m_entry(struct domain *d, unsigned long gfn);
+                           p2m_access_t p2ma, unsigned int flag,
+                           unsigned int lock);
+int clear_identity_p2m_entry(struct domain *d, unsigned long gfn,
+                             unsigned int lock);
 
 /* Add foreign mapping to the guest's p2m table. */
 int p2m_add_foreign(struct domain *tdom, unsigned long fgfn,
diff --git a/xen/include/xen/iommu.h b/xen/include/xen/iommu.h
index f58e9d6..27e0e23 100644
--- a/xen/include/xen/iommu.h
+++ b/xen/include/xen/iommu.h
@@ -75,8 +75,8 @@ void iommu_teardown(struct domain *d);
 #define _IOMMUF_writable 1
 #define IOMMUF_writable  (1u<<_IOMMUF_writable)
 int iommu_map_page(struct domain *d, unsigned long gfn, unsigned long mfn,
-                   unsigned int flags);
-int iommu_unmap_page(struct domain *d, unsigned long gfn);
+                   unsigned int flags, unsigned int lock);
+int iommu_unmap_page(struct domain *d, unsigned long gfn, unsigned int lock);
 
 enum iommu_feature
 {
@@ -156,8 +156,8 @@ struct iommu_ops {
 
     void (*teardown)(struct domain *d);
     int (*map_page)(struct domain *d, unsigned long gfn, unsigned long mfn,
-                    unsigned int flags);
-    int (*unmap_page)(struct domain *d, unsigned long gfn);
+                    unsigned int flags, unsigned int lock);
+    int (*unmap_page)(struct domain *d, unsigned long gfn, unsigned int lock);
     void (*free_page_table)(struct page_info *);
 #ifdef CONFIG_X86
     void (*update_ire_from_apic)(unsigned int apic, unsigned int reg, unsigned int value);
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v5 6/7] VT-d: Refactor iommu_flush .iotlb()
  2016-02-05 10:18 [PATCH v5 0/7] VT-d Device-TLB flush issue Quan Xu
                   ` (4 preceding siblings ...)
  2016-02-05 10:18 ` [PATCH v5 5/7] VT-d: Refactor iommu_ops .map_page() and unmap_page() Quan Xu
@ 2016-02-05 10:18 ` Quan Xu
  2016-02-05 10:18 ` [PATCH v5 7/7] VT-d: Fix vt-d Device-TLB flush timeout issue Quan Xu
  6 siblings, 0 replies; 31+ messages in thread
From: Quan Xu @ 2016-02-05 10:18 UTC (permalink / raw)
  To: jbeulich, kevin.tian
  Cc: feng.wu, george.dunlap, andrew.cooper3, tim, xen-devel, Quan Xu

to pass down a flag indicating whether the lock is being held,
and check the way up the call trees.

Signed-off-by: Quan Xu <quan.xu@intel.com>
---
 xen/arch/x86/mm/p2m-ept.c            |  3 +-
 xen/drivers/passthrough/vtd/iommu.c  | 73 ++++++++++++++++++++++--------------
 xen/drivers/passthrough/vtd/iommu.h  |  3 +-
 xen/drivers/passthrough/vtd/qinval.c |  3 +-
 xen/include/asm-x86/iommu.h          |  3 +-
 5 files changed, 53 insertions(+), 32 deletions(-)

diff --git a/xen/arch/x86/mm/p2m-ept.c b/xen/arch/x86/mm/p2m-ept.c
index ecf7e67..4c9bdfe 100644
--- a/xen/arch/x86/mm/p2m-ept.c
+++ b/xen/arch/x86/mm/p2m-ept.c
@@ -829,7 +829,8 @@ out:
          need_modify_vtd_table )
     {
         if ( iommu_hap_pt_share )
-            rc = iommu_pte_flush(d, gfn, &ept_entry->epte, order, vtd_pte_present);
+            rc = iommu_pte_flush(d, gfn, &ept_entry->epte, order,
+                                 vtd_pte_present, NONE_LOCK);
         else
         {
             if ( iommu_flags )
diff --git a/xen/drivers/passthrough/vtd/iommu.c b/xen/drivers/passthrough/vtd/iommu.c
index 6696b16..8d54c01 100644
--- a/xen/drivers/passthrough/vtd/iommu.c
+++ b/xen/drivers/passthrough/vtd/iommu.c
@@ -410,7 +410,8 @@ static int iommu_flush_context_device(
 /* return value determine if we need a write buffer flush */
 static int flush_iotlb_reg(void *_iommu, u16 did,
                            u64 addr, unsigned int size_order, u64 type,
-                           int flush_non_present_entry, int flush_dev_iotlb)
+                           int flush_non_present_entry, int flush_dev_iotlb,
+                           unsigned int lock)
 {
     struct iommu *iommu = (struct iommu *) _iommu;
     int tlb_offset = ecap_iotlb_offset(iommu->ecap);
@@ -475,7 +476,8 @@ static int flush_iotlb_reg(void *_iommu, u16 did,
 }
 
 static int iommu_flush_iotlb_global(struct iommu *iommu,
-    int flush_non_present_entry, int flush_dev_iotlb)
+    int flush_non_present_entry, int flush_dev_iotlb,
+    unsigned int lock)
 {
     struct iommu_flush *flush = iommu_get_flush(iommu);
     int status;
@@ -484,7 +486,8 @@ static int iommu_flush_iotlb_global(struct iommu *iommu,
     vtd_ops_preamble_quirk(iommu);
 
     status = flush->iotlb(iommu, 0, 0, 0, DMA_TLB_GLOBAL_FLUSH,
-                        flush_non_present_entry, flush_dev_iotlb);
+                          flush_non_present_entry, flush_dev_iotlb,
+                          lock);
 
     /* undo platform specific errata workarounds */
     vtd_ops_postamble_quirk(iommu);
@@ -493,7 +496,8 @@ static int iommu_flush_iotlb_global(struct iommu *iommu,
 }
 
 static int iommu_flush_iotlb_dsi(struct iommu *iommu, u16 did,
-    int flush_non_present_entry, int flush_dev_iotlb)
+    int flush_non_present_entry, int flush_dev_iotlb,
+    unsigned int lock)
 {
     struct iommu_flush *flush = iommu_get_flush(iommu);
     int status;
@@ -502,7 +506,8 @@ static int iommu_flush_iotlb_dsi(struct iommu *iommu, u16 did,
     vtd_ops_preamble_quirk(iommu);
 
     status =  flush->iotlb(iommu, did, 0, 0, DMA_TLB_DSI_FLUSH,
-                        flush_non_present_entry, flush_dev_iotlb);
+                           flush_non_present_entry, flush_dev_iotlb,
+                           lock);
 
     /* undo platform specific errata workarounds */
     vtd_ops_postamble_quirk(iommu);
@@ -512,7 +517,8 @@ static int iommu_flush_iotlb_dsi(struct iommu *iommu, u16 did,
 
 static int iommu_flush_iotlb_psi(
     struct iommu *iommu, u16 did, u64 addr, unsigned int order,
-    int flush_non_present_entry, int flush_dev_iotlb)
+    int flush_non_present_entry, int flush_dev_iotlb,
+    unsigned int lock)
 {
     struct iommu_flush *flush = iommu_get_flush(iommu);
     int status;
@@ -521,11 +527,13 @@ static int iommu_flush_iotlb_psi(
 
     /* Fallback to domain selective flush if no PSI support */
     if ( !cap_pgsel_inv(iommu->cap) )
-        return iommu_flush_iotlb_dsi(iommu, did, flush_non_present_entry, flush_dev_iotlb);
+        return iommu_flush_iotlb_dsi(iommu, did, flush_non_present_entry,
+                                     flush_dev_iotlb, lock);
 
     /* Fallback to domain selective flush if size is too big */
     if ( order > cap_max_amask_val(iommu->cap) )
-        return iommu_flush_iotlb_dsi(iommu, did, flush_non_present_entry, flush_dev_iotlb);
+        return iommu_flush_iotlb_dsi(iommu, did, flush_non_present_entry,
+                                     flush_dev_iotlb, lock);
 
     addr >>= PAGE_SHIFT_4K + order;
     addr <<= PAGE_SHIFT_4K + order;
@@ -534,7 +542,8 @@ static int iommu_flush_iotlb_psi(
     vtd_ops_preamble_quirk(iommu);
 
     status = flush->iotlb(iommu, did, addr, order, DMA_TLB_PSI_FLUSH,
-                        flush_non_present_entry, flush_dev_iotlb);
+                          flush_non_present_entry, flush_dev_iotlb,
+                          lock);
 
     /* undo platform specific errata workarounds */
     vtd_ops_postamble_quirk(iommu);
@@ -542,7 +551,7 @@ static int iommu_flush_iotlb_psi(
     return status;
 }
 
-static int iommu_flush_all(void)
+static int iommu_flush_all(unsigned int lock)
 {
     struct acpi_drhd_unit *drhd;
     struct iommu *iommu;
@@ -554,14 +563,17 @@ static int iommu_flush_all(void)
         iommu = drhd->iommu;
         iommu_flush_context_global(iommu, 0);
         flush_dev_iotlb = find_ats_dev_drhd(iommu) ? 1 : 0;
-        return iommu_flush_iotlb_global(iommu, 0, flush_dev_iotlb);
+        return iommu_flush_iotlb_global(iommu, 0, flush_dev_iotlb,
+                                        lock);
     }
 
     return 0;
 }
 
 static int __intel_iommu_iotlb_flush(struct domain *d, unsigned long gfn,
-        int dma_old_pte_present, unsigned int page_count)
+                                     int dma_old_pte_present,
+                                     unsigned int page_count,
+                                     unsigned int lock)
 {
     struct hvm_iommu *hd = domain_hvm_iommu(d);
     struct acpi_drhd_unit *drhd;
@@ -588,11 +600,11 @@ static int __intel_iommu_iotlb_flush(struct domain *d, unsigned long gfn,
 
         if ( page_count > 1 || gfn == -1 )
             rc = iommu_flush_iotlb_dsi(iommu, iommu_domid,
-                        0, flush_dev_iotlb);
+                        0, flush_dev_iotlb, lock);
         else
             rc = iommu_flush_iotlb_psi(iommu, iommu_domid,
                         (paddr_t)gfn << PAGE_SHIFT_4K, 0,
-                        !dma_old_pte_present, flush_dev_iotlb);
+                        !dma_old_pte_present, flush_dev_iotlb, lock);
 
         if ( rc )
             iommu_flush_write_buffer(iommu);
@@ -604,16 +616,17 @@ static int __intel_iommu_iotlb_flush(struct domain *d, unsigned long gfn,
 static int intel_iommu_iotlb_flush(struct domain *d, unsigned long gfn,
                                    unsigned int page_count, unsigned int lock)
 {
-    return __intel_iommu_iotlb_flush(d, gfn, 1, page_count);
+    return __intel_iommu_iotlb_flush(d, gfn, 1, page_count, lock);
 }
 
 static int intel_iommu_iotlb_flush_all(struct domain *d, unsigned int lock)
 {
-    return __intel_iommu_iotlb_flush(d, 0, 0, 0);
+    return __intel_iommu_iotlb_flush(d, 0, 0, 0, lock);
 }
 
 /* clear one page's page table */
-static int dma_pte_clear_one(struct domain *domain, u64 addr)
+static int dma_pte_clear_one(struct domain *domain, u64 addr,
+                             unsigned int lock)
 {
     struct hvm_iommu *hd = domain_hvm_iommu(domain);
     struct dma_pte *page = NULL, *pte = NULL;
@@ -644,7 +657,8 @@ static int dma_pte_clear_one(struct domain *domain, u64 addr)
     iommu_flush_cache_entry(pte, sizeof(struct dma_pte));
 
     if ( !this_cpu(iommu_dont_flush_iotlb) )
-        rc = __intel_iommu_iotlb_flush(domain, addr >> PAGE_SHIFT_4K, 1, 1);
+        rc = __intel_iommu_iotlb_flush(domain, addr >> PAGE_SHIFT_4K,
+                                       1, 1, lock);
 
     unmap_vtd_domain_page(page);
 
@@ -1273,7 +1287,7 @@ static int __hwdom_init intel_iommu_hwdom_init(struct domain *d)
     setup_hwdom_pci_devices(d, setup_hwdom_device);
     setup_hwdom_rmrr(d);
 
-    rc = iommu_flush_all();
+    rc = iommu_flush_all(NONE_LOCK);
     if ( rc )
         return rc;
 
@@ -1419,7 +1433,8 @@ int domain_context_mapping_one(
     {
         int flush_dev_iotlb = find_ats_dev_drhd(iommu) ? 1 : 0;
 
-        rc = iommu_flush_iotlb_dsi(iommu, 0, 1, flush_dev_iotlb);
+        rc = iommu_flush_iotlb_dsi(iommu, 0, 1, flush_dev_iotlb,
+                                   PCIDEVS_LOCK);
     }
 
     set_bit(iommu->index, &hd->arch.iommu_bitmap);
@@ -1559,7 +1574,8 @@ int domain_context_unmap_one(
     else
     {
         int flush_dev_iotlb = find_ats_dev_drhd(iommu) ? 1 : 0;
-        rc = iommu_flush_iotlb_dsi(iommu, iommu_domid, 0, flush_dev_iotlb);
+        rc = iommu_flush_iotlb_dsi(iommu, iommu_domid, 0, flush_dev_iotlb,
+                                   PCIDEVS_LOCK);
     }
 
     spin_unlock(&iommu->lock);
@@ -1759,7 +1775,8 @@ static int intel_iommu_map_page(
 
     if ( !this_cpu(iommu_dont_flush_iotlb) )
     {
-        return __intel_iommu_iotlb_flush(d, gfn, dma_pte_present(old), 1);
+        return __intel_iommu_iotlb_flush(d, gfn, dma_pte_present(old),
+                                         1, lock);
     }
 
     return 0;
@@ -1772,11 +1789,11 @@ static int intel_iommu_unmap_page(struct domain *d, unsigned long gfn,
     if ( iommu_passthrough && is_hardware_domain(d) )
         return 0;
 
-    return dma_pte_clear_one(d, (paddr_t)gfn << PAGE_SHIFT_4K);
+    return dma_pte_clear_one(d, (paddr_t)gfn << PAGE_SHIFT_4K, lock);
 }
 
 int iommu_pte_flush(struct domain *d, u64 gfn, u64 *pte,
-                    int order, int present)
+                    int order, int present, unsigned int lock)
 {
     struct acpi_drhd_unit *drhd;
     struct iommu *iommu = NULL;
@@ -1799,7 +1816,7 @@ int iommu_pte_flush(struct domain *d, u64 gfn, u64 *pte,
             continue;
         rc = iommu_flush_iotlb_psi(iommu, iommu_domid,
                                    (paddr_t)gfn << PAGE_SHIFT_4K,
-                                   order, !present, flush_dev_iotlb);
+                                   order, !present, flush_dev_iotlb, lock);
         if ( rc )
             iommu_flush_write_buffer(iommu);
     }
@@ -2126,7 +2143,7 @@ static int init_vtd_hw(void)
         }
     }
 
-    return iommu_flush_all();
+    return iommu_flush_all(NONE_LOCK);
 }
 
 static void __hwdom_init setup_hwdom_rmrr(struct domain *d)
@@ -2404,7 +2421,7 @@ static int vtd_suspend(void)
     if ( !iommu_enabled )
         return 0;
 
-    rc = iommu_flush_all();
+    rc = iommu_flush_all(NONE_LOCK);
     if ( rc )
         return rc;
 
@@ -2448,7 +2465,7 @@ static int vtd_crash_shutdown(void)
     if ( !iommu_enabled )
         return 0;
 
-    rc = iommu_flush_all();
+    rc = iommu_flush_all(NONE_LOCK);
     if ( rc )
         return rc;
 
diff --git a/xen/drivers/passthrough/vtd/iommu.h b/xen/drivers/passthrough/vtd/iommu.h
index ac71ed1..0d9d3c4 100644
--- a/xen/drivers/passthrough/vtd/iommu.h
+++ b/xen/drivers/passthrough/vtd/iommu.h
@@ -494,7 +494,8 @@ struct iommu_flush {
     int (*context)(void *iommu, u16 did, u16 source_id,
                    u8 function_mask, u64 type, int non_present_entry_flush);
     int (*iotlb)(void *iommu, u16 did, u64 addr, unsigned int size_order,
-                 u64 type, int flush_non_present_entry, int flush_dev_iotlb);
+                 u64 type, int flush_non_present_entry, int flush_dev_iotlb,
+                 unsigned int lock);
 };
 
 struct intel_iommu {
diff --git a/xen/drivers/passthrough/vtd/qinval.c b/xen/drivers/passthrough/vtd/qinval.c
index f9e752b..f2e7ffb 100644
--- a/xen/drivers/passthrough/vtd/qinval.c
+++ b/xen/drivers/passthrough/vtd/qinval.c
@@ -315,7 +315,8 @@ static int flush_context_qi(
 static int flush_iotlb_qi(
     void *_iommu, u16 did,
     u64 addr, unsigned int size_order, u64 type,
-    int flush_non_present_entry, int flush_dev_iotlb)
+    int flush_non_present_entry, int flush_dev_iotlb,
+    unsigned int lock)
 {
     u8 dr = 0, dw = 0;
     int ret = 0;
diff --git a/xen/include/asm-x86/iommu.h b/xen/include/asm-x86/iommu.h
index cf2a269..f9499e0 100644
--- a/xen/include/asm-x86/iommu.h
+++ b/xen/include/asm-x86/iommu.h
@@ -26,7 +26,8 @@ int iommu_setup_hpet_msi(struct msi_desc *);
 
 /* While VT-d specific, this must get declared in a generic header. */
 int adjust_vtd_irq_affinities(void);
-int iommu_pte_flush(struct domain *d, u64 gfn, u64 *pte, int order, int present);
+int iommu_pte_flush(struct domain *d, u64 gfn, u64 *pte, int order,
+                    int present, unsigned int lock);
 int iommu_supports_eim(void);
 int iommu_enable_x2apic_IR(void);
 void iommu_disable_x2apic_IR(void);
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v5 7/7] VT-d: Fix vt-d Device-TLB flush timeout issue.
  2016-02-05 10:18 [PATCH v5 0/7] VT-d Device-TLB flush issue Quan Xu
                   ` (5 preceding siblings ...)
  2016-02-05 10:18 ` [PATCH v5 6/7] VT-d: Refactor iommu_flush .iotlb() Quan Xu
@ 2016-02-05 10:18 ` Quan Xu
  2016-02-17 14:40   ` Jan Beulich
  6 siblings, 1 reply; 31+ messages in thread
From: Quan Xu @ 2016-02-05 10:18 UTC (permalink / raw)
  To: jbeulich, kevin.tian
  Cc: feng.wu, george.dunlap, andrew.cooper3, tim, xen-devel, Quan Xu

If Device-TLB flush is timeout, we'll hide the target ATS
device and crash the domain owning this ATS device.

If impacted domain is hardware domain, just throw out a warning.

The hidden device will be disallowed to be further assigned to
any domain.

Signed-off-by: Quan Xu <quan.xu@intel.com>
---
 xen/drivers/passthrough/pci.c         |  2 +-
 xen/drivers/passthrough/vtd/extern.h  |  8 +++-
 xen/drivers/passthrough/vtd/qinval.c  | 77 ++++++++++++++++++++++++++++++++++-
 xen/drivers/passthrough/vtd/x86/ats.c | 14 ++++++-
 4 files changed, 95 insertions(+), 6 deletions(-)

diff --git a/xen/drivers/passthrough/pci.c b/xen/drivers/passthrough/pci.c
index 27b3ca7..2d7dc59 100644
--- a/xen/drivers/passthrough/pci.c
+++ b/xen/drivers/passthrough/pci.c
@@ -407,7 +407,7 @@ static void _pci_hide_device(struct pci_dev *pdev)
     list_add(&pdev->domain_list, &dom_xen->arch.pdev_list);
 }
 
-int __init pci_hide_device(int bus, int devfn)
+int pci_hide_device(int bus, int devfn)
 {
     struct pci_dev *pdev;
     int rc = -ENOMEM;
diff --git a/xen/drivers/passthrough/vtd/extern.h b/xen/drivers/passthrough/vtd/extern.h
index ec9c513..a129460 100644
--- a/xen/drivers/passthrough/vtd/extern.h
+++ b/xen/drivers/passthrough/vtd/extern.h
@@ -56,8 +56,12 @@ struct acpi_drhd_unit * find_ats_dev_drhd(struct iommu *iommu);
 
 int ats_device(const struct pci_dev *, const struct acpi_drhd_unit *);
 
-int dev_invalidate_iotlb(struct iommu *iommu, u16 did,
-                         u64 addr, unsigned int size_order, u64 type);
+int dev_invalidate_iotlb(struct iommu *iommu, u16 did, u64 addr,
+                         unsigned int size_order, u64 type,
+                         unsigned int lock);
+int dev_invalidate_iotlb_sync(struct iommu *iommu, u16 did,
+                              u16 seg, u8 bus, u8 devfn,
+                              unsigned int lock);
 
 int qinval_device_iotlb(struct iommu *iommu,
                         u32 max_invs_pend, u16 sid, u16 size, u64 addr);
diff --git a/xen/drivers/passthrough/vtd/qinval.c b/xen/drivers/passthrough/vtd/qinval.c
index f2e7ffb..76046a7 100644
--- a/xen/drivers/passthrough/vtd/qinval.c
+++ b/xen/drivers/passthrough/vtd/qinval.c
@@ -229,6 +229,69 @@ int qinval_device_iotlb(struct iommu *iommu,
     return 0;
 }
 
+static void dev_invalidate_iotlb_timeout(struct iommu *iommu, u16 did,
+                                         u16 seg, u8 bus, u8 devfn,
+                                         unsigned int lock)
+{
+    struct domain *d = NULL;
+    struct pci_dev *pdev;
+
+    if ( test_bit(did, iommu->domid_bitmap) )
+        d = rcu_lock_domain_by_id(iommu->domid_map[did]);
+
+    if ( d == NULL )
+        return;
+
+    for_each_pdev(d, pdev)
+    {
+        if ( (pdev->seg == seg) &&
+             (pdev->bus == bus) &&
+             (pdev->devfn == devfn) )
+        {
+            ASSERT ( pdev->domain );
+            list_del(&pdev->domain_list);
+            pdev->domain = NULL;
+
+            if ( !(lock & PCIDEVS_LOCK) )
+                spin_lock(&pcidevs_lock);
+
+            if ( pci_hide_device(bus, devfn) )
+            {
+                printk(XENLOG_ERR
+                       "IOMMU hide device %04x:%02x:%02x.%02x error.",
+                       seg, bus, PCI_SLOT(devfn), PCI_FUNC(devfn));
+            }
+
+            if ( !(lock & PCIDEVS_LOCK) )
+                spin_unlock(&pcidevs_lock);
+
+            break;
+        }
+    }
+
+    if ( !is_hardware_domain(d) )
+        domain_crash(d);
+    rcu_unlock_domain(d);
+}
+
+int dev_invalidate_iotlb_sync(struct iommu *iommu, u16 did,
+                              u16 seg, u8 bus, u8 devfn,
+                              unsigned int lock)
+{
+    struct qi_ctrl *qi_ctrl = iommu_qi_ctrl(iommu);
+    int rc = 0;
+
+    if ( qi_ctrl->qinval_maddr )
+    {
+        rc = queue_invalidate_wait(iommu, 0, 1, 1);
+        if ( rc == -ETIMEDOUT )
+            dev_invalidate_iotlb_timeout(iommu, did,
+                                         seg, bus, devfn, lock);
+    }
+
+    return rc;
+}
+
 static void queue_invalidate_iec(struct iommu *iommu, u8 granu, u8 im, u16 iidx)
 {
     unsigned long flags;
@@ -350,9 +413,19 @@ static int flush_iotlb_qi(
         queue_invalidate_iotlb(iommu,
                                type >> DMA_TLB_FLUSH_GRANU_OFFSET, dr,
                                dw, did, size_order, 0, addr);
-        if ( flush_dev_iotlb )
-            ret = dev_invalidate_iotlb(iommu, did, addr, size_order, type);
+
+        /*
+         * Before Device-TLB invalidation we need to synchronize
+         * invalidation completions with hardware.
+         */
         rc = invalidate_sync(iommu);
+        if ( rc )
+             return rc;
+
+        if ( flush_dev_iotlb )
+            ret = dev_invalidate_iotlb(iommu, did, addr, size_order,
+                                       type, lock);
+
         if ( !ret )
             ret = rc;
     }
diff --git a/xen/drivers/passthrough/vtd/x86/ats.c b/xen/drivers/passthrough/vtd/x86/ats.c
index 7c797f6..dcc3394 100644
--- a/xen/drivers/passthrough/vtd/x86/ats.c
+++ b/xen/drivers/passthrough/vtd/x86/ats.c
@@ -106,7 +106,7 @@ out:
 }
 
 int dev_invalidate_iotlb(struct iommu *iommu, u16 did,
-    u64 addr, unsigned int size_order, u64 type)
+    u64 addr, unsigned int size_order, u64 type, unsigned int lock)
 {
     struct pci_ats_dev *pdev;
     int ret = 0;
@@ -162,6 +162,18 @@ int dev_invalidate_iotlb(struct iommu *iommu, u16 did,
             return -EOPNOTSUPP;
         }
 
+        /*
+         * Synchronize with hardware for Device-TLB invalidate
+         * descriptor.
+         */
+        ret = dev_invalidate_iotlb_sync(iommu, did, pdev->seg,
+                                        pdev->bus, pdev->devfn, lock);
+        if ( ret )
+            printk(XENLOG_ERR
+                   "Flush error %d on device %04x:%02x:%02x.%02x \n",
+                   ret, pdev->seg, pdev->bus, PCI_SLOT(pdev->devfn),
+                   PCI_FUNC(pdev->devfn));
+
         if ( !ret )
             ret = rc;
     }
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: [PATCH v5 1/7] VT-d: Check VT-d Device-TLB flush error(IOMMU part).
  2016-02-05 10:18 ` [PATCH v5 1/7] VT-d: Check VT-d Device-TLB flush error(IOMMU part) Quan Xu
@ 2016-02-10 17:01   ` Jan Beulich
  2016-02-16 10:50     ` Xu, Quan
  2016-02-23 11:43     ` Xu, Quan
  0 siblings, 2 replies; 31+ messages in thread
From: Jan Beulich @ 2016-02-10 17:01 UTC (permalink / raw)
  To: Quan Xu
  Cc: kevin.tian, feng.wu, george.dunlap, andrew.cooper3, tim, xen-devel

>>> On 05.02.16 at 11:18, <quan.xu@intel.com> wrote:
> This patch checks all kinds of error and all the way up
> the call trees of VT-d Device-TLB flush(IOMMU part).
> 
> Signed-off-by: Quan Xu <quan.xu@intel.com>
> ---
>  xen/drivers/passthrough/amd/iommu_init.c      |   4 +-
>  xen/drivers/passthrough/amd/pci_amd_iommu.c   |   4 +-
>  xen/drivers/passthrough/arm/smmu.c            |  13 +--
>  xen/drivers/passthrough/iommu.c               |  37 +++++---
>  xen/drivers/passthrough/vtd/extern.h          |   4 +-
>  xen/drivers/passthrough/vtd/iommu.c           | 125 ++++++++++++++++----------
>  xen/drivers/passthrough/vtd/qinval.c          |   2 +-
>  xen/drivers/passthrough/vtd/quirks.c          |  26 +++---
>  xen/drivers/passthrough/vtd/x86/vtd.c         |  17 +++-
>  xen/drivers/passthrough/x86/iommu.c           |   6 +-
>  xen/include/asm-x86/hvm/svm/amd-iommu-proto.h |   4 +-
>  xen/include/asm-x86/iommu.h                   |   2 +-
>  xen/include/xen/iommu.h                       |  20 ++---
>  13 files changed, 166 insertions(+), 98 deletions(-)

Please be sure to Cc all maintainers of code you modify.

> @@ -2737,8 +2739,9 @@ static int arm_smmu_iommu_domain_init(struct domain *d)
>  	return 0;
>  }
>  
> -static void __hwdom_init arm_smmu_iommu_hwdom_init(struct domain *d)
> +static int __hwdom_init arm_smmu_iommu_hwdom_init(struct domain *d)
>  {
> +        return 0;
>  }

Bad indentation.

> --- a/xen/drivers/passthrough/iommu.c
> +++ b/xen/drivers/passthrough/iommu.c
> @@ -146,14 +146,15 @@ static void __hwdom_init check_hwdom_reqs(struct domain *d)
>      iommu_dom0_strict = 1;
>  }
>  
> -void __hwdom_init iommu_hwdom_init(struct domain *d)
> +int __hwdom_init iommu_hwdom_init(struct domain *d)
>  {
>      struct hvm_iommu *hd = domain_hvm_iommu(d);
> +    int rc = 0;

Pointless initializer (more elsewhere).

> @@ -171,7 +172,10 @@ void __hwdom_init iommu_hwdom_init(struct domain *d)
>                   ((page->u.inuse.type_info & PGT_type_mask)
>                    == PGT_writable_page) )
>                  mapping |= IOMMUF_writable;
> -            hd->platform_ops->map_page(d, gfn, mfn, mapping);
> +            rc = hd->platform_ops->map_page(d, gfn, mfn, mapping);
> +            if ( rc )
> +                return rc;

So in this patch I can't find error checking getting introduced for
the caller of this function. And if you were to add it - what would
your intentions be? Panic? IOW I'm not sure bailing here is a good
idea. Logging an error message (but only once in this loop) would
be a possibility. (This pattern repeats further down.)

> @@ -369,12 +376,16 @@ void iommu_share_p2m_table(struct domain* d)
>          ops->share_p2m(d);
>  }
>  
> -void iommu_crash_shutdown(void)
> +int iommu_crash_shutdown(void)
>  {
>      const struct iommu_ops *ops = iommu_get_ops();
> +
>      if ( iommu_enabled )
> -        ops->crash_shutdown();
> +        return ops->crash_shutdown();
> +
>      iommu_enabled = iommu_intremap = 0;
> +
> +    return 0;
>  }

Here again the question is - what is the error value going to be used
for? We're trying to shut down a crashed system when coming here.

> @@ -584,37 +587,37 @@ static void __intel_iommu_iotlb_flush(struct domain *d, unsigned long gfn,
>              continue;
>  
>          if ( page_count > 1 || gfn == -1 )
> -        {
> -            if ( iommu_flush_iotlb_dsi(iommu, iommu_domid,
> -                        0, flush_dev_iotlb) )
> -                iommu_flush_write_buffer(iommu);
> -        }
> +            rc = iommu_flush_iotlb_dsi(iommu, iommu_domid,
> +                        0, flush_dev_iotlb);
>          else
> -        {
> -            if ( iommu_flush_iotlb_psi(iommu, iommu_domid,
> +            rc = iommu_flush_iotlb_psi(iommu, iommu_domid,
>                          (paddr_t)gfn << PAGE_SHIFT_4K, 0,
> -                        !dma_old_pte_present, flush_dev_iotlb) )
> -                iommu_flush_write_buffer(iommu);
> -        }
> +                        !dma_old_pte_present, flush_dev_iotlb);
> +
> +        if ( rc )
> +            iommu_flush_write_buffer(iommu);
>      }
> +
> +    return rc;
>  }

rc may be positive here (and afaict that's the indication to do the
flush, not one of failure). Quite likely this code didn't mean to flush
iommu_flush_iotlb_{d,p}si() returning a negative value...


> @@ -1742,7 +1757,9 @@ static int intel_iommu_map_page(
>      unmap_vtd_domain_page(page);
>  
>      if ( !this_cpu(iommu_dont_flush_iotlb) )
> -        __intel_iommu_iotlb_flush(d, gfn, dma_pte_present(old), 1);
> +    {
> +        return __intel_iommu_iotlb_flush(d, gfn, dma_pte_present(old), 1);
> +    }

Pointless introduction of braces.

> --- a/xen/drivers/passthrough/x86/iommu.c
> +++ b/xen/drivers/passthrough/x86/iommu.c
> @@ -104,7 +104,11 @@ int arch_iommu_populate_page_table(struct domain *d)
>      this_cpu(iommu_dont_flush_iotlb) = 0;
>  
>      if ( !rc )
> -        iommu_iotlb_flush_all(d);
> +    {
> +        rc = iommu_iotlb_flush_all(d);
> +        if ( rc )
> +            return rc;
> +    }

But you leave all page tables set up?

>      else if ( rc != -ERESTART )
>          iommu_teardown(d);

I think you should let things reach this tear down in that case.

Jan

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v5 1/7] VT-d: Check VT-d Device-TLB flush error(IOMMU part).
  2016-02-10 17:01   ` Jan Beulich
@ 2016-02-16 10:50     ` Xu, Quan
  2016-02-16 11:06       ` Jan Beulich
  2016-02-23 11:43     ` Xu, Quan
  1 sibling, 1 reply; 31+ messages in thread
From: Xu, Quan @ 2016-02-16 10:50 UTC (permalink / raw)
  To: Jan Beulich, Tian, Kevin
  Cc: Wu, Feng, george.dunlap, andrew.cooper3, Dario Faggioli, tim,
	xen-devel, Nakajima, Jun, keir

> On February 11, 2016 at 1:01am, <JBeulich@suse.com> wrote:
> >>> On 05.02.16 at 11:18, <quan.xu@intel.com> wrote:
> > This patch checks all kinds of error and all the way up the call trees
> > of VT-d Device-TLB flush(IOMMU part).

> Please be sure to Cc all maintainers of code you modify.
> 
OK, also CCed Dario Faggioli.
Jan, could I re-send out the V5 and cc all maintainers of code I modify?


> > @@ -2737,8 +2739,9 @@ static int arm_smmu_iommu_domain_init(struct
> domain *d)
> >  	return 0;
> >  }
> >
> > -static void __hwdom_init arm_smmu_iommu_hwdom_init(struct domain *d)
> > +static int __hwdom_init arm_smmu_iommu_hwdom_init(struct domain *d)
> >  {
> > +        return 0;
> >  }
> 
> Bad indentation.
> 

I will modify it in next version. I tried to align with the coding style, as 
There was similar indentation nearby arm_smmu_iommu_hwdom_init() in that file.

> > --- a/xen/drivers/passthrough/iommu.c
> > +++ b/xen/drivers/passthrough/iommu.c
> > @@ -146,14 +146,15 @@ static void __hwdom_init
> check_hwdom_reqs(struct domain *d)
> >      iommu_dom0_strict = 1;
> >  }
> >
> > -void __hwdom_init iommu_hwdom_init(struct domain *d)
> > +int __hwdom_init iommu_hwdom_init(struct domain *d)
> >  {
> >      struct hvm_iommu *hd = domain_hvm_iommu(d);
> > +    int rc = 0;
> 
> Pointless initializer (more elsewhere).
> 
I will modify them in next version.

> > @@ -171,7 +172,10 @@ void __hwdom_init iommu_hwdom_init(struct
> domain *d)
> >                   ((page->u.inuse.type_info & PGT_type_mask)
> >                    == PGT_writable_page) )
> >                  mapping |= IOMMUF_writable;
> > -            hd->platform_ops->map_page(d, gfn, mfn, mapping);
> > +            rc = hd->platform_ops->map_page(d, gfn, mfn, mapping);
> > +            if ( rc )
> > +                return rc;
> 
> So in this patch I can't find error checking getting introduced for the caller of
> this function. And if you were to add it - what would your intentions be? Panic?
> IOW I'm not sure bailing here is a good idea. Logging an error message (but only
> once in this loop) would be a possibility. (This pattern repeats further down.)
> 

Could I log an error message and break in this loop?


> > @@ -369,12 +376,16 @@ void iommu_share_p2m_table(struct domain* d)
> >          ops->share_p2m(d);
> >  }
> >
> > -void iommu_crash_shutdown(void)
> > +int iommu_crash_shutdown(void)
> >  {
> >      const struct iommu_ops *ops = iommu_get_ops();
> > +
> >      if ( iommu_enabled )
> > -        ops->crash_shutdown();
> > +        return ops->crash_shutdown();
> > +
> >      iommu_enabled = iommu_intremap = 0;
> > +
> > +    return 0;
> >  }
> 
> Here again the question is - what is the error value going to be used for? We're
> trying to shut down a crashed system when coming here.
> 
 I tried to clean up in error handling path chained up. It logs an error message,
 When it calls iommu_crash_shutdown() and returns a non-zero value [in patch 2/7].


> > @@ -584,37 +587,37 @@ static void __intel_iommu_iotlb_flush(struct
> domain *d, unsigned long gfn,
> >              continue;
> >
> >          if ( page_count > 1 || gfn == -1 )
> > -        {
> > -            if ( iommu_flush_iotlb_dsi(iommu, iommu_domid,
> > -                        0, flush_dev_iotlb) )
> > -                iommu_flush_write_buffer(iommu);
> > -        }
> > +            rc = iommu_flush_iotlb_dsi(iommu, iommu_domid,
> > +                        0, flush_dev_iotlb);
> >          else
> > -        {
> > -            if ( iommu_flush_iotlb_psi(iommu, iommu_domid,
> > +            rc = iommu_flush_iotlb_psi(iommu, iommu_domid,
> >                          (paddr_t)gfn << PAGE_SHIFT_4K, 0,
> > -                        !dma_old_pte_present, flush_dev_iotlb) )
> > -                iommu_flush_write_buffer(iommu);
> > -        }
> > +                        !dma_old_pte_present, flush_dev_iotlb);
> > +
> > +        if ( rc )
> > +            iommu_flush_write_buffer(iommu);
> >      }
> > +
> > +    return rc;
> >  }
> 
> rc may be positive here (and afaict that's the indication to do the flush, not one
> of failure). Quite likely this code didn't mean to flush
> iommu_flush_iotlb_{d,p}si() returning a negative value...
> 
> 
IIUC, you refer to the follow:
 flush_iotlb_qi(
{
...
          if ( !cap_caching_mode(iommu->cap) )
            return 1;
...
}


As Kevin mentioned, originally 0/1 is used to indicate whether caller needs to flush cache.
But I try to return ZERO in [PATCH v5 1/7]. Because:
1) if It returns _1_, device passthrough doesn't work when I tried to clean up in error handling path chained up. 
2) as VT-d spec said, Caching Mode is 0: invalidations are not required for modifications to individual not present
  or invalid entries.

That is tricky for this point. Correct me if I am wrong.


> > @@ -1742,7 +1757,9 @@ static int intel_iommu_map_page(
> >      unmap_vtd_domain_page(page);
> >
> >      if ( !this_cpu(iommu_dont_flush_iotlb) )
> > -        __intel_iommu_iotlb_flush(d, gfn, dma_pte_present(old), 1);
> > +    {
> > +        return __intel_iommu_iotlb_flush(d, gfn, dma_pte_present(old),
> 1);
> > +    }
> 
> Pointless introduction of braces.
> 

I will modify it in next version.

> > --- a/xen/drivers/passthrough/x86/iommu.c
> > +++ b/xen/drivers/passthrough/x86/iommu.c
> > @@ -104,7 +104,11 @@ int arch_iommu_populate_page_table(struct
> domain *d)
> >      this_cpu(iommu_dont_flush_iotlb) = 0;
> >
> >      if ( !rc )
> > -        iommu_iotlb_flush_all(d);
> > +    {
> > +        rc = iommu_iotlb_flush_all(d);
> > +        if ( rc )
> > +            return rc;
> > +    }
> 
> But you leave all page tables set up?
> 
> >      else if ( rc != -ERESTART )
> >          iommu_teardown(d);
> 
> I think you should let things reach this tear down in that case.
> 

Agreed.


Jan, thanks for your comments.
BTW, just for a check, did you only review patch v5 1/7 ? 
Hope I didn't miss your other comments.

Quan

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v5 1/7] VT-d: Check VT-d Device-TLB flush error(IOMMU part).
  2016-02-16 10:50     ` Xu, Quan
@ 2016-02-16 11:06       ` Jan Beulich
  2016-02-17 12:49         ` Xu, Quan
  0 siblings, 1 reply; 31+ messages in thread
From: Jan Beulich @ 2016-02-16 11:06 UTC (permalink / raw)
  To: Quan Xu
  Cc: Kevin Tian, Feng Wu, george.dunlap, andrew.cooper3,
	Dario Faggioli, tim, xen-devel, Jun Nakajima, keir

>>> On 16.02.16 at 11:50, <quan.xu@intel.com> wrote:
>>  On February 11, 2016 at 1:01am, <JBeulich@suse.com> wrote:
>> >>> On 05.02.16 at 11:18, <quan.xu@intel.com> wrote:
>> > This patch checks all kinds of error and all the way up the call trees
>> > of VT-d Device-TLB flush(IOMMU part).
> 
>> Please be sure to Cc all maintainers of code you modify.
>> 
> OK, also CCed Dario Faggioli.
> Jan, could I re-send out the V5 and cc all maintainers of code I modify?

I'd say rather make sure you Cc the right people on v6, to avoid
spamming everyone's mailboxes.

>> > @@ -2737,8 +2739,9 @@ static int arm_smmu_iommu_domain_init(struct
>> domain *d)
>> >  	return 0;
>> >  }
>> >
>> > -static void __hwdom_init arm_smmu_iommu_hwdom_init(struct domain *d)
>> > +static int __hwdom_init arm_smmu_iommu_hwdom_init(struct domain *d)
>> >  {
>> > +        return 0;
>> >  }
>> 
>> Bad indentation.
>> 
> 
> I will modify it in next version. I tried to align with the coding style, as 
> 
> There was similar indentation nearby arm_smmu_iommu_hwdom_init() in that 
> file.

Then I'm afraid you didn't check enough of the properties of the
file: It appears to use Linux style, and hence indentation is by
tabs instead of spaces.

>> > @@ -171,7 +172,10 @@ void __hwdom_init iommu_hwdom_init(struct
>> domain *d)
>> >                   ((page->u.inuse.type_info & PGT_type_mask)
>> >                    == PGT_writable_page) )
>> >                  mapping |= IOMMUF_writable;
>> > -            hd->platform_ops->map_page(d, gfn, mfn, mapping);
>> > +            rc = hd->platform_ops->map_page(d, gfn, mfn, mapping);
>> > +            if ( rc )
>> > +                return rc;
>> 
>> So in this patch I can't find error checking getting introduced for the 
> caller of
>> this function. And if you were to add it - what would your intentions be? 
> Panic?
>> IOW I'm not sure bailing here is a good idea. Logging an error message (but 
> only
>> once in this loop) would be a possibility. (This pattern repeats further 
> down.)
>> 
> 
> Could I log an error message and break in this loop?

Logging an error message - yes. Breaking the loop - no; this should
continue to be a best effort attempt at getting things set up for
Dom0.

>> > @@ -369,12 +376,16 @@ void iommu_share_p2m_table(struct domain* d)
>> >          ops->share_p2m(d);
>> >  }
>> >
>> > -void iommu_crash_shutdown(void)
>> > +int iommu_crash_shutdown(void)
>> >  {
>> >      const struct iommu_ops *ops = iommu_get_ops();
>> > +
>> >      if ( iommu_enabled )
>> > -        ops->crash_shutdown();
>> > +        return ops->crash_shutdown();
>> > +
>> >      iommu_enabled = iommu_intremap = 0;
>> > +
>> > +    return 0;
>> >  }
>> 
>> Here again the question is - what is the error value going to be used for? We're
>> trying to shut down a crashed system when coming here.
>> 
>  I tried to clean up in error handling path chained up. It logs an error 
> message,
>  When it calls iommu_crash_shutdown() and returns a non-zero value [in patch 
> 2/7].

That sounds okay than (I didn't get around to look at patches 2-7
yet), but is somewhat contrary to me request of adding
__must_check as far as possible, which - if done here - would
break the build without also adjusting the caller(s).

>> > @@ -584,37 +587,37 @@ static void __intel_iommu_iotlb_flush(struct
>> domain *d, unsigned long gfn,
>> >              continue;
>> >
>> >          if ( page_count > 1 || gfn == -1 )
>> > -        {
>> > -            if ( iommu_flush_iotlb_dsi(iommu, iommu_domid,
>> > -                        0, flush_dev_iotlb) )
>> > -                iommu_flush_write_buffer(iommu);
>> > -        }
>> > +            rc = iommu_flush_iotlb_dsi(iommu, iommu_domid,
>> > +                        0, flush_dev_iotlb);
>> >          else
>> > -        {
>> > -            if ( iommu_flush_iotlb_psi(iommu, iommu_domid,
>> > +            rc = iommu_flush_iotlb_psi(iommu, iommu_domid,
>> >                          (paddr_t)gfn << PAGE_SHIFT_4K, 0,
>> > -                        !dma_old_pte_present, flush_dev_iotlb) )
>> > -                iommu_flush_write_buffer(iommu);
>> > -        }
>> > +                        !dma_old_pte_present, flush_dev_iotlb);
>> > +
>> > +        if ( rc )
>> > +            iommu_flush_write_buffer(iommu);
>> >      }
>> > +
>> > +    return rc;
>> >  }
>> 
>> rc may be positive here (and afaict that's the indication to do the flush, 
> not one
>> of failure). Quite likely this code didn't mean to flush
>> iommu_flush_iotlb_{d,p}si() returning a negative value...
>> 
>> 
> IIUC, you refer to the follow:
>  flush_iotlb_qi(
> {
> ...
>           if ( !cap_caching_mode(iommu->cap) )
>             return 1;
> ...
> }
> 
> 
> As Kevin mentioned, originally 0/1 is used to indicate whether caller needs 
> to flush cache.
> But I try to return ZERO in [PATCH v5 1/7]. Because:
> 1) if It returns _1_, device passthrough doesn't work when I tried to clean 
> up in error handling path chained up. 
> 2) as VT-d spec said, Caching Mode is 0: invalidations are not required for 
> modifications to individual not present
>   or invalid entries.
> 
> That is tricky for this point. Correct me if I am wrong.

I'm not sure I see the trickiness: All you need is
- call iommu_flush_write_buffer() only when rc > 0 (if I remember
  the polarity right, else it might be rc == 0)
- return zero from this function when rc is positive.

Jan

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v5 1/7] VT-d: Check VT-d Device-TLB flush error(IOMMU part).
  2016-02-16 11:06       ` Jan Beulich
@ 2016-02-17 12:49         ` Xu, Quan
  2016-02-17 13:05           ` Jan Beulich
  0 siblings, 1 reply; 31+ messages in thread
From: Xu, Quan @ 2016-02-17 12:49 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Tian, Kevin, Wu, Feng, george.dunlap, andrew.cooper3, tim,
	xen-devel, Xu, Quan

> On February 16, 2016 7:06pm, <JBeulich@suse.com> wrote:
> >>> On 16.02.16 at 11:50, <quan.xu@intel.com> wrote:
> >>  On February 11, 2016 at 1:01am, <JBeulich@suse.com> wrote:
> >> >>> On 05.02.16 at 11:18, <quan.xu@intel.com> wrote:
 
> >> > @@ -369,12 +376,16 @@ void iommu_share_p2m_table(struct domain*
> d)
> >> >          ops->share_p2m(d);
> >> >  }
> >> >
> >> > -void iommu_crash_shutdown(void)
> >> > +int iommu_crash_shutdown(void)
> >> >  {
> >> >      const struct iommu_ops *ops = iommu_get_ops();
> >> > +
> >> >      if ( iommu_enabled )
> >> > -        ops->crash_shutdown();
> >> > +        return ops->crash_shutdown();
> >> > +
> >> >      iommu_enabled = iommu_intremap = 0;
> >> > +
> >> > +    return 0;
> >> >  }
> >>
> >> Here again the question is - what is the error value going to be used
> >> for? We're trying to shut down a crashed system when coming here.
> >>
> >  I tried to clean up in error handling path chained up. It logs an
> > error message,  When it calls iommu_crash_shutdown() and returns a
> > non-zero value [in patch 2/7].
> 
> That sounds okay than (I didn't get around to look at patches 2-7 yet), but is
> somewhat contrary to me request of adding __must_check as far as possible,
> which - if done here - would break the build without also adjusting the caller(s).
> 


If they are in the same patch set, I think it is acceptable. 
BTW, with patch 1/7, I can build Xen successfully( make xen ).
To align this rule, I'd better merge patch1/7 and patch 2/7 into a large patch.

Quan

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v5 1/7] VT-d: Check VT-d Device-TLB flush error(IOMMU part).
  2016-02-17 12:49         ` Xu, Quan
@ 2016-02-17 13:05           ` Jan Beulich
  2016-02-17 13:38             ` Xu, Quan
  0 siblings, 1 reply; 31+ messages in thread
From: Jan Beulich @ 2016-02-17 13:05 UTC (permalink / raw)
  To: Quan Xu
  Cc: Kevin Tian, Feng Wu, george.dunlap, andrew.cooper3, tim, xen-devel

>>> On 17.02.16 at 13:49, <quan.xu@intel.com> wrote:
>>  On February 16, 2016 7:06pm, <JBeulich@suse.com> wrote:
>> >>> On 16.02.16 at 11:50, <quan.xu@intel.com> wrote:
>> >>  On February 11, 2016 at 1:01am, <JBeulich@suse.com> wrote:
>> >> >>> On 05.02.16 at 11:18, <quan.xu@intel.com> wrote:
>  
>> >> > @@ -369,12 +376,16 @@ void iommu_share_p2m_table(struct domain*
>> d)
>> >> >          ops->share_p2m(d);
>> >> >  }
>> >> >
>> >> > -void iommu_crash_shutdown(void)
>> >> > +int iommu_crash_shutdown(void)
>> >> >  {
>> >> >      const struct iommu_ops *ops = iommu_get_ops();
>> >> > +
>> >> >      if ( iommu_enabled )
>> >> > -        ops->crash_shutdown();
>> >> > +        return ops->crash_shutdown();
>> >> > +
>> >> >      iommu_enabled = iommu_intremap = 0;
>> >> > +
>> >> > +    return 0;
>> >> >  }
>> >>
>> >> Here again the question is - what is the error value going to be used
>> >> for? We're trying to shut down a crashed system when coming here.
>> >>
>> >  I tried to clean up in error handling path chained up. It logs an
>> > error message,  When it calls iommu_crash_shutdown() and returns a
>> > non-zero value [in patch 2/7].
>> 
>> That sounds okay than (I didn't get around to look at patches 2-7 yet), but 
> is
>> somewhat contrary to me request of adding __must_check as far as possible,
>> which - if done here - would break the build without also adjusting the 
> caller(s).
>> 
> 
> 
> If they are in the same patch set, I think it is acceptable. 

If unavoidable, yes. But please remember that patch series don't
necessarily get committed in one go.

> BTW, with patch 1/7, I can build Xen successfully( make xen ).
> To align this rule, I'd better merge patch1/7 and patch 2/7 into a large 
> patch.

Not having looked at patch 2 yet (I'm about to), I can't tell whether
this makes sense. I'd rather not see large patches become even
larger though - please make a reasonable attempt at splitting things
(in the case here for example by adjusting top level functions first,
working your way down to leaf ones, thus guaranteeing that newly
added __must_check annotations will be properly honored at each
patch boundary).

Jan

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v5 1/7] VT-d: Check VT-d Device-TLB flush error(IOMMU part).
  2016-02-17 13:05           ` Jan Beulich
@ 2016-02-17 13:38             ` Xu, Quan
  2016-02-18  6:15               ` Tian, Kevin
  0 siblings, 1 reply; 31+ messages in thread
From: Xu, Quan @ 2016-02-17 13:38 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Tian, Kevin, Wu, Feng, george.dunlap, andrew.cooper3, tim, xen-devel

> On February 17, 2016 9:05pm, <JBeulich@suse.com> wrote:
> >>> On 17.02.16 at 13:49, <quan.xu@intel.com> wrote:
> >>  On February 16, 2016 7:06pm, <JBeulich@suse.com> wrote:
> >> >>> On 16.02.16 at 11:50, <quan.xu@intel.com> wrote:
> >> >>  On February 11, 2016 at 1:01am, <JBeulich@suse.com> wrote:
> >> >> >>> On 05.02.16 at 11:18, <quan.xu@intel.com> wrote:
> >
> >> >> > @@ -369,12 +376,16 @@ void iommu_share_p2m_table(struct
> domain*
> >> d)
> >> >> >          ops->share_p2m(d);
> >> >> >  }
> >> >> >
> >> >> > -void iommu_crash_shutdown(void)
> >> >> > +int iommu_crash_shutdown(void)
> >> >> >  {
> >> >> >      const struct iommu_ops *ops = iommu_get_ops();
> >> >> > +
> >> >> >      if ( iommu_enabled )
> >> >> > -        ops->crash_shutdown();
> >> >> > +        return ops->crash_shutdown();
> >> >> > +
> >> >> >      iommu_enabled = iommu_intremap = 0;
> >> >> > +
> >> >> > +    return 0;
> >> >> >  }
> >> >>
> >> >> Here again the question is - what is the error value going to be
> >> >> used for? We're trying to shut down a crashed system when coming here.
> >> >>
> >> >  I tried to clean up in error handling path chained up. It logs an
> >> > error message,  When it calls iommu_crash_shutdown() and returns a
> >> > non-zero value [in patch 2/7].
> >>
> >> That sounds okay than (I didn't get around to look at patches 2-7
> >> yet), but
> > is
> >> somewhat contrary to me request of adding __must_check as far as
> >> possible, which - if done here - would break the build without also
> >> adjusting the
> > caller(s).
> >>
> >
> >
> > If they are in the same patch set, I think it is acceptable.
> 
> If unavoidable, yes. But please remember that patch series don't necessarily get
> committed in one go.
> 
> > BTW, with patch 1/7, I can build Xen successfully( make xen ).
> > To align this rule, I'd better merge patch1/7 and patch 2/7 into a
> > large patch.
> 
> Not having looked at patch 2 yet
> (I'm about to), 

Good news. :):)
I think there are maybe some minor issues for this patch set ( _my_estimate_), and I will address these issues immediately.


> I can't tell whether this makes
> sense. I'd rather not see large patches become even larger though - please
> make a reasonable attempt at splitting things (in the case here for example by
> adjusting top level functions first, working your way down to leaf ones, thus
> guaranteeing that newly added __must_check annotations will be properly
> honored at each patch boundary).
> 

Okay, I will try to do it in v6. Thanks.


Quan

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v5 2/7] VT-d: Check VT-d Device-TLB flush error(MMU part).
  2016-02-05 10:18 ` [PATCH v5 2/7] VT-d: Check VT-d Device-TLB flush error(MMU part) Quan Xu
@ 2016-02-17 14:02   ` Jan Beulich
  0 siblings, 0 replies; 31+ messages in thread
From: Jan Beulich @ 2016-02-17 14:02 UTC (permalink / raw)
  To: Quan Xu
  Cc: kevin.tian, feng.wu, george.dunlap, andrew.cooper3, tim, xen-devel

>>> On 05.02.16 at 11:18, <quan.xu@intel.com> wrote:
> --- a/xen/arch/x86/acpi/power.c
> +++ b/xen/arch/x86/acpi/power.c
> @@ -45,6 +45,8 @@ void do_suspend_lowlevel(void);
>  
>  static int device_power_down(void)
>  {
> +    int rc;
> +
>      console_suspend();
>  
>      time_suspend();
> @@ -53,7 +55,9 @@ static int device_power_down(void)
>  
>      ioapic_suspend();
>  
> -    iommu_suspend();
> +    rc = iommu_suspend();
> +    if ( rc )
> +        return rc;

I'm not going to look at the rest of this patch, seeing that the v4
comment[1] regarding the issue here was left un-addressed.

Jan

[1] http://lists.xenproject.org/archives/html/xen-devel/2016-01/msg01555.html

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v5 3/7] VT-d: Reduce spin timeout to 1ms, which can be boot-time changed.
  2016-02-05 10:18 ` [PATCH v5 3/7] VT-d: Reduce spin timeout to 1ms, which can be boot-time changed Quan Xu
@ 2016-02-17 14:06   ` Jan Beulich
  0 siblings, 0 replies; 31+ messages in thread
From: Jan Beulich @ 2016-02-17 14:06 UTC (permalink / raw)
  To: Quan Xu
  Cc: kevin.tian, feng.wu, george.dunlap, andrew.cooper3, tim, xen-devel

>>> On 05.02.16 at 11:18, <quan.xu@intel.com> wrote:
> --- a/docs/misc/xen-command-line.markdown
> +++ b/docs/misc/xen-command-line.markdown
> @@ -968,6 +968,13 @@ Use this to work around firmware issues providing correct RMRR entries. Rather
>  than only mapping RAM pages for IOMMU accesses for Dom0, with this option all
>  pages not marked as unusable in the E820 table will get a mapping 
> established.
>  
> +### vtd\_qi\_timeout (VT-d)
> +> `= <integer>`
> +
> +> Default: `1`
> +
> +>> Specify the timeout of the VT-d Queued Invalidation in ms.
> +
>  ### irq\_ratelimit
>  > `= <integer>`

Just like for patch 2: I'm sure it was mentioned to you before
that this file has its entries (at least mostly) alphabetically sorted.

> @@ -167,10 +172,12 @@ static int queue_invalidate_wait(struct iommu *iommu,
>          start_time = NOW();
>          while ( poll_slot != QINVAL_STAT_DONE )
>          {
> -            if ( NOW() > (start_time + DMAR_OPERATION_TIMEOUT) )
> +            if ( NOW() > (start_time + IOMMU_QI_TIMEOUT) )
>              {
>                  print_qi_regs(iommu);
> -                panic("queue invalidate wait descriptor was not executed");
> +                dprintk(XENLOG_WARNING VTDPREFIX,
> +                        "Queue invalidate wait descriptor was timeout.\n");
> +                return -ETIMEDOUT;
>              }
>              cpu_relax();
>          }

Without a __must_check annotation on the function I cannot see
how I should reasonably convince myself that all call sites now
handle such an error in one way or another.

Jan

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v5 4/7] VT-d: Refactor iommu_ops .iotlb_flush() and iotlb_flush_all()
  2016-02-05 10:18 ` [PATCH v5 4/7] VT-d: Refactor iommu_ops .iotlb_flush() and iotlb_flush_all() Quan Xu
@ 2016-02-17 14:19   ` Jan Beulich
  2016-02-18  8:50     ` Xu, Quan
  0 siblings, 1 reply; 31+ messages in thread
From: Jan Beulich @ 2016-02-17 14:19 UTC (permalink / raw)
  To: Quan Xu
  Cc: kevin.tian, feng.wu, george.dunlap, andrew.cooper3, tim, xen-devel

>>> On 05.02.16 at 11:18, <quan.xu@intel.com> wrote:

(Side note: The VT-d: prefix of the subject seems inadequate here.)

> to pass down a flag indicating whether the lock is being held.

"the lock" being which one?

> --- a/xen/arch/arm/p2m.c
> +++ b/xen/arch/arm/p2m.c
> @@ -1100,7 +1100,7 @@ tlbflush:
>      if ( flush )
>      {
>          flush_tlb_domain(d);
> -        iommu_iotlb_flush(d, sgfn, egfn - sgfn);
> +        iommu_iotlb_flush(d, sgfn, egfn - sgfn, NONE_LOCK);
>      }
>  
>  out:
> --- a/xen/common/memory.c
> +++ b/xen/common/memory.c
> @@ -631,9 +631,9 @@ static int xenmem_add_to_physmap(struct domain *d,
>      if ( need_iommu(d) )
>      {
>          this_cpu(iommu_dont_flush_iotlb) = 0;
> -        rc = iommu_iotlb_flush(d, xatp->idx - done, done);
> +        rc = iommu_iotlb_flush(d, xatp->idx - done, done, NONE_LOCK);
>          if ( !rc )
> -            rc = iommu_iotlb_flush(d, xatp->gpfn - done, done);
> +            rc = iommu_iotlb_flush(d, xatp->gpfn - done, done, NONE_LOCK);
>      }

Considering both of these changes, I think it would be better to
hide this implementation detail from callers outside of
xen/drivers/passthrough/, by making iommu_iotlb_flush() a
wrapper around the actual implementation. (This at once would
limit the amount of ack-s such a patch will require.)

> --- a/xen/include/xen/iommu.h
> +++ b/xen/include/xen/iommu.h
> @@ -131,6 +131,13 @@ struct page_info;
>   * callback pair.
>   */
>  typedef int iommu_grdm_t(xen_pfn_t start, xen_ulong_t nr, u32 id, void *ctxt);
> +/*
> + * A flag indicates whether the lock is being held.
> + * NONE_LOCK - no lock is being held.
> + * PCIDEVS_LOCK - pcidevs_lock is being held.
> + */
> +#define NONE_LOCK 0
> +#define PCIDEVS_LOCK 1

If you really assume further lock holding might need to be
communicated here, this should be made more obviously a bit
mask (by e.g. using hex constants, not having a NON_LOCKS
constant, and not making the first part of the comment refer to
just one lock). If, however, the pcidevs_lock is all you care
about, I think code readability would benefit from making the
respective function parameters bool_t, and adding (besides the
already mentioned wrapper) another wrapper named e.g.
iommu_iotlb_flush_all_locked().

Jan

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v5 5/7] VT-d: Refactor iommu_ops .map_page() and unmap_page()
  2016-02-05 10:18 ` [PATCH v5 5/7] VT-d: Refactor iommu_ops .map_page() and unmap_page() Quan Xu
@ 2016-02-17 14:23   ` Jan Beulich
  0 siblings, 0 replies; 31+ messages in thread
From: Jan Beulich @ 2016-02-17 14:23 UTC (permalink / raw)
  To: Quan Xu
  Cc: kevin.tian, feng.wu, george.dunlap, andrew.cooper3, tim, xen-devel

>>> On 05.02.16 at 11:18, <quan.xu@intel.com> wrote:
> to pass down a flag indicating whether the lock is being held,
> and check the way up the call trees.

Same comments as on the previous patch; most of the changes
outside of xen/drivers/passthrough/ seem to be avoidable here.

Jan

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v5 7/7] VT-d: Fix vt-d Device-TLB flush timeout issue.
  2016-02-05 10:18 ` [PATCH v5 7/7] VT-d: Fix vt-d Device-TLB flush timeout issue Quan Xu
@ 2016-02-17 14:40   ` Jan Beulich
  2016-02-18  7:36     ` Xu, Quan
  0 siblings, 1 reply; 31+ messages in thread
From: Jan Beulich @ 2016-02-17 14:40 UTC (permalink / raw)
  To: Quan Xu
  Cc: kevin.tian, feng.wu, george.dunlap, andrew.cooper3, tim, xen-devel

>>> On 05.02.16 at 11:18, <quan.xu@intel.com> wrote:
> --- a/xen/drivers/passthrough/vtd/qinval.c
> +++ b/xen/drivers/passthrough/vtd/qinval.c
> @@ -229,6 +229,69 @@ int qinval_device_iotlb(struct iommu *iommu,
>      return 0;
>  }
>  
> +static void dev_invalidate_iotlb_timeout(struct iommu *iommu, u16 did,
> +                                         u16 seg, u8 bus, u8 devfn,
> +                                         unsigned int lock)
> +{
> +    struct domain *d = NULL;
> +    struct pci_dev *pdev;
> +
> +    if ( test_bit(did, iommu->domid_bitmap) )
> +        d = rcu_lock_domain_by_id(iommu->domid_map[did]);
> +
> +    if ( d == NULL )
> +        return;
> +
> +    for_each_pdev(d, pdev)
> +    {
> +        if ( (pdev->seg == seg) &&
> +             (pdev->bus == bus) &&
> +             (pdev->devfn == devfn) )
> +        {
> +            ASSERT ( pdev->domain );

Stray blanks inside the parentheses.

> +            list_del(&pdev->domain_list);
> +            pdev->domain = NULL;
> +
> +            if ( !(lock & PCIDEVS_LOCK) )
> +                spin_lock(&pcidevs_lock);

This is too late - you may not iterate over pdev-s without holding
that lock, and the list_del() may not be done in that case either.
Plus this should be accompanied be an ASSERT() in the else case.

> +            if ( pci_hide_device(bus, devfn) )

But now I'm _really_ puzzled: You acquire the exact lock that
pci_hide_device() acquires. Hence, unless I've overlooked an earlier
change, I can't see this as other than an unconditional dead lock. Did
you test this code path at all?

> +            {
> +                printk(XENLOG_ERR
> +                       "IOMMU hide device %04x:%02x:%02x.%02x error.",
> +                       seg, bus, PCI_SLOT(devfn), PCI_FUNC(devfn));
> +            }

I hate to repeat myself: SBDF need to be printed in canonical
ssss:bb:dd.f form, to be grep-able. Also no full stops at the end
of log messages please. And (also mentioned before) if there are
error codes available, log them to aid diagnosis of problems.

Also - unnecessary figure braces.

> @@ -350,9 +413,19 @@ static int flush_iotlb_qi(
>          queue_invalidate_iotlb(iommu,
>                                 type >> DMA_TLB_FLUSH_GRANU_OFFSET, dr,
>                                 dw, did, size_order, 0, addr);
> -        if ( flush_dev_iotlb )
> -            ret = dev_invalidate_iotlb(iommu, did, addr, size_order, type);
> +
> +        /*
> +         * Before Device-TLB invalidation we need to synchronize
> +         * invalidation completions with hardware.
> +         */
>          rc = invalidate_sync(iommu);
> +        if ( rc )
> +             return rc;
> +
> +        if ( flush_dev_iotlb )
> +            ret = dev_invalidate_iotlb(iommu, did, addr, size_order,
> +                                       type, lock);
> +
>          if ( !ret )
>              ret = rc;

These two lines have now become pointless, and hence should
be deleted.

> @@ -162,6 +162,18 @@ int dev_invalidate_iotlb(struct iommu *iommu, u16 did,
>              return -EOPNOTSUPP;
>          }
>  
> +        /*
> +         * Synchronize with hardware for Device-TLB invalidate
> +         * descriptor.
> +         */
> +        ret = dev_invalidate_iotlb_sync(iommu, did, pdev->seg,
> +                                        pdev->bus, pdev->devfn, lock);
> +        if ( ret )
> +            printk(XENLOG_ERR
> +                   "Flush error %d on device %04x:%02x:%02x.%02x \n",
> +                   ret, pdev->seg, pdev->bus, PCI_SLOT(pdev->devfn),
> +                   PCI_FUNC(pdev->devfn));
> +
>          if ( !ret )
>              ret = rc;

The two context lines here show that you're now clobbering "ret".

Jan

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v5 1/7] VT-d: Check VT-d Device-TLB flush error(IOMMU part).
  2016-02-17 13:38             ` Xu, Quan
@ 2016-02-18  6:15               ` Tian, Kevin
  2016-02-18  6:33                 ` Xu, Quan
  0 siblings, 1 reply; 31+ messages in thread
From: Tian, Kevin @ 2016-02-18  6:15 UTC (permalink / raw)
  To: Xu, Quan, Jan Beulich
  Cc: george.dunlap, andrew.cooper3, tim, Wu, Feng, xen-devel

> From: Xu, Quan
> Sent: Wednesday, February 17, 2016 9:38 PM
> >
> > > BTW, with patch 1/7, I can build Xen successfully( make xen ).
> > > To align this rule, I'd better merge patch1/7 and patch 2/7 into a
> > > large patch.
> >
> > Not having looked at patch 2 yet
> > (I'm about to),
> 
> Good news. :):)
> I think there are maybe some minor issues for this patch set ( _my_estimate_), and I will
> address these issues immediately.
> 

Hi, Quan, are you expecting people to continue looking at your v5,
or better to wait for v7? :-)

Thanks
Kevin

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v5 1/7] VT-d: Check VT-d Device-TLB flush error(IOMMU part).
  2016-02-18  6:15               ` Tian, Kevin
@ 2016-02-18  6:33                 ` Xu, Quan
  2016-02-18  6:37                   ` Tian, Kevin
  0 siblings, 1 reply; 31+ messages in thread
From: Xu, Quan @ 2016-02-18  6:33 UTC (permalink / raw)
  To: Tian, Kevin, Jan Beulich
  Cc: george.dunlap, andrew.cooper3, tim, Wu, Feng, xen-devel

> On February 18, 2016 2:16pm, <Tian, Kevin> wrote:
> > From: Xu, Quan
> > Sent: Wednesday, February 17, 2016 9:38 PM
> > >
> > > > BTW, with patch 1/7, I can build Xen successfully( make xen ).
> > > > To align this rule, I'd better merge patch1/7 and patch 2/7 into a
> > > > large patch.
> > >
> > > Not having looked at patch 2 yet
> > > (I'm about to),
> >
> > Good news. :):)
> > I think there are maybe some minor issues for this patch set (
> > _my_estimate_), and I will address these issues immediately.
> >
> 
> Hi, Quan, are you expecting people to continue looking at your v5, or better to
> wait for v7? :-)
> 

V7?
I think the other people would better wait for v6. I plan to send out v6 in next week.
Jan gave some valuable comments for v5. I will fix it ASAP.
I apologize to everyone who has to go through v5 patch set. I missed some v4 
comments and didn't fully test the v5 patch set, as I tried my best to send out v5 on the last
working day before Chinese Spring Festival and it was a long periods for v4 discussion.


Quan

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v5 1/7] VT-d: Check VT-d Device-TLB flush error(IOMMU part).
  2016-02-18  6:33                 ` Xu, Quan
@ 2016-02-18  6:37                   ` Tian, Kevin
  2016-02-18  6:42                     ` Xu, Quan
  0 siblings, 1 reply; 31+ messages in thread
From: Tian, Kevin @ 2016-02-18  6:37 UTC (permalink / raw)
  To: Xu, Quan, Jan Beulich
  Cc: george.dunlap, andrew.cooper3, tim, Wu, Feng, xen-devel

> From: Xu, Quan
> Sent: Thursday, February 18, 2016 2:34 PM
> 
> > On February 18, 2016 2:16pm, <Tian, Kevin> wrote:
> > > From: Xu, Quan
> > > Sent: Wednesday, February 17, 2016 9:38 PM
> > > >
> > > > > BTW, with patch 1/7, I can build Xen successfully( make xen ).
> > > > > To align this rule, I'd better merge patch1/7 and patch 2/7 into a
> > > > > large patch.
> > > >
> > > > Not having looked at patch 2 yet
> > > > (I'm about to),
> > >
> > > Good news. :):)
> > > I think there are maybe some minor issues for this patch set (
> > > _my_estimate_), and I will address these issues immediately.
> > >
> >
> > Hi, Quan, are you expecting people to continue looking at your v5, or better to
> > wait for v7? :-)
> >
> 
> V7?

sorry. I meant v6.

> I think the other people would better wait for v6. I plan to send out v6 in next week.
> Jan gave some valuable comments for v5. I will fix it ASAP.
> I apologize to everyone who has to go through v5 patch set. I missed some v4
> comments and didn't fully test the v5 patch set, as I tried my best to send out v5 on the
> last
> working day before Chinese Spring Festival and it was a long periods for v4 discussion.
> 

Thanks for your work.
Kevin

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v5 1/7] VT-d: Check VT-d Device-TLB flush error(IOMMU part).
  2016-02-18  6:37                   ` Tian, Kevin
@ 2016-02-18  6:42                     ` Xu, Quan
  0 siblings, 0 replies; 31+ messages in thread
From: Xu, Quan @ 2016-02-18  6:42 UTC (permalink / raw)
  To: Tian, Kevin, Jan Beulich
  Cc: george.dunlap, andrew.cooper3, tim, Wu, Feng, xen-devel

> On February 18, 2016 2:37pm, <Tian, Kevin> wrote:
> > From: Xu, Quan
> > Sent: Thursday, February 18, 2016 2:34 PM


> Thanks for your work.
> Kevin

I appreciate your help!:)
Quan

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v5 7/7] VT-d: Fix vt-d Device-TLB flush timeout issue.
  2016-02-17 14:40   ` Jan Beulich
@ 2016-02-18  7:36     ` Xu, Quan
  2016-02-18  8:35       ` Jan Beulich
  0 siblings, 1 reply; 31+ messages in thread
From: Xu, Quan @ 2016-02-18  7:36 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Tian, Kevin, Wu, Feng, george.dunlap, andrew.cooper3, tim, xen-devel

> On February 17, 2016 10:41pm, <JBeulich@suse.com> wrote:
> >>> On 05.02.16 at 11:18, <quan.xu@intel.com> wrote:
> > --- a/xen/drivers/passthrough/vtd/qinval.c
> > +++ b/xen/drivers/passthrough/vtd/qinval.c
> > +            if ( pci_hide_device(bus, devfn) )
> 
> But now I'm _really_ puzzled: You acquire the exact lock that
> pci_hide_device() acquires. Hence, unless I've overlooked an earlier change, I
> can't see this as other than an unconditional dead lock. Did you test this code
> path at all?

Sorry, I didn't test this code path.
I did test the follows:
   1) Create domain with ATS device.
   2) Attach / Detach ATS device.

I think I could add a variation of pci_hide_device(), without "spin_lock(&pcidevs_lock) / spin_unlock(&pcidevs_lock)"
Or "__init".

But it is sure that different lock state is possible for different call trees when to flush an ATS device.
I verify it as follows:
1.print pcidevs_lock status in flush_iotlb_qi()

flush_iotlb_qi()
{
...
+    printk("__ pcidevs_lock : %d *__\n", spin_is_locked(&pcidevs_lock));
...
}

2. attach ATS device
      $xl pci-attach TestDom 0000:81:00.0
  #The print is "(XEN) __ pcidevs_lock : 1 *__"

3. reset memory of domain
      $ xl mem-set TestDom 2047m
  #the print is "(XEN) __ pcidevs_lock : 0 *__"

Quan

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v5 7/7] VT-d: Fix vt-d Device-TLB flush timeout issue.
  2016-02-18  7:36     ` Xu, Quan
@ 2016-02-18  8:35       ` Jan Beulich
  2016-02-18  8:47         ` Xu, Quan
  0 siblings, 1 reply; 31+ messages in thread
From: Jan Beulich @ 2016-02-18  8:35 UTC (permalink / raw)
  To: Quan Xu
  Cc: Kevin Tian, Feng Wu, george.dunlap, andrew.cooper3, tim, xen-devel

>>> On 18.02.16 at 08:36, <quan.xu@intel.com> wrote:
>>  On February 17, 2016 10:41pm, <JBeulich@suse.com> wrote:
>> >>> On 05.02.16 at 11:18, <quan.xu@intel.com> wrote:
>> > --- a/xen/drivers/passthrough/vtd/qinval.c
>> > +++ b/xen/drivers/passthrough/vtd/qinval.c
>> > +            if ( pci_hide_device(bus, devfn) )
>> 
>> But now I'm _really_ puzzled: You acquire the exact lock that
>> pci_hide_device() acquires. Hence, unless I've overlooked an earlier change, 
> I
>> can't see this as other than an unconditional dead lock. Did you test this 
> code
>> path at all?
> 
> Sorry, I didn't test this code path.
> I did test the follows:
>    1) Create domain with ATS device.
>    2) Attach / Detach ATS device.
> 
> I think I could add a variation of pci_hide_device(), without 
> "spin_lock(&pcidevs_lock) / spin_unlock(&pcidevs_lock)"
> Or "__init".

Which we already have - _pci_hide_device(). The function would
just need to be made non-static, but would otherwise fit your
purposes since you also don't need the alloc_pdev() the other
function does.

> But it is sure that different lock state is possible for different call 
> trees when to flush an ATS device.

Sure, that's why you pass down the flag. Presumably easiest
(albeit not nicest) will be to call the respective of the two functions
depending on the lock holding flag.

Jan

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v5 7/7] VT-d: Fix vt-d Device-TLB flush timeout issue.
  2016-02-18  8:35       ` Jan Beulich
@ 2016-02-18  8:47         ` Xu, Quan
  2016-02-18  9:21           ` Jan Beulich
  0 siblings, 1 reply; 31+ messages in thread
From: Xu, Quan @ 2016-02-18  8:47 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Tian, Kevin, Wu, Feng, george.dunlap, andrew.cooper3, tim, xen-devel

> On February 18, 2016 4:36pm, <JBeulich@suse.com> wrote:
> >>> On 18.02.16 at 08:36, <quan.xu@intel.com> wrote:
> >>  On February 17, 2016 10:41pm, <JBeulich@suse.com> wrote:
> >> >>> On 05.02.16 at 11:18, <quan.xu@intel.com> wrote:
> >> > --- a/xen/drivers/passthrough/vtd/qinval.c
> >> > +++ b/xen/drivers/passthrough/vtd/qinval.c
> >> > +            if ( pci_hide_device(bus, devfn) )
> >>
> >> But now I'm _really_ puzzled: You acquire the exact lock that
> >> pci_hide_device() acquires. Hence, unless I've overlooked an earlier
> >> change,
> > I
> >> can't see this as other than an unconditional dead lock. Did you test
> >> this
> > code
> >> path at all?
> >
> > Sorry, I didn't test this code path.
> > I did test the follows:
> >    1) Create domain with ATS device.
> >    2) Attach / Detach ATS device.
> >
> > I think I could add a variation of pci_hide_device(), without
> > "spin_lock(&pcidevs_lock) / spin_unlock(&pcidevs_lock)"
> > Or "__init".
> 
> Which we already have - _pci_hide_device(). The function would just need to be
> made non-static, but would otherwise fit your purposes since you also don't
> need the alloc_pdev() the other function does.
> 

But the _pci_hide_device() starts with '_', I think it is not a good idea to make it non-static.
I'd better introduce a new function to wrap the '_pci_hide_device()'. It is difficult to name the new function.
 

> > But it is sure that different lock state is possible for different
> > call trees when to flush an ATS device.
> 
> Sure, that's why you pass down the flag. Presumably easiest (albeit not nicest)
> will be to call the respective of the two functions depending on the lock holding
> flag.
> 

It is really a good idea. Thanks you for your advice.

Quan

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v5 4/7] VT-d: Refactor iommu_ops .iotlb_flush() and iotlb_flush_all()
  2016-02-17 14:19   ` Jan Beulich
@ 2016-02-18  8:50     ` Xu, Quan
  0 siblings, 0 replies; 31+ messages in thread
From: Xu, Quan @ 2016-02-18  8:50 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Tian, Kevin, Wu, Feng, george.dunlap, andrew.cooper3, tim, xen-devel

> On February 17, 2016 10:20pm, <JBeulich@suse.com> wrote:
> >>> On 05.02.16 at 11:18, <quan.xu@intel.com> wrote:
> 
> (Side note: The VT-d: prefix of the subject seems inadequate here.)
> 
> > to pass down a flag indicating whether the lock is being held.
> 
> "the lock" being which one?

"pcidevs_lock".


> > --- a/xen/arch/arm/p2m.c
> > +++ b/xen/arch/arm/p2m.c
> > @@ -1100,7 +1100,7 @@ tlbflush:
> >      if ( flush )
> >      {
> >          flush_tlb_domain(d);
> > -        iommu_iotlb_flush(d, sgfn, egfn - sgfn);
> > +        iommu_iotlb_flush(d, sgfn, egfn - sgfn, NONE_LOCK);
> >      }
> >
> >  out:
> > --- a/xen/common/memory.c
> > +++ b/xen/common/memory.c
> > @@ -631,9 +631,9 @@ static int xenmem_add_to_physmap(struct domain
> *d,
> >      if ( need_iommu(d) )
> >      {
> >          this_cpu(iommu_dont_flush_iotlb) = 0;
> > -        rc = iommu_iotlb_flush(d, xatp->idx - done, done);
> > +        rc = iommu_iotlb_flush(d, xatp->idx - done, done, NONE_LOCK);
> >          if ( !rc )
> > -            rc = iommu_iotlb_flush(d, xatp->gpfn - done, done);
> > +            rc = iommu_iotlb_flush(d, xatp->gpfn - done, done,
> > + NONE_LOCK);
> >      }
> 
> Considering both of these changes, I think it would be better to hide this
> implementation detail from callers outside of xen/drivers/passthrough/, by
> making iommu_iotlb_flush() a wrapper around the actual implementation. (This
> at once would limit the amount of ack-s such a patch will require.)
> 

Good idea.

> > --- a/xen/include/xen/iommu.h
> > +++ b/xen/include/xen/iommu.h
> > @@ -131,6 +131,13 @@ struct page_info;
> >   * callback pair.
> >   */
> >  typedef int iommu_grdm_t(xen_pfn_t start, xen_ulong_t nr, u32 id,
> > void *ctxt);
> > +/*
> > + * A flag indicates whether the lock is being held.
> > + * NONE_LOCK - no lock is being held.
> > + * PCIDEVS_LOCK - pcidevs_lock is being held.
> > + */
> > +#define NONE_LOCK 0
> > +#define PCIDEVS_LOCK 1
> 
> If you really assume further lock holding might need to be communicated here,
> this should be made more obviously a bit mask (by e.g. using hex constants, not
> having a NON_LOCKS constant, and not making the first part of the comment
> refer to just one lock). If, however, the pcidevs_lock is all you care about, I think
> code readability would benefit from making the respective function parameters
> bool_t, and adding (besides the already mentioned wrapper) another wrapper
> named e.g.
> iommu_iotlb_flush_all_locked().
> 

Yes, the pcidevs_lock is all I care about. I would make the respective function parameters
bool_t, and add another wrapper in v6.

e.g.
   iommu_iotlb_flush_all_locked() -- The pcidevs_lock is being held.
   iommu_iotlb_flush_all() -- The pcidevs_lock is NOT being held.
...

iommu_iotlb_flush()
iommu_iotlb_flush_locked()
...
.etc.

Quan

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v5 7/7] VT-d: Fix vt-d Device-TLB flush timeout issue.
  2016-02-18  8:47         ` Xu, Quan
@ 2016-02-18  9:21           ` Jan Beulich
  2016-02-18 10:00             ` Xu, Quan
  0 siblings, 1 reply; 31+ messages in thread
From: Jan Beulich @ 2016-02-18  9:21 UTC (permalink / raw)
  To: Quan Xu
  Cc: Kevin Tian, Feng Wu, george.dunlap, andrew.cooper3, tim, xen-devel

>>> On 18.02.16 at 09:47, <quan.xu@intel.com> wrote:
>>  On February 18, 2016 4:36pm, <JBeulich@suse.com> wrote:
>> >>> On 18.02.16 at 08:36, <quan.xu@intel.com> wrote:
>> >>  On February 17, 2016 10:41pm, <JBeulich@suse.com> wrote:
>> >> >>> On 05.02.16 at 11:18, <quan.xu@intel.com> wrote:
>> >> > --- a/xen/drivers/passthrough/vtd/qinval.c
>> >> > +++ b/xen/drivers/passthrough/vtd/qinval.c
>> >> > +            if ( pci_hide_device(bus, devfn) )
>> >>
>> >> But now I'm _really_ puzzled: You acquire the exact lock that
>> >> pci_hide_device() acquires. Hence, unless I've overlooked an earlier
>> >> change,
>> > I
>> >> can't see this as other than an unconditional dead lock. Did you test
>> >> this
>> > code
>> >> path at all?
>> >
>> > Sorry, I didn't test this code path.
>> > I did test the follows:
>> >    1) Create domain with ATS device.
>> >    2) Attach / Detach ATS device.
>> >
>> > I think I could add a variation of pci_hide_device(), without
>> > "spin_lock(&pcidevs_lock) / spin_unlock(&pcidevs_lock)"
>> > Or "__init".
>> 
>> Which we already have - _pci_hide_device(). The function would just need to be
>> made non-static, but would otherwise fit your purposes since you also don't
>> need the alloc_pdev() the other function does.
> 
> But the _pci_hide_device() starts with '_', I think it is not a good idea to 
> make it non-static.
> I'd better introduce a new function to wrap the '_pci_hide_device()'. It is 
> difficult to name the new function.

Why add a wrapper - just rename the function (I indeed appreciate
you not wanting to globalize a function whose name starts with an
underscore), e.g. to pci_hide_device_locked() or
pci_hide_existing_device().

Jan

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v5 7/7] VT-d: Fix vt-d Device-TLB flush timeout issue.
  2016-02-18  9:21           ` Jan Beulich
@ 2016-02-18 10:00             ` Xu, Quan
  0 siblings, 0 replies; 31+ messages in thread
From: Xu, Quan @ 2016-02-18 10:00 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Tian, Kevin, Wu, Feng, george.dunlap, andrew.cooper3, tim, xen-devel

> On  February 18, 2016 5:21pm,  <JBeulich@suse.com> wrote:
> >>> On 18.02.16 at 09:47, <quan.xu@intel.com> wrote:
> >>  On February 18, 2016 4:36pm, <JBeulich@suse.com> wrote:
> >> >>> On 18.02.16 at 08:36, <quan.xu@intel.com> wrote:
> >> >>  On February 17, 2016 10:41pm, <JBeulich@suse.com> wrote:
> >> >> >>> On 05.02.16 at 11:18, <quan.xu@intel.com> wrote:
> >> >> > --- a/xen/drivers/passthrough/vtd/qinval.c
> >> >> > +++ b/xen/drivers/passthrough/vtd/qinval.c
> >> >> > +            if ( pci_hide_device(bus, devfn) )
> >> >>
> >> >> But now I'm _really_ puzzled: You acquire the exact lock that
> >> >> pci_hide_device() acquires. Hence, unless I've overlooked an
> >> >> earlier change,
> >> > I
> >> >> can't see this as other than an unconditional dead lock. Did you
> >> >> test this
> >> > code
> >> >> path at all?
> >> >
> >> > Sorry, I didn't test this code path.
> >> > I did test the follows:
> >> >    1) Create domain with ATS device.
> >> >    2) Attach / Detach ATS device.
> >> >
> >> > I think I could add a variation of pci_hide_device(), without
> >> > "spin_lock(&pcidevs_lock) / spin_unlock(&pcidevs_lock)"
> >> > Or "__init".
> >>
> >> Which we already have - _pci_hide_device(). The function would just
> >> need to be made non-static, but would otherwise fit your purposes
> >> since you also don't need the alloc_pdev() the other function does.
> >
> > But the _pci_hide_device() starts with '_', I think it is not a good
> > idea to make it non-static.
> > I'd better introduce a new function to wrap the '_pci_hide_device()'.
> > It is difficult to name the new function.
> 
> Why add a wrapper - just rename the function (I indeed appreciate you not
> wanting to globalize a function whose name starts with an underscore), e.g. to
> pci_hide_device_locked() or pci_hide_existing_device().
>

Okay, I will rename the _pci_hide_device() to pci_hide_existing_device() in v6.
Quan

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v5 1/7] VT-d: Check VT-d Device-TLB flush error(IOMMU part).
  2016-02-10 17:01   ` Jan Beulich
  2016-02-16 10:50     ` Xu, Quan
@ 2016-02-23 11:43     ` Xu, Quan
  2016-02-23 15:53       ` Jan Beulich
  1 sibling, 1 reply; 31+ messages in thread
From: Xu, Quan @ 2016-02-23 11:43 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Tian, Kevin, Wu, Feng, george.dunlap, andrew.cooper3, tim, xen-devel

> On February 11, 2016 1:01am, Jan Beulich <JBeulich@suse.com> wrote:
> >>> On 05.02.16 at 11:18, <quan.xu@intel.com> wrote:
> > This patch checks all kinds of error and all the way up the call trees
> > of VT-d Device-TLB flush(IOMMU part).
> > @@ -171,7 +172,10 @@ void __hwdom_init iommu_hwdom_init(struct
> domain *d)
> >                   ((page->u.inuse.type_info & PGT_type_mask)
> >                    == PGT_writable_page) )
> >                  mapping |= IOMMUF_writable;
> > -            hd->platform_ops->map_page(d, gfn, mfn, mapping);
> > +            rc = hd->platform_ops->map_page(d, gfn, mfn, mapping);
> > +            if ( rc )
> > +                return rc;
> 
> So in this patch I can't find error checking getting introduced for the caller of
> this function. And if you were to add it - what would your intentions be? Panic?
> IOW I'm not sure bailing here is a good idea. Logging an error message (but only
> once in this loop) would be a possibility. (This pattern repeats further down.)
> 
While I read the code/email again and again,
I think I'd better _not_ return error value. Then the below modifications are pointless:
1.
-    void (*hwdom_init)(struct domain *d);
+    int (*hwdom_init)(struct domain *d);

2. 
-void iommu_hwdom_init(struct domain *d);
+int iommu_hwdom_init(struct domain *d);


> > @@ -369,12 +376,16 @@ void iommu_share_p2m_table(struct domain* d)
> >          ops->share_p2m(d);
> >  }
> >
> > -void iommu_crash_shutdown(void)
> > +int iommu_crash_shutdown(void)
> >  {
> >      const struct iommu_ops *ops = iommu_get_ops();
> > +
> >      if ( iommu_enabled )
> > -        ops->crash_shutdown();
> > +        return ops->crash_shutdown();
> > +
> >      iommu_enabled = iommu_intremap = 0;
> > +
> > +    return 0;
> >  }
> 
> Here again the question is - what is the error value going to be used for? We're
> trying to shut down a crashed system when coming here.
> 

It is similar as the above case.
I think I'd better _not_ return error value. Then the below modifications are pointless:
1.
-    void (*crash_shutdown)(void);
+    int (*crash_shutdown)(void);

2.
-void amd_iommu_crash_shutdown(void);
+int amd_iommu_crash_shutdown(void);

3. 
-void iommu_crash_shutdown(void);
+int iommu_crash_shutdown(void);


Right?

Quan

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v5 1/7] VT-d: Check VT-d Device-TLB flush error(IOMMU part).
  2016-02-23 11:43     ` Xu, Quan
@ 2016-02-23 15:53       ` Jan Beulich
  0 siblings, 0 replies; 31+ messages in thread
From: Jan Beulich @ 2016-02-23 15:53 UTC (permalink / raw)
  To: Quan Xu
  Cc: Kevin Tian, Feng Wu, george.dunlap, andrew.cooper3, tim, xen-devel

>>> On 23.02.16 at 12:43, <quan.xu@intel.com> wrote:
>>  On February 11, 2016 1:01am, Jan Beulich <JBeulich@suse.com> wrote:
>> >>> On 05.02.16 at 11:18, <quan.xu@intel.com> wrote:
>> > This patch checks all kinds of error and all the way up the call trees
>> > of VT-d Device-TLB flush(IOMMU part).
>> > @@ -171,7 +172,10 @@ void __hwdom_init iommu_hwdom_init(struct
>> domain *d)
>> >                   ((page->u.inuse.type_info & PGT_type_mask)
>> >                    == PGT_writable_page) )
>> >                  mapping |= IOMMUF_writable;
>> > -            hd->platform_ops->map_page(d, gfn, mfn, mapping);
>> > +            rc = hd->platform_ops->map_page(d, gfn, mfn, mapping);
>> > +            if ( rc )
>> > +                return rc;
>> 
>> So in this patch I can't find error checking getting introduced for the caller of
>> this function. And if you were to add it - what would your intentions be? Panic?
>> IOW I'm not sure bailing here is a good idea. Logging an error message (but only
>> once in this loop) would be a possibility. (This pattern repeats further down.)
>> 
> While I read the code/email again and again,
> I think I'd better _not_ return error value. Then the below modifications 
> are pointless:
> 1.
> -    void (*hwdom_init)(struct domain *d);
> +    int (*hwdom_init)(struct domain *d);
> 
> 2. 
> -void iommu_hwdom_init(struct domain *d);
> +int iommu_hwdom_init(struct domain *d);

Agreed. Just log some message for failure in the case above, but
make sure that is (a) useful for debugging and (b) not one message
per page.

>> > @@ -369,12 +376,16 @@ void iommu_share_p2m_table(struct domain* d)
>> >          ops->share_p2m(d);
>> >  }
>> >
>> > -void iommu_crash_shutdown(void)
>> > +int iommu_crash_shutdown(void)
>> >  {
>> >      const struct iommu_ops *ops = iommu_get_ops();
>> > +
>> >      if ( iommu_enabled )
>> > -        ops->crash_shutdown();
>> > +        return ops->crash_shutdown();
>> > +
>> >      iommu_enabled = iommu_intremap = 0;
>> > +
>> > +    return 0;
>> >  }
>> 
>> Here again the question is - what is the error value going to be used for? We're
>> trying to shut down a crashed system when coming here.
>> 
> 
> It is similar as the above case.
> I think I'd better _not_ return error value. Then the below modifications 
> are pointless:
> 1.
> -    void (*crash_shutdown)(void);
> +    int (*crash_shutdown)(void);
> 
> 2.
> -void amd_iommu_crash_shutdown(void);
> +int amd_iommu_crash_shutdown(void);
> 
> 3. 
> -void iommu_crash_shutdown(void);
> +int iommu_crash_shutdown(void);

Indeed, same as above.

Jan

^ permalink raw reply	[flat|nested] 31+ messages in thread

end of thread, other threads:[~2016-02-23 15:53 UTC | newest]

Thread overview: 31+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-02-05 10:18 [PATCH v5 0/7] VT-d Device-TLB flush issue Quan Xu
2016-02-05 10:18 ` [PATCH v5 1/7] VT-d: Check VT-d Device-TLB flush error(IOMMU part) Quan Xu
2016-02-10 17:01   ` Jan Beulich
2016-02-16 10:50     ` Xu, Quan
2016-02-16 11:06       ` Jan Beulich
2016-02-17 12:49         ` Xu, Quan
2016-02-17 13:05           ` Jan Beulich
2016-02-17 13:38             ` Xu, Quan
2016-02-18  6:15               ` Tian, Kevin
2016-02-18  6:33                 ` Xu, Quan
2016-02-18  6:37                   ` Tian, Kevin
2016-02-18  6:42                     ` Xu, Quan
2016-02-23 11:43     ` Xu, Quan
2016-02-23 15:53       ` Jan Beulich
2016-02-05 10:18 ` [PATCH v5 2/7] VT-d: Check VT-d Device-TLB flush error(MMU part) Quan Xu
2016-02-17 14:02   ` Jan Beulich
2016-02-05 10:18 ` [PATCH v5 3/7] VT-d: Reduce spin timeout to 1ms, which can be boot-time changed Quan Xu
2016-02-17 14:06   ` Jan Beulich
2016-02-05 10:18 ` [PATCH v5 4/7] VT-d: Refactor iommu_ops .iotlb_flush() and iotlb_flush_all() Quan Xu
2016-02-17 14:19   ` Jan Beulich
2016-02-18  8:50     ` Xu, Quan
2016-02-05 10:18 ` [PATCH v5 5/7] VT-d: Refactor iommu_ops .map_page() and unmap_page() Quan Xu
2016-02-17 14:23   ` Jan Beulich
2016-02-05 10:18 ` [PATCH v5 6/7] VT-d: Refactor iommu_flush .iotlb() Quan Xu
2016-02-05 10:18 ` [PATCH v5 7/7] VT-d: Fix vt-d Device-TLB flush timeout issue Quan Xu
2016-02-17 14:40   ` Jan Beulich
2016-02-18  7:36     ` Xu, Quan
2016-02-18  8:35       ` Jan Beulich
2016-02-18  8:47         ` Xu, Quan
2016-02-18  9:21           ` Jan Beulich
2016-02-18 10:00             ` Xu, Quan

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.