xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
* [Patch v11 0/3] VT-d Device-TLB flush issue
@ 2016-06-01  9:05 Xu, Quan
  2016-06-01  9:05 ` [Patch v11 1/3] IOMMU: add a timeout parameter for device IOTLB invalidation Xu, Quan
                   ` (2 more replies)
  0 siblings, 3 replies; 17+ messages in thread
From: Xu, Quan @ 2016-06-01  9:05 UTC (permalink / raw)
  To: xen-devel; +Cc: Quan Xu, kevin.tian, feng.wu, dario.faggioli, jbeulich

From: Quan Xu <quan.xu@intel.com>

This patches fix current timeout concern and also allow limited ATS support:

1. add a timeout parameter for device IOTLB invalidation

 The parameter 'iommu_dev_iotlb_timeout' specifies the timeout of the device
 IOTLB invalidation in milliseconds. By default, the timeout is 1ms, which can
 be boot-time changed.

2. Synchronize for Device-TLB flush one by one

 Today we do Device-TLB flush synchronization after issuing flush
 requests for all ATS devices belonging to a VM. Doing so however
 imposes a limitation, i.e. that we can not figure out which flush
 request is blocked in the flush queue list, based on VT-d spec.

 To prepare correct Device-TLB flush timeout handling in next patch,
 we change the behavior to synchronize for every Device-TLB flush
 request. So the Device-TLB flush interface is changed a little bit,
 by checking timeout within the function instead of outside of function.

 Accordingly we also do a similar change for flush interfaces of
 IOTLB/IEC/Context, i.e. moving synchronization into the function.
 Since there is no user of a non-synced interface, we just rename
 existing ones with _sync suffix.

3. Fix vt-d Device-TLB flush timeout issue

 If Device-TLB flush timed out, we hide the target ATS device
 immediately and crash the domain owning this ATS device. If
 impacted domain is hardware domain, just throw out a warning.

 By hiding the device, we make sure it can't be assigned to any
 domain any longer (see device_assigned).


**NOTE**
   This patch set should base on one prereq patch sets:
    a). Check VT-d Device-TLB flush error.

----
Not covered in this series:

    a) Eliminate the panic() in IOMMU_WAIT_OP, used only in VT-d register read/write.
       Further discussion is required on whether and how to improve it.
    b) Handle IOTLB/Context/IEC flush timeout.


Quan Xu (3):
  IOMMU: add a timeout parameter for device IOTLB invalidation
  vt-d: synchronize for Device-TLB flush one by one
  vt-d: fix vt-d Device-TLB flush timeout issue

 docs/misc/xen-command-line.markdown   |   9 ++
 xen/drivers/passthrough/iommu.c       |   3 +
 xen/drivers/passthrough/pci.c         |   6 +-
 xen/drivers/passthrough/vtd/extern.h  |   6 +-
 xen/drivers/passthrough/vtd/qinval.c  | 157 ++++++++++++++++++++++++++--------
 xen/drivers/passthrough/vtd/x86/ats.c |   7 +-
 xen/include/xen/iommu.h               |   2 +
 xen/include/xen/pci.h                 |   1 +
 8 files changed, 144 insertions(+), 47 deletions(-)

-- 
1.9.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Patch v11 1/3] IOMMU: add a timeout parameter for device IOTLB invalidation
  2016-06-01  9:05 [Patch v11 0/3] VT-d Device-TLB flush issue Xu, Quan
@ 2016-06-01  9:05 ` Xu, Quan
  2016-06-02 10:24   ` Jan Beulich
  2016-06-01  9:05 ` [Patch v11 2/3] vt-d: synchronize for Device-TLB flush one by one Xu, Quan
  2016-06-01  9:05 ` [Patch v11 3/3] vt-d: fix vt-d Device-TLB flush timeout issue Xu, Quan
  2 siblings, 1 reply; 17+ messages in thread
From: Xu, Quan @ 2016-06-01  9:05 UTC (permalink / raw)
  To: xen-devel; +Cc: Quan Xu, kevin.tian, feng.wu, dario.faggioli, jbeulich

From: Quan Xu <quan.xu@intel.com>

The parameter 'iommu_dev_iotlb_timeout' specifies the timeout of
the device IOTLB invalidation in milliseconds. By default, the
timeout is 1ms, which can be boot-time changed.

Add a __must_check annotation. The followup patch titled
'VT-d IOTLB/Context/IEC flush issue' addresses the __mustcheck.
That is the other callers of this routine (two or three levels up)
ignore the return code. This patch does not address this but the
other does.

v11: Change the timeout parameter from 'vtd_qi_timeout' to
    'iommu_dev_iotlb_timeout', which is not only for VT-d device
    IOTLB invalidation, but also for other IOMMU implementations.

Signed-off-by: Quan Xu <quan.xu@intel.com>
---
 docs/misc/xen-command-line.markdown  |  9 +++++++++
 xen/drivers/passthrough/iommu.c      |  3 +++
 xen/drivers/passthrough/vtd/qinval.c | 34 +++++++++++++++++++++++-----------
 xen/include/xen/iommu.h              |  2 ++
 4 files changed, 37 insertions(+), 11 deletions(-)

diff --git a/docs/misc/xen-command-line.markdown b/docs/misc/xen-command-line.markdown
index b4bae11..34a0f9c 100644
--- a/docs/misc/xen-command-line.markdown
+++ b/docs/misc/xen-command-line.markdown
@@ -996,6 +996,15 @@ debug hypervisor only).
 
 >> Enable IOMMU debugging code (implies `verbose`).
 
+### iommu\_dev\_iotlb\_timeout
+> `= <integer>`
+
+> Default: `1`
+
+Specify the timeout of the device IOTLB invalidation in milliseconds.
+By default, the timeout is 1 ms. When you see error 'Queue invalidate
+wait descriptor timed out', try increasing this value.
+
 ### iommu\_inclusive\_mapping (VT-d)
 > `= <boolean>`
 
diff --git a/xen/drivers/passthrough/iommu.c b/xen/drivers/passthrough/iommu.c
index 098b601..e12de69 100644
--- a/xen/drivers/passthrough/iommu.c
+++ b/xen/drivers/passthrough/iommu.c
@@ -24,6 +24,9 @@
 static void parse_iommu_param(char *s);
 static void iommu_dump_p2m_table(unsigned char key);
 
+unsigned int __read_mostly iommu_dev_iotlb_timeout = 1;
+integer_param("iommu_dev_iotlb_timeout", iommu_dev_iotlb_timeout);
+
 /*
  * The 'iommu' parameter enables the IOMMU.  Optional comma separated
  * value may contain:
diff --git a/xen/drivers/passthrough/vtd/qinval.c b/xen/drivers/passthrough/vtd/qinval.c
index aa7841a..1a37565 100644
--- a/xen/drivers/passthrough/vtd/qinval.c
+++ b/xen/drivers/passthrough/vtd/qinval.c
@@ -28,6 +28,8 @@
 #include "vtd.h"
 #include "extern.h"
 
+#define IOMMU_QI_TIMEOUT MILLISECS(1)
+
 static void print_qi_regs(struct iommu *iommu)
 {
     u64 val;
@@ -130,10 +132,10 @@ static void queue_invalidate_iotlb(struct iommu *iommu,
     spin_unlock_irqrestore(&iommu->register_lock, flags);
 }
 
-static int queue_invalidate_wait(struct iommu *iommu,
-    u8 iflag, u8 sw, u8 fn)
+static int __must_check queue_invalidate_wait(struct iommu *iommu,
+                                              u8 iflag, u8 sw, u8 fn,
+                                              bool_t flush_dev_iotlb)
 {
-    s_time_t start_time;
     volatile u32 poll_slot = QINVAL_STAT_INIT;
     unsigned int index;
     unsigned long flags;
@@ -163,14 +165,21 @@ static int queue_invalidate_wait(struct iommu *iommu,
     /* Now we don't support interrupt method */
     if ( sw )
     {
+        s_time_t timeout;
+
         /* In case all wait descriptor writes to same addr with same data */
-        start_time = NOW();
+        timeout = flush_dev_iotlb ?
+                  (NOW() + iommu_dev_iotlb_timeout * MILLISECS(1)) :
+                  (NOW() + IOMMU_QI_TIMEOUT);
+
         while ( poll_slot != QINVAL_STAT_DONE )
         {
-            if ( NOW() > (start_time + DMAR_OPERATION_TIMEOUT) )
+            if ( NOW() > timeout )
             {
                 print_qi_regs(iommu);
-                panic("queue invalidate wait descriptor was not executed");
+                printk(XENLOG_WARNING VTDPREFIX
+                       " Queue invalidate wait descriptor timed out\n");
+                return -ETIMEDOUT;
             }
             cpu_relax();
         }
@@ -180,12 +189,14 @@ static int queue_invalidate_wait(struct iommu *iommu,
     return -EOPNOTSUPP;
 }
 
-static int invalidate_sync(struct iommu *iommu)
+static int __must_check invalidate_sync(struct iommu *iommu,
+                                        bool_t flush_dev_iotlb)
 {
     struct qi_ctrl *qi_ctrl = iommu_qi_ctrl(iommu);
 
     if ( qi_ctrl->qinval_maddr )
-        return queue_invalidate_wait(iommu, 0, 1, 1);
+        return queue_invalidate_wait(iommu, 0, 1, 1, flush_dev_iotlb);
+
     return 0;
 }
 
@@ -254,7 +265,7 @@ static int __iommu_flush_iec(struct iommu *iommu, u8 granu, u8 im, u16 iidx)
     int ret;
 
     queue_invalidate_iec(iommu, granu, im, iidx);
-    ret = invalidate_sync(iommu);
+    ret = invalidate_sync(iommu, 0);
     /*
      * reading vt-d architecture register will ensure
      * draining happens in implementation independent way.
@@ -300,7 +311,7 @@ static int __must_check flush_context_qi(void *_iommu, u16 did,
     {
         queue_invalidate_context(iommu, did, sid, fm,
                                  type >> DMA_CCMD_INVL_GRANU_OFFSET);
-        ret = invalidate_sync(iommu);
+        ret = invalidate_sync(iommu, 0);
     }
     return ret;
 }
@@ -344,10 +355,11 @@ static int __must_check flush_iotlb_qi(void *_iommu, u16 did, u64 addr,
                                dw, did, size_order, 0, addr);
         if ( flush_dev_iotlb )
             ret = dev_invalidate_iotlb(iommu, did, addr, size_order, type);
-        rc = invalidate_sync(iommu);
+        rc = invalidate_sync(iommu, flush_dev_iotlb);
         if ( !ret )
             ret = rc;
     }
+
     return ret;
 }
 
diff --git a/xen/include/xen/iommu.h b/xen/include/xen/iommu.h
index e917031..9891bc5 100644
--- a/xen/include/xen/iommu.h
+++ b/xen/include/xen/iommu.h
@@ -35,6 +35,8 @@ extern bool_t iommu_hap_pt_share;
 extern bool_t iommu_debug;
 extern bool_t amd_iommu_perdev_intremap;
 
+extern unsigned int iommu_dev_iotlb_timeout;
+
 #define IOMMU_PAGE_SIZE(sz) (1UL << PAGE_SHIFT_##sz)
 #define IOMMU_PAGE_MASK(sz) (~(u64)0 << PAGE_SHIFT_##sz)
 #define IOMMU_PAGE_ALIGN(sz, addr)  (((addr) + ~PAGE_MASK_##sz) & PAGE_MASK_##sz)
-- 
1.9.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Patch v11 2/3] vt-d: synchronize for Device-TLB flush one by one
  2016-06-01  9:05 [Patch v11 0/3] VT-d Device-TLB flush issue Xu, Quan
  2016-06-01  9:05 ` [Patch v11 1/3] IOMMU: add a timeout parameter for device IOTLB invalidation Xu, Quan
@ 2016-06-01  9:05 ` Xu, Quan
  2016-06-02 10:49   ` Jan Beulich
  2016-06-01  9:05 ` [Patch v11 3/3] vt-d: fix vt-d Device-TLB flush timeout issue Xu, Quan
  2 siblings, 1 reply; 17+ messages in thread
From: Xu, Quan @ 2016-06-01  9:05 UTC (permalink / raw)
  To: xen-devel; +Cc: Quan Xu, kevin.tian, feng.wu, dario.faggioli, jbeulich

From: Quan Xu <quan.xu@intel.com>

Today we do Device-TLB flush synchronization after issuing flush
requests for all ATS devices belonging to a VM. Doing so however
imposes a limitation, i.e. that we can not figure out which flush
request is blocked in the flush queue list, based on VT-d spec.

To prepare correct Device-TLB flush timeout handling in next patch,
we change the behavior to synchronize for every Device-TLB flush
request. So the Device-TLB flush interface is changed a little bit,
by checking timeout within the function instead of outside of function.

Accordingly we also do a similar change for flush interfaces of
IOTLB/IEC/Context, i.e. moving synchronization into the function.
Since there is no user of a non-synced interface, we just rename
existing ones with _sync suffix.

v11:
   1. Add __must_check annotation to invalidate_sync().
   2. Remove the redundant invalidate_sync() in flush_iotlb_qi().
   3. Refactor invalidate_sync() to indicate whether we need to flush
      device IOTLB or not (I'll change it back in next patch, as I'll
      add a specific function - dev_invalidate_sync() for device IOTLB
      invalidation).

Signed-off-by: Quan Xu <quan.xu@intel.com>
---
 xen/drivers/passthrough/vtd/extern.h  |  5 +--
 xen/drivers/passthrough/vtd/qinval.c  | 65 ++++++++++++++++++++---------------
 xen/drivers/passthrough/vtd/x86/ats.c |  8 ++---
 3 files changed, 45 insertions(+), 33 deletions(-)

diff --git a/xen/drivers/passthrough/vtd/extern.h b/xen/drivers/passthrough/vtd/extern.h
index 6772839..45357f2 100644
--- a/xen/drivers/passthrough/vtd/extern.h
+++ b/xen/drivers/passthrough/vtd/extern.h
@@ -59,8 +59,9 @@ int ats_device(const struct pci_dev *, const struct acpi_drhd_unit *);
 int dev_invalidate_iotlb(struct iommu *iommu, u16 did,
                          u64 addr, unsigned int size_order, u64 type);
 
-int qinval_device_iotlb(struct iommu *iommu,
-                        u32 max_invs_pend, u16 sid, u16 size, u64 addr);
+int __must_check qinval_device_iotlb_sync(struct iommu *iommu,
+                                          u32 max_invs_pend,
+                                          u16 sid, u16 size, u64 addr);
 
 unsigned int get_cache_line_size(void);
 void cacheline_flush(char *);
diff --git a/xen/drivers/passthrough/vtd/qinval.c b/xen/drivers/passthrough/vtd/qinval.c
index 1a37565..9116588 100644
--- a/xen/drivers/passthrough/vtd/qinval.c
+++ b/xen/drivers/passthrough/vtd/qinval.c
@@ -30,6 +30,9 @@
 
 #define IOMMU_QI_TIMEOUT MILLISECS(1)
 
+static int __must_check invalidate_sync(struct iommu *iommu,
+                                        bool_t flush_dev_iotlb);
+
 static void print_qi_regs(struct iommu *iommu)
 {
     u64 val;
@@ -69,8 +72,10 @@ static void qinval_update_qtail(struct iommu *iommu, unsigned int index)
     dmar_writeq(iommu->reg, DMAR_IQT_REG, (val << QINVAL_INDEX_SHIFT));
 }
 
-static void queue_invalidate_context(struct iommu *iommu,
-    u16 did, u16 source_id, u8 function_mask, u8 granu)
+static int __must_check queue_invalidate_context_sync(struct iommu *iommu,
+                                                      u16 did, u16 source_id,
+                                                      u8 function_mask,
+                                                      u8 granu)
 {
     unsigned long flags;
     unsigned int index;
@@ -97,10 +102,14 @@ static void queue_invalidate_context(struct iommu *iommu,
     spin_unlock_irqrestore(&iommu->register_lock, flags);
 
     unmap_vtd_domain_page(qinval_entries);
+
+    return invalidate_sync(iommu, 0);
 }
 
-static void queue_invalidate_iotlb(struct iommu *iommu,
-    u8 granu, u8 dr, u8 dw, u16 did, u8 am, u8 ih, u64 addr)
+static int __must_check queue_invalidate_iotlb_sync(struct iommu *iommu,
+                                                    u8 granu, u8 dr, u8 dw,
+                                                    u16 did, u8 am, u8 ih,
+                                                    u64 addr)
 {
     unsigned long flags;
     unsigned int index;
@@ -130,6 +139,8 @@ static void queue_invalidate_iotlb(struct iommu *iommu,
     unmap_vtd_domain_page(qinval_entries);
     qinval_update_qtail(iommu, index);
     spin_unlock_irqrestore(&iommu->register_lock, flags);
+
+    return invalidate_sync(iommu, 0);
 }
 
 static int __must_check queue_invalidate_wait(struct iommu *iommu,
@@ -200,8 +211,9 @@ static int __must_check invalidate_sync(struct iommu *iommu,
     return 0;
 }
 
-int qinval_device_iotlb(struct iommu *iommu,
-    u32 max_invs_pend, u16 sid, u16 size, u64 addr)
+int qinval_device_iotlb_sync(struct iommu *iommu,
+                             u32 max_invs_pend,
+                             u16 sid, u16 size, u64 addr)
 {
     unsigned long flags;
     unsigned int index;
@@ -230,15 +242,17 @@ int qinval_device_iotlb(struct iommu *iommu,
     qinval_update_qtail(iommu, index);
     spin_unlock_irqrestore(&iommu->register_lock, flags);
 
-    return 0;
+    return invalidate_sync(iommu, 1);
 }
 
-static void queue_invalidate_iec(struct iommu *iommu, u8 granu, u8 im, u16 iidx)
+static int __must_check queue_invalidate_iec_sync(struct iommu *iommu,
+                                                  u8 granu, u8 im, u16 iidx)
 {
     unsigned long flags;
     unsigned int index;
     u64 entry_base;
     struct qinval_entry *qinval_entry, *qinval_entries;
+    int ret;
 
     spin_lock_irqsave(&iommu->register_lock, flags);
     index = qinval_next_index(iommu);
@@ -258,14 +272,9 @@ static void queue_invalidate_iec(struct iommu *iommu, u8 granu, u8 im, u16 iidx)
     unmap_vtd_domain_page(qinval_entries);
     qinval_update_qtail(iommu, index);
     spin_unlock_irqrestore(&iommu->register_lock, flags);
-}
-
-static int __iommu_flush_iec(struct iommu *iommu, u8 granu, u8 im, u16 iidx)
-{
-    int ret;
 
-    queue_invalidate_iec(iommu, granu, im, iidx);
     ret = invalidate_sync(iommu, 0);
+
     /*
      * reading vt-d architecture register will ensure
      * draining happens in implementation independent way.
@@ -277,12 +286,12 @@ static int __iommu_flush_iec(struct iommu *iommu, u8 granu, u8 im, u16 iidx)
 
 int iommu_flush_iec_global(struct iommu *iommu)
 {
-    return __iommu_flush_iec(iommu, IEC_GLOBAL_INVL, 0, 0);
+    return queue_invalidate_iec_sync(iommu, IEC_GLOBAL_INVL, 0, 0);
 }
 
 int iommu_flush_iec_index(struct iommu *iommu, u8 im, u16 iidx)
 {
-   return __iommu_flush_iec(iommu, IEC_INDEX_INVL, im, iidx);
+    return queue_invalidate_iec_sync(iommu, IEC_INDEX_INVL, im, iidx);
 }
 
 static int __must_check flush_context_qi(void *_iommu, u16 did,
@@ -308,11 +317,9 @@ static int __must_check flush_context_qi(void *_iommu, u16 did,
     }
 
     if ( qi_ctrl->qinval_maddr != 0 )
-    {
-        queue_invalidate_context(iommu, did, sid, fm,
-                                 type >> DMA_CCMD_INVL_GRANU_OFFSET);
-        ret = invalidate_sync(iommu, 0);
-    }
+        ret = queue_invalidate_context_sync(iommu, did, sid, fm,
+                                            type >> DMA_CCMD_INVL_GRANU_OFFSET);
+
     return ret;
 }
 
@@ -350,14 +357,18 @@ static int __must_check flush_iotlb_qi(void *_iommu, u16 did, u64 addr,
         if (cap_read_drain(iommu->cap))
             dr = 1;
         /* Need to conside the ih bit later */
-        queue_invalidate_iotlb(iommu,
-                               type >> DMA_TLB_FLUSH_GRANU_OFFSET, dr,
-                               dw, did, size_order, 0, addr);
-        if ( flush_dev_iotlb )
-            ret = dev_invalidate_iotlb(iommu, did, addr, size_order, type);
-        rc = invalidate_sync(iommu, flush_dev_iotlb);
+        rc = queue_invalidate_iotlb_sync(iommu,
+                                         type >> DMA_TLB_FLUSH_GRANU_OFFSET,
+                                         dr, dw, did, size_order, 0, addr);
         if ( !ret )
             ret = rc;
+
+        if ( flush_dev_iotlb )
+        {
+            rc = dev_invalidate_iotlb(iommu, did, addr, size_order, type);
+            if ( !ret )
+                ret = rc;
+        }
     }
 
     return ret;
diff --git a/xen/drivers/passthrough/vtd/x86/ats.c b/xen/drivers/passthrough/vtd/x86/ats.c
index 334b9c1..dfa4d30 100644
--- a/xen/drivers/passthrough/vtd/x86/ats.c
+++ b/xen/drivers/passthrough/vtd/x86/ats.c
@@ -134,8 +134,8 @@ int dev_invalidate_iotlb(struct iommu *iommu, u16 did,
             /* invalidate all translations: sbit=1,bit_63=0,bit[62:12]=1 */
             sbit = 1;
             addr = (~0UL << PAGE_SHIFT_4K) & 0x7FFFFFFFFFFFFFFF;
-            rc = qinval_device_iotlb(iommu, pdev->ats_queue_depth,
-                                     sid, sbit, addr);
+            rc = qinval_device_iotlb_sync(iommu, pdev->ats_queue_depth,
+                                          sid, sbit, addr);
             break;
         case DMA_TLB_PSI_FLUSH:
             if ( !device_in_domain(iommu, pdev, did) )
@@ -154,8 +154,8 @@ int dev_invalidate_iotlb(struct iommu *iommu, u16 did,
                 addr |= (((u64)1 << (size_order - 1)) - 1) << PAGE_SHIFT_4K;
             }
 
-            rc = qinval_device_iotlb(iommu, pdev->ats_queue_depth,
-                                     sid, sbit, addr);
+            rc = qinval_device_iotlb_sync(iommu, pdev->ats_queue_depth,
+                                          sid, sbit, addr);
             break;
         default:
             dprintk(XENLOG_WARNING VTDPREFIX, "invalid vt-d flush type\n");
-- 
1.9.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Patch v11 3/3] vt-d: fix vt-d Device-TLB flush timeout issue
  2016-06-01  9:05 [Patch v11 0/3] VT-d Device-TLB flush issue Xu, Quan
  2016-06-01  9:05 ` [Patch v11 1/3] IOMMU: add a timeout parameter for device IOTLB invalidation Xu, Quan
  2016-06-01  9:05 ` [Patch v11 2/3] vt-d: synchronize for Device-TLB flush one by one Xu, Quan
@ 2016-06-01  9:05 ` Xu, Quan
  2016-06-02 11:07   ` Jan Beulich
  2 siblings, 1 reply; 17+ messages in thread
From: Xu, Quan @ 2016-06-01  9:05 UTC (permalink / raw)
  To: xen-devel; +Cc: Quan Xu, kevin.tian, feng.wu, dario.faggioli, jbeulich

From: Quan Xu <quan.xu@intel.com>

If Device-TLB flush timed out, we hide the target ATS device
immediately and crash the domain owning this ATS device. If
impacted domain is hardware domain, just throw out a warning.

By hiding the device, we make sure it can't be assigned to any
domain any longer (see device_assigned).

v11:
   1. Pass down struct pci_dev *, instead of SBDF inside struct
      pci_ats_dev.
   2. change invalidate_sync() back as I add a specific function
      - dev_invalidate_sync() for device IOTLB.

Signed-off-by: Quan Xu <quan.xu@intel.com>
---
 xen/drivers/passthrough/pci.c         |  6 +--
 xen/drivers/passthrough/vtd/extern.h  |  5 +-
 xen/drivers/passthrough/vtd/qinval.c  | 86 +++++++++++++++++++++++++++++------
 xen/drivers/passthrough/vtd/x86/ats.c |  7 +--
 xen/include/xen/pci.h                 |  1 +
 5 files changed, 82 insertions(+), 23 deletions(-)

diff --git a/xen/drivers/passthrough/pci.c b/xen/drivers/passthrough/pci.c
index 98936f55c..843dc88 100644
--- a/xen/drivers/passthrough/pci.c
+++ b/xen/drivers/passthrough/pci.c
@@ -419,7 +419,7 @@ static void free_pdev(struct pci_seg *pseg, struct pci_dev *pdev)
     xfree(pdev);
 }
 
-static void _pci_hide_device(struct pci_dev *pdev)
+void pci_hide_existing_device(struct pci_dev *pdev)
 {
     if ( pdev->domain )
         return;
@@ -436,7 +436,7 @@ int __init pci_hide_device(int bus, int devfn)
     pdev = alloc_pdev(get_pseg(0), bus, devfn);
     if ( pdev )
     {
-        _pci_hide_device(pdev);
+        pci_hide_existing_device(pdev);
         rc = 0;
     }
     pcidevs_unlock();
@@ -466,7 +466,7 @@ int __init pci_ro_device(int seg, int bus, int devfn)
     }
 
     __set_bit(PCI_BDF2(bus, devfn), pseg->ro_map);
-    _pci_hide_device(pdev);
+    pci_hide_existing_device(pdev);
 
     return 0;
 }
diff --git a/xen/drivers/passthrough/vtd/extern.h b/xen/drivers/passthrough/vtd/extern.h
index 45357f2..94ca97a 100644
--- a/xen/drivers/passthrough/vtd/extern.h
+++ b/xen/drivers/passthrough/vtd/extern.h
@@ -21,6 +21,7 @@
 #define _VTD_EXTERN_H_
 
 #include "dmar.h"
+#include "../ats.h"
 #include <xen/keyhandler.h>
 
 #define VTDPREFIX "[VT-D]"
@@ -60,8 +61,8 @@ int dev_invalidate_iotlb(struct iommu *iommu, u16 did,
                          u64 addr, unsigned int size_order, u64 type);
 
 int __must_check qinval_device_iotlb_sync(struct iommu *iommu,
-                                          u32 max_invs_pend,
-                                          u16 sid, u16 size, u64 addr);
+                                          struct pci_ats_dev *ats_dev,
+                                          u16 did, u16 size, u64 addr);
 
 unsigned int get_cache_line_size(void);
 void cacheline_flush(char *);
diff --git a/xen/drivers/passthrough/vtd/qinval.c b/xen/drivers/passthrough/vtd/qinval.c
index 9116588..bea3e86 100644
--- a/xen/drivers/passthrough/vtd/qinval.c
+++ b/xen/drivers/passthrough/vtd/qinval.c
@@ -30,8 +30,7 @@
 
 #define IOMMU_QI_TIMEOUT MILLISECS(1)
 
-static int __must_check invalidate_sync(struct iommu *iommu,
-                                        bool_t flush_dev_iotlb);
+static int __must_check invalidate_sync(struct iommu *iommu);
 
 static void print_qi_regs(struct iommu *iommu)
 {
@@ -103,7 +102,7 @@ static int __must_check queue_invalidate_context_sync(struct iommu *iommu,
 
     unmap_vtd_domain_page(qinval_entries);
 
-    return invalidate_sync(iommu, 0);
+    return invalidate_sync(iommu);
 }
 
 static int __must_check queue_invalidate_iotlb_sync(struct iommu *iommu,
@@ -140,7 +139,7 @@ static int __must_check queue_invalidate_iotlb_sync(struct iommu *iommu,
     qinval_update_qtail(iommu, index);
     spin_unlock_irqrestore(&iommu->register_lock, flags);
 
-    return invalidate_sync(iommu, 0);
+    return invalidate_sync(iommu);
 }
 
 static int __must_check queue_invalidate_wait(struct iommu *iommu,
@@ -200,20 +199,81 @@ static int __must_check queue_invalidate_wait(struct iommu *iommu,
     return -EOPNOTSUPP;
 }
 
-static int __must_check invalidate_sync(struct iommu *iommu,
-                                        bool_t flush_dev_iotlb)
+static int __must_check invalidate_sync(struct iommu *iommu)
 {
     struct qi_ctrl *qi_ctrl = iommu_qi_ctrl(iommu);
 
     if ( qi_ctrl->qinval_maddr )
-        return queue_invalidate_wait(iommu, 0, 1, 1, flush_dev_iotlb);
+        return queue_invalidate_wait(iommu, 0, 1, 1, 0);
 
     return 0;
 }
 
+static void dev_invalidate_iotlb_timeout(struct iommu *iommu, u16 did,
+                                         struct pci_ats_dev *ats_dev)
+{
+    struct domain *d = NULL;
+    struct pci_dev *pdev;
+
+    if ( test_bit(did, iommu->domid_bitmap) )
+        d = rcu_lock_domain_by_id(iommu->domid_map[did]);
+
+    /*
+     * In case the domain has been freed or the IOMMU domid bitmap is
+     * not valid, the device no longer belongs to this domain.
+     */
+    if ( d == NULL )
+        return;
+
+    pcidevs_lock();
+
+    for_each_pdev(d, pdev)
+    {
+        if ( (pdev->seg == ats_dev->seg) &&
+             (pdev->bus == ats_dev->bus) &&
+             (pdev->devfn == ats_dev->devfn) )
+        {
+            ASSERT(pdev->domain);
+            list_del(&pdev->domain_list);
+            pdev->domain = NULL;
+            pci_hide_existing_device(pdev);
+            break;
+        }
+    }
+
+    pcidevs_unlock();
+
+    if ( !is_hardware_domain(d) )
+        domain_crash(d);
+    else
+        printk(XENLOG_WARNING VTDPREFIX
+               " dom%d: ATS device %04x:%02x:%02x.%u flush failed\n",
+               d->domain_id, ats_dev->seg, ats_dev->bus,
+               PCI_SLOT(ats_dev->devfn), PCI_FUNC(ats_dev->devfn));
+
+    rcu_unlock_domain(d);
+}
+
+static int __must_check dev_invalidate_sync(struct iommu *iommu, u16 did,
+                                            struct pci_ats_dev *ats_dev)
+{
+    struct qi_ctrl *qi_ctrl = iommu_qi_ctrl(iommu);
+    int rc = 0;
+
+    if ( qi_ctrl->qinval_maddr )
+    {
+        rc = queue_invalidate_wait(iommu, 0, 1, 1, 1);
+
+        if ( rc == -ETIMEDOUT )
+            dev_invalidate_iotlb_timeout(iommu, did, ats_dev);
+    }
+
+    return rc;
+}
+
 int qinval_device_iotlb_sync(struct iommu *iommu,
-                             u32 max_invs_pend,
-                             u16 sid, u16 size, u64 addr)
+                             struct pci_ats_dev *ats_dev,
+                             u16 did, u16 size, u64 addr)
 {
     unsigned long flags;
     unsigned int index;
@@ -229,9 +289,9 @@ int qinval_device_iotlb_sync(struct iommu *iommu,
 
     qinval_entry->q.dev_iotlb_inv_dsc.lo.type = TYPE_INVAL_DEVICE_IOTLB;
     qinval_entry->q.dev_iotlb_inv_dsc.lo.res_1 = 0;
-    qinval_entry->q.dev_iotlb_inv_dsc.lo.max_invs_pend = max_invs_pend;
+    qinval_entry->q.dev_iotlb_inv_dsc.lo.max_invs_pend = ats_dev->ats_queue_depth;
     qinval_entry->q.dev_iotlb_inv_dsc.lo.res_2 = 0;
-    qinval_entry->q.dev_iotlb_inv_dsc.lo.sid = sid;
+    qinval_entry->q.dev_iotlb_inv_dsc.lo.sid = PCI_BDF2(ats_dev->bus, ats_dev->devfn);
     qinval_entry->q.dev_iotlb_inv_dsc.lo.res_3 = 0;
 
     qinval_entry->q.dev_iotlb_inv_dsc.hi.size = size;
@@ -242,7 +302,7 @@ int qinval_device_iotlb_sync(struct iommu *iommu,
     qinval_update_qtail(iommu, index);
     spin_unlock_irqrestore(&iommu->register_lock, flags);
 
-    return invalidate_sync(iommu, 1);
+    return dev_invalidate_sync(iommu, did, ats_dev);
 }
 
 static int __must_check queue_invalidate_iec_sync(struct iommu *iommu,
@@ -273,7 +333,7 @@ static int __must_check queue_invalidate_iec_sync(struct iommu *iommu,
     qinval_update_qtail(iommu, index);
     spin_unlock_irqrestore(&iommu->register_lock, flags);
 
-    ret = invalidate_sync(iommu, 0);
+    ret = invalidate_sync(iommu);
 
     /*
      * reading vt-d architecture register will ensure
diff --git a/xen/drivers/passthrough/vtd/x86/ats.c b/xen/drivers/passthrough/vtd/x86/ats.c
index dfa4d30..ecae94c 100644
--- a/xen/drivers/passthrough/vtd/x86/ats.c
+++ b/xen/drivers/passthrough/vtd/x86/ats.c
@@ -116,7 +116,6 @@ int dev_invalidate_iotlb(struct iommu *iommu, u16 did,
 
     list_for_each_entry( pdev, &ats_devices, list )
     {
-        u16 sid = PCI_BDF2(pdev->bus, pdev->devfn);
         bool_t sbit;
         int rc = 0;
 
@@ -134,8 +133,7 @@ int dev_invalidate_iotlb(struct iommu *iommu, u16 did,
             /* invalidate all translations: sbit=1,bit_63=0,bit[62:12]=1 */
             sbit = 1;
             addr = (~0UL << PAGE_SHIFT_4K) & 0x7FFFFFFFFFFFFFFF;
-            rc = qinval_device_iotlb_sync(iommu, pdev->ats_queue_depth,
-                                          sid, sbit, addr);
+            rc = qinval_device_iotlb_sync(iommu, pdev, did, sbit, addr);
             break;
         case DMA_TLB_PSI_FLUSH:
             if ( !device_in_domain(iommu, pdev, did) )
@@ -154,8 +152,7 @@ int dev_invalidate_iotlb(struct iommu *iommu, u16 did,
                 addr |= (((u64)1 << (size_order - 1)) - 1) << PAGE_SHIFT_4K;
             }
 
-            rc = qinval_device_iotlb_sync(iommu, pdev->ats_queue_depth,
-                                          sid, sbit, addr);
+            rc = qinval_device_iotlb_sync(iommu, pdev, did, sbit, addr);
             break;
         default:
             dprintk(XENLOG_WARNING VTDPREFIX, "invalid vt-d flush type\n");
diff --git a/xen/include/xen/pci.h b/xen/include/xen/pci.h
index 6ed29dd..e4940cd 100644
--- a/xen/include/xen/pci.h
+++ b/xen/include/xen/pci.h
@@ -118,6 +118,7 @@ int pci_add_device(u16 seg, u8 bus, u8 devfn,
 int pci_remove_device(u16 seg, u8 bus, u8 devfn);
 int pci_ro_device(int seg, int bus, int devfn);
 int pci_hide_device(int bus, int devfn);
+void pci_hide_existing_device(struct pci_dev *pdev);
 struct pci_dev *pci_get_pdev(int seg, int bus, int devfn);
 struct pci_dev *pci_get_real_pdev(int seg, int bus, int devfn);
 struct pci_dev *pci_get_pdev_by_domain(
-- 
1.9.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Patch v11 1/3] IOMMU: add a timeout parameter for device IOTLB invalidation
  2016-06-01  9:05 ` [Patch v11 1/3] IOMMU: add a timeout parameter for device IOTLB invalidation Xu, Quan
@ 2016-06-02 10:24   ` Jan Beulich
  2016-06-15  2:55     ` Xu, Quan
  0 siblings, 1 reply; 17+ messages in thread
From: Jan Beulich @ 2016-06-02 10:24 UTC (permalink / raw)
  To: Quan Xu; +Cc: dario.faggioli, feng.wu, kevin.tian, xen-devel

>>> On 01.06.16 at 11:05, <quan.xu@intel.com> wrote:
> From: Quan Xu <quan.xu@intel.com>
> 
> The parameter 'iommu_dev_iotlb_timeout' specifies the timeout of
> the device IOTLB invalidation in milliseconds. By default, the
> timeout is 1ms, which can be boot-time changed.
> 
> Add a __must_check annotation. The followup patch titled
> 'VT-d IOTLB/Context/IEC flush issue' addresses the __mustcheck.
> That is the other callers of this routine (two or three levels up)
> ignore the return code. This patch does not address this but the
> other does.

The description really needs to mention that the dropping of
the panic() is intentional, and why.

> v11: Change the timeout parameter from 'vtd_qi_timeout' to
>     'iommu_dev_iotlb_timeout', which is not only for VT-d device
>     IOTLB invalidation, but also for other IOMMU implementations.

This goes after the first --- separator.

> --- a/docs/misc/xen-command-line.markdown
> +++ b/docs/misc/xen-command-line.markdown
> @@ -996,6 +996,15 @@ debug hypervisor only).
>  
>  >> Enable IOMMU debugging code (implies `verbose`).
>  
> +### iommu\_dev\_iotlb\_timeout
> +> `= <integer>`
> +
> +> Default: `1`

So on v10 I had made clear that any timeout reduction from its
current value is, for the ATS case, not acceptable, unless you have
proof that this lower value will fit all past, present, and future
devices. Otherwise we're risking a regression here.

> --- a/xen/drivers/passthrough/vtd/qinval.c
> +++ b/xen/drivers/passthrough/vtd/qinval.c
> @@ -28,6 +28,8 @@
>  #include "vtd.h"
>  #include "extern.h"
>  
> +#define IOMMU_QI_TIMEOUT MILLISECS(1)

May I suggest VTD_QI_TIMEOUT (but see also below)?

> @@ -163,14 +165,21 @@ static int queue_invalidate_wait(struct iommu *iommu,
>      /* Now we don't support interrupt method */
>      if ( sw )
>      {
> +        s_time_t timeout;
> +
>          /* In case all wait descriptor writes to same addr with same data */
> -        start_time = NOW();
> +        timeout = flush_dev_iotlb ?
> +                  (NOW() + iommu_dev_iotlb_timeout * MILLISECS(1)) :

MILLISECS(iommu_dev_iotlb_timeout)

> +                  (NOW() + IOMMU_QI_TIMEOUT);

Or really the whole expression should probably simply become

        timeout = NOW() + MILLISECS(flush_dev_iotlb ? iommu_dev_iotlb_timeout : VTD_QI_TIMEOUT);

(of course with VTD_QI_TIMEOUT having its MILLISECS() dropped,
and suitably line wrapped).

> @@ -344,10 +355,11 @@ static int __must_check flush_iotlb_qi(void *_iommu, u16 did, u64 addr,
>                                 dw, did, size_order, 0, addr);
>          if ( flush_dev_iotlb )
>              ret = dev_invalidate_iotlb(iommu, did, addr, size_order, type);
> -        rc = invalidate_sync(iommu);
> +        rc = invalidate_sync(iommu, flush_dev_iotlb);
>          if ( !ret )
>              ret = rc;
>      }
> +
>      return ret;

Once again an addition of a blank line that doesn't belong here.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Patch v11 2/3] vt-d: synchronize for Device-TLB flush one by one
  2016-06-01  9:05 ` [Patch v11 2/3] vt-d: synchronize for Device-TLB flush one by one Xu, Quan
@ 2016-06-02 10:49   ` Jan Beulich
  0 siblings, 0 replies; 17+ messages in thread
From: Jan Beulich @ 2016-06-02 10:49 UTC (permalink / raw)
  To: Quan Xu; +Cc: dario.faggioli, feng.wu, kevin.tian, xen-devel

>>> On 01.06.16 at 11:05, <quan.xu@intel.com> wrote:
> From: Quan Xu <quan.xu@intel.com>
> 
> Today we do Device-TLB flush synchronization after issuing flush
> requests for all ATS devices belonging to a VM. Doing so however
> imposes a limitation, i.e. that we can not figure out which flush
> request is blocked in the flush queue list, based on VT-d spec.
> 
> To prepare correct Device-TLB flush timeout handling in next patch,
> we change the behavior to synchronize for every Device-TLB flush
> request. So the Device-TLB flush interface is changed a little bit,
> by checking timeout within the function instead of outside of function.
> 
> Accordingly we also do a similar change for flush interfaces of
> IOTLB/IEC/Context, i.e. moving synchronization into the function.
> Since there is no user of a non-synced interface, we just rename
> existing ones with _sync suffix.
> 
> v11:
>    1. Add __must_check annotation to invalidate_sync().
>    2. Remove the redundant invalidate_sync() in flush_iotlb_qi().
>    3. Refactor invalidate_sync() to indicate whether we need to flush
>       device IOTLB or not (I'll change it back in next patch, as I'll
>       add a specific function - dev_invalidate_sync() for device IOTLB
>       invalidation).
> 
> Signed-off-by: Quan Xu <quan.xu@intel.com>

Reviewed-by: Jan Beulich <jbeulich@suse.com>


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Patch v11 3/3] vt-d: fix vt-d Device-TLB flush timeout issue
  2016-06-01  9:05 ` [Patch v11 3/3] vt-d: fix vt-d Device-TLB flush timeout issue Xu, Quan
@ 2016-06-02 11:07   ` Jan Beulich
  2016-06-16  8:42     ` Xu, Quan
  0 siblings, 1 reply; 17+ messages in thread
From: Jan Beulich @ 2016-06-02 11:07 UTC (permalink / raw)
  To: Quan Xu; +Cc: dario.faggioli, feng.wu, kevin.tian, xen-devel

>>> On 01.06.16 at 11:05, <quan.xu@intel.com> wrote:
> --- a/xen/drivers/passthrough/vtd/extern.h
> +++ b/xen/drivers/passthrough/vtd/extern.h
> @@ -21,6 +21,7 @@
>  #define _VTD_EXTERN_H_
>  
>  #include "dmar.h"
> +#include "../ats.h"

Why? You don't de-reference struct pci_ats_dev * in this file, so
all you'd need is a forward declaration. But then this is not in line
with your v11 change description above, so I wonder whether
you actually sent a stale patch. After all I thought the v10
discussion (see
http://lists.xenproject.org/archives/html/xen-devel/2016-05/msg02208.html
) had made clear that this passing down, besides reducing the
number of arguments of some function, would also be meant to
eliminate ...

> +static void dev_invalidate_iotlb_timeout(struct iommu *iommu, u16 did,
> +                                         struct pci_ats_dev *ats_dev)
> +{
> +    struct domain *d = NULL;
> +    struct pci_dev *pdev;
> +
> +    if ( test_bit(did, iommu->domid_bitmap) )
> +        d = rcu_lock_domain_by_id(iommu->domid_map[did]);
> +
> +    /*
> +     * In case the domain has been freed or the IOMMU domid bitmap is
> +     * not valid, the device no longer belongs to this domain.
> +     */
> +    if ( d == NULL )
> +        return;
> +
> +    pcidevs_lock();
> +
> +    for_each_pdev(d, pdev)
> +    {
> +        if ( (pdev->seg == ats_dev->seg) &&
> +             (pdev->bus == ats_dev->bus) &&
> +             (pdev->devfn == ats_dev->devfn) )
> +        {
> +            ASSERT(pdev->domain);
> +            list_del(&pdev->domain_list);
> +            pdev->domain = NULL;
> +            pci_hide_existing_device(pdev);
> +            break;
> +        }
> +    }
> +
> +    pcidevs_unlock();

... this loop (and locking). (Of course such a change may better be
done in another preparatory patch.)

> +    if ( !is_hardware_domain(d) )
> +        domain_crash(d);
> +    else
> +        printk(XENLOG_WARNING VTDPREFIX
> +               " dom%d: ATS device %04x:%02x:%02x.%u flush failed\n",
> +               d->domain_id, ats_dev->seg, ats_dev->bus,
> +               PCI_SLOT(ats_dev->devfn), PCI_FUNC(ats_dev->devfn));

Please use the same logic for logging and crashing as you do in
the other series, so that at least on average a resulting DomU
crash will be accompanied with some indication of the reason
beyond just the source file name and line number.

> +static int __must_check dev_invalidate_sync(struct iommu *iommu, u16 did,
> +                                            struct pci_ats_dev *ats_dev)
> +{
> +    struct qi_ctrl *qi_ctrl = iommu_qi_ctrl(iommu);
> +    int rc = 0;
> +
> +    if ( qi_ctrl->qinval_maddr )
> +    {
> +        rc = queue_invalidate_wait(iommu, 0, 1, 1, 1);
> +
> +        if ( rc == -ETIMEDOUT )
> +            dev_invalidate_iotlb_timeout(iommu, did, ats_dev);
> +    }
> +
> +    return rc;
> +}

I've never really understood why invalidate_sync() returns success
when it didn't do anything. Now that you copy this same behavior
here, I really need to ask you to explain that.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Patch v11 1/3] IOMMU: add a timeout parameter for device IOTLB invalidation
  2016-06-02 10:24   ` Jan Beulich
@ 2016-06-15  2:55     ` Xu, Quan
  0 siblings, 0 replies; 17+ messages in thread
From: Xu, Quan @ 2016-06-15  2:55 UTC (permalink / raw)
  To: Jan Beulich; +Cc: dario.faggioli, Wu, Feng, Tian, Kevin, xen-devel

On June 02, 2016 6:25 PM, Jan Beulich <JBeulich@suse.com> wrote:
> >>> On 01.06.16 at 11:05, <quan.xu@intel.com> wrote:
> > From: Quan Xu <quan.xu@intel.com>
> > v11: Change the timeout parameter from 'vtd_qi_timeout' to
> >     'iommu_dev_iotlb_timeout', which is not only for VT-d device
> >     IOTLB invalidation, but also for other IOMMU implementations.
> 
> This goes after the first --- separator.
> 

Got it. It should be:

---
v11:
       - ...
       - ...
---


> > --- a/docs/misc/xen-command-line.markdown
> > +++ b/docs/misc/xen-command-line.markdown
> > @@ -996,6 +996,15 @@ debug hypervisor only).
> >
> >  >> Enable IOMMU debugging code (implies `verbose`).
> >
> > +### iommu\_dev\_iotlb\_timeout
> > +> `= <integer>`
> > +
> > +> Default: `1`
> 
> So on v10 I had made clear that any timeout reduction from its current value
> is, for the ATS case, not acceptable, unless you have proof that this lower
> value will fit all past, present, and future devices. Otherwise we're risking a
> regression here.
> 

I really misunderstood the 'current value', which should be about 'DMAR_OPERATION_TIMEOUT MILLISECS(1000) ', instead of ' IOMMU_QI_TIMEOUT MILLISECS(1)' in my patch.
So the default is 1000.
 
> > --- a/xen/drivers/passthrough/vtd/qinval.c
> > +++ b/xen/drivers/passthrough/vtd/qinval.c
> > @@ -28,6 +28,8 @@
> >  #include "vtd.h"
> >  #include "extern.h"
> >
> > +#define IOMMU_QI_TIMEOUT MILLISECS(1)
> 
> May I suggest VTD_QI_TIMEOUT (but see also below)?
> 

Agreed. VTD_QI_TIMEOUT is a better one.

> > @@ -163,14 +165,21 @@ static int queue_invalidate_wait(struct iommu
> *iommu,
> >      /* Now we don't support interrupt method */
> >      if ( sw )
> >      {
> > +        s_time_t timeout;
> > +
> >          /* In case all wait descriptor writes to same addr with same data */
> > -        start_time = NOW();
> > +        timeout = flush_dev_iotlb ?
> > +                  (NOW() + iommu_dev_iotlb_timeout * MILLISECS(1)) :
> 
> MILLISECS(iommu_dev_iotlb_timeout)
> 
> > +                  (NOW() + IOMMU_QI_TIMEOUT);
> 
> Or really the whole expression should probably simply become
> 
>         timeout = NOW() + MILLISECS(flush_dev_iotlb ?
> iommu_dev_iotlb_timeout : VTD_QI_TIMEOUT);
> 
> (of course with VTD_QI_TIMEOUT having its MILLISECS() dropped, and
> suitably line wrapped).
> 


I prefer this later one.

Quan
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Patch v11 3/3] vt-d: fix vt-d Device-TLB flush timeout issue
  2016-06-02 11:07   ` Jan Beulich
@ 2016-06-16  8:42     ` Xu, Quan
  2016-06-16  9:04       ` Jan Beulich
  0 siblings, 1 reply; 17+ messages in thread
From: Xu, Quan @ 2016-06-16  8:42 UTC (permalink / raw)
  To: Jan Beulich; +Cc: dario.faggioli, Wu, Feng, Tian, Kevin, xen-devel

(* I will CC arm/amd maintainer after this vt-d specific discussion, and then send out my proposal...)

On June 02, 2016 7:07 PM, Jan Beulich <JBeulich@suse.com> wrote:
> >>> On 01.06.16 at 11:05, <quan.xu@intel.com> wrote:
> > --- a/xen/drivers/passthrough/vtd/extern.h
> > +++ b/xen/drivers/passthrough/vtd/extern.h
> > @@ -21,6 +21,7 @@
> >  #define _VTD_EXTERN_H_
> >
> >  #include "dmar.h"
> > +#include "../ats.h"
> 
> Why? You don't de-reference struct pci_ats_dev * in this file, so all you'd need
> is a forward declaration. But then this is not in line with your v11 change
> description above, so I wonder whether you actually sent a stale patch.

Sorry, this patch is really strange to me.

> After all I thought the v10 discussion (see
> http://lists.xenproject.org/archives/html/xen-devel/2016-
> 05/msg02208.html
> ) had made clear that this passing down,

Sure, what you said is very clear. But I read these code again, I found a  pci_get_pdev_by_domain()
Can also get *pdev without below loop.. (also hardware domain calls pci_get_pdev_by_domain() to get pdev.)

To be honest,  now I don't like to add a struct pci_dev * inside struct pci_ats_dev, as I need to change
' struct pci_ats_dev *pdev' to ' struct pci_ats_dev * pci_ats_dev ' in some functions as well.


> besides reducing the number of
> arguments of some function, would also be meant to eliminate ...
> 
> > +static void dev_invalidate_iotlb_timeout(struct iommu *iommu, u16 did,
> > +                                         struct pci_ats_dev *ats_dev)
> > +{
> > +    struct domain *d = NULL;
> > +    struct pci_dev *pdev;
> > +
> > +    if ( test_bit(did, iommu->domid_bitmap) )
> > +        d = rcu_lock_domain_by_id(iommu->domid_map[did]);
> > +
> > +    /*
> > +     * In case the domain has been freed or the IOMMU domid bitmap is
> > +     * not valid, the device no longer belongs to this domain.
> > +     */
> > +    if ( d == NULL )
> > +        return;
> > +
> > +    pcidevs_lock();
> > +
> > +    for_each_pdev(d, pdev)
> > +    {
> > +        if ( (pdev->seg == ats_dev->seg) &&
> > +             (pdev->bus == ats_dev->bus) &&
> > +             (pdev->devfn == ats_dev->devfn) )
> > +        {
> > +            ASSERT(pdev->domain);
> > +            list_del(&pdev->domain_list);
> > +            pdev->domain = NULL;
> > +            pci_hide_existing_device(pdev);
> > +            break;
> > +        }
> > +    }
> > +
> > +    pcidevs_unlock();
> 
> ... this loop (and locking). (Of course such a change may better be done in
> another preparatory patch.)
> 

To eliminate the locking?  I am afraid the locking is still a must here even without the loop, also referring  to device_assigned()..


> 
> > +static int __must_check dev_invalidate_sync(struct iommu *iommu, u16
> did,
> > +                                            struct pci_ats_dev
> > +*ats_dev) {
> > +    struct qi_ctrl *qi_ctrl = iommu_qi_ctrl(iommu);
> > +    int rc = 0;
> > +
> > +    if ( qi_ctrl->qinval_maddr )
> > +    {
> > +        rc = queue_invalidate_wait(iommu, 0, 1, 1, 1);
> > +
> > +        if ( rc == -ETIMEDOUT )
> > +            dev_invalidate_iotlb_timeout(iommu, did, ats_dev);
> > +    }
> > +
> > +    return rc;
> > +}
> 
> I've never really understood why invalidate_sync() returns success when it
> didn't do anything. Now that you copy this same behavior here, I really need
> to ask you to explain that.
> 

It is acceptable to me, returning success when it didn't do anything -- this is worth reflection and criticism:(..
It is better:
+    if ( qi_ctrl->qinval_maddr )
+        ...
+    else
+        rc = -ENOENT;

A question:
I find the page related to qi_ctrl->qinval_maddr is not freed at all. IMO,
In disable_qinval (), we need to do:
     - free the page related to qi_ctrl->qinval_maddr.
     - qi_ctrl->qinval_maddr = 0;

Right?

Quan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Patch v11 3/3] vt-d: fix vt-d Device-TLB flush timeout issue
  2016-06-16  8:42     ` Xu, Quan
@ 2016-06-16  9:04       ` Jan Beulich
  2016-06-17  6:08         ` Xu, Quan
  0 siblings, 1 reply; 17+ messages in thread
From: Jan Beulich @ 2016-06-16  9:04 UTC (permalink / raw)
  To: Quan Xu; +Cc: dario.faggioli, Feng Wu, Kevin Tian, xen-devel

>>> On 16.06.16 at 10:42, <quan.xu@intel.com> wrote:
> On June 02, 2016 7:07 PM, Jan Beulich <JBeulich@suse.com> wrote:
>> >>> On 01.06.16 at 11:05, <quan.xu@intel.com> wrote:
>> > --- a/xen/drivers/passthrough/vtd/extern.h
>> > +++ b/xen/drivers/passthrough/vtd/extern.h
>> > @@ -21,6 +21,7 @@
>> >  #define _VTD_EXTERN_H_
>> >
>> >  #include "dmar.h"
>> > +#include "../ats.h"
>> 
>> Why? You don't de-reference struct pci_ats_dev * in this file, so all you'd need
>> is a forward declaration. But then this is not in line with your v11 change
>> description above, so I wonder whether you actually sent a stale patch.
> 
> Sorry, this patch is really strange to me.

Well, it's yours, so you ought to be able to explain it.

>> After all I thought the v10 discussion (see
>> http://lists.xenproject.org/archives/html/xen-devel/2016- 
>> 05/msg02208.html
>> ) had made clear that this passing down,
> 
> Sure, what you said is very clear. But I read these code again, I found a  
> pci_get_pdev_by_domain()
> Can also get *pdev without below loop.. (also hardware domain calls 
> pci_get_pdev_by_domain() to get pdev.)

Which would not eliminate the loop - pci_get_pdev_by_domain()
does have a loop itself.

> To be honest,  now I don't like to add a struct pci_dev * inside struct 
> pci_ats_dev, as I need to change
> ' struct pci_ats_dev *pdev' to ' struct pci_ats_dev * pci_ats_dev ' in some 
> functions as well.

I don't see why variables would need renaming. If you need a
variable of type struct pci_dev * in a function which already has
a variable named pdev, simply name the new variable differently
(e.g. pci_dev).

>> besides reducing the number of
>> arguments of some function, would also be meant to eliminate ...
>> 
>> > +static void dev_invalidate_iotlb_timeout(struct iommu *iommu, u16 did,
>> > +                                         struct pci_ats_dev *ats_dev)
>> > +{
>> > +    struct domain *d = NULL;
>> > +    struct pci_dev *pdev;
>> > +
>> > +    if ( test_bit(did, iommu->domid_bitmap) )
>> > +        d = rcu_lock_domain_by_id(iommu->domid_map[did]);
>> > +
>> > +    /*
>> > +     * In case the domain has been freed or the IOMMU domid bitmap is
>> > +     * not valid, the device no longer belongs to this domain.
>> > +     */
>> > +    if ( d == NULL )
>> > +        return;
>> > +
>> > +    pcidevs_lock();
>> > +
>> > +    for_each_pdev(d, pdev)
>> > +    {
>> > +        if ( (pdev->seg == ats_dev->seg) &&
>> > +             (pdev->bus == ats_dev->bus) &&
>> > +             (pdev->devfn == ats_dev->devfn) )
>> > +        {
>> > +            ASSERT(pdev->domain);
>> > +            list_del(&pdev->domain_list);
>> > +            pdev->domain = NULL;
>> > +            pci_hide_existing_device(pdev);
>> > +            break;
>> > +        }
>> > +    }
>> > +
>> > +    pcidevs_unlock();
>> 
>> ... this loop (and locking). (Of course such a change may better be done in
>> another preparatory patch.)
>> 
> 
> To eliminate the locking?  I am afraid the locking is still a must here even 
> without the loop, also referring  to device_assigned()..

If the entire loop gets eliminated, what would be left is

    pcidevs_lock();
    pcidevs_unlock();

which I don't think does any good.

>> > +static int __must_check dev_invalidate_sync(struct iommu *iommu, u16
>> did,
>> > +                                            struct pci_ats_dev
>> > +*ats_dev) {
>> > +    struct qi_ctrl *qi_ctrl = iommu_qi_ctrl(iommu);
>> > +    int rc = 0;
>> > +
>> > +    if ( qi_ctrl->qinval_maddr )
>> > +    {
>> > +        rc = queue_invalidate_wait(iommu, 0, 1, 1, 1);
>> > +
>> > +        if ( rc == -ETIMEDOUT )
>> > +            dev_invalidate_iotlb_timeout(iommu, did, ats_dev);
>> > +    }
>> > +
>> > +    return rc;
>> > +}
>> 
>> I've never really understood why invalidate_sync() returns success when it
>> didn't do anything. Now that you copy this same behavior here, I really need
>> to ask you to explain that.
>> 
> 
> It is acceptable to me, returning success when it didn't do anything -- this 
> is worth reflection and criticism:(..
> It is better:
> +    if ( qi_ctrl->qinval_maddr )
> +        ...
> +    else
> +        rc = -ENOENT;

Right. And perhaps a separate patch to make invalidate_sync()
do the same. Question is whether this really ought to be a
conditional, or whether instead this code is unreachable when
qinval is not in use, in which case these conditionals would imo
better be converted to ASSERT()s.

> A question:
> I find the page related to qi_ctrl->qinval_maddr is not freed at all. IMO,
> In disable_qinval (), we need to do:
>      - free the page related to qi_ctrl->qinval_maddr.
>      - qi_ctrl->qinval_maddr = 0;

Well, that's a correct observation, but not a big problem imo: If this
was a per-domain resource, it surely would need fixing. But if freeing
a couple of these pages (one per IOMMU) causes synchronization
headaches (e.g. because there may still be dangling references to
it), then I think freeing them is not a must. But if freeing them is safe
(like you seem to imply), then I'm certainly fine with you fixing this
(not that my opinion would matter much here, as I'm not the
maintainer of this code).

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Patch v11 3/3] vt-d: fix vt-d Device-TLB flush timeout issue
  2016-06-16  9:04       ` Jan Beulich
@ 2016-06-17  6:08         ` Xu, Quan
  2016-06-17  7:00           ` Jan Beulich
  0 siblings, 1 reply; 17+ messages in thread
From: Xu, Quan @ 2016-06-17  6:08 UTC (permalink / raw)
  To: Jan Beulich; +Cc: dario.faggioli, Wu, Feng, Tian, Kevin, xen-devel


On June 16, 2016 5:05 PM, Jan Beulich <JBeulich@suse.com> wrote:
> >>> On 16.06.16 at 10:42, <quan.xu@intel.com> wrote:
> > On June 02, 2016 7:07 PM, Jan Beulich <JBeulich@suse.com> wrote:
> >> >>> On 01.06.16 at 11:05, <quan.xu@intel.com> wrote:
> >> > +static void dev_invalidate_iotlb_timeout(struct iommu *iommu, u16 did,
> >> > +                                         struct pci_ats_dev
> >> > +*ats_dev) {
> >> > +    struct domain *d = NULL;
> >> > +    struct pci_dev *pdev;
> >> > +
> >> > +    if ( test_bit(did, iommu->domid_bitmap) )
> >> > +        d = rcu_lock_domain_by_id(iommu->domid_map[did]);
> >> > +
> >> > +    /*
> >> > +     * In case the domain has been freed or the IOMMU domid bitmap is
> >> > +     * not valid, the device no longer belongs to this domain.
> >> > +     */
> >> > +    if ( d == NULL )
> >> > +        return;
> >> > +
> >> > +    pcidevs_lock();
> >> > +
> >> > +    for_each_pdev(d, pdev)
> >> > +    {
> >> > +        if ( (pdev->seg == ats_dev->seg) &&
> >> > +             (pdev->bus == ats_dev->bus) &&
> >> > +             (pdev->devfn == ats_dev->devfn) )
> >> > +        {
> >> > +            ASSERT(pdev->domain);
> >> > +            list_del(&pdev->domain_list);
> >> > +            pdev->domain = NULL;
> >> > +            pci_hide_existing_device(pdev);
> >> > +            break;
> >> > +        }
> >> > +    }
> >> > +
> >> > +    pcidevs_unlock();
> >>
> >> ... this loop (and locking). (Of course such a change may better be
> >> done in another preparatory patch.)
> >>
> >
> > To eliminate the locking?  I am afraid the locking is still a must
> > here even without the loop, also referring  to device_assigned()..
> 
> If the entire loop gets eliminated, what would be left is
> 
>     pcidevs_lock();
>     pcidevs_unlock();
> 
> which I don't think does any good.
> 

Why? I can't follow it..


> >> > +static int __must_check dev_invalidate_sync(struct iommu *iommu,
> >> > +u16
> >> did,
> >> > +                                            struct pci_ats_dev
> >> > +*ats_dev) {
> >> > +    struct qi_ctrl *qi_ctrl = iommu_qi_ctrl(iommu);
> >> > +    int rc = 0;
> >> > +
> >> > +    if ( qi_ctrl->qinval_maddr )
> >> > +    {
> >> > +        rc = queue_invalidate_wait(iommu, 0, 1, 1, 1);
> >> > +
> >> > +        if ( rc == -ETIMEDOUT )
> >> > +            dev_invalidate_iotlb_timeout(iommu, did, ats_dev);
> >> > +    }
> >> > +
> >> > +    return rc;
> >> > +}
> >>
> >> I've never really understood why invalidate_sync() returns success
> >> when it didn't do anything. Now that you copy this same behavior
> >> here, I really need to ask you to explain that.
> >>
> >
> > It is acceptable to me, returning success when it didn't do anything
> > -- this is worth reflection and criticism:(..
> > It is better:
> > +    if ( qi_ctrl->qinval_maddr )
> > +        ...
> > +    else
> > +        rc = -ENOENT;
> 
> Right. And perhaps a separate patch to make invalidate_sync() do the same.

Agreed.

> Question is whether this really ought to be a conditional, or whether instead
> this code is unreachable when qinval is not in use, in which case these
> conditionals would imo better be converted to ASSERT()s.
> 

IMO, this should be a conditional.
As mentioned below, strictly speaking, this is a bug. We can't  ASSERT() based on a bug..
A coming patch may fix it..


> > A question:
> > I find the page related to qi_ctrl->qinval_maddr is not freed at all.
> > IMO, In disable_qinval (), we need to do:
> >      - free the page related to qi_ctrl->qinval_maddr.
> >      - qi_ctrl->qinval_maddr = 0;
> 
> Well, that's a correct observation, but not a big problem imo: If this was a per-
> domain resource, it surely would need fixing. But if freeing a couple of these
> pages (one per IOMMU) causes synchronization headaches (e.g. because
> there may still be dangling references to it), then I think freeing them is not a
> must. But if freeing them is safe (like you seem to imply), then I'm certainly
> fine with you fixing this (not that my opinion would matter much here, as I'm
> not the maintainer of this code).
> 

Agreed, thanks for your explanation. At least I will leave it as is in this patch set.

Quan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Patch v11 3/3] vt-d: fix vt-d Device-TLB flush timeout issue
  2016-06-17  6:08         ` Xu, Quan
@ 2016-06-17  7:00           ` Jan Beulich
  2016-06-17  8:15             ` Xu, Quan
  2016-06-22 15:54             ` Xu, Quan
  0 siblings, 2 replies; 17+ messages in thread
From: Jan Beulich @ 2016-06-17  7:00 UTC (permalink / raw)
  To: Quan Xu; +Cc: dario.faggioli, Feng Wu, Kevin Tian, xen-devel

>>> On 17.06.16 at 08:08, <quan.xu@intel.com> wrote:

> On June 16, 2016 5:05 PM, Jan Beulich <JBeulich@suse.com> wrote:
>> >>> On 16.06.16 at 10:42, <quan.xu@intel.com> wrote:
>> > On June 02, 2016 7:07 PM, Jan Beulich <JBeulich@suse.com> wrote:
>> >> >>> On 01.06.16 at 11:05, <quan.xu@intel.com> wrote:
>> >> > +static void dev_invalidate_iotlb_timeout(struct iommu *iommu, u16 did,
>> >> > +                                         struct pci_ats_dev
>> >> > +*ats_dev) {
>> >> > +    struct domain *d = NULL;
>> >> > +    struct pci_dev *pdev;
>> >> > +
>> >> > +    if ( test_bit(did, iommu->domid_bitmap) )
>> >> > +        d = rcu_lock_domain_by_id(iommu->domid_map[did]);
>> >> > +
>> >> > +    /*
>> >> > +     * In case the domain has been freed or the IOMMU domid bitmap is
>> >> > +     * not valid, the device no longer belongs to this domain.
>> >> > +     */
>> >> > +    if ( d == NULL )
>> >> > +        return;
>> >> > +
>> >> > +    pcidevs_lock();
>> >> > +
>> >> > +    for_each_pdev(d, pdev)
>> >> > +    {
>> >> > +        if ( (pdev->seg == ats_dev->seg) &&
>> >> > +             (pdev->bus == ats_dev->bus) &&
>> >> > +             (pdev->devfn == ats_dev->devfn) )
>> >> > +        {
>> >> > +            ASSERT(pdev->domain);
>> >> > +            list_del(&pdev->domain_list);
>> >> > +            pdev->domain = NULL;
>> >> > +            pci_hide_existing_device(pdev);
>> >> > +            break;
>> >> > +        }
>> >> > +    }
>> >> > +
>> >> > +    pcidevs_unlock();
>> >>
>> >> ... this loop (and locking). (Of course such a change may better be
>> >> done in another preparatory patch.)
>> >>
>> >
>> > To eliminate the locking?  I am afraid the locking is still a must
>> > here even without the loop, also referring  to device_assigned()..
>> 
>> If the entire loop gets eliminated, what would be left is
>> 
>>     pcidevs_lock();
>>     pcidevs_unlock();
>> 
>> which I don't think does any good.
> 
> Why? I can't follow it..

I don't understand your question. Can you explain what use above
code sequence is, in your opinion? Or else - what does the "why"
refer to?

>> >> > +static int __must_check dev_invalidate_sync(struct iommu *iommu,
>> >> > +u16
>> >> did,
>> >> > +                                            struct pci_ats_dev
>> >> > +*ats_dev) {
>> >> > +    struct qi_ctrl *qi_ctrl = iommu_qi_ctrl(iommu);
>> >> > +    int rc = 0;
>> >> > +
>> >> > +    if ( qi_ctrl->qinval_maddr )
>> >> > +    {
>> >> > +        rc = queue_invalidate_wait(iommu, 0, 1, 1, 1);
>> >> > +
>> >> > +        if ( rc == -ETIMEDOUT )
>> >> > +            dev_invalidate_iotlb_timeout(iommu, did, ats_dev);
>> >> > +    }
>> >> > +
>> >> > +    return rc;
>> >> > +}
>> >>
>> >> I've never really understood why invalidate_sync() returns success
>> >> when it didn't do anything. Now that you copy this same behavior
>> >> here, I really need to ask you to explain that.
>> >>
>> >
>> > It is acceptable to me, returning success when it didn't do anything
>> > -- this is worth reflection and criticism:(..
>> > It is better:
>> > +    if ( qi_ctrl->qinval_maddr )
>> > +        ...
>> > +    else
>> > +        rc = -ENOENT;
>> 
>> Right. And perhaps a separate patch to make invalidate_sync() do the same.
> 
> Agreed.
> 
>> Question is whether this really ought to be a conditional, or whether 
> instead
>> this code is unreachable when qinval is not in use, in which case these
>> conditionals would imo better be converted to ASSERT()s.
>> 
> 
> IMO, this should be a conditional.
> As mentioned below, strictly speaking, this is a bug. We can't  ASSERT() 
> based on a bug..
> A coming patch may fix it..

And again I don't understand: ASSERT()s are to verify assumed
state. If static code analysis resulted in understanding a function
is unreachable when qi_ctrl->qinval_maddr is zero (because
qinval ought to have got disabled if any of the table setup failed),
then adding ASSERT() would (a) document that and (b) allow to
know quickly if something broke that assumption.

But then again I may simply misunderstand your wording: "We
can't ASSERT() based on a bug" is really pretty unclear to me.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Patch v11 3/3] vt-d: fix vt-d Device-TLB flush timeout issue
  2016-06-17  7:00           ` Jan Beulich
@ 2016-06-17  8:15             ` Xu, Quan
  2016-06-17  8:40               ` Jan Beulich
  2016-06-22 15:54             ` Xu, Quan
  1 sibling, 1 reply; 17+ messages in thread
From: Xu, Quan @ 2016-06-17  8:15 UTC (permalink / raw)
  To: Jan Beulich; +Cc: dario.faggioli, Wu, Feng, Tian, Kevin, xen-devel

On June 17, 2016 3:01 PM, Jan Beulich <JBeulich@suse.com> wrote:
> >>> On 17.06.16 at 08:08, <quan.xu@intel.com> wrote:
> 
> > On June 16, 2016 5:05 PM, Jan Beulich <JBeulich@suse.com> wrote:
> >> >>> On 16.06.16 at 10:42, <quan.xu@intel.com> wrote:
> >> > On June 02, 2016 7:07 PM, Jan Beulich <JBeulich@suse.com> wrote:
> >> >> >>> On 01.06.16 at 11:05, <quan.xu@intel.com> wrote:
> >> >> > +static void dev_invalidate_iotlb_timeout(struct iommu *iommu, u16
> did,
> >> >> > +                                         struct pci_ats_dev
> >> >> > +*ats_dev) {
> >> >> > +    struct domain *d = NULL;
> >> >> > +    struct pci_dev *pdev;
> >> >> > +
> >> >> > +    if ( test_bit(did, iommu->domid_bitmap) )
> >> >> > +        d = rcu_lock_domain_by_id(iommu->domid_map[did]);
> >> >> > +
> >> >> > +    /*
> >> >> > +     * In case the domain has been freed or the IOMMU domid bitmap
> is
> >> >> > +     * not valid, the device no longer belongs to this domain.
> >> >> > +     */
> >> >> > +    if ( d == NULL )
> >> >> > +        return;
> >> >> > +
> >> >> > +    pcidevs_lock();
> >> >> > +
> >> >> > +    for_each_pdev(d, pdev)
> >> >> > +    {
> >> >> > +        if ( (pdev->seg == ats_dev->seg) &&
> >> >> > +             (pdev->bus == ats_dev->bus) &&
> >> >> > +             (pdev->devfn == ats_dev->devfn) )
> >> >> > +        {
> >> >> > +            ASSERT(pdev->domain);
> >> >> > +            list_del(&pdev->domain_list);
> >> >> > +            pdev->domain = NULL;
> >> >> > +            pci_hide_existing_device(pdev);
> >> >> > +            break;
> >> >> > +        }
> >> >> > +    }
> >> >> > +
> >> >> > +    pcidevs_unlock();
> >> >>
> >> >> ... this loop (and locking). (Of course such a change may better
> >> >> be done in another preparatory patch.)
> >> >>
> >> >
> >> > To eliminate the locking?  I am afraid the locking is still a must
> >> > here even without the loop, also referring  to device_assigned()..
> >>
> >> If the entire loop gets eliminated, what would be left is
> >>
> >>     pcidevs_lock();
> >>     pcidevs_unlock();
> >>
> >> which I don't think does any good.
> >
> > Why? I can't follow it..
> 
> I don't understand your question. Can you explain what use above code
> sequence is, in your opinion? Or else - what does the "why"
> refer to?
> 

Ah, there may be a gap between us. without this loop,  these pdev operation should be still there, such as,


+    ASSERT(pdev->domain);
+    list_del(&pdev->domain_list);
+    pdev->domain = NULL;
+    pci_hide_existing_device(pdev);

So, the left is not only:
   pcidevs_lock();
   pcidevs_unlock();


> >> >> > +static int __must_check dev_invalidate_sync(struct iommu
> >> >> > +*iommu,
> >> >> > +u16
> >> >> did,
> >> >> > +                                            struct pci_ats_dev
> >> >> > +*ats_dev) {
> >> >> > +    struct qi_ctrl *qi_ctrl = iommu_qi_ctrl(iommu);
> >> >> > +    int rc = 0;
> >> >> > +
> >> >> > +    if ( qi_ctrl->qinval_maddr )
> >> >> > +    {
> >> >> > +        rc = queue_invalidate_wait(iommu, 0, 1, 1, 1);
> >> >> > +
> >> >> > +        if ( rc == -ETIMEDOUT )
> >> >> > +            dev_invalidate_iotlb_timeout(iommu, did, ats_dev);
> >> >> > +    }
> >> >> > +
> >> >> > +    return rc;
> >> >> > +}
> >> >>
> >> >> I've never really understood why invalidate_sync() returns success
> >> >> when it didn't do anything. Now that you copy this same behavior
> >> >> here, I really need to ask you to explain that.
> >> >>
> >> >
> >> > It is acceptable to me, returning success when it didn't do
> >> > anything
> >> > -- this is worth reflection and criticism:(..
> >> > It is better:
> >> > +    if ( qi_ctrl->qinval_maddr )
> >> > +        ...
> >> > +    else
> >> > +        rc = -ENOENT;
> >>
> >> Right. And perhaps a separate patch to make invalidate_sync() do the
> same.
> >
> > Agreed.
> >
> >> Question is whether this really ought to be a conditional, or whether
> > instead
> >> this code is unreachable when qinval is not in use, in which case
> >> these conditionals would imo better be converted to ASSERT()s.
> >>
> >
> > IMO, this should be a conditional.
> > As mentioned below, strictly speaking, this is a bug. We can't
> > ASSERT() based on a bug..
> > A coming patch may fix it..
> 
> And again I don't understand: ASSERT()s are to verify assumed state. If static
> code analysis resulted in understanding a function is unreachable when
> qi_ctrl->qinval_maddr is zero (because qinval ought to have got disabled if any
> of the table setup failed), then adding ASSERT() would (a) document that and
> (b) allow to know quickly if something broke that assumption.
> 

You are right.  A separate patch does this.

> But then again I may simply misunderstand your wording: "We can't ASSERT()
> based on a bug" is really pretty unclear to me.
> 

I supposed the variable in ASSERT() is always true, but disable_qinval() needs to make
qi_ctrl->qinval_maddr  zero, but today it doesn't do this -- a bug.
With your explanation,  I got it now. Thanks.

Quan








_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Patch v11 3/3] vt-d: fix vt-d Device-TLB flush timeout issue
  2016-06-17  8:15             ` Xu, Quan
@ 2016-06-17  8:40               ` Jan Beulich
  0 siblings, 0 replies; 17+ messages in thread
From: Jan Beulich @ 2016-06-17  8:40 UTC (permalink / raw)
  To: Quan Xu; +Cc: dario.faggioli, Feng Wu, Kevin Tian, xen-devel

>>> On 17.06.16 at 10:15, <quan.xu@intel.com> wrote:
> On June 17, 2016 3:01 PM, Jan Beulich <JBeulich@suse.com> wrote:
>> >>> On 17.06.16 at 08:08, <quan.xu@intel.com> wrote:
>> 
>> > On June 16, 2016 5:05 PM, Jan Beulich <JBeulich@suse.com> wrote:
>> >> >>> On 16.06.16 at 10:42, <quan.xu@intel.com> wrote:
>> >> > On June 02, 2016 7:07 PM, Jan Beulich <JBeulich@suse.com> wrote:
>> >> >> >>> On 01.06.16 at 11:05, <quan.xu@intel.com> wrote:
>> >> >> > +static void dev_invalidate_iotlb_timeout(struct iommu *iommu, u16
>> did,
>> >> >> > +                                         struct pci_ats_dev
>> >> >> > +*ats_dev) {
>> >> >> > +    struct domain *d = NULL;
>> >> >> > +    struct pci_dev *pdev;
>> >> >> > +
>> >> >> > +    if ( test_bit(did, iommu->domid_bitmap) )
>> >> >> > +        d = rcu_lock_domain_by_id(iommu->domid_map[did]);
>> >> >> > +
>> >> >> > +    /*
>> >> >> > +     * In case the domain has been freed or the IOMMU domid bitmap
>> is
>> >> >> > +     * not valid, the device no longer belongs to this domain.
>> >> >> > +     */
>> >> >> > +    if ( d == NULL )
>> >> >> > +        return;
>> >> >> > +
>> >> >> > +    pcidevs_lock();
>> >> >> > +
>> >> >> > +    for_each_pdev(d, pdev)
>> >> >> > +    {
>> >> >> > +        if ( (pdev->seg == ats_dev->seg) &&
>> >> >> > +             (pdev->bus == ats_dev->bus) &&
>> >> >> > +             (pdev->devfn == ats_dev->devfn) )
>> >> >> > +        {
>> >> >> > +            ASSERT(pdev->domain);
>> >> >> > +            list_del(&pdev->domain_list);
>> >> >> > +            pdev->domain = NULL;
>> >> >> > +            pci_hide_existing_device(pdev);
>> >> >> > +            break;
>> >> >> > +        }
>> >> >> > +    }
>> >> >> > +
>> >> >> > +    pcidevs_unlock();
>> >> >>
>> >> >> ... this loop (and locking). (Of course such a change may better
>> >> >> be done in another preparatory patch.)
>> >> >>
>> >> >
>> >> > To eliminate the locking?  I am afraid the locking is still a must
>> >> > here even without the loop, also referring  to device_assigned()..
>> >>
>> >> If the entire loop gets eliminated, what would be left is
>> >>
>> >>     pcidevs_lock();
>> >>     pcidevs_unlock();
>> >>
>> >> which I don't think does any good.
>> >
>> > Why? I can't follow it..
>> 
>> I don't understand your question. Can you explain what use above code
>> sequence is, in your opinion? Or else - what does the "why"
>> refer to?
>> 
> 
> Ah, there may be a gap between us. without this loop,  these pdev operation 
> should be still there, such as,
> 
> 
> +    ASSERT(pdev->domain);
> +    list_del(&pdev->domain_list);
> +    pdev->domain = NULL;
> +    pci_hide_existing_device(pdev);
> 
> So, the left is not only:
>    pcidevs_lock();
>    pcidevs_unlock();

Oh, indeed. My bad.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Patch v11 3/3] vt-d: fix vt-d Device-TLB flush timeout issue
  2016-06-17  7:00           ` Jan Beulich
  2016-06-17  8:15             ` Xu, Quan
@ 2016-06-22 15:54             ` Xu, Quan
  2016-06-22 16:18               ` Jan Beulich
  1 sibling, 1 reply; 17+ messages in thread
From: Xu, Quan @ 2016-06-22 15:54 UTC (permalink / raw)
  To: Jan Beulich; +Cc: dario.faggioli, Wu, Feng, Tian, Kevin, xen-devel

On June 17, 2016 3:01 PM, Jan Beulich <JBeulich@suse.com> wrote:
> And again I don't understand: ASSERT()s are to verify assumed state. If static
> code analysis resulted in understanding a function is unreachable when
> qi_ctrl->qinval_maddr is zero (because qinval ought to have got disabled if any
> of the table setup failed), then adding ASSERT() would (a) document that and
> (b) allow to know quickly if something broke that assumption.

other than enable_qinval() -- yes, I need to convert conditionals of qi_ctrl->qinval_maddr into
 ASSERT()s..
But in enable_qinval(), I am still not quite sure whether I need to convert these conditionals of
 qi_ctrl->qinval_maddr into ASSERT()s or not.

Quan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Patch v11 3/3] vt-d: fix vt-d Device-TLB flush timeout issue
  2016-06-22 15:54             ` Xu, Quan
@ 2016-06-22 16:18               ` Jan Beulich
  2016-06-23  2:08                 ` Xu, Quan
  0 siblings, 1 reply; 17+ messages in thread
From: Jan Beulich @ 2016-06-22 16:18 UTC (permalink / raw)
  To: Quan Xu; +Cc: dario.faggioli, Feng Wu, Kevin Tian, xen-devel

>>> On 22.06.16 at 17:54, <quan.xu@intel.com> wrote:
> On June 17, 2016 3:01 PM, Jan Beulich <JBeulich@suse.com> wrote:
>> And again I don't understand: ASSERT()s are to verify assumed state. If 
> static
>> code analysis resulted in understanding a function is unreachable when
>> qi_ctrl->qinval_maddr is zero (because qinval ought to have got disabled if 
> any
>> of the table setup failed), then adding ASSERT() would (a) document that and
>> (b) allow to know quickly if something broke that assumption.
> 
> other than enable_qinval() -- yes, I need to convert conditionals of 
> qi_ctrl->qinval_maddr into
>  ASSERT()s..
> But in enable_qinval(), I am still not quite sure whether I need to convert 
> these conditionals of
>  qi_ctrl->qinval_maddr into ASSERT()s or not.

No, I don't think you want to so there - you'd bring the system down
in case of an actual initialization error. ASSERT()s should only be
used on conditions controlled entirely by the hypervisor.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Patch v11 3/3] vt-d: fix vt-d Device-TLB flush timeout issue
  2016-06-22 16:18               ` Jan Beulich
@ 2016-06-23  2:08                 ` Xu, Quan
  0 siblings, 0 replies; 17+ messages in thread
From: Xu, Quan @ 2016-06-23  2:08 UTC (permalink / raw)
  To: Jan Beulich; +Cc: dario.faggioli, Wu, Feng, Tian, Kevin, xen-devel

On June 23, 2016 12:18 AM, Jan Beulich <JBeulich@suse.com> wrote:
> >>> On 22.06.16 at 17:54, <quan.xu@intel.com> wrote:
> > On June 17, 2016 3:01 PM, Jan Beulich <JBeulich@suse.com> wrote:
> >> And again I don't understand: ASSERT()s are to verify assumed state.
> >> If
> > static
> >> code analysis resulted in understanding a function is unreachable
> >> when qi_ctrl->qinval_maddr is zero (because qinval ought to have got
> >> disabled if
> > any
> >> of the table setup failed), then adding ASSERT() would (a) document
> >> that and
> >> (b) allow to know quickly if something broke that assumption.
> >
> > other than enable_qinval() -- yes, I need to convert conditionals of
> > qi_ctrl->qinval_maddr into  ASSERT()s..
> > But in enable_qinval(), I am still not quite sure whether I need to
> > convert these conditionals of  qi_ctrl->qinval_maddr into ASSERT()s or
> > not.
> 
> No, I don't think you want to so there - you'd bring the system down in case
> of an actual initialization error. ASSERT()s should only be used on conditions
> controlled entirely by the hypervisor.
> 

Jan, thank you. Now I am clear.

Quan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2016-06-23  2:08 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-06-01  9:05 [Patch v11 0/3] VT-d Device-TLB flush issue Xu, Quan
2016-06-01  9:05 ` [Patch v11 1/3] IOMMU: add a timeout parameter for device IOTLB invalidation Xu, Quan
2016-06-02 10:24   ` Jan Beulich
2016-06-15  2:55     ` Xu, Quan
2016-06-01  9:05 ` [Patch v11 2/3] vt-d: synchronize for Device-TLB flush one by one Xu, Quan
2016-06-02 10:49   ` Jan Beulich
2016-06-01  9:05 ` [Patch v11 3/3] vt-d: fix vt-d Device-TLB flush timeout issue Xu, Quan
2016-06-02 11:07   ` Jan Beulich
2016-06-16  8:42     ` Xu, Quan
2016-06-16  9:04       ` Jan Beulich
2016-06-17  6:08         ` Xu, Quan
2016-06-17  7:00           ` Jan Beulich
2016-06-17  8:15             ` Xu, Quan
2016-06-17  8:40               ` Jan Beulich
2016-06-22 15:54             ` Xu, Quan
2016-06-22 16:18               ` Jan Beulich
2016-06-23  2:08                 ` Xu, Quan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).