[PATCH v5 0/4] mitigate the per-pCPU blocking list may be too long

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH v5 0/4] mitigate the per-pCPU blocking list may be too long
@ 2017-08-16  5:14 Chao Gao
  2017-08-16  5:14 ` [PATCH v5 1/4] VT-d PI: track the number of vcpus on pi blocking list Chao Gao
                   ` (3 more replies)
  0 siblings, 4 replies; 19+ messages in thread
From: Chao Gao @ 2017-08-16  5:14 UTC (permalink / raw)
  To: xen-devel
  Cc: Kevin Tian, Wei Liu, Jan Beulich, George Dunlap, Andrew Cooper,
	Ian Jackson, Jun Nakajima, Chao Gao

Changes in v5:
 - In patch 1, add check the sanity of vcpus count on pi blocking list 
   and also drop George's Reviewed-by.
 - In patch 3, introduce a new function to find proper pCPU to accept
 the blocked vcpu.
 - In patch 4, add support of tracking the operations on pi blocking list
 and generating scatterplot of pi list length

VT-d PI introduces a per-pCPU blocking list to track the blocked vCPU
on a given pCPU. Theoretically, there are 32K domain on single host,
128 vCPUs per domain. If all vCPUs are blocked on the same pCPU,
4M vCPUs are in the same list. Traversing this list consumes too
much time. More discussion can be found in [1,2,3].

To mitigate this issue, this series put vcpus to another pcpu's list
when the local pcpu's list length reachs an upper bound which is the
average vcpus per pcpu ratio plus a constant. 

PATCH 1/4 adds a counter to track the per-pCPU blocking list's length.

PATCH 2/4 uses a global variable to track how many hvm vcpus on this
system. It is used to calculate the average vcpus per pcpu ratio.

patch 3/4 employs a policy to restrict the vcpu count on a given
pcpu's pi blocking list in case the list grows too long. In one work,
If list length is smaller than the upper bound, the vcpu is added to
the pi blocking list of the pcpu which it is running on. Otherwise,
another online pcpu is chosen to accept the vcpu.

patch 4/4 adds some relevant events to xentrace to aid analysis of
the list length. With this patch, some data can be acquired to
validate patch 3/4. 

[1] https://lists.gt.net/xen/devel/422661?search_string=VT-d%20posted-interrupt%20core%20logic%20handling;#422661
[2] https://lists.gt.net/xen/devel/422567?search_string=%20The%20length%20of%20the%20list%20depends;#422567
[3] https://lists.gt.net/xen/devel/472749?search_string=enable%20vt-d%20pi%20by%20default;#472749

Chao Gao (4):
  VT-d PI: track the number of vcpus on pi blocking list
  x86/vcpu: track hvm vcpu number on the system
  VT-d PI: restrict the number of vcpus in a given pcpu's PI blocking
    list
  xentrace: add support for HVM's PI blocking list operation

 tools/xentrace/formats        |   2 +
 tools/xentrace/xenalyze.c     | 116 +++++++++++++++++++++++++++++
 xen/arch/x86/hvm/hvm.c        |   6 ++
 xen/arch/x86/hvm/vmx/vmx.c    | 166 ++++++++++++++++++++++++++++++++++++------
 xen/include/asm-x86/hvm/hvm.h |   3 +
 xen/include/public/trace.h    |   5 ++
 6 files changed, 275 insertions(+), 23 deletions(-)

-- 
1.8.3.1

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH v5 1/4] VT-d PI: track the number of vcpus on pi blocking list
  2017-08-16  5:14 [PATCH v5 0/4] mitigate the per-pCPU blocking list may be too long Chao Gao
@ 2017-08-16  5:14 ` Chao Gao
  2017-08-30 16:00   ` Jan Beulich
  2017-08-16  5:14 ` [PATCH v5 2/4] x86/vcpu: track hvm vcpu number on the system Chao Gao
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 19+ messages in thread
From: Chao Gao @ 2017-08-16  5:14 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Kevin Tian, Jun Nakajima, Jan Beulich, Chao Gao

This patch adds a field, counter, in struct vmx_pi_blocking_vcpu to track
how many entries are on the pi blocking list.

Signed-off-by: Chao Gao <chao.gao@intel.com>
---
v5:
 - introduce two functions for adding or removing vcpus from pi blocking list.
 - check the sanity of vcpu count on pi blocking list
v4:
 - non-trace part of Patch 1 in v3

---
 xen/arch/x86/hvm/vmx/vmx.c | 42 ++++++++++++++++++++++++++++++++----------
 1 file changed, 32 insertions(+), 10 deletions(-)

diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
index 67fc85b..bf17988 100644
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -83,6 +83,7 @@ static int vmx_vmfunc_intercept(struct cpu_user_regs *regs);
 struct vmx_pi_blocking_vcpu {
     struct list_head     list;
     spinlock_t           lock;
+    unsigned int         counter;
 };
 
 /*
@@ -100,6 +101,24 @@ void vmx_pi_per_cpu_init(unsigned int cpu)
     spin_lock_init(&per_cpu(vmx_pi_blocking, cpu).lock);
 }
 
+static void vmx_pi_add_vcpu(struct pi_blocking_vcpu *pbv,
+                            struct vmx_pi_blocking_vcpu *vpbv)
+{
+    ASSERT(spin_is_locked(&vpbv->lock));
+    add_sized(&vpbv->counter, 1);
+    ASSERT(read_atomic(&vpbv->counter));
+    list_add_tail(&pbv->list, &vpbv->list);
+}
+
+static void vmx_pi_del_vcpu(struct pi_blocking_vcpu *pbv,
+                            struct vmx_pi_blocking_vcpu *vpbv)
+{
+    ASSERT(spin_is_locked(&vpbv->lock));
+    ASSERT(read_atomic(&vpbv->counter));
+    list_del(&pbv->list);
+    add_sized(&vpbv->counter, -1);
+}
+
 static void vmx_vcpu_block(struct vcpu *v)
 {
     unsigned long flags;
@@ -120,8 +139,8 @@ static void vmx_vcpu_block(struct vcpu *v)
      */
     ASSERT(old_lock == NULL);
 
-    list_add_tail(&v->arch.hvm_vmx.pi_blocking.list,
-                  &per_cpu(vmx_pi_blocking, v->processor).list);
+    vmx_pi_add_vcpu(&v->arch.hvm_vmx.pi_blocking,
+                    &per_cpu(vmx_pi_blocking, v->processor));
     spin_unlock_irqrestore(pi_blocking_list_lock, flags);
 
     ASSERT(!pi_test_sn(pi_desc));
@@ -186,7 +205,9 @@ static void vmx_pi_unblock_vcpu(struct vcpu *v)
     if ( v->arch.hvm_vmx.pi_blocking.lock != NULL )
     {
         ASSERT(v->arch.hvm_vmx.pi_blocking.lock == pi_blocking_list_lock);
-        list_del(&v->arch.hvm_vmx.pi_blocking.list);
+        vmx_pi_del_vcpu(&v->arch.hvm_vmx.pi_blocking,
+                        container_of(pi_blocking_list_lock,
+                                     struct vmx_pi_blocking_vcpu, lock));
         v->arch.hvm_vmx.pi_blocking.lock = NULL;
     }
 
@@ -234,7 +255,7 @@ void vmx_pi_desc_fixup(unsigned int cpu)
          */
         if ( pi_test_on(&vmx->pi_desc) )
         {
-            list_del(&vmx->pi_blocking.list);
+            vmx_pi_del_vcpu(&vmx->pi_blocking, &per_cpu(vmx_pi_blocking, cpu));
             vmx->pi_blocking.lock = NULL;
             vcpu_unblock(container_of(vmx, struct vcpu, arch.hvm_vmx));
         }
@@ -257,8 +278,9 @@ void vmx_pi_desc_fixup(unsigned int cpu)
             write_atomic(&vmx->pi_desc.ndst,
                          x2apic_enabled ? dest : MASK_INSR(dest, PI_xAPIC_NDST_MASK));
 
-            list_move(&vmx->pi_blocking.list,
-                      &per_cpu(vmx_pi_blocking, new_cpu).list);
+            vmx_pi_del_vcpu(&vmx->pi_blocking, &per_cpu(vmx_pi_blocking, cpu));
+            vmx_pi_add_vcpu(&vmx->pi_blocking, &per_cpu(vmx_pi_blocking,
+                                                        new_cpu));
             vmx->pi_blocking.lock = new_lock;
 
             spin_unlock(new_lock);
@@ -2351,9 +2373,9 @@ static struct hvm_function_table __initdata vmx_function_table = {
 static void pi_wakeup_interrupt(struct cpu_user_regs *regs)
 {
     struct arch_vmx_struct *vmx, *tmp;
-    spinlock_t *lock = &per_cpu(vmx_pi_blocking, smp_processor_id()).lock;
-    struct list_head *blocked_vcpus =
-		&per_cpu(vmx_pi_blocking, smp_processor_id()).list;
+    unsigned int cpu = smp_processor_id();
+    spinlock_t *lock = &per_cpu(vmx_pi_blocking, cpu).lock;
+    struct list_head *blocked_vcpus = &per_cpu(vmx_pi_blocking, cpu).list;
 
     ack_APIC_irq();
     this_cpu(irq_count)++;
@@ -2369,7 +2391,7 @@ static void pi_wakeup_interrupt(struct cpu_user_regs *regs)
     {
         if ( pi_test_on(&vmx->pi_desc) )
         {
-            list_del(&vmx->pi_blocking.list);
+            vmx_pi_del_vcpu(&vmx->pi_blocking, &per_cpu(vmx_pi_blocking, cpu));
             ASSERT(vmx->pi_blocking.lock == lock);
             vmx->pi_blocking.lock = NULL;
             vcpu_unblock(container_of(vmx, struct vcpu, arch.hvm_vmx));
-- 
1.8.3.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v5 2/4] x86/vcpu: track hvm vcpu number on the system
  2017-08-16  5:14 [PATCH v5 0/4] mitigate the per-pCPU blocking list may be too long Chao Gao
  2017-08-16  5:14 ` [PATCH v5 1/4] VT-d PI: track the number of vcpus on pi blocking list Chao Gao
@ 2017-08-16  5:14 ` Chao Gao
  2017-08-16  5:14 ` [PATCH v5 3/4] VT-d PI: restrict the number of vcpus in a given pcpu's PI blocking list Chao Gao
  2017-08-16  5:14 ` [PATCH v5 4/4] xentrace: add support for HVM's PI blocking list operation Chao Gao
  3 siblings, 0 replies; 19+ messages in thread
From: Chao Gao @ 2017-08-16  5:14 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Jan Beulich, Chao Gao

This number is used to calculate the average vcpus per pcpu ratio.

Signed-off-by: Chao Gao <chao.gao@intel.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
---
v4:
 - move the place we increase/decrease the hvm vcpu number to
 hvm_vcpu_{initialise, destory}

---
 xen/arch/x86/hvm/hvm.c        | 6 ++++++
 xen/include/asm-x86/hvm/hvm.h | 3 +++
 2 files changed, 9 insertions(+)

diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index 555133f..37afdb4 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -109,6 +109,9 @@ static const char __initconst warning_hvm_fep[] =
 static bool_t __initdata opt_altp2m_enabled = 0;
 boolean_param("altp2m", opt_altp2m_enabled);
 
+/* Total number of HVM vCPUs on this system */
+atomic_t num_hvm_vcpus;
+
 static int cpu_callback(
     struct notifier_block *nfb, unsigned long action, void *hcpu)
 {
@@ -1511,6 +1514,7 @@ int hvm_vcpu_initialise(struct vcpu *v)
 
     hvm_update_guest_vendor(v);
 
+    atomic_inc(&num_hvm_vcpus);
     return 0;
 
  fail6:
@@ -1529,6 +1533,8 @@ int hvm_vcpu_initialise(struct vcpu *v)
 
 void hvm_vcpu_destroy(struct vcpu *v)
 {
+    atomic_dec(&num_hvm_vcpus);
+
     viridian_vcpu_deinit(v);
 
     hvm_all_ioreq_servers_remove_vcpu(v->domain, v);
diff --git a/xen/include/asm-x86/hvm/hvm.h b/xen/include/asm-x86/hvm/hvm.h
index b687e03..c51bd9f 100644
--- a/xen/include/asm-x86/hvm/hvm.h
+++ b/xen/include/asm-x86/hvm/hvm.h
@@ -25,6 +25,7 @@
 #include <asm/hvm/asid.h>
 #include <public/domctl.h>
 #include <public/hvm/save.h>
+#include <xen/atomic.h>
 #include <xen/mm.h>
 
 #ifdef CONFIG_HVM_FEP
@@ -233,6 +234,8 @@ extern bool_t hvm_enabled;
 extern bool_t cpu_has_lmsl;
 extern s8 hvm_port80_allowed;
 
+extern atomic_t num_hvm_vcpus;
+
 extern const struct hvm_function_table *start_svm(void);
 extern const struct hvm_function_table *start_vmx(void);
 
-- 
1.8.3.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v5 3/4] VT-d PI: restrict the number of vcpus in a given pcpu's PI blocking list
  2017-08-16  5:14 [PATCH v5 0/4] mitigate the per-pCPU blocking list may be too long Chao Gao
  2017-08-16  5:14 ` [PATCH v5 1/4] VT-d PI: track the number of vcpus on pi blocking list Chao Gao
  2017-08-16  5:14 ` [PATCH v5 2/4] x86/vcpu: track hvm vcpu number on the system Chao Gao
@ 2017-08-16  5:14 ` Chao Gao
  2017-08-31 16:01   ` Jan Beulich
  2017-08-16  5:14 ` [PATCH v5 4/4] xentrace: add support for HVM's PI blocking list operation Chao Gao
  3 siblings, 1 reply; 19+ messages in thread
From: Chao Gao @ 2017-08-16  5:14 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Kevin Tian, Jun Nakajima, Jan Beulich, Chao Gao

Currently, a blocked vCPU is put in its pCPU's pi blocking list. If
too many vCPUs are blocked on a given pCPU, it will incur that the list
grows too long. After a simple analysis, there are 32k domains and
128 vcpu per domain, thus about 4M vCPUs may be blocked in one pCPU's
PI blocking list. When a wakeup interrupt arrives, the list is
traversed to wake up vCPUs which have events pending. This traversal in
that case would consume much time.

To mitigate this issue, this patch limits the number of vCPUs tracked by a
given pCPU's blocking list, taking factors such as perfomance of common case,
current hvm vCPU count and current pCPU count into consideration. With this
method, for the common case, it works fast and for some extreme cases, the
list length is under control.

With this patch, when a vcpu is to be blocked, we check whether the pi
blocking list's length of the pcpu where the vcpu is running exceeds
the limit which is the average vcpus per pcpu ratio plus a constant.
If no, the vcpu is added to this pcpu's pi blocking list. Otherwise,
another online pcpu is chosen to accept the vcpu.

Signed-off-by: Chao Gao <chao.gao@intel.com>
---
v5:
 - Introduce a function to choose the suitable pcpu to accept the blocked
 vcpu.
v4:
 - use a new lock to avoid adding a blocked vcpu to a offline pcpu's blocking
 list.

---
 xen/arch/x86/hvm/vmx/vmx.c | 109 ++++++++++++++++++++++++++++++++++++++++-----
 1 file changed, 97 insertions(+), 12 deletions(-)

diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
index bf17988..646f409 100644
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -119,16 +119,85 @@ static void vmx_pi_del_vcpu(struct pi_blocking_vcpu *pbv,
     add_sized(&vpbv->counter, -1);
 }
 
+/*
+ * By default, the local pcpu (namely, the one the vcpu is currently running on)
+ * is chosen as the destination of wakeup interrupt. But if the number of vcpus
+ * in the default pcpu's PI blocking list exceeds a limit, another suitable
+ * pcpu is chosen as the destination by iterating through all online pcpus.
+ *
+ * Currently, choose (v_tot/p_tot) + K as the limit of vcpus, where
+ * v_tot is the total number of hvm vcpus on the system, p_tot is the total
+ * number of pcpus in the system, and K is a fixed number. An experiment on a
+ * skylake server which has 112 cpus and 64G memory shows the maximum time of
+ * waking up a vcpu from a 128-entry blocking list is about 22us, which is
+ * tolerable. So choose 128 as the fixed number K.
+ *
+ * This policy makes sure:
+ * 1) for common cases, the limit won't be reached and the local pcpu is used
+ * which is beneficial to performance (at least, avoid an IPI when unblocking
+ * vcpu).
+ * 2) for the worst case, the blocking list length scales with the vcpu count
+ * divided by the pcpu count.
+ */
+#define PI_LIST_FIXED_LIMIT 128
+
+static inline bool pi_over_limit(unsigned int cpu)
+{
+    /* Compare w/ constant first to save a division and an add */
+    if ( likely(read_atomic(&per_cpu(vmx_pi_blocking, cpu).counter) <=
+                PI_LIST_FIXED_LIMIT) )
+        return 0;
+    else
+        return read_atomic(&per_cpu(vmx_pi_blocking, cpu).counter) >=
+                           (atomic_read(&num_hvm_vcpus) / num_online_cpus()) +
+                           PI_LIST_FIXED_LIMIT;
+}
+
+/*
+ * Start from @cpu and iterate cpu_online_map to look for one cpu whose
+ * blocking list length is under limit. Return with holding a lock to avoid
+ * others adding entries to the chosen cpu.
+ * There must be at least one suitable cpu for the limit is greater than the
+ * average number of all cpus' blocking list length.
+ */
+static unsigned int pi_get_blocking_cpu(unsigned int cpu, unsigned long *flags)
+{
+    spinlock_t *pi_blocking_list_lock;
+
+    for ( ; ; )
+    {
+        while ( unlikely(pi_over_limit(cpu)) )
+            cpu = cpumask_cycle(cpu, &cpu_online_map);
+
+        pi_blocking_list_lock = &per_cpu(vmx_pi_blocking, cpu).lock;
+        if ( flags )
+            spin_lock_irqsave(pi_blocking_list_lock, *flags);
+        else
+            spin_lock(pi_blocking_list_lock);
+        /*
+         * check again in case the list length exceeds the limit during taking
+         * the lock
+         */
+        if ( !pi_over_limit(cpu) )
+            break;
+        else if ( flags )
+            spin_unlock_irqrestore(pi_blocking_list_lock, *flags);
+        else
+            spin_unlock(pi_blocking_list_lock);
+    }
+
+    return cpu;
+}
+
 static void vmx_vcpu_block(struct vcpu *v)
 {
     unsigned long flags;
-    unsigned int dest;
-    spinlock_t *old_lock;
-    spinlock_t *pi_blocking_list_lock =
-		&per_cpu(vmx_pi_blocking, v->processor).lock;
+    unsigned int dest, pi_cpu;
+    spinlock_t *old_lock, *pi_blocking_list_lock;
     struct pi_desc *pi_desc = &v->arch.hvm_vmx.pi_desc;
 
-    spin_lock_irqsave(pi_blocking_list_lock, flags);
+    pi_cpu = pi_get_blocking_cpu(v->processor, &flags);
+    pi_blocking_list_lock = &per_cpu(vmx_pi_blocking, pi_cpu).lock;
     old_lock = cmpxchg(&v->arch.hvm_vmx.pi_blocking.lock, NULL,
                        pi_blocking_list_lock);
 
@@ -140,7 +209,7 @@ static void vmx_vcpu_block(struct vcpu *v)
     ASSERT(old_lock == NULL);
 
     vmx_pi_add_vcpu(&v->arch.hvm_vmx.pi_blocking,
-                    &per_cpu(vmx_pi_blocking, v->processor));
+                    &per_cpu(vmx_pi_blocking, pi_cpu));
     spin_unlock_irqrestore(pi_blocking_list_lock, flags);
 
     ASSERT(!pi_test_sn(pi_desc));
@@ -149,6 +218,19 @@ static void vmx_vcpu_block(struct vcpu *v)
 
     ASSERT(pi_desc->ndst ==
            (x2apic_enabled ? dest : MASK_INSR(dest, PI_xAPIC_NDST_MASK)));
+    if ( unlikely(pi_cpu != v->processor) )
+    {
+        /*
+         * The vcpu is put to another pCPU's blocking list. Change 'NDST'
+         * field to that pCPU to make sure it can wake up the vcpu when an
+         * interrupt arrives. The 'NDST' field will be set to the pCPU which
+         * the vcpu is running on during VM-Entry, seeing
+         * vmx_pi_unblock_vcpu().
+         */
+        dest = cpu_physical_id(pi_cpu);
+        write_atomic(&pi_desc->ndst,
+                  x2apic_enabled ? dest : MASK_INSR(dest, PI_xAPIC_NDST_MASK));
+    }
 
     write_atomic(&pi_desc->nv, pi_wakeup_vector);
 }
@@ -179,13 +261,17 @@ static void vmx_pi_unblock_vcpu(struct vcpu *v)
     unsigned long flags;
     spinlock_t *pi_blocking_list_lock;
     struct pi_desc *pi_desc = &v->arch.hvm_vmx.pi_desc;
+    unsigned int dest = cpu_physical_id(v->processor);
 
     /*
-     * Set 'NV' field back to posted_intr_vector, so the
-     * Posted-Interrupts can be delivered to the vCPU when
-     * it is running in non-root mode.
+     * Set 'NV' field back to posted_intr_vector and 'NDST' field to the pCPU
+     * where the vcpu is running (for this field may now point to another
+     * pCPU), so the Posted-Interrupts can be delivered to the vCPU when it
+     * is running in non-root mode.
      */
     write_atomic(&pi_desc->nv, posted_intr_vector);
+    write_atomic(&pi_desc->ndst,
+                 x2apic_enabled ? dest : MASK_INSR(dest, PI_xAPIC_NDST_MASK));
 
     pi_blocking_list_lock = v->arch.hvm_vmx.pi_blocking.lock;
 
@@ -261,17 +347,16 @@ void vmx_pi_desc_fixup(unsigned int cpu)
         }
         else
         {
+            new_cpu = cpumask_cycle(cpu, &cpu_online_map);
             /*
              * We need to find an online cpu as the NDST of the PI descriptor, it
              * doesn't matter whether it is within the cpupool of the domain or
              * not. As long as it is online, the vCPU will be woken up once the
              * notification event arrives.
              */
-            new_cpu = cpumask_any(&cpu_online_map);
+            new_cpu = pi_get_blocking_cpu(new_cpu, NULL);
             new_lock = &per_cpu(vmx_pi_blocking, new_cpu).lock;
 
-            spin_lock(new_lock);
-
             ASSERT(vmx->pi_blocking.lock == old_lock);
 
             dest = cpu_physical_id(new_cpu);
-- 
1.8.3.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v5 4/4] xentrace: add support for HVM's PI blocking list operation
  2017-08-16  5:14 [PATCH v5 0/4] mitigate the per-pCPU blocking list may be too long Chao Gao
                   ` (2 preceding siblings ...)
  2017-08-16  5:14 ` [PATCH v5 3/4] VT-d PI: restrict the number of vcpus in a given pcpu's PI blocking list Chao Gao
@ 2017-08-16  5:14 ` Chao Gao
  3 siblings, 0 replies; 19+ messages in thread
From: Chao Gao @ 2017-08-16  5:14 UTC (permalink / raw)
  To: xen-devel
  Cc: Kevin Tian, Stefano Stabellini, Wei Liu, Jun Nakajima,
	George Dunlap, Andrew Cooper, Tim Deegan, Jan Beulich,
	Ian Jackson, Chao Gao

In order to analyze PI blocking list operation frequence and obtain
the list length, add some relevant events to xentrace and some
associated code in xenalyze.

Signed-off-by: Chao Gao <chao.gao@intel.com>
---
v5:
 - Put pi list operation under HW events and get rid of ASYNC stuff
 - generate scatterplot of pi list length on pcpus to be vivid to
 analyst.
v4:
 - trace part of Patch 1 in v3

---
 tools/xentrace/formats     |   2 +
 tools/xentrace/xenalyze.c  | 116 +++++++++++++++++++++++++++++++++++++++++++++
 xen/arch/x86/hvm/vmx/vmx.c |  17 ++++++-
 xen/include/public/trace.h |   5 ++
 4 files changed, 138 insertions(+), 2 deletions(-)

diff --git a/tools/xentrace/formats b/tools/xentrace/formats
index c1f584f..e926a18 100644
--- a/tools/xentrace/formats
+++ b/tools/xentrace/formats
@@ -205,6 +205,8 @@
 0x00802006  CPU%(cpu)d  %(tsc)d (+%(reltsc)8d)  assign_vector [ irq = %(1)d = vector 0x%(2)x, CPU mask: 0x%(3)08x ]
 0x00802007  CPU%(cpu)d  %(tsc)d (+%(reltsc)8d)  bogus_vector [ 0x%(1)x ]
 0x00802008  CPU%(cpu)d  %(tsc)d (+%(reltsc)8d)  do_irq [ irq = %(1)d, began = %(2)dus, ended = %(3)dus ]
+0x00804001  CPU%(cpu)d  %(tsc)d (+%(reltsc)8d)  pi_list_add [ domid = 0x%(1)04x vcpu = 0x%(2)04x, pcpu = 0x%(3)04x, #entry = 0x%(4)04x ]
+0x00804002  CPU%(cpu)d  %(tsc)d (+%(reltsc)8d)  pi_list_del [ domid = 0x%(1)04x vcpu = 0x%(2)04x ]
 
 0x00084001  CPU%(cpu)d  %(tsc)d (+%(reltsc)8d)  hpet create [ tn = %(1)d, irq = %(2)d, delta = 0x%(4)08x%(3)08x, period = 0x%(6)08x%(5)08x ]
 0x00084002  CPU%(cpu)d  %(tsc)d (+%(reltsc)8d)  pit create [ delta = 0x%(1)016x, period = 0x%(2)016x ]
diff --git a/tools/xentrace/xenalyze.c b/tools/xentrace/xenalyze.c
index 24cce2a..2276a23 100644
--- a/tools/xentrace/xenalyze.c
+++ b/tools/xentrace/xenalyze.c
@@ -159,6 +159,7 @@ struct {
         scatterplot_extint_cycles:1,
         scatterplot_rdtsc:1,
         scatterplot_irq:1,
+        scatterplot_pi_list:1,
         histogram_interrupt_eip:1,
         interval_mode:1,
         dump_all:1,
@@ -233,6 +234,7 @@ struct {
     .scatterplot_extint_cycles=0,
     .scatterplot_rdtsc=0,
     .scatterplot_irq=0,
+    .scatterplot_pi_list=0,
     .histogram_interrupt_eip=0,
     .dump_all = 0,
     .dump_raw_process = 0,
@@ -1391,6 +1393,9 @@ struct hvm_data {
 
     /* Historical info */
     tsc_t last_rdtsc;
+
+    /* Destination pcpu of posted interrupt's wakeup interrupt */
+    int pi_cpu;
 };
 
 enum {
@@ -1457,6 +1462,8 @@ void init_hvm_data(struct hvm_data *h, struct vcpu_data *v) {
     }
     for(i=0; i<GUEST_INTERRUPT_MAX+1; i++)
         h->summary.guest_interrupt[i].count=0;
+
+    h->pi_cpu = -1;
 }
 
 /* PV data */
@@ -1852,6 +1859,9 @@ struct pcpu_info {
         tsc_t tsc;
         struct cycle_summary idle, running, lost;
     } time;
+
+    /* Posted Interrupt List Length */
+    int pi_list_length;
 };
 
 void __fill_in_record_info(struct pcpu_info *p);
@@ -8581,8 +8591,97 @@ void irq_process(struct pcpu_info *p) {
     }
 }
 
+static void process_pi_list_add(struct record_info *ri)
+{
+    struct {
+        int did;
+        int vid;
+        int pcpu;
+        int len;
+    } *data = (typeof(data))ri->d;
+    struct vcpu_data *v;
+
+    v = vcpu_find(data->did, data->vid);
+    if ( !v->hvm.init )
+        init_hvm_data(&v->hvm, v);
+
+    if ( opt.dump_all )
+        printf("d%uv%u is added to pi blocking list of pcpu%u. "
+               "The list length is now %d\n",
+               data->did, data->vid, data->pcpu, data->len);
+
+    v->hvm.pi_cpu = data->pcpu;
+    /* Calibrate pi list length */
+    P.pcpu[data->pcpu].pi_list_length = data->len;
+
+    if ( opt.scatterplot_pi_list )
+    {
+        struct time_struct t;
+
+        abs_cycles_to_time(ri->tsc, &t);
+        printf("%d %u.%09u %d\n", data->pcpu, t.s, t.ns,
+               P.pcpu[data->pcpu].pi_list_length);
+    }
+}
+
+static void process_pi_list_del(struct record_info *ri)
+{
+    struct {
+        int did;
+        int vid;
+    } *data = (typeof(data))ri->d;
+    struct vcpu_data *v;
+
+    v = vcpu_find(data->did, data->vid);
+    if ( !v->hvm.init )
+        init_hvm_data(&v->hvm, v);
+
+    if ( opt.dump_all )
+    {
+        if ( v->hvm.pi_cpu != -1 )
+            printf("d%uv%u is removed from pi blocking list of pcpu%u\n",
+                   data->did, data->vid, v->hvm.pi_cpu);
+        else
+            printf("d%uv%u is removed from pi blocking list\n",
+                   data->did, data->vid);
+    }
+
+    if ( (v->hvm.pi_cpu != -1) && (P.pcpu[v->hvm.pi_cpu].pi_list_length != -1) )
+    {
+        P.pcpu[v->hvm.pi_cpu].pi_list_length--;
+
+        if ( opt.scatterplot_pi_list )
+        {
+            struct time_struct t;
+
+            abs_cycles_to_time(ri->tsc, &t);
+            printf("%d %u.%09u %d\n", v->hvm.pi_cpu, t.s, t.ns,
+                   P.pcpu[v->hvm.pi_cpu].pi_list_length);
+        }
+    }
+    v->hvm.pi_cpu = -1;
+}
+
+
+static void vtd_process(struct pcpu_info *p) {
+    struct record_info *ri = &p->ri;
+
+    switch (ri->event)
+    {
+    case TRC_VTD_PI_LIST_ADD:
+        process_pi_list_add(ri);
+        break;
+    case TRC_VTD_PI_LIST_DEL:
+        process_pi_list_del(ri);
+        break;
+    default:
+        process_generic(&p->ri);
+    }
+}
+
 #define TRC_HW_SUB_PM 1
 #define TRC_HW_SUB_IRQ 2
+#define TRC_HW_SUB_VTD 4
 void hw_process(struct pcpu_info *p)
 {
     struct record_info *ri = &p->ri;
@@ -8595,6 +8694,11 @@ void hw_process(struct pcpu_info *p)
     case TRC_HW_SUB_IRQ:
         irq_process(p);
         break;
+    case TRC_HW_SUB_VTD:
+        vtd_process(p);
+        break;
+    default:
+        process_generic(&p->ri);
     }
 
 }
@@ -9027,6 +9131,7 @@ off_t scan_for_new_pcpu(off_t offset) {
 
         p->file_offset = offset;
         p->next_cpu_change_offset = offset;
+        p->pi_list_length = -1;
 
         record_order_insert(p);
 
@@ -9255,6 +9360,7 @@ void process_cpu_change(struct pcpu_info *p) {
         p2->ri.d = p2->ri.rec.u.notsc.data;
         p2->file_offset = p->file_offset;
         p2->next_cpu_change_offset = p->file_offset;
+        p2->pi_list_length = -1;
 
         fprintf(warn, "%s: Activating pcpu %d at offset %lld\n",
                 __func__, r->cpu, (unsigned long long)p->file_offset);
@@ -10015,6 +10121,7 @@ enum {
     OPT_SCATTERPLOT_EXTINT_CYCLES,
     OPT_SCATTERPLOT_RDTSC,
     OPT_SCATTERPLOT_IRQ,
+    OPT_SCATTERPLOT_PI_LIST,
     OPT_HISTOGRAM_INTERRUPT_EIP,
     /* Interval options */
     OPT_INTERVAL_CR3_SCHEDULE_TIME,
@@ -10304,6 +10411,10 @@ error_t cmd_parser(int key, char *arg, struct argp_state *state)
         G.output_defined = 1;
         opt.scatterplot_pcpu=1;
         break;
+    case OPT_SCATTERPLOT_PI_LIST:
+        G.output_defined = 1;
+        opt.scatterplot_pi_list=1;
+        break;
     case OPT_HISTOGRAM_INTERRUPT_EIP:
     {
         char * inval, *p;
@@ -10679,6 +10790,11 @@ const struct argp_option cmd_opts[] =  {
       .group = OPT_GROUP_EXTRA,
       .doc = "Output scatterplot of irqs on pcpus.", },
 
+    { .name = "scatterplot-pi-list",
+      .key = OPT_SCATTERPLOT_PI_LIST,
+      .group = OPT_GROUP_EXTRA,
+      .doc = "Output scatterplot of pi blocking list length on pcpus.", },
+
     { .name = "histogram-interrupt-eip",
       .key = OPT_HISTOGRAM_INTERRUPT_EIP,
       .arg = "vector[,increment]",
diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
index 646f409..3f71681 100644
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -210,6 +210,8 @@ static void vmx_vcpu_block(struct vcpu *v)
 
     vmx_pi_add_vcpu(&v->arch.hvm_vmx.pi_blocking,
                     &per_cpu(vmx_pi_blocking, pi_cpu));
+    TRACE_4D(TRC_VTD_PI_LIST_ADD, v->domain->domain_id, v->vcpu_id, pi_cpu,
+             read_atomic(&per_cpu(vmx_pi_blocking, pi_cpu).counter));
     spin_unlock_irqrestore(pi_blocking_list_lock, flags);
 
     ASSERT(!pi_test_sn(pi_desc));
@@ -291,6 +293,7 @@ static void vmx_pi_unblock_vcpu(struct vcpu *v)
     if ( v->arch.hvm_vmx.pi_blocking.lock != NULL )
     {
         ASSERT(v->arch.hvm_vmx.pi_blocking.lock == pi_blocking_list_lock);
+        TRACE_2D(TRC_VTD_PI_LIST_DEL, v->domain->domain_id, v->vcpu_id);
         vmx_pi_del_vcpu(&v->arch.hvm_vmx.pi_blocking,
                         container_of(pi_blocking_list_lock,
                                      struct vmx_pi_blocking_vcpu, lock));
@@ -328,6 +331,8 @@ void vmx_pi_desc_fixup(unsigned int cpu)
 
     list_for_each_entry_safe(vmx, tmp, blocked_vcpus, pi_blocking.list)
     {
+        struct vcpu *v = container_of(vmx, struct vcpu, arch.hvm_vmx);
+
         /*
          * Suppress notification or we may miss an interrupt when the
          * target cpu is dying.
@@ -341,9 +346,10 @@ void vmx_pi_desc_fixup(unsigned int cpu)
          */
         if ( pi_test_on(&vmx->pi_desc) )
         {
+            TRACE_2D(TRC_VTD_PI_LIST_DEL, v->domain->domain_id, v->vcpu_id);
             vmx_pi_del_vcpu(&vmx->pi_blocking, &per_cpu(vmx_pi_blocking, cpu));
             vmx->pi_blocking.lock = NULL;
-            vcpu_unblock(container_of(vmx, struct vcpu, arch.hvm_vmx));
+            vcpu_unblock(v);
         }
         else
         {
@@ -363,9 +369,13 @@ void vmx_pi_desc_fixup(unsigned int cpu)
             write_atomic(&vmx->pi_desc.ndst,
                          x2apic_enabled ? dest : MASK_INSR(dest, PI_xAPIC_NDST_MASK));
 
+            TRACE_2D(TRC_VTD_PI_LIST_DEL, v->domain->domain_id, v->vcpu_id);
             vmx_pi_del_vcpu(&vmx->pi_blocking, &per_cpu(vmx_pi_blocking, cpu));
             vmx_pi_add_vcpu(&vmx->pi_blocking, &per_cpu(vmx_pi_blocking,
                                                         new_cpu));
+            TRACE_4D(TRC_VTD_PI_LIST_ADD, v->domain->domain_id, v->vcpu_id,
+                     new_cpu,
+                     read_atomic(&per_cpu(vmx_pi_blocking, new_cpu).counter));
             vmx->pi_blocking.lock = new_lock;
 
             spin_unlock(new_lock);
@@ -2476,10 +2486,13 @@ static void pi_wakeup_interrupt(struct cpu_user_regs *regs)
     {
         if ( pi_test_on(&vmx->pi_desc) )
         {
+            struct vcpu *v = container_of(vmx, struct vcpu, arch.hvm_vmx);
+
+            TRACE_2D(TRC_VTD_PI_LIST_DEL, v->domain->domain_id, v->vcpu_id);
             vmx_pi_del_vcpu(&vmx->pi_blocking, &per_cpu(vmx_pi_blocking, cpu));
             ASSERT(vmx->pi_blocking.lock == lock);
             vmx->pi_blocking.lock = NULL;
-            vcpu_unblock(container_of(vmx, struct vcpu, arch.hvm_vmx));
+            vcpu_unblock(v);
         }
     }
 
diff --git a/xen/include/public/trace.h b/xen/include/public/trace.h
index 3746bff..5eeb8ee 100644
--- a/xen/include/public/trace.h
+++ b/xen/include/public/trace.h
@@ -92,6 +92,7 @@
 /* Trace classes for Hardware */
 #define TRC_HW_PM           0x00801000   /* Power management traces */
 #define TRC_HW_IRQ          0x00802000   /* Traces relating to the handling of IRQs */
+#define TRC_HW_VTD          0x00804000   /* Traces relating to VTD */
 
 /* Trace events per class */
 #define TRC_LOST_RECORDS        (TRC_GEN + 1)
@@ -273,6 +274,10 @@
 #define TRC_HW_IRQ_UNMAPPED_VECTOR    (TRC_HW_IRQ + 0x7)
 #define TRC_HW_IRQ_HANDLED            (TRC_HW_IRQ + 0x8)
 
+/* Trace events relating to VT-d */
+#define TRC_VTD_PI_LIST_ADD     (TRC_HW_VTD + 0x1)
+#define TRC_VTD_PI_LIST_DEL     (TRC_HW_VTD + 0x2)
+
 /*
  * Event Flags
  *
-- 
1.8.3.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH v5 1/4] VT-d PI: track the number of vcpus on pi blocking list
  2017-08-16  5:14 ` [PATCH v5 1/4] VT-d PI: track the number of vcpus on pi blocking list Chao Gao
@ 2017-08-30 16:00   ` Jan Beulich
  2017-08-30 22:57     ` Chao Gao
  0 siblings, 1 reply; 19+ messages in thread
From: Jan Beulich @ 2017-08-30 16:00 UTC (permalink / raw)
  To: Chao Gao; +Cc: Andrew Cooper, Kevin Tian, Jun Nakajima, xen-devel

>>> On 16.08.17 at 07:14, <chao.gao@intel.com> wrote:
> @@ -100,6 +101,24 @@ void vmx_pi_per_cpu_init(unsigned int cpu)
>      spin_lock_init(&per_cpu(vmx_pi_blocking, cpu).lock);
>  }
>  
> +static void vmx_pi_add_vcpu(struct pi_blocking_vcpu *pbv,
> +                            struct vmx_pi_blocking_vcpu *vpbv)
> +{
> +    ASSERT(spin_is_locked(&vpbv->lock));

You realize this is only a very weak check for a non-recursive lock?

> +    add_sized(&vpbv->counter, 1);
> +    ASSERT(read_atomic(&vpbv->counter));

Why add_sized() and read_atomic() when you hold the lock?

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v5 1/4] VT-d PI: track the number of vcpus on pi blocking list
  2017-08-30 16:00   ` Jan Beulich
@ 2017-08-30 22:57     ` Chao Gao
  2017-08-31  7:42       ` Jan Beulich
  0 siblings, 1 reply; 19+ messages in thread
From: Chao Gao @ 2017-08-30 22:57 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Andrew Cooper, Kevin Tian, Jun Nakajima, xen-devel

On Wed, Aug 30, 2017 at 10:00:49AM -0600, Jan Beulich wrote:
>>>> On 16.08.17 at 07:14, <chao.gao@intel.com> wrote:
>> @@ -100,6 +101,24 @@ void vmx_pi_per_cpu_init(unsigned int cpu)
>>      spin_lock_init(&per_cpu(vmx_pi_blocking, cpu).lock);
>>  }
>>  
>> +static void vmx_pi_add_vcpu(struct pi_blocking_vcpu *pbv,
>> +                            struct vmx_pi_blocking_vcpu *vpbv)
>> +{
>> +    ASSERT(spin_is_locked(&vpbv->lock));
>
>You realize this is only a very weak check for a non-recursive lock?

I just thought the lock should be held when adding one entry to the
blocking list. Do you think we should remove this check or make it
stricter?

>
>> +    add_sized(&vpbv->counter, 1);
>> +    ASSERT(read_atomic(&vpbv->counter));
>
>Why add_sized() and read_atomic() when you hold the lock?
>

In patch 3, frequent reading the counter is used to find a suitable
vcpu and we can use add_sized() and read_atomic() to avoid acquiring the
lock. In one word, the lock doesn't protect the counter.

Thanks
Chao

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v5 1/4] VT-d PI: track the number of vcpus on pi blocking list
  2017-08-31  7:42       ` Jan Beulich
@ 2017-08-31  7:15         ` Chao Gao
  2017-08-31  8:33           ` Jan Beulich
  0 siblings, 1 reply; 19+ messages in thread
From: Chao Gao @ 2017-08-31  7:15 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Andrew Cooper, Kevin Tian, Jun Nakajima, xen-devel

On Thu, Aug 31, 2017 at 01:42:53AM -0600, Jan Beulich wrote:
>>>> On 31.08.17 at 00:57, <chao.gao@intel.com> wrote:
>> On Wed, Aug 30, 2017 at 10:00:49AM -0600, Jan Beulich wrote:
>>>>>> On 16.08.17 at 07:14, <chao.gao@intel.com> wrote:
>>>> @@ -100,6 +101,24 @@ void vmx_pi_per_cpu_init(unsigned int cpu)
>>>>      spin_lock_init(&per_cpu(vmx_pi_blocking, cpu).lock);
>>>>  }
>>>>  
>>>> +static void vmx_pi_add_vcpu(struct pi_blocking_vcpu *pbv,
>>>> +                            struct vmx_pi_blocking_vcpu *vpbv)
>>>> +{
>>>> +    ASSERT(spin_is_locked(&vpbv->lock));
>>>
>>>You realize this is only a very weak check for a non-recursive lock?
>> 
>> I just thought the lock should be held when adding one entry to the
>> blocking list. Do you think we should remove this check or make it
>> stricter?
>
>Well, the primary purpose of my comment was to make you aware
>of the fact. If the weak check is good enough for you, then fine.

To be honest, I don't know the difference between weak check and tight
check.

>Removing the check would be a bad idea imo (but see also below);
>tightening might be worthwhile, but might also go too far (depending
>mainly on how clearly provable it is that all callers actually hold the
>lock).

IMO, the lock was introduced (not by me) to protect the blocking list.
list_add() and list_del() should be performed with the lock held. So I
think it is clear that all callers should hold the lock.

>
>>>> +    add_sized(&vpbv->counter, 1);
>>>> +    ASSERT(read_atomic(&vpbv->counter));
>>>
>>>Why add_sized() and read_atomic() when you hold the lock?
>>>
>> 
>> In patch 3, frequent reading the counter is used to find a suitable
>> vcpu and we can use add_sized() and read_atomic() to avoid acquiring the
>> lock. In one word, the lock doesn't protect the counter.
>
>In that case it would be more natural to switch to the atomic
>accesses there. Plus you still wouldn't need read_atomic()
>here, with the lock held. Furthermore I would then wonder
>whether it wasn't better to use atomic_t for the counter at

Is there some basic guide on when it is better to use read_atomic()
and add_sized() and when it is better to define a atomic variable
directly?

>that point. Also with a lock-less readers the requirement to
>hold a lock here (rather than using suitable LOCKed accesses)
>becomes questionable too.

As I said above, I think the lock is used to protect the list.

I think this patch has two parts:
1. Move all list operations to two inline functions. (with this, adding
a counter is easier and don't need add code in several places.)

2. Add a counter.

Thanks
Chao

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v5 1/4] VT-d PI: track the number of vcpus on pi blocking list
  2017-08-30 22:57     ` Chao Gao
@ 2017-08-31  7:42       ` Jan Beulich
  2017-08-31  7:15         ` Chao Gao
  0 siblings, 1 reply; 19+ messages in thread
From: Jan Beulich @ 2017-08-31  7:42 UTC (permalink / raw)
  To: Chao Gao; +Cc: Andrew Cooper, Kevin Tian, Jun Nakajima, xen-devel

>>> On 31.08.17 at 00:57, <chao.gao@intel.com> wrote:
> On Wed, Aug 30, 2017 at 10:00:49AM -0600, Jan Beulich wrote:
>>>>> On 16.08.17 at 07:14, <chao.gao@intel.com> wrote:
>>> @@ -100,6 +101,24 @@ void vmx_pi_per_cpu_init(unsigned int cpu)
>>>      spin_lock_init(&per_cpu(vmx_pi_blocking, cpu).lock);
>>>  }
>>>  
>>> +static void vmx_pi_add_vcpu(struct pi_blocking_vcpu *pbv,
>>> +                            struct vmx_pi_blocking_vcpu *vpbv)
>>> +{
>>> +    ASSERT(spin_is_locked(&vpbv->lock));
>>
>>You realize this is only a very weak check for a non-recursive lock?
> 
> I just thought the lock should be held when adding one entry to the
> blocking list. Do you think we should remove this check or make it
> stricter?

Well, the primary purpose of my comment was to make you aware
of the fact. If the weak check is good enough for you, then fine.
Removing the check would be a bad idea imo (but see also below);
tightening might be worthwhile, but might also go too far (depending
mainly on how clearly provable it is that all callers actually hold the
lock).

>>> +    add_sized(&vpbv->counter, 1);
>>> +    ASSERT(read_atomic(&vpbv->counter));
>>
>>Why add_sized() and read_atomic() when you hold the lock?
>>
> 
> In patch 3, frequent reading the counter is used to find a suitable
> vcpu and we can use add_sized() and read_atomic() to avoid acquiring the
> lock. In one word, the lock doesn't protect the counter.

In that case it would be more natural to switch to the atomic
accesses there. Plus you still wouldn't need read_atomic()
here, with the lock held. Furthermore I would then wonder
whether it wasn't better to use atomic_t for the counter at
that point. Also with a lock-less readers the requirement to
hold a lock here (rather than using suitable LOCKed accesses)
becomes questionable too.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v5 1/4] VT-d PI: track the number of vcpus on pi blocking list
  2017-08-31  8:33           ` Jan Beulich
@ 2017-08-31  7:53             ` Chao Gao
  2017-09-01  1:39             ` Chao Gao
  1 sibling, 0 replies; 19+ messages in thread
From: Chao Gao @ 2017-08-31  7:53 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Andrew Cooper, Kevin Tian, Jun Nakajima, xen-devel

On Thu, Aug 31, 2017 at 02:33:57AM -0600, Jan Beulich wrote:
>>>> On 31.08.17 at 09:15, <chao.gao@intel.com> wrote:
>> On Thu, Aug 31, 2017 at 01:42:53AM -0600, Jan Beulich wrote:
>>>>>> On 31.08.17 at 00:57, <chao.gao@intel.com> wrote:
>>>> On Wed, Aug 30, 2017 at 10:00:49AM -0600, Jan Beulich wrote:
>>>>>>>> On 16.08.17 at 07:14, <chao.gao@intel.com> wrote:
>>>>>> @@ -100,6 +101,24 @@ void vmx_pi_per_cpu_init(unsigned int cpu)
>>>>>>      spin_lock_init(&per_cpu(vmx_pi_blocking, cpu).lock);
>>>>>>  }
>>>>>>  
>>>>>> +static void vmx_pi_add_vcpu(struct pi_blocking_vcpu *pbv,
>>>>>> +                            struct vmx_pi_blocking_vcpu *vpbv)
>>>>>> +{
>>>>>> +    ASSERT(spin_is_locked(&vpbv->lock));
>>>>>
>>>>>You realize this is only a very weak check for a non-recursive lock?
>>>> 
>>>> I just thought the lock should be held when adding one entry to the
>>>> blocking list. Do you think we should remove this check or make it
>>>> stricter?
>>>
>>>Well, the primary purpose of my comment was to make you aware
>>>of the fact. If the weak check is good enough for you, then fine.
>> 
>> To be honest, I don't know the difference between weak check and tight
>> check.
>
>For non-recursive locks spin_is_locked() only tells you if _any_
>CPU in the system currently holds the lock. For recursive ones it
>checks whether it's the local CPU that owns the lock.

This remake is impressive to me.

>
>>>Removing the check would be a bad idea imo (but see also below);
>>>tightening might be worthwhile, but might also go too far (depending
>>>mainly on how clearly provable it is that all callers actually hold the
>>>lock).
>> 
>> IMO, the lock was introduced (not by me) to protect the blocking list.
>> list_add() and list_del() should be performed with the lock held. So I
>> think it is clear that all callers should hold the lock.
>
>Good.
>
>>>>>> +    add_sized(&vpbv->counter, 1);
>>>>>> +    ASSERT(read_atomic(&vpbv->counter));
>>>>>
>>>>>Why add_sized() and read_atomic() when you hold the lock?
>>>>>
>>>> 
>>>> In patch 3, frequent reading the counter is used to find a suitable
>>>> vcpu and we can use add_sized() and read_atomic() to avoid acquiring the
>>>> lock. In one word, the lock doesn't protect the counter.
>>>
>>>In that case it would be more natural to switch to the atomic
>>>accesses there. Plus you still wouldn't need read_atomic()
>>>here, with the lock held. Furthermore I would then wonder
>>>whether it wasn't better to use atomic_t for the counter at
>> 
>> Is there some basic guide on when it is better to use read_atomic()
>> and add_sized() and when it is better to define a atomic variable
>> directly?
>
>If an atomic_t variable fits your needs, I think it should always
>be preferred. add_sized() was introduced for a case where an
>atomic_t variable would not have been usable. Please also
>consult older commits for understanding the background.

Ok. I will. Thanks for your guide.

>
>>>that point. Also with a lock-less readers the requirement to
>>>hold a lock here (rather than using suitable LOCKed accesses)
>>>becomes questionable too.
>> 
>> As I said above, I think the lock is used to protect the list.
>> 
>> I think this patch has two parts:
>> 1. Move all list operations to two inline functions. (with this, adding
>> a counter is easier and don't need add code in several places.)
>> 
>> 2. Add a counter.
>
>With it being left unclear whether the counter is meant to
>also be protected by the lock: In the patch here you claim it
>is, yet by later introducing lock-less readers you weaken
>that model. Hence the request to bring things into a
>consistent state right away, and ideally also into the final
>state.

Sure. I will clarify this and make things consistent.

Thanks
Chao

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v5 1/4] VT-d PI: track the number of vcpus on pi blocking list
  2017-08-31  7:15         ` Chao Gao
@ 2017-08-31  8:33           ` Jan Beulich
  2017-08-31  7:53             ` Chao Gao
  2017-09-01  1:39             ` Chao Gao
  0 siblings, 2 replies; 19+ messages in thread
From: Jan Beulich @ 2017-08-31  8:33 UTC (permalink / raw)
  To: Chao Gao; +Cc: Andrew Cooper, Kevin Tian, Jun Nakajima, xen-devel

>>> On 31.08.17 at 09:15, <chao.gao@intel.com> wrote:
> On Thu, Aug 31, 2017 at 01:42:53AM -0600, Jan Beulich wrote:
>>>>> On 31.08.17 at 00:57, <chao.gao@intel.com> wrote:
>>> On Wed, Aug 30, 2017 at 10:00:49AM -0600, Jan Beulich wrote:
>>>>>>> On 16.08.17 at 07:14, <chao.gao@intel.com> wrote:
>>>>> @@ -100,6 +101,24 @@ void vmx_pi_per_cpu_init(unsigned int cpu)
>>>>>      spin_lock_init(&per_cpu(vmx_pi_blocking, cpu).lock);
>>>>>  }
>>>>>  
>>>>> +static void vmx_pi_add_vcpu(struct pi_blocking_vcpu *pbv,
>>>>> +                            struct vmx_pi_blocking_vcpu *vpbv)
>>>>> +{
>>>>> +    ASSERT(spin_is_locked(&vpbv->lock));
>>>>
>>>>You realize this is only a very weak check for a non-recursive lock?
>>> 
>>> I just thought the lock should be held when adding one entry to the
>>> blocking list. Do you think we should remove this check or make it
>>> stricter?
>>
>>Well, the primary purpose of my comment was to make you aware
>>of the fact. If the weak check is good enough for you, then fine.
> 
> To be honest, I don't know the difference between weak check and tight
> check.

For non-recursive locks spin_is_locked() only tells you if _any_
CPU in the system currently holds the lock. For recursive ones it
checks whether it's the local CPU that owns the lock.

>>Removing the check would be a bad idea imo (but see also below);
>>tightening might be worthwhile, but might also go too far (depending
>>mainly on how clearly provable it is that all callers actually hold the
>>lock).
> 
> IMO, the lock was introduced (not by me) to protect the blocking list.
> list_add() and list_del() should be performed with the lock held. So I
> think it is clear that all callers should hold the lock.

Good.

>>>>> +    add_sized(&vpbv->counter, 1);
>>>>> +    ASSERT(read_atomic(&vpbv->counter));
>>>>
>>>>Why add_sized() and read_atomic() when you hold the lock?
>>>>
>>> 
>>> In patch 3, frequent reading the counter is used to find a suitable
>>> vcpu and we can use add_sized() and read_atomic() to avoid acquiring the
>>> lock. In one word, the lock doesn't protect the counter.
>>
>>In that case it would be more natural to switch to the atomic
>>accesses there. Plus you still wouldn't need read_atomic()
>>here, with the lock held. Furthermore I would then wonder
>>whether it wasn't better to use atomic_t for the counter at
> 
> Is there some basic guide on when it is better to use read_atomic()
> and add_sized() and when it is better to define a atomic variable
> directly?

If an atomic_t variable fits your needs, I think it should always
be preferred. add_sized() was introduced for a case where an
atomic_t variable would not have been usable. Please also
consult older commits for understanding the background.

>>that point. Also with a lock-less readers the requirement to
>>hold a lock here (rather than using suitable LOCKed accesses)
>>becomes questionable too.
> 
> As I said above, I think the lock is used to protect the list.
> 
> I think this patch has two parts:
> 1. Move all list operations to two inline functions. (with this, adding
> a counter is easier and don't need add code in several places.)
> 
> 2. Add a counter.

With it being left unclear whether the counter is meant to
also be protected by the lock: In the patch here you claim it
is, yet by later introducing lock-less readers you weaken
that model. Hence the request to bring things into a
consistent state right away, and ideally also into the final
state.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v5 3/4] VT-d PI: restrict the number of vcpus in a given pcpu's PI blocking list
  2017-08-16  5:14 ` [PATCH v5 3/4] VT-d PI: restrict the number of vcpus in a given pcpu's PI blocking list Chao Gao
@ 2017-08-31 16:01   ` Jan Beulich
  0 siblings, 0 replies; 19+ messages in thread
From: Jan Beulich @ 2017-08-31 16:01 UTC (permalink / raw)
  To: Chao Gao; +Cc: Andrew Cooper, Kevin Tian, Jun Nakajima, xen-devel

>>> On 16.08.17 at 07:14, <chao.gao@intel.com> wrote:
> +static inline bool pi_over_limit(unsigned int cpu)
> +{
> +    /* Compare w/ constant first to save a division and an add */
> +    if ( likely(read_atomic(&per_cpu(vmx_pi_blocking, cpu).counter) <=
> +                PI_LIST_FIXED_LIMIT) )
> +        return 0;

false

> +    else

Pointless else.

> +static unsigned int pi_get_blocking_cpu(unsigned int cpu, unsigned long *flags)
> +{
> +    spinlock_t *pi_blocking_list_lock;
> +
> +    for ( ; ; )
> +    {
> +        while ( unlikely(pi_over_limit(cpu)) )
> +            cpu = cpumask_cycle(cpu, &cpu_online_map);
> +
> +        pi_blocking_list_lock = &per_cpu(vmx_pi_blocking, cpu).lock;
> +        if ( flags )
> +            spin_lock_irqsave(pi_blocking_list_lock, *flags);
> +        else
> +            spin_lock(pi_blocking_list_lock);

This is ugly, but I think I see why you want it this way. Let's see
what the maintainers think.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v5 1/4] VT-d PI: track the number of vcpus on pi blocking list
  2017-08-31  8:33           ` Jan Beulich
  2017-08-31  7:53             ` Chao Gao
@ 2017-09-01  1:39             ` Chao Gao
  2017-09-01  8:24               ` Jan Beulich
  1 sibling, 1 reply; 19+ messages in thread
From: Chao Gao @ 2017-09-01  1:39 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Andrew Cooper, Kevin Tian, Jun Nakajima, xen-devel

On Thu, Aug 31, 2017 at 02:33:57AM -0600, Jan Beulich wrote:
>>>> On 31.08.17 at 09:15, <chao.gao@intel.com> wrote:
>> On Thu, Aug 31, 2017 at 01:42:53AM -0600, Jan Beulich wrote:
>>>>>> On 31.08.17 at 00:57, <chao.gao@intel.com> wrote:
>>>> On Wed, Aug 30, 2017 at 10:00:49AM -0600, Jan Beulich wrote:
>>>>>>>> On 16.08.17 at 07:14, <chao.gao@intel.com> wrote:
>>>>>> @@ -100,6 +101,24 @@ void vmx_pi_per_cpu_init(unsigned int cpu)
>>>>>>      spin_lock_init(&per_cpu(vmx_pi_blocking, cpu).lock);
>>>>>>  }
>>>>>>  
>>>>>> +static void vmx_pi_add_vcpu(struct pi_blocking_vcpu *pbv,
>>>>>> +                            struct vmx_pi_blocking_vcpu *vpbv)
>>>>>> +{
>>>>>> +    ASSERT(spin_is_locked(&vpbv->lock));
>>>>>
>>>>>You realize this is only a very weak check for a non-recursive lock?
>>>> 
>>>> I just thought the lock should be held when adding one entry to the
>>>> blocking list. Do you think we should remove this check or make it
>>>> stricter?
>>>
>>>Well, the primary purpose of my comment was to make you aware
>>>of the fact. If the weak check is good enough for you, then fine.
>> 
>> To be honest, I don't know the difference between weak check and tight
>> check.
>
>For non-recursive locks spin_is_locked() only tells you if _any_
>CPU in the system currently holds the lock. For recursive ones it
>checks whether it's the local CPU that owns the lock.
>
>>>Removing the check would be a bad idea imo (but see also below);
>>>tightening might be worthwhile, but might also go too far (depending
>>>mainly on how clearly provable it is that all callers actually hold the
>>>lock).
>> 
>> IMO, the lock was introduced (not by me) to protect the blocking list.
>> list_add() and list_del() should be performed with the lock held. So I
>> think it is clear that all callers should hold the lock.
>
>Good.
>
>>>>>> +    add_sized(&vpbv->counter, 1);
>>>>>> +    ASSERT(read_atomic(&vpbv->counter));
>>>>>
>>>>>Why add_sized() and read_atomic() when you hold the lock?
>>>>>
>>>> 
>>>> In patch 3, frequent reading the counter is used to find a suitable
>>>> vcpu and we can use add_sized() and read_atomic() to avoid acquiring the
>>>> lock. In one word, the lock doesn't protect the counter.
>>>
>>>In that case it would be more natural to switch to the atomic
>>>accesses there. Plus you still wouldn't need read_atomic()
>>>here, with the lock held. Furthermore I would then wonder
>>>whether it wasn't better to use atomic_t for the counter at
>> 
>> Is there some basic guide on when it is better to use read_atomic()
>> and add_sized() and when it is better to define a atomic variable
>> directly?
>
>If an atomic_t variable fits your needs, I think it should always
>be preferred. add_sized() was introduced for a case where an
>atomic_t variable would not have been usable. Please also
>consult older commits for understanding the background.
>
>>>that point. Also with a lock-less readers the requirement to
>>>hold a lock here (rather than using suitable LOCKed accesses)
>>>becomes questionable too.
>> 
>> As I said above, I think the lock is used to protect the list.
>> 
>> I think this patch has two parts:
>> 1. Move all list operations to two inline functions. (with this, adding
>> a counter is easier and don't need add code in several places.)
>> 
>> 2. Add a counter.
>
>With it being left unclear whether the counter is meant to
>also be protected by the lock: In the patch here you claim it
>is, yet by later introducing lock-less readers you weaken
>that model. Hence the request to bring things into a
>consistent state right away, and ideally also into the final
>state.
>

Hi, Jan.

After thinking it again, I want to define the counter as
a unsigned int variable for the following reasion:
1. It is definite that the counter is closely related with
list_add() and list_del(). If the list is protected by the
lock, it is straightforward that the counter is also protected
by the lock.
2. In patch 3, althought there are some lock-less readers, we
will check the counter still meets our requirement with the lock
held. Thus, I don't think there is a racing issue.

Thanks
Chao

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v5 1/4] VT-d PI: track the number of vcpus on pi blocking list
  2017-09-01  8:24               ` Jan Beulich
@ 2017-09-01  7:55                 ` Chao Gao
  2017-09-01  9:13                   ` Jan Beulich
  0 siblings, 1 reply; 19+ messages in thread
From: Chao Gao @ 2017-09-01  7:55 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Andrew Cooper, Kevin Tian, Jun Nakajima, xen-devel

On Fri, Sep 01, 2017 at 02:24:08AM -0600, Jan Beulich wrote:
>>>> On 01.09.17 at 03:39, <chao.gao@intel.com> wrote:
>> After thinking it again, I want to define the counter as
>> a unsigned int variable for the following reasion:
>> 1. It is definite that the counter is closely related with
>> list_add() and list_del(). If the list is protected by the
>> lock, it is straightforward that the counter is also protected
>> by the lock.
>> 2. In patch 3, althought there are some lock-less readers, we
>> will check the counter still meets our requirement with the lock
>> held. Thus, I don't think there is a racing issue.
>
>I think that's fine, but then you still don't need LOCKed accesses
>to the counter for updating it; write_atomic() will suffice afaict.

A stupid question.
Is it contradictory that you think the counter can be protected by
the lock while suggesting using write_atomic() instead of LOCKed
accesses?

updating the counter is always accompanied by updating list and updating
list should in locked region. I meaned things like:

spin_lock()
list_add()
counter++
spin_unlock()

However, I am afraid that not using LOCKed accesses but using
write_atomic() means something like (separating updating the counter
from updating the list I think is not good):

spin_lock()
list_add()
spin_unlock()
write_atomic()

And I think this version is:

spin_lock()
list_add()
add_sized()
spin_unlock()

Thanks
Chao

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v5 1/4] VT-d PI: track the number of vcpus on pi blocking list
  2017-09-01  1:39             ` Chao Gao
@ 2017-09-01  8:24               ` Jan Beulich
  2017-09-01  7:55                 ` Chao Gao
  0 siblings, 1 reply; 19+ messages in thread
From: Jan Beulich @ 2017-09-01  8:24 UTC (permalink / raw)
  To: Chao Gao; +Cc: Andrew Cooper, Kevin Tian, Jun Nakajima, xen-devel

>>> On 01.09.17 at 03:39, <chao.gao@intel.com> wrote:
> After thinking it again, I want to define the counter as
> a unsigned int variable for the following reasion:
> 1. It is definite that the counter is closely related with
> list_add() and list_del(). If the list is protected by the
> lock, it is straightforward that the counter is also protected
> by the lock.
> 2. In patch 3, althought there are some lock-less readers, we
> will check the counter still meets our requirement with the lock
> held. Thus, I don't think there is a racing issue.

I think that's fine, but then you still don't need LOCKed accesses
to the counter for updating it; write_atomic() will suffice afaict.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v5 1/4] VT-d PI: track the number of vcpus on pi blocking list
  2017-09-01  9:13                   ` Jan Beulich
@ 2017-09-01  8:37                     ` Chao Gao
  2017-09-01  9:55                       ` Jan Beulich
  0 siblings, 1 reply; 19+ messages in thread
From: Chao Gao @ 2017-09-01  8:37 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Andrew Cooper, Kevin Tian, Jun Nakajima, xen-devel

On Fri, Sep 01, 2017 at 03:13:17AM -0600, Jan Beulich wrote:
>>>> On 01.09.17 at 09:55, <chao.gao@intel.com> wrote:
>> On Fri, Sep 01, 2017 at 02:24:08AM -0600, Jan Beulich wrote:
>>>>>> On 01.09.17 at 03:39, <chao.gao@intel.com> wrote:
>>>> After thinking it again, I want to define the counter as
>>>> a unsigned int variable for the following reasion:
>>>> 1. It is definite that the counter is closely related with
>>>> list_add() and list_del(). If the list is protected by the
>>>> lock, it is straightforward that the counter is also protected
>>>> by the lock.
>>>> 2. In patch 3, althought there are some lock-less readers, we
>>>> will check the counter still meets our requirement with the lock
>>>> held. Thus, I don't think there is a racing issue.
>>>
>>>I think that's fine, but then you still don't need LOCKed accesses
>>>to the counter for updating it; write_atomic() will suffice afaict.
>> 
>> A stupid question.
>> Is it contradictory that you think the counter can be protected by
>> the lock while suggesting using write_atomic() instead of LOCKed
>> accesses?
>> 
>> updating the counter is always accompanied by updating list and updating
>> list should in locked region. I meaned things like:
>> 
>> spin_lock()
>> list_add()
>> counter++
>> spin_unlock()
>> 
>> However, I am afraid that not using LOCKed accesses but using
>> write_atomic() means something like (separating updating the counter
>> from updating the list I think is not good):
>> 
>> spin_lock()
>> list_add()
>> spin_unlock()
>> write_atomic()
>
>No, I mean
>
> spin_lock()
> list_add()
> write_atomic()
> spin_unlock()
>
>whereas ...
>
>> And I think this version is:
>> 
>> spin_lock()
>> list_add()
>> add_sized()
>> spin_unlock()
>
>... this produces a needless LOCKed instruction redundant with being
>inside the locked region).

it seems add_sized() won't be a LOCKed instruction.
#define build_add_sized(name, size, type, reg) \
    static inline void name(volatile type *addr, type val)              \
    {                                                                   \
        asm volatile("add" size " %1,%0"                                \
                     : "=m" (*addr)                                     \
                     : reg (val));                                      \
    }

Thanks
Chao

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v5 1/4] VT-d PI: track the number of vcpus on pi blocking list
  2017-09-01  7:55                 ` Chao Gao
@ 2017-09-01  9:13                   ` Jan Beulich
  2017-09-01  8:37                     ` Chao Gao
  0 siblings, 1 reply; 19+ messages in thread
From: Jan Beulich @ 2017-09-01  9:13 UTC (permalink / raw)
  To: Chao Gao; +Cc: Andrew Cooper, Kevin Tian, Jun Nakajima, xen-devel

>>> On 01.09.17 at 09:55, <chao.gao@intel.com> wrote:
> On Fri, Sep 01, 2017 at 02:24:08AM -0600, Jan Beulich wrote:
>>>>> On 01.09.17 at 03:39, <chao.gao@intel.com> wrote:
>>> After thinking it again, I want to define the counter as
>>> a unsigned int variable for the following reasion:
>>> 1. It is definite that the counter is closely related with
>>> list_add() and list_del(). If the list is protected by the
>>> lock, it is straightforward that the counter is also protected
>>> by the lock.
>>> 2. In patch 3, althought there are some lock-less readers, we
>>> will check the counter still meets our requirement with the lock
>>> held. Thus, I don't think there is a racing issue.
>>
>>I think that's fine, but then you still don't need LOCKed accesses
>>to the counter for updating it; write_atomic() will suffice afaict.
> 
> A stupid question.
> Is it contradictory that you think the counter can be protected by
> the lock while suggesting using write_atomic() instead of LOCKed
> accesses?
> 
> updating the counter is always accompanied by updating list and updating
> list should in locked region. I meaned things like:
> 
> spin_lock()
> list_add()
> counter++
> spin_unlock()
> 
> However, I am afraid that not using LOCKed accesses but using
> write_atomic() means something like (separating updating the counter
> from updating the list I think is not good):
> 
> spin_lock()
> list_add()
> spin_unlock()
> write_atomic()

No, I mean

 spin_lock()
 list_add()
 write_atomic()
 spin_unlock()

whereas ...

> And I think this version is:
> 
> spin_lock()
> list_add()
> add_sized()
> spin_unlock()

... this produces a needless LOCKed instruction redundant with being
inside the locked region).

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v5 1/4] VT-d PI: track the number of vcpus on pi blocking list
  2017-09-01  8:37                     ` Chao Gao
@ 2017-09-01  9:55                       ` Jan Beulich
  2017-09-01 10:04                         ` Jan Beulich
  0 siblings, 1 reply; 19+ messages in thread
From: Jan Beulich @ 2017-09-01  9:55 UTC (permalink / raw)
  To: Chao Gao; +Cc: Andrew Cooper, Kevin Tian, Jun Nakajima, xen-devel

>>> On 01.09.17 at 10:37, <chao.gao@intel.com> wrote:
> it seems add_sized() won't be a LOCKed instruction.
> #define build_add_sized(name, size, type, reg) \
>     static inline void name(volatile type *addr, type val)              \
>     {                                                                   \
>         asm volatile("add" size " %1,%0"                                \
>                      : "=m" (*addr)                                     \
>                      : reg (val));                                      \
>     }

Oh, you're right. But then I'd still like you to not add a new
user, as I don't see why it was introduced in the first place:
Independent of architecture it is equivalent to

write_atomic(p, read_atomic(p) + c)

and hence I'd like to get rid of it as misleading/redundant.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v5 1/4] VT-d PI: track the number of vcpus on pi blocking list
  2017-09-01  9:55                       ` Jan Beulich
@ 2017-09-01 10:04                         ` Jan Beulich
  0 siblings, 0 replies; 19+ messages in thread
From: Jan Beulich @ 2017-09-01 10:04 UTC (permalink / raw)
  To: Chao Gao; +Cc: Andrew Cooper, Kevin Tian, Jun Nakajima, xen-devel

>>> On 01.09.17 at 11:55, <JBeulich@suse.com> wrote:
>>>> On 01.09.17 at 10:37, <chao.gao@intel.com> wrote:
>> it seems add_sized() won't be a LOCKed instruction.
>> #define build_add_sized(name, size, type, reg) \
>>     static inline void name(volatile type *addr, type val)              \
>>     {                                                                   \
>>         asm volatile("add" size " %1,%0"                                \
>>                      : "=m" (*addr)                                     \
>>                      : reg (val));                                      \
>>     }
> 
> Oh, you're right. But then I'd still like you to not add a new
> user, as I don't see why it was introduced in the first place:
> Independent of architecture it is equivalent to
> 
> write_atomic(p, read_atomic(p) + c)
> 
> and hence I'd like to get rid of it as misleading/redundant.

Actually, on x86 it still is a bit better than the generic replacement,
i.e. it would only be worthwhile dropping the custom ARM variant
in favor of a generic one. Keep using it here then.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2017-09-01 10:04 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-08-16  5:14 [PATCH v5 0/4] mitigate the per-pCPU blocking list may be too long Chao Gao
2017-08-16  5:14 ` [PATCH v5 1/4] VT-d PI: track the number of vcpus on pi blocking list Chao Gao
2017-08-30 16:00   ` Jan Beulich
2017-08-30 22:57     ` Chao Gao
2017-08-31  7:42       ` Jan Beulich
2017-08-31  7:15         ` Chao Gao
2017-08-31  8:33           ` Jan Beulich
2017-08-31  7:53             ` Chao Gao
2017-09-01  1:39             ` Chao Gao
2017-09-01  8:24               ` Jan Beulich
2017-09-01  7:55                 ` Chao Gao
2017-09-01  9:13                   ` Jan Beulich
2017-09-01  8:37                     ` Chao Gao
2017-09-01  9:55                       ` Jan Beulich
2017-09-01 10:04                         ` Jan Beulich
2017-08-16  5:14 ` [PATCH v5 2/4] x86/vcpu: track hvm vcpu number on the system Chao Gao
2017-08-16  5:14 ` [PATCH v5 3/4] VT-d PI: restrict the number of vcpus in a given pcpu's PI blocking list Chao Gao
2017-08-31 16:01   ` Jan Beulich
2017-08-16  5:14 ` [PATCH v5 4/4] xentrace: add support for HVM's PI blocking list operation Chao Gao

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.