All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v6 00/11] implement vcpu preempted check
@ 2016-10-28  8:11 ` Pan Xinhui
  0 siblings, 0 replies; 57+ messages in thread
From: Pan Xinhui @ 2016-10-28  8:11 UTC (permalink / raw)
  To: linux-kernel, linuxppc-dev, virtualization, linux-s390,
	xen-devel-request, kvm, xen-devel, x86
  Cc: benh, paulus, mpe, mingo, peterz, paulmck, will.deacon,
	kernellwp, jgross, pbonzini, bsingharora, boqun.feng,
	borntraeger, rkrcmar, David.Laight, Pan Xinhui

change from v5:
	spilt x86/kvm patch into guest/host part.
	introduce kvm_write_guest_offset_cached.
	fix some typos.
	rebase patch onto 4.9.2
change from v4:
	spilt x86 kvm vcpu preempted check into two patches.
	add documentation patch.
	add x86 vcpu preempted check patch under xen
	add s390 vcpu preempted check patch 
change from v3:
	add x86 vcpu preempted check patch
change from v2:
	no code change, fix typos, update some comments
change from v1:
	a simplier definition of default vcpu_is_preempted
	skip mahcine type check on ppc, and add config. remove dedicated macro.
	add one patch to drop overload of rwsem_spin_on_owner and mutex_spin_on_owner. 
	add more comments
	thanks boqun and Peter's suggestion.

This patch set aims to fix lock holder preemption issues.

test-case:
perf record -a perf bench sched messaging -g 400 -p && perf report

18.09%  sched-messaging  [kernel.vmlinux]  [k] osq_lock
12.28%  sched-messaging  [kernel.vmlinux]  [k] rwsem_spin_on_owner
 5.27%  sched-messaging  [kernel.vmlinux]  [k] mutex_unlock
 3.89%  sched-messaging  [kernel.vmlinux]  [k] wait_consider_task
 3.64%  sched-messaging  [kernel.vmlinux]  [k] _raw_write_lock_irq
 3.41%  sched-messaging  [kernel.vmlinux]  [k] mutex_spin_on_owner.is
 2.49%  sched-messaging  [kernel.vmlinux]  [k] system_call

We introduce interface bool vcpu_is_preempted(int cpu) and use it in some spin
loops of osq_lock, rwsem_spin_on_owner and mutex_spin_on_owner.
These spin_on_onwer variant also cause rcu stall before we apply this patch set

We also have observed some performace improvements in uninx benchmark tests.

PPC test result:
1 copy - 0.94%
2 copy - 7.17%
4 copy - 11.9%
8 copy -  3.04%
16 copy - 15.11%

details below:
Without patch:

1 copy - File Write 4096 bufsize 8000 maxblocks      2188223.0 KBps  (30.0 s, 1 samples)
2 copy - File Write 4096 bufsize 8000 maxblocks      1804433.0 KBps  (30.0 s, 1 samples)
4 copy - File Write 4096 bufsize 8000 maxblocks      1237257.0 KBps  (30.0 s, 1 samples)
8 copy - File Write 4096 bufsize 8000 maxblocks      1032658.0 KBps  (30.0 s, 1 samples)
16 copy - File Write 4096 bufsize 8000 maxblocks       768000.0 KBps  (30.1 s, 1 samples)

With patch: 

1 copy - File Write 4096 bufsize 8000 maxblocks      2209189.0 KBps  (30.0 s, 1 samples)
2 copy - File Write 4096 bufsize 8000 maxblocks      1943816.0 KBps  (30.0 s, 1 samples)
4 copy - File Write 4096 bufsize 8000 maxblocks      1405591.0 KBps  (30.0 s, 1 samples)
8 copy - File Write 4096 bufsize 8000 maxblocks      1065080.0 KBps  (30.0 s, 1 samples)
16 copy - File Write 4096 bufsize 8000 maxblocks       904762.0 KBps  (30.0 s, 1 samples)

X86 test result:
	test-case			after-patch	  before-patch
Execl Throughput                       |    18307.9 lps  |    11701.6 lps 
File Copy 1024 bufsize 2000 maxblocks  |  1352407.3 KBps |   790418.9 KBps
File Copy 256 bufsize 500 maxblocks    |   367555.6 KBps |   222867.7 KBps
File Copy 4096 bufsize 8000 maxblocks  |  3675649.7 KBps |  1780614.4 KBps
Pipe Throughput                        | 11872208.7 lps  | 11855628.9 lps 
Pipe-based Context Switching           |  1495126.5 lps  |  1490533.9 lps 
Process Creation                       |    29881.2 lps  |    28572.8 lps 
Shell Scripts (1 concurrent)           |    23224.3 lpm  |    22607.4 lpm 
Shell Scripts (8 concurrent)           |     3531.4 lpm  |     3211.9 lpm 
System Call Overhead                   | 10385653.0 lps  | 10419979.0 lps 

Christian Borntraeger (1):
  s390/spinlock: Provide vcpu_is_preempted

Juergen Gross (1):
  x86, xen: support vcpu preempted check

Pan Xinhui (9):
  kernel/sched: introduce vcpu preempted check interface
  locking/osq: Drop the overload of osq_lock()
  kernel/locking: Drop the overload of {mutex,rwsem}_spin_on_owner
  powerpc/spinlock: support vcpu preempted check
  x86, paravirt: Add interface to support kvm/xen vcpu preempted check
  KVM: Introduce kvm_write_guest_offset_cached
  x86, kvm/x86.c: support vcpu preempted check
  x86, kernel/kvm.c: support vcpu preempted check
  Documentation: virtual: kvm: Support vcpu preempted check

 Documentation/virtual/kvm/msr.txt     |  9 ++++++++-
 arch/powerpc/include/asm/spinlock.h   |  8 ++++++++
 arch/s390/include/asm/spinlock.h      |  8 ++++++++
 arch/s390/kernel/smp.c                |  9 +++++++--
 arch/s390/lib/spinlock.c              | 25 ++++++++-----------------
 arch/x86/include/asm/paravirt_types.h |  2 ++
 arch/x86/include/asm/spinlock.h       |  8 ++++++++
 arch/x86/include/uapi/asm/kvm_para.h  |  4 +++-
 arch/x86/kernel/kvm.c                 | 12 ++++++++++++
 arch/x86/kernel/paravirt-spinlocks.c  |  6 ++++++
 arch/x86/kvm/x86.c                    | 16 ++++++++++++++++
 arch/x86/xen/spinlock.c               |  3 ++-
 include/linux/kvm_host.h              |  2 ++
 include/linux/sched.h                 | 12 ++++++++++++
 kernel/locking/mutex.c                | 15 +++++++++++++--
 kernel/locking/osq_lock.c             | 10 +++++++++-
 kernel/locking/rwsem-xadd.c           | 16 +++++++++++++---
 virt/kvm/kvm_main.c                   | 20 ++++++++++++++------
 18 files changed, 151 insertions(+), 34 deletions(-)

-- 
2.4.11

^ permalink raw reply	[flat|nested] 57+ messages in thread

* [PATCH v6 00/11] implement vcpu preempted check
@ 2016-10-28  8:11 ` Pan Xinhui
  0 siblings, 0 replies; 57+ messages in thread
From: Pan Xinhui @ 2016-10-28  8:11 UTC (permalink / raw)
  To: linux-kernel, linuxppc-dev, virtualization, linux-s390,
	xen-devel-request, kvm, xen-devel, x86
  Cc: kernellwp, jgross, David.Laight, rkrcmar, peterz, benh,
	will.deacon, Pan Xinhui, mingo, paulus, mpe, pbonzini, paulmck,
	boqun.feng

change from v5:
	spilt x86/kvm patch into guest/host part.
	introduce kvm_write_guest_offset_cached.
	fix some typos.
	rebase patch onto 4.9.2
change from v4:
	spilt x86 kvm vcpu preempted check into two patches.
	add documentation patch.
	add x86 vcpu preempted check patch under xen
	add s390 vcpu preempted check patch 
change from v3:
	add x86 vcpu preempted check patch
change from v2:
	no code change, fix typos, update some comments
change from v1:
	a simplier definition of default vcpu_is_preempted
	skip mahcine type check on ppc, and add config. remove dedicated macro.
	add one patch to drop overload of rwsem_spin_on_owner and mutex_spin_on_owner. 
	add more comments
	thanks boqun and Peter's suggestion.

This patch set aims to fix lock holder preemption issues.

test-case:
perf record -a perf bench sched messaging -g 400 -p && perf report

18.09%  sched-messaging  [kernel.vmlinux]  [k] osq_lock
12.28%  sched-messaging  [kernel.vmlinux]  [k] rwsem_spin_on_owner
 5.27%  sched-messaging  [kernel.vmlinux]  [k] mutex_unlock
 3.89%  sched-messaging  [kernel.vmlinux]  [k] wait_consider_task
 3.64%  sched-messaging  [kernel.vmlinux]  [k] _raw_write_lock_irq
 3.41%  sched-messaging  [kernel.vmlinux]  [k] mutex_spin_on_owner.is
 2.49%  sched-messaging  [kernel.vmlinux]  [k] system_call

We introduce interface bool vcpu_is_preempted(int cpu) and use it in some spin
loops of osq_lock, rwsem_spin_on_owner and mutex_spin_on_owner.
These spin_on_onwer variant also cause rcu stall before we apply this patch set

We also have observed some performace improvements in uninx benchmark tests.

PPC test result:
1 copy - 0.94%
2 copy - 7.17%
4 copy - 11.9%
8 copy -  3.04%
16 copy - 15.11%

details below:
Without patch:

1 copy - File Write 4096 bufsize 8000 maxblocks      2188223.0 KBps  (30.0 s, 1 samples)
2 copy - File Write 4096 bufsize 8000 maxblocks      1804433.0 KBps  (30.0 s, 1 samples)
4 copy - File Write 4096 bufsize 8000 maxblocks      1237257.0 KBps  (30.0 s, 1 samples)
8 copy - File Write 4096 bufsize 8000 maxblocks      1032658.0 KBps  (30.0 s, 1 samples)
16 copy - File Write 4096 bufsize 8000 maxblocks       768000.0 KBps  (30.1 s, 1 samples)

With patch: 

1 copy - File Write 4096 bufsize 8000 maxblocks      2209189.0 KBps  (30.0 s, 1 samples)
2 copy - File Write 4096 bufsize 8000 maxblocks      1943816.0 KBps  (30.0 s, 1 samples)
4 copy - File Write 4096 bufsize 8000 maxblocks      1405591.0 KBps  (30.0 s, 1 samples)
8 copy - File Write 4096 bufsize 8000 maxblocks      1065080.0 KBps  (30.0 s, 1 samples)
16 copy - File Write 4096 bufsize 8000 maxblocks       904762.0 KBps  (30.0 s, 1 samples)

X86 test result:
	test-case			after-patch	  before-patch
Execl Throughput                       |    18307.9 lps  |    11701.6 lps 
File Copy 1024 bufsize 2000 maxblocks  |  1352407.3 KBps |   790418.9 KBps
File Copy 256 bufsize 500 maxblocks    |   367555.6 KBps |   222867.7 KBps
File Copy 4096 bufsize 8000 maxblocks  |  3675649.7 KBps |  1780614.4 KBps
Pipe Throughput                        | 11872208.7 lps  | 11855628.9 lps 
Pipe-based Context Switching           |  1495126.5 lps  |  1490533.9 lps 
Process Creation                       |    29881.2 lps  |    28572.8 lps 
Shell Scripts (1 concurrent)           |    23224.3 lpm  |    22607.4 lpm 
Shell Scripts (8 concurrent)           |     3531.4 lpm  |     3211.9 lpm 
System Call Overhead                   | 10385653.0 lps  | 10419979.0 lps 

Christian Borntraeger (1):
  s390/spinlock: Provide vcpu_is_preempted

Juergen Gross (1):
  x86, xen: support vcpu preempted check

Pan Xinhui (9):
  kernel/sched: introduce vcpu preempted check interface
  locking/osq: Drop the overload of osq_lock()
  kernel/locking: Drop the overload of {mutex,rwsem}_spin_on_owner
  powerpc/spinlock: support vcpu preempted check
  x86, paravirt: Add interface to support kvm/xen vcpu preempted check
  KVM: Introduce kvm_write_guest_offset_cached
  x86, kvm/x86.c: support vcpu preempted check
  x86, kernel/kvm.c: support vcpu preempted check
  Documentation: virtual: kvm: Support vcpu preempted check

 Documentation/virtual/kvm/msr.txt     |  9 ++++++++-
 arch/powerpc/include/asm/spinlock.h   |  8 ++++++++
 arch/s390/include/asm/spinlock.h      |  8 ++++++++
 arch/s390/kernel/smp.c                |  9 +++++++--
 arch/s390/lib/spinlock.c              | 25 ++++++++-----------------
 arch/x86/include/asm/paravirt_types.h |  2 ++
 arch/x86/include/asm/spinlock.h       |  8 ++++++++
 arch/x86/include/uapi/asm/kvm_para.h  |  4 +++-
 arch/x86/kernel/kvm.c                 | 12 ++++++++++++
 arch/x86/kernel/paravirt-spinlocks.c  |  6 ++++++
 arch/x86/kvm/x86.c                    | 16 ++++++++++++++++
 arch/x86/xen/spinlock.c               |  3 ++-
 include/linux/kvm_host.h              |  2 ++
 include/linux/sched.h                 | 12 ++++++++++++
 kernel/locking/mutex.c                | 15 +++++++++++++--
 kernel/locking/osq_lock.c             | 10 +++++++++-
 kernel/locking/rwsem-xadd.c           | 16 +++++++++++++---
 virt/kvm/kvm_main.c                   | 20 ++++++++++++++------
 18 files changed, 151 insertions(+), 34 deletions(-)

-- 
2.4.11

^ permalink raw reply	[flat|nested] 57+ messages in thread

* [PATCH v6 01/11] kernel/sched: introduce vcpu preempted check interface
  2016-10-28  8:11 ` Pan Xinhui
  (?)
  (?)
@ 2016-10-28  8:11 ` Pan Xinhui
  -1 siblings, 0 replies; 57+ messages in thread
From: Pan Xinhui @ 2016-10-28  8:11 UTC (permalink / raw)
  To: linux-kernel, linuxppc-dev, virtualization, linux-s390,
	xen-devel-request, kvm, xen-devel, x86
  Cc: benh, paulus, mpe, mingo, peterz, paulmck, will.deacon,
	kernellwp, jgross, pbonzini, bsingharora, boqun.feng,
	borntraeger, rkrcmar, David.Laight, Pan Xinhui

This patch support to fix lock holder preemption issue.

For kernel users, we could use bool vcpu_is_preempted(int cpu) to detech if
one vcpu is preempted or not.

The default implementation is a macro defined by false. So compiler can
wrap it out if arch dose not support such vcpu pteempted check.

Suggested-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Pan Xinhui <xinhui.pan@linux.vnet.ibm.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Tested-by: Juergen Gross <jgross@suse.com>
---
 include/linux/sched.h | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/include/linux/sched.h b/include/linux/sched.h
index 348f51b..44c1ce7 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -3506,6 +3506,18 @@ static inline void set_task_cpu(struct task_struct *p, unsigned int cpu)
 
 #endif /* CONFIG_SMP */
 
+/*
+ * In order to deal with a various lock holder preemption issues provide an
+ * interface to see if a vCPU is currently running or not.
+ *
+ * This allows us to terminate optimistic spin loops and block, analogous to
+ * the native optimistic spin heuristic of testing if the lock owner task is
+ * running or not.
+ */
+#ifndef vcpu_is_preempted
+#define vcpu_is_preempted(cpu)	false
+#endif
+
 extern long sched_setaffinity(pid_t pid, const struct cpumask *new_mask);
 extern long sched_getaffinity(pid_t pid, struct cpumask *mask);
 
-- 
2.4.11

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH v6 01/11] kernel/sched: introduce vcpu preempted check interface
  2016-10-28  8:11 ` Pan Xinhui
  (?)
@ 2016-10-28  8:11 ` Pan Xinhui
  -1 siblings, 0 replies; 57+ messages in thread
From: Pan Xinhui @ 2016-10-28  8:11 UTC (permalink / raw)
  To: linux-kernel, linuxppc-dev, virtualization, linux-s390,
	xen-devel-request, kvm, xen-devel, x86
  Cc: kernellwp, jgross, David.Laight, rkrcmar, peterz, benh,
	will.deacon, Pan Xinhui, mingo, paulus, mpe, pbonzini, paulmck,
	boqun.feng

This patch support to fix lock holder preemption issue.

For kernel users, we could use bool vcpu_is_preempted(int cpu) to detech if
one vcpu is preempted or not.

The default implementation is a macro defined by false. So compiler can
wrap it out if arch dose not support such vcpu pteempted check.

Suggested-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Pan Xinhui <xinhui.pan@linux.vnet.ibm.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Tested-by: Juergen Gross <jgross@suse.com>
---
 include/linux/sched.h | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/include/linux/sched.h b/include/linux/sched.h
index 348f51b..44c1ce7 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -3506,6 +3506,18 @@ static inline void set_task_cpu(struct task_struct *p, unsigned int cpu)
 
 #endif /* CONFIG_SMP */
 
+/*
+ * In order to deal with a various lock holder preemption issues provide an
+ * interface to see if a vCPU is currently running or not.
+ *
+ * This allows us to terminate optimistic spin loops and block, analogous to
+ * the native optimistic spin heuristic of testing if the lock owner task is
+ * running or not.
+ */
+#ifndef vcpu_is_preempted
+#define vcpu_is_preempted(cpu)	false
+#endif
+
 extern long sched_setaffinity(pid_t pid, const struct cpumask *new_mask);
 extern long sched_getaffinity(pid_t pid, struct cpumask *mask);
 
-- 
2.4.11

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH v6 01/11] kernel/sched: introduce vcpu preempted check interface
  2016-10-28  8:11 ` Pan Xinhui
                   ` (2 preceding siblings ...)
  (?)
@ 2016-10-28  8:11 ` Pan Xinhui
  -1 siblings, 0 replies; 57+ messages in thread
From: Pan Xinhui @ 2016-10-28  8:11 UTC (permalink / raw)
  To: linux-kernel, linuxppc-dev, virtualization, linux-s390,
	xen-devel-request, kvm, xen-devel, x86
  Cc: kernellwp, jgross, David.Laight, rkrcmar, peterz, benh,
	bsingharora, will.deacon, Pan Xinhui, borntraeger, mingo, paulus,
	mpe, pbonzini, paulmck, boqun.feng

This patch support to fix lock holder preemption issue.

For kernel users, we could use bool vcpu_is_preempted(int cpu) to detech if
one vcpu is preempted or not.

The default implementation is a macro defined by false. So compiler can
wrap it out if arch dose not support such vcpu pteempted check.

Suggested-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Pan Xinhui <xinhui.pan@linux.vnet.ibm.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Tested-by: Juergen Gross <jgross@suse.com>
---
 include/linux/sched.h | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/include/linux/sched.h b/include/linux/sched.h
index 348f51b..44c1ce7 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -3506,6 +3506,18 @@ static inline void set_task_cpu(struct task_struct *p, unsigned int cpu)
 
 #endif /* CONFIG_SMP */
 
+/*
+ * In order to deal with a various lock holder preemption issues provide an
+ * interface to see if a vCPU is currently running or not.
+ *
+ * This allows us to terminate optimistic spin loops and block, analogous to
+ * the native optimistic spin heuristic of testing if the lock owner task is
+ * running or not.
+ */
+#ifndef vcpu_is_preempted
+#define vcpu_is_preempted(cpu)	false
+#endif
+
 extern long sched_setaffinity(pid_t pid, const struct cpumask *new_mask);
 extern long sched_getaffinity(pid_t pid, struct cpumask *mask);
 
-- 
2.4.11


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH v6 02/11] locking/osq: Drop the overload of osq_lock()
  2016-10-28  8:11 ` Pan Xinhui
                   ` (4 preceding siblings ...)
  (?)
@ 2016-10-28  8:11 ` Pan Xinhui
  2016-10-29 16:52   ` Davidlohr Bueso
                     ` (2 more replies)
  -1 siblings, 3 replies; 57+ messages in thread
From: Pan Xinhui @ 2016-10-28  8:11 UTC (permalink / raw)
  To: linux-kernel, linuxppc-dev, virtualization, linux-s390,
	xen-devel-request, kvm, xen-devel, x86
  Cc: benh, paulus, mpe, mingo, peterz, paulmck, will.deacon,
	kernellwp, jgross, pbonzini, bsingharora, boqun.feng,
	borntraeger, rkrcmar, David.Laight, Pan Xinhui

An over-committed guest with more vCPUs than pCPUs has a heavy overload in
osq_lock().

This is because vCPU A hold the osq lock and yield out, vCPU B wait per_cpu
node->locked to be set. IOW, vCPU B wait vCPU A to run and unlock the osq
lock.

Kernel has an interface bool vcpu_is_preempted(int cpu) to see if a vCPU is
currently running or not. So break the spin loops on true condition.

test case:
perf record -a perf bench sched messaging -g 400 -p && perf report

before patch:
18.09%  sched-messaging  [kernel.vmlinux]  [k] osq_lock
12.28%  sched-messaging  [kernel.vmlinux]  [k] rwsem_spin_on_owner
 5.27%  sched-messaging  [kernel.vmlinux]  [k] mutex_unlock
 3.89%  sched-messaging  [kernel.vmlinux]  [k] wait_consider_task
 3.64%  sched-messaging  [kernel.vmlinux]  [k] _raw_write_lock_irq
 3.41%  sched-messaging  [kernel.vmlinux]  [k] mutex_spin_on_owner.is
 2.49%  sched-messaging  [kernel.vmlinux]  [k] system_call

after patch:
20.68%  sched-messaging  [kernel.vmlinux]  [k] mutex_spin_on_owner
 8.45%  sched-messaging  [kernel.vmlinux]  [k] mutex_unlock
 4.12%  sched-messaging  [kernel.vmlinux]  [k] system_call
 3.01%  sched-messaging  [kernel.vmlinux]  [k] system_call_common
 2.83%  sched-messaging  [kernel.vmlinux]  [k] copypage_power7
 2.64%  sched-messaging  [kernel.vmlinux]  [k] rwsem_spin_on_owner
 2.00%  sched-messaging  [kernel.vmlinux]  [k] osq_lock

Suggested-by: Boqun Feng <boqun.feng@gmail.com>
Signed-off-by: Pan Xinhui <xinhui.pan@linux.vnet.ibm.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Tested-by: Juergen Gross <jgross@suse.com>
---
 kernel/locking/osq_lock.c | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/kernel/locking/osq_lock.c b/kernel/locking/osq_lock.c
index 05a3785..39d1385 100644
--- a/kernel/locking/osq_lock.c
+++ b/kernel/locking/osq_lock.c
@@ -21,6 +21,11 @@ static inline int encode_cpu(int cpu_nr)
 	return cpu_nr + 1;
 }
 
+static inline int node_cpu(struct optimistic_spin_node *node)
+{
+	return node->cpu - 1;
+}
+
 static inline struct optimistic_spin_node *decode_cpu(int encoded_cpu_val)
 {
 	int cpu_nr = encoded_cpu_val - 1;
@@ -118,8 +123,11 @@ bool osq_lock(struct optimistic_spin_queue *lock)
 	while (!READ_ONCE(node->locked)) {
 		/*
 		 * If we need to reschedule bail... so we can block.
+		 * Use vcpu_is_preempted to detech lock holder preemption issue
+		 * and break. vcpu_is_preempted is a macro defined by false if
+		 * arch does not support vcpu preempted check,
 		 */
-		if (need_resched())
+		if (need_resched() || vcpu_is_preempted(node_cpu(node->prev)))
 			goto unqueue;
 
 		cpu_relax_lowlatency();
-- 
2.4.11

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH v6 02/11] locking/osq: Drop the overload of osq_lock()
  2016-10-28  8:11 ` Pan Xinhui
                   ` (5 preceding siblings ...)
  (?)
@ 2016-10-28  8:11 ` Pan Xinhui
  -1 siblings, 0 replies; 57+ messages in thread
From: Pan Xinhui @ 2016-10-28  8:11 UTC (permalink / raw)
  To: linux-kernel, linuxppc-dev, virtualization, linux-s390,
	xen-devel-request, kvm, xen-devel, x86
  Cc: kernellwp, jgross, David.Laight, rkrcmar, peterz, benh,
	will.deacon, Pan Xinhui, mingo, paulus, mpe, pbonzini, paulmck,
	boqun.feng

An over-committed guest with more vCPUs than pCPUs has a heavy overload in
osq_lock().

This is because vCPU A hold the osq lock and yield out, vCPU B wait per_cpu
node->locked to be set. IOW, vCPU B wait vCPU A to run and unlock the osq
lock.

Kernel has an interface bool vcpu_is_preempted(int cpu) to see if a vCPU is
currently running or not. So break the spin loops on true condition.

test case:
perf record -a perf bench sched messaging -g 400 -p && perf report

before patch:
18.09%  sched-messaging  [kernel.vmlinux]  [k] osq_lock
12.28%  sched-messaging  [kernel.vmlinux]  [k] rwsem_spin_on_owner
 5.27%  sched-messaging  [kernel.vmlinux]  [k] mutex_unlock
 3.89%  sched-messaging  [kernel.vmlinux]  [k] wait_consider_task
 3.64%  sched-messaging  [kernel.vmlinux]  [k] _raw_write_lock_irq
 3.41%  sched-messaging  [kernel.vmlinux]  [k] mutex_spin_on_owner.is
 2.49%  sched-messaging  [kernel.vmlinux]  [k] system_call

after patch:
20.68%  sched-messaging  [kernel.vmlinux]  [k] mutex_spin_on_owner
 8.45%  sched-messaging  [kernel.vmlinux]  [k] mutex_unlock
 4.12%  sched-messaging  [kernel.vmlinux]  [k] system_call
 3.01%  sched-messaging  [kernel.vmlinux]  [k] system_call_common
 2.83%  sched-messaging  [kernel.vmlinux]  [k] copypage_power7
 2.64%  sched-messaging  [kernel.vmlinux]  [k] rwsem_spin_on_owner
 2.00%  sched-messaging  [kernel.vmlinux]  [k] osq_lock

Suggested-by: Boqun Feng <boqun.feng@gmail.com>
Signed-off-by: Pan Xinhui <xinhui.pan@linux.vnet.ibm.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Tested-by: Juergen Gross <jgross@suse.com>
---
 kernel/locking/osq_lock.c | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/kernel/locking/osq_lock.c b/kernel/locking/osq_lock.c
index 05a3785..39d1385 100644
--- a/kernel/locking/osq_lock.c
+++ b/kernel/locking/osq_lock.c
@@ -21,6 +21,11 @@ static inline int encode_cpu(int cpu_nr)
 	return cpu_nr + 1;
 }
 
+static inline int node_cpu(struct optimistic_spin_node *node)
+{
+	return node->cpu - 1;
+}
+
 static inline struct optimistic_spin_node *decode_cpu(int encoded_cpu_val)
 {
 	int cpu_nr = encoded_cpu_val - 1;
@@ -118,8 +123,11 @@ bool osq_lock(struct optimistic_spin_queue *lock)
 	while (!READ_ONCE(node->locked)) {
 		/*
 		 * If we need to reschedule bail... so we can block.
+		 * Use vcpu_is_preempted to detech lock holder preemption issue
+		 * and break. vcpu_is_preempted is a macro defined by false if
+		 * arch does not support vcpu preempted check,
 		 */
-		if (need_resched())
+		if (need_resched() || vcpu_is_preempted(node_cpu(node->prev)))
 			goto unqueue;
 
 		cpu_relax_lowlatency();
-- 
2.4.11

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH v6 02/11] locking/osq: Drop the overload of osq_lock()
  2016-10-28  8:11 ` Pan Xinhui
                   ` (3 preceding siblings ...)
  (?)
@ 2016-10-28  8:11 ` Pan Xinhui
  -1 siblings, 0 replies; 57+ messages in thread
From: Pan Xinhui @ 2016-10-28  8:11 UTC (permalink / raw)
  To: linux-kernel, linuxppc-dev, virtualization, linux-s390,
	xen-devel-request, kvm, xen-devel, x86
  Cc: kernellwp, jgross, David.Laight, rkrcmar, peterz, benh,
	bsingharora, will.deacon, Pan Xinhui, borntraeger, mingo, paulus,
	mpe, pbonzini, paulmck, boqun.feng

An over-committed guest with more vCPUs than pCPUs has a heavy overload in
osq_lock().

This is because vCPU A hold the osq lock and yield out, vCPU B wait per_cpu
node->locked to be set. IOW, vCPU B wait vCPU A to run and unlock the osq
lock.

Kernel has an interface bool vcpu_is_preempted(int cpu) to see if a vCPU is
currently running or not. So break the spin loops on true condition.

test case:
perf record -a perf bench sched messaging -g 400 -p && perf report

before patch:
18.09%  sched-messaging  [kernel.vmlinux]  [k] osq_lock
12.28%  sched-messaging  [kernel.vmlinux]  [k] rwsem_spin_on_owner
 5.27%  sched-messaging  [kernel.vmlinux]  [k] mutex_unlock
 3.89%  sched-messaging  [kernel.vmlinux]  [k] wait_consider_task
 3.64%  sched-messaging  [kernel.vmlinux]  [k] _raw_write_lock_irq
 3.41%  sched-messaging  [kernel.vmlinux]  [k] mutex_spin_on_owner.is
 2.49%  sched-messaging  [kernel.vmlinux]  [k] system_call

after patch:
20.68%  sched-messaging  [kernel.vmlinux]  [k] mutex_spin_on_owner
 8.45%  sched-messaging  [kernel.vmlinux]  [k] mutex_unlock
 4.12%  sched-messaging  [kernel.vmlinux]  [k] system_call
 3.01%  sched-messaging  [kernel.vmlinux]  [k] system_call_common
 2.83%  sched-messaging  [kernel.vmlinux]  [k] copypage_power7
 2.64%  sched-messaging  [kernel.vmlinux]  [k] rwsem_spin_on_owner
 2.00%  sched-messaging  [kernel.vmlinux]  [k] osq_lock

Suggested-by: Boqun Feng <boqun.feng@gmail.com>
Signed-off-by: Pan Xinhui <xinhui.pan@linux.vnet.ibm.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Tested-by: Juergen Gross <jgross@suse.com>
---
 kernel/locking/osq_lock.c | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/kernel/locking/osq_lock.c b/kernel/locking/osq_lock.c
index 05a3785..39d1385 100644
--- a/kernel/locking/osq_lock.c
+++ b/kernel/locking/osq_lock.c
@@ -21,6 +21,11 @@ static inline int encode_cpu(int cpu_nr)
 	return cpu_nr + 1;
 }
 
+static inline int node_cpu(struct optimistic_spin_node *node)
+{
+	return node->cpu - 1;
+}
+
 static inline struct optimistic_spin_node *decode_cpu(int encoded_cpu_val)
 {
 	int cpu_nr = encoded_cpu_val - 1;
@@ -118,8 +123,11 @@ bool osq_lock(struct optimistic_spin_queue *lock)
 	while (!READ_ONCE(node->locked)) {
 		/*
 		 * If we need to reschedule bail... so we can block.
+		 * Use vcpu_is_preempted to detech lock holder preemption issue
+		 * and break. vcpu_is_preempted is a macro defined by false if
+		 * arch does not support vcpu preempted check,
 		 */
-		if (need_resched())
+		if (need_resched() || vcpu_is_preempted(node_cpu(node->prev)))
 			goto unqueue;
 
 		cpu_relax_lowlatency();
-- 
2.4.11


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH v6 03/11] kernel/locking: Drop the overload of {mutex,rwsem}_spin_on_owner
  2016-10-28  8:11 ` Pan Xinhui
  (?)
@ 2016-10-28  8:11   ` Pan Xinhui
  -1 siblings, 0 replies; 57+ messages in thread
From: Pan Xinhui @ 2016-10-28  8:11 UTC (permalink / raw)
  To: linux-kernel, linuxppc-dev, virtualization, linux-s390,
	xen-devel-request, kvm, xen-devel, x86
  Cc: benh, paulus, mpe, mingo, peterz, paulmck, will.deacon,
	kernellwp, jgross, pbonzini, bsingharora, boqun.feng,
	borntraeger, rkrcmar, David.Laight, Pan Xinhui

An over-committed guest with more vCPUs than pCPUs has a heavy overload in
the two spin_on_owner. This blames on the lock holder preemption issue.

Kernel has an interface bool vcpu_is_preempted(int cpu) to see if a vCPU is
currently running or not. So break the spin loops on true condition.

test-case:
perf record -a perf bench sched messaging -g 400 -p && perf report

before patch:
20.68%  sched-messaging  [kernel.vmlinux]  [k] mutex_spin_on_owner
 8.45%  sched-messaging  [kernel.vmlinux]  [k] mutex_unlock
 4.12%  sched-messaging  [kernel.vmlinux]  [k] system_call
 3.01%  sched-messaging  [kernel.vmlinux]  [k] system_call_common
 2.83%  sched-messaging  [kernel.vmlinux]  [k] copypage_power7
 2.64%  sched-messaging  [kernel.vmlinux]  [k] rwsem_spin_on_owner
 2.00%  sched-messaging  [kernel.vmlinux]  [k] osq_lock

after patch:
 9.99%  sched-messaging  [kernel.vmlinux]  [k] mutex_unlock
 5.28%  sched-messaging  [unknown]         [H] 0xc0000000000768e0
 4.27%  sched-messaging  [kernel.vmlinux]  [k] __copy_tofrom_user_power7
 3.77%  sched-messaging  [kernel.vmlinux]  [k] copypage_power7
 3.24%  sched-messaging  [kernel.vmlinux]  [k] _raw_write_lock_irq
 3.02%  sched-messaging  [kernel.vmlinux]  [k] system_call
 2.69%  sched-messaging  [kernel.vmlinux]  [k] wait_consider_task

Signed-off-by: Pan Xinhui <xinhui.pan@linux.vnet.ibm.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Tested-by: Juergen Gross <jgross@suse.com>
---
 kernel/locking/mutex.c      | 15 +++++++++++++--
 kernel/locking/rwsem-xadd.c | 16 +++++++++++++---
 2 files changed, 26 insertions(+), 5 deletions(-)

diff --git a/kernel/locking/mutex.c b/kernel/locking/mutex.c
index a70b90d..82108f5 100644
--- a/kernel/locking/mutex.c
+++ b/kernel/locking/mutex.c
@@ -236,7 +236,13 @@ bool mutex_spin_on_owner(struct mutex *lock, struct task_struct *owner)
 		 */
 		barrier();
 
-		if (!owner->on_cpu || need_resched()) {
+		/*
+		 * Use vcpu_is_preempted to detech lock holder preemption issue
+		 * and break. vcpu_is_preempted is a macro defined by false if
+		 * arch does not support vcpu preempted check,
+		 */
+		if (!owner->on_cpu || need_resched() ||
+				vcpu_is_preempted(task_cpu(owner))) {
 			ret = false;
 			break;
 		}
@@ -261,8 +267,13 @@ static inline int mutex_can_spin_on_owner(struct mutex *lock)
 
 	rcu_read_lock();
 	owner = READ_ONCE(lock->owner);
+
+	/*
+	 * As lock holder preemption issue, we both skip spinning if task is not
+	 * on cpu or its cpu is preempted
+	 */
 	if (owner)
-		retval = owner->on_cpu;
+		retval = owner->on_cpu && !vcpu_is_preempted(task_cpu(owner));
 	rcu_read_unlock();
 	/*
 	 * if lock->owner is not set, the mutex owner may have just acquired
diff --git a/kernel/locking/rwsem-xadd.c b/kernel/locking/rwsem-xadd.c
index 2337b4b..0897179 100644
--- a/kernel/locking/rwsem-xadd.c
+++ b/kernel/locking/rwsem-xadd.c
@@ -336,7 +336,11 @@ static inline bool rwsem_can_spin_on_owner(struct rw_semaphore *sem)
 		goto done;
 	}
 
-	ret = owner->on_cpu;
+	/*
+	 * As lock holder preemption issue, we both skip spinning if task is not
+	 * on cpu or its cpu is preempted
+	 */
+	ret = owner->on_cpu && !vcpu_is_preempted(task_cpu(owner));
 done:
 	rcu_read_unlock();
 	return ret;
@@ -362,8 +366,14 @@ static noinline bool rwsem_spin_on_owner(struct rw_semaphore *sem)
 		 */
 		barrier();
 
-		/* abort spinning when need_resched or owner is not running */
-		if (!owner->on_cpu || need_resched()) {
+		/*
+		 * abort spinning when need_resched or owner is not running or
+		 * owner's cpu is preempted. vcpu_is_preempted is a macro
+		 * defined by false if arch does not support vcpu preempted
+		 * check
+		 */
+		if (!owner->on_cpu || need_resched() ||
+				vcpu_is_preempted(task_cpu(owner))) {
 			rcu_read_unlock();
 			return false;
 		}
-- 
2.4.11

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH v6 03/11] kernel/locking: Drop the overload of {mutex, rwsem}_spin_on_owner
@ 2016-10-28  8:11   ` Pan Xinhui
  0 siblings, 0 replies; 57+ messages in thread
From: Pan Xinhui @ 2016-10-28  8:11 UTC (permalink / raw)
  To: linux-kernel, linuxppc-dev, virtualization, linux-s390,
	xen-devel-request, kvm, xen-devel, x86
  Cc: kernellwp, jgross, David.Laight, rkrcmar, peterz, benh,
	will.deacon, Pan Xinhui, mingo, paulus, mpe, pbonzini, paulmck,
	boqun.feng

An over-committed guest with more vCPUs than pCPUs has a heavy overload in
the two spin_on_owner. This blames on the lock holder preemption issue.

Kernel has an interface bool vcpu_is_preempted(int cpu) to see if a vCPU is
currently running or not. So break the spin loops on true condition.

test-case:
perf record -a perf bench sched messaging -g 400 -p && perf report

before patch:
20.68%  sched-messaging  [kernel.vmlinux]  [k] mutex_spin_on_owner
 8.45%  sched-messaging  [kernel.vmlinux]  [k] mutex_unlock
 4.12%  sched-messaging  [kernel.vmlinux]  [k] system_call
 3.01%  sched-messaging  [kernel.vmlinux]  [k] system_call_common
 2.83%  sched-messaging  [kernel.vmlinux]  [k] copypage_power7
 2.64%  sched-messaging  [kernel.vmlinux]  [k] rwsem_spin_on_owner
 2.00%  sched-messaging  [kernel.vmlinux]  [k] osq_lock

after patch:
 9.99%  sched-messaging  [kernel.vmlinux]  [k] mutex_unlock
 5.28%  sched-messaging  [unknown]         [H] 0xc0000000000768e0
 4.27%  sched-messaging  [kernel.vmlinux]  [k] __copy_tofrom_user_power7
 3.77%  sched-messaging  [kernel.vmlinux]  [k] copypage_power7
 3.24%  sched-messaging  [kernel.vmlinux]  [k] _raw_write_lock_irq
 3.02%  sched-messaging  [kernel.vmlinux]  [k] system_call
 2.69%  sched-messaging  [kernel.vmlinux]  [k] wait_consider_task

Signed-off-by: Pan Xinhui <xinhui.pan@linux.vnet.ibm.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Tested-by: Juergen Gross <jgross@suse.com>
---
 kernel/locking/mutex.c      | 15 +++++++++++++--
 kernel/locking/rwsem-xadd.c | 16 +++++++++++++---
 2 files changed, 26 insertions(+), 5 deletions(-)

diff --git a/kernel/locking/mutex.c b/kernel/locking/mutex.c
index a70b90d..82108f5 100644
--- a/kernel/locking/mutex.c
+++ b/kernel/locking/mutex.c
@@ -236,7 +236,13 @@ bool mutex_spin_on_owner(struct mutex *lock, struct task_struct *owner)
 		 */
 		barrier();
 
-		if (!owner->on_cpu || need_resched()) {
+		/*
+		 * Use vcpu_is_preempted to detech lock holder preemption issue
+		 * and break. vcpu_is_preempted is a macro defined by false if
+		 * arch does not support vcpu preempted check,
+		 */
+		if (!owner->on_cpu || need_resched() ||
+				vcpu_is_preempted(task_cpu(owner))) {
 			ret = false;
 			break;
 		}
@@ -261,8 +267,13 @@ static inline int mutex_can_spin_on_owner(struct mutex *lock)
 
 	rcu_read_lock();
 	owner = READ_ONCE(lock->owner);
+
+	/*
+	 * As lock holder preemption issue, we both skip spinning if task is not
+	 * on cpu or its cpu is preempted
+	 */
 	if (owner)
-		retval = owner->on_cpu;
+		retval = owner->on_cpu && !vcpu_is_preempted(task_cpu(owner));
 	rcu_read_unlock();
 	/*
 	 * if lock->owner is not set, the mutex owner may have just acquired
diff --git a/kernel/locking/rwsem-xadd.c b/kernel/locking/rwsem-xadd.c
index 2337b4b..0897179 100644
--- a/kernel/locking/rwsem-xadd.c
+++ b/kernel/locking/rwsem-xadd.c
@@ -336,7 +336,11 @@ static inline bool rwsem_can_spin_on_owner(struct rw_semaphore *sem)
 		goto done;
 	}
 
-	ret = owner->on_cpu;
+	/*
+	 * As lock holder preemption issue, we both skip spinning if task is not
+	 * on cpu or its cpu is preempted
+	 */
+	ret = owner->on_cpu && !vcpu_is_preempted(task_cpu(owner));
 done:
 	rcu_read_unlock();
 	return ret;
@@ -362,8 +366,14 @@ static noinline bool rwsem_spin_on_owner(struct rw_semaphore *sem)
 		 */
 		barrier();
 
-		/* abort spinning when need_resched or owner is not running */
-		if (!owner->on_cpu || need_resched()) {
+		/*
+		 * abort spinning when need_resched or owner is not running or
+		 * owner's cpu is preempted. vcpu_is_preempted is a macro
+		 * defined by false if arch does not support vcpu preempted
+		 * check
+		 */
+		if (!owner->on_cpu || need_resched() ||
+				vcpu_is_preempted(task_cpu(owner))) {
 			rcu_read_unlock();
 			return false;
 		}
-- 
2.4.11

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH v6 03/11] kernel/locking: Drop the overload of {mutex, rwsem}_spin_on_owner
@ 2016-10-28  8:11   ` Pan Xinhui
  0 siblings, 0 replies; 57+ messages in thread
From: Pan Xinhui @ 2016-10-28  8:11 UTC (permalink / raw)
  To: linux-kernel, linuxppc-dev, virtualization, linux-s390,
	xen-devel-request, kvm, xen-devel, x86
  Cc: benh, paulus, mpe, mingo, peterz, paulmck, will.deacon,
	kernellwp, jgross, pbonzini, bsingharora, boqun.feng,
	borntraeger, rkrcmar, David.Laight, Pan Xinhui

An over-committed guest with more vCPUs than pCPUs has a heavy overload in
the two spin_on_owner. This blames on the lock holder preemption issue.

Kernel has an interface bool vcpu_is_preempted(int cpu) to see if a vCPU is
currently running or not. So break the spin loops on true condition.

test-case:
perf record -a perf bench sched messaging -g 400 -p && perf report

before patch:
20.68%  sched-messaging  [kernel.vmlinux]  [k] mutex_spin_on_owner
 8.45%  sched-messaging  [kernel.vmlinux]  [k] mutex_unlock
 4.12%  sched-messaging  [kernel.vmlinux]  [k] system_call
 3.01%  sched-messaging  [kernel.vmlinux]  [k] system_call_common
 2.83%  sched-messaging  [kernel.vmlinux]  [k] copypage_power7
 2.64%  sched-messaging  [kernel.vmlinux]  [k] rwsem_spin_on_owner
 2.00%  sched-messaging  [kernel.vmlinux]  [k] osq_lock

after patch:
 9.99%  sched-messaging  [kernel.vmlinux]  [k] mutex_unlock
 5.28%  sched-messaging  [unknown]         [H] 0xc0000000000768e0
 4.27%  sched-messaging  [kernel.vmlinux]  [k] __copy_tofrom_user_power7
 3.77%  sched-messaging  [kernel.vmlinux]  [k] copypage_power7
 3.24%  sched-messaging  [kernel.vmlinux]  [k] _raw_write_lock_irq
 3.02%  sched-messaging  [kernel.vmlinux]  [k] system_call
 2.69%  sched-messaging  [kernel.vmlinux]  [k] wait_consider_task

Signed-off-by: Pan Xinhui <xinhui.pan@linux.vnet.ibm.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Tested-by: Juergen Gross <jgross@suse.com>
---
 kernel/locking/mutex.c      | 15 +++++++++++++--
 kernel/locking/rwsem-xadd.c | 16 +++++++++++++---
 2 files changed, 26 insertions(+), 5 deletions(-)

diff --git a/kernel/locking/mutex.c b/kernel/locking/mutex.c
index a70b90d..82108f5 100644
--- a/kernel/locking/mutex.c
+++ b/kernel/locking/mutex.c
@@ -236,7 +236,13 @@ bool mutex_spin_on_owner(struct mutex *lock, struct task_struct *owner)
 		 */
 		barrier();
 
-		if (!owner->on_cpu || need_resched()) {
+		/*
+		 * Use vcpu_is_preempted to detech lock holder preemption issue
+		 * and break. vcpu_is_preempted is a macro defined by false if
+		 * arch does not support vcpu preempted check,
+		 */
+		if (!owner->on_cpu || need_resched() ||
+				vcpu_is_preempted(task_cpu(owner))) {
 			ret = false;
 			break;
 		}
@@ -261,8 +267,13 @@ static inline int mutex_can_spin_on_owner(struct mutex *lock)
 
 	rcu_read_lock();
 	owner = READ_ONCE(lock->owner);
+
+	/*
+	 * As lock holder preemption issue, we both skip spinning if task is not
+	 * on cpu or its cpu is preempted
+	 */
 	if (owner)
-		retval = owner->on_cpu;
+		retval = owner->on_cpu && !vcpu_is_preempted(task_cpu(owner));
 	rcu_read_unlock();
 	/*
 	 * if lock->owner is not set, the mutex owner may have just acquired
diff --git a/kernel/locking/rwsem-xadd.c b/kernel/locking/rwsem-xadd.c
index 2337b4b..0897179 100644
--- a/kernel/locking/rwsem-xadd.c
+++ b/kernel/locking/rwsem-xadd.c
@@ -336,7 +336,11 @@ static inline bool rwsem_can_spin_on_owner(struct rw_semaphore *sem)
 		goto done;
 	}
 
-	ret = owner->on_cpu;
+	/*
+	 * As lock holder preemption issue, we both skip spinning if task is not
+	 * on cpu or its cpu is preempted
+	 */
+	ret = owner->on_cpu && !vcpu_is_preempted(task_cpu(owner));
 done:
 	rcu_read_unlock();
 	return ret;
@@ -362,8 +366,14 @@ static noinline bool rwsem_spin_on_owner(struct rw_semaphore *sem)
 		 */
 		barrier();
 
-		/* abort spinning when need_resched or owner is not running */
-		if (!owner->on_cpu || need_resched()) {
+		/*
+		 * abort spinning when need_resched or owner is not running or
+		 * owner's cpu is preempted. vcpu_is_preempted is a macro
+		 * defined by false if arch does not support vcpu preempted
+		 * check
+		 */
+		if (!owner->on_cpu || need_resched() ||
+				vcpu_is_preempted(task_cpu(owner))) {
 			rcu_read_unlock();
 			return false;
 		}
-- 
2.4.11

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH v6 03/11] kernel/locking: Drop the overload of {mutex, rwsem}_spin_on_owner
  2016-10-28  8:11 ` Pan Xinhui
                   ` (6 preceding siblings ...)
  (?)
@ 2016-10-28  8:11 ` Pan Xinhui
  -1 siblings, 0 replies; 57+ messages in thread
From: Pan Xinhui @ 2016-10-28  8:11 UTC (permalink / raw)
  To: linux-kernel, linuxppc-dev, virtualization, linux-s390,
	xen-devel-request, kvm, xen-devel, x86
  Cc: kernellwp, jgross, David.Laight, rkrcmar, peterz, benh,
	bsingharora, will.deacon, Pan Xinhui, borntraeger, mingo, paulus,
	mpe, pbonzini, paulmck, boqun.feng

An over-committed guest with more vCPUs than pCPUs has a heavy overload in
the two spin_on_owner. This blames on the lock holder preemption issue.

Kernel has an interface bool vcpu_is_preempted(int cpu) to see if a vCPU is
currently running or not. So break the spin loops on true condition.

test-case:
perf record -a perf bench sched messaging -g 400 -p && perf report

before patch:
20.68%  sched-messaging  [kernel.vmlinux]  [k] mutex_spin_on_owner
 8.45%  sched-messaging  [kernel.vmlinux]  [k] mutex_unlock
 4.12%  sched-messaging  [kernel.vmlinux]  [k] system_call
 3.01%  sched-messaging  [kernel.vmlinux]  [k] system_call_common
 2.83%  sched-messaging  [kernel.vmlinux]  [k] copypage_power7
 2.64%  sched-messaging  [kernel.vmlinux]  [k] rwsem_spin_on_owner
 2.00%  sched-messaging  [kernel.vmlinux]  [k] osq_lock

after patch:
 9.99%  sched-messaging  [kernel.vmlinux]  [k] mutex_unlock
 5.28%  sched-messaging  [unknown]         [H] 0xc0000000000768e0
 4.27%  sched-messaging  [kernel.vmlinux]  [k] __copy_tofrom_user_power7
 3.77%  sched-messaging  [kernel.vmlinux]  [k] copypage_power7
 3.24%  sched-messaging  [kernel.vmlinux]  [k] _raw_write_lock_irq
 3.02%  sched-messaging  [kernel.vmlinux]  [k] system_call
 2.69%  sched-messaging  [kernel.vmlinux]  [k] wait_consider_task

Signed-off-by: Pan Xinhui <xinhui.pan@linux.vnet.ibm.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Tested-by: Juergen Gross <jgross@suse.com>
---
 kernel/locking/mutex.c      | 15 +++++++++++++--
 kernel/locking/rwsem-xadd.c | 16 +++++++++++++---
 2 files changed, 26 insertions(+), 5 deletions(-)

diff --git a/kernel/locking/mutex.c b/kernel/locking/mutex.c
index a70b90d..82108f5 100644
--- a/kernel/locking/mutex.c
+++ b/kernel/locking/mutex.c
@@ -236,7 +236,13 @@ bool mutex_spin_on_owner(struct mutex *lock, struct task_struct *owner)
 		 */
 		barrier();
 
-		if (!owner->on_cpu || need_resched()) {
+		/*
+		 * Use vcpu_is_preempted to detech lock holder preemption issue
+		 * and break. vcpu_is_preempted is a macro defined by false if
+		 * arch does not support vcpu preempted check,
+		 */
+		if (!owner->on_cpu || need_resched() ||
+				vcpu_is_preempted(task_cpu(owner))) {
 			ret = false;
 			break;
 		}
@@ -261,8 +267,13 @@ static inline int mutex_can_spin_on_owner(struct mutex *lock)
 
 	rcu_read_lock();
 	owner = READ_ONCE(lock->owner);
+
+	/*
+	 * As lock holder preemption issue, we both skip spinning if task is not
+	 * on cpu or its cpu is preempted
+	 */
 	if (owner)
-		retval = owner->on_cpu;
+		retval = owner->on_cpu && !vcpu_is_preempted(task_cpu(owner));
 	rcu_read_unlock();
 	/*
 	 * if lock->owner is not set, the mutex owner may have just acquired
diff --git a/kernel/locking/rwsem-xadd.c b/kernel/locking/rwsem-xadd.c
index 2337b4b..0897179 100644
--- a/kernel/locking/rwsem-xadd.c
+++ b/kernel/locking/rwsem-xadd.c
@@ -336,7 +336,11 @@ static inline bool rwsem_can_spin_on_owner(struct rw_semaphore *sem)
 		goto done;
 	}
 
-	ret = owner->on_cpu;
+	/*
+	 * As lock holder preemption issue, we both skip spinning if task is not
+	 * on cpu or its cpu is preempted
+	 */
+	ret = owner->on_cpu && !vcpu_is_preempted(task_cpu(owner));
 done:
 	rcu_read_unlock();
 	return ret;
@@ -362,8 +366,14 @@ static noinline bool rwsem_spin_on_owner(struct rw_semaphore *sem)
 		 */
 		barrier();
 
-		/* abort spinning when need_resched or owner is not running */
-		if (!owner->on_cpu || need_resched()) {
+		/*
+		 * abort spinning when need_resched or owner is not running or
+		 * owner's cpu is preempted. vcpu_is_preempted is a macro
+		 * defined by false if arch does not support vcpu preempted
+		 * check
+		 */
+		if (!owner->on_cpu || need_resched() ||
+				vcpu_is_preempted(task_cpu(owner))) {
 			rcu_read_unlock();
 			return false;
 		}
-- 
2.4.11


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH v6 04/11] powerpc/spinlock: support vcpu preempted check
  2016-10-28  8:11 ` Pan Xinhui
                   ` (10 preceding siblings ...)
  (?)
@ 2016-10-28  8:11 ` Pan Xinhui
  -1 siblings, 0 replies; 57+ messages in thread
From: Pan Xinhui @ 2016-10-28  8:11 UTC (permalink / raw)
  To: linux-kernel, linuxppc-dev, virtualization, linux-s390,
	xen-devel-request, kvm, xen-devel, x86
  Cc: benh, paulus, mpe, mingo, peterz, paulmck, will.deacon,
	kernellwp, jgross, pbonzini, bsingharora, boqun.feng,
	borntraeger, rkrcmar, David.Laight, Pan Xinhui

This is to fix some lock holder preemption issues. Some other locks
implementation do a spin loop before acquiring the lock itself.
Currently kernel has an interface of bool vcpu_is_preempted(int cpu). It
takes the cpu as parameter and return true if the cpu is preempted. Then
kernel can break the spin loops upon on the retval of vcpu_is_preempted.

As kernel has used this interface, So lets support it.

Only pSeries need support it. And the fact is powerNV are built into
same kernel image with pSeries. So we need return false if we are runnig
as powerNV. The another fact is that lppaca->yiled_count keeps zero on
powerNV. So we can just skip the machine type check.

Suggested-by: Boqun Feng <boqun.feng@gmail.com>
Suggested-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Pan Xinhui <xinhui.pan@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/spinlock.h | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/arch/powerpc/include/asm/spinlock.h b/arch/powerpc/include/asm/spinlock.h
index fa37fe9..8c1b913 100644
--- a/arch/powerpc/include/asm/spinlock.h
+++ b/arch/powerpc/include/asm/spinlock.h
@@ -52,6 +52,14 @@
 #define SYNC_IO
 #endif
 
+#ifdef CONFIG_PPC_PSERIES
+#define vcpu_is_preempted vcpu_is_preempted
+static inline bool vcpu_is_preempted(int cpu)
+{
+	return !!(be32_to_cpu(lppaca_of(cpu).yield_count) & 1);
+}
+#endif
+
 static __always_inline int arch_spin_value_unlocked(arch_spinlock_t lock)
 {
 	return lock.slock == 0;
-- 
2.4.11

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH v6 04/11] powerpc/spinlock: support vcpu preempted check
  2016-10-28  8:11 ` Pan Xinhui
                   ` (9 preceding siblings ...)
  (?)
@ 2016-10-28  8:11 ` Pan Xinhui
  -1 siblings, 0 replies; 57+ messages in thread
From: Pan Xinhui @ 2016-10-28  8:11 UTC (permalink / raw)
  To: linux-kernel, linuxppc-dev, virtualization, linux-s390,
	xen-devel-request, kvm, xen-devel, x86
  Cc: kernellwp, jgross, David.Laight, rkrcmar, peterz, benh,
	will.deacon, Pan Xinhui, mingo, paulus, mpe, pbonzini, paulmck,
	boqun.feng

This is to fix some lock holder preemption issues. Some other locks
implementation do a spin loop before acquiring the lock itself.
Currently kernel has an interface of bool vcpu_is_preempted(int cpu). It
takes the cpu as parameter and return true if the cpu is preempted. Then
kernel can break the spin loops upon on the retval of vcpu_is_preempted.

As kernel has used this interface, So lets support it.

Only pSeries need support it. And the fact is powerNV are built into
same kernel image with pSeries. So we need return false if we are runnig
as powerNV. The another fact is that lppaca->yiled_count keeps zero on
powerNV. So we can just skip the machine type check.

Suggested-by: Boqun Feng <boqun.feng@gmail.com>
Suggested-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Pan Xinhui <xinhui.pan@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/spinlock.h | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/arch/powerpc/include/asm/spinlock.h b/arch/powerpc/include/asm/spinlock.h
index fa37fe9..8c1b913 100644
--- a/arch/powerpc/include/asm/spinlock.h
+++ b/arch/powerpc/include/asm/spinlock.h
@@ -52,6 +52,14 @@
 #define SYNC_IO
 #endif
 
+#ifdef CONFIG_PPC_PSERIES
+#define vcpu_is_preempted vcpu_is_preempted
+static inline bool vcpu_is_preempted(int cpu)
+{
+	return !!(be32_to_cpu(lppaca_of(cpu).yield_count) & 1);
+}
+#endif
+
 static __always_inline int arch_spin_value_unlocked(arch_spinlock_t lock)
 {
 	return lock.slock == 0;
-- 
2.4.11

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH v6 04/11] powerpc/spinlock: support vcpu preempted check
  2016-10-28  8:11 ` Pan Xinhui
                   ` (8 preceding siblings ...)
  (?)
@ 2016-10-28  8:11 ` Pan Xinhui
  -1 siblings, 0 replies; 57+ messages in thread
From: Pan Xinhui @ 2016-10-28  8:11 UTC (permalink / raw)
  To: linux-kernel, linuxppc-dev, virtualization, linux-s390,
	xen-devel-request, kvm, xen-devel, x86
  Cc: kernellwp, jgross, David.Laight, rkrcmar, peterz, benh,
	bsingharora, will.deacon, Pan Xinhui, borntraeger, mingo, paulus,
	mpe, pbonzini, paulmck, boqun.feng

This is to fix some lock holder preemption issues. Some other locks
implementation do a spin loop before acquiring the lock itself.
Currently kernel has an interface of bool vcpu_is_preempted(int cpu). It
takes the cpu as parameter and return true if the cpu is preempted. Then
kernel can break the spin loops upon on the retval of vcpu_is_preempted.

As kernel has used this interface, So lets support it.

Only pSeries need support it. And the fact is powerNV are built into
same kernel image with pSeries. So we need return false if we are runnig
as powerNV. The another fact is that lppaca->yiled_count keeps zero on
powerNV. So we can just skip the machine type check.

Suggested-by: Boqun Feng <boqun.feng@gmail.com>
Suggested-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Pan Xinhui <xinhui.pan@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/spinlock.h | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/arch/powerpc/include/asm/spinlock.h b/arch/powerpc/include/asm/spinlock.h
index fa37fe9..8c1b913 100644
--- a/arch/powerpc/include/asm/spinlock.h
+++ b/arch/powerpc/include/asm/spinlock.h
@@ -52,6 +52,14 @@
 #define SYNC_IO
 #endif
 
+#ifdef CONFIG_PPC_PSERIES
+#define vcpu_is_preempted vcpu_is_preempted
+static inline bool vcpu_is_preempted(int cpu)
+{
+	return !!(be32_to_cpu(lppaca_of(cpu).yield_count) & 1);
+}
+#endif
+
 static __always_inline int arch_spin_value_unlocked(arch_spinlock_t lock)
 {
 	return lock.slock == 0;
-- 
2.4.11


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH v6 05/11] s390/spinlock: Provide vcpu_is_preempted
  2016-10-28  8:11 ` Pan Xinhui
                   ` (11 preceding siblings ...)
  (?)
@ 2016-10-28  8:11 ` Pan Xinhui
  -1 siblings, 0 replies; 57+ messages in thread
From: Pan Xinhui @ 2016-10-28  8:11 UTC (permalink / raw)
  To: linux-kernel, linuxppc-dev, virtualization, linux-s390,
	xen-devel-request, kvm, xen-devel, x86
  Cc: benh, paulus, mpe, mingo, peterz, paulmck, will.deacon,
	kernellwp, jgross, pbonzini, bsingharora, boqun.feng,
	borntraeger, rkrcmar, David.Laight

From: Christian Borntraeger <borntraeger@de.ibm.com>

this implements the s390 backend for commit
"kernel/sched: introduce vcpu preempted check interface"
by reworking the existing smp_vcpu_scheduled into
arch_vcpu_is_preempted. We can then also get rid of the
local cpu_is_preempted function by moving the
CIF_ENABLED_WAIT test into arch_vcpu_is_preempted.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com>
---
 arch/s390/include/asm/spinlock.h |  8 ++++++++
 arch/s390/kernel/smp.c           |  9 +++++++--
 arch/s390/lib/spinlock.c         | 25 ++++++++-----------------
 3 files changed, 23 insertions(+), 19 deletions(-)

diff --git a/arch/s390/include/asm/spinlock.h b/arch/s390/include/asm/spinlock.h
index 7e9e09f..7ecd890 100644
--- a/arch/s390/include/asm/spinlock.h
+++ b/arch/s390/include/asm/spinlock.h
@@ -23,6 +23,14 @@ _raw_compare_and_swap(unsigned int *lock, unsigned int old, unsigned int new)
 	return __sync_bool_compare_and_swap(lock, old, new);
 }
 
+#ifndef CONFIG_SMP
+static inline bool arch_vcpu_is_preempted(int cpu) { return false; }
+#else
+bool arch_vcpu_is_preempted(int cpu);
+#endif
+
+#define vcpu_is_preempted arch_vcpu_is_preempted
+
 /*
  * Simple spin lock operations.  There are two variants, one clears IRQ's
  * on the local processor, one does not.
diff --git a/arch/s390/kernel/smp.c b/arch/s390/kernel/smp.c
index 35531fe..b988ed1 100644
--- a/arch/s390/kernel/smp.c
+++ b/arch/s390/kernel/smp.c
@@ -368,10 +368,15 @@ int smp_find_processor_id(u16 address)
 	return -1;
 }
 
-int smp_vcpu_scheduled(int cpu)
+bool arch_vcpu_is_preempted(int cpu)
 {
-	return pcpu_running(pcpu_devices + cpu);
+	if (test_cpu_flag_of(CIF_ENABLED_WAIT, cpu))
+		return false;
+	if (pcpu_running(pcpu_devices + cpu))
+		return false;
+	return true;
 }
+EXPORT_SYMBOL(arch_vcpu_is_preempted);
 
 void smp_yield_cpu(int cpu)
 {
diff --git a/arch/s390/lib/spinlock.c b/arch/s390/lib/spinlock.c
index e5f50a7..e48a48e 100644
--- a/arch/s390/lib/spinlock.c
+++ b/arch/s390/lib/spinlock.c
@@ -37,15 +37,6 @@ static inline void _raw_compare_and_delay(unsigned int *lock, unsigned int old)
 	asm(".insn rsy,0xeb0000000022,%0,0,%1" : : "d" (old), "Q" (*lock));
 }
 
-static inline int cpu_is_preempted(int cpu)
-{
-	if (test_cpu_flag_of(CIF_ENABLED_WAIT, cpu))
-		return 0;
-	if (smp_vcpu_scheduled(cpu))
-		return 0;
-	return 1;
-}
-
 void arch_spin_lock_wait(arch_spinlock_t *lp)
 {
 	unsigned int cpu = SPINLOCK_LOCKVAL;
@@ -62,7 +53,7 @@ void arch_spin_lock_wait(arch_spinlock_t *lp)
 			continue;
 		}
 		/* First iteration: check if the lock owner is running. */
-		if (first_diag && cpu_is_preempted(~owner)) {
+		if (first_diag && arch_vcpu_is_preempted(~owner)) {
 			smp_yield_cpu(~owner);
 			first_diag = 0;
 			continue;
@@ -81,7 +72,7 @@ void arch_spin_lock_wait(arch_spinlock_t *lp)
 		 * yield the CPU unconditionally. For LPAR rely on the
 		 * sense running status.
 		 */
-		if (!MACHINE_IS_LPAR || cpu_is_preempted(~owner)) {
+		if (!MACHINE_IS_LPAR || arch_vcpu_is_preempted(~owner)) {
 			smp_yield_cpu(~owner);
 			first_diag = 0;
 		}
@@ -108,7 +99,7 @@ void arch_spin_lock_wait_flags(arch_spinlock_t *lp, unsigned long flags)
 			continue;
 		}
 		/* Check if the lock owner is running. */
-		if (first_diag && cpu_is_preempted(~owner)) {
+		if (first_diag && arch_vcpu_is_preempted(~owner)) {
 			smp_yield_cpu(~owner);
 			first_diag = 0;
 			continue;
@@ -127,7 +118,7 @@ void arch_spin_lock_wait_flags(arch_spinlock_t *lp, unsigned long flags)
 		 * yield the CPU unconditionally. For LPAR rely on the
 		 * sense running status.
 		 */
-		if (!MACHINE_IS_LPAR || cpu_is_preempted(~owner)) {
+		if (!MACHINE_IS_LPAR || arch_vcpu_is_preempted(~owner)) {
 			smp_yield_cpu(~owner);
 			first_diag = 0;
 		}
@@ -165,7 +156,7 @@ void _raw_read_lock_wait(arch_rwlock_t *rw)
 	owner = 0;
 	while (1) {
 		if (count-- <= 0) {
-			if (owner && cpu_is_preempted(~owner))
+			if (owner && arch_vcpu_is_preempted(~owner))
 				smp_yield_cpu(~owner);
 			count = spin_retry;
 		}
@@ -211,7 +202,7 @@ void _raw_write_lock_wait(arch_rwlock_t *rw, unsigned int prev)
 	owner = 0;
 	while (1) {
 		if (count-- <= 0) {
-			if (owner && cpu_is_preempted(~owner))
+			if (owner && arch_vcpu_is_preempted(~owner))
 				smp_yield_cpu(~owner);
 			count = spin_retry;
 		}
@@ -241,7 +232,7 @@ void _raw_write_lock_wait(arch_rwlock_t *rw)
 	owner = 0;
 	while (1) {
 		if (count-- <= 0) {
-			if (owner && cpu_is_preempted(~owner))
+			if (owner && arch_vcpu_is_preempted(~owner))
 				smp_yield_cpu(~owner);
 			count = spin_retry;
 		}
@@ -285,7 +276,7 @@ void arch_lock_relax(unsigned int cpu)
 {
 	if (!cpu)
 		return;
-	if (MACHINE_IS_LPAR && !cpu_is_preempted(~cpu))
+	if (MACHINE_IS_LPAR && !arch_vcpu_is_preempted(~cpu))
 		return;
 	smp_yield_cpu(~cpu);
 }
-- 
2.4.11

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH v6 05/11] s390/spinlock: Provide vcpu_is_preempted
  2016-10-28  8:11 ` Pan Xinhui
                   ` (13 preceding siblings ...)
  (?)
@ 2016-10-28  8:11 ` Pan Xinhui
  -1 siblings, 0 replies; 57+ messages in thread
From: Pan Xinhui @ 2016-10-28  8:11 UTC (permalink / raw)
  To: linux-kernel, linuxppc-dev, virtualization, linux-s390,
	xen-devel-request, kvm, xen-devel, x86
  Cc: kernellwp, jgross, David.Laight, rkrcmar, peterz, benh,
	will.deacon, mingo, paulus, mpe, pbonzini, paulmck, boqun.feng

From: Christian Borntraeger <borntraeger@de.ibm.com>

this implements the s390 backend for commit
"kernel/sched: introduce vcpu preempted check interface"
by reworking the existing smp_vcpu_scheduled into
arch_vcpu_is_preempted. We can then also get rid of the
local cpu_is_preempted function by moving the
CIF_ENABLED_WAIT test into arch_vcpu_is_preempted.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com>
---
 arch/s390/include/asm/spinlock.h |  8 ++++++++
 arch/s390/kernel/smp.c           |  9 +++++++--
 arch/s390/lib/spinlock.c         | 25 ++++++++-----------------
 3 files changed, 23 insertions(+), 19 deletions(-)

diff --git a/arch/s390/include/asm/spinlock.h b/arch/s390/include/asm/spinlock.h
index 7e9e09f..7ecd890 100644
--- a/arch/s390/include/asm/spinlock.h
+++ b/arch/s390/include/asm/spinlock.h
@@ -23,6 +23,14 @@ _raw_compare_and_swap(unsigned int *lock, unsigned int old, unsigned int new)
 	return __sync_bool_compare_and_swap(lock, old, new);
 }
 
+#ifndef CONFIG_SMP
+static inline bool arch_vcpu_is_preempted(int cpu) { return false; }
+#else
+bool arch_vcpu_is_preempted(int cpu);
+#endif
+
+#define vcpu_is_preempted arch_vcpu_is_preempted
+
 /*
  * Simple spin lock operations.  There are two variants, one clears IRQ's
  * on the local processor, one does not.
diff --git a/arch/s390/kernel/smp.c b/arch/s390/kernel/smp.c
index 35531fe..b988ed1 100644
--- a/arch/s390/kernel/smp.c
+++ b/arch/s390/kernel/smp.c
@@ -368,10 +368,15 @@ int smp_find_processor_id(u16 address)
 	return -1;
 }
 
-int smp_vcpu_scheduled(int cpu)
+bool arch_vcpu_is_preempted(int cpu)
 {
-	return pcpu_running(pcpu_devices + cpu);
+	if (test_cpu_flag_of(CIF_ENABLED_WAIT, cpu))
+		return false;
+	if (pcpu_running(pcpu_devices + cpu))
+		return false;
+	return true;
 }
+EXPORT_SYMBOL(arch_vcpu_is_preempted);
 
 void smp_yield_cpu(int cpu)
 {
diff --git a/arch/s390/lib/spinlock.c b/arch/s390/lib/spinlock.c
index e5f50a7..e48a48e 100644
--- a/arch/s390/lib/spinlock.c
+++ b/arch/s390/lib/spinlock.c
@@ -37,15 +37,6 @@ static inline void _raw_compare_and_delay(unsigned int *lock, unsigned int old)
 	asm(".insn rsy,0xeb0000000022,%0,0,%1" : : "d" (old), "Q" (*lock));
 }
 
-static inline int cpu_is_preempted(int cpu)
-{
-	if (test_cpu_flag_of(CIF_ENABLED_WAIT, cpu))
-		return 0;
-	if (smp_vcpu_scheduled(cpu))
-		return 0;
-	return 1;
-}
-
 void arch_spin_lock_wait(arch_spinlock_t *lp)
 {
 	unsigned int cpu = SPINLOCK_LOCKVAL;
@@ -62,7 +53,7 @@ void arch_spin_lock_wait(arch_spinlock_t *lp)
 			continue;
 		}
 		/* First iteration: check if the lock owner is running. */
-		if (first_diag && cpu_is_preempted(~owner)) {
+		if (first_diag && arch_vcpu_is_preempted(~owner)) {
 			smp_yield_cpu(~owner);
 			first_diag = 0;
 			continue;
@@ -81,7 +72,7 @@ void arch_spin_lock_wait(arch_spinlock_t *lp)
 		 * yield the CPU unconditionally. For LPAR rely on the
 		 * sense running status.
 		 */
-		if (!MACHINE_IS_LPAR || cpu_is_preempted(~owner)) {
+		if (!MACHINE_IS_LPAR || arch_vcpu_is_preempted(~owner)) {
 			smp_yield_cpu(~owner);
 			first_diag = 0;
 		}
@@ -108,7 +99,7 @@ void arch_spin_lock_wait_flags(arch_spinlock_t *lp, unsigned long flags)
 			continue;
 		}
 		/* Check if the lock owner is running. */
-		if (first_diag && cpu_is_preempted(~owner)) {
+		if (first_diag && arch_vcpu_is_preempted(~owner)) {
 			smp_yield_cpu(~owner);
 			first_diag = 0;
 			continue;
@@ -127,7 +118,7 @@ void arch_spin_lock_wait_flags(arch_spinlock_t *lp, unsigned long flags)
 		 * yield the CPU unconditionally. For LPAR rely on the
 		 * sense running status.
 		 */
-		if (!MACHINE_IS_LPAR || cpu_is_preempted(~owner)) {
+		if (!MACHINE_IS_LPAR || arch_vcpu_is_preempted(~owner)) {
 			smp_yield_cpu(~owner);
 			first_diag = 0;
 		}
@@ -165,7 +156,7 @@ void _raw_read_lock_wait(arch_rwlock_t *rw)
 	owner = 0;
 	while (1) {
 		if (count-- <= 0) {
-			if (owner && cpu_is_preempted(~owner))
+			if (owner && arch_vcpu_is_preempted(~owner))
 				smp_yield_cpu(~owner);
 			count = spin_retry;
 		}
@@ -211,7 +202,7 @@ void _raw_write_lock_wait(arch_rwlock_t *rw, unsigned int prev)
 	owner = 0;
 	while (1) {
 		if (count-- <= 0) {
-			if (owner && cpu_is_preempted(~owner))
+			if (owner && arch_vcpu_is_preempted(~owner))
 				smp_yield_cpu(~owner);
 			count = spin_retry;
 		}
@@ -241,7 +232,7 @@ void _raw_write_lock_wait(arch_rwlock_t *rw)
 	owner = 0;
 	while (1) {
 		if (count-- <= 0) {
-			if (owner && cpu_is_preempted(~owner))
+			if (owner && arch_vcpu_is_preempted(~owner))
 				smp_yield_cpu(~owner);
 			count = spin_retry;
 		}
@@ -285,7 +276,7 @@ void arch_lock_relax(unsigned int cpu)
 {
 	if (!cpu)
 		return;
-	if (MACHINE_IS_LPAR && !cpu_is_preempted(~cpu))
+	if (MACHINE_IS_LPAR && !arch_vcpu_is_preempted(~cpu))
 		return;
 	smp_yield_cpu(~cpu);
 }
-- 
2.4.11

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH v6 05/11] s390/spinlock: Provide vcpu_is_preempted
  2016-10-28  8:11 ` Pan Xinhui
                   ` (12 preceding siblings ...)
  (?)
@ 2016-10-28  8:11 ` Pan Xinhui
  -1 siblings, 0 replies; 57+ messages in thread
From: Pan Xinhui @ 2016-10-28  8:11 UTC (permalink / raw)
  To: linux-kernel, linuxppc-dev, virtualization, linux-s390,
	xen-devel-request, kvm, xen-devel, x86
  Cc: kernellwp, jgross, David.Laight, rkrcmar, peterz, benh,
	bsingharora, will.deacon, borntraeger, mingo, paulus, mpe,
	pbonzini, paulmck, boqun.feng

From: Christian Borntraeger <borntraeger@de.ibm.com>

this implements the s390 backend for commit
"kernel/sched: introduce vcpu preempted check interface"
by reworking the existing smp_vcpu_scheduled into
arch_vcpu_is_preempted. We can then also get rid of the
local cpu_is_preempted function by moving the
CIF_ENABLED_WAIT test into arch_vcpu_is_preempted.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com>
---
 arch/s390/include/asm/spinlock.h |  8 ++++++++
 arch/s390/kernel/smp.c           |  9 +++++++--
 arch/s390/lib/spinlock.c         | 25 ++++++++-----------------
 3 files changed, 23 insertions(+), 19 deletions(-)

diff --git a/arch/s390/include/asm/spinlock.h b/arch/s390/include/asm/spinlock.h
index 7e9e09f..7ecd890 100644
--- a/arch/s390/include/asm/spinlock.h
+++ b/arch/s390/include/asm/spinlock.h
@@ -23,6 +23,14 @@ _raw_compare_and_swap(unsigned int *lock, unsigned int old, unsigned int new)
 	return __sync_bool_compare_and_swap(lock, old, new);
 }
 
+#ifndef CONFIG_SMP
+static inline bool arch_vcpu_is_preempted(int cpu) { return false; }
+#else
+bool arch_vcpu_is_preempted(int cpu);
+#endif
+
+#define vcpu_is_preempted arch_vcpu_is_preempted
+
 /*
  * Simple spin lock operations.  There are two variants, one clears IRQ's
  * on the local processor, one does not.
diff --git a/arch/s390/kernel/smp.c b/arch/s390/kernel/smp.c
index 35531fe..b988ed1 100644
--- a/arch/s390/kernel/smp.c
+++ b/arch/s390/kernel/smp.c
@@ -368,10 +368,15 @@ int smp_find_processor_id(u16 address)
 	return -1;
 }
 
-int smp_vcpu_scheduled(int cpu)
+bool arch_vcpu_is_preempted(int cpu)
 {
-	return pcpu_running(pcpu_devices + cpu);
+	if (test_cpu_flag_of(CIF_ENABLED_WAIT, cpu))
+		return false;
+	if (pcpu_running(pcpu_devices + cpu))
+		return false;
+	return true;
 }
+EXPORT_SYMBOL(arch_vcpu_is_preempted);
 
 void smp_yield_cpu(int cpu)
 {
diff --git a/arch/s390/lib/spinlock.c b/arch/s390/lib/spinlock.c
index e5f50a7..e48a48e 100644
--- a/arch/s390/lib/spinlock.c
+++ b/arch/s390/lib/spinlock.c
@@ -37,15 +37,6 @@ static inline void _raw_compare_and_delay(unsigned int *lock, unsigned int old)
 	asm(".insn rsy,0xeb0000000022,%0,0,%1" : : "d" (old), "Q" (*lock));
 }
 
-static inline int cpu_is_preempted(int cpu)
-{
-	if (test_cpu_flag_of(CIF_ENABLED_WAIT, cpu))
-		return 0;
-	if (smp_vcpu_scheduled(cpu))
-		return 0;
-	return 1;
-}
-
 void arch_spin_lock_wait(arch_spinlock_t *lp)
 {
 	unsigned int cpu = SPINLOCK_LOCKVAL;
@@ -62,7 +53,7 @@ void arch_spin_lock_wait(arch_spinlock_t *lp)
 			continue;
 		}
 		/* First iteration: check if the lock owner is running. */
-		if (first_diag && cpu_is_preempted(~owner)) {
+		if (first_diag && arch_vcpu_is_preempted(~owner)) {
 			smp_yield_cpu(~owner);
 			first_diag = 0;
 			continue;
@@ -81,7 +72,7 @@ void arch_spin_lock_wait(arch_spinlock_t *lp)
 		 * yield the CPU unconditionally. For LPAR rely on the
 		 * sense running status.
 		 */
-		if (!MACHINE_IS_LPAR || cpu_is_preempted(~owner)) {
+		if (!MACHINE_IS_LPAR || arch_vcpu_is_preempted(~owner)) {
 			smp_yield_cpu(~owner);
 			first_diag = 0;
 		}
@@ -108,7 +99,7 @@ void arch_spin_lock_wait_flags(arch_spinlock_t *lp, unsigned long flags)
 			continue;
 		}
 		/* Check if the lock owner is running. */
-		if (first_diag && cpu_is_preempted(~owner)) {
+		if (first_diag && arch_vcpu_is_preempted(~owner)) {
 			smp_yield_cpu(~owner);
 			first_diag = 0;
 			continue;
@@ -127,7 +118,7 @@ void arch_spin_lock_wait_flags(arch_spinlock_t *lp, unsigned long flags)
 		 * yield the CPU unconditionally. For LPAR rely on the
 		 * sense running status.
 		 */
-		if (!MACHINE_IS_LPAR || cpu_is_preempted(~owner)) {
+		if (!MACHINE_IS_LPAR || arch_vcpu_is_preempted(~owner)) {
 			smp_yield_cpu(~owner);
 			first_diag = 0;
 		}
@@ -165,7 +156,7 @@ void _raw_read_lock_wait(arch_rwlock_t *rw)
 	owner = 0;
 	while (1) {
 		if (count-- <= 0) {
-			if (owner && cpu_is_preempted(~owner))
+			if (owner && arch_vcpu_is_preempted(~owner))
 				smp_yield_cpu(~owner);
 			count = spin_retry;
 		}
@@ -211,7 +202,7 @@ void _raw_write_lock_wait(arch_rwlock_t *rw, unsigned int prev)
 	owner = 0;
 	while (1) {
 		if (count-- <= 0) {
-			if (owner && cpu_is_preempted(~owner))
+			if (owner && arch_vcpu_is_preempted(~owner))
 				smp_yield_cpu(~owner);
 			count = spin_retry;
 		}
@@ -241,7 +232,7 @@ void _raw_write_lock_wait(arch_rwlock_t *rw)
 	owner = 0;
 	while (1) {
 		if (count-- <= 0) {
-			if (owner && cpu_is_preempted(~owner))
+			if (owner && arch_vcpu_is_preempted(~owner))
 				smp_yield_cpu(~owner);
 			count = spin_retry;
 		}
@@ -285,7 +276,7 @@ void arch_lock_relax(unsigned int cpu)
 {
 	if (!cpu)
 		return;
-	if (MACHINE_IS_LPAR && !cpu_is_preempted(~cpu))
+	if (MACHINE_IS_LPAR && !arch_vcpu_is_preempted(~cpu))
 		return;
 	smp_yield_cpu(~cpu);
 }
-- 
2.4.11


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH v6 06/11] x86, paravirt: Add interface to support kvm/xen vcpu preempted check
  2016-10-28  8:11 ` Pan Xinhui
                   ` (14 preceding siblings ...)
  (?)
@ 2016-10-28  8:11 ` Pan Xinhui
  -1 siblings, 0 replies; 57+ messages in thread
From: Pan Xinhui @ 2016-10-28  8:11 UTC (permalink / raw)
  To: linux-kernel, linuxppc-dev, virtualization, linux-s390,
	xen-devel-request, kvm, xen-devel, x86
  Cc: benh, paulus, mpe, mingo, peterz, paulmck, will.deacon,
	kernellwp, jgross, pbonzini, bsingharora, boqun.feng,
	borntraeger, rkrcmar, David.Laight, Pan Xinhui

This is to fix some lock holder preemption issues. Some other locks
implementation do a spin loop before acquiring the lock itself.
Currently kernel has an interface of bool vcpu_is_preempted(int cpu). It
takes the cpu as parameter and return true if the cpu is preempted.
Then kernel can break the spin loops upon on the retval of
vcpu_is_preempted.

As kernel has used this interface, So lets support it.

To deal with kernel and kvm/xen, add vcpu_is_preempted into struct
pv_lock_ops.

Then kvm or xen could provide their own implementation to support
vcpu_is_preempted.

Signed-off-by: Pan Xinhui <xinhui.pan@linux.vnet.ibm.com>
---
 arch/x86/include/asm/paravirt_types.h | 2 ++
 arch/x86/include/asm/spinlock.h       | 8 ++++++++
 arch/x86/kernel/paravirt-spinlocks.c  | 6 ++++++
 3 files changed, 16 insertions(+)

diff --git a/arch/x86/include/asm/paravirt_types.h b/arch/x86/include/asm/paravirt_types.h
index 0f400c0..38c3bb7 100644
--- a/arch/x86/include/asm/paravirt_types.h
+++ b/arch/x86/include/asm/paravirt_types.h
@@ -310,6 +310,8 @@ struct pv_lock_ops {
 
 	void (*wait)(u8 *ptr, u8 val);
 	void (*kick)(int cpu);
+
+	bool (*vcpu_is_preempted)(int cpu);
 };
 
 /* This contains all the paravirt structures: we get a convenient
diff --git a/arch/x86/include/asm/spinlock.h b/arch/x86/include/asm/spinlock.h
index 921bea7..0526f59 100644
--- a/arch/x86/include/asm/spinlock.h
+++ b/arch/x86/include/asm/spinlock.h
@@ -26,6 +26,14 @@
 extern struct static_key paravirt_ticketlocks_enabled;
 static __always_inline bool static_key_false(struct static_key *key);
 
+#ifdef CONFIG_PARAVIRT_SPINLOCKS
+#define vcpu_is_preempted vcpu_is_preempted
+static inline bool vcpu_is_preempted(int cpu)
+{
+	return pv_lock_ops.vcpu_is_preempted(cpu);
+}
+#endif
+
 #include <asm/qspinlock.h>
 
 /*
diff --git a/arch/x86/kernel/paravirt-spinlocks.c b/arch/x86/kernel/paravirt-spinlocks.c
index 2c55a00..2f204dd 100644
--- a/arch/x86/kernel/paravirt-spinlocks.c
+++ b/arch/x86/kernel/paravirt-spinlocks.c
@@ -21,12 +21,18 @@ bool pv_is_native_spin_unlock(void)
 		__raw_callee_save___native_queued_spin_unlock;
 }
 
+static bool native_vcpu_is_preempted(int cpu)
+{
+	return 0;
+}
+
 struct pv_lock_ops pv_lock_ops = {
 #ifdef CONFIG_SMP
 	.queued_spin_lock_slowpath = native_queued_spin_lock_slowpath,
 	.queued_spin_unlock = PV_CALLEE_SAVE(__native_queued_spin_unlock),
 	.wait = paravirt_nop,
 	.kick = paravirt_nop,
+	.vcpu_is_preempted = native_vcpu_is_preempted,
 #endif /* SMP */
 };
 EXPORT_SYMBOL(pv_lock_ops);
-- 
2.4.11

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH v6 06/11] x86, paravirt: Add interface to support kvm/xen vcpu preempted check
  2016-10-28  8:11 ` Pan Xinhui
                   ` (16 preceding siblings ...)
  (?)
@ 2016-10-28  8:11 ` Pan Xinhui
  -1 siblings, 0 replies; 57+ messages in thread
From: Pan Xinhui @ 2016-10-28  8:11 UTC (permalink / raw)
  To: linux-kernel, linuxppc-dev, virtualization, linux-s390,
	xen-devel-request, kvm, xen-devel, x86
  Cc: kernellwp, jgross, David.Laight, rkrcmar, peterz, benh,
	will.deacon, Pan Xinhui, mingo, paulus, mpe, pbonzini, paulmck,
	boqun.feng

This is to fix some lock holder preemption issues. Some other locks
implementation do a spin loop before acquiring the lock itself.
Currently kernel has an interface of bool vcpu_is_preempted(int cpu). It
takes the cpu as parameter and return true if the cpu is preempted.
Then kernel can break the spin loops upon on the retval of
vcpu_is_preempted.

As kernel has used this interface, So lets support it.

To deal with kernel and kvm/xen, add vcpu_is_preempted into struct
pv_lock_ops.

Then kvm or xen could provide their own implementation to support
vcpu_is_preempted.

Signed-off-by: Pan Xinhui <xinhui.pan@linux.vnet.ibm.com>
---
 arch/x86/include/asm/paravirt_types.h | 2 ++
 arch/x86/include/asm/spinlock.h       | 8 ++++++++
 arch/x86/kernel/paravirt-spinlocks.c  | 6 ++++++
 3 files changed, 16 insertions(+)

diff --git a/arch/x86/include/asm/paravirt_types.h b/arch/x86/include/asm/paravirt_types.h
index 0f400c0..38c3bb7 100644
--- a/arch/x86/include/asm/paravirt_types.h
+++ b/arch/x86/include/asm/paravirt_types.h
@@ -310,6 +310,8 @@ struct pv_lock_ops {
 
 	void (*wait)(u8 *ptr, u8 val);
 	void (*kick)(int cpu);
+
+	bool (*vcpu_is_preempted)(int cpu);
 };
 
 /* This contains all the paravirt structures: we get a convenient
diff --git a/arch/x86/include/asm/spinlock.h b/arch/x86/include/asm/spinlock.h
index 921bea7..0526f59 100644
--- a/arch/x86/include/asm/spinlock.h
+++ b/arch/x86/include/asm/spinlock.h
@@ -26,6 +26,14 @@
 extern struct static_key paravirt_ticketlocks_enabled;
 static __always_inline bool static_key_false(struct static_key *key);
 
+#ifdef CONFIG_PARAVIRT_SPINLOCKS
+#define vcpu_is_preempted vcpu_is_preempted
+static inline bool vcpu_is_preempted(int cpu)
+{
+	return pv_lock_ops.vcpu_is_preempted(cpu);
+}
+#endif
+
 #include <asm/qspinlock.h>
 
 /*
diff --git a/arch/x86/kernel/paravirt-spinlocks.c b/arch/x86/kernel/paravirt-spinlocks.c
index 2c55a00..2f204dd 100644
--- a/arch/x86/kernel/paravirt-spinlocks.c
+++ b/arch/x86/kernel/paravirt-spinlocks.c
@@ -21,12 +21,18 @@ bool pv_is_native_spin_unlock(void)
 		__raw_callee_save___native_queued_spin_unlock;
 }
 
+static bool native_vcpu_is_preempted(int cpu)
+{
+	return 0;
+}
+
 struct pv_lock_ops pv_lock_ops = {
 #ifdef CONFIG_SMP
 	.queued_spin_lock_slowpath = native_queued_spin_lock_slowpath,
 	.queued_spin_unlock = PV_CALLEE_SAVE(__native_queued_spin_unlock),
 	.wait = paravirt_nop,
 	.kick = paravirt_nop,
+	.vcpu_is_preempted = native_vcpu_is_preempted,
 #endif /* SMP */
 };
 EXPORT_SYMBOL(pv_lock_ops);
-- 
2.4.11

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH v6 06/11] x86, paravirt: Add interface to support kvm/xen vcpu preempted check
  2016-10-28  8:11 ` Pan Xinhui
                   ` (15 preceding siblings ...)
  (?)
@ 2016-10-28  8:11 ` Pan Xinhui
  -1 siblings, 0 replies; 57+ messages in thread
From: Pan Xinhui @ 2016-10-28  8:11 UTC (permalink / raw)
  To: linux-kernel, linuxppc-dev, virtualization, linux-s390,
	xen-devel-request, kvm, xen-devel, x86
  Cc: kernellwp, jgross, David.Laight, rkrcmar, peterz, benh,
	bsingharora, will.deacon, Pan Xinhui, borntraeger, mingo, paulus,
	mpe, pbonzini, paulmck, boqun.feng

This is to fix some lock holder preemption issues. Some other locks
implementation do a spin loop before acquiring the lock itself.
Currently kernel has an interface of bool vcpu_is_preempted(int cpu). It
takes the cpu as parameter and return true if the cpu is preempted.
Then kernel can break the spin loops upon on the retval of
vcpu_is_preempted.

As kernel has used this interface, So lets support it.

To deal with kernel and kvm/xen, add vcpu_is_preempted into struct
pv_lock_ops.

Then kvm or xen could provide their own implementation to support
vcpu_is_preempted.

Signed-off-by: Pan Xinhui <xinhui.pan@linux.vnet.ibm.com>
---
 arch/x86/include/asm/paravirt_types.h | 2 ++
 arch/x86/include/asm/spinlock.h       | 8 ++++++++
 arch/x86/kernel/paravirt-spinlocks.c  | 6 ++++++
 3 files changed, 16 insertions(+)

diff --git a/arch/x86/include/asm/paravirt_types.h b/arch/x86/include/asm/paravirt_types.h
index 0f400c0..38c3bb7 100644
--- a/arch/x86/include/asm/paravirt_types.h
+++ b/arch/x86/include/asm/paravirt_types.h
@@ -310,6 +310,8 @@ struct pv_lock_ops {
 
 	void (*wait)(u8 *ptr, u8 val);
 	void (*kick)(int cpu);
+
+	bool (*vcpu_is_preempted)(int cpu);
 };
 
 /* This contains all the paravirt structures: we get a convenient
diff --git a/arch/x86/include/asm/spinlock.h b/arch/x86/include/asm/spinlock.h
index 921bea7..0526f59 100644
--- a/arch/x86/include/asm/spinlock.h
+++ b/arch/x86/include/asm/spinlock.h
@@ -26,6 +26,14 @@
 extern struct static_key paravirt_ticketlocks_enabled;
 static __always_inline bool static_key_false(struct static_key *key);
 
+#ifdef CONFIG_PARAVIRT_SPINLOCKS
+#define vcpu_is_preempted vcpu_is_preempted
+static inline bool vcpu_is_preempted(int cpu)
+{
+	return pv_lock_ops.vcpu_is_preempted(cpu);
+}
+#endif
+
 #include <asm/qspinlock.h>
 
 /*
diff --git a/arch/x86/kernel/paravirt-spinlocks.c b/arch/x86/kernel/paravirt-spinlocks.c
index 2c55a00..2f204dd 100644
--- a/arch/x86/kernel/paravirt-spinlocks.c
+++ b/arch/x86/kernel/paravirt-spinlocks.c
@@ -21,12 +21,18 @@ bool pv_is_native_spin_unlock(void)
 		__raw_callee_save___native_queued_spin_unlock;
 }
 
+static bool native_vcpu_is_preempted(int cpu)
+{
+	return 0;
+}
+
 struct pv_lock_ops pv_lock_ops = {
 #ifdef CONFIG_SMP
 	.queued_spin_lock_slowpath = native_queued_spin_lock_slowpath,
 	.queued_spin_unlock = PV_CALLEE_SAVE(__native_queued_spin_unlock),
 	.wait = paravirt_nop,
 	.kick = paravirt_nop,
+	.vcpu_is_preempted = native_vcpu_is_preempted,
 #endif /* SMP */
 };
 EXPORT_SYMBOL(pv_lock_ops);
-- 
2.4.11


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH v6 07/11] KVM: Introduce kvm_write_guest_offset_cached
  2016-10-28  8:11 ` Pan Xinhui
                   ` (18 preceding siblings ...)
  (?)
@ 2016-10-28  8:11 ` Pan Xinhui
  -1 siblings, 0 replies; 57+ messages in thread
From: Pan Xinhui @ 2016-10-28  8:11 UTC (permalink / raw)
  To: linux-kernel, linuxppc-dev, virtualization, linux-s390,
	xen-devel-request, kvm, xen-devel, x86
  Cc: benh, paulus, mpe, mingo, peterz, paulmck, will.deacon,
	kernellwp, jgross, pbonzini, bsingharora, boqun.feng,
	borntraeger, rkrcmar, David.Laight, Pan Xinhui

It allows us to update some status or field of one struct partially.

We can also save one kvm_read_guest_cached if we just update one filed
of the struct regardless of its current value.

Signed-off-by: Pan Xinhui <xinhui.pan@linux.vnet.ibm.com>
---
 include/linux/kvm_host.h |  2 ++
 virt/kvm/kvm_main.c      | 20 ++++++++++++++------
 2 files changed, 16 insertions(+), 6 deletions(-)

diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 01c0b9c..6f00237 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -645,6 +645,8 @@ int kvm_write_guest(struct kvm *kvm, gpa_t gpa, const void *data,
 		    unsigned long len);
 int kvm_write_guest_cached(struct kvm *kvm, struct gfn_to_hva_cache *ghc,
 			   void *data, unsigned long len);
+int kvm_write_guest_offset_cached(struct kvm *kvm, struct gfn_to_hva_cache *ghc,
+			   void *data, int offset, unsigned long len);
 int kvm_gfn_to_hva_cache_init(struct kvm *kvm, struct gfn_to_hva_cache *ghc,
 			      gpa_t gpa, unsigned long len);
 int kvm_clear_guest_page(struct kvm *kvm, gfn_t gfn, int offset, int len);
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 2907b7b..95308ee 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -1972,30 +1972,38 @@ int kvm_gfn_to_hva_cache_init(struct kvm *kvm, struct gfn_to_hva_cache *ghc,
 }
 EXPORT_SYMBOL_GPL(kvm_gfn_to_hva_cache_init);
 
-int kvm_write_guest_cached(struct kvm *kvm, struct gfn_to_hva_cache *ghc,
-			   void *data, unsigned long len)
+int kvm_write_guest_offset_cached(struct kvm *kvm, struct gfn_to_hva_cache *ghc,
+			   void *data, int offset, unsigned long len)
 {
 	struct kvm_memslots *slots = kvm_memslots(kvm);
 	int r;
+	gpa_t gpa = ghc->gpa + offset;
 
-	BUG_ON(len > ghc->len);
+	BUG_ON(len + offset > ghc->len);
 
 	if (slots->generation != ghc->generation)
 		kvm_gfn_to_hva_cache_init(kvm, ghc, ghc->gpa, ghc->len);
 
 	if (unlikely(!ghc->memslot))
-		return kvm_write_guest(kvm, ghc->gpa, data, len);
+		return kvm_write_guest(kvm, gpa, data, len);
 
 	if (kvm_is_error_hva(ghc->hva))
 		return -EFAULT;
 
-	r = __copy_to_user((void __user *)ghc->hva, data, len);
+	r = __copy_to_user((void __user *)ghc->hva + offset, data, len);
 	if (r)
 		return -EFAULT;
-	mark_page_dirty_in_slot(ghc->memslot, ghc->gpa >> PAGE_SHIFT);
+	mark_page_dirty_in_slot(ghc->memslot, gpa >> PAGE_SHIFT);
 
 	return 0;
 }
+EXPORT_SYMBOL_GPL(kvm_write_guest_offset_cached);
+
+int kvm_write_guest_cached(struct kvm *kvm, struct gfn_to_hva_cache *ghc,
+			   void *data, unsigned long len)
+{
+	return kvm_write_guest_offset_cached(kvm, ghc, data, 0, len);
+}
 EXPORT_SYMBOL_GPL(kvm_write_guest_cached);
 
 int kvm_read_guest_cached(struct kvm *kvm, struct gfn_to_hva_cache *ghc,
-- 
2.4.11

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH v6 07/11] KVM: Introduce kvm_write_guest_offset_cached
  2016-10-28  8:11 ` Pan Xinhui
                   ` (17 preceding siblings ...)
  (?)
@ 2016-10-28  8:11 ` Pan Xinhui
  -1 siblings, 0 replies; 57+ messages in thread
From: Pan Xinhui @ 2016-10-28  8:11 UTC (permalink / raw)
  To: linux-kernel, linuxppc-dev, virtualization, linux-s390,
	xen-devel-request, kvm, xen-devel, x86
  Cc: kernellwp, jgross, David.Laight, rkrcmar, peterz, benh,
	will.deacon, Pan Xinhui, mingo, paulus, mpe, pbonzini, paulmck,
	boqun.feng

It allows us to update some status or field of one struct partially.

We can also save one kvm_read_guest_cached if we just update one filed
of the struct regardless of its current value.

Signed-off-by: Pan Xinhui <xinhui.pan@linux.vnet.ibm.com>
---
 include/linux/kvm_host.h |  2 ++
 virt/kvm/kvm_main.c      | 20 ++++++++++++++------
 2 files changed, 16 insertions(+), 6 deletions(-)

diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 01c0b9c..6f00237 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -645,6 +645,8 @@ int kvm_write_guest(struct kvm *kvm, gpa_t gpa, const void *data,
 		    unsigned long len);
 int kvm_write_guest_cached(struct kvm *kvm, struct gfn_to_hva_cache *ghc,
 			   void *data, unsigned long len);
+int kvm_write_guest_offset_cached(struct kvm *kvm, struct gfn_to_hva_cache *ghc,
+			   void *data, int offset, unsigned long len);
 int kvm_gfn_to_hva_cache_init(struct kvm *kvm, struct gfn_to_hva_cache *ghc,
 			      gpa_t gpa, unsigned long len);
 int kvm_clear_guest_page(struct kvm *kvm, gfn_t gfn, int offset, int len);
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 2907b7b..95308ee 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -1972,30 +1972,38 @@ int kvm_gfn_to_hva_cache_init(struct kvm *kvm, struct gfn_to_hva_cache *ghc,
 }
 EXPORT_SYMBOL_GPL(kvm_gfn_to_hva_cache_init);
 
-int kvm_write_guest_cached(struct kvm *kvm, struct gfn_to_hva_cache *ghc,
-			   void *data, unsigned long len)
+int kvm_write_guest_offset_cached(struct kvm *kvm, struct gfn_to_hva_cache *ghc,
+			   void *data, int offset, unsigned long len)
 {
 	struct kvm_memslots *slots = kvm_memslots(kvm);
 	int r;
+	gpa_t gpa = ghc->gpa + offset;
 
-	BUG_ON(len > ghc->len);
+	BUG_ON(len + offset > ghc->len);
 
 	if (slots->generation != ghc->generation)
 		kvm_gfn_to_hva_cache_init(kvm, ghc, ghc->gpa, ghc->len);
 
 	if (unlikely(!ghc->memslot))
-		return kvm_write_guest(kvm, ghc->gpa, data, len);
+		return kvm_write_guest(kvm, gpa, data, len);
 
 	if (kvm_is_error_hva(ghc->hva))
 		return -EFAULT;
 
-	r = __copy_to_user((void __user *)ghc->hva, data, len);
+	r = __copy_to_user((void __user *)ghc->hva + offset, data, len);
 	if (r)
 		return -EFAULT;
-	mark_page_dirty_in_slot(ghc->memslot, ghc->gpa >> PAGE_SHIFT);
+	mark_page_dirty_in_slot(ghc->memslot, gpa >> PAGE_SHIFT);
 
 	return 0;
 }
+EXPORT_SYMBOL_GPL(kvm_write_guest_offset_cached);
+
+int kvm_write_guest_cached(struct kvm *kvm, struct gfn_to_hva_cache *ghc,
+			   void *data, unsigned long len)
+{
+	return kvm_write_guest_offset_cached(kvm, ghc, data, 0, len);
+}
 EXPORT_SYMBOL_GPL(kvm_write_guest_cached);
 
 int kvm_read_guest_cached(struct kvm *kvm, struct gfn_to_hva_cache *ghc,
-- 
2.4.11

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH v6 07/11] KVM: Introduce kvm_write_guest_offset_cached
  2016-10-28  8:11 ` Pan Xinhui
                   ` (19 preceding siblings ...)
  (?)
@ 2016-10-28  8:11 ` Pan Xinhui
  -1 siblings, 0 replies; 57+ messages in thread
From: Pan Xinhui @ 2016-10-28  8:11 UTC (permalink / raw)
  To: linux-kernel, linuxppc-dev, virtualization, linux-s390,
	xen-devel-request, kvm, xen-devel, x86
  Cc: kernellwp, jgross, David.Laight, rkrcmar, peterz, benh,
	bsingharora, will.deacon, Pan Xinhui, borntraeger, mingo, paulus,
	mpe, pbonzini, paulmck, boqun.feng

It allows us to update some status or field of one struct partially.

We can also save one kvm_read_guest_cached if we just update one filed
of the struct regardless of its current value.

Signed-off-by: Pan Xinhui <xinhui.pan@linux.vnet.ibm.com>
---
 include/linux/kvm_host.h |  2 ++
 virt/kvm/kvm_main.c      | 20 ++++++++++++++------
 2 files changed, 16 insertions(+), 6 deletions(-)

diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 01c0b9c..6f00237 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -645,6 +645,8 @@ int kvm_write_guest(struct kvm *kvm, gpa_t gpa, const void *data,
 		    unsigned long len);
 int kvm_write_guest_cached(struct kvm *kvm, struct gfn_to_hva_cache *ghc,
 			   void *data, unsigned long len);
+int kvm_write_guest_offset_cached(struct kvm *kvm, struct gfn_to_hva_cache *ghc,
+			   void *data, int offset, unsigned long len);
 int kvm_gfn_to_hva_cache_init(struct kvm *kvm, struct gfn_to_hva_cache *ghc,
 			      gpa_t gpa, unsigned long len);
 int kvm_clear_guest_page(struct kvm *kvm, gfn_t gfn, int offset, int len);
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 2907b7b..95308ee 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -1972,30 +1972,38 @@ int kvm_gfn_to_hva_cache_init(struct kvm *kvm, struct gfn_to_hva_cache *ghc,
 }
 EXPORT_SYMBOL_GPL(kvm_gfn_to_hva_cache_init);
 
-int kvm_write_guest_cached(struct kvm *kvm, struct gfn_to_hva_cache *ghc,
-			   void *data, unsigned long len)
+int kvm_write_guest_offset_cached(struct kvm *kvm, struct gfn_to_hva_cache *ghc,
+			   void *data, int offset, unsigned long len)
 {
 	struct kvm_memslots *slots = kvm_memslots(kvm);
 	int r;
+	gpa_t gpa = ghc->gpa + offset;
 
-	BUG_ON(len > ghc->len);
+	BUG_ON(len + offset > ghc->len);
 
 	if (slots->generation != ghc->generation)
 		kvm_gfn_to_hva_cache_init(kvm, ghc, ghc->gpa, ghc->len);
 
 	if (unlikely(!ghc->memslot))
-		return kvm_write_guest(kvm, ghc->gpa, data, len);
+		return kvm_write_guest(kvm, gpa, data, len);
 
 	if (kvm_is_error_hva(ghc->hva))
 		return -EFAULT;
 
-	r = __copy_to_user((void __user *)ghc->hva, data, len);
+	r = __copy_to_user((void __user *)ghc->hva + offset, data, len);
 	if (r)
 		return -EFAULT;
-	mark_page_dirty_in_slot(ghc->memslot, ghc->gpa >> PAGE_SHIFT);
+	mark_page_dirty_in_slot(ghc->memslot, gpa >> PAGE_SHIFT);
 
 	return 0;
 }
+EXPORT_SYMBOL_GPL(kvm_write_guest_offset_cached);
+
+int kvm_write_guest_cached(struct kvm *kvm, struct gfn_to_hva_cache *ghc,
+			   void *data, unsigned long len)
+{
+	return kvm_write_guest_offset_cached(kvm, ghc, data, 0, len);
+}
 EXPORT_SYMBOL_GPL(kvm_write_guest_cached);
 
 int kvm_read_guest_cached(struct kvm *kvm, struct gfn_to_hva_cache *ghc,
-- 
2.4.11


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH v6 08/11] x86, kvm/x86.c: support vcpu preempted check
  2016-10-28  8:11 ` Pan Xinhui
                   ` (20 preceding siblings ...)
  (?)
@ 2016-10-28  8:11 ` Pan Xinhui
  -1 siblings, 0 replies; 57+ messages in thread
From: Pan Xinhui @ 2016-10-28  8:11 UTC (permalink / raw)
  To: linux-kernel, linuxppc-dev, virtualization, linux-s390,
	xen-devel-request, kvm, xen-devel, x86
  Cc: benh, paulus, mpe, mingo, peterz, paulmck, will.deacon,
	kernellwp, jgross, pbonzini, bsingharora, boqun.feng,
	borntraeger, rkrcmar, David.Laight, Pan Xinhui

Support the vcpu_is_preempted() functionality under KVM. This will
enhance lock performance on overcommitted hosts (more runnable vcpus
than physical cpus in the system) as doing busy waits for preempted
vcpus will hurt system performance far worse than early yielding.

Use one field of struct kvm_steal_time ::preempted to indicate that if
one vcpu is running or not.

Signed-off-by: Pan Xinhui <xinhui.pan@linux.vnet.ibm.com>
---
 arch/x86/include/uapi/asm/kvm_para.h |  4 +++-
 arch/x86/kvm/x86.c                   | 16 ++++++++++++++++
 2 files changed, 19 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/uapi/asm/kvm_para.h b/arch/x86/include/uapi/asm/kvm_para.h
index 94dc8ca..1421a65 100644
--- a/arch/x86/include/uapi/asm/kvm_para.h
+++ b/arch/x86/include/uapi/asm/kvm_para.h
@@ -45,7 +45,9 @@ struct kvm_steal_time {
 	__u64 steal;
 	__u32 version;
 	__u32 flags;
-	__u32 pad[12];
+	__u8  preempted;
+	__u8  u8_pad[3];
+	__u32 pad[11];
 };
 
 #define KVM_STEAL_ALIGNMENT_BITS 5
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index e375235..f06e115 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -2057,6 +2057,8 @@ static void record_steal_time(struct kvm_vcpu *vcpu)
 		&vcpu->arch.st.steal, sizeof(struct kvm_steal_time))))
 		return;
 
+	vcpu->arch.st.steal.preempted = 0;
+
 	if (vcpu->arch.st.steal.version & 1)
 		vcpu->arch.st.steal.version += 1;  /* first time write, random junk */
 
@@ -2810,8 +2812,22 @@ void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
 	kvm_make_request(KVM_REQ_STEAL_UPDATE, vcpu);
 }
 
+static void kvm_steal_time_set_preempted(struct kvm_vcpu *vcpu)
+{
+	if (!(vcpu->arch.st.msr_val & KVM_MSR_ENABLED))
+		return;
+
+	vcpu->arch.st.steal.preempted = 1;
+
+	kvm_write_guest_offset_cached(vcpu->kvm, &vcpu->arch.st.stime,
+			&vcpu->arch.st.steal.preempted,
+			offsetof(struct kvm_steal_time, preempted),
+			sizeof(vcpu->arch.st.steal.preempted));
+}
+
 void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu)
 {
+	kvm_steal_time_set_preempted(vcpu);
 	kvm_x86_ops->vcpu_put(vcpu);
 	kvm_put_guest_fpu(vcpu);
 	vcpu->arch.last_host_tsc = rdtsc();
-- 
2.4.11

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH v6 08/11] x86, kvm/x86.c: support vcpu preempted check
  2016-10-28  8:11 ` Pan Xinhui
                   ` (22 preceding siblings ...)
  (?)
@ 2016-10-28  8:11 ` Pan Xinhui
  -1 siblings, 0 replies; 57+ messages in thread
From: Pan Xinhui @ 2016-10-28  8:11 UTC (permalink / raw)
  To: linux-kernel, linuxppc-dev, virtualization, linux-s390,
	xen-devel-request, kvm, xen-devel, x86
  Cc: kernellwp, jgross, David.Laight, rkrcmar, peterz, benh,
	will.deacon, Pan Xinhui, mingo, paulus, mpe, pbonzini, paulmck,
	boqun.feng

Support the vcpu_is_preempted() functionality under KVM. This will
enhance lock performance on overcommitted hosts (more runnable vcpus
than physical cpus in the system) as doing busy waits for preempted
vcpus will hurt system performance far worse than early yielding.

Use one field of struct kvm_steal_time ::preempted to indicate that if
one vcpu is running or not.

Signed-off-by: Pan Xinhui <xinhui.pan@linux.vnet.ibm.com>
---
 arch/x86/include/uapi/asm/kvm_para.h |  4 +++-
 arch/x86/kvm/x86.c                   | 16 ++++++++++++++++
 2 files changed, 19 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/uapi/asm/kvm_para.h b/arch/x86/include/uapi/asm/kvm_para.h
index 94dc8ca..1421a65 100644
--- a/arch/x86/include/uapi/asm/kvm_para.h
+++ b/arch/x86/include/uapi/asm/kvm_para.h
@@ -45,7 +45,9 @@ struct kvm_steal_time {
 	__u64 steal;
 	__u32 version;
 	__u32 flags;
-	__u32 pad[12];
+	__u8  preempted;
+	__u8  u8_pad[3];
+	__u32 pad[11];
 };
 
 #define KVM_STEAL_ALIGNMENT_BITS 5
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index e375235..f06e115 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -2057,6 +2057,8 @@ static void record_steal_time(struct kvm_vcpu *vcpu)
 		&vcpu->arch.st.steal, sizeof(struct kvm_steal_time))))
 		return;
 
+	vcpu->arch.st.steal.preempted = 0;
+
 	if (vcpu->arch.st.steal.version & 1)
 		vcpu->arch.st.steal.version += 1;  /* first time write, random junk */
 
@@ -2810,8 +2812,22 @@ void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
 	kvm_make_request(KVM_REQ_STEAL_UPDATE, vcpu);
 }
 
+static void kvm_steal_time_set_preempted(struct kvm_vcpu *vcpu)
+{
+	if (!(vcpu->arch.st.msr_val & KVM_MSR_ENABLED))
+		return;
+
+	vcpu->arch.st.steal.preempted = 1;
+
+	kvm_write_guest_offset_cached(vcpu->kvm, &vcpu->arch.st.stime,
+			&vcpu->arch.st.steal.preempted,
+			offsetof(struct kvm_steal_time, preempted),
+			sizeof(vcpu->arch.st.steal.preempted));
+}
+
 void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu)
 {
+	kvm_steal_time_set_preempted(vcpu);
 	kvm_x86_ops->vcpu_put(vcpu);
 	kvm_put_guest_fpu(vcpu);
 	vcpu->arch.last_host_tsc = rdtsc();
-- 
2.4.11

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH v6 08/11] x86, kvm/x86.c: support vcpu preempted check
  2016-10-28  8:11 ` Pan Xinhui
                   ` (21 preceding siblings ...)
  (?)
@ 2016-10-28  8:11 ` Pan Xinhui
  -1 siblings, 0 replies; 57+ messages in thread
From: Pan Xinhui @ 2016-10-28  8:11 UTC (permalink / raw)
  To: linux-kernel, linuxppc-dev, virtualization, linux-s390,
	xen-devel-request, kvm, xen-devel, x86
  Cc: kernellwp, jgross, David.Laight, rkrcmar, peterz, benh,
	bsingharora, will.deacon, Pan Xinhui, borntraeger, mingo, paulus,
	mpe, pbonzini, paulmck, boqun.feng

Support the vcpu_is_preempted() functionality under KVM. This will
enhance lock performance on overcommitted hosts (more runnable vcpus
than physical cpus in the system) as doing busy waits for preempted
vcpus will hurt system performance far worse than early yielding.

Use one field of struct kvm_steal_time ::preempted to indicate that if
one vcpu is running or not.

Signed-off-by: Pan Xinhui <xinhui.pan@linux.vnet.ibm.com>
---
 arch/x86/include/uapi/asm/kvm_para.h |  4 +++-
 arch/x86/kvm/x86.c                   | 16 ++++++++++++++++
 2 files changed, 19 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/uapi/asm/kvm_para.h b/arch/x86/include/uapi/asm/kvm_para.h
index 94dc8ca..1421a65 100644
--- a/arch/x86/include/uapi/asm/kvm_para.h
+++ b/arch/x86/include/uapi/asm/kvm_para.h
@@ -45,7 +45,9 @@ struct kvm_steal_time {
 	__u64 steal;
 	__u32 version;
 	__u32 flags;
-	__u32 pad[12];
+	__u8  preempted;
+	__u8  u8_pad[3];
+	__u32 pad[11];
 };
 
 #define KVM_STEAL_ALIGNMENT_BITS 5
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index e375235..f06e115 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -2057,6 +2057,8 @@ static void record_steal_time(struct kvm_vcpu *vcpu)
 		&vcpu->arch.st.steal, sizeof(struct kvm_steal_time))))
 		return;
 
+	vcpu->arch.st.steal.preempted = 0;
+
 	if (vcpu->arch.st.steal.version & 1)
 		vcpu->arch.st.steal.version += 1;  /* first time write, random junk */
 
@@ -2810,8 +2812,22 @@ void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
 	kvm_make_request(KVM_REQ_STEAL_UPDATE, vcpu);
 }
 
+static void kvm_steal_time_set_preempted(struct kvm_vcpu *vcpu)
+{
+	if (!(vcpu->arch.st.msr_val & KVM_MSR_ENABLED))
+		return;
+
+	vcpu->arch.st.steal.preempted = 1;
+
+	kvm_write_guest_offset_cached(vcpu->kvm, &vcpu->arch.st.stime,
+			&vcpu->arch.st.steal.preempted,
+			offsetof(struct kvm_steal_time, preempted),
+			sizeof(vcpu->arch.st.steal.preempted));
+}
+
 void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu)
 {
+	kvm_steal_time_set_preempted(vcpu);
 	kvm_x86_ops->vcpu_put(vcpu);
 	kvm_put_guest_fpu(vcpu);
 	vcpu->arch.last_host_tsc = rdtsc();
-- 
2.4.11


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH v6 09/11] x86, kernel/kvm.c: support vcpu preempted check
  2016-10-28  8:11 ` Pan Xinhui
                   ` (25 preceding siblings ...)
  (?)
@ 2016-10-28  8:11 ` Pan Xinhui
  -1 siblings, 0 replies; 57+ messages in thread
From: Pan Xinhui @ 2016-10-28  8:11 UTC (permalink / raw)
  To: linux-kernel, linuxppc-dev, virtualization, linux-s390,
	xen-devel-request, kvm, xen-devel, x86
  Cc: benh, paulus, mpe, mingo, peterz, paulmck, will.deacon,
	kernellwp, jgross, pbonzini, bsingharora, boqun.feng,
	borntraeger, rkrcmar, David.Laight, Pan Xinhui

Support the vcpu_is_preempted() functionality under KVM. This will
enhance lock performance on overcommitted hosts (more runnable vcpus
than physical cpus in the system) as doing busy waits for preempted
vcpus will hurt system performance far worse than early yielding.

struct kvm_steal_time::preempted indicate that if one vcpu is running or
not after commit("x86, kvm/x86.c: support vcpu preempted check").

unix benchmark result:
host:  kernel 4.8.1, i5-4570, 4 cpus
guest: kernel 4.8.1, 8 vcpus

        test-case                       after-patch       before-patch
Execl Throughput                       |    18307.9 lps  |    11701.6 lps
File Copy 1024 bufsize 2000 maxblocks  |  1352407.3 KBps |   790418.9 KBps
File Copy 256 bufsize 500 maxblocks    |   367555.6 KBps |   222867.7 KBps
File Copy 4096 bufsize 8000 maxblocks  |  3675649.7 KBps |  1780614.4 KBps
Pipe Throughput                        | 11872208.7 lps  | 11855628.9 lps
Pipe-based Context Switching           |  1495126.5 lps  |  1490533.9 lps
Process Creation                       |    29881.2 lps  |    28572.8 lps
Shell Scripts (1 concurrent)           |    23224.3 lpm  |    22607.4 lpm
Shell Scripts (8 concurrent)           |     3531.4 lpm  |     3211.9 lpm
System Call Overhead                   | 10385653.0 lps  | 10419979.0 lps

Signed-off-by: Pan Xinhui <xinhui.pan@linux.vnet.ibm.com>
---
 arch/x86/kernel/kvm.c | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
index edbbfc8..0b48dd2 100644
--- a/arch/x86/kernel/kvm.c
+++ b/arch/x86/kernel/kvm.c
@@ -415,6 +415,15 @@ void kvm_disable_steal_time(void)
 	wrmsr(MSR_KVM_STEAL_TIME, 0, 0);
 }
 
+static bool kvm_vcpu_is_preempted(int cpu)
+{
+	struct kvm_steal_time *src;
+
+	src = &per_cpu(steal_time, cpu);
+
+	return !!src->preempted;
+}
+
 #ifdef CONFIG_SMP
 static void __init kvm_smp_prepare_boot_cpu(void)
 {
@@ -471,6 +480,9 @@ void __init kvm_guest_init(void)
 	if (kvm_para_has_feature(KVM_FEATURE_STEAL_TIME)) {
 		has_steal_clock = 1;
 		pv_time_ops.steal_clock = kvm_steal_clock;
+#ifdef CONFIG_PARAVIRT_SPINLOCKS
+		pv_lock_ops.vcpu_is_preempted = kvm_vcpu_is_preempted;
+#endif
 	}
 
 	if (kvm_para_has_feature(KVM_FEATURE_PV_EOI))
-- 
2.4.11

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH v6 09/11] x86, kernel/kvm.c: support vcpu preempted check
  2016-10-28  8:11 ` Pan Xinhui
                   ` (23 preceding siblings ...)
  (?)
@ 2016-10-28  8:11 ` Pan Xinhui
  -1 siblings, 0 replies; 57+ messages in thread
From: Pan Xinhui @ 2016-10-28  8:11 UTC (permalink / raw)
  To: linux-kernel, linuxppc-dev, virtualization, linux-s390,
	xen-devel-request, kvm, xen-devel, x86
  Cc: kernellwp, jgross, David.Laight, rkrcmar, peterz, benh,
	will.deacon, Pan Xinhui, mingo, paulus, mpe, pbonzini, paulmck,
	boqun.feng

Support the vcpu_is_preempted() functionality under KVM. This will
enhance lock performance on overcommitted hosts (more runnable vcpus
than physical cpus in the system) as doing busy waits for preempted
vcpus will hurt system performance far worse than early yielding.

struct kvm_steal_time::preempted indicate that if one vcpu is running or
not after commit("x86, kvm/x86.c: support vcpu preempted check").

unix benchmark result:
host:  kernel 4.8.1, i5-4570, 4 cpus
guest: kernel 4.8.1, 8 vcpus

        test-case                       after-patch       before-patch
Execl Throughput                       |    18307.9 lps  |    11701.6 lps
File Copy 1024 bufsize 2000 maxblocks  |  1352407.3 KBps |   790418.9 KBps
File Copy 256 bufsize 500 maxblocks    |   367555.6 KBps |   222867.7 KBps
File Copy 4096 bufsize 8000 maxblocks  |  3675649.7 KBps |  1780614.4 KBps
Pipe Throughput                        | 11872208.7 lps  | 11855628.9 lps
Pipe-based Context Switching           |  1495126.5 lps  |  1490533.9 lps
Process Creation                       |    29881.2 lps  |    28572.8 lps
Shell Scripts (1 concurrent)           |    23224.3 lpm  |    22607.4 lpm
Shell Scripts (8 concurrent)           |     3531.4 lpm  |     3211.9 lpm
System Call Overhead                   | 10385653.0 lps  | 10419979.0 lps

Signed-off-by: Pan Xinhui <xinhui.pan@linux.vnet.ibm.com>
---
 arch/x86/kernel/kvm.c | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
index edbbfc8..0b48dd2 100644
--- a/arch/x86/kernel/kvm.c
+++ b/arch/x86/kernel/kvm.c
@@ -415,6 +415,15 @@ void kvm_disable_steal_time(void)
 	wrmsr(MSR_KVM_STEAL_TIME, 0, 0);
 }
 
+static bool kvm_vcpu_is_preempted(int cpu)
+{
+	struct kvm_steal_time *src;
+
+	src = &per_cpu(steal_time, cpu);
+
+	return !!src->preempted;
+}
+
 #ifdef CONFIG_SMP
 static void __init kvm_smp_prepare_boot_cpu(void)
 {
@@ -471,6 +480,9 @@ void __init kvm_guest_init(void)
 	if (kvm_para_has_feature(KVM_FEATURE_STEAL_TIME)) {
 		has_steal_clock = 1;
 		pv_time_ops.steal_clock = kvm_steal_clock;
+#ifdef CONFIG_PARAVIRT_SPINLOCKS
+		pv_lock_ops.vcpu_is_preempted = kvm_vcpu_is_preempted;
+#endif
 	}
 
 	if (kvm_para_has_feature(KVM_FEATURE_PV_EOI))
-- 
2.4.11

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH v6 09/11] x86, kernel/kvm.c: support vcpu preempted check
  2016-10-28  8:11 ` Pan Xinhui
                   ` (24 preceding siblings ...)
  (?)
@ 2016-10-28  8:11 ` Pan Xinhui
  -1 siblings, 0 replies; 57+ messages in thread
From: Pan Xinhui @ 2016-10-28  8:11 UTC (permalink / raw)
  To: linux-kernel, linuxppc-dev, virtualization, linux-s390,
	xen-devel-request, kvm, xen-devel, x86
  Cc: kernellwp, jgross, David.Laight, rkrcmar, peterz, benh,
	bsingharora, will.deacon, Pan Xinhui, borntraeger, mingo, paulus,
	mpe, pbonzini, paulmck, boqun.feng

Support the vcpu_is_preempted() functionality under KVM. This will
enhance lock performance on overcommitted hosts (more runnable vcpus
than physical cpus in the system) as doing busy waits for preempted
vcpus will hurt system performance far worse than early yielding.

struct kvm_steal_time::preempted indicate that if one vcpu is running or
not after commit("x86, kvm/x86.c: support vcpu preempted check").

unix benchmark result:
host:  kernel 4.8.1, i5-4570, 4 cpus
guest: kernel 4.8.1, 8 vcpus

        test-case                       after-patch       before-patch
Execl Throughput                       |    18307.9 lps  |    11701.6 lps
File Copy 1024 bufsize 2000 maxblocks  |  1352407.3 KBps |   790418.9 KBps
File Copy 256 bufsize 500 maxblocks    |   367555.6 KBps |   222867.7 KBps
File Copy 4096 bufsize 8000 maxblocks  |  3675649.7 KBps |  1780614.4 KBps
Pipe Throughput                        | 11872208.7 lps  | 11855628.9 lps
Pipe-based Context Switching           |  1495126.5 lps  |  1490533.9 lps
Process Creation                       |    29881.2 lps  |    28572.8 lps
Shell Scripts (1 concurrent)           |    23224.3 lpm  |    22607.4 lpm
Shell Scripts (8 concurrent)           |     3531.4 lpm  |     3211.9 lpm
System Call Overhead                   | 10385653.0 lps  | 10419979.0 lps

Signed-off-by: Pan Xinhui <xinhui.pan@linux.vnet.ibm.com>
---
 arch/x86/kernel/kvm.c | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
index edbbfc8..0b48dd2 100644
--- a/arch/x86/kernel/kvm.c
+++ b/arch/x86/kernel/kvm.c
@@ -415,6 +415,15 @@ void kvm_disable_steal_time(void)
 	wrmsr(MSR_KVM_STEAL_TIME, 0, 0);
 }
 
+static bool kvm_vcpu_is_preempted(int cpu)
+{
+	struct kvm_steal_time *src;
+
+	src = &per_cpu(steal_time, cpu);
+
+	return !!src->preempted;
+}
+
 #ifdef CONFIG_SMP
 static void __init kvm_smp_prepare_boot_cpu(void)
 {
@@ -471,6 +480,9 @@ void __init kvm_guest_init(void)
 	if (kvm_para_has_feature(KVM_FEATURE_STEAL_TIME)) {
 		has_steal_clock = 1;
 		pv_time_ops.steal_clock = kvm_steal_clock;
+#ifdef CONFIG_PARAVIRT_SPINLOCKS
+		pv_lock_ops.vcpu_is_preempted = kvm_vcpu_is_preempted;
+#endif
 	}
 
 	if (kvm_para_has_feature(KVM_FEATURE_PV_EOI))
-- 
2.4.11


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH v6 10/11] x86, xen: support vcpu preempted check
  2016-10-28  8:11 ` Pan Xinhui
                   ` (26 preceding siblings ...)
  (?)
@ 2016-10-28  8:11 ` Pan Xinhui
  2016-10-28 19:43     ` Konrad Rzeszutek Wilk
  2016-10-28 19:43   ` Konrad Rzeszutek Wilk
  -1 siblings, 2 replies; 57+ messages in thread
From: Pan Xinhui @ 2016-10-28  8:11 UTC (permalink / raw)
  To: linux-kernel, linuxppc-dev, virtualization, linux-s390,
	xen-devel-request, kvm, xen-devel, x86
  Cc: benh, paulus, mpe, mingo, peterz, paulmck, will.deacon,
	kernellwp, jgross, pbonzini, bsingharora, boqun.feng,
	borntraeger, rkrcmar, David.Laight, Pan Xinhui

From: Juergen Gross <jgross@suse.com>

Support the vcpu_is_preempted() functionality under Xen. This will
enhance lock performance on overcommitted hosts (more runnable vcpus
than physical cpus in the system) as doing busy waits for preempted
vcpus will hurt system performance far worse than early yielding.

A quick test (4 vcpus on 1 physical cpu doing a parallel build job
with "make -j 8") reduced system time by about 5% with this patch.

Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Pan Xinhui <xinhui.pan@linux.vnet.ibm.com>
---
 arch/x86/xen/spinlock.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/x86/xen/spinlock.c b/arch/x86/xen/spinlock.c
index 3d6e006..74756bb 100644
--- a/arch/x86/xen/spinlock.c
+++ b/arch/x86/xen/spinlock.c
@@ -114,7 +114,6 @@ void xen_uninit_lock_cpu(int cpu)
 	per_cpu(irq_name, cpu) = NULL;
 }
 
-
 /*
  * Our init of PV spinlocks is split in two init functions due to us
  * using paravirt patching and jump labels patching and having to do
@@ -137,6 +136,8 @@ void __init xen_init_spinlocks(void)
 	pv_lock_ops.queued_spin_unlock = PV_CALLEE_SAVE(__pv_queued_spin_unlock);
 	pv_lock_ops.wait = xen_qlock_wait;
 	pv_lock_ops.kick = xen_qlock_kick;
+
+	pv_lock_ops.vcpu_is_preempted = xen_vcpu_stolen;
 }
 
 /*
-- 
2.4.11

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH v6 10/11] x86, xen: support vcpu preempted check
  2016-10-28  8:11 ` Pan Xinhui
                   ` (27 preceding siblings ...)
  (?)
@ 2016-10-28  8:11 ` Pan Xinhui
  -1 siblings, 0 replies; 57+ messages in thread
From: Pan Xinhui @ 2016-10-28  8:11 UTC (permalink / raw)
  To: linux-kernel, linuxppc-dev, virtualization, linux-s390,
	xen-devel-request, kvm, xen-devel, x86
  Cc: kernellwp, jgross, David.Laight, rkrcmar, peterz, benh,
	will.deacon, Pan Xinhui, mingo, paulus, mpe, pbonzini, paulmck,
	boqun.feng

From: Juergen Gross <jgross@suse.com>

Support the vcpu_is_preempted() functionality under Xen. This will
enhance lock performance on overcommitted hosts (more runnable vcpus
than physical cpus in the system) as doing busy waits for preempted
vcpus will hurt system performance far worse than early yielding.

A quick test (4 vcpus on 1 physical cpu doing a parallel build job
with "make -j 8") reduced system time by about 5% with this patch.

Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Pan Xinhui <xinhui.pan@linux.vnet.ibm.com>
---
 arch/x86/xen/spinlock.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/x86/xen/spinlock.c b/arch/x86/xen/spinlock.c
index 3d6e006..74756bb 100644
--- a/arch/x86/xen/spinlock.c
+++ b/arch/x86/xen/spinlock.c
@@ -114,7 +114,6 @@ void xen_uninit_lock_cpu(int cpu)
 	per_cpu(irq_name, cpu) = NULL;
 }
 
-
 /*
  * Our init of PV spinlocks is split in two init functions due to us
  * using paravirt patching and jump labels patching and having to do
@@ -137,6 +136,8 @@ void __init xen_init_spinlocks(void)
 	pv_lock_ops.queued_spin_unlock = PV_CALLEE_SAVE(__pv_queued_spin_unlock);
 	pv_lock_ops.wait = xen_qlock_wait;
 	pv_lock_ops.kick = xen_qlock_kick;
+
+	pv_lock_ops.vcpu_is_preempted = xen_vcpu_stolen;
 }
 
 /*
-- 
2.4.11

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH v6 10/11] x86, xen: support vcpu preempted check
  2016-10-28  8:11 ` Pan Xinhui
                   ` (28 preceding siblings ...)
  (?)
@ 2016-10-28  8:11 ` Pan Xinhui
  -1 siblings, 0 replies; 57+ messages in thread
From: Pan Xinhui @ 2016-10-28  8:11 UTC (permalink / raw)
  To: linux-kernel, linuxppc-dev, virtualization, linux-s390,
	xen-devel-request, kvm, xen-devel, x86
  Cc: kernellwp, jgross, David.Laight, rkrcmar, peterz, benh,
	bsingharora, will.deacon, Pan Xinhui, borntraeger, mingo, paulus,
	mpe, pbonzini, paulmck, boqun.feng

From: Juergen Gross <jgross@suse.com>

Support the vcpu_is_preempted() functionality under Xen. This will
enhance lock performance on overcommitted hosts (more runnable vcpus
than physical cpus in the system) as doing busy waits for preempted
vcpus will hurt system performance far worse than early yielding.

A quick test (4 vcpus on 1 physical cpu doing a parallel build job
with "make -j 8") reduced system time by about 5% with this patch.

Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Pan Xinhui <xinhui.pan@linux.vnet.ibm.com>
---
 arch/x86/xen/spinlock.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/x86/xen/spinlock.c b/arch/x86/xen/spinlock.c
index 3d6e006..74756bb 100644
--- a/arch/x86/xen/spinlock.c
+++ b/arch/x86/xen/spinlock.c
@@ -114,7 +114,6 @@ void xen_uninit_lock_cpu(int cpu)
 	per_cpu(irq_name, cpu) = NULL;
 }
 
-
 /*
  * Our init of PV spinlocks is split in two init functions due to us
  * using paravirt patching and jump labels patching and having to do
@@ -137,6 +136,8 @@ void __init xen_init_spinlocks(void)
 	pv_lock_ops.queued_spin_unlock = PV_CALLEE_SAVE(__pv_queued_spin_unlock);
 	pv_lock_ops.wait = xen_qlock_wait;
 	pv_lock_ops.kick = xen_qlock_kick;
+
+	pv_lock_ops.vcpu_is_preempted = xen_vcpu_stolen;
 }
 
 /*
-- 
2.4.11


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH v6 11/11] Documentation: virtual: kvm: Support vcpu preempted check
  2016-10-28  8:11 ` Pan Xinhui
                   ` (30 preceding siblings ...)
  (?)
@ 2016-10-28  8:11 ` Pan Xinhui
  -1 siblings, 0 replies; 57+ messages in thread
From: Pan Xinhui @ 2016-10-28  8:11 UTC (permalink / raw)
  To: linux-kernel, linuxppc-dev, virtualization, linux-s390,
	xen-devel-request, kvm, xen-devel, x86
  Cc: benh, paulus, mpe, mingo, peterz, paulmck, will.deacon,
	kernellwp, jgross, pbonzini, bsingharora, boqun.feng,
	borntraeger, rkrcmar, David.Laight, Pan Xinhui

Commit ("x86, kvm: support vcpu preempted check") add one field "__u8
preempted" into struct kvm_steal_time. This field tells if one vcpu is
running or not.

It is zero if 1) some old KVM deos not support this filed. 2) the vcpu is
not preempted. Other values means the vcpu has been preempted.

Signed-off-by: Pan Xinhui <xinhui.pan@linux.vnet.ibm.com>
Acked-by: Radim Krčmář <rkrcmar@redhat.com>
---
 Documentation/virtual/kvm/msr.txt | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/Documentation/virtual/kvm/msr.txt b/Documentation/virtual/kvm/msr.txt
index 2a71c8f..ab2ab76 100644
--- a/Documentation/virtual/kvm/msr.txt
+++ b/Documentation/virtual/kvm/msr.txt
@@ -208,7 +208,9 @@ MSR_KVM_STEAL_TIME: 0x4b564d03
 		__u64 steal;
 		__u32 version;
 		__u32 flags;
-		__u32 pad[12];
+		__u8  preempted;
+		__u8  u8_pad[3];
+		__u32 pad[11];
 	}
 
 	whose data will be filled in by the hypervisor periodically. Only one
@@ -232,6 +234,11 @@ MSR_KVM_STEAL_TIME: 0x4b564d03
 		nanoseconds. Time during which the vcpu is idle, will not be
 		reported as steal time.
 
+		preempted: indicate the VCPU who owns this struct is running or
+		not. Non-zero values mean the VCPU has been preempted. Zero
+		means the VCPU is not preempted. NOTE, it is always zero if the
+		the hypervisor doesn't support this field.
+
 MSR_KVM_EOI_EN: 0x4b564d04
 	data: Bit 0 is 1 when PV end of interrupt is enabled on the vcpu; 0
 	when disabled.  Bit 1 is reserved and must be zero.  When PV end of
-- 
2.4.11

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH v6 11/11] Documentation: virtual: kvm: Support vcpu preempted check
  2016-10-28  8:11 ` Pan Xinhui
                   ` (31 preceding siblings ...)
  (?)
@ 2016-10-28  8:11 ` Pan Xinhui
  -1 siblings, 0 replies; 57+ messages in thread
From: Pan Xinhui @ 2016-10-28  8:11 UTC (permalink / raw)
  To: linux-kernel, linuxppc-dev, virtualization, linux-s390,
	xen-devel-request, kvm, xen-devel, x86
  Cc: kernellwp, jgross, David.Laight, rkrcmar, peterz, benh,
	will.deacon, Pan Xinhui, mingo, paulus, mpe, pbonzini, paulmck,
	boqun.feng

Commit ("x86, kvm: support vcpu preempted check") add one field "__u8
preempted" into struct kvm_steal_time. This field tells if one vcpu is
running or not.

It is zero if 1) some old KVM deos not support this filed. 2) the vcpu is
not preempted. Other values means the vcpu has been preempted.

Signed-off-by: Pan Xinhui <xinhui.pan@linux.vnet.ibm.com>
Acked-by: Radim Krčmář <rkrcmar@redhat.com>
---
 Documentation/virtual/kvm/msr.txt | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/Documentation/virtual/kvm/msr.txt b/Documentation/virtual/kvm/msr.txt
index 2a71c8f..ab2ab76 100644
--- a/Documentation/virtual/kvm/msr.txt
+++ b/Documentation/virtual/kvm/msr.txt
@@ -208,7 +208,9 @@ MSR_KVM_STEAL_TIME: 0x4b564d03
 		__u64 steal;
 		__u32 version;
 		__u32 flags;
-		__u32 pad[12];
+		__u8  preempted;
+		__u8  u8_pad[3];
+		__u32 pad[11];
 	}
 
 	whose data will be filled in by the hypervisor periodically. Only one
@@ -232,6 +234,11 @@ MSR_KVM_STEAL_TIME: 0x4b564d03
 		nanoseconds. Time during which the vcpu is idle, will not be
 		reported as steal time.
 
+		preempted: indicate the VCPU who owns this struct is running or
+		not. Non-zero values mean the VCPU has been preempted. Zero
+		means the VCPU is not preempted. NOTE, it is always zero if the
+		the hypervisor doesn't support this field.
+
 MSR_KVM_EOI_EN: 0x4b564d04
 	data: Bit 0 is 1 when PV end of interrupt is enabled on the vcpu; 0
 	when disabled.  Bit 1 is reserved and must be zero.  When PV end of
-- 
2.4.11

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH v6 11/11] Documentation: virtual: kvm: Support vcpu preempted check
  2016-10-28  8:11 ` Pan Xinhui
                   ` (29 preceding siblings ...)
  (?)
@ 2016-10-28  8:11 ` Pan Xinhui
  -1 siblings, 0 replies; 57+ messages in thread
From: Pan Xinhui @ 2016-10-28  8:11 UTC (permalink / raw)
  To: linux-kernel, linuxppc-dev, virtualization, linux-s390,
	xen-devel-request, kvm, xen-devel, x86
  Cc: kernellwp, jgross, David.Laight, rkrcmar, peterz, benh,
	bsingharora, will.deacon, Pan Xinhui, borntraeger, mingo, paulus,
	mpe, pbonzini, paulmck, boqun.feng

Commit ("x86, kvm: support vcpu preempted check") add one field "__u8
preempted" into struct kvm_steal_time. This field tells if one vcpu is
running or not.

It is zero if 1) some old KVM deos not support this filed. 2) the vcpu is
not preempted. Other values means the vcpu has been preempted.

Signed-off-by: Pan Xinhui <xinhui.pan@linux.vnet.ibm.com>
Acked-by: Radim Krčmář <rkrcmar@redhat.com>
---
 Documentation/virtual/kvm/msr.txt | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/Documentation/virtual/kvm/msr.txt b/Documentation/virtual/kvm/msr.txt
index 2a71c8f..ab2ab76 100644
--- a/Documentation/virtual/kvm/msr.txt
+++ b/Documentation/virtual/kvm/msr.txt
@@ -208,7 +208,9 @@ MSR_KVM_STEAL_TIME: 0x4b564d03
 		__u64 steal;
 		__u32 version;
 		__u32 flags;
-		__u32 pad[12];
+		__u8  preempted;
+		__u8  u8_pad[3];
+		__u32 pad[11];
 	}
 
 	whose data will be filled in by the hypervisor periodically. Only one
@@ -232,6 +234,11 @@ MSR_KVM_STEAL_TIME: 0x4b564d03
 		nanoseconds. Time during which the vcpu is idle, will not be
 		reported as steal time.
 
+		preempted: indicate the VCPU who owns this struct is running or
+		not. Non-zero values mean the VCPU has been preempted. Zero
+		means the VCPU is not preempted. NOTE, it is always zero if the
+		the hypervisor doesn't support this field.
+
 MSR_KVM_EOI_EN: 0x4b564d04
 	data: Bit 0 is 1 when PV end of interrupt is enabled on the vcpu; 0
 	when disabled.  Bit 1 is reserved and must be zero.  When PV end of
-- 
2.4.11


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* Re: [PATCH v6 00/11] implement vcpu preempted check
  2016-10-28  8:11 ` Pan Xinhui
                   ` (33 preceding siblings ...)
  (?)
@ 2016-10-28  9:57 ` Paolo Bonzini
  -1 siblings, 0 replies; 57+ messages in thread
From: Paolo Bonzini @ 2016-10-28  9:57 UTC (permalink / raw)
  To: Pan Xinhui, linux-kernel, linuxppc-dev, virtualization,
	linux-s390, xen-devel-request, kvm, xen-devel, x86
  Cc: benh, paulus, mpe, mingo, peterz, paulmck, will.deacon,
	kernellwp, jgross, bsingharora, boqun.feng, borntraeger, rkrcmar,
	David.Laight



On 28/10/2016 10:11, Pan Xinhui wrote:
> change from v5:
> 	spilt x86/kvm patch into guest/host part.
> 	introduce kvm_write_guest_offset_cached.
> 	fix some typos.
> 	rebase patch onto 4.9.2

Acked-by: Paolo Bonzini <pbonzini@redhat.com>

Thanks,

Paolo

> change from v4:
> 	spilt x86 kvm vcpu preempted check into two patches.
> 	add documentation patch.
> 	add x86 vcpu preempted check patch under xen
> 	add s390 vcpu preempted check patch 
> change from v3:
> 	add x86 vcpu preempted check patch
> change from v2:
> 	no code change, fix typos, update some comments
> change from v1:
> 	a simplier definition of default vcpu_is_preempted
> 	skip mahcine type check on ppc, and add config. remove dedicated macro.
> 	add one patch to drop overload of rwsem_spin_on_owner and mutex_spin_on_owner. 
> 	add more comments
> 	thanks boqun and Peter's suggestion.
> 
> This patch set aims to fix lock holder preemption issues.
> 
> test-case:
> perf record -a perf bench sched messaging -g 400 -p && perf report
> 
> 18.09%  sched-messaging  [kernel.vmlinux]  [k] osq_lock
> 12.28%  sched-messaging  [kernel.vmlinux]  [k] rwsem_spin_on_owner
>  5.27%  sched-messaging  [kernel.vmlinux]  [k] mutex_unlock
>  3.89%  sched-messaging  [kernel.vmlinux]  [k] wait_consider_task
>  3.64%  sched-messaging  [kernel.vmlinux]  [k] _raw_write_lock_irq
>  3.41%  sched-messaging  [kernel.vmlinux]  [k] mutex_spin_on_owner.is
>  2.49%  sched-messaging  [kernel.vmlinux]  [k] system_call
> 
> We introduce interface bool vcpu_is_preempted(int cpu) and use it in some spin
> loops of osq_lock, rwsem_spin_on_owner and mutex_spin_on_owner.
> These spin_on_onwer variant also cause rcu stall before we apply this patch set
> 
> We also have observed some performace improvements in uninx benchmark tests.
> 
> PPC test result:
> 1 copy - 0.94%
> 2 copy - 7.17%
> 4 copy - 11.9%
> 8 copy -  3.04%
> 16 copy - 15.11%
> 
> details below:
> Without patch:
> 
> 1 copy - File Write 4096 bufsize 8000 maxblocks      2188223.0 KBps  (30.0 s, 1 samples)
> 2 copy - File Write 4096 bufsize 8000 maxblocks      1804433.0 KBps  (30.0 s, 1 samples)
> 4 copy - File Write 4096 bufsize 8000 maxblocks      1237257.0 KBps  (30.0 s, 1 samples)
> 8 copy - File Write 4096 bufsize 8000 maxblocks      1032658.0 KBps  (30.0 s, 1 samples)
> 16 copy - File Write 4096 bufsize 8000 maxblocks       768000.0 KBps  (30.1 s, 1 samples)
> 
> With patch: 
> 
> 1 copy - File Write 4096 bufsize 8000 maxblocks      2209189.0 KBps  (30.0 s, 1 samples)
> 2 copy - File Write 4096 bufsize 8000 maxblocks      1943816.0 KBps  (30.0 s, 1 samples)
> 4 copy - File Write 4096 bufsize 8000 maxblocks      1405591.0 KBps  (30.0 s, 1 samples)
> 8 copy - File Write 4096 bufsize 8000 maxblocks      1065080.0 KBps  (30.0 s, 1 samples)
> 16 copy - File Write 4096 bufsize 8000 maxblocks       904762.0 KBps  (30.0 s, 1 samples)
> 
> X86 test result:
> 	test-case			after-patch	  before-patch
> Execl Throughput                       |    18307.9 lps  |    11701.6 lps 
> File Copy 1024 bufsize 2000 maxblocks  |  1352407.3 KBps |   790418.9 KBps
> File Copy 256 bufsize 500 maxblocks    |   367555.6 KBps |   222867.7 KBps
> File Copy 4096 bufsize 8000 maxblocks  |  3675649.7 KBps |  1780614.4 KBps
> Pipe Throughput                        | 11872208.7 lps  | 11855628.9 lps 
> Pipe-based Context Switching           |  1495126.5 lps  |  1490533.9 lps 
> Process Creation                       |    29881.2 lps  |    28572.8 lps 
> Shell Scripts (1 concurrent)           |    23224.3 lpm  |    22607.4 lpm 
> Shell Scripts (8 concurrent)           |     3531.4 lpm  |     3211.9 lpm 
> System Call Overhead                   | 10385653.0 lps  | 10419979.0 lps 
> 
> Christian Borntraeger (1):
>   s390/spinlock: Provide vcpu_is_preempted
> 
> Juergen Gross (1):
>   x86, xen: support vcpu preempted check
> 
> Pan Xinhui (9):
>   kernel/sched: introduce vcpu preempted check interface
>   locking/osq: Drop the overload of osq_lock()
>   kernel/locking: Drop the overload of {mutex,rwsem}_spin_on_owner
>   powerpc/spinlock: support vcpu preempted check
>   x86, paravirt: Add interface to support kvm/xen vcpu preempted check
>   KVM: Introduce kvm_write_guest_offset_cached
>   x86, kvm/x86.c: support vcpu preempted check
>   x86, kernel/kvm.c: support vcpu preempted check
>   Documentation: virtual: kvm: Support vcpu preempted check
> 
>  Documentation/virtual/kvm/msr.txt     |  9 ++++++++-
>  arch/powerpc/include/asm/spinlock.h   |  8 ++++++++
>  arch/s390/include/asm/spinlock.h      |  8 ++++++++
>  arch/s390/kernel/smp.c                |  9 +++++++--
>  arch/s390/lib/spinlock.c              | 25 ++++++++-----------------
>  arch/x86/include/asm/paravirt_types.h |  2 ++
>  arch/x86/include/asm/spinlock.h       |  8 ++++++++
>  arch/x86/include/uapi/asm/kvm_para.h  |  4 +++-
>  arch/x86/kernel/kvm.c                 | 12 ++++++++++++
>  arch/x86/kernel/paravirt-spinlocks.c  |  6 ++++++
>  arch/x86/kvm/x86.c                    | 16 ++++++++++++++++
>  arch/x86/xen/spinlock.c               |  3 ++-
>  include/linux/kvm_host.h              |  2 ++
>  include/linux/sched.h                 | 12 ++++++++++++
>  kernel/locking/mutex.c                | 15 +++++++++++++--
>  kernel/locking/osq_lock.c             | 10 +++++++++-
>  kernel/locking/rwsem-xadd.c           | 16 +++++++++++++---
>  virt/kvm/kvm_main.c                   | 20 ++++++++++++++------
>  18 files changed, 151 insertions(+), 34 deletions(-)
> 

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH v6 00/11] implement vcpu preempted check
  2016-10-28  8:11 ` Pan Xinhui
                   ` (34 preceding siblings ...)
  (?)
@ 2016-10-28  9:57 ` Paolo Bonzini
  -1 siblings, 0 replies; 57+ messages in thread
From: Paolo Bonzini @ 2016-10-28  9:57 UTC (permalink / raw)
  To: Pan Xinhui, linux-kernel, linuxppc-dev, virtualization,
	linux-s390, xen-devel-request, kvm, xen-devel, x86
  Cc: kernellwp, jgross, David.Laight, rkrcmar, peterz, benh,
	will.deacon, mingo, paulus, mpe, paulmck, boqun.feng



On 28/10/2016 10:11, Pan Xinhui wrote:
> change from v5:
> 	spilt x86/kvm patch into guest/host part.
> 	introduce kvm_write_guest_offset_cached.
> 	fix some typos.
> 	rebase patch onto 4.9.2

Acked-by: Paolo Bonzini <pbonzini@redhat.com>

Thanks,

Paolo

> change from v4:
> 	spilt x86 kvm vcpu preempted check into two patches.
> 	add documentation patch.
> 	add x86 vcpu preempted check patch under xen
> 	add s390 vcpu preempted check patch 
> change from v3:
> 	add x86 vcpu preempted check patch
> change from v2:
> 	no code change, fix typos, update some comments
> change from v1:
> 	a simplier definition of default vcpu_is_preempted
> 	skip mahcine type check on ppc, and add config. remove dedicated macro.
> 	add one patch to drop overload of rwsem_spin_on_owner and mutex_spin_on_owner. 
> 	add more comments
> 	thanks boqun and Peter's suggestion.
> 
> This patch set aims to fix lock holder preemption issues.
> 
> test-case:
> perf record -a perf bench sched messaging -g 400 -p && perf report
> 
> 18.09%  sched-messaging  [kernel.vmlinux]  [k] osq_lock
> 12.28%  sched-messaging  [kernel.vmlinux]  [k] rwsem_spin_on_owner
>  5.27%  sched-messaging  [kernel.vmlinux]  [k] mutex_unlock
>  3.89%  sched-messaging  [kernel.vmlinux]  [k] wait_consider_task
>  3.64%  sched-messaging  [kernel.vmlinux]  [k] _raw_write_lock_irq
>  3.41%  sched-messaging  [kernel.vmlinux]  [k] mutex_spin_on_owner.is
>  2.49%  sched-messaging  [kernel.vmlinux]  [k] system_call
> 
> We introduce interface bool vcpu_is_preempted(int cpu) and use it in some spin
> loops of osq_lock, rwsem_spin_on_owner and mutex_spin_on_owner.
> These spin_on_onwer variant also cause rcu stall before we apply this patch set
> 
> We also have observed some performace improvements in uninx benchmark tests.
> 
> PPC test result:
> 1 copy - 0.94%
> 2 copy - 7.17%
> 4 copy - 11.9%
> 8 copy -  3.04%
> 16 copy - 15.11%
> 
> details below:
> Without patch:
> 
> 1 copy - File Write 4096 bufsize 8000 maxblocks      2188223.0 KBps  (30.0 s, 1 samples)
> 2 copy - File Write 4096 bufsize 8000 maxblocks      1804433.0 KBps  (30.0 s, 1 samples)
> 4 copy - File Write 4096 bufsize 8000 maxblocks      1237257.0 KBps  (30.0 s, 1 samples)
> 8 copy - File Write 4096 bufsize 8000 maxblocks      1032658.0 KBps  (30.0 s, 1 samples)
> 16 copy - File Write 4096 bufsize 8000 maxblocks       768000.0 KBps  (30.1 s, 1 samples)
> 
> With patch: 
> 
> 1 copy - File Write 4096 bufsize 8000 maxblocks      2209189.0 KBps  (30.0 s, 1 samples)
> 2 copy - File Write 4096 bufsize 8000 maxblocks      1943816.0 KBps  (30.0 s, 1 samples)
> 4 copy - File Write 4096 bufsize 8000 maxblocks      1405591.0 KBps  (30.0 s, 1 samples)
> 8 copy - File Write 4096 bufsize 8000 maxblocks      1065080.0 KBps  (30.0 s, 1 samples)
> 16 copy - File Write 4096 bufsize 8000 maxblocks       904762.0 KBps  (30.0 s, 1 samples)
> 
> X86 test result:
> 	test-case			after-patch	  before-patch
> Execl Throughput                       |    18307.9 lps  |    11701.6 lps 
> File Copy 1024 bufsize 2000 maxblocks  |  1352407.3 KBps |   790418.9 KBps
> File Copy 256 bufsize 500 maxblocks    |   367555.6 KBps |   222867.7 KBps
> File Copy 4096 bufsize 8000 maxblocks  |  3675649.7 KBps |  1780614.4 KBps
> Pipe Throughput                        | 11872208.7 lps  | 11855628.9 lps 
> Pipe-based Context Switching           |  1495126.5 lps  |  1490533.9 lps 
> Process Creation                       |    29881.2 lps  |    28572.8 lps 
> Shell Scripts (1 concurrent)           |    23224.3 lpm  |    22607.4 lpm 
> Shell Scripts (8 concurrent)           |     3531.4 lpm  |     3211.9 lpm 
> System Call Overhead                   | 10385653.0 lps  | 10419979.0 lps 
> 
> Christian Borntraeger (1):
>   s390/spinlock: Provide vcpu_is_preempted
> 
> Juergen Gross (1):
>   x86, xen: support vcpu preempted check
> 
> Pan Xinhui (9):
>   kernel/sched: introduce vcpu preempted check interface
>   locking/osq: Drop the overload of osq_lock()
>   kernel/locking: Drop the overload of {mutex,rwsem}_spin_on_owner
>   powerpc/spinlock: support vcpu preempted check
>   x86, paravirt: Add interface to support kvm/xen vcpu preempted check
>   KVM: Introduce kvm_write_guest_offset_cached
>   x86, kvm/x86.c: support vcpu preempted check
>   x86, kernel/kvm.c: support vcpu preempted check
>   Documentation: virtual: kvm: Support vcpu preempted check
> 
>  Documentation/virtual/kvm/msr.txt     |  9 ++++++++-
>  arch/powerpc/include/asm/spinlock.h   |  8 ++++++++
>  arch/s390/include/asm/spinlock.h      |  8 ++++++++
>  arch/s390/kernel/smp.c                |  9 +++++++--
>  arch/s390/lib/spinlock.c              | 25 ++++++++-----------------
>  arch/x86/include/asm/paravirt_types.h |  2 ++
>  arch/x86/include/asm/spinlock.h       |  8 ++++++++
>  arch/x86/include/uapi/asm/kvm_para.h  |  4 +++-
>  arch/x86/kernel/kvm.c                 | 12 ++++++++++++
>  arch/x86/kernel/paravirt-spinlocks.c  |  6 ++++++
>  arch/x86/kvm/x86.c                    | 16 ++++++++++++++++
>  arch/x86/xen/spinlock.c               |  3 ++-
>  include/linux/kvm_host.h              |  2 ++
>  include/linux/sched.h                 | 12 ++++++++++++
>  kernel/locking/mutex.c                | 15 +++++++++++++--
>  kernel/locking/osq_lock.c             | 10 +++++++++-
>  kernel/locking/rwsem-xadd.c           | 16 +++++++++++++---
>  virt/kvm/kvm_main.c                   | 20 ++++++++++++++------
>  18 files changed, 151 insertions(+), 34 deletions(-)
> 

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH v6 00/11] implement vcpu preempted check
  2016-10-28  8:11 ` Pan Xinhui
                   ` (32 preceding siblings ...)
  (?)
@ 2016-10-28  9:57 ` Paolo Bonzini
  -1 siblings, 0 replies; 57+ messages in thread
From: Paolo Bonzini @ 2016-10-28  9:57 UTC (permalink / raw)
  To: Pan Xinhui, linux-kernel, linuxppc-dev, virtualization,
	linux-s390, xen-devel-request, kvm, xen-devel, x86
  Cc: kernellwp, jgross, David.Laight, rkrcmar, peterz, benh,
	bsingharora, will.deacon, borntraeger, mingo, paulus, mpe,
	paulmck, boqun.feng



On 28/10/2016 10:11, Pan Xinhui wrote:
> change from v5:
> 	spilt x86/kvm patch into guest/host part.
> 	introduce kvm_write_guest_offset_cached.
> 	fix some typos.
> 	rebase patch onto 4.9.2

Acked-by: Paolo Bonzini <pbonzini@redhat.com>

Thanks,

Paolo

> change from v4:
> 	spilt x86 kvm vcpu preempted check into two patches.
> 	add documentation patch.
> 	add x86 vcpu preempted check patch under xen
> 	add s390 vcpu preempted check patch 
> change from v3:
> 	add x86 vcpu preempted check patch
> change from v2:
> 	no code change, fix typos, update some comments
> change from v1:
> 	a simplier definition of default vcpu_is_preempted
> 	skip mahcine type check on ppc, and add config. remove dedicated macro.
> 	add one patch to drop overload of rwsem_spin_on_owner and mutex_spin_on_owner. 
> 	add more comments
> 	thanks boqun and Peter's suggestion.
> 
> This patch set aims to fix lock holder preemption issues.
> 
> test-case:
> perf record -a perf bench sched messaging -g 400 -p && perf report
> 
> 18.09%  sched-messaging  [kernel.vmlinux]  [k] osq_lock
> 12.28%  sched-messaging  [kernel.vmlinux]  [k] rwsem_spin_on_owner
>  5.27%  sched-messaging  [kernel.vmlinux]  [k] mutex_unlock
>  3.89%  sched-messaging  [kernel.vmlinux]  [k] wait_consider_task
>  3.64%  sched-messaging  [kernel.vmlinux]  [k] _raw_write_lock_irq
>  3.41%  sched-messaging  [kernel.vmlinux]  [k] mutex_spin_on_owner.is
>  2.49%  sched-messaging  [kernel.vmlinux]  [k] system_call
> 
> We introduce interface bool vcpu_is_preempted(int cpu) and use it in some spin
> loops of osq_lock, rwsem_spin_on_owner and mutex_spin_on_owner.
> These spin_on_onwer variant also cause rcu stall before we apply this patch set
> 
> We also have observed some performace improvements in uninx benchmark tests.
> 
> PPC test result:
> 1 copy - 0.94%
> 2 copy - 7.17%
> 4 copy - 11.9%
> 8 copy -  3.04%
> 16 copy - 15.11%
> 
> details below:
> Without patch:
> 
> 1 copy - File Write 4096 bufsize 8000 maxblocks      2188223.0 KBps  (30.0 s, 1 samples)
> 2 copy - File Write 4096 bufsize 8000 maxblocks      1804433.0 KBps  (30.0 s, 1 samples)
> 4 copy - File Write 4096 bufsize 8000 maxblocks      1237257.0 KBps  (30.0 s, 1 samples)
> 8 copy - File Write 4096 bufsize 8000 maxblocks      1032658.0 KBps  (30.0 s, 1 samples)
> 16 copy - File Write 4096 bufsize 8000 maxblocks       768000.0 KBps  (30.1 s, 1 samples)
> 
> With patch: 
> 
> 1 copy - File Write 4096 bufsize 8000 maxblocks      2209189.0 KBps  (30.0 s, 1 samples)
> 2 copy - File Write 4096 bufsize 8000 maxblocks      1943816.0 KBps  (30.0 s, 1 samples)
> 4 copy - File Write 4096 bufsize 8000 maxblocks      1405591.0 KBps  (30.0 s, 1 samples)
> 8 copy - File Write 4096 bufsize 8000 maxblocks      1065080.0 KBps  (30.0 s, 1 samples)
> 16 copy - File Write 4096 bufsize 8000 maxblocks       904762.0 KBps  (30.0 s, 1 samples)
> 
> X86 test result:
> 	test-case			after-patch	  before-patch
> Execl Throughput                       |    18307.9 lps  |    11701.6 lps 
> File Copy 1024 bufsize 2000 maxblocks  |  1352407.3 KBps |   790418.9 KBps
> File Copy 256 bufsize 500 maxblocks    |   367555.6 KBps |   222867.7 KBps
> File Copy 4096 bufsize 8000 maxblocks  |  3675649.7 KBps |  1780614.4 KBps
> Pipe Throughput                        | 11872208.7 lps  | 11855628.9 lps 
> Pipe-based Context Switching           |  1495126.5 lps  |  1490533.9 lps 
> Process Creation                       |    29881.2 lps  |    28572.8 lps 
> Shell Scripts (1 concurrent)           |    23224.3 lpm  |    22607.4 lpm 
> Shell Scripts (8 concurrent)           |     3531.4 lpm  |     3211.9 lpm 
> System Call Overhead                   | 10385653.0 lps  | 10419979.0 lps 
> 
> Christian Borntraeger (1):
>   s390/spinlock: Provide vcpu_is_preempted
> 
> Juergen Gross (1):
>   x86, xen: support vcpu preempted check
> 
> Pan Xinhui (9):
>   kernel/sched: introduce vcpu preempted check interface
>   locking/osq: Drop the overload of osq_lock()
>   kernel/locking: Drop the overload of {mutex,rwsem}_spin_on_owner
>   powerpc/spinlock: support vcpu preempted check
>   x86, paravirt: Add interface to support kvm/xen vcpu preempted check
>   KVM: Introduce kvm_write_guest_offset_cached
>   x86, kvm/x86.c: support vcpu preempted check
>   x86, kernel/kvm.c: support vcpu preempted check
>   Documentation: virtual: kvm: Support vcpu preempted check
> 
>  Documentation/virtual/kvm/msr.txt     |  9 ++++++++-
>  arch/powerpc/include/asm/spinlock.h   |  8 ++++++++
>  arch/s390/include/asm/spinlock.h      |  8 ++++++++
>  arch/s390/kernel/smp.c                |  9 +++++++--
>  arch/s390/lib/spinlock.c              | 25 ++++++++-----------------
>  arch/x86/include/asm/paravirt_types.h |  2 ++
>  arch/x86/include/asm/spinlock.h       |  8 ++++++++
>  arch/x86/include/uapi/asm/kvm_para.h  |  4 +++-
>  arch/x86/kernel/kvm.c                 | 12 ++++++++++++
>  arch/x86/kernel/paravirt-spinlocks.c  |  6 ++++++
>  arch/x86/kvm/x86.c                    | 16 ++++++++++++++++
>  arch/x86/xen/spinlock.c               |  3 ++-
>  include/linux/kvm_host.h              |  2 ++
>  include/linux/sched.h                 | 12 ++++++++++++
>  kernel/locking/mutex.c                | 15 +++++++++++++--
>  kernel/locking/osq_lock.c             | 10 +++++++++-
>  kernel/locking/rwsem-xadd.c           | 16 +++++++++++++---
>  virt/kvm/kvm_main.c                   | 20 ++++++++++++++------
>  18 files changed, 151 insertions(+), 34 deletions(-)
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [Xen-devel] [PATCH v6 00/11] implement vcpu preempted check
  2016-10-28  8:11 ` Pan Xinhui
@ 2016-10-28 19:38   ` Konrad Rzeszutek Wilk
  -1 siblings, 0 replies; 57+ messages in thread
From: Konrad Rzeszutek Wilk @ 2016-10-28 19:38 UTC (permalink / raw)
  To: Pan Xinhui
  Cc: linux-kernel, linuxppc-dev, virtualization, linux-s390,
	xen-devel-request, kvm, xen-devel, x86, kernellwp, jgross,
	David.Laight, rkrcmar, peterz, benh, bsingharora, will.deacon,
	borntraeger, mingo, paulus, mpe, pbonzini, paulmck, boqun.feng

On Fri, Oct 28, 2016 at 04:11:16AM -0400, Pan Xinhui wrote:
> change from v5:
> 	spilt x86/kvm patch into guest/host part.
> 	introduce kvm_write_guest_offset_cached.
> 	fix some typos.
> 	rebase patch onto 4.9.2
> change from v4:
> 	spilt x86 kvm vcpu preempted check into two patches.
> 	add documentation patch.
> 	add x86 vcpu preempted check patch under xen
> 	add s390 vcpu preempted check patch 
> change from v3:
> 	add x86 vcpu preempted check patch
> change from v2:
> 	no code change, fix typos, update some comments
> change from v1:
> 	a simplier definition of default vcpu_is_preempted
> 	skip mahcine type check on ppc, and add config. remove dedicated macro.
> 	add one patch to drop overload of rwsem_spin_on_owner and mutex_spin_on_owner. 
> 	add more comments
> 	thanks boqun and Peter's suggestion.
> 
> This patch set aims to fix lock holder preemption issues.

Do you have a git tree with these patches?

> 
> test-case:
> perf record -a perf bench sched messaging -g 400 -p && perf report
> 
> 18.09%  sched-messaging  [kernel.vmlinux]  [k] osq_lock
> 12.28%  sched-messaging  [kernel.vmlinux]  [k] rwsem_spin_on_owner
>  5.27%  sched-messaging  [kernel.vmlinux]  [k] mutex_unlock
>  3.89%  sched-messaging  [kernel.vmlinux]  [k] wait_consider_task
>  3.64%  sched-messaging  [kernel.vmlinux]  [k] _raw_write_lock_irq
>  3.41%  sched-messaging  [kernel.vmlinux]  [k] mutex_spin_on_owner.is
>  2.49%  sched-messaging  [kernel.vmlinux]  [k] system_call
> 
> We introduce interface bool vcpu_is_preempted(int cpu) and use it in some spin
> loops of osq_lock, rwsem_spin_on_owner and mutex_spin_on_owner.
> These spin_on_onwer variant also cause rcu stall before we apply this patch set
> 
> We also have observed some performace improvements in uninx benchmark tests.
> 
> PPC test result:
> 1 copy - 0.94%
> 2 copy - 7.17%
> 4 copy - 11.9%
> 8 copy -  3.04%
> 16 copy - 15.11%
> 
> details below:
> Without patch:
> 
> 1 copy - File Write 4096 bufsize 8000 maxblocks      2188223.0 KBps  (30.0 s, 1 samples)
> 2 copy - File Write 4096 bufsize 8000 maxblocks      1804433.0 KBps  (30.0 s, 1 samples)
> 4 copy - File Write 4096 bufsize 8000 maxblocks      1237257.0 KBps  (30.0 s, 1 samples)
> 8 copy - File Write 4096 bufsize 8000 maxblocks      1032658.0 KBps  (30.0 s, 1 samples)
> 16 copy - File Write 4096 bufsize 8000 maxblocks       768000.0 KBps  (30.1 s, 1 samples)
> 
> With patch: 
> 
> 1 copy - File Write 4096 bufsize 8000 maxblocks      2209189.0 KBps  (30.0 s, 1 samples)
> 2 copy - File Write 4096 bufsize 8000 maxblocks      1943816.0 KBps  (30.0 s, 1 samples)
> 4 copy - File Write 4096 bufsize 8000 maxblocks      1405591.0 KBps  (30.0 s, 1 samples)
> 8 copy - File Write 4096 bufsize 8000 maxblocks      1065080.0 KBps  (30.0 s, 1 samples)
> 16 copy - File Write 4096 bufsize 8000 maxblocks       904762.0 KBps  (30.0 s, 1 samples)
> 
> X86 test result:
> 	test-case			after-patch	  before-patch
> Execl Throughput                       |    18307.9 lps  |    11701.6 lps 
> File Copy 1024 bufsize 2000 maxblocks  |  1352407.3 KBps |   790418.9 KBps
> File Copy 256 bufsize 500 maxblocks    |   367555.6 KBps |   222867.7 KBps
> File Copy 4096 bufsize 8000 maxblocks  |  3675649.7 KBps |  1780614.4 KBps
> Pipe Throughput                        | 11872208.7 lps  | 11855628.9 lps 
> Pipe-based Context Switching           |  1495126.5 lps  |  1490533.9 lps 
> Process Creation                       |    29881.2 lps  |    28572.8 lps 
> Shell Scripts (1 concurrent)           |    23224.3 lpm  |    22607.4 lpm 
> Shell Scripts (8 concurrent)           |     3531.4 lpm  |     3211.9 lpm 
> System Call Overhead                   | 10385653.0 lps  | 10419979.0 lps 
> 
> Christian Borntraeger (1):
>   s390/spinlock: Provide vcpu_is_preempted
> 
> Juergen Gross (1):
>   x86, xen: support vcpu preempted check
> 
> Pan Xinhui (9):
>   kernel/sched: introduce vcpu preempted check interface
>   locking/osq: Drop the overload of osq_lock()
>   kernel/locking: Drop the overload of {mutex,rwsem}_spin_on_owner
>   powerpc/spinlock: support vcpu preempted check
>   x86, paravirt: Add interface to support kvm/xen vcpu preempted check
>   KVM: Introduce kvm_write_guest_offset_cached
>   x86, kvm/x86.c: support vcpu preempted check
>   x86, kernel/kvm.c: support vcpu preempted check
>   Documentation: virtual: kvm: Support vcpu preempted check
> 
>  Documentation/virtual/kvm/msr.txt     |  9 ++++++++-
>  arch/powerpc/include/asm/spinlock.h   |  8 ++++++++
>  arch/s390/include/asm/spinlock.h      |  8 ++++++++
>  arch/s390/kernel/smp.c                |  9 +++++++--
>  arch/s390/lib/spinlock.c              | 25 ++++++++-----------------
>  arch/x86/include/asm/paravirt_types.h |  2 ++
>  arch/x86/include/asm/spinlock.h       |  8 ++++++++
>  arch/x86/include/uapi/asm/kvm_para.h  |  4 +++-
>  arch/x86/kernel/kvm.c                 | 12 ++++++++++++
>  arch/x86/kernel/paravirt-spinlocks.c  |  6 ++++++
>  arch/x86/kvm/x86.c                    | 16 ++++++++++++++++
>  arch/x86/xen/spinlock.c               |  3 ++-
>  include/linux/kvm_host.h              |  2 ++
>  include/linux/sched.h                 | 12 ++++++++++++
>  kernel/locking/mutex.c                | 15 +++++++++++++--
>  kernel/locking/osq_lock.c             | 10 +++++++++-
>  kernel/locking/rwsem-xadd.c           | 16 +++++++++++++---
>  virt/kvm/kvm_main.c                   | 20 ++++++++++++++------
>  18 files changed, 151 insertions(+), 34 deletions(-)
> 
> -- 
> 2.4.11
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [Xen-devel] [PATCH v6 00/11] implement vcpu preempted check
@ 2016-10-28 19:38   ` Konrad Rzeszutek Wilk
  0 siblings, 0 replies; 57+ messages in thread
From: Konrad Rzeszutek Wilk @ 2016-10-28 19:38 UTC (permalink / raw)
  To: Pan Xinhui
  Cc: kvm, rkrcmar, peterz, benh, will.deacon, virtualization, paulus,
	kernellwp, linux-s390, xen-devel-request, x86, mingo, xen-devel,
	paulmck, boqun.feng, jgross, linux-kernel, David.Laight, mpe,
	pbonzini, linuxppc-dev

On Fri, Oct 28, 2016 at 04:11:16AM -0400, Pan Xinhui wrote:
> change from v5:
> 	spilt x86/kvm patch into guest/host part.
> 	introduce kvm_write_guest_offset_cached.
> 	fix some typos.
> 	rebase patch onto 4.9.2
> change from v4:
> 	spilt x86 kvm vcpu preempted check into two patches.
> 	add documentation patch.
> 	add x86 vcpu preempted check patch under xen
> 	add s390 vcpu preempted check patch 
> change from v3:
> 	add x86 vcpu preempted check patch
> change from v2:
> 	no code change, fix typos, update some comments
> change from v1:
> 	a simplier definition of default vcpu_is_preempted
> 	skip mahcine type check on ppc, and add config. remove dedicated macro.
> 	add one patch to drop overload of rwsem_spin_on_owner and mutex_spin_on_owner. 
> 	add more comments
> 	thanks boqun and Peter's suggestion.
> 
> This patch set aims to fix lock holder preemption issues.

Do you have a git tree with these patches?

> 
> test-case:
> perf record -a perf bench sched messaging -g 400 -p && perf report
> 
> 18.09%  sched-messaging  [kernel.vmlinux]  [k] osq_lock
> 12.28%  sched-messaging  [kernel.vmlinux]  [k] rwsem_spin_on_owner
>  5.27%  sched-messaging  [kernel.vmlinux]  [k] mutex_unlock
>  3.89%  sched-messaging  [kernel.vmlinux]  [k] wait_consider_task
>  3.64%  sched-messaging  [kernel.vmlinux]  [k] _raw_write_lock_irq
>  3.41%  sched-messaging  [kernel.vmlinux]  [k] mutex_spin_on_owner.is
>  2.49%  sched-messaging  [kernel.vmlinux]  [k] system_call
> 
> We introduce interface bool vcpu_is_preempted(int cpu) and use it in some spin
> loops of osq_lock, rwsem_spin_on_owner and mutex_spin_on_owner.
> These spin_on_onwer variant also cause rcu stall before we apply this patch set
> 
> We also have observed some performace improvements in uninx benchmark tests.
> 
> PPC test result:
> 1 copy - 0.94%
> 2 copy - 7.17%
> 4 copy - 11.9%
> 8 copy -  3.04%
> 16 copy - 15.11%
> 
> details below:
> Without patch:
> 
> 1 copy - File Write 4096 bufsize 8000 maxblocks      2188223.0 KBps  (30.0 s, 1 samples)
> 2 copy - File Write 4096 bufsize 8000 maxblocks      1804433.0 KBps  (30.0 s, 1 samples)
> 4 copy - File Write 4096 bufsize 8000 maxblocks      1237257.0 KBps  (30.0 s, 1 samples)
> 8 copy - File Write 4096 bufsize 8000 maxblocks      1032658.0 KBps  (30.0 s, 1 samples)
> 16 copy - File Write 4096 bufsize 8000 maxblocks       768000.0 KBps  (30.1 s, 1 samples)
> 
> With patch: 
> 
> 1 copy - File Write 4096 bufsize 8000 maxblocks      2209189.0 KBps  (30.0 s, 1 samples)
> 2 copy - File Write 4096 bufsize 8000 maxblocks      1943816.0 KBps  (30.0 s, 1 samples)
> 4 copy - File Write 4096 bufsize 8000 maxblocks      1405591.0 KBps  (30.0 s, 1 samples)
> 8 copy - File Write 4096 bufsize 8000 maxblocks      1065080.0 KBps  (30.0 s, 1 samples)
> 16 copy - File Write 4096 bufsize 8000 maxblocks       904762.0 KBps  (30.0 s, 1 samples)
> 
> X86 test result:
> 	test-case			after-patch	  before-patch
> Execl Throughput                       |    18307.9 lps  |    11701.6 lps 
> File Copy 1024 bufsize 2000 maxblocks  |  1352407.3 KBps |   790418.9 KBps
> File Copy 256 bufsize 500 maxblocks    |   367555.6 KBps |   222867.7 KBps
> File Copy 4096 bufsize 8000 maxblocks  |  3675649.7 KBps |  1780614.4 KBps
> Pipe Throughput                        | 11872208.7 lps  | 11855628.9 lps 
> Pipe-based Context Switching           |  1495126.5 lps  |  1490533.9 lps 
> Process Creation                       |    29881.2 lps  |    28572.8 lps 
> Shell Scripts (1 concurrent)           |    23224.3 lpm  |    22607.4 lpm 
> Shell Scripts (8 concurrent)           |     3531.4 lpm  |     3211.9 lpm 
> System Call Overhead                   | 10385653.0 lps  | 10419979.0 lps 
> 
> Christian Borntraeger (1):
>   s390/spinlock: Provide vcpu_is_preempted
> 
> Juergen Gross (1):
>   x86, xen: support vcpu preempted check
> 
> Pan Xinhui (9):
>   kernel/sched: introduce vcpu preempted check interface
>   locking/osq: Drop the overload of osq_lock()
>   kernel/locking: Drop the overload of {mutex,rwsem}_spin_on_owner
>   powerpc/spinlock: support vcpu preempted check
>   x86, paravirt: Add interface to support kvm/xen vcpu preempted check
>   KVM: Introduce kvm_write_guest_offset_cached
>   x86, kvm/x86.c: support vcpu preempted check
>   x86, kernel/kvm.c: support vcpu preempted check
>   Documentation: virtual: kvm: Support vcpu preempted check
> 
>  Documentation/virtual/kvm/msr.txt     |  9 ++++++++-
>  arch/powerpc/include/asm/spinlock.h   |  8 ++++++++
>  arch/s390/include/asm/spinlock.h      |  8 ++++++++
>  arch/s390/kernel/smp.c                |  9 +++++++--
>  arch/s390/lib/spinlock.c              | 25 ++++++++-----------------
>  arch/x86/include/asm/paravirt_types.h |  2 ++
>  arch/x86/include/asm/spinlock.h       |  8 ++++++++
>  arch/x86/include/uapi/asm/kvm_para.h  |  4 +++-
>  arch/x86/kernel/kvm.c                 | 12 ++++++++++++
>  arch/x86/kernel/paravirt-spinlocks.c  |  6 ++++++
>  arch/x86/kvm/x86.c                    | 16 ++++++++++++++++
>  arch/x86/xen/spinlock.c               |  3 ++-
>  include/linux/kvm_host.h              |  2 ++
>  include/linux/sched.h                 | 12 ++++++++++++
>  kernel/locking/mutex.c                | 15 +++++++++++++--
>  kernel/locking/osq_lock.c             | 10 +++++++++-
>  kernel/locking/rwsem-xadd.c           | 16 +++++++++++++---
>  virt/kvm/kvm_main.c                   | 20 ++++++++++++++------
>  18 files changed, 151 insertions(+), 34 deletions(-)
> 
> -- 
> 2.4.11
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH v6 00/11] implement vcpu preempted check
  2016-10-28  8:11 ` Pan Xinhui
                   ` (35 preceding siblings ...)
  (?)
@ 2016-10-28 19:38 ` Konrad Rzeszutek Wilk
  -1 siblings, 0 replies; 57+ messages in thread
From: Konrad Rzeszutek Wilk @ 2016-10-28 19:38 UTC (permalink / raw)
  To: Pan Xinhui
  Cc: kvm, rkrcmar, peterz, benh, bsingharora, will.deacon,
	virtualization, paulus, kernellwp, linux-s390, xen-devel-request,
	x86, borntraeger, mingo, xen-devel, paulmck, boqun.feng, jgross,
	linux-kernel, David.Laight, mpe, pbonzini, linuxppc-dev

On Fri, Oct 28, 2016 at 04:11:16AM -0400, Pan Xinhui wrote:
> change from v5:
> 	spilt x86/kvm patch into guest/host part.
> 	introduce kvm_write_guest_offset_cached.
> 	fix some typos.
> 	rebase patch onto 4.9.2
> change from v4:
> 	spilt x86 kvm vcpu preempted check into two patches.
> 	add documentation patch.
> 	add x86 vcpu preempted check patch under xen
> 	add s390 vcpu preempted check patch 
> change from v3:
> 	add x86 vcpu preempted check patch
> change from v2:
> 	no code change, fix typos, update some comments
> change from v1:
> 	a simplier definition of default vcpu_is_preempted
> 	skip mahcine type check on ppc, and add config. remove dedicated macro.
> 	add one patch to drop overload of rwsem_spin_on_owner and mutex_spin_on_owner. 
> 	add more comments
> 	thanks boqun and Peter's suggestion.
> 
> This patch set aims to fix lock holder preemption issues.

Do you have a git tree with these patches?

> 
> test-case:
> perf record -a perf bench sched messaging -g 400 -p && perf report
> 
> 18.09%  sched-messaging  [kernel.vmlinux]  [k] osq_lock
> 12.28%  sched-messaging  [kernel.vmlinux]  [k] rwsem_spin_on_owner
>  5.27%  sched-messaging  [kernel.vmlinux]  [k] mutex_unlock
>  3.89%  sched-messaging  [kernel.vmlinux]  [k] wait_consider_task
>  3.64%  sched-messaging  [kernel.vmlinux]  [k] _raw_write_lock_irq
>  3.41%  sched-messaging  [kernel.vmlinux]  [k] mutex_spin_on_owner.is
>  2.49%  sched-messaging  [kernel.vmlinux]  [k] system_call
> 
> We introduce interface bool vcpu_is_preempted(int cpu) and use it in some spin
> loops of osq_lock, rwsem_spin_on_owner and mutex_spin_on_owner.
> These spin_on_onwer variant also cause rcu stall before we apply this patch set
> 
> We also have observed some performace improvements in uninx benchmark tests.
> 
> PPC test result:
> 1 copy - 0.94%
> 2 copy - 7.17%
> 4 copy - 11.9%
> 8 copy -  3.04%
> 16 copy - 15.11%
> 
> details below:
> Without patch:
> 
> 1 copy - File Write 4096 bufsize 8000 maxblocks      2188223.0 KBps  (30.0 s, 1 samples)
> 2 copy - File Write 4096 bufsize 8000 maxblocks      1804433.0 KBps  (30.0 s, 1 samples)
> 4 copy - File Write 4096 bufsize 8000 maxblocks      1237257.0 KBps  (30.0 s, 1 samples)
> 8 copy - File Write 4096 bufsize 8000 maxblocks      1032658.0 KBps  (30.0 s, 1 samples)
> 16 copy - File Write 4096 bufsize 8000 maxblocks       768000.0 KBps  (30.1 s, 1 samples)
> 
> With patch: 
> 
> 1 copy - File Write 4096 bufsize 8000 maxblocks      2209189.0 KBps  (30.0 s, 1 samples)
> 2 copy - File Write 4096 bufsize 8000 maxblocks      1943816.0 KBps  (30.0 s, 1 samples)
> 4 copy - File Write 4096 bufsize 8000 maxblocks      1405591.0 KBps  (30.0 s, 1 samples)
> 8 copy - File Write 4096 bufsize 8000 maxblocks      1065080.0 KBps  (30.0 s, 1 samples)
> 16 copy - File Write 4096 bufsize 8000 maxblocks       904762.0 KBps  (30.0 s, 1 samples)
> 
> X86 test result:
> 	test-case			after-patch	  before-patch
> Execl Throughput                       |    18307.9 lps  |    11701.6 lps 
> File Copy 1024 bufsize 2000 maxblocks  |  1352407.3 KBps |   790418.9 KBps
> File Copy 256 bufsize 500 maxblocks    |   367555.6 KBps |   222867.7 KBps
> File Copy 4096 bufsize 8000 maxblocks  |  3675649.7 KBps |  1780614.4 KBps
> Pipe Throughput                        | 11872208.7 lps  | 11855628.9 lps 
> Pipe-based Context Switching           |  1495126.5 lps  |  1490533.9 lps 
> Process Creation                       |    29881.2 lps  |    28572.8 lps 
> Shell Scripts (1 concurrent)           |    23224.3 lpm  |    22607.4 lpm 
> Shell Scripts (8 concurrent)           |     3531.4 lpm  |     3211.9 lpm 
> System Call Overhead                   | 10385653.0 lps  | 10419979.0 lps 
> 
> Christian Borntraeger (1):
>   s390/spinlock: Provide vcpu_is_preempted
> 
> Juergen Gross (1):
>   x86, xen: support vcpu preempted check
> 
> Pan Xinhui (9):
>   kernel/sched: introduce vcpu preempted check interface
>   locking/osq: Drop the overload of osq_lock()
>   kernel/locking: Drop the overload of {mutex,rwsem}_spin_on_owner
>   powerpc/spinlock: support vcpu preempted check
>   x86, paravirt: Add interface to support kvm/xen vcpu preempted check
>   KVM: Introduce kvm_write_guest_offset_cached
>   x86, kvm/x86.c: support vcpu preempted check
>   x86, kernel/kvm.c: support vcpu preempted check
>   Documentation: virtual: kvm: Support vcpu preempted check
> 
>  Documentation/virtual/kvm/msr.txt     |  9 ++++++++-
>  arch/powerpc/include/asm/spinlock.h   |  8 ++++++++
>  arch/s390/include/asm/spinlock.h      |  8 ++++++++
>  arch/s390/kernel/smp.c                |  9 +++++++--
>  arch/s390/lib/spinlock.c              | 25 ++++++++-----------------
>  arch/x86/include/asm/paravirt_types.h |  2 ++
>  arch/x86/include/asm/spinlock.h       |  8 ++++++++
>  arch/x86/include/uapi/asm/kvm_para.h  |  4 +++-
>  arch/x86/kernel/kvm.c                 | 12 ++++++++++++
>  arch/x86/kernel/paravirt-spinlocks.c  |  6 ++++++
>  arch/x86/kvm/x86.c                    | 16 ++++++++++++++++
>  arch/x86/xen/spinlock.c               |  3 ++-
>  include/linux/kvm_host.h              |  2 ++
>  include/linux/sched.h                 | 12 ++++++++++++
>  kernel/locking/mutex.c                | 15 +++++++++++++--
>  kernel/locking/osq_lock.c             | 10 +++++++++-
>  kernel/locking/rwsem-xadd.c           | 16 +++++++++++++---
>  virt/kvm/kvm_main.c                   | 20 ++++++++++++++------
>  18 files changed, 151 insertions(+), 34 deletions(-)
> 
> -- 
> 2.4.11
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> https://lists.xen.org/xen-devel

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [Xen-devel] [PATCH v6 10/11] x86, xen: support vcpu preempted check
  2016-10-28  8:11 ` [PATCH v6 10/11] x86, xen: " Pan Xinhui
@ 2016-10-28 19:43     ` Konrad Rzeszutek Wilk
  2016-10-28 19:43   ` Konrad Rzeszutek Wilk
  1 sibling, 0 replies; 57+ messages in thread
From: Konrad Rzeszutek Wilk @ 2016-10-28 19:43 UTC (permalink / raw)
  To: Pan Xinhui
  Cc: linux-kernel, linuxppc-dev, virtualization, linux-s390,
	xen-devel-request, kvm, xen-devel, x86, kernellwp, jgross,
	David.Laight, rkrcmar, peterz, benh, bsingharora, will.deacon,
	borntraeger, mingo, paulus, mpe, pbonzini, paulmck, boqun.feng

On Fri, Oct 28, 2016 at 04:11:26AM -0400, Pan Xinhui wrote:
> From: Juergen Gross <jgross@suse.com>
> 
> Support the vcpu_is_preempted() functionality under Xen. This will
> enhance lock performance on overcommitted hosts (more runnable vcpus
> than physical cpus in the system) as doing busy waits for preempted
> vcpus will hurt system performance far worse than early yielding.
> 
> A quick test (4 vcpus on 1 physical cpu doing a parallel build job
> with "make -j 8") reduced system time by about 5% with this patch.
> 
> Signed-off-by: Juergen Gross <jgross@suse.com>
> Signed-off-by: Pan Xinhui <xinhui.pan@linux.vnet.ibm.com>
> ---
>  arch/x86/xen/spinlock.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/x86/xen/spinlock.c b/arch/x86/xen/spinlock.c
> index 3d6e006..74756bb 100644
> --- a/arch/x86/xen/spinlock.c
> +++ b/arch/x86/xen/spinlock.c
> @@ -114,7 +114,6 @@ void xen_uninit_lock_cpu(int cpu)
>  	per_cpu(irq_name, cpu) = NULL;
>  }
>  
> -

Spurious change.
>  /*
>   * Our init of PV spinlocks is split in two init functions due to us
>   * using paravirt patching and jump labels patching and having to do
> @@ -137,6 +136,8 @@ void __init xen_init_spinlocks(void)
>  	pv_lock_ops.queued_spin_unlock = PV_CALLEE_SAVE(__pv_queued_spin_unlock);
>  	pv_lock_ops.wait = xen_qlock_wait;
>  	pv_lock_ops.kick = xen_qlock_kick;
> +
> +	pv_lock_ops.vcpu_is_preempted = xen_vcpu_stolen;
>  }
>  
>  /*
> -- 
> 2.4.11
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [Xen-devel] [PATCH v6 10/11] x86, xen: support vcpu preempted check
@ 2016-10-28 19:43     ` Konrad Rzeszutek Wilk
  0 siblings, 0 replies; 57+ messages in thread
From: Konrad Rzeszutek Wilk @ 2016-10-28 19:43 UTC (permalink / raw)
  To: Pan Xinhui
  Cc: kvm, rkrcmar, peterz, benh, will.deacon, virtualization, paulus,
	kernellwp, linux-s390, xen-devel-request, x86, mingo, xen-devel,
	paulmck, boqun.feng, jgross, linux-kernel, David.Laight, mpe,
	pbonzini, linuxppc-dev

On Fri, Oct 28, 2016 at 04:11:26AM -0400, Pan Xinhui wrote:
> From: Juergen Gross <jgross@suse.com>
> 
> Support the vcpu_is_preempted() functionality under Xen. This will
> enhance lock performance on overcommitted hosts (more runnable vcpus
> than physical cpus in the system) as doing busy waits for preempted
> vcpus will hurt system performance far worse than early yielding.
> 
> A quick test (4 vcpus on 1 physical cpu doing a parallel build job
> with "make -j 8") reduced system time by about 5% with this patch.
> 
> Signed-off-by: Juergen Gross <jgross@suse.com>
> Signed-off-by: Pan Xinhui <xinhui.pan@linux.vnet.ibm.com>
> ---
>  arch/x86/xen/spinlock.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/x86/xen/spinlock.c b/arch/x86/xen/spinlock.c
> index 3d6e006..74756bb 100644
> --- a/arch/x86/xen/spinlock.c
> +++ b/arch/x86/xen/spinlock.c
> @@ -114,7 +114,6 @@ void xen_uninit_lock_cpu(int cpu)
>  	per_cpu(irq_name, cpu) = NULL;
>  }
>  
> -

Spurious change.
>  /*
>   * Our init of PV spinlocks is split in two init functions due to us
>   * using paravirt patching and jump labels patching and having to do
> @@ -137,6 +136,8 @@ void __init xen_init_spinlocks(void)
>  	pv_lock_ops.queued_spin_unlock = PV_CALLEE_SAVE(__pv_queued_spin_unlock);
>  	pv_lock_ops.wait = xen_qlock_wait;
>  	pv_lock_ops.kick = xen_qlock_kick;
> +
> +	pv_lock_ops.vcpu_is_preempted = xen_vcpu_stolen;
>  }
>  
>  /*
> -- 
> 2.4.11
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH v6 10/11] x86, xen: support vcpu preempted check
  2016-10-28  8:11 ` [PATCH v6 10/11] x86, xen: " Pan Xinhui
  2016-10-28 19:43     ` Konrad Rzeszutek Wilk
@ 2016-10-28 19:43   ` Konrad Rzeszutek Wilk
  1 sibling, 0 replies; 57+ messages in thread
From: Konrad Rzeszutek Wilk @ 2016-10-28 19:43 UTC (permalink / raw)
  To: Pan Xinhui
  Cc: kvm, rkrcmar, peterz, benh, bsingharora, will.deacon,
	virtualization, paulus, kernellwp, linux-s390, xen-devel-request,
	x86, borntraeger, mingo, xen-devel, paulmck, boqun.feng, jgross,
	linux-kernel, David.Laight, mpe, pbonzini, linuxppc-dev

On Fri, Oct 28, 2016 at 04:11:26AM -0400, Pan Xinhui wrote:
> From: Juergen Gross <jgross@suse.com>
> 
> Support the vcpu_is_preempted() functionality under Xen. This will
> enhance lock performance on overcommitted hosts (more runnable vcpus
> than physical cpus in the system) as doing busy waits for preempted
> vcpus will hurt system performance far worse than early yielding.
> 
> A quick test (4 vcpus on 1 physical cpu doing a parallel build job
> with "make -j 8") reduced system time by about 5% with this patch.
> 
> Signed-off-by: Juergen Gross <jgross@suse.com>
> Signed-off-by: Pan Xinhui <xinhui.pan@linux.vnet.ibm.com>
> ---
>  arch/x86/xen/spinlock.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/x86/xen/spinlock.c b/arch/x86/xen/spinlock.c
> index 3d6e006..74756bb 100644
> --- a/arch/x86/xen/spinlock.c
> +++ b/arch/x86/xen/spinlock.c
> @@ -114,7 +114,6 @@ void xen_uninit_lock_cpu(int cpu)
>  	per_cpu(irq_name, cpu) = NULL;
>  }
>  
> -

Spurious change.
>  /*
>   * Our init of PV spinlocks is split in two init functions due to us
>   * using paravirt patching and jump labels patching and having to do
> @@ -137,6 +136,8 @@ void __init xen_init_spinlocks(void)
>  	pv_lock_ops.queued_spin_unlock = PV_CALLEE_SAVE(__pv_queued_spin_unlock);
>  	pv_lock_ops.wait = xen_qlock_wait;
>  	pv_lock_ops.kick = xen_qlock_kick;
> +
> +	pv_lock_ops.vcpu_is_preempted = xen_vcpu_stolen;
>  }
>  
>  /*
> -- 
> 2.4.11
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> https://lists.xen.org/xen-devel

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [Xen-devel] [PATCH v6 10/11] x86, xen: support vcpu preempted check
  2016-10-28 19:43     ` Konrad Rzeszutek Wilk
@ 2016-10-29  4:26       ` Pan Xinhui
  -1 siblings, 0 replies; 57+ messages in thread
From: Pan Xinhui @ 2016-10-29  4:26 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk, Pan Xinhui
  Cc: kvm, rkrcmar, peterz, benh, will.deacon, virtualization, paulus,
	kernellwp, linux-s390, xen-devel-request, x86, mingo, xen-devel,
	paulmck, boqun.feng, jgross, linux-kernel, David.Laight, mpe,
	pbonzini, linuxppc-dev



在 2016/10/29 03:43, Konrad Rzeszutek Wilk 写道:
> On Fri, Oct 28, 2016 at 04:11:26AM -0400, Pan Xinhui wrote:
>> From: Juergen Gross <jgross@suse.com>
>>
>> Support the vcpu_is_preempted() functionality under Xen. This will
>> enhance lock performance on overcommitted hosts (more runnable vcpus
>> than physical cpus in the system) as doing busy waits for preempted
>> vcpus will hurt system performance far worse than early yielding.
>>
>> A quick test (4 vcpus on 1 physical cpu doing a parallel build job
>> with "make -j 8") reduced system time by about 5% with this patch.
>>
>> Signed-off-by: Juergen Gross <jgross@suse.com>
>> Signed-off-by: Pan Xinhui <xinhui.pan@linux.vnet.ibm.com>
>> ---
>>  arch/x86/xen/spinlock.c | 3 ++-
>>  1 file changed, 2 insertions(+), 1 deletion(-)
>>
>> diff --git a/arch/x86/xen/spinlock.c b/arch/x86/xen/spinlock.c
>> index 3d6e006..74756bb 100644
>> --- a/arch/x86/xen/spinlock.c
>> +++ b/arch/x86/xen/spinlock.c
>> @@ -114,7 +114,6 @@ void xen_uninit_lock_cpu(int cpu)
>>  	per_cpu(irq_name, cpu) = NULL;
>>  }
>>
>> -
>
> Spurious change.
well, just remove one unnecessary blank line while at it.

>>  /*
>>   * Our init of PV spinlocks is split in two init functions due to us
>>   * using paravirt patching and jump labels patching and having to do
>> @@ -137,6 +136,8 @@ void __init xen_init_spinlocks(void)
>>  	pv_lock_ops.queued_spin_unlock = PV_CALLEE_SAVE(__pv_queued_spin_unlock);
>>  	pv_lock_ops.wait = xen_qlock_wait;
>>  	pv_lock_ops.kick = xen_qlock_kick;
>> +
>> +	pv_lock_ops.vcpu_is_preempted = xen_vcpu_stolen;
>>  }
>>
>>  /*
>> --
>> 2.4.11
>>
>>
>> _______________________________________________
>> Xen-devel mailing list
>> Xen-devel@lists.xen.org
>> https://lists.xen.org/xen-devel
>

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [Xen-devel] [PATCH v6 10/11] x86, xen: support vcpu preempted check
@ 2016-10-29  4:26       ` Pan Xinhui
  0 siblings, 0 replies; 57+ messages in thread
From: Pan Xinhui @ 2016-10-29  4:26 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk, Pan Xinhui
  Cc: linux-kernel, linuxppc-dev, virtualization, linux-s390,
	xen-devel-request, kvm, xen-devel, x86, kernellwp, jgross,
	David.Laight, rkrcmar, peterz, benh, bsingharora, will.deacon,
	borntraeger, mingo, paulus, mpe, pbonzini, paulmck, boqun.feng



在 2016/10/29 03:43, Konrad Rzeszutek Wilk 写道:
> On Fri, Oct 28, 2016 at 04:11:26AM -0400, Pan Xinhui wrote:
>> From: Juergen Gross <jgross@suse.com>
>>
>> Support the vcpu_is_preempted() functionality under Xen. This will
>> enhance lock performance on overcommitted hosts (more runnable vcpus
>> than physical cpus in the system) as doing busy waits for preempted
>> vcpus will hurt system performance far worse than early yielding.
>>
>> A quick test (4 vcpus on 1 physical cpu doing a parallel build job
>> with "make -j 8") reduced system time by about 5% with this patch.
>>
>> Signed-off-by: Juergen Gross <jgross@suse.com>
>> Signed-off-by: Pan Xinhui <xinhui.pan@linux.vnet.ibm.com>
>> ---
>>  arch/x86/xen/spinlock.c | 3 ++-
>>  1 file changed, 2 insertions(+), 1 deletion(-)
>>
>> diff --git a/arch/x86/xen/spinlock.c b/arch/x86/xen/spinlock.c
>> index 3d6e006..74756bb 100644
>> --- a/arch/x86/xen/spinlock.c
>> +++ b/arch/x86/xen/spinlock.c
>> @@ -114,7 +114,6 @@ void xen_uninit_lock_cpu(int cpu)
>>  	per_cpu(irq_name, cpu) = NULL;
>>  }
>>
>> -
>
> Spurious change.
well, just remove one unnecessary blank line while at it.

>>  /*
>>   * Our init of PV spinlocks is split in two init functions due to us
>>   * using paravirt patching and jump labels patching and having to do
>> @@ -137,6 +136,8 @@ void __init xen_init_spinlocks(void)
>>  	pv_lock_ops.queued_spin_unlock = PV_CALLEE_SAVE(__pv_queued_spin_unlock);
>>  	pv_lock_ops.wait = xen_qlock_wait;
>>  	pv_lock_ops.kick = xen_qlock_kick;
>> +
>> +	pv_lock_ops.vcpu_is_preempted = xen_vcpu_stolen;
>>  }
>>
>>  /*
>> --
>> 2.4.11
>>
>>
>> _______________________________________________
>> Xen-devel mailing list
>> Xen-devel@lists.xen.org
>> https://lists.xen.org/xen-devel
>

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH v6 10/11] x86, xen: support vcpu preempted check
  2016-10-28 19:43     ` Konrad Rzeszutek Wilk
  (?)
@ 2016-10-29  4:26     ` Pan Xinhui
  -1 siblings, 0 replies; 57+ messages in thread
From: Pan Xinhui @ 2016-10-29  4:26 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk, Pan Xinhui
  Cc: kvm, rkrcmar, peterz, benh, bsingharora, will.deacon,
	virtualization, paulus, kernellwp, linux-s390, xen-devel-request,
	x86, borntraeger, mingo, xen-devel, paulmck, boqun.feng, jgross,
	linux-kernel, David.Laight, mpe, pbonzini, linuxppc-dev



在 2016/10/29 03:43, Konrad Rzeszutek Wilk 写道:
> On Fri, Oct 28, 2016 at 04:11:26AM -0400, Pan Xinhui wrote:
>> From: Juergen Gross <jgross@suse.com>
>>
>> Support the vcpu_is_preempted() functionality under Xen. This will
>> enhance lock performance on overcommitted hosts (more runnable vcpus
>> than physical cpus in the system) as doing busy waits for preempted
>> vcpus will hurt system performance far worse than early yielding.
>>
>> A quick test (4 vcpus on 1 physical cpu doing a parallel build job
>> with "make -j 8") reduced system time by about 5% with this patch.
>>
>> Signed-off-by: Juergen Gross <jgross@suse.com>
>> Signed-off-by: Pan Xinhui <xinhui.pan@linux.vnet.ibm.com>
>> ---
>>  arch/x86/xen/spinlock.c | 3 ++-
>>  1 file changed, 2 insertions(+), 1 deletion(-)
>>
>> diff --git a/arch/x86/xen/spinlock.c b/arch/x86/xen/spinlock.c
>> index 3d6e006..74756bb 100644
>> --- a/arch/x86/xen/spinlock.c
>> +++ b/arch/x86/xen/spinlock.c
>> @@ -114,7 +114,6 @@ void xen_uninit_lock_cpu(int cpu)
>>  	per_cpu(irq_name, cpu) = NULL;
>>  }
>>
>> -
>
> Spurious change.
well, just remove one unnecessary blank line while at it.

>>  /*
>>   * Our init of PV spinlocks is split in two init functions due to us
>>   * using paravirt patching and jump labels patching and having to do
>> @@ -137,6 +136,8 @@ void __init xen_init_spinlocks(void)
>>  	pv_lock_ops.queued_spin_unlock = PV_CALLEE_SAVE(__pv_queued_spin_unlock);
>>  	pv_lock_ops.wait = xen_qlock_wait;
>>  	pv_lock_ops.kick = xen_qlock_kick;
>> +
>> +	pv_lock_ops.vcpu_is_preempted = xen_vcpu_stolen;
>>  }
>>
>>  /*
>> --
>> 2.4.11
>>
>>
>> _______________________________________________
>> Xen-devel mailing list
>> Xen-devel@lists.xen.org
>> https://lists.xen.org/xen-devel
>


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [Xen-devel] [PATCH v6 00/11] implement vcpu preempted check
  2016-10-28 19:38   ` Konrad Rzeszutek Wilk
@ 2016-10-29  4:37     ` Pan Xinhui
  -1 siblings, 0 replies; 57+ messages in thread
From: Pan Xinhui @ 2016-10-29  4:37 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk, Pan Xinhui
  Cc: linux-kernel, linuxppc-dev, virtualization, linux-s390,
	xen-devel-request, kvm, xen-devel, x86, kernellwp, jgross,
	David.Laight, rkrcmar, peterz, benh, bsingharora, will.deacon,
	borntraeger, mingo, paulus, mpe, pbonzini, paulmck, boqun.feng

[-- Attachment #1: Type: text/plain, Size: 5899 bytes --]



在 2016/10/29 03:38, Konrad Rzeszutek Wilk 写道:
> On Fri, Oct 28, 2016 at 04:11:16AM -0400, Pan Xinhui wrote:
>> change from v5:
>> 	spilt x86/kvm patch into guest/host part.
>> 	introduce kvm_write_guest_offset_cached.
>> 	fix some typos.
>> 	rebase patch onto 4.9.2
>> change from v4:
>> 	spilt x86 kvm vcpu preempted check into two patches.
>> 	add documentation patch.
>> 	add x86 vcpu preempted check patch under xen
>> 	add s390 vcpu preempted check patch
>> change from v3:
>> 	add x86 vcpu preempted check patch
>> change from v2:
>> 	no code change, fix typos, update some comments
>> change from v1:
>> 	a simplier definition of default vcpu_is_preempted
>> 	skip mahcine type check on ppc, and add config. remove dedicated macro.
>> 	add one patch to drop overload of rwsem_spin_on_owner and mutex_spin_on_owner.
>> 	add more comments
>> 	thanks boqun and Peter's suggestion.
>>
>> This patch set aims to fix lock holder preemption issues.
>
> Do you have a git tree with these patches?
>
Currently no, sorry :(

I make a tar file for this patcheset. Maybe a little easier to apply :)

thanks
xinhui

>>
>> test-case:
>> perf record -a perf bench sched messaging -g 400 -p && perf report
>>
>> 18.09%  sched-messaging  [kernel.vmlinux]  [k] osq_lock
>> 12.28%  sched-messaging  [kernel.vmlinux]  [k] rwsem_spin_on_owner
>>  5.27%  sched-messaging  [kernel.vmlinux]  [k] mutex_unlock
>>  3.89%  sched-messaging  [kernel.vmlinux]  [k] wait_consider_task
>>  3.64%  sched-messaging  [kernel.vmlinux]  [k] _raw_write_lock_irq
>>  3.41%  sched-messaging  [kernel.vmlinux]  [k] mutex_spin_on_owner.is
>>  2.49%  sched-messaging  [kernel.vmlinux]  [k] system_call
>>
>> We introduce interface bool vcpu_is_preempted(int cpu) and use it in some spin
>> loops of osq_lock, rwsem_spin_on_owner and mutex_spin_on_owner.
>> These spin_on_onwer variant also cause rcu stall before we apply this patch set
>>
>> We also have observed some performace improvements in uninx benchmark tests.
>>
>> PPC test result:
>> 1 copy - 0.94%
>> 2 copy - 7.17%
>> 4 copy - 11.9%
>> 8 copy -  3.04%
>> 16 copy - 15.11%
>>
>> details below:
>> Without patch:
>>
>> 1 copy - File Write 4096 bufsize 8000 maxblocks      2188223.0 KBps  (30.0 s, 1 samples)
>> 2 copy - File Write 4096 bufsize 8000 maxblocks      1804433.0 KBps  (30.0 s, 1 samples)
>> 4 copy - File Write 4096 bufsize 8000 maxblocks      1237257.0 KBps  (30.0 s, 1 samples)
>> 8 copy - File Write 4096 bufsize 8000 maxblocks      1032658.0 KBps  (30.0 s, 1 samples)
>> 16 copy - File Write 4096 bufsize 8000 maxblocks       768000.0 KBps  (30.1 s, 1 samples)
>>
>> With patch:
>>
>> 1 copy - File Write 4096 bufsize 8000 maxblocks      2209189.0 KBps  (30.0 s, 1 samples)
>> 2 copy - File Write 4096 bufsize 8000 maxblocks      1943816.0 KBps  (30.0 s, 1 samples)
>> 4 copy - File Write 4096 bufsize 8000 maxblocks      1405591.0 KBps  (30.0 s, 1 samples)
>> 8 copy - File Write 4096 bufsize 8000 maxblocks      1065080.0 KBps  (30.0 s, 1 samples)
>> 16 copy - File Write 4096 bufsize 8000 maxblocks       904762.0 KBps  (30.0 s, 1 samples)
>>
>> X86 test result:
>> 	test-case			after-patch	  before-patch
>> Execl Throughput                       |    18307.9 lps  |    11701.6 lps
>> File Copy 1024 bufsize 2000 maxblocks  |  1352407.3 KBps |   790418.9 KBps
>> File Copy 256 bufsize 500 maxblocks    |   367555.6 KBps |   222867.7 KBps
>> File Copy 4096 bufsize 8000 maxblocks  |  3675649.7 KBps |  1780614.4 KBps
>> Pipe Throughput                        | 11872208.7 lps  | 11855628.9 lps
>> Pipe-based Context Switching           |  1495126.5 lps  |  1490533.9 lps
>> Process Creation                       |    29881.2 lps  |    28572.8 lps
>> Shell Scripts (1 concurrent)           |    23224.3 lpm  |    22607.4 lpm
>> Shell Scripts (8 concurrent)           |     3531.4 lpm  |     3211.9 lpm
>> System Call Overhead                   | 10385653.0 lps  | 10419979.0 lps
>>
>> Christian Borntraeger (1):
>>   s390/spinlock: Provide vcpu_is_preempted
>>
>> Juergen Gross (1):
>>   x86, xen: support vcpu preempted check
>>
>> Pan Xinhui (9):
>>   kernel/sched: introduce vcpu preempted check interface
>>   locking/osq: Drop the overload of osq_lock()
>>   kernel/locking: Drop the overload of {mutex,rwsem}_spin_on_owner
>>   powerpc/spinlock: support vcpu preempted check
>>   x86, paravirt: Add interface to support kvm/xen vcpu preempted check
>>   KVM: Introduce kvm_write_guest_offset_cached
>>   x86, kvm/x86.c: support vcpu preempted check
>>   x86, kernel/kvm.c: support vcpu preempted check
>>   Documentation: virtual: kvm: Support vcpu preempted check
>>
>>  Documentation/virtual/kvm/msr.txt     |  9 ++++++++-
>>  arch/powerpc/include/asm/spinlock.h   |  8 ++++++++
>>  arch/s390/include/asm/spinlock.h      |  8 ++++++++
>>  arch/s390/kernel/smp.c                |  9 +++++++--
>>  arch/s390/lib/spinlock.c              | 25 ++++++++-----------------
>>  arch/x86/include/asm/paravirt_types.h |  2 ++
>>  arch/x86/include/asm/spinlock.h       |  8 ++++++++
>>  arch/x86/include/uapi/asm/kvm_para.h  |  4 +++-
>>  arch/x86/kernel/kvm.c                 | 12 ++++++++++++
>>  arch/x86/kernel/paravirt-spinlocks.c  |  6 ++++++
>>  arch/x86/kvm/x86.c                    | 16 ++++++++++++++++
>>  arch/x86/xen/spinlock.c               |  3 ++-
>>  include/linux/kvm_host.h              |  2 ++
>>  include/linux/sched.h                 | 12 ++++++++++++
>>  kernel/locking/mutex.c                | 15 +++++++++++++--
>>  kernel/locking/osq_lock.c             | 10 +++++++++-
>>  kernel/locking/rwsem-xadd.c           | 16 +++++++++++++---
>>  virt/kvm/kvm_main.c                   | 20 ++++++++++++++------
>>  18 files changed, 151 insertions(+), 34 deletions(-)
>>
>> --
>> 2.4.11
>>
>>
>> _______________________________________________
>> Xen-devel mailing list
>> Xen-devel@lists.xen.org
>> https://lists.xen.org/xen-devel
>

[-- Attachment #2: vcpu.tar --]
[-- Type: application/x-tar, Size: 81920 bytes --]

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [Xen-devel] [PATCH v6 00/11] implement vcpu preempted check
@ 2016-10-29  4:37     ` Pan Xinhui
  0 siblings, 0 replies; 57+ messages in thread
From: Pan Xinhui @ 2016-10-29  4:37 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk, Pan Xinhui
  Cc: kvm, rkrcmar, peterz, benh, will.deacon, virtualization, paulus,
	kernellwp, linux-s390, xen-devel-request, x86, mingo, xen-devel,
	paulmck, boqun.feng, jgross, linux-kernel, David.Laight, mpe,
	pbonzini, linuxppc-dev

[-- Attachment #1: Type: text/plain, Size: 5899 bytes --]



在 2016/10/29 03:38, Konrad Rzeszutek Wilk 写道:
> On Fri, Oct 28, 2016 at 04:11:16AM -0400, Pan Xinhui wrote:
>> change from v5:
>> 	spilt x86/kvm patch into guest/host part.
>> 	introduce kvm_write_guest_offset_cached.
>> 	fix some typos.
>> 	rebase patch onto 4.9.2
>> change from v4:
>> 	spilt x86 kvm vcpu preempted check into two patches.
>> 	add documentation patch.
>> 	add x86 vcpu preempted check patch under xen
>> 	add s390 vcpu preempted check patch
>> change from v3:
>> 	add x86 vcpu preempted check patch
>> change from v2:
>> 	no code change, fix typos, update some comments
>> change from v1:
>> 	a simplier definition of default vcpu_is_preempted
>> 	skip mahcine type check on ppc, and add config. remove dedicated macro.
>> 	add one patch to drop overload of rwsem_spin_on_owner and mutex_spin_on_owner.
>> 	add more comments
>> 	thanks boqun and Peter's suggestion.
>>
>> This patch set aims to fix lock holder preemption issues.
>
> Do you have a git tree with these patches?
>
Currently no, sorry :(

I make a tar file for this patcheset. Maybe a little easier to apply :)

thanks
xinhui

>>
>> test-case:
>> perf record -a perf bench sched messaging -g 400 -p && perf report
>>
>> 18.09%  sched-messaging  [kernel.vmlinux]  [k] osq_lock
>> 12.28%  sched-messaging  [kernel.vmlinux]  [k] rwsem_spin_on_owner
>>  5.27%  sched-messaging  [kernel.vmlinux]  [k] mutex_unlock
>>  3.89%  sched-messaging  [kernel.vmlinux]  [k] wait_consider_task
>>  3.64%  sched-messaging  [kernel.vmlinux]  [k] _raw_write_lock_irq
>>  3.41%  sched-messaging  [kernel.vmlinux]  [k] mutex_spin_on_owner.is
>>  2.49%  sched-messaging  [kernel.vmlinux]  [k] system_call
>>
>> We introduce interface bool vcpu_is_preempted(int cpu) and use it in some spin
>> loops of osq_lock, rwsem_spin_on_owner and mutex_spin_on_owner.
>> These spin_on_onwer variant also cause rcu stall before we apply this patch set
>>
>> We also have observed some performace improvements in uninx benchmark tests.
>>
>> PPC test result:
>> 1 copy - 0.94%
>> 2 copy - 7.17%
>> 4 copy - 11.9%
>> 8 copy -  3.04%
>> 16 copy - 15.11%
>>
>> details below:
>> Without patch:
>>
>> 1 copy - File Write 4096 bufsize 8000 maxblocks      2188223.0 KBps  (30.0 s, 1 samples)
>> 2 copy - File Write 4096 bufsize 8000 maxblocks      1804433.0 KBps  (30.0 s, 1 samples)
>> 4 copy - File Write 4096 bufsize 8000 maxblocks      1237257.0 KBps  (30.0 s, 1 samples)
>> 8 copy - File Write 4096 bufsize 8000 maxblocks      1032658.0 KBps  (30.0 s, 1 samples)
>> 16 copy - File Write 4096 bufsize 8000 maxblocks       768000.0 KBps  (30.1 s, 1 samples)
>>
>> With patch:
>>
>> 1 copy - File Write 4096 bufsize 8000 maxblocks      2209189.0 KBps  (30.0 s, 1 samples)
>> 2 copy - File Write 4096 bufsize 8000 maxblocks      1943816.0 KBps  (30.0 s, 1 samples)
>> 4 copy - File Write 4096 bufsize 8000 maxblocks      1405591.0 KBps  (30.0 s, 1 samples)
>> 8 copy - File Write 4096 bufsize 8000 maxblocks      1065080.0 KBps  (30.0 s, 1 samples)
>> 16 copy - File Write 4096 bufsize 8000 maxblocks       904762.0 KBps  (30.0 s, 1 samples)
>>
>> X86 test result:
>> 	test-case			after-patch	  before-patch
>> Execl Throughput                       |    18307.9 lps  |    11701.6 lps
>> File Copy 1024 bufsize 2000 maxblocks  |  1352407.3 KBps |   790418.9 KBps
>> File Copy 256 bufsize 500 maxblocks    |   367555.6 KBps |   222867.7 KBps
>> File Copy 4096 bufsize 8000 maxblocks  |  3675649.7 KBps |  1780614.4 KBps
>> Pipe Throughput                        | 11872208.7 lps  | 11855628.9 lps
>> Pipe-based Context Switching           |  1495126.5 lps  |  1490533.9 lps
>> Process Creation                       |    29881.2 lps  |    28572.8 lps
>> Shell Scripts (1 concurrent)           |    23224.3 lpm  |    22607.4 lpm
>> Shell Scripts (8 concurrent)           |     3531.4 lpm  |     3211.9 lpm
>> System Call Overhead                   | 10385653.0 lps  | 10419979.0 lps
>>
>> Christian Borntraeger (1):
>>   s390/spinlock: Provide vcpu_is_preempted
>>
>> Juergen Gross (1):
>>   x86, xen: support vcpu preempted check
>>
>> Pan Xinhui (9):
>>   kernel/sched: introduce vcpu preempted check interface
>>   locking/osq: Drop the overload of osq_lock()
>>   kernel/locking: Drop the overload of {mutex,rwsem}_spin_on_owner
>>   powerpc/spinlock: support vcpu preempted check
>>   x86, paravirt: Add interface to support kvm/xen vcpu preempted check
>>   KVM: Introduce kvm_write_guest_offset_cached
>>   x86, kvm/x86.c: support vcpu preempted check
>>   x86, kernel/kvm.c: support vcpu preempted check
>>   Documentation: virtual: kvm: Support vcpu preempted check
>>
>>  Documentation/virtual/kvm/msr.txt     |  9 ++++++++-
>>  arch/powerpc/include/asm/spinlock.h   |  8 ++++++++
>>  arch/s390/include/asm/spinlock.h      |  8 ++++++++
>>  arch/s390/kernel/smp.c                |  9 +++++++--
>>  arch/s390/lib/spinlock.c              | 25 ++++++++-----------------
>>  arch/x86/include/asm/paravirt_types.h |  2 ++
>>  arch/x86/include/asm/spinlock.h       |  8 ++++++++
>>  arch/x86/include/uapi/asm/kvm_para.h  |  4 +++-
>>  arch/x86/kernel/kvm.c                 | 12 ++++++++++++
>>  arch/x86/kernel/paravirt-spinlocks.c  |  6 ++++++
>>  arch/x86/kvm/x86.c                    | 16 ++++++++++++++++
>>  arch/x86/xen/spinlock.c               |  3 ++-
>>  include/linux/kvm_host.h              |  2 ++
>>  include/linux/sched.h                 | 12 ++++++++++++
>>  kernel/locking/mutex.c                | 15 +++++++++++++--
>>  kernel/locking/osq_lock.c             | 10 +++++++++-
>>  kernel/locking/rwsem-xadd.c           | 16 +++++++++++++---
>>  virt/kvm/kvm_main.c                   | 20 ++++++++++++++------
>>  18 files changed, 151 insertions(+), 34 deletions(-)
>>
>> --
>> 2.4.11
>>
>>
>> _______________________________________________
>> Xen-devel mailing list
>> Xen-devel@lists.xen.org
>> https://lists.xen.org/xen-devel
>

[-- Attachment #2: vcpu.tar --]
[-- Type: application/x-tar, Size: 81920 bytes --]

[-- Attachment #3: Type: text/plain, Size: 183 bytes --]

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH v6 00/11] implement vcpu preempted check
  2016-10-28 19:38   ` Konrad Rzeszutek Wilk
  (?)
  (?)
@ 2016-10-29  4:37   ` Pan Xinhui
  -1 siblings, 0 replies; 57+ messages in thread
From: Pan Xinhui @ 2016-10-29  4:37 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk, Pan Xinhui
  Cc: kvm, rkrcmar, peterz, benh, bsingharora, will.deacon,
	virtualization, paulus, kernellwp, linux-s390, xen-devel-request,
	x86, borntraeger, mingo, xen-devel, paulmck, boqun.feng, jgross,
	linux-kernel, David.Laight, mpe, pbonzini, linuxppc-dev

[-- Attachment #1: Type: text/plain, Size: 5899 bytes --]



在 2016/10/29 03:38, Konrad Rzeszutek Wilk 写道:
> On Fri, Oct 28, 2016 at 04:11:16AM -0400, Pan Xinhui wrote:
>> change from v5:
>> 	spilt x86/kvm patch into guest/host part.
>> 	introduce kvm_write_guest_offset_cached.
>> 	fix some typos.
>> 	rebase patch onto 4.9.2
>> change from v4:
>> 	spilt x86 kvm vcpu preempted check into two patches.
>> 	add documentation patch.
>> 	add x86 vcpu preempted check patch under xen
>> 	add s390 vcpu preempted check patch
>> change from v3:
>> 	add x86 vcpu preempted check patch
>> change from v2:
>> 	no code change, fix typos, update some comments
>> change from v1:
>> 	a simplier definition of default vcpu_is_preempted
>> 	skip mahcine type check on ppc, and add config. remove dedicated macro.
>> 	add one patch to drop overload of rwsem_spin_on_owner and mutex_spin_on_owner.
>> 	add more comments
>> 	thanks boqun and Peter's suggestion.
>>
>> This patch set aims to fix lock holder preemption issues.
>
> Do you have a git tree with these patches?
>
Currently no, sorry :(

I make a tar file for this patcheset. Maybe a little easier to apply :)

thanks
xinhui

>>
>> test-case:
>> perf record -a perf bench sched messaging -g 400 -p && perf report
>>
>> 18.09%  sched-messaging  [kernel.vmlinux]  [k] osq_lock
>> 12.28%  sched-messaging  [kernel.vmlinux]  [k] rwsem_spin_on_owner
>>  5.27%  sched-messaging  [kernel.vmlinux]  [k] mutex_unlock
>>  3.89%  sched-messaging  [kernel.vmlinux]  [k] wait_consider_task
>>  3.64%  sched-messaging  [kernel.vmlinux]  [k] _raw_write_lock_irq
>>  3.41%  sched-messaging  [kernel.vmlinux]  [k] mutex_spin_on_owner.is
>>  2.49%  sched-messaging  [kernel.vmlinux]  [k] system_call
>>
>> We introduce interface bool vcpu_is_preempted(int cpu) and use it in some spin
>> loops of osq_lock, rwsem_spin_on_owner and mutex_spin_on_owner.
>> These spin_on_onwer variant also cause rcu stall before we apply this patch set
>>
>> We also have observed some performace improvements in uninx benchmark tests.
>>
>> PPC test result:
>> 1 copy - 0.94%
>> 2 copy - 7.17%
>> 4 copy - 11.9%
>> 8 copy -  3.04%
>> 16 copy - 15.11%
>>
>> details below:
>> Without patch:
>>
>> 1 copy - File Write 4096 bufsize 8000 maxblocks      2188223.0 KBps  (30.0 s, 1 samples)
>> 2 copy - File Write 4096 bufsize 8000 maxblocks      1804433.0 KBps  (30.0 s, 1 samples)
>> 4 copy - File Write 4096 bufsize 8000 maxblocks      1237257.0 KBps  (30.0 s, 1 samples)
>> 8 copy - File Write 4096 bufsize 8000 maxblocks      1032658.0 KBps  (30.0 s, 1 samples)
>> 16 copy - File Write 4096 bufsize 8000 maxblocks       768000.0 KBps  (30.1 s, 1 samples)
>>
>> With patch:
>>
>> 1 copy - File Write 4096 bufsize 8000 maxblocks      2209189.0 KBps  (30.0 s, 1 samples)
>> 2 copy - File Write 4096 bufsize 8000 maxblocks      1943816.0 KBps  (30.0 s, 1 samples)
>> 4 copy - File Write 4096 bufsize 8000 maxblocks      1405591.0 KBps  (30.0 s, 1 samples)
>> 8 copy - File Write 4096 bufsize 8000 maxblocks      1065080.0 KBps  (30.0 s, 1 samples)
>> 16 copy - File Write 4096 bufsize 8000 maxblocks       904762.0 KBps  (30.0 s, 1 samples)
>>
>> X86 test result:
>> 	test-case			after-patch	  before-patch
>> Execl Throughput                       |    18307.9 lps  |    11701.6 lps
>> File Copy 1024 bufsize 2000 maxblocks  |  1352407.3 KBps |   790418.9 KBps
>> File Copy 256 bufsize 500 maxblocks    |   367555.6 KBps |   222867.7 KBps
>> File Copy 4096 bufsize 8000 maxblocks  |  3675649.7 KBps |  1780614.4 KBps
>> Pipe Throughput                        | 11872208.7 lps  | 11855628.9 lps
>> Pipe-based Context Switching           |  1495126.5 lps  |  1490533.9 lps
>> Process Creation                       |    29881.2 lps  |    28572.8 lps
>> Shell Scripts (1 concurrent)           |    23224.3 lpm  |    22607.4 lpm
>> Shell Scripts (8 concurrent)           |     3531.4 lpm  |     3211.9 lpm
>> System Call Overhead                   | 10385653.0 lps  | 10419979.0 lps
>>
>> Christian Borntraeger (1):
>>   s390/spinlock: Provide vcpu_is_preempted
>>
>> Juergen Gross (1):
>>   x86, xen: support vcpu preempted check
>>
>> Pan Xinhui (9):
>>   kernel/sched: introduce vcpu preempted check interface
>>   locking/osq: Drop the overload of osq_lock()
>>   kernel/locking: Drop the overload of {mutex,rwsem}_spin_on_owner
>>   powerpc/spinlock: support vcpu preempted check
>>   x86, paravirt: Add interface to support kvm/xen vcpu preempted check
>>   KVM: Introduce kvm_write_guest_offset_cached
>>   x86, kvm/x86.c: support vcpu preempted check
>>   x86, kernel/kvm.c: support vcpu preempted check
>>   Documentation: virtual: kvm: Support vcpu preempted check
>>
>>  Documentation/virtual/kvm/msr.txt     |  9 ++++++++-
>>  arch/powerpc/include/asm/spinlock.h   |  8 ++++++++
>>  arch/s390/include/asm/spinlock.h      |  8 ++++++++
>>  arch/s390/kernel/smp.c                |  9 +++++++--
>>  arch/s390/lib/spinlock.c              | 25 ++++++++-----------------
>>  arch/x86/include/asm/paravirt_types.h |  2 ++
>>  arch/x86/include/asm/spinlock.h       |  8 ++++++++
>>  arch/x86/include/uapi/asm/kvm_para.h  |  4 +++-
>>  arch/x86/kernel/kvm.c                 | 12 ++++++++++++
>>  arch/x86/kernel/paravirt-spinlocks.c  |  6 ++++++
>>  arch/x86/kvm/x86.c                    | 16 ++++++++++++++++
>>  arch/x86/xen/spinlock.c               |  3 ++-
>>  include/linux/kvm_host.h              |  2 ++
>>  include/linux/sched.h                 | 12 ++++++++++++
>>  kernel/locking/mutex.c                | 15 +++++++++++++--
>>  kernel/locking/osq_lock.c             | 10 +++++++++-
>>  kernel/locking/rwsem-xadd.c           | 16 +++++++++++++---
>>  virt/kvm/kvm_main.c                   | 20 ++++++++++++++------
>>  18 files changed, 151 insertions(+), 34 deletions(-)
>>
>> --
>> 2.4.11
>>
>>
>> _______________________________________________
>> Xen-devel mailing list
>> Xen-devel@lists.xen.org
>> https://lists.xen.org/xen-devel
>

[-- Attachment #2: vcpu.tar --]
[-- Type: application/x-tar, Size: 81920 bytes --]

[-- Attachment #3: Type: text/plain, Size: 127 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH v6 02/11] locking/osq: Drop the overload of osq_lock()
  2016-10-28  8:11 ` Pan Xinhui
  2016-10-29 16:52   ` Davidlohr Bueso
  2016-10-29 16:52   ` Davidlohr Bueso
@ 2016-10-29 16:52   ` Davidlohr Bueso
  2016-10-30 14:39       ` Pan Xinhui
  2016-10-30 14:39     ` Pan Xinhui
  2 siblings, 2 replies; 57+ messages in thread
From: Davidlohr Bueso @ 2016-10-29 16:52 UTC (permalink / raw)
  To: Pan Xinhui
  Cc: linux-kernel, linuxppc-dev, virtualization, linux-s390,
	xen-devel-request, kvm, xen-devel, x86, benh, paulus, mpe, mingo,
	peterz, paulmck, will.deacon, kernellwp, jgross, pbonzini,
	bsingharora, boqun.feng, borntraeger, rkrcmar, David.Laight

On Fri, 28 Oct 2016, Pan Xinhui wrote:
> 		/*
> 		 * If we need to reschedule bail... so we can block.
>+		 * Use vcpu_is_preempted to detech lock holder preemption issue
                                            ^^ detect
>+		 * and break. 

Could you please remove the rest of this comment? Its just noise to point out
that vcpu_is_preempted is a macro defined by arch/false. This is standard protocol
in the kernel.

Same goes for all locks you change with this.

Thanks,
Davidlohr

>                * vcpu_is_preempted is a macro defined by false if
>+		 * arch does not support vcpu preempted check,
> 		 */
>-		if (need_resched())
>+		if (need_resched() || vcpu_is_preempted(node_cpu(node->prev)))
> 			goto unqueue;
>
> 		cpu_relax_lowlatency();
>-- 
>2.4.11
>

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH v6 02/11] locking/osq: Drop the overload of osq_lock()
  2016-10-28  8:11 ` Pan Xinhui
@ 2016-10-29 16:52   ` Davidlohr Bueso
  2016-10-29 16:52   ` Davidlohr Bueso
  2016-10-29 16:52   ` Davidlohr Bueso
  2 siblings, 0 replies; 57+ messages in thread
From: Davidlohr Bueso @ 2016-10-29 16:52 UTC (permalink / raw)
  To: Pan Xinhui
  Cc: kvm, rkrcmar, peterz, benh, will.deacon, virtualization, paulus,
	kernellwp, linux-s390, xen-devel-request, x86, mingo, xen-devel,
	paulmck, boqun.feng, jgross, linux-kernel, David.Laight, mpe,
	pbonzini, linuxppc-dev

On Fri, 28 Oct 2016, Pan Xinhui wrote:
> 		/*
> 		 * If we need to reschedule bail... so we can block.
>+		 * Use vcpu_is_preempted to detech lock holder preemption issue
                                            ^^ detect
>+		 * and break. 

Could you please remove the rest of this comment? Its just noise to point out
that vcpu_is_preempted is a macro defined by arch/false. This is standard protocol
in the kernel.

Same goes for all locks you change with this.

Thanks,
Davidlohr

>                * vcpu_is_preempted is a macro defined by false if
>+		 * arch does not support vcpu preempted check,
> 		 */
>-		if (need_resched())
>+		if (need_resched() || vcpu_is_preempted(node_cpu(node->prev)))
> 			goto unqueue;
>
> 		cpu_relax_lowlatency();
>-- 
>2.4.11
>

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH v6 02/11] locking/osq: Drop the overload of osq_lock()
  2016-10-28  8:11 ` Pan Xinhui
  2016-10-29 16:52   ` Davidlohr Bueso
@ 2016-10-29 16:52   ` Davidlohr Bueso
  2016-10-29 16:52   ` Davidlohr Bueso
  2 siblings, 0 replies; 57+ messages in thread
From: Davidlohr Bueso @ 2016-10-29 16:52 UTC (permalink / raw)
  To: Pan Xinhui
  Cc: kvm, rkrcmar, peterz, benh, bsingharora, will.deacon,
	virtualization, paulus, kernellwp, linux-s390, xen-devel-request,
	x86, borntraeger, mingo, xen-devel, paulmck, boqun.feng, jgross,
	linux-kernel, David.Laight, mpe, pbonzini, linuxppc-dev

On Fri, 28 Oct 2016, Pan Xinhui wrote:
> 		/*
> 		 * If we need to reschedule bail... so we can block.
>+		 * Use vcpu_is_preempted to detech lock holder preemption issue
                                            ^^ detect
>+		 * and break. 

Could you please remove the rest of this comment? Its just noise to point out
that vcpu_is_preempted is a macro defined by arch/false. This is standard protocol
in the kernel.

Same goes for all locks you change with this.

Thanks,
Davidlohr

>                * vcpu_is_preempted is a macro defined by false if
>+		 * arch does not support vcpu preempted check,
> 		 */
>-		if (need_resched())
>+		if (need_resched() || vcpu_is_preempted(node_cpu(node->prev)))
> 			goto unqueue;
>
> 		cpu_relax_lowlatency();
>-- 
>2.4.11
>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH v6 02/11] locking/osq: Drop the overload of osq_lock()
  2016-10-29 16:52   ` Davidlohr Bueso
@ 2016-10-30 14:39       ` Pan Xinhui
  2016-10-30 14:39     ` Pan Xinhui
  1 sibling, 0 replies; 57+ messages in thread
From: Pan Xinhui @ 2016-10-30 14:39 UTC (permalink / raw)
  To: Davidlohr Bueso, Pan Xinhui
  Cc: kvm, rkrcmar, peterz, benh, will.deacon, virtualization, paulus,
	kernellwp, linux-s390, xen-devel-request, x86, mingo, xen-devel,
	paulmck, boqun.feng, jgross, linux-kernel, David.Laight, mpe,
	pbonzini, linuxppc-dev



在 2016/10/30 00:52, Davidlohr Bueso 写道:
> On Fri, 28 Oct 2016, Pan Xinhui wrote:
>>         /*
>>          * If we need to reschedule bail... so we can block.
>> +         * Use vcpu_is_preempted to detech lock holder preemption issue
>                                            ^^ detect
ok. thanks for poingting it out.
>> +         * and break.
>
> Could you please remove the rest of this comment? Its just noise to point out
> that vcpu_is_preempted is a macro defined by arch/false. This is standard protocol
> in the kernel.
>
fair enough.

> Same goes for all locks you change with this.
>
> Thanks,
> Davidlohr
>
>>                * vcpu_is_preempted is a macro defined by false if
>> +         * arch does not support vcpu preempted check,
>>          */
>> -        if (need_resched())
>> +        if (need_resched() || vcpu_is_preempted(node_cpu(node->prev)))
>>             goto unqueue;
>>
>>         cpu_relax_lowlatency();
>> --
>> 2.4.11
>>
>

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH v6 02/11] locking/osq: Drop the overload of osq_lock()
@ 2016-10-30 14:39       ` Pan Xinhui
  0 siblings, 0 replies; 57+ messages in thread
From: Pan Xinhui @ 2016-10-30 14:39 UTC (permalink / raw)
  To: Davidlohr Bueso, Pan Xinhui
  Cc: linux-kernel, linuxppc-dev, virtualization, linux-s390,
	xen-devel-request, kvm, xen-devel, x86, benh, paulus, mpe, mingo,
	peterz, paulmck, will.deacon, kernellwp, jgross, pbonzini,
	bsingharora, boqun.feng, borntraeger, rkrcmar, David.Laight



在 2016/10/30 00:52, Davidlohr Bueso 写道:
> On Fri, 28 Oct 2016, Pan Xinhui wrote:
>>         /*
>>          * If we need to reschedule bail... so we can block.
>> +         * Use vcpu_is_preempted to detech lock holder preemption issue
>                                            ^^ detect
ok. thanks for poingting it out.
>> +         * and break.
>
> Could you please remove the rest of this comment? Its just noise to point out
> that vcpu_is_preempted is a macro defined by arch/false. This is standard protocol
> in the kernel.
>
fair enough.

> Same goes for all locks you change with this.
>
> Thanks,
> Davidlohr
>
>>                * vcpu_is_preempted is a macro defined by false if
>> +         * arch does not support vcpu preempted check,
>>          */
>> -        if (need_resched())
>> +        if (need_resched() || vcpu_is_preempted(node_cpu(node->prev)))
>>             goto unqueue;
>>
>>         cpu_relax_lowlatency();
>> --
>> 2.4.11
>>
>

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH v6 02/11] locking/osq: Drop the overload of osq_lock()
  2016-10-29 16:52   ` Davidlohr Bueso
  2016-10-30 14:39       ` Pan Xinhui
@ 2016-10-30 14:39     ` Pan Xinhui
  1 sibling, 0 replies; 57+ messages in thread
From: Pan Xinhui @ 2016-10-30 14:39 UTC (permalink / raw)
  To: Davidlohr Bueso, Pan Xinhui
  Cc: kvm, rkrcmar, peterz, benh, bsingharora, will.deacon,
	virtualization, paulus, kernellwp, linux-s390, xen-devel-request,
	x86, borntraeger, mingo, xen-devel, paulmck, boqun.feng, jgross,
	linux-kernel, David.Laight, mpe, pbonzini, linuxppc-dev



在 2016/10/30 00:52, Davidlohr Bueso 写道:
> On Fri, 28 Oct 2016, Pan Xinhui wrote:
>>         /*
>>          * If we need to reschedule bail... so we can block.
>> +         * Use vcpu_is_preempted to detech lock holder preemption issue
>                                            ^^ detect
ok. thanks for poingting it out.
>> +         * and break.
>
> Could you please remove the rest of this comment? Its just noise to point out
> that vcpu_is_preempted is a macro defined by arch/false. This is standard protocol
> in the kernel.
>
fair enough.

> Same goes for all locks you change with this.
>
> Thanks,
> Davidlohr
>
>>                * vcpu_is_preempted is a macro defined by false if
>> +         * arch does not support vcpu preempted check,
>>          */
>> -        if (need_resched())
>> +        if (need_resched() || vcpu_is_preempted(node_cpu(node->prev)))
>>             goto unqueue;
>>
>>         cpu_relax_lowlatency();
>> --
>> 2.4.11
>>
>


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 57+ messages in thread

end of thread, other threads:[~2016-10-30 14:39 UTC | newest]

Thread overview: 57+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-10-28  8:11 [PATCH v6 00/11] implement vcpu preempted check Pan Xinhui
2016-10-28  8:11 ` Pan Xinhui
2016-10-28  8:11 ` [PATCH v6 01/11] kernel/sched: introduce vcpu preempted check interface Pan Xinhui
2016-10-28  8:11 ` Pan Xinhui
2016-10-28  8:11 ` Pan Xinhui
2016-10-28  8:11 ` [PATCH v6 02/11] locking/osq: Drop the overload of osq_lock() Pan Xinhui
2016-10-28  8:11 ` Pan Xinhui
2016-10-29 16:52   ` Davidlohr Bueso
2016-10-29 16:52   ` Davidlohr Bueso
2016-10-29 16:52   ` Davidlohr Bueso
2016-10-30 14:39     ` Pan Xinhui
2016-10-30 14:39       ` Pan Xinhui
2016-10-30 14:39     ` Pan Xinhui
2016-10-28  8:11 ` Pan Xinhui
2016-10-28  8:11 ` [PATCH v6 03/11] kernel/locking: Drop the overload of {mutex, rwsem}_spin_on_owner Pan Xinhui
2016-10-28  8:11 ` [PATCH v6 03/11] kernel/locking: Drop the overload of {mutex,rwsem}_spin_on_owner Pan Xinhui
2016-10-28  8:11   ` [PATCH v6 03/11] kernel/locking: Drop the overload of {mutex, rwsem}_spin_on_owner Pan Xinhui
2016-10-28  8:11   ` Pan Xinhui
2016-10-28  8:11 ` [PATCH v6 04/11] powerpc/spinlock: support vcpu preempted check Pan Xinhui
2016-10-28  8:11 ` Pan Xinhui
2016-10-28  8:11 ` Pan Xinhui
2016-10-28  8:11 ` [PATCH v6 05/11] s390/spinlock: Provide vcpu_is_preempted Pan Xinhui
2016-10-28  8:11 ` Pan Xinhui
2016-10-28  8:11 ` Pan Xinhui
2016-10-28  8:11 ` [PATCH v6 06/11] x86, paravirt: Add interface to support kvm/xen vcpu preempted check Pan Xinhui
2016-10-28  8:11 ` Pan Xinhui
2016-10-28  8:11 ` Pan Xinhui
2016-10-28  8:11 ` [PATCH v6 07/11] KVM: Introduce kvm_write_guest_offset_cached Pan Xinhui
2016-10-28  8:11 ` Pan Xinhui
2016-10-28  8:11 ` Pan Xinhui
2016-10-28  8:11 ` [PATCH v6 08/11] x86, kvm/x86.c: support vcpu preempted check Pan Xinhui
2016-10-28  8:11 ` Pan Xinhui
2016-10-28  8:11 ` Pan Xinhui
2016-10-28  8:11 ` [PATCH v6 09/11] x86, kernel/kvm.c: " Pan Xinhui
2016-10-28  8:11 ` Pan Xinhui
2016-10-28  8:11 ` Pan Xinhui
2016-10-28  8:11 ` [PATCH v6 10/11] x86, xen: " Pan Xinhui
2016-10-28 19:43   ` [Xen-devel] " Konrad Rzeszutek Wilk
2016-10-28 19:43     ` Konrad Rzeszutek Wilk
2016-10-29  4:26     ` Pan Xinhui
2016-10-29  4:26     ` [Xen-devel] " Pan Xinhui
2016-10-29  4:26       ` Pan Xinhui
2016-10-28 19:43   ` Konrad Rzeszutek Wilk
2016-10-28  8:11 ` Pan Xinhui
2016-10-28  8:11 ` Pan Xinhui
2016-10-28  8:11 ` [PATCH v6 11/11] Documentation: virtual: kvm: Support " Pan Xinhui
2016-10-28  8:11 ` Pan Xinhui
2016-10-28  8:11 ` Pan Xinhui
2016-10-28  9:57 ` [PATCH v6 00/11] implement " Paolo Bonzini
2016-10-28  9:57 ` Paolo Bonzini
2016-10-28  9:57 ` Paolo Bonzini
2016-10-28 19:38 ` Konrad Rzeszutek Wilk
2016-10-28 19:38 ` [Xen-devel] " Konrad Rzeszutek Wilk
2016-10-28 19:38   ` Konrad Rzeszutek Wilk
2016-10-29  4:37   ` Pan Xinhui
2016-10-29  4:37     ` Pan Xinhui
2016-10-29  4:37   ` Pan Xinhui

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.