linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCHv3 1/1] locking/qspinlock/x86: Avoid test-and-set when PV_DEDICATED is set
@ 2017-11-06 20:26 Eduardo Valentin
  2017-11-07 12:23 ` Paolo Bonzini
                   ` (2 more replies)
  0 siblings, 3 replies; 27+ messages in thread
From: Eduardo Valentin @ 2017-11-06 20:26 UTC (permalink / raw)
  To: Paolo Bonzini, rkrcmar
  Cc: Matt Wilson, Eduardo Valentin, Jonathan Corbet, Thomas Gleixner,
	Ingo Molnar, H. Peter Anvin, x86, Peter Zijlstra, Waiman Long,
	kvm, linux-doc, linux-kernel, Jan H . Schoenherr,
	Anthony Liguori

Currently, the existing qspinlock implementation will fallback to
test-and-set if the hypervisor has not set the PV_UNHALT flag.

This patch gives the opportunity to guest kernels to select
between test-and-set and the regular queueu fair lock implementation
based on the PV_DEDICATED KVM feature flag. When the PV_DEDICATED
flag is not set, the code will still fall back to test-and-set,
but when the PV_DEDICATED flag is set, the code will use
the regular queue spinlock implementation.

With this patch, when in autoselect mode, the guest will
use the default spinlock implementation based on host feature
flags as follows:

PV_DEDICATED = 1, PV_UNHALT = anything: default is qspinlock
PV_DEDICATED = 0, PV_UNHALT = 1: default is pvqspinlock
PV_DEDICATED = 0, PV_UNHALT = 0: default is tas

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: x86@kernel.org
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Waiman Long <longman@redhat.com>
Cc: kvm@vger.kernel.org
Cc: linux-doc@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: Jan H. Schoenherr <jschoenh@amazon.de>
Cc: Anthony Liguori <aliguori@amazon.com>
Suggested-by: Matt Wilson <msw@amazon.com>
Signed-off-by: Eduardo Valentin <eduval@amazon.com>
---
V3:
 - When PV_DEDICATED is set (1), qspinlock is selected,
   regardless of the value of PV_UNHAULT. Suggested by Paolo Bonzini. 
 - Refreshed on top of tip/master.
V2:
 - rebase on top of tip/master

 Documentation/virtual/kvm/cpuid.txt  | 6 ++++++
 arch/x86/include/asm/qspinlock.h     | 4 ++++
 arch/x86/include/uapi/asm/kvm_para.h | 1 +
 arch/x86/kernel/kvm.c                | 2 ++
 4 files changed, 13 insertions(+)

diff --git a/Documentation/virtual/kvm/cpuid.txt b/Documentation/virtual/kvm/cpuid.txt
index 3c65feb..117066a 100644
--- a/Documentation/virtual/kvm/cpuid.txt
+++ b/Documentation/virtual/kvm/cpuid.txt
@@ -54,6 +54,12 @@ KVM_FEATURE_PV_UNHALT              ||     7 || guest checks this feature bit
                                    ||       || before enabling paravirtualized
                                    ||       || spinlock support.
 ------------------------------------------------------------------------------
+KVM_FEATURE_PV_DEDICATED           ||     8 || guest checks this feature bit
+                                   ||       || to determine if they run on
+                                   ||       || dedicated vCPUs, allowing opti-
+                                   ||       || mizations such as usage of
+                                   ||       || qspinlocks.
+------------------------------------------------------------------------------
 KVM_FEATURE_CLOCKSOURCE_STABLE_BIT ||    24 || host will warn if no guest-side
                                    ||       || per-cpu warps are expected in
                                    ||       || kvmclock.
diff --git a/arch/x86/include/asm/qspinlock.h b/arch/x86/include/asm/qspinlock.h
index 5e16b5d..de42694 100644
--- a/arch/x86/include/asm/qspinlock.h
+++ b/arch/x86/include/asm/qspinlock.h
@@ -3,6 +3,8 @@
 #define _ASM_X86_QSPINLOCK_H
 
 #include <linux/jump_label.h>
+#include <linux/kvm_para.h>
+
 #include <asm/cpufeature.h>
 #include <asm-generic/qspinlock_types.h>
 #include <asm/paravirt.h>
@@ -58,6 +60,8 @@ static inline bool virt_spin_lock(struct qspinlock *lock)
 	if (!static_branch_likely(&virt_spin_lock_key))
 		return false;
 
+	if (kvm_para_has_feature(KVM_FEATURE_PV_DEDICATED))
+		return false;
 	/*
 	 * On hypervisors without PARAVIRT_SPINLOCKS support we fall
 	 * back to a Test-and-Set spinlock, because fair locks have
diff --git a/arch/x86/include/uapi/asm/kvm_para.h b/arch/x86/include/uapi/asm/kvm_para.h
index 554aa8f..85a9875 100644
--- a/arch/x86/include/uapi/asm/kvm_para.h
+++ b/arch/x86/include/uapi/asm/kvm_para.h
@@ -25,6 +25,7 @@
 #define KVM_FEATURE_STEAL_TIME		5
 #define KVM_FEATURE_PV_EOI		6
 #define KVM_FEATURE_PV_UNHALT		7
+#define KVM_FEATURE_PV_DEDICATED	8
 
 /* The last 8 bits are used to indicate how to interpret the flags field
  * in pvclock structure. If no bits are set, all flags are ignored.
diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
index 8bb9594..dacd7cf 100644
--- a/arch/x86/kernel/kvm.c
+++ b/arch/x86/kernel/kvm.c
@@ -642,6 +642,8 @@ void __init kvm_spinlock_init(void)
 {
 	if (!kvm_para_available())
 		return;
+	if (kvm_para_has_feature(KVM_FEATURE_PV_DEDICATED))
+		return;
 	/* Does host kernel support KVM_FEATURE_PV_UNHALT? */
 	if (!kvm_para_has_feature(KVM_FEATURE_PV_UNHALT))
 		return;
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* Re: [PATCHv3 1/1] locking/qspinlock/x86: Avoid test-and-set when PV_DEDICATED is set
  2017-11-06 20:26 [PATCHv3 1/1] locking/qspinlock/x86: Avoid test-and-set when PV_DEDICATED is set Eduardo Valentin
@ 2017-11-07 12:23 ` Paolo Bonzini
  2017-11-07 12:39   ` Eduardo Valentin
  2017-11-08 17:36 ` Radim Krčmář
  2017-11-09 12:43 ` Wanpeng Li
  2 siblings, 1 reply; 27+ messages in thread
From: Paolo Bonzini @ 2017-11-07 12:23 UTC (permalink / raw)
  To: Eduardo Valentin, rkrcmar
  Cc: Matt Wilson, Jonathan Corbet, Thomas Gleixner, Ingo Molnar,
	H. Peter Anvin, x86, Peter Zijlstra, Waiman Long, kvm, linux-doc,
	linux-kernel, Jan H . Schoenherr, Anthony Liguori

On 06/11/2017 21:26, Eduardo Valentin wrote:
> Currently, the existing qspinlock implementation will fallback to
> test-and-set if the hypervisor has not set the PV_UNHALT flag.
> 
> This patch gives the opportunity to guest kernels to select
> between test-and-set and the regular queueu fair lock implementation
> based on the PV_DEDICATED KVM feature flag. When the PV_DEDICATED
> flag is not set, the code will still fall back to test-and-set,
> but when the PV_DEDICATED flag is set, the code will use
> the regular queue spinlock implementation.
> 
> With this patch, when in autoselect mode, the guest will
> use the default spinlock implementation based on host feature
> flags as follows:
> 
> PV_DEDICATED = 1, PV_UNHALT = anything: default is qspinlock
> PV_DEDICATED = 0, PV_UNHALT = 1: default is pvqspinlock
> PV_DEDICATED = 0, PV_UNHALT = 0: default is tas

Hi Eduardo,

besides the suggestion to use a separate word than the one for features,
is this still needed after Waiman's patch to adaptively switch between
tas and pvqspinlock?

Paolo

> Cc: Paolo Bonzini <pbonzini@redhat.com>
> Cc: "Radim Krčmář" <rkrcmar@redhat.com>
> Cc: Jonathan Corbet <corbet@lwn.net>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: "H. Peter Anvin" <hpa@zytor.com>
> Cc: x86@kernel.org
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Waiman Long <longman@redhat.com>
> Cc: kvm@vger.kernel.org
> Cc: linux-doc@vger.kernel.org
> Cc: linux-kernel@vger.kernel.org
> Cc: Jan H. Schoenherr <jschoenh@amazon.de>
> Cc: Anthony Liguori <aliguori@amazon.com>
> Suggested-by: Matt Wilson <msw@amazon.com>
> Signed-off-by: Eduardo Valentin <eduval@amazon.com>
> ---
> V3:
>  - When PV_DEDICATED is set (1), qspinlock is selected,
>    regardless of the value of PV_UNHAULT. Suggested by Paolo Bonzini. 
>  - Refreshed on top of tip/master.
> V2:
>  - rebase on top of tip/master
> 
>  Documentation/virtual/kvm/cpuid.txt  | 6 ++++++
>  arch/x86/include/asm/qspinlock.h     | 4 ++++
>  arch/x86/include/uapi/asm/kvm_para.h | 1 +
>  arch/x86/kernel/kvm.c                | 2 ++
>  4 files changed, 13 insertions(+)
> 
> diff --git a/Documentation/virtual/kvm/cpuid.txt b/Documentation/virtual/kvm/cpuid.txt
> index 3c65feb..117066a 100644
> --- a/Documentation/virtual/kvm/cpuid.txt
> +++ b/Documentation/virtual/kvm/cpuid.txt
> @@ -54,6 +54,12 @@ KVM_FEATURE_PV_UNHALT              ||     7 || guest checks this feature bit
>                                     ||       || before enabling paravirtualized
>                                     ||       || spinlock support.
>  ------------------------------------------------------------------------------
> +KVM_FEATURE_PV_DEDICATED           ||     8 || guest checks this feature bit
> +                                   ||       || to determine if they run on
> +                                   ||       || dedicated vCPUs, allowing opti-
> +                                   ||       || mizations such as usage of
> +                                   ||       || qspinlocks.
> +------------------------------------------------------------------------------
>  KVM_FEATURE_CLOCKSOURCE_STABLE_BIT ||    24 || host will warn if no guest-side
>                                     ||       || per-cpu warps are expected in
>                                     ||       || kvmclock.
> diff --git a/arch/x86/include/asm/qspinlock.h b/arch/x86/include/asm/qspinlock.h
> index 5e16b5d..de42694 100644
> --- a/arch/x86/include/asm/qspinlock.h
> +++ b/arch/x86/include/asm/qspinlock.h
> @@ -3,6 +3,8 @@
>  #define _ASM_X86_QSPINLOCK_H
>  
>  #include <linux/jump_label.h>
> +#include <linux/kvm_para.h>
> +
>  #include <asm/cpufeature.h>
>  #include <asm-generic/qspinlock_types.h>
>  #include <asm/paravirt.h>
> @@ -58,6 +60,8 @@ static inline bool virt_spin_lock(struct qspinlock *lock)
>  	if (!static_branch_likely(&virt_spin_lock_key))
>  		return false;
>  
> +	if (kvm_para_has_feature(KVM_FEATURE_PV_DEDICATED))
> +		return false;
>  	/*
>  	 * On hypervisors without PARAVIRT_SPINLOCKS support we fall
>  	 * back to a Test-and-Set spinlock, because fair locks have
> diff --git a/arch/x86/include/uapi/asm/kvm_para.h b/arch/x86/include/uapi/asm/kvm_para.h
> index 554aa8f..85a9875 100644
> --- a/arch/x86/include/uapi/asm/kvm_para.h
> +++ b/arch/x86/include/uapi/asm/kvm_para.h
> @@ -25,6 +25,7 @@
>  #define KVM_FEATURE_STEAL_TIME		5
>  #define KVM_FEATURE_PV_EOI		6
>  #define KVM_FEATURE_PV_UNHALT		7
> +#define KVM_FEATURE_PV_DEDICATED	8
>  
>  /* The last 8 bits are used to indicate how to interpret the flags field
>   * in pvclock structure. If no bits are set, all flags are ignored.
> diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
> index 8bb9594..dacd7cf 100644
> --- a/arch/x86/kernel/kvm.c
> +++ b/arch/x86/kernel/kvm.c
> @@ -642,6 +642,8 @@ void __init kvm_spinlock_init(void)
>  {
>  	if (!kvm_para_available())
>  		return;
> +	if (kvm_para_has_feature(KVM_FEATURE_PV_DEDICATED))
> +		return;
>  	/* Does host kernel support KVM_FEATURE_PV_UNHALT? */
>  	if (!kvm_para_has_feature(KVM_FEATURE_PV_UNHALT))
>  		return;
> 

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCHv3 1/1] locking/qspinlock/x86: Avoid test-and-set when PV_DEDICATED is set
  2017-11-07 12:23 ` Paolo Bonzini
@ 2017-11-07 12:39   ` Eduardo Valentin
  2017-11-07 12:43     ` Paolo Bonzini
  0 siblings, 1 reply; 27+ messages in thread
From: Eduardo Valentin @ 2017-11-07 12:39 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Eduardo Valentin, rkrcmar, Matt Wilson, Jonathan Corbet,
	Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86,
	Peter Zijlstra, Waiman Long, kvm, linux-doc, linux-kernel,
	Jan H . Schoenherr, Anthony Liguori

On Tue, Nov 07, 2017 at 01:23:56PM +0100, Paolo Bonzini wrote:
> On 06/11/2017 21:26, Eduardo Valentin wrote:
> > Currently, the existing qspinlock implementation will fallback to
> > test-and-set if the hypervisor has not set the PV_UNHALT flag.
> > 
> > This patch gives the opportunity to guest kernels to select
> > between test-and-set and the regular queueu fair lock implementation
> > based on the PV_DEDICATED KVM feature flag. When the PV_DEDICATED
> > flag is not set, the code will still fall back to test-and-set,
> > but when the PV_DEDICATED flag is set, the code will use
> > the regular queue spinlock implementation.
> > 
> > With this patch, when in autoselect mode, the guest will
> > use the default spinlock implementation based on host feature
> > flags as follows:
> > 
> > PV_DEDICATED = 1, PV_UNHALT = anything: default is qspinlock
> > PV_DEDICATED = 0, PV_UNHALT = 1: default is pvqspinlock
> > PV_DEDICATED = 0, PV_UNHALT = 0: default is tas
> 
> Hi Eduardo,
> 
> besides the suggestion to use a separate word than the one for features,

Ok. I will take a look.

> is this still needed after Waiman's patch to adaptively switch between
> tas and pvqspinlock?

Can you please point me to it ? Is it already in tip/master?

> 
> Paolo
> 
> > Cc: Paolo Bonzini <pbonzini@redhat.com>
> > Cc: "Radim Krčmář" <rkrcmar@redhat.com>
> > Cc: Jonathan Corbet <corbet@lwn.net>
> > Cc: Thomas Gleixner <tglx@linutronix.de>
> > Cc: Ingo Molnar <mingo@redhat.com>
> > Cc: "H. Peter Anvin" <hpa@zytor.com>
> > Cc: x86@kernel.org
> > Cc: Peter Zijlstra <peterz@infradead.org>
> > Cc: Waiman Long <longman@redhat.com>
> > Cc: kvm@vger.kernel.org
> > Cc: linux-doc@vger.kernel.org
> > Cc: linux-kernel@vger.kernel.org
> > Cc: Jan H. Schoenherr <jschoenh@amazon.de>
> > Cc: Anthony Liguori <aliguori@amazon.com>
> > Suggested-by: Matt Wilson <msw@amazon.com>
> > Signed-off-by: Eduardo Valentin <eduval@amazon.com>
> > ---
> > V3:
> >  - When PV_DEDICATED is set (1), qspinlock is selected,
> >    regardless of the value of PV_UNHAULT. Suggested by Paolo Bonzini. 
> >  - Refreshed on top of tip/master.
> > V2:
> >  - rebase on top of tip/master
> > 
> >  Documentation/virtual/kvm/cpuid.txt  | 6 ++++++
> >  arch/x86/include/asm/qspinlock.h     | 4 ++++
> >  arch/x86/include/uapi/asm/kvm_para.h | 1 +
> >  arch/x86/kernel/kvm.c                | 2 ++
> >  4 files changed, 13 insertions(+)
> > 
> > diff --git a/Documentation/virtual/kvm/cpuid.txt b/Documentation/virtual/kvm/cpuid.txt
> > index 3c65feb..117066a 100644
> > --- a/Documentation/virtual/kvm/cpuid.txt
> > +++ b/Documentation/virtual/kvm/cpuid.txt
> > @@ -54,6 +54,12 @@ KVM_FEATURE_PV_UNHALT              ||     7 || guest checks this feature bit
> >                                     ||       || before enabling paravirtualized
> >                                     ||       || spinlock support.
> >  ------------------------------------------------------------------------------
> > +KVM_FEATURE_PV_DEDICATED           ||     8 || guest checks this feature bit
> > +                                   ||       || to determine if they run on
> > +                                   ||       || dedicated vCPUs, allowing opti-
> > +                                   ||       || mizations such as usage of
> > +                                   ||       || qspinlocks.
> > +------------------------------------------------------------------------------
> >  KVM_FEATURE_CLOCKSOURCE_STABLE_BIT ||    24 || host will warn if no guest-side
> >                                     ||       || per-cpu warps are expected in
> >                                     ||       || kvmclock.
> > diff --git a/arch/x86/include/asm/qspinlock.h b/arch/x86/include/asm/qspinlock.h
> > index 5e16b5d..de42694 100644
> > --- a/arch/x86/include/asm/qspinlock.h
> > +++ b/arch/x86/include/asm/qspinlock.h
> > @@ -3,6 +3,8 @@
> >  #define _ASM_X86_QSPINLOCK_H
> >  
> >  #include <linux/jump_label.h>
> > +#include <linux/kvm_para.h>
> > +
> >  #include <asm/cpufeature.h>
> >  #include <asm-generic/qspinlock_types.h>
> >  #include <asm/paravirt.h>
> > @@ -58,6 +60,8 @@ static inline bool virt_spin_lock(struct qspinlock *lock)
> >  	if (!static_branch_likely(&virt_spin_lock_key))
> >  		return false;
> >  
> > +	if (kvm_para_has_feature(KVM_FEATURE_PV_DEDICATED))
> > +		return false;
> >  	/*
> >  	 * On hypervisors without PARAVIRT_SPINLOCKS support we fall
> >  	 * back to a Test-and-Set spinlock, because fair locks have
> > diff --git a/arch/x86/include/uapi/asm/kvm_para.h b/arch/x86/include/uapi/asm/kvm_para.h
> > index 554aa8f..85a9875 100644
> > --- a/arch/x86/include/uapi/asm/kvm_para.h
> > +++ b/arch/x86/include/uapi/asm/kvm_para.h
> > @@ -25,6 +25,7 @@
> >  #define KVM_FEATURE_STEAL_TIME		5
> >  #define KVM_FEATURE_PV_EOI		6
> >  #define KVM_FEATURE_PV_UNHALT		7
> > +#define KVM_FEATURE_PV_DEDICATED	8
> >  
> >  /* The last 8 bits are used to indicate how to interpret the flags field
> >   * in pvclock structure. If no bits are set, all flags are ignored.
> > diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
> > index 8bb9594..dacd7cf 100644
> > --- a/arch/x86/kernel/kvm.c
> > +++ b/arch/x86/kernel/kvm.c
> > @@ -642,6 +642,8 @@ void __init kvm_spinlock_init(void)
> >  {
> >  	if (!kvm_para_available())
> >  		return;
> > +	if (kvm_para_has_feature(KVM_FEATURE_PV_DEDICATED))
> > +		return;
> >  	/* Does host kernel support KVM_FEATURE_PV_UNHALT? */
> >  	if (!kvm_para_has_feature(KVM_FEATURE_PV_UNHALT))
> >  		return;
> > 
> 
> 

-- 
All the best,
Eduardo Valentin

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCHv3 1/1] locking/qspinlock/x86: Avoid test-and-set when PV_DEDICATED is set
  2017-11-07 12:39   ` Eduardo Valentin
@ 2017-11-07 12:43     ` Paolo Bonzini
  2017-11-08  8:45       ` Eduardo Valentin
  0 siblings, 1 reply; 27+ messages in thread
From: Paolo Bonzini @ 2017-11-07 12:43 UTC (permalink / raw)
  To: Eduardo Valentin
  Cc: rkrcmar, Matt Wilson, Jonathan Corbet, Thomas Gleixner,
	Ingo Molnar, H. Peter Anvin, x86, Peter Zijlstra, Waiman Long,
	kvm, linux-doc, linux-kernel, Jan H . Schoenherr,
	Anthony Liguori

On 07/11/2017 13:39, Eduardo Valentin wrote:
>> is this still needed after Waiman's patch to adaptively switch between
>> tas and pvqspinlock?
> Can you please point me to it ? Is it already in tip/master?
> 

No, he just posted it:

https://marc.info/?l=linux-kernel&m=150972337909996&w=2

Paolo

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCHv3 1/1] locking/qspinlock/x86: Avoid test-and-set when PV_DEDICATED is set
  2017-11-07 12:43     ` Paolo Bonzini
@ 2017-11-08  8:45       ` Eduardo Valentin
  0 siblings, 0 replies; 27+ messages in thread
From: Eduardo Valentin @ 2017-11-08  8:45 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Eduardo Valentin, rkrcmar, Matt Wilson, Jonathan Corbet,
	Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86,
	Peter Zijlstra, Waiman Long, kvm, linux-doc, linux-kernel,
	Jan H . Schoenherr, Anthony Liguori

Paolo,

On Tue, Nov 07, 2017 at 01:43:15PM +0100, Paolo Bonzini wrote:
> On 07/11/2017 13:39, Eduardo Valentin wrote:
> >> is this still needed after Waiman's patch to adaptively switch between
> >> tas and pvqspinlock?
> > Can you please point me to it ? Is it already in tip/master?
> > 
> 
> No, he just posted it:
> 
> https://marc.info/?l=linux-kernel&m=150972337909996&w=2

OK, Thanks. I've reviewed his V2. I think the patch to have PV_DEDICATED is still interesting to have,
for the case the guest is in autoselect mode and honors what the host gives as hints.

> 
> Paolo

-- 
All the best,
Eduardo Valentin

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCHv3 1/1] locking/qspinlock/x86: Avoid test-and-set when PV_DEDICATED is set
  2017-11-06 20:26 [PATCHv3 1/1] locking/qspinlock/x86: Avoid test-and-set when PV_DEDICATED is set Eduardo Valentin
  2017-11-07 12:23 ` Paolo Bonzini
@ 2017-11-08 17:36 ` Radim Krčmář
  2017-11-09  8:55   ` Eduardo Valentin
  2017-11-09 12:43 ` Wanpeng Li
  2 siblings, 1 reply; 27+ messages in thread
From: Radim Krčmář @ 2017-11-08 17:36 UTC (permalink / raw)
  To: Eduardo Valentin
  Cc: Paolo Bonzini, Matt Wilson, Jonathan Corbet, Thomas Gleixner,
	Ingo Molnar, H. Peter Anvin, x86, Peter Zijlstra, Waiman Long,
	kvm, linux-doc, linux-kernel, Jan H . Schoenherr,
	Anthony Liguori

2017-11-06 12:26-0800, Eduardo Valentin:
> Currently, the existing qspinlock implementation will fallback to
> test-and-set if the hypervisor has not set the PV_UNHALT flag.
> 
> This patch gives the opportunity to guest kernels to select
> between test-and-set and the regular queueu fair lock implementation
> based on the PV_DEDICATED KVM feature flag. When the PV_DEDICATED
> flag is not set, the code will still fall back to test-and-set,
> but when the PV_DEDICATED flag is set, the code will use
> the regular queue spinlock implementation.
> 
> With this patch, when in autoselect mode, the guest will
> use the default spinlock implementation based on host feature
> flags as follows:
> 
> PV_DEDICATED = 1, PV_UNHALT = anything: default is qspinlock
> PV_DEDICATED = 0, PV_UNHALT = 1: default is pvqspinlock
> PV_DEDICATED = 0, PV_UNHALT = 0: default is tas
> 
> Cc: Paolo Bonzini <pbonzini@redhat.com>
> Cc: "Radim Krčmář" <rkrcmar@redhat.com>
> Cc: Jonathan Corbet <corbet@lwn.net>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: "H. Peter Anvin" <hpa@zytor.com>
> Cc: x86@kernel.org
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Waiman Long <longman@redhat.com>
> Cc: kvm@vger.kernel.org
> Cc: linux-doc@vger.kernel.org
> Cc: linux-kernel@vger.kernel.org
> Cc: Jan H. Schoenherr <jschoenh@amazon.de>
> Cc: Anthony Liguori <aliguori@amazon.com>
> Suggested-by: Matt Wilson <msw@amazon.com>
> Signed-off-by: Eduardo Valentin <eduval@amazon.com>
> ---
> V3:
>  - When PV_DEDICATED is set (1), qspinlock is selected,
>    regardless of the value of PV_UNHAULT. Suggested by Paolo Bonzini. 
>  - Refreshed on top of tip/master.
> V2:
>  - rebase on top of tip/master
> 
>  Documentation/virtual/kvm/cpuid.txt  | 6 ++++++
>  arch/x86/include/asm/qspinlock.h     | 4 ++++
>  arch/x86/include/uapi/asm/kvm_para.h | 1 +
>  arch/x86/kernel/kvm.c                | 2 ++
>  4 files changed, 13 insertions(+)
> 
> diff --git a/Documentation/virtual/kvm/cpuid.txt b/Documentation/virtual/kvm/cpuid.txt
> index 3c65feb..117066a 100644
> --- a/Documentation/virtual/kvm/cpuid.txt
> +++ b/Documentation/virtual/kvm/cpuid.txt
> @@ -54,6 +54,12 @@ KVM_FEATURE_PV_UNHALT              ||     7 || guest checks this feature bit
>                                     ||       || before enabling paravirtualized
>                                     ||       || spinlock support.
>  ------------------------------------------------------------------------------
> +KVM_FEATURE_PV_DEDICATED           ||     8 || guest checks this feature bit
> +                                   ||       || to determine if they run on
> +                                   ||       || dedicated vCPUs, allowing opti-
> +                                   ||       || mizations such as usage of
> +                                   ||       || qspinlocks.
> +------------------------------------------------------------------------------
>  KVM_FEATURE_CLOCKSOURCE_STABLE_BIT ||    24 || host will warn if no guest-side
>                                     ||       || per-cpu warps are expected in
>                                     ||       || kvmclock.
> diff --git a/arch/x86/include/asm/qspinlock.h b/arch/x86/include/asm/qspinlock.h
> index 5e16b5d..de42694 100644
> --- a/arch/x86/include/asm/qspinlock.h
> +++ b/arch/x86/include/asm/qspinlock.h
> @@ -3,6 +3,8 @@
>  #define _ASM_X86_QSPINLOCK_H
>  
>  #include <linux/jump_label.h>
> +#include <linux/kvm_para.h>
> +
>  #include <asm/cpufeature.h>
>  #include <asm-generic/qspinlock_types.h>
>  #include <asm/paravirt.h>
> @@ -58,6 +60,8 @@ static inline bool virt_spin_lock(struct qspinlock *lock)
>  	if (!static_branch_likely(&virt_spin_lock_key))
>  		return false;
>  
> +	if (kvm_para_has_feature(KVM_FEATURE_PV_DEDICATED))
> +		return false;

Hm, every spinlock slowpath calls cpuid, which causes a VM exit, so I
wouldn't expect it to be faster than the existing implementations.
(Using the static key would be better.)

How does this patch perform compared to user-forced qspinlock and hybrid
pvqspinlock?

Thanks.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCHv3 1/1] locking/qspinlock/x86: Avoid test-and-set when PV_DEDICATED is set
  2017-11-08 17:36 ` Radim Krčmář
@ 2017-11-09  8:55   ` Eduardo Valentin
  2017-11-09 14:17     ` Radim Krčmář
  0 siblings, 1 reply; 27+ messages in thread
From: Eduardo Valentin @ 2017-11-09  8:55 UTC (permalink / raw)
  To: Radim Krčmář
  Cc: Eduardo Valentin, Paolo Bonzini, Matt Wilson, Jonathan Corbet,
	Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86,
	Peter Zijlstra, Waiman Long, kvm, linux-doc, linux-kernel,
	Jan H . Schoenherr, Anthony Liguori

Hello,

On Wed, Nov 08, 2017 at 06:36:52PM +0100, Radim Krčmář wrote:
> 2017-11-06 12:26-0800, Eduardo Valentin:
> > Currently, the existing qspinlock implementation will fallback to
> > test-and-set if the hypervisor has not set the PV_UNHALT flag.
> > 
> > This patch gives the opportunity to guest kernels to select
> > between test-and-set and the regular queueu fair lock implementation
> > based on the PV_DEDICATED KVM feature flag. When the PV_DEDICATED
> > flag is not set, the code will still fall back to test-and-set,
> > but when the PV_DEDICATED flag is set, the code will use
> > the regular queue spinlock implementation.
> > 
> > With this patch, when in autoselect mode, the guest will
> > use the default spinlock implementation based on host feature
> > flags as follows:
> > 
> > PV_DEDICATED = 1, PV_UNHALT = anything: default is qspinlock
> > PV_DEDICATED = 0, PV_UNHALT = 1: default is pvqspinlock
> > PV_DEDICATED = 0, PV_UNHALT = 0: default is tas
> > 
> > Cc: Paolo Bonzini <pbonzini@redhat.com>
> > Cc: "Radim Krčmář" <rkrcmar@redhat.com>
> > Cc: Jonathan Corbet <corbet@lwn.net>
> > Cc: Thomas Gleixner <tglx@linutronix.de>
> > Cc: Ingo Molnar <mingo@redhat.com>
> > Cc: "H. Peter Anvin" <hpa@zytor.com>
> > Cc: x86@kernel.org
> > Cc: Peter Zijlstra <peterz@infradead.org>
> > Cc: Waiman Long <longman@redhat.com>
> > Cc: kvm@vger.kernel.org
> > Cc: linux-doc@vger.kernel.org
> > Cc: linux-kernel@vger.kernel.org
> > Cc: Jan H. Schoenherr <jschoenh@amazon.de>
> > Cc: Anthony Liguori <aliguori@amazon.com>
> > Suggested-by: Matt Wilson <msw@amazon.com>
> > Signed-off-by: Eduardo Valentin <eduval@amazon.com>
> > ---
> > V3:
> >  - When PV_DEDICATED is set (1), qspinlock is selected,
> >    regardless of the value of PV_UNHAULT. Suggested by Paolo Bonzini. 
> >  - Refreshed on top of tip/master.
> > V2:
> >  - rebase on top of tip/master
> > 
> >  Documentation/virtual/kvm/cpuid.txt  | 6 ++++++
> >  arch/x86/include/asm/qspinlock.h     | 4 ++++
> >  arch/x86/include/uapi/asm/kvm_para.h | 1 +
> >  arch/x86/kernel/kvm.c                | 2 ++
> >  4 files changed, 13 insertions(+)
> > 
> > diff --git a/Documentation/virtual/kvm/cpuid.txt b/Documentation/virtual/kvm/cpuid.txt
> > index 3c65feb..117066a 100644
> > --- a/Documentation/virtual/kvm/cpuid.txt
> > +++ b/Documentation/virtual/kvm/cpuid.txt
> > @@ -54,6 +54,12 @@ KVM_FEATURE_PV_UNHALT              ||     7 || guest checks this feature bit
> >                                     ||       || before enabling paravirtualized
> >                                     ||       || spinlock support.
> >  ------------------------------------------------------------------------------
> > +KVM_FEATURE_PV_DEDICATED           ||     8 || guest checks this feature bit
> > +                                   ||       || to determine if they run on
> > +                                   ||       || dedicated vCPUs, allowing opti-
> > +                                   ||       || mizations such as usage of
> > +                                   ||       || qspinlocks.
> > +------------------------------------------------------------------------------
> >  KVM_FEATURE_CLOCKSOURCE_STABLE_BIT ||    24 || host will warn if no guest-side
> >                                     ||       || per-cpu warps are expected in
> >                                     ||       || kvmclock.
> > diff --git a/arch/x86/include/asm/qspinlock.h b/arch/x86/include/asm/qspinlock.h
> > index 5e16b5d..de42694 100644
> > --- a/arch/x86/include/asm/qspinlock.h
> > +++ b/arch/x86/include/asm/qspinlock.h
> > @@ -3,6 +3,8 @@
> >  #define _ASM_X86_QSPINLOCK_H
> >  
> >  #include <linux/jump_label.h>
> > +#include <linux/kvm_para.h>
> > +
> >  #include <asm/cpufeature.h>
> >  #include <asm-generic/qspinlock_types.h>
> >  #include <asm/paravirt.h>
> > @@ -58,6 +60,8 @@ static inline bool virt_spin_lock(struct qspinlock *lock)
> >  	if (!static_branch_likely(&virt_spin_lock_key))
> >  		return false;
> >  
> > +	if (kvm_para_has_feature(KVM_FEATURE_PV_DEDICATED))
> > +		return false;
> 
> Hm, every spinlock slowpath calls cpuid, which causes a VM exit, so I
> wouldn't expect it to be faster than the existing implementations.
> (Using the static key would be better.)
> 
> How does this patch perform compared to user-forced qspinlock and hybrid
> pvqspinlock?

This patch should have same effect as user-forced qspinlock. However, the key aspect
here is this patch gives a way for the host to instruct the guest to use qspinlock.
Even with Longman's patch which allows guest to select the spinlock implementation,
there should still be the auto-select mode. In such mode, PV_DEDICATED should
allow the host to get the guest to use qspinlock, without, the guest will fallback
to tas when PV_UNHALT == 0.

> 
> Thanks.
> 

-- 
All the best,
Eduardo Valentin

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCHv3 1/1] locking/qspinlock/x86: Avoid test-and-set when PV_DEDICATED is set
  2017-11-06 20:26 [PATCHv3 1/1] locking/qspinlock/x86: Avoid test-and-set when PV_DEDICATED is set Eduardo Valentin
  2017-11-07 12:23 ` Paolo Bonzini
  2017-11-08 17:36 ` Radim Krčmář
@ 2017-11-09 12:43 ` Wanpeng Li
  2017-11-09 15:53   ` Pankaj Gupta
  2017-11-09 16:00   ` Radim Krcmar
  2 siblings, 2 replies; 27+ messages in thread
From: Wanpeng Li @ 2017-11-09 12:43 UTC (permalink / raw)
  To: Eduardo Valentin
  Cc: Paolo Bonzini, Radim Krcmar, Matt Wilson, Jonathan Corbet,
	Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
	the arch/x86 maintainers, Peter Zijlstra, Waiman Long, kvm,
	linux-doc, linux-kernel, Jan H . Schoenherr, Anthony Liguori

2017-11-07 4:26 GMT+08:00 Eduardo Valentin <eduval@amazon.com>:
> Currently, the existing qspinlock implementation will fallback to
> test-and-set if the hypervisor has not set the PV_UNHALT flag.
>
> This patch gives the opportunity to guest kernels to select
> between test-and-set and the regular queueu fair lock implementation
> based on the PV_DEDICATED KVM feature flag. When the PV_DEDICATED
> flag is not set, the code will still fall back to test-and-set,
> but when the PV_DEDICATED flag is set, the code will use
> the regular queue spinlock implementation.
>
> With this patch, when in autoselect mode, the guest will
> use the default spinlock implementation based on host feature
> flags as follows:
>
> PV_DEDICATED = 1, PV_UNHALT = anything: default is qspinlock
> PV_DEDICATED = 0, PV_UNHALT = 1: default is pvqspinlock
> PV_DEDICATED = 0, PV_UNHALT = 0: default is tas
>
> Cc: Paolo Bonzini <pbonzini@redhat.com>
> Cc: "Radim Krčmář" <rkrcmar@redhat.com>
> Cc: Jonathan Corbet <corbet@lwn.net>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: "H. Peter Anvin" <hpa@zytor.com>
> Cc: x86@kernel.org
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Waiman Long <longman@redhat.com>
> Cc: kvm@vger.kernel.org
> Cc: linux-doc@vger.kernel.org
> Cc: linux-kernel@vger.kernel.org
> Cc: Jan H. Schoenherr <jschoenh@amazon.de>
> Cc: Anthony Liguori <aliguori@amazon.com>
> Suggested-by: Matt Wilson <msw@amazon.com>
> Signed-off-by: Eduardo Valentin <eduval@amazon.com>
> ---
> V3:
>  - When PV_DEDICATED is set (1), qspinlock is selected,
>    regardless of the value of PV_UNHAULT. Suggested by Paolo Bonzini.
>  - Refreshed on top of tip/master.
> V2:
>  - rebase on top of tip/master
>
>  Documentation/virtual/kvm/cpuid.txt  | 6 ++++++
>  arch/x86/include/asm/qspinlock.h     | 4 ++++
>  arch/x86/include/uapi/asm/kvm_para.h | 1 +
>  arch/x86/kernel/kvm.c                | 2 ++
>  4 files changed, 13 insertions(+)
>
> diff --git a/Documentation/virtual/kvm/cpuid.txt b/Documentation/virtual/kvm/cpuid.txt
> index 3c65feb..117066a 100644
> --- a/Documentation/virtual/kvm/cpuid.txt
> +++ b/Documentation/virtual/kvm/cpuid.txt
> @@ -54,6 +54,12 @@ KVM_FEATURE_PV_UNHALT              ||     7 || guest checks this feature bit
>                                     ||       || before enabling paravirtualized
>                                     ||       || spinlock support.
>  ------------------------------------------------------------------------------
> +KVM_FEATURE_PV_DEDICATED           ||     8 || guest checks this feature bit
> +                                   ||       || to determine if they run on
> +                                   ||       || dedicated vCPUs, allowing opti-
> +                                   ||       || mizations such as usage of
> +                                   ||       || qspinlocks.
> +------------------------------------------------------------------------------
>  KVM_FEATURE_CLOCKSOURCE_STABLE_BIT ||    24 || host will warn if no guest-side
>                                     ||       || per-cpu warps are expected in
>                                     ||       || kvmclock.
> diff --git a/arch/x86/include/asm/qspinlock.h b/arch/x86/include/asm/qspinlock.h
> index 5e16b5d..de42694 100644
> --- a/arch/x86/include/asm/qspinlock.h
> +++ b/arch/x86/include/asm/qspinlock.h
> @@ -3,6 +3,8 @@
>  #define _ASM_X86_QSPINLOCK_H
>
>  #include <linux/jump_label.h>
> +#include <linux/kvm_para.h>
> +
>  #include <asm/cpufeature.h>
>  #include <asm-generic/qspinlock_types.h>
>  #include <asm/paravirt.h>
> @@ -58,6 +60,8 @@ static inline bool virt_spin_lock(struct qspinlock *lock)
>         if (!static_branch_likely(&virt_spin_lock_key))
>                 return false;
>
> +       if (kvm_para_has_feature(KVM_FEATURE_PV_DEDICATED))
> +               return false;
>         /*
>          * On hypervisors without PARAVIRT_SPINLOCKS support we fall
>          * back to a Test-and-Set spinlock, because fair locks have
> diff --git a/arch/x86/include/uapi/asm/kvm_para.h b/arch/x86/include/uapi/asm/kvm_para.h
> index 554aa8f..85a9875 100644
> --- a/arch/x86/include/uapi/asm/kvm_para.h
> +++ b/arch/x86/include/uapi/asm/kvm_para.h
> @@ -25,6 +25,7 @@
>  #define KVM_FEATURE_STEAL_TIME         5
>  #define KVM_FEATURE_PV_EOI             6
>  #define KVM_FEATURE_PV_UNHALT          7
> +#define KVM_FEATURE_PV_DEDICATED       8
>
>  /* The last 8 bits are used to indicate how to interpret the flags field
>   * in pvclock structure. If no bits are set, all flags are ignored.
> diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
> index 8bb9594..dacd7cf 100644
> --- a/arch/x86/kernel/kvm.c
> +++ b/arch/x86/kernel/kvm.c
> @@ -642,6 +642,8 @@ void __init kvm_spinlock_init(void)
>  {
>         if (!kvm_para_available())
>                 return;
> +       if (kvm_para_has_feature(KVM_FEATURE_PV_DEDICATED))
> +               return;
>         /* Does host kernel support KVM_FEATURE_PV_UNHALT? */
>         if (!kvm_para_has_feature(KVM_FEATURE_PV_UNHALT))
>                 return;
> --
> 2.7.4
>

You should also add a cpuid flag in kvm part.

Regards,
Wanpeng Li

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCHv3 1/1] locking/qspinlock/x86: Avoid test-and-set when PV_DEDICATED is set
  2017-11-09  8:55   ` Eduardo Valentin
@ 2017-11-09 14:17     ` Radim Krčmář
  2017-11-16  4:54       ` Eduardo Valentin
  0 siblings, 1 reply; 27+ messages in thread
From: Radim Krčmář @ 2017-11-09 14:17 UTC (permalink / raw)
  To: Eduardo Valentin
  Cc: Paolo Bonzini, Matt Wilson, Jonathan Corbet, Thomas Gleixner,
	Ingo Molnar, H. Peter Anvin, x86, Peter Zijlstra, Waiman Long,
	kvm, linux-doc, linux-kernel, Jan H . Schoenherr,
	Anthony Liguori

2017-11-09 00:55-0800, Eduardo Valentin:
> Hello,
> 
> On Wed, Nov 08, 2017 at 06:36:52PM +0100, Radim Krčmář wrote:
> > 2017-11-06 12:26-0800, Eduardo Valentin:
> > > Currently, the existing qspinlock implementation will fallback to
> > > test-and-set if the hypervisor has not set the PV_UNHALT flag.
> > > 
> > > This patch gives the opportunity to guest kernels to select
> > > between test-and-set and the regular queueu fair lock implementation
> > > based on the PV_DEDICATED KVM feature flag. When the PV_DEDICATED
> > > flag is not set, the code will still fall back to test-and-set,
> > > but when the PV_DEDICATED flag is set, the code will use
> > > the regular queue spinlock implementation.
> > > 
> > > With this patch, when in autoselect mode, the guest will
> > > use the default spinlock implementation based on host feature
> > > flags as follows:
> > > 
> > > PV_DEDICATED = 1, PV_UNHALT = anything: default is qspinlock
> > > PV_DEDICATED = 0, PV_UNHALT = 1: default is pvqspinlock
> > > PV_DEDICATED = 0, PV_UNHALT = 0: default is tas
> > > 
> > > Cc: Paolo Bonzini <pbonzini@redhat.com>
> > > Cc: "Radim Krčmář" <rkrcmar@redhat.com>
> > > Cc: Jonathan Corbet <corbet@lwn.net>
> > > Cc: Thomas Gleixner <tglx@linutronix.de>
> > > Cc: Ingo Molnar <mingo@redhat.com>
> > > Cc: "H. Peter Anvin" <hpa@zytor.com>
> > > Cc: x86@kernel.org
> > > Cc: Peter Zijlstra <peterz@infradead.org>
> > > Cc: Waiman Long <longman@redhat.com>
> > > Cc: kvm@vger.kernel.org
> > > Cc: linux-doc@vger.kernel.org
> > > Cc: linux-kernel@vger.kernel.org
> > > Cc: Jan H. Schoenherr <jschoenh@amazon.de>
> > > Cc: Anthony Liguori <aliguori@amazon.com>
> > > Suggested-by: Matt Wilson <msw@amazon.com>
> > > Signed-off-by: Eduardo Valentin <eduval@amazon.com>
> > > ---
> > > V3:
> > >  - When PV_DEDICATED is set (1), qspinlock is selected,
> > >    regardless of the value of PV_UNHAULT. Suggested by Paolo Bonzini. 
> > >  - Refreshed on top of tip/master.
> > > V2:
> > >  - rebase on top of tip/master
> > > 
> > >  Documentation/virtual/kvm/cpuid.txt  | 6 ++++++
> > >  arch/x86/include/asm/qspinlock.h     | 4 ++++
> > >  arch/x86/include/uapi/asm/kvm_para.h | 1 +
> > >  arch/x86/kernel/kvm.c                | 2 ++
> > >  4 files changed, 13 insertions(+)
> > > 
> > > diff --git a/Documentation/virtual/kvm/cpuid.txt b/Documentation/virtual/kvm/cpuid.txt
> > > index 3c65feb..117066a 100644
> > > --- a/Documentation/virtual/kvm/cpuid.txt
> > > +++ b/Documentation/virtual/kvm/cpuid.txt
> > > @@ -54,6 +54,12 @@ KVM_FEATURE_PV_UNHALT              ||     7 || guest checks this feature bit
> > >                                     ||       || before enabling paravirtualized
> > >                                     ||       || spinlock support.
> > >  ------------------------------------------------------------------------------
> > > +KVM_FEATURE_PV_DEDICATED           ||     8 || guest checks this feature bit
> > > +                                   ||       || to determine if they run on
> > > +                                   ||       || dedicated vCPUs, allowing opti-
> > > +                                   ||       || mizations such as usage of
> > > +                                   ||       || qspinlocks.
> > > +------------------------------------------------------------------------------
> > >  KVM_FEATURE_CLOCKSOURCE_STABLE_BIT ||    24 || host will warn if no guest-side
> > >                                     ||       || per-cpu warps are expected in
> > >                                     ||       || kvmclock.
> > > diff --git a/arch/x86/include/asm/qspinlock.h b/arch/x86/include/asm/qspinlock.h
> > > index 5e16b5d..de42694 100644
> > > --- a/arch/x86/include/asm/qspinlock.h
> > > +++ b/arch/x86/include/asm/qspinlock.h
> > > @@ -3,6 +3,8 @@
> > >  #define _ASM_X86_QSPINLOCK_H
> > >  
> > >  #include <linux/jump_label.h>
> > > +#include <linux/kvm_para.h>
> > > +
> > >  #include <asm/cpufeature.h>
> > >  #include <asm-generic/qspinlock_types.h>
> > >  #include <asm/paravirt.h>
> > > @@ -58,6 +60,8 @@ static inline bool virt_spin_lock(struct qspinlock *lock)
> > >  	if (!static_branch_likely(&virt_spin_lock_key))
> > >  		return false;
> > >  
> > > +	if (kvm_para_has_feature(KVM_FEATURE_PV_DEDICATED))
> > > +		return false;
> > 
> > Hm, every spinlock slowpath calls cpuid, which causes a VM exit, so I
> > wouldn't expect it to be faster than the existing implementations.
> > (Using the static key would be better.)
> > 
> > How does this patch perform compared to user-forced qspinlock and hybrid
> > pvqspinlock?
> 
> This patch should have same effect as user-forced qspinlock.

This is what I'm doubting, because the patch is adding about two
thousand cycles to every spinlock-taken path.
Doesn't this patch yield better results?

diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
index 3df743b60c80..d9225e48c11a 100644
--- a/arch/x86/kernel/kvm.c
+++ b/arch/x86/kernel/kvm.c
@@ -676,6 +676,12 @@ void __init kvm_spinlock_init(void)
 {
 	if (!kvm_para_available())
 		return;
+
+	if (kvm_para_has_feature(KVM_FEATURE_PV_DEDICATED)) {
+		static_branch_disable(&virt_spin_lock_key);
+		return;
+	}
+
 	/* Does host kernel support KVM_FEATURE_PV_UNHALT? */
 	if (!kvm_para_has_feature(KVM_FEATURE_PV_UNHALT))
 		return;

>                                                              However, the key aspect
> here is this patch gives a way for the host to instruct the guest to use qspinlock.
> Even with Longman's patch which allows guest to select the spinlock implementation,
> there should still be the auto-select mode. In such mode, PV_DEDICATED should
> allow the host to get the guest to use qspinlock, without, the guest will fallback
> to tas when PV_UNHALT == 0.

I agree that a flag can be useful for certains setups.

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* Re: [PATCHv3 1/1] locking/qspinlock/x86: Avoid test-and-set when PV_DEDICATED is set
  2017-11-09 12:43 ` Wanpeng Li
@ 2017-11-09 15:53   ` Pankaj Gupta
  2017-11-09 16:05     ` Radim Krcmar
  2017-11-09 16:00   ` Radim Krcmar
  1 sibling, 1 reply; 27+ messages in thread
From: Pankaj Gupta @ 2017-11-09 15:53 UTC (permalink / raw)
  To: Wanpeng Li
  Cc: Eduardo Valentin, Paolo Bonzini, Radim Krcmar, Matt Wilson,
	Jonathan Corbet, Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
	the arch/x86 maintainers, Peter Zijlstra, Waiman Long, kvm,
	linux-doc, linux-kernel, Jan H . Schoenherr, Anthony Liguori



> 2017-11-07 4:26 GMT+08:00 Eduardo Valentin <eduval@amazon.com>:
> > Currently, the existing qspinlock implementation will fallback to
> > test-and-set if the hypervisor has not set the PV_UNHALT flag.
> >
> > This patch gives the opportunity to guest kernels to select
> > between test-and-set and the regular queueu fair lock implementation
> > based on the PV_DEDICATED KVM feature flag. When the PV_DEDICATED
> > flag is not set, the code will still fall back to test-and-set,
> > but when the PV_DEDICATED flag is set, the code will use
> > the regular queue spinlock implementation.
> >
> > With this patch, when in autoselect mode, the guest will
> > use the default spinlock implementation based on host feature
> > flags as follows:
> >
> > PV_DEDICATED = 1, PV_UNHALT = anything: default is qspinlock
> > PV_DEDICATED = 0, PV_UNHALT = 1: default is pvqspinlock
> > PV_DEDICATED = 0, PV_UNHALT = 0: default is tas
> >
> > Cc: Paolo Bonzini <pbonzini@redhat.com>
> > Cc: "Radim Krčmář" <rkrcmar@redhat.com>
> > Cc: Jonathan Corbet <corbet@lwn.net>
> > Cc: Thomas Gleixner <tglx@linutronix.de>
> > Cc: Ingo Molnar <mingo@redhat.com>
> > Cc: "H. Peter Anvin" <hpa@zytor.com>
> > Cc: x86@kernel.org
> > Cc: Peter Zijlstra <peterz@infradead.org>
> > Cc: Waiman Long <longman@redhat.com>
> > Cc: kvm@vger.kernel.org
> > Cc: linux-doc@vger.kernel.org
> > Cc: linux-kernel@vger.kernel.org
> > Cc: Jan H. Schoenherr <jschoenh@amazon.de>
> > Cc: Anthony Liguori <aliguori@amazon.com>
> > Suggested-by: Matt Wilson <msw@amazon.com>
> > Signed-off-by: Eduardo Valentin <eduval@amazon.com>
> > ---
> > V3:
> >  - When PV_DEDICATED is set (1), qspinlock is selected,
> >    regardless of the value of PV_UNHAULT. Suggested by Paolo Bonzini.
> >  - Refreshed on top of tip/master.
> > V2:
> >  - rebase on top of tip/master
> >
> >  Documentation/virtual/kvm/cpuid.txt  | 6 ++++++
> >  arch/x86/include/asm/qspinlock.h     | 4 ++++
> >  arch/x86/include/uapi/asm/kvm_para.h | 1 +
> >  arch/x86/kernel/kvm.c                | 2 ++
> >  4 files changed, 13 insertions(+)
> >
> > diff --git a/Documentation/virtual/kvm/cpuid.txt
> > b/Documentation/virtual/kvm/cpuid.txt
> > index 3c65feb..117066a 100644
> > --- a/Documentation/virtual/kvm/cpuid.txt
> > +++ b/Documentation/virtual/kvm/cpuid.txt
> > @@ -54,6 +54,12 @@ KVM_FEATURE_PV_UNHALT              ||     7 || guest
> > checks this feature bit
> >                                     ||       || before enabling
> >                                     ||       || paravirtualized
> >                                     ||       || spinlock support.
> >  ------------------------------------------------------------------------------
> > +KVM_FEATURE_PV_DEDICATED           ||     8 || guest checks this feature
> > bit
> > +                                   ||       || to determine if they run on
> > +                                   ||       || dedicated vCPUs, allowing
> > opti-
> > +                                   ||       || mizations such as usage of
> > +                                   ||       || qspinlocks.
> > +------------------------------------------------------------------------------
> >  KVM_FEATURE_CLOCKSOURCE_STABLE_BIT ||    24 || host will warn if no
> >  guest-side
> >                                     ||       || per-cpu warps are expected
> >                                     ||       || in
> >                                     ||       || kvmclock.
> > diff --git a/arch/x86/include/asm/qspinlock.h
> > b/arch/x86/include/asm/qspinlock.h
> > index 5e16b5d..de42694 100644
> > --- a/arch/x86/include/asm/qspinlock.h
> > +++ b/arch/x86/include/asm/qspinlock.h
> > @@ -3,6 +3,8 @@
> >  #define _ASM_X86_QSPINLOCK_H
> >
> >  #include <linux/jump_label.h>
> > +#include <linux/kvm_para.h>
> > +
> >  #include <asm/cpufeature.h>
> >  #include <asm-generic/qspinlock_types.h>
> >  #include <asm/paravirt.h>
> > @@ -58,6 +60,8 @@ static inline bool virt_spin_lock(struct qspinlock *lock)
> >         if (!static_branch_likely(&virt_spin_lock_key))
> >                 return false;
> >
> > +       if (kvm_para_has_feature(KVM_FEATURE_PV_DEDICATED))
> > +               return false;
> >         /*
> >          * On hypervisors without PARAVIRT_SPINLOCKS support we fall
> >          * back to a Test-and-Set spinlock, because fair locks have
> > diff --git a/arch/x86/include/uapi/asm/kvm_para.h
> > b/arch/x86/include/uapi/asm/kvm_para.h
> > index 554aa8f..85a9875 100644
> > --- a/arch/x86/include/uapi/asm/kvm_para.h
> > +++ b/arch/x86/include/uapi/asm/kvm_para.h
> > @@ -25,6 +25,7 @@
> >  #define KVM_FEATURE_STEAL_TIME         5
> >  #define KVM_FEATURE_PV_EOI             6
> >  #define KVM_FEATURE_PV_UNHALT          7
> > +#define KVM_FEATURE_PV_DEDICATED       8
> >
> >  /* The last 8 bits are used to indicate how to interpret the flags field
> >   * in pvclock structure. If no bits are set, all flags are ignored.
> > diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
> > index 8bb9594..dacd7cf 100644
> > --- a/arch/x86/kernel/kvm.c
> > +++ b/arch/x86/kernel/kvm.c
> > @@ -642,6 +642,8 @@ void __init kvm_spinlock_init(void)
> >  {
> >         if (!kvm_para_available())
> >                 return;
> > +       if (kvm_para_has_feature(KVM_FEATURE_PV_DEDICATED))
> > +               return;
> >         /* Does host kernel support KVM_FEATURE_PV_UNHALT? */
> >         if (!kvm_para_has_feature(KVM_FEATURE_PV_UNHALT))
> >                 return;
> > --
> > 2.7.4
> >
> 
> You should also add a cpuid flag in kvm part.

Also, I am thinking if PV_DEDICATED helps in performance and with conjunction
with PV TLB patch in other thread. For use-case e.g KVM-RT where we don't overcommit 
vCPU's and pin vCPU:pCPU 1:1 we need a way from host side with which user can decide 
to enable PV_DEDICATED option. Such that if vCPU's are unlikely going to preempt or 
sleep we should avoid traversing the cpulist in PV TLB code.

So, two things:

1] A way to configure PV_DEDICATED from host.
2] PV TLB should also behave as per option PV_DEDICATED for better performance.

Or I am missing any context here?
> 
> Regards,
> Wanpeng Li
> 

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCHv3 1/1] locking/qspinlock/x86: Avoid test-and-set when PV_DEDICATED is set
  2017-11-09 12:43 ` Wanpeng Li
  2017-11-09 15:53   ` Pankaj Gupta
@ 2017-11-09 16:00   ` Radim Krcmar
  2017-11-10  6:07     ` Wanpeng Li
  1 sibling, 1 reply; 27+ messages in thread
From: Radim Krcmar @ 2017-11-09 16:00 UTC (permalink / raw)
  To: Wanpeng Li
  Cc: Eduardo Valentin, Paolo Bonzini, Matt Wilson, Jonathan Corbet,
	Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
	the arch/x86 maintainers, Peter Zijlstra, Waiman Long, kvm,
	linux-doc, linux-kernel, Jan H . Schoenherr, Anthony Liguori

2017-11-09 20:43+0800, Wanpeng Li:
> 2017-11-07 4:26 GMT+08:00 Eduardo Valentin <eduval@amazon.com>:
> > Currently, the existing qspinlock implementation will fallback to
> > test-and-set if the hypervisor has not set the PV_UNHALT flag.
> >
> > This patch gives the opportunity to guest kernels to select
> > between test-and-set and the regular queueu fair lock implementation
> > based on the PV_DEDICATED KVM feature flag. When the PV_DEDICATED
> > flag is not set, the code will still fall back to test-and-set,
> > but when the PV_DEDICATED flag is set, the code will use
> > the regular queue spinlock implementation.
> >
> > With this patch, when in autoselect mode, the guest will
> > use the default spinlock implementation based on host feature
> > flags as follows:
> >
> > PV_DEDICATED = 1, PV_UNHALT = anything: default is qspinlock
> > PV_DEDICATED = 0, PV_UNHALT = 1: default is pvqspinlock
> > PV_DEDICATED = 0, PV_UNHALT = 0: default is tas
> >
> > Cc: Paolo Bonzini <pbonzini@redhat.com>
> > Cc: "Radim Krčmář" <rkrcmar@redhat.com>
> > Cc: Jonathan Corbet <corbet@lwn.net>
> > Cc: Thomas Gleixner <tglx@linutronix.de>
> > Cc: Ingo Molnar <mingo@redhat.com>
> > Cc: "H. Peter Anvin" <hpa@zytor.com>
> > Cc: x86@kernel.org
> > Cc: Peter Zijlstra <peterz@infradead.org>
> > Cc: Waiman Long <longman@redhat.com>
> > Cc: kvm@vger.kernel.org
> > Cc: linux-doc@vger.kernel.org
> > Cc: linux-kernel@vger.kernel.org
> > Cc: Jan H. Schoenherr <jschoenh@amazon.de>
> > Cc: Anthony Liguori <aliguori@amazon.com>
> > Suggested-by: Matt Wilson <msw@amazon.com>
> > Signed-off-by: Eduardo Valentin <eduval@amazon.com>
> > ---
> 
> You should also add a cpuid flag in kvm part.

It is better without that.  The flag has no dependency on KVM (kernel
hypervisor) code.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCHv3 1/1] locking/qspinlock/x86: Avoid test-and-set when PV_DEDICATED is set
  2017-11-09 15:53   ` Pankaj Gupta
@ 2017-11-09 16:05     ` Radim Krcmar
  2017-11-09 16:17       ` Peter Zijlstra
  0 siblings, 1 reply; 27+ messages in thread
From: Radim Krcmar @ 2017-11-09 16:05 UTC (permalink / raw)
  To: Pankaj Gupta
  Cc: Wanpeng Li, Eduardo Valentin, Paolo Bonzini, Matt Wilson,
	Jonathan Corbet, Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
	the arch/x86 maintainers, Peter Zijlstra, Waiman Long, kvm,
	linux-doc, linux-kernel, Jan H . Schoenherr, Anthony Liguori

2017-11-09 10:53-0500, Pankaj Gupta:
> 
> 
> > 2017-11-07 4:26 GMT+08:00 Eduardo Valentin <eduval@amazon.com>:
> > > Currently, the existing qspinlock implementation will fallback to
> > > test-and-set if the hypervisor has not set the PV_UNHALT flag.
> > >
> > > This patch gives the opportunity to guest kernels to select
> > > between test-and-set and the regular queueu fair lock implementation
> > > based on the PV_DEDICATED KVM feature flag. When the PV_DEDICATED
> > > flag is not set, the code will still fall back to test-and-set,
> > > but when the PV_DEDICATED flag is set, the code will use
> > > the regular queue spinlock implementation.
> > >
> > > With this patch, when in autoselect mode, the guest will
> > > use the default spinlock implementation based on host feature
> > > flags as follows:
> > >
> > > PV_DEDICATED = 1, PV_UNHALT = anything: default is qspinlock
> > > PV_DEDICATED = 0, PV_UNHALT = 1: default is pvqspinlock
> > > PV_DEDICATED = 0, PV_UNHALT = 0: default is tas
> > >
> > > Cc: Paolo Bonzini <pbonzini@redhat.com>
> > > Cc: "Radim Krčmář" <rkrcmar@redhat.com>
> > > Cc: Jonathan Corbet <corbet@lwn.net>
> > > Cc: Thomas Gleixner <tglx@linutronix.de>
> > > Cc: Ingo Molnar <mingo@redhat.com>
> > > Cc: "H. Peter Anvin" <hpa@zytor.com>
> > > Cc: x86@kernel.org
> > > Cc: Peter Zijlstra <peterz@infradead.org>
> > > Cc: Waiman Long <longman@redhat.com>
> > > Cc: kvm@vger.kernel.org
> > > Cc: linux-doc@vger.kernel.org
> > > Cc: linux-kernel@vger.kernel.org
> > > Cc: Jan H. Schoenherr <jschoenh@amazon.de>
> > > Cc: Anthony Liguori <aliguori@amazon.com>
> > > Suggested-by: Matt Wilson <msw@amazon.com>
> > > Signed-off-by: Eduardo Valentin <eduval@amazon.com>
> > > ---
> > > V3:
> > >  - When PV_DEDICATED is set (1), qspinlock is selected,
> > >    regardless of the value of PV_UNHAULT. Suggested by Paolo Bonzini.
> > >  - Refreshed on top of tip/master.
> > > V2:
> > >  - rebase on top of tip/master
> > >
> > >  Documentation/virtual/kvm/cpuid.txt  | 6 ++++++
> > >  arch/x86/include/asm/qspinlock.h     | 4 ++++
> > >  arch/x86/include/uapi/asm/kvm_para.h | 1 +
> > >  arch/x86/kernel/kvm.c                | 2 ++
> > >  4 files changed, 13 insertions(+)
> > >
> > > diff --git a/Documentation/virtual/kvm/cpuid.txt
> > > b/Documentation/virtual/kvm/cpuid.txt
> > > index 3c65feb..117066a 100644
> > > --- a/Documentation/virtual/kvm/cpuid.txt
> > > +++ b/Documentation/virtual/kvm/cpuid.txt
> > > @@ -54,6 +54,12 @@ KVM_FEATURE_PV_UNHALT              ||     7 || guest
> > > checks this feature bit
> > >                                     ||       || before enabling
> > >                                     ||       || paravirtualized
> > >                                     ||       || spinlock support.
> > >  ------------------------------------------------------------------------------
> > > +KVM_FEATURE_PV_DEDICATED           ||     8 || guest checks this feature
> > > bit
> > > +                                   ||       || to determine if they run on
> > > +                                   ||       || dedicated vCPUs, allowing
> > > opti-
> > > +                                   ||       || mizations such as usage of
> > > +                                   ||       || qspinlocks.
> > > +------------------------------------------------------------------------------
> > >  KVM_FEATURE_CLOCKSOURCE_STABLE_BIT ||    24 || host will warn if no
> > >  guest-side
> > >                                     ||       || per-cpu warps are expected
> > >                                     ||       || in
> > >                                     ||       || kvmclock.
> > > diff --git a/arch/x86/include/asm/qspinlock.h
> > > b/arch/x86/include/asm/qspinlock.h
> > > index 5e16b5d..de42694 100644
> > > --- a/arch/x86/include/asm/qspinlock.h
> > > +++ b/arch/x86/include/asm/qspinlock.h
> > > @@ -3,6 +3,8 @@
> > >  #define _ASM_X86_QSPINLOCK_H
> > >
> > >  #include <linux/jump_label.h>
> > > +#include <linux/kvm_para.h>
> > > +
> > >  #include <asm/cpufeature.h>
> > >  #include <asm-generic/qspinlock_types.h>
> > >  #include <asm/paravirt.h>
> > > @@ -58,6 +60,8 @@ static inline bool virt_spin_lock(struct qspinlock *lock)
> > >         if (!static_branch_likely(&virt_spin_lock_key))
> > >                 return false;
> > >
> > > +       if (kvm_para_has_feature(KVM_FEATURE_PV_DEDICATED))
> > > +               return false;
> > >         /*
> > >          * On hypervisors without PARAVIRT_SPINLOCKS support we fall
> > >          * back to a Test-and-Set spinlock, because fair locks have
> > > diff --git a/arch/x86/include/uapi/asm/kvm_para.h
> > > b/arch/x86/include/uapi/asm/kvm_para.h
> > > index 554aa8f..85a9875 100644
> > > --- a/arch/x86/include/uapi/asm/kvm_para.h
> > > +++ b/arch/x86/include/uapi/asm/kvm_para.h
> > > @@ -25,6 +25,7 @@
> > >  #define KVM_FEATURE_STEAL_TIME         5
> > >  #define KVM_FEATURE_PV_EOI             6
> > >  #define KVM_FEATURE_PV_UNHALT          7
> > > +#define KVM_FEATURE_PV_DEDICATED       8
> > >
> > >  /* The last 8 bits are used to indicate how to interpret the flags field
> > >   * in pvclock structure. If no bits are set, all flags are ignored.
> > > diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
> > > index 8bb9594..dacd7cf 100644
> > > --- a/arch/x86/kernel/kvm.c
> > > +++ b/arch/x86/kernel/kvm.c
> > > @@ -642,6 +642,8 @@ void __init kvm_spinlock_init(void)
> > >  {
> > >         if (!kvm_para_available())
> > >                 return;
> > > +       if (kvm_para_has_feature(KVM_FEATURE_PV_DEDICATED))
> > > +               return;
> > >         /* Does host kernel support KVM_FEATURE_PV_UNHALT? */
> > >         if (!kvm_para_has_feature(KVM_FEATURE_PV_UNHALT))
> > >                 return;
> > > --
> > > 2.7.4
> > >
> > 
> > You should also add a cpuid flag in kvm part.
> 
> Also, I am thinking if PV_DEDICATED helps in performance and with conjunction
> with PV TLB patch in other thread. For use-case e.g KVM-RT where we don't overcommit 
> vCPU's and pin vCPU:pCPU 1:1 we need a way from host side with which user can decide 
> to enable PV_DEDICATED option. Such that if vCPU's are unlikely going to preempt or 
> sleep we should avoid traversing the cpulist in PV TLB code.
> 
> So, two things:
> 
> 1] A way to configure PV_DEDICATED from host.

Userspace can already configure it through KVM_SET_CPUID2 ioctl
(regardless of KVM version).

> 2] PV TLB should also behave as per option PV_DEDICATED for better performance.

Right,

thanks.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCHv3 1/1] locking/qspinlock/x86: Avoid test-and-set when PV_DEDICATED is set
  2017-11-09 16:05     ` Radim Krcmar
@ 2017-11-09 16:17       ` Peter Zijlstra
  2017-11-09 16:37         ` Pankaj Gupta
  2017-11-09 16:45         ` Radim Krcmar
  0 siblings, 2 replies; 27+ messages in thread
From: Peter Zijlstra @ 2017-11-09 16:17 UTC (permalink / raw)
  To: Radim Krcmar
  Cc: Pankaj Gupta, Wanpeng Li, Eduardo Valentin, Paolo Bonzini,
	Matt Wilson, Jonathan Corbet, Thomas Gleixner, Ingo Molnar,
	H. Peter Anvin, the arch/x86 maintainers, Waiman Long, kvm,
	linux-doc, linux-kernel, Jan H . Schoenherr, Anthony Liguori

On Thu, Nov 09, 2017 at 05:05:36PM +0100, Radim Krcmar wrote:
> 2017-11-09 10:53-0500, Pankaj Gupta:
> > 2] PV TLB should also behave as per option PV_DEDICATED for better performance.
> 
> Right,

Shouldn't KVM do flush_tlb_other() in any case? Not sure how
PV_DEDICATED can help with that.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCHv3 1/1] locking/qspinlock/x86: Avoid test-and-set when PV_DEDICATED is set
  2017-11-09 16:17       ` Peter Zijlstra
@ 2017-11-09 16:37         ` Pankaj Gupta
  2017-11-09 16:45         ` Radim Krcmar
  1 sibling, 0 replies; 27+ messages in thread
From: Pankaj Gupta @ 2017-11-09 16:37 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Radim Krcmar, Wanpeng Li, Eduardo Valentin, Paolo Bonzini,
	Matt Wilson, Jonathan Corbet, Thomas Gleixner, Ingo Molnar,
	H. Peter Anvin, the arch/x86 maintainers, Waiman Long, kvm,
	linux-doc, linux-kernel, Jan H . Schoenherr, Anthony Liguori


> Subject: Re: [PATCHv3 1/1] locking/qspinlock/x86: Avoid test-and-set when PV_DEDICATED is set
> 
> On Thu, Nov 09, 2017 at 05:05:36PM +0100, Radim Krcmar wrote:
> > 2017-11-09 10:53-0500, Pankaj Gupta:
> > > 2] PV TLB should also behave as per option PV_DEDICATED for better
> > > performance.
> > 
> > Right,
> 
> Shouldn't KVM do flush_tlb_other() in any case? Not sure how
> PV_DEDICATED can help with that.

Yes.

If I understand it correctly, It will just avoid traversing all the 
cpus another time just to check/set 'KVM_VCPU_PREEMPTED'/FLUSH value and 
clear the cpu mask.

Thanks,
Pankaj

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCHv3 1/1] locking/qspinlock/x86: Avoid test-and-set when PV_DEDICATED is set
  2017-11-09 16:17       ` Peter Zijlstra
  2017-11-09 16:37         ` Pankaj Gupta
@ 2017-11-09 16:45         ` Radim Krcmar
  2017-11-09 17:12           ` Peter Zijlstra
  1 sibling, 1 reply; 27+ messages in thread
From: Radim Krcmar @ 2017-11-09 16:45 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Pankaj Gupta, Wanpeng Li, Eduardo Valentin, Paolo Bonzini,
	Matt Wilson, Jonathan Corbet, Thomas Gleixner, Ingo Molnar,
	H. Peter Anvin, the arch/x86 maintainers, Waiman Long, kvm,
	linux-doc, linux-kernel, Jan H . Schoenherr, Anthony Liguori

2017-11-09 17:17+0100, Peter Zijlstra:
> On Thu, Nov 09, 2017 at 05:05:36PM +0100, Radim Krcmar wrote:
> > 2017-11-09 10:53-0500, Pankaj Gupta:
> > > 2] PV TLB should also behave as per option PV_DEDICATED for better performance.
> > 
> > Right,
> 
> Shouldn't KVM do flush_tlb_other() in any case? Not sure how
> PV_DEDICATED can help with that.

It will, the suggestion was based on recent extension of the
flush_tlb_others implementaion, https://lkml.org/lkml/2017/11/8/1146.

PV_TLB_FLUSH allows a guest to set a flush bit instead of sending flush
IPI if the target VCPU is not running.  This would be a waste of time
with PV_DEDICATED as all VCPUs are expected to always running.

With PV_DEDICATED, the guest should keep using native_flush_tlb_others.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCHv3 1/1] locking/qspinlock/x86: Avoid test-and-set when PV_DEDICATED is set
  2017-11-09 16:45         ` Radim Krcmar
@ 2017-11-09 17:12           ` Peter Zijlstra
  2017-11-09 17:15             ` Peter Zijlstra
  2017-11-09 17:31             ` Radim Krcmar
  0 siblings, 2 replies; 27+ messages in thread
From: Peter Zijlstra @ 2017-11-09 17:12 UTC (permalink / raw)
  To: Radim Krcmar
  Cc: Pankaj Gupta, Wanpeng Li, Eduardo Valentin, Paolo Bonzini,
	Matt Wilson, Jonathan Corbet, Thomas Gleixner, Ingo Molnar,
	H. Peter Anvin, the arch/x86 maintainers, Waiman Long, kvm,
	linux-doc, linux-kernel, Jan H . Schoenherr, Anthony Liguori

On Thu, Nov 09, 2017 at 05:45:23PM +0100, Radim Krcmar wrote:
> 2017-11-09 17:17+0100, Peter Zijlstra:
> > On Thu, Nov 09, 2017 at 05:05:36PM +0100, Radim Krcmar wrote:
> > > 2017-11-09 10:53-0500, Pankaj Gupta:
> > > > 2] PV TLB should also behave as per option PV_DEDICATED for better performance.
> > > 
> > > Right,
> > 
> > Shouldn't KVM do flush_tlb_other() in any case? Not sure how
> > PV_DEDICATED can help with that.
> 
> It will, the suggestion was based on recent extension of the
> flush_tlb_others implementaion, https://lkml.org/lkml/2017/11/8/1146.
> 
> PV_TLB_FLUSH allows a guest to set a flush bit instead of sending flush
> IPI if the target VCPU is not running.  This would be a waste of time
> with PV_DEDICATED as all VCPUs are expected to always running.
> 
> With PV_DEDICATED, the guest should keep using native_flush_tlb_others.

Is saving that for_each_cpu() really worth the effort compared to the
cost of actually doing the IPIs and CR3 write?

Also, you should not put cpumask_t on stack, that's 'broken'.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCHv3 1/1] locking/qspinlock/x86: Avoid test-and-set when PV_DEDICATED is set
  2017-11-09 17:12           ` Peter Zijlstra
@ 2017-11-09 17:15             ` Peter Zijlstra
  2017-11-09 17:28               ` Peter Zijlstra
  2017-11-10  2:07               ` Wanpeng Li
  2017-11-09 17:31             ` Radim Krcmar
  1 sibling, 2 replies; 27+ messages in thread
From: Peter Zijlstra @ 2017-11-09 17:15 UTC (permalink / raw)
  To: Radim Krcmar
  Cc: Pankaj Gupta, Wanpeng Li, Eduardo Valentin, Paolo Bonzini,
	Matt Wilson, Jonathan Corbet, Thomas Gleixner, Ingo Molnar,
	H. Peter Anvin, the arch/x86 maintainers, Waiman Long, kvm,
	linux-doc, linux-kernel, Jan H . Schoenherr, Anthony Liguori

On Thu, Nov 09, 2017 at 06:12:41PM +0100, Peter Zijlstra wrote:
> On Thu, Nov 09, 2017 at 05:45:23PM +0100, Radim Krcmar wrote:
> > 2017-11-09 17:17+0100, Peter Zijlstra:
> > > On Thu, Nov 09, 2017 at 05:05:36PM +0100, Radim Krcmar wrote:
> > > > 2017-11-09 10:53-0500, Pankaj Gupta:
> > > > > 2] PV TLB should also behave as per option PV_DEDICATED for better performance.
> > > > 
> > > > Right,
> > > 
> > > Shouldn't KVM do flush_tlb_other() in any case? Not sure how
> > > PV_DEDICATED can help with that.
> > 
> > It will, the suggestion was based on recent extension of the
> > flush_tlb_others implementaion, https://lkml.org/lkml/2017/11/8/1146.
> > 
> > PV_TLB_FLUSH allows a guest to set a flush bit instead of sending flush
> > IPI if the target VCPU is not running.  This would be a waste of time
> > with PV_DEDICATED as all VCPUs are expected to always running.
> > 
> > With PV_DEDICATED, the guest should keep using native_flush_tlb_others.
> 
> Is saving that for_each_cpu() really worth the effort compared to the
> cost of actually doing the IPIs and CR3 write?
> 
> Also, you should not put cpumask_t on stack, that's 'broken'.

Also, you'll want to use __cpumask_clear_cpu() there.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCHv3 1/1] locking/qspinlock/x86: Avoid test-and-set when PV_DEDICATED is set
  2017-11-09 17:15             ` Peter Zijlstra
@ 2017-11-09 17:28               ` Peter Zijlstra
  2017-11-09 17:35                 ` Radim Krcmar
  2017-11-10  2:07               ` Wanpeng Li
  1 sibling, 1 reply; 27+ messages in thread
From: Peter Zijlstra @ 2017-11-09 17:28 UTC (permalink / raw)
  To: Radim Krcmar
  Cc: Pankaj Gupta, Wanpeng Li, Eduardo Valentin, Paolo Bonzini,
	Matt Wilson, Jonathan Corbet, Thomas Gleixner, Ingo Molnar,
	H. Peter Anvin, the arch/x86 maintainers, Waiman Long, kvm,
	linux-doc, linux-kernel, Jan H . Schoenherr, Anthony Liguori

On Thu, Nov 09, 2017 at 06:15:11PM +0100, Peter Zijlstra wrote:
> On Thu, Nov 09, 2017 at 06:12:41PM +0100, Peter Zijlstra wrote:
> > On Thu, Nov 09, 2017 at 05:45:23PM +0100, Radim Krcmar wrote:
> > > 2017-11-09 17:17+0100, Peter Zijlstra:
> > > > On Thu, Nov 09, 2017 at 05:05:36PM +0100, Radim Krcmar wrote:
> > > > > 2017-11-09 10:53-0500, Pankaj Gupta:
> > > > > > 2] PV TLB should also behave as per option PV_DEDICATED for better performance.
> > > > > 
> > > > > Right,
> > > > 
> > > > Shouldn't KVM do flush_tlb_other() in any case? Not sure how
> > > > PV_DEDICATED can help with that.
> > > 
> > > It will, the suggestion was based on recent extension of the
> > > flush_tlb_others implementaion, https://lkml.org/lkml/2017/11/8/1146.
> > > 
> > > PV_TLB_FLUSH allows a guest to set a flush bit instead of sending flush
> > > IPI if the target VCPU is not running.  This would be a waste of time
> > > with PV_DEDICATED as all VCPUs are expected to always running.
> > > 
> > > With PV_DEDICATED, the guest should keep using native_flush_tlb_others.
> > 
> > Is saving that for_each_cpu() really worth the effort compared to the
> > cost of actually doing the IPIs and CR3 write?
> > 
> > Also, you should not put cpumask_t on stack, that's 'broken'.
> 
> Also, you'll want to use __cpumask_clear_cpu() there.

Also^2, that patch split is crazy, after patch 2/3 your machine is
broken due to lost TLB flushes. You have to first add the SHOULD_FLUSH
handling and then clear CPUs from the native_flush_tlb_other() mask.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCHv3 1/1] locking/qspinlock/x86: Avoid test-and-set when PV_DEDICATED is set
  2017-11-09 17:12           ` Peter Zijlstra
  2017-11-09 17:15             ` Peter Zijlstra
@ 2017-11-09 17:31             ` Radim Krcmar
  1 sibling, 0 replies; 27+ messages in thread
From: Radim Krcmar @ 2017-11-09 17:31 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Pankaj Gupta, Wanpeng Li, Eduardo Valentin, Paolo Bonzini,
	Matt Wilson, Jonathan Corbet, Thomas Gleixner, Ingo Molnar,
	H. Peter Anvin, the arch/x86 maintainers, Waiman Long, kvm,
	linux-doc, linux-kernel, Jan H . Schoenherr, Anthony Liguori

2017-11-09 18:12+0100, Peter Zijlstra:
> On Thu, Nov 09, 2017 at 05:45:23PM +0100, Radim Krcmar wrote:
> > 2017-11-09 17:17+0100, Peter Zijlstra:
> > > On Thu, Nov 09, 2017 at 05:05:36PM +0100, Radim Krcmar wrote:
> > > > 2017-11-09 10:53-0500, Pankaj Gupta:
> > > > > 2] PV TLB should also behave as per option PV_DEDICATED for better performance.
> > > > 
> > > > Right,
> > > 
> > > Shouldn't KVM do flush_tlb_other() in any case? Not sure how
> > > PV_DEDICATED can help with that.
> > 
> > It will, the suggestion was based on recent extension of the
> > flush_tlb_others implementaion, https://lkml.org/lkml/2017/11/8/1146.
> > 
> > PV_TLB_FLUSH allows a guest to set a flush bit instead of sending flush
> > IPI if the target VCPU is not running.  This would be a waste of time
> > with PV_DEDICATED as all VCPUs are expected to always running.
> > 
> > With PV_DEDICATED, the guest should keep using native_flush_tlb_others.
> 
> Is saving that for_each_cpu() really worth the effort compared to the
> cost of actually doing the IPIs and CR3 write?

It is one line for a few percent (hopefully better for AMD with AVIC).

Still, keeping the decision completely on userspace would be cleaner.

> Also, you should not put cpumask_t on stack, that's 'broken'.

Good catch, thanks.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCHv3 1/1] locking/qspinlock/x86: Avoid test-and-set when PV_DEDICATED is set
  2017-11-09 17:28               ` Peter Zijlstra
@ 2017-11-09 17:35                 ` Radim Krcmar
  0 siblings, 0 replies; 27+ messages in thread
From: Radim Krcmar @ 2017-11-09 17:35 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Pankaj Gupta, Wanpeng Li, Eduardo Valentin, Paolo Bonzini,
	Matt Wilson, Jonathan Corbet, Thomas Gleixner, Ingo Molnar,
	H. Peter Anvin, the arch/x86 maintainers, Waiman Long, kvm,
	linux-doc, linux-kernel, Jan H . Schoenherr, Anthony Liguori

2017-11-09 18:28+0100, Peter Zijlstra:
> On Thu, Nov 09, 2017 at 06:15:11PM +0100, Peter Zijlstra wrote:
> > On Thu, Nov 09, 2017 at 06:12:41PM +0100, Peter Zijlstra wrote:
> > > On Thu, Nov 09, 2017 at 05:45:23PM +0100, Radim Krcmar wrote:
> > > > 2017-11-09 17:17+0100, Peter Zijlstra:
> > > > > On Thu, Nov 09, 2017 at 05:05:36PM +0100, Radim Krcmar wrote:
> > > > > > 2017-11-09 10:53-0500, Pankaj Gupta:
> > > > > > > 2] PV TLB should also behave as per option PV_DEDICATED for better performance.
> > > > > > 
> > > > > > Right,
> > > > > 
> > > > > Shouldn't KVM do flush_tlb_other() in any case? Not sure how
> > > > > PV_DEDICATED can help with that.
> > > > 
> > > > It will, the suggestion was based on recent extension of the
> > > > flush_tlb_others implementaion, https://lkml.org/lkml/2017/11/8/1146.
> > > > 
> > > > PV_TLB_FLUSH allows a guest to set a flush bit instead of sending flush
> > > > IPI if the target VCPU is not running.  This would be a waste of time
> > > > with PV_DEDICATED as all VCPUs are expected to always running.
> > > > 
> > > > With PV_DEDICATED, the guest should keep using native_flush_tlb_others.
> > > 
> > > Is saving that for_each_cpu() really worth the effort compared to the
> > > cost of actually doing the IPIs and CR3 write?
> > > 
> > > Also, you should not put cpumask_t on stack, that's 'broken'.
> > 
> > Also, you'll want to use __cpumask_clear_cpu() there.
> 
> Also^2, that patch split is crazy, after patch 2/3 your machine is
> broken due to lost TLB flushes. You have to first add the SHOULD_FLUSH
> handling and then clear CPUs from the native_flush_tlb_other() mask.

That should be fixed in v2 -- [2/3] must not enable this feature if the
host has not exposed it and [3/3] has to expose it.  (The ordering of
those two doesn't matter as they are separate kernel.)

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCHv3 1/1] locking/qspinlock/x86: Avoid test-and-set when PV_DEDICATED is set
  2017-11-09 17:15             ` Peter Zijlstra
  2017-11-09 17:28               ` Peter Zijlstra
@ 2017-11-10  2:07               ` Wanpeng Li
  2017-11-10  7:59                 ` Peter Zijlstra
  1 sibling, 1 reply; 27+ messages in thread
From: Wanpeng Li @ 2017-11-10  2:07 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Radim Krcmar, Pankaj Gupta, Eduardo Valentin, Paolo Bonzini,
	Matt Wilson, Jonathan Corbet, Thomas Gleixner, Ingo Molnar,
	H. Peter Anvin, the arch/x86 maintainers, Waiman Long, kvm,
	linux-doc, linux-kernel, Jan H . Schoenherr, Anthony Liguori

2017-11-10 1:15 GMT+08:00 Peter Zijlstra <peterz@infradead.org>:
> On Thu, Nov 09, 2017 at 06:12:41PM +0100, Peter Zijlstra wrote:
>> On Thu, Nov 09, 2017 at 05:45:23PM +0100, Radim Krcmar wrote:
>> > 2017-11-09 17:17+0100, Peter Zijlstra:
>> > > On Thu, Nov 09, 2017 at 05:05:36PM +0100, Radim Krcmar wrote:
>> > > > 2017-11-09 10:53-0500, Pankaj Gupta:
>> > > > > 2] PV TLB should also behave as per option PV_DEDICATED for better performance.
>> > > >
>> > > > Right,
>> > >
>> > > Shouldn't KVM do flush_tlb_other() in any case? Not sure how
>> > > PV_DEDICATED can help with that.
>> >
>> > It will, the suggestion was based on recent extension of the
>> > flush_tlb_others implementaion, https://lkml.org/lkml/2017/11/8/1146.
>> >
>> > PV_TLB_FLUSH allows a guest to set a flush bit instead of sending flush
>> > IPI if the target VCPU is not running.  This would be a waste of time
>> > with PV_DEDICATED as all VCPUs are expected to always running.
>> >
>> > With PV_DEDICATED, the guest should keep using native_flush_tlb_others.
>>
>> Is saving that for_each_cpu() really worth the effort compared to the
>> cost of actually doing the IPIs and CR3 write?
>>
>> Also, you should not put cpumask_t on stack, that's 'broken'.

Thanks pointing out this. I found a useful comments in arch/x86/kernel/irq.c:

/* These two declarations are only used in check_irq_vectors_for_cpu_disable()
 * below, which is protected by stop_machine().  Putting them on the stack
 * results in a stack frame overflow.  Dynamically allocating could result in a
 * failure so declare these two cpumasks as global.
 */
static struct cpumask affinity_new, online_new;

>
> Also, you'll want to use __cpumask_clear_cpu() there.

Will do.

Regards,
Wanpeng Li

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCHv3 1/1] locking/qspinlock/x86: Avoid test-and-set when PV_DEDICATED is set
  2017-11-09 16:00   ` Radim Krcmar
@ 2017-11-10  6:07     ` Wanpeng Li
  2017-11-10  8:19       ` Paolo Bonzini
  0 siblings, 1 reply; 27+ messages in thread
From: Wanpeng Li @ 2017-11-10  6:07 UTC (permalink / raw)
  To: Radim Krcmar
  Cc: Eduardo Valentin, Paolo Bonzini, Matt Wilson, Jonathan Corbet,
	Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
	the arch/x86 maintainers, Peter Zijlstra, Waiman Long, kvm,
	linux-doc, linux-kernel, Jan H . Schoenherr, Anthony Liguori

2017-11-10 0:00 GMT+08:00 Radim Krcmar <rkrcmar@redhat.com>:
> 2017-11-09 20:43+0800, Wanpeng Li:
>> 2017-11-07 4:26 GMT+08:00 Eduardo Valentin <eduval@amazon.com>:
>> > Currently, the existing qspinlock implementation will fallback to
>> > test-and-set if the hypervisor has not set the PV_UNHALT flag.
>> >
>> > This patch gives the opportunity to guest kernels to select
>> > between test-and-set and the regular queueu fair lock implementation
>> > based on the PV_DEDICATED KVM feature flag. When the PV_DEDICATED
>> > flag is not set, the code will still fall back to test-and-set,
>> > but when the PV_DEDICATED flag is set, the code will use
>> > the regular queue spinlock implementation.
>> >
>> > With this patch, when in autoselect mode, the guest will
>> > use the default spinlock implementation based on host feature
>> > flags as follows:
>> >
>> > PV_DEDICATED = 1, PV_UNHALT = anything: default is qspinlock
>> > PV_DEDICATED = 0, PV_UNHALT = 1: default is pvqspinlock
>> > PV_DEDICATED = 0, PV_UNHALT = 0: default is tas
>> >
>> > Cc: Paolo Bonzini <pbonzini@redhat.com>
>> > Cc: "Radim Krčmář" <rkrcmar@redhat.com>
>> > Cc: Jonathan Corbet <corbet@lwn.net>
>> > Cc: Thomas Gleixner <tglx@linutronix.de>
>> > Cc: Ingo Molnar <mingo@redhat.com>
>> > Cc: "H. Peter Anvin" <hpa@zytor.com>
>> > Cc: x86@kernel.org
>> > Cc: Peter Zijlstra <peterz@infradead.org>
>> > Cc: Waiman Long <longman@redhat.com>
>> > Cc: kvm@vger.kernel.org
>> > Cc: linux-doc@vger.kernel.org
>> > Cc: linux-kernel@vger.kernel.org
>> > Cc: Jan H. Schoenherr <jschoenh@amazon.de>
>> > Cc: Anthony Liguori <aliguori@amazon.com>
>> > Suggested-by: Matt Wilson <msw@amazon.com>
>> > Signed-off-by: Eduardo Valentin <eduval@amazon.com>
>> > ---
>>
>> You should also add a cpuid flag in kvm part.
>
> It is better without that.  The flag has no dependency on KVM (kernel
> hypervisor) code.

Do you mean -cpu host, +xxxx,I think it will result in "warning: host
doesn't support requested feature: CPUID.40000001H:eax.XXXXXX"

Regards,
Wanpeng Li

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCHv3 1/1] locking/qspinlock/x86: Avoid test-and-set when PV_DEDICATED is set
  2017-11-10  2:07               ` Wanpeng Li
@ 2017-11-10  7:59                 ` Peter Zijlstra
  2017-11-10  8:04                   ` Wanpeng Li
  0 siblings, 1 reply; 27+ messages in thread
From: Peter Zijlstra @ 2017-11-10  7:59 UTC (permalink / raw)
  To: Wanpeng Li
  Cc: Radim Krcmar, Pankaj Gupta, Eduardo Valentin, Paolo Bonzini,
	Matt Wilson, Jonathan Corbet, Thomas Gleixner, Ingo Molnar,
	H. Peter Anvin, the arch/x86 maintainers, Waiman Long, kvm,
	linux-doc, linux-kernel, Jan H . Schoenherr, Anthony Liguori

On Fri, Nov 10, 2017 at 10:07:56AM +0800, Wanpeng Li wrote:

> >> Also, you should not put cpumask_t on stack, that's 'broken'.
> 
> Thanks pointing out this. I found a useful comments in arch/x86/kernel/irq.c:
> 
> /* These two declarations are only used in check_irq_vectors_for_cpu_disable()
>  * below, which is protected by stop_machine().  Putting them on the stack
>  * results in a stack frame overflow.  Dynamically allocating could result in a
>  * failure so declare these two cpumasks as global.
>  */
> static struct cpumask affinity_new, online_new;

That code no longer exists.. Also not entirely sure how it would be
helpful.

What you probably want to do is have a per-cpu cpumask, since
flush_tlb_others() is called with preemption disabled. But you probably
don't want an unconditionally allocated one, since most kernels will not
in fact be PV.

So you'll want something like:

	static DEFINE_PER_CPU(cpumask_var_t, __pv_tlb_mask);

And then you need something like:

	for_each_possible_cpu(cpu) {
		zalloc_cpumask_var_node(per_cpu_ptr(&__pb_tlb_mask, cpu),
					GFP_KERNEL, cpu_to_node(cpu));
	}

before you set the pv-op or so.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCHv3 1/1] locking/qspinlock/x86: Avoid test-and-set when PV_DEDICATED is set
  2017-11-10  7:59                 ` Peter Zijlstra
@ 2017-11-10  8:04                   ` Wanpeng Li
  0 siblings, 0 replies; 27+ messages in thread
From: Wanpeng Li @ 2017-11-10  8:04 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Radim Krcmar, Pankaj Gupta, Eduardo Valentin, Paolo Bonzini,
	Matt Wilson, Jonathan Corbet, Thomas Gleixner, Ingo Molnar,
	H. Peter Anvin, the arch/x86 maintainers, Waiman Long, kvm,
	linux-doc, linux-kernel, Jan H . Schoenherr, Anthony Liguori

2017-11-10 15:59 GMT+08:00 Peter Zijlstra <peterz@infradead.org>:
> On Fri, Nov 10, 2017 at 10:07:56AM +0800, Wanpeng Li wrote:
>
>> >> Also, you should not put cpumask_t on stack, that's 'broken'.
>>
>> Thanks pointing out this. I found a useful comments in arch/x86/kernel/irq.c:
>>
>> /* These two declarations are only used in check_irq_vectors_for_cpu_disable()
>>  * below, which is protected by stop_machine().  Putting them on the stack
>>  * results in a stack frame overflow.  Dynamically allocating could result in a
>>  * failure so declare these two cpumasks as global.
>>  */
>> static struct cpumask affinity_new, online_new;
>
> That code no longer exists.. Also not entirely sure how it would be
> helpful.
>
> What you probably want to do is have a per-cpu cpumask, since
> flush_tlb_others() is called with preemption disabled. But you probably
> don't want an unconditionally allocated one, since most kernels will not
> in fact be PV.
>
> So you'll want something like:
>
>         static DEFINE_PER_CPU(cpumask_var_t, __pv_tlb_mask);
>
> And then you need something like:
>
>         for_each_possible_cpu(cpu) {
>                 zalloc_cpumask_var_node(per_cpu_ptr(&__pb_tlb_mask, cpu),
>                                         GFP_KERNEL, cpu_to_node(cpu));
>         }
>
> before you set the pv-op or so.

Thanks Peterz, :) I will have a try.

Regards,
Wanpeng Li

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCHv3 1/1] locking/qspinlock/x86: Avoid test-and-set when PV_DEDICATED is set
  2017-11-10  6:07     ` Wanpeng Li
@ 2017-11-10  8:19       ` Paolo Bonzini
  0 siblings, 0 replies; 27+ messages in thread
From: Paolo Bonzini @ 2017-11-10  8:19 UTC (permalink / raw)
  To: Wanpeng Li, Radim Krcmar
  Cc: Eduardo Valentin, Matt Wilson, Jonathan Corbet, Thomas Gleixner,
	Ingo Molnar, H. Peter Anvin, the arch/x86 maintainers,
	Peter Zijlstra, Waiman Long, kvm, linux-doc, linux-kernel,
	Jan H . Schoenherr, Anthony Liguori

On 10/11/2017 07:07, Wanpeng Li wrote:
>>> You should also add a cpuid flag in kvm part.
>> It is better without that.  The flag has no dependency on KVM (kernel
>> hypervisor) code.
> Do you mean -cpu host, +xxxx,I think it will result in "warning: host
> doesn't support requested feature: CPUID.40000001H:eax.XXXXXX"

There are some exceptions where QEMU overrides the values of
KVM_GET_SUPPORTED_CPUID.

I think it is better to add the flag to KVM *and* to QEMU's override in
kvm_arch_get_supported_cpuid (target/i386/kvm.c).

Paolo

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCHv3 1/1] locking/qspinlock/x86: Avoid test-and-set when PV_DEDICATED is set
  2017-11-09 14:17     ` Radim Krčmář
@ 2017-11-16  4:54       ` Eduardo Valentin
  2017-11-27  0:43         ` Wanpeng Li
  0 siblings, 1 reply; 27+ messages in thread
From: Eduardo Valentin @ 2017-11-16  4:54 UTC (permalink / raw)
  To: Radim Krčmář
  Cc: Eduardo Valentin, Paolo Bonzini, Matt Wilson, Jonathan Corbet,
	Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86,
	Peter Zijlstra, Waiman Long, kvm, linux-doc, linux-kernel,
	Jan H . Schoenherr, Anthony Liguori

Hey Radim,

On Thu, Nov 09, 2017 at 03:17:33PM +0100, Radim Krčmář wrote:

<cut>

> 
> This is what I'm doubting, because the patch is adding about two
> thousand cycles to every spinlock-taken path.
> Doesn't this patch yield better results?
> 
> diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
> index 3df743b60c80..d9225e48c11a 100644
> --- a/arch/x86/kernel/kvm.c
> +++ b/arch/x86/kernel/kvm.c
> @@ -676,6 +676,12 @@ void __init kvm_spinlock_init(void)
>  {
>  	if (!kvm_para_available())
>  		return;
> +
> +	if (kvm_para_has_feature(KVM_FEATURE_PV_DEDICATED)) {
> +		static_branch_disable(&virt_spin_lock_key);
> +		return;
> +	}
> +

Yes, the above suggestion is a much better approach. The code has probably changed from the time I wrote the first version. I will refresh with the above suggestion.


>  	/* Does host kernel support KVM_FEATURE_PV_UNHALT? */
>  	if (!kvm_para_has_feature(KVM_FEATURE_PV_UNHALT))
>  		return;
> 
> >                                                              However, the key aspect
> > here is this patch gives a way for the host to instruct the guest to use qspinlock.
> > Even with Longman's patch which allows guest to select the spinlock implementation,
> > there should still be the auto-select mode. In such mode, PV_DEDICATED should
> > allow the host to get the guest to use qspinlock, without, the guest will fallback
> > to tas when PV_UNHALT == 0.
> 
> I agree that a flag can be useful for certains setups.

Cool!

> 

-- 
All the best,
Eduardo Valentin

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCHv3 1/1] locking/qspinlock/x86: Avoid test-and-set when PV_DEDICATED is set
  2017-11-16  4:54       ` Eduardo Valentin
@ 2017-11-27  0:43         ` Wanpeng Li
  0 siblings, 0 replies; 27+ messages in thread
From: Wanpeng Li @ 2017-11-27  0:43 UTC (permalink / raw)
  To: Eduardo Valentin
  Cc: Radim Krčmář,
	Paolo Bonzini, Matt Wilson, Jonathan Corbet, Thomas Gleixner,
	Ingo Molnar, H. Peter Anvin, the arch/x86 maintainers,
	Peter Zijlstra, Waiman Long, kvm, linux-doc, linux-kernel,
	Jan H . Schoenherr, Anthony Liguori

Hi Eduardo,
2017-11-16 12:54 GMT+08:00 Eduardo Valentin <eduval@amazon.com>:
> Hey Radim,
>
> On Thu, Nov 09, 2017 at 03:17:33PM +0100, Radim Krčmář wrote:
>
> <cut>
>
>>
>> This is what I'm doubting, because the patch is adding about two
>> thousand cycles to every spinlock-taken path.
>> Doesn't this patch yield better results?
>>
>> diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
>> index 3df743b60c80..d9225e48c11a 100644
>> --- a/arch/x86/kernel/kvm.c
>> +++ b/arch/x86/kernel/kvm.c
>> @@ -676,6 +676,12 @@ void __init kvm_spinlock_init(void)
>>  {
>>       if (!kvm_para_available())
>>               return;
>> +
>> +     if (kvm_para_has_feature(KVM_FEATURE_PV_DEDICATED)) {
>> +             static_branch_disable(&virt_spin_lock_key);
>> +             return;
>> +     }
>> +
>
> Yes, the above suggestion is a much better approach. The code has probably changed from the time I wrote the first version. I will refresh with the above suggestion.

Do you mind to send a new version since the merge window is closed?

Regards,
Wanpeng Li

>
>
>>       /* Does host kernel support KVM_FEATURE_PV_UNHALT? */
>>       if (!kvm_para_has_feature(KVM_FEATURE_PV_UNHALT))
>>               return;
>>
>> >                                                              However, the key aspect
>> > here is this patch gives a way for the host to instruct the guest to use qspinlock.
>> > Even with Longman's patch which allows guest to select the spinlock implementation,
>> > there should still be the auto-select mode. In such mode, PV_DEDICATED should
>> > allow the host to get the guest to use qspinlock, without, the guest will fallback
>> > to tas when PV_UNHALT == 0.
>>
>> I agree that a flag can be useful for certains setups.
>
> Cool!
>
>>
>
> --
> All the best,
> Eduardo Valentin

^ permalink raw reply	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2017-11-27  0:43 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-11-06 20:26 [PATCHv3 1/1] locking/qspinlock/x86: Avoid test-and-set when PV_DEDICATED is set Eduardo Valentin
2017-11-07 12:23 ` Paolo Bonzini
2017-11-07 12:39   ` Eduardo Valentin
2017-11-07 12:43     ` Paolo Bonzini
2017-11-08  8:45       ` Eduardo Valentin
2017-11-08 17:36 ` Radim Krčmář
2017-11-09  8:55   ` Eduardo Valentin
2017-11-09 14:17     ` Radim Krčmář
2017-11-16  4:54       ` Eduardo Valentin
2017-11-27  0:43         ` Wanpeng Li
2017-11-09 12:43 ` Wanpeng Li
2017-11-09 15:53   ` Pankaj Gupta
2017-11-09 16:05     ` Radim Krcmar
2017-11-09 16:17       ` Peter Zijlstra
2017-11-09 16:37         ` Pankaj Gupta
2017-11-09 16:45         ` Radim Krcmar
2017-11-09 17:12           ` Peter Zijlstra
2017-11-09 17:15             ` Peter Zijlstra
2017-11-09 17:28               ` Peter Zijlstra
2017-11-09 17:35                 ` Radim Krcmar
2017-11-10  2:07               ` Wanpeng Li
2017-11-10  7:59                 ` Peter Zijlstra
2017-11-10  8:04                   ` Wanpeng Li
2017-11-09 17:31             ` Radim Krcmar
2017-11-09 16:00   ` Radim Krcmar
2017-11-10  6:07     ` Wanpeng Li
2017-11-10  8:19       ` Paolo Bonzini

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).