* [PATCH 1/1] locking/qspinlock/x86: Avoid test-and-set when PV_DEDICATED is set
@ 2017-10-24 0:44 Eduardo Valentin
2017-10-24 8:13 ` Peter Zijlstra
2017-10-24 11:18 ` Radim Krčmář
0 siblings, 2 replies; 8+ messages in thread
From: Eduardo Valentin @ 2017-10-24 0:44 UTC (permalink / raw)
To: Paolo Bonzini, rkrcmar
Cc: Eduardo Valentin, Jonathan Corbet, Thomas Gleixner, Ingo Molnar,
H. Peter Anvin, x86, Peter Zijlstra, Waiman Long, kvm, linux-doc,
linux-kernel, Jan H . Schoenherr, Anthony Liguori
Currently, the existing qspinlock implementation will fallback to
test-and-set if the hypervisor has not set the PV_UNHALT flag.
This patch gives the opportunity to guest kernels to select
between test-and-set and the regular queueu fair lock implementation
based on the PV_DEDICATED KVM feature flag. When the PV_DEDICATED
flag is not set, the code will still fall back to test-and-set,
but when the PV_DEDICATED flag is set, the code will use
the regular queue spinlock implementation.
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: x86@kernel.org
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Waiman Long <longman@redhat.com>
Cc: kvm@vger.kernel.org
Cc: linux-doc@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: Jan H. Schoenherr <jschoenh@amazon.de>
Cc: Anthony Liguori <aliguori@amazon.com>
Suggested-by: Matt Wilson <msw@amazon.com>
Signed-off-by: Eduardo Valentin <eduval@amazon.com>
---
Documentation/virtual/kvm/cpuid.txt | 6 ++++++
arch/x86/include/asm/qspinlock.h | 4 ++++
arch/x86/include/uapi/asm/kvm_para.h | 1 +
3 files changed, 11 insertions(+)
diff --git a/Documentation/virtual/kvm/cpuid.txt b/Documentation/virtual/kvm/cpuid.txt
index 3c65feb..117066a 100644
--- a/Documentation/virtual/kvm/cpuid.txt
+++ b/Documentation/virtual/kvm/cpuid.txt
@@ -54,6 +54,12 @@ KVM_FEATURE_PV_UNHALT || 7 || guest checks this feature bit
|| || before enabling paravirtualized
|| || spinlock support.
------------------------------------------------------------------------------
+KVM_FEATURE_PV_DEDICATED || 8 || guest checks this feature bit
+ || || to determine if they run on
+ || || dedicated vCPUs, allowing opti-
+ || || mizations such as usage of
+ || || qspinlocks.
+------------------------------------------------------------------------------
KVM_FEATURE_CLOCKSOURCE_STABLE_BIT || 24 || host will warn if no guest-side
|| || per-cpu warps are expected in
|| || kvmclock.
diff --git a/arch/x86/include/asm/qspinlock.h b/arch/x86/include/asm/qspinlock.h
index eaba080..f89b469 100644
--- a/arch/x86/include/asm/qspinlock.h
+++ b/arch/x86/include/asm/qspinlock.h
@@ -1,6 +1,8 @@
#ifndef _ASM_X86_QSPINLOCK_H
#define _ASM_X86_QSPINLOCK_H
+#include <linux/kvm_para.h>
+
#include <asm/cpufeature.h>
#include <asm-generic/qspinlock_types.h>
#include <asm/paravirt.h>
@@ -46,6 +48,8 @@ static inline bool virt_spin_lock(struct qspinlock *lock)
if (!static_cpu_has(X86_FEATURE_HYPERVISOR))
return false;
+ if (kvm_para_has_feature(KVM_FEATURE_PV_DEDICATED))
+ return false;
/*
* On hypervisors without PARAVIRT_SPINLOCKS support we fall
* back to a Test-and-Set spinlock, because fair locks have
diff --git a/arch/x86/include/uapi/asm/kvm_para.h b/arch/x86/include/uapi/asm/kvm_para.h
index 94dc8ca..ad2e8fe 100644
--- a/arch/x86/include/uapi/asm/kvm_para.h
+++ b/arch/x86/include/uapi/asm/kvm_para.h
@@ -24,6 +24,7 @@
#define KVM_FEATURE_STEAL_TIME 5
#define KVM_FEATURE_PV_EOI 6
#define KVM_FEATURE_PV_UNHALT 7
+#define KVM_FEATURE_PV_DEDICATED 8
/* The last 8 bits are used to indicate how to interpret the flags field
* in pvclock structure. If no bits are set, all flags are ignored.
--
2.7.5
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH 1/1] locking/qspinlock/x86: Avoid test-and-set when PV_DEDICATED is set
2017-10-24 0:44 [PATCH 1/1] locking/qspinlock/x86: Avoid test-and-set when PV_DEDICATED is set Eduardo Valentin
@ 2017-10-24 8:13 ` Peter Zijlstra
2017-10-24 15:37 ` Eduardo Valentin
2017-10-24 11:18 ` Radim Krčmář
1 sibling, 1 reply; 8+ messages in thread
From: Peter Zijlstra @ 2017-10-24 8:13 UTC (permalink / raw)
To: Eduardo Valentin
Cc: Paolo Bonzini, rkrcmar, Jonathan Corbet, Thomas Gleixner,
Ingo Molnar, H. Peter Anvin, x86, Waiman Long, kvm, linux-doc,
linux-kernel, Jan H . Schoenherr, Anthony Liguori
On Mon, Oct 23, 2017 at 05:44:27PM -0700, Eduardo Valentin wrote:
> @@ -46,6 +48,8 @@ static inline bool virt_spin_lock(struct qspinlock *lock)
> if (!static_cpu_has(X86_FEATURE_HYPERVISOR))
> return false;
>
> + if (kvm_para_has_feature(KVM_FEATURE_PV_DEDICATED))
> + return false;
> /*
> * On hypervisors without PARAVIRT_SPINLOCKS support we fall
> * back to a Test-and-Set spinlock, because fair locks have
This does not apply. Much has been changed here recently.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 1/1] locking/qspinlock/x86: Avoid test-and-set when PV_DEDICATED is set
2017-10-24 0:44 [PATCH 1/1] locking/qspinlock/x86: Avoid test-and-set when PV_DEDICATED is set Eduardo Valentin
2017-10-24 8:13 ` Peter Zijlstra
@ 2017-10-24 11:18 ` Radim Krčmář
2017-10-31 17:02 ` Eduardo Valentin
1 sibling, 1 reply; 8+ messages in thread
From: Radim Krčmář @ 2017-10-24 11:18 UTC (permalink / raw)
To: Eduardo Valentin
Cc: Paolo Bonzini, Jonathan Corbet, Thomas Gleixner, Ingo Molnar,
H. Peter Anvin, x86, Peter Zijlstra, Waiman Long, kvm, linux-doc,
linux-kernel, Jan H . Schoenherr, Anthony Liguori
2017-10-23 17:44-0700, Eduardo Valentin:
> Currently, the existing qspinlock implementation will fallback to
> test-and-set if the hypervisor has not set the PV_UNHALT flag.
Where have you detected the main source of overhead with pinned VCPUs?
Makes me wonder if we couldn't improve general PV_UNHALT,
thanks.
> This patch gives the opportunity to guest kernels to select
> between test-and-set and the regular queueu fair lock implementation
> based on the PV_DEDICATED KVM feature flag. When the PV_DEDICATED
> flag is not set, the code will still fall back to test-and-set,
> but when the PV_DEDICATED flag is set, the code will use
> the regular queue spinlock implementation.
Some flag makes sense and we do want to make sure that userspaces don't
enable it in pass-through-cpuid mode.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 1/1] locking/qspinlock/x86: Avoid test-and-set when PV_DEDICATED is set
2017-10-24 8:13 ` Peter Zijlstra
@ 2017-10-24 15:37 ` Eduardo Valentin
2017-10-24 16:07 ` Waiman Long
0 siblings, 1 reply; 8+ messages in thread
From: Eduardo Valentin @ 2017-10-24 15:37 UTC (permalink / raw)
To: Peter Zijlstra
Cc: Eduardo Valentin, Paolo Bonzini, rkrcmar, Jonathan Corbet,
Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86, Waiman Long,
kvm, linux-doc, linux-kernel, Jan H . Schoenherr,
Anthony Liguori
Hello Peter,
On Tue, Oct 24, 2017 at 10:13:45AM +0200, Peter Zijlstra wrote:
> On Mon, Oct 23, 2017 at 05:44:27PM -0700, Eduardo Valentin wrote:
> > @@ -46,6 +48,8 @@ static inline bool virt_spin_lock(struct qspinlock *lock)
> > if (!static_cpu_has(X86_FEATURE_HYPERVISOR))
> > return false;
> >
> > + if (kvm_para_has_feature(KVM_FEATURE_PV_DEDICATED))
> > + return false;
> > /*
> > * On hypervisors without PARAVIRT_SPINLOCKS support we fall
> > * back to a Test-and-Set spinlock, because fair locks have
>
> This does not apply. Much has been changed here recently.
>
I checked against Linus master branch before sending. Which tree/branch are you referring to / should I based this?
--
All the best,
Eduardo Valentin
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 1/1] locking/qspinlock/x86: Avoid test-and-set when PV_DEDICATED is set
2017-10-24 15:37 ` Eduardo Valentin
@ 2017-10-24 16:07 ` Waiman Long
2017-10-24 16:26 ` Eduardo Valentin
0 siblings, 1 reply; 8+ messages in thread
From: Waiman Long @ 2017-10-24 16:07 UTC (permalink / raw)
To: Eduardo Valentin, Peter Zijlstra
Cc: Paolo Bonzini, rkrcmar, Jonathan Corbet, Thomas Gleixner,
Ingo Molnar, H. Peter Anvin, x86, kvm, linux-doc, linux-kernel,
Jan H . Schoenherr, Anthony Liguori
On 10/24/2017 11:37 AM, Eduardo Valentin wrote:
> Hello Peter,
> On Tue, Oct 24, 2017 at 10:13:45AM +0200, Peter Zijlstra wrote:
>> On Mon, Oct 23, 2017 at 05:44:27PM -0700, Eduardo Valentin wrote:
>>> @@ -46,6 +48,8 @@ static inline bool virt_spin_lock(struct qspinlock *lock)
>>> if (!static_cpu_has(X86_FEATURE_HYPERVISOR))
>>> return false;
>>>
>>> + if (kvm_para_has_feature(KVM_FEATURE_PV_DEDICATED))
>>> + return false;
>>> /*
>>> * On hypervisors without PARAVIRT_SPINLOCKS support we fall
>>> * back to a Test-and-Set spinlock, because fair locks have
>> This does not apply. Much has been changed here recently.
>>
> I checked against Linus master branch before sending. Which tree/branch are you referring to / should I based this?
>
Please check the tip tree
(https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git) which has
the latest changes in locking code.
Cheers,
Longman
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 1/1] locking/qspinlock/x86: Avoid test-and-set when PV_DEDICATED is set
2017-10-24 16:07 ` Waiman Long
@ 2017-10-24 16:26 ` Eduardo Valentin
0 siblings, 0 replies; 8+ messages in thread
From: Eduardo Valentin @ 2017-10-24 16:26 UTC (permalink / raw)
To: Waiman Long
Cc: Eduardo Valentin, Peter Zijlstra, Paolo Bonzini, rkrcmar,
Jonathan Corbet, Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
x86, kvm, linux-doc, linux-kernel, Jan H . Schoenherr,
Anthony Liguori
Hey Waiman,
On Tue, Oct 24, 2017 at 12:07:04PM -0400, Waiman Long wrote:
> On 10/24/2017 11:37 AM, Eduardo Valentin wrote:
> > Hello Peter,
> > On Tue, Oct 24, 2017 at 10:13:45AM +0200, Peter Zijlstra wrote:
> >> On Mon, Oct 23, 2017 at 05:44:27PM -0700, Eduardo Valentin wrote:
> >>> @@ -46,6 +48,8 @@ static inline bool virt_spin_lock(struct qspinlock *lock)
> >>> if (!static_cpu_has(X86_FEATURE_HYPERVISOR))
> >>> return false;
> >>>
> >>> + if (kvm_para_has_feature(KVM_FEATURE_PV_DEDICATED))
> >>> + return false;
> >>> /*
> >>> * On hypervisors without PARAVIRT_SPINLOCKS support we fall
> >>> * back to a Test-and-Set spinlock, because fair locks have
> >> This does not apply. Much has been changed here recently.
> >>
> > I checked against Linus master branch before sending. Which tree/branch are you referring to / should I based this?
> >
> Please check the tip tree
> (https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git) which has
> the latest changes in locking code.
I will rebase the patch on top of the tip tree.
Thanks.
>
> Cheers,
> Longman
>
--
All the best,
Eduardo Valentin
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 1/1] locking/qspinlock/x86: Avoid test-and-set when PV_DEDICATED is set
2017-10-24 11:18 ` Radim Krčmář
@ 2017-10-31 17:02 ` Eduardo Valentin
2017-11-08 18:41 ` Radim Krčmář
0 siblings, 1 reply; 8+ messages in thread
From: Eduardo Valentin @ 2017-10-31 17:02 UTC (permalink / raw)
To: Radim Krčmář
Cc: Eduardo Valentin, Paolo Bonzini, Jonathan Corbet,
Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86,
Peter Zijlstra, Waiman Long, kvm, linux-doc, linux-kernel,
Jan H . Schoenherr, Anthony Liguori, msw
Hello Radim,
On Tue, Oct 24, 2017 at 01:18:59PM +0200, Radim Krčmář wrote:
> 2017-10-23 17:44-0700, Eduardo Valentin:
> > Currently, the existing qspinlock implementation will fallback to
> > test-and-set if the hypervisor has not set the PV_UNHALT flag.
>
> Where have you detected the main source of overhead with pinned VCPUs?
> Makes me wonder if we couldn't improve general PV_UNHALT,
This is essentially for cases of non-overcommitted vCPUs in which we want
the instance vCPUs to run uninterrupted as much as possible. Here by disabling
the PV_UNHALT, we avoid the accounting needed to properly do the PV_UNHALT
hypercall, as the lock holder won't be preempted anyway for the 1:1 pin case.
>
> thanks.
>
> > This patch gives the opportunity to guest kernels to select
> > between test-and-set and the regular queueu fair lock implementation
> > based on the PV_DEDICATED KVM feature flag. When the PV_DEDICATED
> > flag is not set, the code will still fall back to test-and-set,
> > but when the PV_DEDICATED flag is set, the code will use
> > the regular queue spinlock implementation.
>
> Some flag makes sense and we do want to make sure that userspaces don't
> enable it in pass-through-cpuid mode.
Did you mean something like:
diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index 0099e10..8ceb503 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -211,7 +211,8 @@ int kvm_vcpu_ioctl_set_cpuid(struct kvm_vcpu *vcpu,
}
for (i = 0; i < cpuid->nent; i++) {
vcpu->arch.cpuid_entries[i].function = cpuid_entries[i].function;
- vcpu->arch.cpuid_entries[i].eax = cpuid_entries[i].eax;
+ vcpu->arch.cpuid_entries[i].eax = cpuid_entries[i].eax &
+ ~KVM_FEATURE_PV_DEDICATED;
vcpu->arch.cpuid_entries[i].ebx = cpuid_entries[i].ebx;
vcpu->arch.cpuid_entries[i].ecx = cpuid_entries[i].ecx;
vcpu->arch.cpuid_entries[i].edx = cpuid_entries[i].edx;
But I do not see any other KVM_FEATURE_* being enforced (e.g. PV_UNHALT).
Do you mind elaborating a bit here?
>
--
All the best,
Eduardo Valentin
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH 1/1] locking/qspinlock/x86: Avoid test-and-set when PV_DEDICATED is set
2017-10-31 17:02 ` Eduardo Valentin
@ 2017-11-08 18:41 ` Radim Krčmář
0 siblings, 0 replies; 8+ messages in thread
From: Radim Krčmář @ 2017-11-08 18:41 UTC (permalink / raw)
To: Eduardo Valentin
Cc: Paolo Bonzini, Jonathan Corbet, Thomas Gleixner, Ingo Molnar,
H. Peter Anvin, x86, Peter Zijlstra, Waiman Long, kvm, linux-doc,
linux-kernel, Jan H . Schoenherr, Anthony Liguori, msw
2017-10-31 10:02-0700, Eduardo Valentin:
> Hello Radim,
>
> On Tue, Oct 24, 2017 at 01:18:59PM +0200, Radim Krčmář wrote:
> > 2017-10-23 17:44-0700, Eduardo Valentin:
> > > Currently, the existing qspinlock implementation will fallback to
> > > test-and-set if the hypervisor has not set the PV_UNHALT flag.
> >
> > Where have you detected the main source of overhead with pinned VCPUs?
> > Makes me wonder if we couldn't improve general PV_UNHALT,
>
> This is essentially for cases of non-overcommitted vCPUs in which we want
> the instance vCPUs to run uninterrupted as much as possible. Here by disabling
> the PV_UNHALT, we avoid the accounting needed to properly do the PV_UNHALT
> hypercall, as the lock holder won't be preempted anyway for the 1:1 pin case.
Right, I would expect that the scenario should very rarely go into the
halt/kick path -- is SPIN_THRESHOLD too low?
We could also try abolishing the SPIN_THRESHOLD completely and only use
vcpu_is_preempted() and state of the previous lock holder to enter the
halt/kick path.
(The drawback is that vcpu_is_preempted() currently gets set even when
dropping into userspace.)
> > > This patch gives the opportunity to guest kernels to select
> > > between test-and-set and the regular queueu fair lock implementation
> > > based on the PV_DEDICATED KVM feature flag. When the PV_DEDICATED
> > > flag is not set, the code will still fall back to test-and-set,
> > > but when the PV_DEDICATED flag is set, the code will use
> > > the regular queue spinlock implementation.
> >
> > Some flag makes sense and we do want to make sure that userspaces don't
> > enable it in pass-through-cpuid mode.
>
> Did you mean something like:
> diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
> index 0099e10..8ceb503 100644
> --- a/arch/x86/kvm/cpuid.c
> +++ b/arch/x86/kvm/cpuid.c
> @@ -211,7 +211,8 @@ int kvm_vcpu_ioctl_set_cpuid(struct kvm_vcpu *vcpu,
> }
> for (i = 0; i < cpuid->nent; i++) {
> vcpu->arch.cpuid_entries[i].function = cpuid_entries[i].function;
> - vcpu->arch.cpuid_entries[i].eax = cpuid_entries[i].eax;
> + vcpu->arch.cpuid_entries[i].eax = cpuid_entries[i].eax &
> + ~KVM_FEATURE_PV_DEDICATED;
> vcpu->arch.cpuid_entries[i].ebx = cpuid_entries[i].ebx;
> vcpu->arch.cpuid_entries[i].ecx = cpuid_entries[i].ecx;
> vcpu->arch.cpuid_entries[i].edx = cpuid_entries[i].edx;
>
>
> But I do not see any other KVM_FEATURE_* being enforced (e.g. PV_UNHALT).
> Do you mind elaborating a bit here?
Sorry, nothing is needed. I somehow though that we need to expose this
to the userspace through CPUID, but KVM just needs to consider the flag
as reserved.
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2017-11-08 18:41 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-10-24 0:44 [PATCH 1/1] locking/qspinlock/x86: Avoid test-and-set when PV_DEDICATED is set Eduardo Valentin
2017-10-24 8:13 ` Peter Zijlstra
2017-10-24 15:37 ` Eduardo Valentin
2017-10-24 16:07 ` Waiman Long
2017-10-24 16:26 ` Eduardo Valentin
2017-10-24 11:18 ` Radim Krčmář
2017-10-31 17:02 ` Eduardo Valentin
2017-11-08 18:41 ` Radim Krčmář
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.