kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2] KVM: x86: use wrpkru directly in kvm_load_{guest|host}_xsave_state
@ 2021-05-11 15:59 Jon Kohler
  2021-05-11 16:45 ` Dave Hansen
  0 siblings, 1 reply; 5+ messages in thread
From: Jon Kohler @ 2021-05-11 15:59 UTC (permalink / raw)
  Cc: Jon Kohler, Babu Moger, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, x86, H. Peter Anvin, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, Dave Hansen, Yu-cheng Yu, Fenghua Yu, Tony Luck,
	Petteri Aimonen, Kan Liang, Uros Bizjak, Andrew Morton,
	Mike Rapoport, Benjamin Thiel, Anshuman Khandual, Juergen Gross,
	Fan Yang, Dave Jiang, Peter Zijlstra (Intel),
	Ricardo Neri, Arvind Sankar, linux-kernel, kvm

kvm_load_host_xsave_state handles xsave on vm exit, part of which is
managing memory protection key state. The latest arch.pkru is updated
with a rdpkru, and if that doesn't match the base host_pkru (which
about 70% of the time), we issue a __write_pkru.

__write_pkru issues another rdpkru internally to try to avoid the
wrpkru, so we're reading the same value back to back when we 100% of
the time know that we need to go directly to wrpkru. This is a 100%
branch miss and extra work that can be skipped.

To improve performance, use wrpkru directly in KVM code and simplify
the uses of __write_pkru such that it can be removed completely.

While we're in this section of code, optimize if condition ordering
prior to wrpkru in both kvm_load_{guest|host}_xsave_state.

For both functions, flip the ordering of the || condition so that
arch.xcr0 & XFEATURE_MASK_PKRU is checked first, which when
instrumented in our evironment appeared to be always true and less
overall work than kvm_read_cr4_bits.

For kvm_load_guest_xsave_state, hoist arch.pkru != host_pkru ahead
one position. When instrumented, I saw this be true roughly ~70% of
the time vs the other conditions which were almost always true.
With this change, we will avoid 3rd condition check ~30% of the time.

Cc: Babu Moger <babu.moger@amd.com>
Signed-off-by: Jon Kohler <jon@nutanix.com>
---
 arch/x86/include/asm/fpu/internal.h  | 10 +++++++++-
 arch/x86/include/asm/pgtable.h       |  2 +-
 arch/x86/include/asm/special_insns.h | 20 +++++---------------
 arch/x86/kvm/x86.c                   | 14 +++++++-------
 4 files changed, 22 insertions(+), 24 deletions(-)

diff --git a/arch/x86/include/asm/fpu/internal.h b/arch/x86/include/asm/fpu/internal.h
index 8d33ad80704f..6b50fa98370e 100644
--- a/arch/x86/include/asm/fpu/internal.h
+++ b/arch/x86/include/asm/fpu/internal.h
@@ -583,7 +583,15 @@ static inline void switch_fpu_finish(struct fpu *new_fpu)
 		if (pk)
 			pkru_val = pk->pkru;
 	}
-	__write_pkru(pkru_val);
+
+	/*
+	 * WRPKRU is relatively expensive compared to RDPKRU.
+	 * Avoid WRPKRU when it would not change the value.
+	 */
+	if (pkru_val == rdpkru())
+		return;
+
+	wrpkru(pkru_val);
 
 	/*
 	 * Expensive PASID MSR write will be avoided in update_pasid() because
diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h
index b1099f2d9800..20f1fb8be7ef 100644
--- a/arch/x86/include/asm/pgtable.h
+++ b/arch/x86/include/asm/pgtable.h
@@ -151,7 +151,7 @@ static inline void write_pkru(u32 pkru)
 	fpregs_lock();
 	if (pk)
 		pk->pkru = pkru;
-	__write_pkru(pkru);
+	wrpkru(pkru);
 	fpregs_unlock();
 }
 
diff --git a/arch/x86/include/asm/special_insns.h b/arch/x86/include/asm/special_insns.h
index 2acd6cb62328..3c361b5cbed5 100644
--- a/arch/x86/include/asm/special_insns.h
+++ b/arch/x86/include/asm/special_insns.h
@@ -99,32 +99,22 @@ static inline void wrpkru(u32 pkru)
 	/*
 	 * "wrpkru" instruction.  Loads contents in EAX to PKRU,
 	 * requires that ecx = edx = 0.
+	 * WRPKRU is relatively expensive compared to RDPKRU, callers
+	 * should try to compare pkru == rdpkru() and avoid the call
+	 * when it will not change the value; however, there are no
+	 * correctness issues if a caller WRPKRU's for the same value.
 	 */
 	asm volatile(".byte 0x0f,0x01,0xef\n\t"
 		     : : "a" (pkru), "c"(ecx), "d"(edx));
 }
 
-static inline void __write_pkru(u32 pkru)
-{
-	/*
-	 * WRPKRU is relatively expensive compared to RDPKRU.
-	 * Avoid WRPKRU when it would not change the value.
-	 */
-	if (pkru == rdpkru())
-		return;
-
-	wrpkru(pkru);
-}
-
 #else
 static inline u32 rdpkru(void)
 {
 	return 0;
 }
 
-static inline void __write_pkru(u32 pkru)
-{
-}
+static inline void wrpkru(u32 pkru) {}
 #endif
 
 static inline void native_wbinvd(void)
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index cebdaa1e3cf5..3222c7f60f31 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -912,10 +912,10 @@ void kvm_load_guest_xsave_state(struct kvm_vcpu *vcpu)
 	}
 
 	if (static_cpu_has(X86_FEATURE_PKU) &&
-	    (kvm_read_cr4_bits(vcpu, X86_CR4_PKE) ||
-	     (vcpu->arch.xcr0 & XFEATURE_MASK_PKRU)) &&
-	    vcpu->arch.pkru != vcpu->arch.host_pkru)
-		__write_pkru(vcpu->arch.pkru);
+	    vcpu->arch.pkru != vcpu->arch.host_pkru &&
+	    ((vcpu->arch.xcr0 & XFEATURE_MASK_PKRU) ||
+	     kvm_read_cr4_bits(vcpu, X86_CR4_PKE)))
+		wrpkru(vcpu->arch.pkru);
 }
 EXPORT_SYMBOL_GPL(kvm_load_guest_xsave_state);
 
@@ -925,11 +925,11 @@ void kvm_load_host_xsave_state(struct kvm_vcpu *vcpu)
 		return;
 
 	if (static_cpu_has(X86_FEATURE_PKU) &&
-	    (kvm_read_cr4_bits(vcpu, X86_CR4_PKE) ||
-	     (vcpu->arch.xcr0 & XFEATURE_MASK_PKRU))) {
+	    ((vcpu->arch.xcr0 & XFEATURE_MASK_PKRU) ||
+	     kvm_read_cr4_bits(vcpu, X86_CR4_PKE))) {
 		vcpu->arch.pkru = rdpkru();
 		if (vcpu->arch.pkru != vcpu->arch.host_pkru)
-			__write_pkru(vcpu->arch.host_pkru);
+			wrpkru(vcpu->arch.host_pkru);
 	}
 
 	if (kvm_read_cr4_bits(vcpu, X86_CR4_OSXSAVE)) {
-- 
2.30.1 (Apple Git-130)


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v2] KVM: x86: use wrpkru directly in kvm_load_{guest|host}_xsave_state
  2021-05-11 15:59 [PATCH v2] KVM: x86: use wrpkru directly in kvm_load_{guest|host}_xsave_state Jon Kohler
@ 2021-05-11 16:45 ` Dave Hansen
  2021-05-11 16:49   ` Jon Kohler
  2021-05-11 17:08   ` Paolo Bonzini
  0 siblings, 2 replies; 5+ messages in thread
From: Dave Hansen @ 2021-05-11 16:45 UTC (permalink / raw)
  To: Jon Kohler
  Cc: Babu Moger, Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	H. Peter Anvin, Paolo Bonzini, Sean Christopherson,
	Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, Joerg Roedel,
	Yu-cheng Yu, Fenghua Yu, Tony Luck, Petteri Aimonen, Kan Liang,
	Uros Bizjak, Andrew Morton, Mike Rapoport, Benjamin Thiel,
	Anshuman Khandual, Juergen Gross, Fan Yang, Dave Jiang,
	Peter Zijlstra (Intel),
	Ricardo Neri, Arvind Sankar, linux-kernel, kvm

On 5/11/21 8:59 AM, Jon Kohler wrote:
> diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h
> index b1099f2d9800..20f1fb8be7ef 100644
> --- a/arch/x86/include/asm/pgtable.h
> +++ b/arch/x86/include/asm/pgtable.h
> @@ -151,7 +151,7 @@ static inline void write_pkru(u32 pkru)
>  	fpregs_lock();
>  	if (pk)
>  		pk->pkru = pkru;
> -	__write_pkru(pkru);
> +	wrpkru(pkru);
>  	fpregs_unlock();
>  }

This removes the:

	if (pkru == rdpkru())
		return;

optimization from a couple of write_pkru() users:
arch_set_user_pkey_access() and copy_init_pkru_to_fpregs().

Was that intentional?  Those aren't the hottest paths in the kernel, but
copy_init_pkru_to_fpregs() is used in signal handling and exeve().

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v2] KVM: x86: use wrpkru directly in kvm_load_{guest|host}_xsave_state
  2021-05-11 16:45 ` Dave Hansen
@ 2021-05-11 16:49   ` Jon Kohler
  2021-05-11 17:08   ` Paolo Bonzini
  1 sibling, 0 replies; 5+ messages in thread
From: Jon Kohler @ 2021-05-11 16:49 UTC (permalink / raw)
  To: Dave Hansen
  Cc: Jon Kohler, Babu Moger, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, x86, H. Peter Anvin, Paolo Bonzini,
	Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, Yu-cheng Yu, Fenghua Yu, Tony Luck,
	Petteri Aimonen, Kan Liang, Uros Bizjak, Andrew Morton,
	Mike Rapoport, Benjamin Thiel, Anshuman Khandual, Juergen Gross,
	Fan Yang, Dave Jiang, Peter Zijlstra (Intel),
	Ricardo Neri, Arvind Sankar, linux-kernel, kvm



> On May 11, 2021, at 12:45 PM, Dave Hansen <dave.hansen@intel.com> wrote:
> 
> On 5/11/21 8:59 AM, Jon Kohler wrote:
>> diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h
>> index b1099f2d9800..20f1fb8be7ef 100644
>> --- a/arch/x86/include/asm/pgtable.h
>> +++ b/arch/x86/include/asm/pgtable.h
>> @@ -151,7 +151,7 @@ static inline void write_pkru(u32 pkru)
>> 	fpregs_lock();
>> 	if (pk)
>> 		pk->pkru = pkru;
>> -	__write_pkru(pkru);
>> +	wrpkru(pkru);
>> 	fpregs_unlock();
>> }
> 
> This removes the:
> 
> 	if (pkru == rdpkru())
> 		return;
> 
> optimization from a couple of write_pkru() users:
> arch_set_user_pkey_access() and copy_init_pkru_to_fpregs().
> 
> Was that intentional?  Those aren't the hottest paths in the kernel, but
> copy_init_pkru_to_fpregs() is used in signal handling and exeve().

That wasn’t intentional, I’ll take a look and send out a v3 pronto.

Thanks, Dave


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v2] KVM: x86: use wrpkru directly in kvm_load_{guest|host}_xsave_state
  2021-05-11 16:45 ` Dave Hansen
  2021-05-11 16:49   ` Jon Kohler
@ 2021-05-11 17:08   ` Paolo Bonzini
  2021-05-11 17:10     ` Jon Kohler
  1 sibling, 1 reply; 5+ messages in thread
From: Paolo Bonzini @ 2021-05-11 17:08 UTC (permalink / raw)
  To: Dave Hansen, Jon Kohler
  Cc: Babu Moger, Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	H. Peter Anvin, Sean Christopherson, Vitaly Kuznetsov,
	Wanpeng Li, Jim Mattson, Joerg Roedel, Yu-cheng Yu, Fenghua Yu,
	Tony Luck, Petteri Aimonen, Kan Liang, Uros Bizjak,
	Andrew Morton, Mike Rapoport, Benjamin Thiel, Anshuman Khandual,
	Juergen Gross, Fan Yang, Dave Jiang, Peter Zijlstra (Intel),
	Ricardo Neri, Arvind Sankar, linux-kernel, kvm

On 11/05/21 18:45, Dave Hansen wrote:
> On 5/11/21 8:59 AM, Jon Kohler wrote:
>> diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h
>> index b1099f2d9800..20f1fb8be7ef 100644
>> --- a/arch/x86/include/asm/pgtable.h
>> +++ b/arch/x86/include/asm/pgtable.h
>> @@ -151,7 +151,7 @@ static inline void write_pkru(u32 pkru)
>>   	fpregs_lock();
>>   	if (pk)
>>   		pk->pkru = pkru;
>> -	__write_pkru(pkru);
>> +	wrpkru(pkru);
>>   	fpregs_unlock();
>>   }
> 
> This removes the:
> 
> 	if (pkru == rdpkru())
> 		return;
> 
> optimization from a couple of write_pkru() users:
> arch_set_user_pkey_access() and copy_init_pkru_to_fpregs().
> 
> Was that intentional?  Those aren't the hottest paths in the kernel, but
> copy_init_pkru_to_fpregs() is used in signal handling and exeve().

Yeah, you should move it from __write_pkru() to write_pkru() but not 
remove it completely.

Paolo


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v2] KVM: x86: use wrpkru directly in kvm_load_{guest|host}_xsave_state
  2021-05-11 17:08   ` Paolo Bonzini
@ 2021-05-11 17:10     ` Jon Kohler
  0 siblings, 0 replies; 5+ messages in thread
From: Jon Kohler @ 2021-05-11 17:10 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Dave Hansen, Jon Kohler, Babu Moger, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, x86, H. Peter Anvin,
	Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, Yu-cheng Yu, Fenghua Yu, Tony Luck,
	Petteri Aimonen, Kan Liang, Uros Bizjak, Andrew Morton,
	Mike Rapoport, Benjamin Thiel, Anshuman Khandual, Juergen Gross,
	Fan Yang, Dave Jiang, Peter Zijlstra (Intel),
	Ricardo Neri, Arvind Sankar, linux-kernel, kvm



> On May 11, 2021, at 1:08 PM, Paolo Bonzini <pbonzini@redhat.com> wrote:
> 
> On 11/05/21 18:45, Dave Hansen wrote:
>> On 5/11/21 8:59 AM, Jon Kohler wrote:
>>> diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h
>>> index b1099f2d9800..20f1fb8be7ef 100644
>>> --- a/arch/x86/include/asm/pgtable.h
>>> +++ b/arch/x86/include/asm/pgtable.h
>>> @@ -151,7 +151,7 @@ static inline void write_pkru(u32 pkru)
>>>  	fpregs_lock();
>>>  	if (pk)
>>>  		pk->pkru = pkru;
>>> -	__write_pkru(pkru);
>>> +	wrpkru(pkru);
>>>  	fpregs_unlock();
>>>  }
>> This removes the:
>> 	if (pkru == rdpkru())
>> 		return;
>> optimization from a couple of write_pkru() users:
>> arch_set_user_pkey_access() and copy_init_pkru_to_fpregs().
>> Was that intentional?  Those aren't the hottest paths in the kernel, but
>> copy_init_pkru_to_fpregs() is used in signal handling and exeve().
> 
> Yeah, you should move it from __write_pkru() to write_pkru() but not remove it completely.
> 
> Paolo

Thanks, Paolo. Just sent out a v3 with that fix up included, so all of
the previous callers should be at par and not notice any differences


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2021-05-11 17:12 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-05-11 15:59 [PATCH v2] KVM: x86: use wrpkru directly in kvm_load_{guest|host}_xsave_state Jon Kohler
2021-05-11 16:45 ` Dave Hansen
2021-05-11 16:49   ` Jon Kohler
2021-05-11 17:08   ` Paolo Bonzini
2021-05-11 17:10     ` Jon Kohler

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).