All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH] x86: Remove X86_FEATURE_MFENCE_RDTSC
@ 2019-07-04 15:46 Josh Poimboeuf
  2019-07-08 19:32 ` Lendacky, Thomas
  2019-07-22 10:04 ` [tip:x86/cpu] " tip-bot for Josh Poimboeuf
  0 siblings, 2 replies; 6+ messages in thread
From: Josh Poimboeuf @ 2019-07-04 15:46 UTC (permalink / raw)
  To: x86
  Cc: linux-kernel, Tom Lendacky, Andrew Cooper, Thomas Gleixner,
	Pu Wen, Borislav Petkov

AMD and Intel both have serializing lfence (X86_FEATURE_LFENCE_RDTSC).
They've both had it for a long time, and AMD has had it enabled in Linux
since Spectre v1 was announced.

Back then, there was a proposal to remove the serializing mfence feature
bit (X86_FEATURE_MFENCE_RDTSC), since both AMD and Intel have
serializing lfence.  At the time, it was (ahem) speculated that some
hypervisors might not yet support its removal, so it remained for the
time being.

Now a year-and-a-half later, it should be safe to remove.

I asked Andrew Cooper about whether it's still needed:

  So if you're virtualised, you've got no choice in the matter.  lfence
  is either dispatch-serialising or not on AMD, and you won't be able to
  change it.

  Furthermore, you can't accurately tell what state the bit is in, because
  the MSR might not be virtualised at all, or may not reflect the true
  state in hardware.  Worse still, attempting to set the bit may not be
  successful even if there isn't a fault for doing so.

  Xen sets the DE_CFG bit unconditionally, as does Linux by the looks of
  things (see MSR_F10H_DECFG_LFENCE_SERIALIZE_BIT).  ISTR other hypervisor
  vendors saying the same, but I don't have any information to hand.

  If you are running under a hypervisor which has been updated, then
  lfence will almost certainly be dispatch-serialising in practice, and
  you'll almost certainly see the bit already set in DE_CFG.  If you're
  running under a hypervisor which hasn't been patched since Spectre,
  you've already lost in many more ways.

  I'd argue that X86_FEATURE_MFENCE_RDTSC is not worth keeping.

So remove it.  This will reduce some code rot, and also make it easier
to hook barrier_nospec() up to a cmdline disable for performance
raisins, without having to need an alternative_3() macro.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Pu Wen <puwen@hygon.cn>
Cc: Borislav Petkov <bp@alien8.de>
---
 arch/x86/include/asm/barrier.h           |  3 +--
 arch/x86/include/asm/cpufeatures.h       |  1 -
 arch/x86/include/asm/msr.h               |  3 +--
 arch/x86/kernel/cpu/amd.c                | 21 +++------------------
 arch/x86/kernel/cpu/hygon.c              | 21 +++------------------
 tools/arch/x86/include/asm/cpufeatures.h |  1 -
 6 files changed, 8 insertions(+), 42 deletions(-)

diff --git a/arch/x86/include/asm/barrier.h b/arch/x86/include/asm/barrier.h
index 84f848c2541a..7f828fe49797 100644
--- a/arch/x86/include/asm/barrier.h
+++ b/arch/x86/include/asm/barrier.h
@@ -49,8 +49,7 @@ static inline unsigned long array_index_mask_nospec(unsigned long index,
 #define array_index_mask_nospec array_index_mask_nospec
 
 /* Prevent speculative execution past this barrier. */
-#define barrier_nospec() alternative_2("", "mfence", X86_FEATURE_MFENCE_RDTSC, \
-					   "lfence", X86_FEATURE_LFENCE_RDTSC)
+#define barrier_nospec() alternative("", "lfence", X86_FEATURE_LFENCE_RDTSC)
 
 #define dma_rmb()	barrier()
 #define dma_wmb()	barrier()
diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h
index 998c2cc08363..9b98edb6b2d3 100644
--- a/arch/x86/include/asm/cpufeatures.h
+++ b/arch/x86/include/asm/cpufeatures.h
@@ -96,7 +96,6 @@
 #define X86_FEATURE_SYSCALL32		( 3*32+14) /* "" syscall in IA32 userspace */
 #define X86_FEATURE_SYSENTER32		( 3*32+15) /* "" sysenter in IA32 userspace */
 #define X86_FEATURE_REP_GOOD		( 3*32+16) /* REP microcode works well */
-#define X86_FEATURE_MFENCE_RDTSC	( 3*32+17) /* "" MFENCE synchronizes RDTSC */
 #define X86_FEATURE_LFENCE_RDTSC	( 3*32+18) /* "" LFENCE synchronizes RDTSC */
 #define X86_FEATURE_ACC_POWER		( 3*32+19) /* AMD Accumulated Power Mechanism */
 #define X86_FEATURE_NOPL		( 3*32+20) /* The NOPL (0F 1F) instructions */
diff --git a/arch/x86/include/asm/msr.h b/arch/x86/include/asm/msr.h
index 5cc3930cb465..86f20d520a07 100644
--- a/arch/x86/include/asm/msr.h
+++ b/arch/x86/include/asm/msr.h
@@ -233,8 +233,7 @@ static __always_inline unsigned long long rdtsc_ordered(void)
 	 * Thus, use the preferred barrier on the respective CPU, aiming for
 	 * RDTSCP as the default.
 	 */
-	asm volatile(ALTERNATIVE_3("rdtsc",
-				   "mfence; rdtsc", X86_FEATURE_MFENCE_RDTSC,
+	asm volatile(ALTERNATIVE_2("rdtsc",
 				   "lfence; rdtsc", X86_FEATURE_LFENCE_RDTSC,
 				   "rdtscp", X86_FEATURE_RDTSCP)
 			: EAX_EDX_RET(val, low, high)
diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c
index 8d4e50428b68..3afe07d602dd 100644
--- a/arch/x86/kernel/cpu/amd.c
+++ b/arch/x86/kernel/cpu/amd.c
@@ -879,12 +879,8 @@ static void init_amd(struct cpuinfo_x86 *c)
 	init_amd_cacheinfo(c);
 
 	if (cpu_has(c, X86_FEATURE_XMM2)) {
-		unsigned long long val;
-		int ret;
-
 		/*
-		 * A serializing LFENCE has less overhead than MFENCE, so
-		 * use it for execution serialization.  On families which
+		 * Use LFENCE for execution serialization.  On families which
 		 * don't have that MSR, LFENCE is already serializing.
 		 * msr_set_bit() uses the safe accessors, too, even if the MSR
 		 * is not present.
@@ -892,19 +888,8 @@ static void init_amd(struct cpuinfo_x86 *c)
 		msr_set_bit(MSR_F10H_DECFG,
 			    MSR_F10H_DECFG_LFENCE_SERIALIZE_BIT);
 
-		/*
-		 * Verify that the MSR write was successful (could be running
-		 * under a hypervisor) and only then assume that LFENCE is
-		 * serializing.
-		 */
-		ret = rdmsrl_safe(MSR_F10H_DECFG, &val);
-		if (!ret && (val & MSR_F10H_DECFG_LFENCE_SERIALIZE)) {
-			/* A serializing LFENCE stops RDTSC speculation */
-			set_cpu_cap(c, X86_FEATURE_LFENCE_RDTSC);
-		} else {
-			/* MFENCE stops RDTSC speculation */
-			set_cpu_cap(c, X86_FEATURE_MFENCE_RDTSC);
-		}
+		/* A serializing LFENCE stops RDTSC speculation */
+		set_cpu_cap(c, X86_FEATURE_LFENCE_RDTSC);
 	}
 
 	/*
diff --git a/arch/x86/kernel/cpu/hygon.c b/arch/x86/kernel/cpu/hygon.c
index 415621ddb8a2..4e28c1fc8749 100644
--- a/arch/x86/kernel/cpu/hygon.c
+++ b/arch/x86/kernel/cpu/hygon.c
@@ -330,12 +330,8 @@ static void init_hygon(struct cpuinfo_x86 *c)
 	init_hygon_cacheinfo(c);
 
 	if (cpu_has(c, X86_FEATURE_XMM2)) {
-		unsigned long long val;
-		int ret;
-
 		/*
-		 * A serializing LFENCE has less overhead than MFENCE, so
-		 * use it for execution serialization.  On families which
+		 * Use LFENCE for execution serialization.  On families which
 		 * don't have that MSR, LFENCE is already serializing.
 		 * msr_set_bit() uses the safe accessors, too, even if the MSR
 		 * is not present.
@@ -343,19 +339,8 @@ static void init_hygon(struct cpuinfo_x86 *c)
 		msr_set_bit(MSR_F10H_DECFG,
 			    MSR_F10H_DECFG_LFENCE_SERIALIZE_BIT);
 
-		/*
-		 * Verify that the MSR write was successful (could be running
-		 * under a hypervisor) and only then assume that LFENCE is
-		 * serializing.
-		 */
-		ret = rdmsrl_safe(MSR_F10H_DECFG, &val);
-		if (!ret && (val & MSR_F10H_DECFG_LFENCE_SERIALIZE)) {
-			/* A serializing LFENCE stops RDTSC speculation */
-			set_cpu_cap(c, X86_FEATURE_LFENCE_RDTSC);
-		} else {
-			/* MFENCE stops RDTSC speculation */
-			set_cpu_cap(c, X86_FEATURE_MFENCE_RDTSC);
-		}
+		/* A serializing LFENCE stops RDTSC speculation */
+		set_cpu_cap(c, X86_FEATURE_LFENCE_RDTSC);
 	}
 
 	/*
diff --git a/tools/arch/x86/include/asm/cpufeatures.h b/tools/arch/x86/include/asm/cpufeatures.h
index 75f27ee2c263..4cebefd007f1 100644
--- a/tools/arch/x86/include/asm/cpufeatures.h
+++ b/tools/arch/x86/include/asm/cpufeatures.h
@@ -96,7 +96,6 @@
 #define X86_FEATURE_SYSCALL32		( 3*32+14) /* "" syscall in IA32 userspace */
 #define X86_FEATURE_SYSENTER32		( 3*32+15) /* "" sysenter in IA32 userspace */
 #define X86_FEATURE_REP_GOOD		( 3*32+16) /* REP microcode works well */
-#define X86_FEATURE_MFENCE_RDTSC	( 3*32+17) /* "" MFENCE synchronizes RDTSC */
 #define X86_FEATURE_LFENCE_RDTSC	( 3*32+18) /* "" LFENCE synchronizes RDTSC */
 #define X86_FEATURE_ACC_POWER		( 3*32+19) /* AMD Accumulated Power Mechanism */
 #define X86_FEATURE_NOPL		( 3*32+20) /* The NOPL (0F 1F) instructions */
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [RFC PATCH] x86: Remove X86_FEATURE_MFENCE_RDTSC
  2019-07-04 15:46 [RFC PATCH] x86: Remove X86_FEATURE_MFENCE_RDTSC Josh Poimboeuf
@ 2019-07-08 19:32 ` Lendacky, Thomas
  2019-07-10  6:20   ` Paolo Bonzini
  2019-07-22 10:04 ` [tip:x86/cpu] " tip-bot for Josh Poimboeuf
  1 sibling, 1 reply; 6+ messages in thread
From: Lendacky, Thomas @ 2019-07-08 19:32 UTC (permalink / raw)
  To: Josh Poimboeuf, x86, Paolo Bonzini
  Cc: linux-kernel, Andrew Cooper, Thomas Gleixner, Pu Wen, Borislav Petkov

On 7/4/19 10:46 AM, Josh Poimboeuf wrote:
> AMD and Intel both have serializing lfence (X86_FEATURE_LFENCE_RDTSC).
> They've both had it for a long time, and AMD has had it enabled in Linux
> since Spectre v1 was announced.
> 
> Back then, there was a proposal to remove the serializing mfence feature
> bit (X86_FEATURE_MFENCE_RDTSC), since both AMD and Intel have
> serializing lfence.  At the time, it was (ahem) speculated that some
> hypervisors might not yet support its removal, so it remained for the
> time being.
> 
> Now a year-and-a-half later, it should be safe to remove.

I vaguely remember a concern from a migration point of view, maybe? Adding
Paolo to see if he has any concerns.

Thanks,
Tom

> 
> I asked Andrew Cooper about whether it's still needed:
> 
>   So if you're virtualised, you've got no choice in the matter.  lfence
>   is either dispatch-serialising or not on AMD, and you won't be able to
>   change it.
> 
>   Furthermore, you can't accurately tell what state the bit is in, because
>   the MSR might not be virtualised at all, or may not reflect the true
>   state in hardware.  Worse still, attempting to set the bit may not be
>   successful even if there isn't a fault for doing so.
> 
>   Xen sets the DE_CFG bit unconditionally, as does Linux by the looks of
>   things (see MSR_F10H_DECFG_LFENCE_SERIALIZE_BIT).  ISTR other hypervisor
>   vendors saying the same, but I don't have any information to hand.
> 
>   If you are running under a hypervisor which has been updated, then
>   lfence will almost certainly be dispatch-serialising in practice, and
>   you'll almost certainly see the bit already set in DE_CFG.  If you're
>   running under a hypervisor which hasn't been patched since Spectre,
>   you've already lost in many more ways.
> 
>   I'd argue that X86_FEATURE_MFENCE_RDTSC is not worth keeping.
> 
> So remove it.  This will reduce some code rot, and also make it easier
> to hook barrier_nospec() up to a cmdline disable for performance
> raisins, without having to need an alternative_3() macro.
> 
> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
> Cc: Tom Lendacky <thomas.lendacky@amd.com>
> Cc: Andrew Cooper <andrew.cooper3@citrix.com>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Pu Wen <puwen@hygon.cn>
> Cc: Borislav Petkov <bp@alien8.de>
> ---
>  arch/x86/include/asm/barrier.h           |  3 +--
>  arch/x86/include/asm/cpufeatures.h       |  1 -
>  arch/x86/include/asm/msr.h               |  3 +--
>  arch/x86/kernel/cpu/amd.c                | 21 +++------------------
>  arch/x86/kernel/cpu/hygon.c              | 21 +++------------------
>  tools/arch/x86/include/asm/cpufeatures.h |  1 -
>  6 files changed, 8 insertions(+), 42 deletions(-)
> 
> diff --git a/arch/x86/include/asm/barrier.h b/arch/x86/include/asm/barrier.h
> index 84f848c2541a..7f828fe49797 100644
> --- a/arch/x86/include/asm/barrier.h
> +++ b/arch/x86/include/asm/barrier.h
> @@ -49,8 +49,7 @@ static inline unsigned long array_index_mask_nospec(unsigned long index,
>  #define array_index_mask_nospec array_index_mask_nospec
>  
>  /* Prevent speculative execution past this barrier. */
> -#define barrier_nospec() alternative_2("", "mfence", X86_FEATURE_MFENCE_RDTSC, \
> -					   "lfence", X86_FEATURE_LFENCE_RDTSC)
> +#define barrier_nospec() alternative("", "lfence", X86_FEATURE_LFENCE_RDTSC)
>  
>  #define dma_rmb()	barrier()
>  #define dma_wmb()	barrier()
> diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h
> index 998c2cc08363..9b98edb6b2d3 100644
> --- a/arch/x86/include/asm/cpufeatures.h
> +++ b/arch/x86/include/asm/cpufeatures.h
> @@ -96,7 +96,6 @@
>  #define X86_FEATURE_SYSCALL32		( 3*32+14) /* "" syscall in IA32 userspace */
>  #define X86_FEATURE_SYSENTER32		( 3*32+15) /* "" sysenter in IA32 userspace */
>  #define X86_FEATURE_REP_GOOD		( 3*32+16) /* REP microcode works well */
> -#define X86_FEATURE_MFENCE_RDTSC	( 3*32+17) /* "" MFENCE synchronizes RDTSC */
>  #define X86_FEATURE_LFENCE_RDTSC	( 3*32+18) /* "" LFENCE synchronizes RDTSC */
>  #define X86_FEATURE_ACC_POWER		( 3*32+19) /* AMD Accumulated Power Mechanism */
>  #define X86_FEATURE_NOPL		( 3*32+20) /* The NOPL (0F 1F) instructions */
> diff --git a/arch/x86/include/asm/msr.h b/arch/x86/include/asm/msr.h
> index 5cc3930cb465..86f20d520a07 100644
> --- a/arch/x86/include/asm/msr.h
> +++ b/arch/x86/include/asm/msr.h
> @@ -233,8 +233,7 @@ static __always_inline unsigned long long rdtsc_ordered(void)
>  	 * Thus, use the preferred barrier on the respective CPU, aiming for
>  	 * RDTSCP as the default.
>  	 */
> -	asm volatile(ALTERNATIVE_3("rdtsc",
> -				   "mfence; rdtsc", X86_FEATURE_MFENCE_RDTSC,
> +	asm volatile(ALTERNATIVE_2("rdtsc",
>  				   "lfence; rdtsc", X86_FEATURE_LFENCE_RDTSC,
>  				   "rdtscp", X86_FEATURE_RDTSCP)
>  			: EAX_EDX_RET(val, low, high)
> diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c
> index 8d4e50428b68..3afe07d602dd 100644
> --- a/arch/x86/kernel/cpu/amd.c
> +++ b/arch/x86/kernel/cpu/amd.c
> @@ -879,12 +879,8 @@ static void init_amd(struct cpuinfo_x86 *c)
>  	init_amd_cacheinfo(c);
>  
>  	if (cpu_has(c, X86_FEATURE_XMM2)) {
> -		unsigned long long val;
> -		int ret;
> -
>  		/*
> -		 * A serializing LFENCE has less overhead than MFENCE, so
> -		 * use it for execution serialization.  On families which
> +		 * Use LFENCE for execution serialization.  On families which
>  		 * don't have that MSR, LFENCE is already serializing.
>  		 * msr_set_bit() uses the safe accessors, too, even if the MSR
>  		 * is not present.
> @@ -892,19 +888,8 @@ static void init_amd(struct cpuinfo_x86 *c)
>  		msr_set_bit(MSR_F10H_DECFG,
>  			    MSR_F10H_DECFG_LFENCE_SERIALIZE_BIT);
>  
> -		/*
> -		 * Verify that the MSR write was successful (could be running
> -		 * under a hypervisor) and only then assume that LFENCE is
> -		 * serializing.
> -		 */
> -		ret = rdmsrl_safe(MSR_F10H_DECFG, &val);
> -		if (!ret && (val & MSR_F10H_DECFG_LFENCE_SERIALIZE)) {
> -			/* A serializing LFENCE stops RDTSC speculation */
> -			set_cpu_cap(c, X86_FEATURE_LFENCE_RDTSC);
> -		} else {
> -			/* MFENCE stops RDTSC speculation */
> -			set_cpu_cap(c, X86_FEATURE_MFENCE_RDTSC);
> -		}
> +		/* A serializing LFENCE stops RDTSC speculation */
> +		set_cpu_cap(c, X86_FEATURE_LFENCE_RDTSC);
>  	}
>  
>  	/*
> diff --git a/arch/x86/kernel/cpu/hygon.c b/arch/x86/kernel/cpu/hygon.c
> index 415621ddb8a2..4e28c1fc8749 100644
> --- a/arch/x86/kernel/cpu/hygon.c
> +++ b/arch/x86/kernel/cpu/hygon.c
> @@ -330,12 +330,8 @@ static void init_hygon(struct cpuinfo_x86 *c)
>  	init_hygon_cacheinfo(c);
>  
>  	if (cpu_has(c, X86_FEATURE_XMM2)) {
> -		unsigned long long val;
> -		int ret;
> -
>  		/*
> -		 * A serializing LFENCE has less overhead than MFENCE, so
> -		 * use it for execution serialization.  On families which
> +		 * Use LFENCE for execution serialization.  On families which
>  		 * don't have that MSR, LFENCE is already serializing.
>  		 * msr_set_bit() uses the safe accessors, too, even if the MSR
>  		 * is not present.
> @@ -343,19 +339,8 @@ static void init_hygon(struct cpuinfo_x86 *c)
>  		msr_set_bit(MSR_F10H_DECFG,
>  			    MSR_F10H_DECFG_LFENCE_SERIALIZE_BIT);
>  
> -		/*
> -		 * Verify that the MSR write was successful (could be running
> -		 * under a hypervisor) and only then assume that LFENCE is
> -		 * serializing.
> -		 */
> -		ret = rdmsrl_safe(MSR_F10H_DECFG, &val);
> -		if (!ret && (val & MSR_F10H_DECFG_LFENCE_SERIALIZE)) {
> -			/* A serializing LFENCE stops RDTSC speculation */
> -			set_cpu_cap(c, X86_FEATURE_LFENCE_RDTSC);
> -		} else {
> -			/* MFENCE stops RDTSC speculation */
> -			set_cpu_cap(c, X86_FEATURE_MFENCE_RDTSC);
> -		}
> +		/* A serializing LFENCE stops RDTSC speculation */
> +		set_cpu_cap(c, X86_FEATURE_LFENCE_RDTSC);
>  	}
>  
>  	/*
> diff --git a/tools/arch/x86/include/asm/cpufeatures.h b/tools/arch/x86/include/asm/cpufeatures.h
> index 75f27ee2c263..4cebefd007f1 100644
> --- a/tools/arch/x86/include/asm/cpufeatures.h
> +++ b/tools/arch/x86/include/asm/cpufeatures.h
> @@ -96,7 +96,6 @@
>  #define X86_FEATURE_SYSCALL32		( 3*32+14) /* "" syscall in IA32 userspace */
>  #define X86_FEATURE_SYSENTER32		( 3*32+15) /* "" sysenter in IA32 userspace */
>  #define X86_FEATURE_REP_GOOD		( 3*32+16) /* REP microcode works well */
> -#define X86_FEATURE_MFENCE_RDTSC	( 3*32+17) /* "" MFENCE synchronizes RDTSC */
>  #define X86_FEATURE_LFENCE_RDTSC	( 3*32+18) /* "" LFENCE synchronizes RDTSC */
>  #define X86_FEATURE_ACC_POWER		( 3*32+19) /* AMD Accumulated Power Mechanism */
>  #define X86_FEATURE_NOPL		( 3*32+20) /* The NOPL (0F 1F) instructions */
> 

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [RFC PATCH] x86: Remove X86_FEATURE_MFENCE_RDTSC
  2019-07-08 19:32 ` Lendacky, Thomas
@ 2019-07-10  6:20   ` Paolo Bonzini
  2019-07-10 11:33     ` Thomas Gleixner
  0 siblings, 1 reply; 6+ messages in thread
From: Paolo Bonzini @ 2019-07-10  6:20 UTC (permalink / raw)
  To: Lendacky, Thomas, Josh Poimboeuf, x86
  Cc: linux-kernel, Andrew Cooper, Thomas Gleixner, Pu Wen, Borislav Petkov

On 08/07/19 21:32, Lendacky, Thomas wrote:
>> AMD and Intel both have serializing lfence (X86_FEATURE_LFENCE_RDTSC).
>> They've both had it for a long time, and AMD has had it enabled in Linux
>> since Spectre v1 was announced.
>>
>> Back then, there was a proposal to remove the serializing mfence feature
>> bit (X86_FEATURE_MFENCE_RDTSC), since both AMD and Intel have
>> serializing lfence.  At the time, it was (ahem) speculated that some
>> hypervisors might not yet support its removal, so it remained for the
>> time being.
>>
>> Now a year-and-a-half later, it should be safe to remove.
>
> I vaguely remember a concern from a migration point of view, maybe? Adding
> Paolo to see if he has any concerns.

It would be a problem to remove the conditional "if
(boot_cpu_has(X86_FEATURE_LFENCE_RDTSC))" from svm_get_msr_feature.  But
removing support for X86_FEATURE_MFENCE_RDTSC essentially amounts to
removing support for hypervisors that haven't been updated pre-Spectre.
 That's fair enough, I think.

Paolo

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [RFC PATCH] x86: Remove X86_FEATURE_MFENCE_RDTSC
  2019-07-10  6:20   ` Paolo Bonzini
@ 2019-07-10 11:33     ` Thomas Gleixner
  2019-07-11 17:45       ` Josh Poimboeuf
  0 siblings, 1 reply; 6+ messages in thread
From: Thomas Gleixner @ 2019-07-10 11:33 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Lendacky, Thomas, Josh Poimboeuf, x86, linux-kernel,
	Andrew Cooper, Pu Wen, Borislav Petkov

On Wed, 10 Jul 2019, Paolo Bonzini wrote:
> On 08/07/19 21:32, Lendacky, Thomas wrote:
> >> AMD and Intel both have serializing lfence (X86_FEATURE_LFENCE_RDTSC).
> >> They've both had it for a long time, and AMD has had it enabled in Linux
> >> since Spectre v1 was announced.
> >>
> >> Back then, there was a proposal to remove the serializing mfence feature
> >> bit (X86_FEATURE_MFENCE_RDTSC), since both AMD and Intel have
> >> serializing lfence.  At the time, it was (ahem) speculated that some
> >> hypervisors might not yet support its removal, so it remained for the
> >> time being.
> >>
> >> Now a year-and-a-half later, it should be safe to remove.
> >
> > I vaguely remember a concern from a migration point of view, maybe? Adding
> > Paolo to see if he has any concerns.
> 
> It would be a problem to remove the conditional "if
> (boot_cpu_has(X86_FEATURE_LFENCE_RDTSC))" from svm_get_msr_feature.  But
> removing support for X86_FEATURE_MFENCE_RDTSC essentially amounts to
> removing support for hypervisors that haven't been updated pre-Spectre.
>  That's fair enough, I think.

Yes, they have other more interesting problems :)

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [RFC PATCH] x86: Remove X86_FEATURE_MFENCE_RDTSC
  2019-07-10 11:33     ` Thomas Gleixner
@ 2019-07-11 17:45       ` Josh Poimboeuf
  0 siblings, 0 replies; 6+ messages in thread
From: Josh Poimboeuf @ 2019-07-11 17:45 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Paolo Bonzini, Lendacky, Thomas, x86, linux-kernel,
	Andrew Cooper, Pu Wen, Borislav Petkov

On Wed, Jul 10, 2019 at 01:33:48PM +0200, Thomas Gleixner wrote:
> On Wed, 10 Jul 2019, Paolo Bonzini wrote:
> > On 08/07/19 21:32, Lendacky, Thomas wrote:
> > >> AMD and Intel both have serializing lfence (X86_FEATURE_LFENCE_RDTSC).
> > >> They've both had it for a long time, and AMD has had it enabled in Linux
> > >> since Spectre v1 was announced.
> > >>
> > >> Back then, there was a proposal to remove the serializing mfence feature
> > >> bit (X86_FEATURE_MFENCE_RDTSC), since both AMD and Intel have
> > >> serializing lfence.  At the time, it was (ahem) speculated that some
> > >> hypervisors might not yet support its removal, so it remained for the
> > >> time being.
> > >>
> > >> Now a year-and-a-half later, it should be safe to remove.
> > >
> > > I vaguely remember a concern from a migration point of view, maybe? Adding
> > > Paolo to see if he has any concerns.
> > 
> > It would be a problem to remove the conditional "if
> > (boot_cpu_has(X86_FEATURE_LFENCE_RDTSC))" from svm_get_msr_feature.  But
> > removing support for X86_FEATURE_MFENCE_RDTSC essentially amounts to
> > removing support for hypervisors that haven't been updated pre-Spectre.
> >  That's fair enough, I think.
> 
> Yes, they have other more interesting problems :)

Great.  Anyone care to give an ACK? :-)

-- 
Josh

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [tip:x86/cpu] x86: Remove X86_FEATURE_MFENCE_RDTSC
  2019-07-04 15:46 [RFC PATCH] x86: Remove X86_FEATURE_MFENCE_RDTSC Josh Poimboeuf
  2019-07-08 19:32 ` Lendacky, Thomas
@ 2019-07-22 10:04 ` tip-bot for Josh Poimboeuf
  1 sibling, 0 replies; 6+ messages in thread
From: tip-bot for Josh Poimboeuf @ 2019-07-22 10:04 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: jpoimboe, tglx, mingo, linux-kernel, hpa

Commit-ID:  be261ffce6f13229dad50f59c5e491f933d3167f
Gitweb:     https://git.kernel.org/tip/be261ffce6f13229dad50f59c5e491f933d3167f
Author:     Josh Poimboeuf <jpoimboe@redhat.com>
AuthorDate: Thu, 4 Jul 2019 10:46:37 -0500
Committer:  Thomas Gleixner <tglx@linutronix.de>
CommitDate: Mon, 22 Jul 2019 12:00:51 +0200

x86: Remove X86_FEATURE_MFENCE_RDTSC

AMD and Intel both have serializing lfence (X86_FEATURE_LFENCE_RDTSC).
They've both had it for a long time, and AMD has had it enabled in Linux
since Spectre v1 was announced.

Back then, there was a proposal to remove the serializing mfence feature
bit (X86_FEATURE_MFENCE_RDTSC), since both AMD and Intel have
serializing lfence.  At the time, it was (ahem) speculated that some
hypervisors might not yet support its removal, so it remained for the
time being.

Now a year-and-a-half later, it should be safe to remove.

I asked Andrew Cooper about whether it's still needed:

  So if you're virtualised, you've got no choice in the matter.  lfence
  is either dispatch-serialising or not on AMD, and you won't be able to
  change it.

  Furthermore, you can't accurately tell what state the bit is in, because
  the MSR might not be virtualised at all, or may not reflect the true
  state in hardware.  Worse still, attempting to set the bit may not be
  successful even if there isn't a fault for doing so.

  Xen sets the DE_CFG bit unconditionally, as does Linux by the looks of
  things (see MSR_F10H_DECFG_LFENCE_SERIALIZE_BIT).  ISTR other hypervisor
  vendors saying the same, but I don't have any information to hand.

  If you are running under a hypervisor which has been updated, then
  lfence will almost certainly be dispatch-serialising in practice, and
  you'll almost certainly see the bit already set in DE_CFG.  If you're
  running under a hypervisor which hasn't been patched since Spectre,
  you've already lost in many more ways.

  I'd argue that X86_FEATURE_MFENCE_RDTSC is not worth keeping.

So remove it.  This will reduce some code rot, and also make it easier
to hook barrier_nospec() up to a cmdline disable for performance
raisins, without having to need an alternative_3() macro.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/d990aa51e40063acb9888e8c1b688e41355a9588.1562255067.git.jpoimboe@redhat.com

---
 arch/x86/include/asm/barrier.h           |  3 +--
 arch/x86/include/asm/cpufeatures.h       |  1 -
 arch/x86/include/asm/msr.h               |  3 +--
 arch/x86/kernel/cpu/amd.c                | 21 +++------------------
 arch/x86/kernel/cpu/hygon.c              | 21 +++------------------
 tools/arch/x86/include/asm/cpufeatures.h |  1 -
 6 files changed, 8 insertions(+), 42 deletions(-)

diff --git a/arch/x86/include/asm/barrier.h b/arch/x86/include/asm/barrier.h
index 84f848c2541a..7f828fe49797 100644
--- a/arch/x86/include/asm/barrier.h
+++ b/arch/x86/include/asm/barrier.h
@@ -49,8 +49,7 @@ static inline unsigned long array_index_mask_nospec(unsigned long index,
 #define array_index_mask_nospec array_index_mask_nospec
 
 /* Prevent speculative execution past this barrier. */
-#define barrier_nospec() alternative_2("", "mfence", X86_FEATURE_MFENCE_RDTSC, \
-					   "lfence", X86_FEATURE_LFENCE_RDTSC)
+#define barrier_nospec() alternative("", "lfence", X86_FEATURE_LFENCE_RDTSC)
 
 #define dma_rmb()	barrier()
 #define dma_wmb()	barrier()
diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h
index 56f53bf3bbbf..fcc70ffd88c2 100644
--- a/arch/x86/include/asm/cpufeatures.h
+++ b/arch/x86/include/asm/cpufeatures.h
@@ -96,7 +96,6 @@
 #define X86_FEATURE_SYSCALL32		( 3*32+14) /* "" syscall in IA32 userspace */
 #define X86_FEATURE_SYSENTER32		( 3*32+15) /* "" sysenter in IA32 userspace */
 #define X86_FEATURE_REP_GOOD		( 3*32+16) /* REP microcode works well */
-#define X86_FEATURE_MFENCE_RDTSC	( 3*32+17) /* "" MFENCE synchronizes RDTSC */
 #define X86_FEATURE_LFENCE_RDTSC	( 3*32+18) /* "" LFENCE synchronizes RDTSC */
 #define X86_FEATURE_ACC_POWER		( 3*32+19) /* AMD Accumulated Power Mechanism */
 #define X86_FEATURE_NOPL		( 3*32+20) /* The NOPL (0F 1F) instructions */
diff --git a/arch/x86/include/asm/msr.h b/arch/x86/include/asm/msr.h
index 5cc3930cb465..86f20d520a07 100644
--- a/arch/x86/include/asm/msr.h
+++ b/arch/x86/include/asm/msr.h
@@ -233,8 +233,7 @@ static __always_inline unsigned long long rdtsc_ordered(void)
 	 * Thus, use the preferred barrier on the respective CPU, aiming for
 	 * RDTSCP as the default.
 	 */
-	asm volatile(ALTERNATIVE_3("rdtsc",
-				   "mfence; rdtsc", X86_FEATURE_MFENCE_RDTSC,
+	asm volatile(ALTERNATIVE_2("rdtsc",
 				   "lfence; rdtsc", X86_FEATURE_LFENCE_RDTSC,
 				   "rdtscp", X86_FEATURE_RDTSCP)
 			: EAX_EDX_RET(val, low, high)
diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c
index 8d4e50428b68..3afe07d602dd 100644
--- a/arch/x86/kernel/cpu/amd.c
+++ b/arch/x86/kernel/cpu/amd.c
@@ -879,12 +879,8 @@ static void init_amd(struct cpuinfo_x86 *c)
 	init_amd_cacheinfo(c);
 
 	if (cpu_has(c, X86_FEATURE_XMM2)) {
-		unsigned long long val;
-		int ret;
-
 		/*
-		 * A serializing LFENCE has less overhead than MFENCE, so
-		 * use it for execution serialization.  On families which
+		 * Use LFENCE for execution serialization.  On families which
 		 * don't have that MSR, LFENCE is already serializing.
 		 * msr_set_bit() uses the safe accessors, too, even if the MSR
 		 * is not present.
@@ -892,19 +888,8 @@ static void init_amd(struct cpuinfo_x86 *c)
 		msr_set_bit(MSR_F10H_DECFG,
 			    MSR_F10H_DECFG_LFENCE_SERIALIZE_BIT);
 
-		/*
-		 * Verify that the MSR write was successful (could be running
-		 * under a hypervisor) and only then assume that LFENCE is
-		 * serializing.
-		 */
-		ret = rdmsrl_safe(MSR_F10H_DECFG, &val);
-		if (!ret && (val & MSR_F10H_DECFG_LFENCE_SERIALIZE)) {
-			/* A serializing LFENCE stops RDTSC speculation */
-			set_cpu_cap(c, X86_FEATURE_LFENCE_RDTSC);
-		} else {
-			/* MFENCE stops RDTSC speculation */
-			set_cpu_cap(c, X86_FEATURE_MFENCE_RDTSC);
-		}
+		/* A serializing LFENCE stops RDTSC speculation */
+		set_cpu_cap(c, X86_FEATURE_LFENCE_RDTSC);
 	}
 
 	/*
diff --git a/arch/x86/kernel/cpu/hygon.c b/arch/x86/kernel/cpu/hygon.c
index 415621ddb8a2..4e28c1fc8749 100644
--- a/arch/x86/kernel/cpu/hygon.c
+++ b/arch/x86/kernel/cpu/hygon.c
@@ -330,12 +330,8 @@ static void init_hygon(struct cpuinfo_x86 *c)
 	init_hygon_cacheinfo(c);
 
 	if (cpu_has(c, X86_FEATURE_XMM2)) {
-		unsigned long long val;
-		int ret;
-
 		/*
-		 * A serializing LFENCE has less overhead than MFENCE, so
-		 * use it for execution serialization.  On families which
+		 * Use LFENCE for execution serialization.  On families which
 		 * don't have that MSR, LFENCE is already serializing.
 		 * msr_set_bit() uses the safe accessors, too, even if the MSR
 		 * is not present.
@@ -343,19 +339,8 @@ static void init_hygon(struct cpuinfo_x86 *c)
 		msr_set_bit(MSR_F10H_DECFG,
 			    MSR_F10H_DECFG_LFENCE_SERIALIZE_BIT);
 
-		/*
-		 * Verify that the MSR write was successful (could be running
-		 * under a hypervisor) and only then assume that LFENCE is
-		 * serializing.
-		 */
-		ret = rdmsrl_safe(MSR_F10H_DECFG, &val);
-		if (!ret && (val & MSR_F10H_DECFG_LFENCE_SERIALIZE)) {
-			/* A serializing LFENCE stops RDTSC speculation */
-			set_cpu_cap(c, X86_FEATURE_LFENCE_RDTSC);
-		} else {
-			/* MFENCE stops RDTSC speculation */
-			set_cpu_cap(c, X86_FEATURE_MFENCE_RDTSC);
-		}
+		/* A serializing LFENCE stops RDTSC speculation */
+		set_cpu_cap(c, X86_FEATURE_LFENCE_RDTSC);
 	}
 
 	/*
diff --git a/tools/arch/x86/include/asm/cpufeatures.h b/tools/arch/x86/include/asm/cpufeatures.h
index 998c2cc08363..9b98edb6b2d3 100644
--- a/tools/arch/x86/include/asm/cpufeatures.h
+++ b/tools/arch/x86/include/asm/cpufeatures.h
@@ -96,7 +96,6 @@
 #define X86_FEATURE_SYSCALL32		( 3*32+14) /* "" syscall in IA32 userspace */
 #define X86_FEATURE_SYSENTER32		( 3*32+15) /* "" sysenter in IA32 userspace */
 #define X86_FEATURE_REP_GOOD		( 3*32+16) /* REP microcode works well */
-#define X86_FEATURE_MFENCE_RDTSC	( 3*32+17) /* "" MFENCE synchronizes RDTSC */
 #define X86_FEATURE_LFENCE_RDTSC	( 3*32+18) /* "" LFENCE synchronizes RDTSC */
 #define X86_FEATURE_ACC_POWER		( 3*32+19) /* AMD Accumulated Power Mechanism */
 #define X86_FEATURE_NOPL		( 3*32+20) /* The NOPL (0F 1F) instructions */

^ permalink raw reply related	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2019-07-22 10:04 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-07-04 15:46 [RFC PATCH] x86: Remove X86_FEATURE_MFENCE_RDTSC Josh Poimboeuf
2019-07-08 19:32 ` Lendacky, Thomas
2019-07-10  6:20   ` Paolo Bonzini
2019-07-10 11:33     ` Thomas Gleixner
2019-07-11 17:45       ` Josh Poimboeuf
2019-07-22 10:04 ` [tip:x86/cpu] " tip-bot for Josh Poimboeuf

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.