linux-edac.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3] x86/cpu: Use SERIALIZE in sync_core() when available
@ 2020-08-06 19:25 Ricardo Neri
  2020-08-06 19:57 ` Dave Hansen
  0 siblings, 1 reply; 5+ messages in thread
From: Ricardo Neri @ 2020-08-06 19:25 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Andy Lutomirski, x86
  Cc: Dave Hansen, Tony Luck, Cathy Zhang, Fenghua Yu, H. Peter Anvin,
	Kyung Min Park, Peter Zijlstra (Intel),
	Ravi V. Shankar, Sean Christopherson, linux-kernel, Ricardo Neri,
	Ricardo Neri, Dave Hansen, linux-edac

The SERIALIZE instruction gives software a way to force the processor to
complete all modifications to flags, registers and memory from previous
instructions and drain all buffered writes to memory before the next
instruction is fetched and executed. Thus, it serves the purpose of
sync_core(). Use it when available.

Cc: Andy Lutomirski <luto@kernel.org>
Cc: Cathy Zhang <cathy.zhang@intel.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Kyung Min Park <kyung.min.park@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: "Ravi V. Shankar" <ravi.v.shankar@intel.com>
Cc: Sean Christopherson <sean.j.christopherson@intel.com>
Cc: linux-edac@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Reviewed-by: Tony Luck <tony.luck@intel.com>
Suggested-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
---
This is a v3 from my two previous submissions [1], [2]. The first three
patches of the series have been merged into Linus' tree. Hence, I am
submitting only this patch for review.

[1]. https://lkml.org/lkml/2020/7/27/8
[2]. https://lkml.org/lkml/2020/8/4/1090

Changes since v2:
 * Support serialize with static_cpu_has() instead of using alternative
   runtime patching directly. (Borislav Petkov)

Changes since v1:
 * Support SERIALIZE using alternative runtime patching.
   (Peter Zijlstra, H. Peter Anvin)
 * Added a note to specify which version of binutils supports SERIALIZE.
   (Peter Zijlstra)
 * Verified that (::: "memory") is used. (H. Peter Anvin)
---
 arch/x86/include/asm/special_insns.h | 2 ++
 arch/x86/include/asm/sync_core.h     | 9 ++++++++-
 2 files changed, 10 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/special_insns.h b/arch/x86/include/asm/special_insns.h
index 59a3e13204c3..25cd67801dda 100644
--- a/arch/x86/include/asm/special_insns.h
+++ b/arch/x86/include/asm/special_insns.h
@@ -10,6 +10,8 @@
 #include <linux/irqflags.h>
 #include <linux/jump_label.h>
 
+/* Instruction opcode for SERIALIZE; supported in binutils >= 2.35. */
+#define __ASM_SERIALIZE ".byte 0xf, 0x1, 0xe8"
 /*
  * Volatile isn't enough to prevent the compiler from reordering the
  * read/write functions for the control registers and messing everything up.
diff --git a/arch/x86/include/asm/sync_core.h b/arch/x86/include/asm/sync_core.h
index fdb5b356e59b..b74580413a0b 100644
--- a/arch/x86/include/asm/sync_core.h
+++ b/arch/x86/include/asm/sync_core.h
@@ -5,6 +5,7 @@
 #include <linux/preempt.h>
 #include <asm/processor.h>
 #include <asm/cpufeature.h>
+#include <asm/special_insns.h>
 
 #ifdef CONFIG_X86_32
 static inline void iret_to_self(void)
@@ -54,7 +55,8 @@ static inline void iret_to_self(void)
 static inline void sync_core(void)
 {
 	/*
-	 * There are quite a few ways to do this.  IRET-to-self is nice
+	 * Hardware can do this for us if SERIALIZE is available. Otherwise,
+	 * there are quite a few ways to do this.  IRET-to-self is nice
 	 * because it works on every CPU, at any CPL (so it's compatible
 	 * with paravirtualization), and it never exits to a hypervisor.
 	 * The only down sides are that it's a bit slow (it seems to be
@@ -75,6 +77,11 @@ static inline void sync_core(void)
 	 * Like all of Linux's memory ordering operations, this is a
 	 * compiler barrier as well.
 	 */
+	if (static_cpu_has(X86_FEATURE_SERIALIZE)) {
+		asm volatile(__ASM_SERIALIZE ::: "memory");
+		return;
+	}
+
 	iret_to_self();
 }
 
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH v3] x86/cpu: Use SERIALIZE in sync_core() when available
  2020-08-06 19:25 [PATCH v3] x86/cpu: Use SERIALIZE in sync_core() when available Ricardo Neri
@ 2020-08-06 19:57 ` Dave Hansen
  2020-08-06 23:04   ` Ricardo Neri
  0 siblings, 1 reply; 5+ messages in thread
From: Dave Hansen @ 2020-08-06 19:57 UTC (permalink / raw)
  To: Ricardo Neri, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Andy Lutomirski, x86
  Cc: Tony Luck, Cathy Zhang, Fenghua Yu, H. Peter Anvin,
	Kyung Min Park, Peter Zijlstra (Intel),
	Ravi V. Shankar, Sean Christopherson, linux-kernel, Ricardo Neri,
	Dave Hansen, linux-edac

On 8/6/20 12:25 PM, Ricardo Neri wrote:
>  static inline void sync_core(void)
>  {
>  	/*
> -	 * There are quite a few ways to do this.  IRET-to-self is nice
> +	 * Hardware can do this for us if SERIALIZE is available. Otherwise,
> +	 * there are quite a few ways to do this.  IRET-to-self is nice

This seems to imply that things other than SERIALIZE aren't the hardware
doing this.  All of these methods are equally architecturally
serializing *and* provided by the hardware.

I also don't quite get the point of separating the comments from the
code.  Shouldn't this be:

	/*
	 * The SERIALIZE instruction is the most straightforward way to
	 * do this but it not universally available.
	 */
	if (static_cpu_has(X86_FEATURE_SERIALIZE)) {
		asm volatile(__ASM_SERIALIZE ::: "memory");
		return;
	}

	/*
	 * For all other processors, IRET-to-self is nice ...
	 */
	iret_to_self();

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v3] x86/cpu: Use SERIALIZE in sync_core() when available
  2020-08-06 19:57 ` Dave Hansen
@ 2020-08-06 23:04   ` Ricardo Neri
  2020-08-06 23:08     ` Dave Hansen
  0 siblings, 1 reply; 5+ messages in thread
From: Ricardo Neri @ 2020-08-06 23:04 UTC (permalink / raw)
  To: Dave Hansen
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Andy Lutomirski,
	x86, Tony Luck, Cathy Zhang, Fenghua Yu, H. Peter Anvin,
	Kyung Min Park, Peter Zijlstra (Intel),
	Ravi V. Shankar, Sean Christopherson, linux-kernel, Ricardo Neri,
	Dave Hansen, linux-edac

On Thu, Aug 06, 2020 at 12:57:26PM -0700, Dave Hansen wrote:
> On 8/6/20 12:25 PM, Ricardo Neri wrote:
> >  static inline void sync_core(void)
> >  {
> >  	/*
> > -	 * There are quite a few ways to do this.  IRET-to-self is nice
> > +	 * Hardware can do this for us if SERIALIZE is available. Otherwise,
> > +	 * there are quite a few ways to do this.  IRET-to-self is nice
> 
> This seems to imply that things other than SERIALIZE aren't the hardware
> doing this.  All of these methods are equally architecturally
> serializing *and* provided by the hardware.

Indeed, I can see how the wording may imply that.

> 
> I also don't quite get the point of separating the comments from the
> code.  Shouldn't this be:
> 
> 	/*
> 	 * The SERIALIZE instruction is the most straightforward way to
> 	 * do this but it not universally available.
> 	 */

I regard the comment as describing all the considered options to for
architectural serialization. What about if I add SERIALIZE as another
option? I propose to discuss it towards the end of the comment:

	/*
	 * There are quite a few ways to do this.  IRET-to-self is nice
	 * because it works on every CPU, at any CPL (so it's compatible
	 * with paravirtualization), and it never exits to a hypervisor.
	 * The only down sides are that it's a bit slow (it seems to be
	 * a bit more than 2x slower than the fastest options) and that
	 * it unmasks NMIs.  The "push %cs" is needed because, in
	 * paravirtual environments, __KERNEL_CS may not be a valid CS
	 * value when we do IRET directly.
	 *
	 * In case NMI unmasking or performance ever becomes a problem,
	 * the next best option appears to be MOV-to-CR2 and an
	 * unconditional jump.  That sequence also works on all CPUs,
	 * but it will fault at CPL3 (i.e. Xen PV).
	 *
	 * CPUID is the conventional way, but it's nasty: it doesn't
	 * exist on some 486-like CPUs, and it usually exits to a
	 * hypervisor.
	 *
 	 * The SERIALIZE instruction is the most straightforward way to
 	 * do this as it does not clobber registers or exit to a
	 * hypervisor. However, it is not universally available.
 	 *
	 * Like all of Linux's memory ordering operations, this is a
	 * compiler barrier as well.
	 */

What do you think?

> 	if (static_cpu_has(X86_FEATURE_SERIALIZE)) {
> 		asm volatile(__ASM_SERIALIZE ::: "memory");
> 		return;
> 	}
> 
> 	/*
> 	 * For all other processors, IRET-to-self is nice ...
> 	 */
> 	iret_to_self();


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v3] x86/cpu: Use SERIALIZE in sync_core() when available
  2020-08-06 23:04   ` Ricardo Neri
@ 2020-08-06 23:08     ` Dave Hansen
  2020-08-06 23:39       ` Ricardo Neri
  0 siblings, 1 reply; 5+ messages in thread
From: Dave Hansen @ 2020-08-06 23:08 UTC (permalink / raw)
  To: Ricardo Neri
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Andy Lutomirski,
	x86, Tony Luck, Cathy Zhang, Fenghua Yu, H. Peter Anvin,
	Kyung Min Park, Peter Zijlstra (Intel),
	Ravi V. Shankar, Sean Christopherson, linux-kernel, Ricardo Neri,
	Dave Hansen, linux-edac

On 8/6/20 4:04 PM, Ricardo Neri wrote:
> 	 * CPUID is the conventional way, but it's nasty: it doesn't
> 	 * exist on some 486-like CPUs, and it usually exits to a
> 	 * hypervisor.
> 	 *
>  	 * The SERIALIZE instruction is the most straightforward way to
>  	 * do this as it does not clobber registers or exit to a
> 	 * hypervisor. However, it is not universally available.
>  	 *
> 	 * Like all of Linux's memory ordering operations, this is a
> 	 * compiler barrier as well.
> 	 */
> 
> What do you think?

I like what I suggested.  :)

SERIALIZE is best where available.  Do it first, comment it by itself.

Then, go into the long discussion of the other alternatives.  They only
make sense when SERIALIZE isn't there, and the logic for selection there
is substantially more complicated.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v3] x86/cpu: Use SERIALIZE in sync_core() when available
  2020-08-06 23:08     ` Dave Hansen
@ 2020-08-06 23:39       ` Ricardo Neri
  0 siblings, 0 replies; 5+ messages in thread
From: Ricardo Neri @ 2020-08-06 23:39 UTC (permalink / raw)
  To: Dave Hansen
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Andy Lutomirski,
	x86, Tony Luck, Cathy Zhang, Fenghua Yu, H. Peter Anvin,
	Kyung Min Park, Peter Zijlstra (Intel),
	Ravi V. Shankar, Sean Christopherson, linux-kernel, Ricardo Neri,
	Dave Hansen, linux-edac

On Thu, Aug 06, 2020 at 04:08:47PM -0700, Dave Hansen wrote:
> On 8/6/20 4:04 PM, Ricardo Neri wrote:
> > 	 * CPUID is the conventional way, but it's nasty: it doesn't
> > 	 * exist on some 486-like CPUs, and it usually exits to a
> > 	 * hypervisor.
> > 	 *
> >  	 * The SERIALIZE instruction is the most straightforward way to
> >  	 * do this as it does not clobber registers or exit to a
> > 	 * hypervisor. However, it is not universally available.
> >  	 *
> > 	 * Like all of Linux's memory ordering operations, this is a
> > 	 * compiler barrier as well.
> > 	 */
> > 
> > What do you think?
> 
> I like what I suggested.  :)
> 
> SERIALIZE is best where available.  Do it first, comment it by itself.
> 
> Then, go into the long discussion of the other alternatives.  They only
> make sense when SERIALIZE isn't there, and the logic for selection there
> is substantially more complicated.

Sure Dave, I think this layout makes sense. I will rework the comments.

Thanks and BR,
Ricardo

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2020-08-06 23:39 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-08-06 19:25 [PATCH v3] x86/cpu: Use SERIALIZE in sync_core() when available Ricardo Neri
2020-08-06 19:57 ` Dave Hansen
2020-08-06 23:04   ` Ricardo Neri
2020-08-06 23:08     ` Dave Hansen
2020-08-06 23:39       ` Ricardo Neri

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).