linux-edac.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2] x86/cpu: Use SERIALIZE in sync_core() when available
@ 2020-08-05  2:10 Ricardo Neri
  2020-08-05  4:48 ` Borislav Petkov
  0 siblings, 1 reply; 11+ messages in thread
From: Ricardo Neri @ 2020-08-05  2:10 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Andy Lutomirski,
	x86, Peter Zijlstra (Intel)
  Cc: Dave Hansen, Tony Luck, Cathy Zhang, Fenghua Yu, H. Peter Anvin,
	Kyung Min Park, Ravi V. Shankar, Sean Christopherson,
	linux-kernel, Ricardo Neri, Ricardo Neri, Dave Hansen,
	linux-edac

The SERIALIZE instruction gives software a way to force the processor to
complete all modifications to flags, registers and memory from previous
instructions and drain all buffered writes to memory before the next
instruction is fetched and executed. Thus, it serves the purpose of
sync_core(). Use it when available.

Commit 7117f16bf460 ("objtool: Fix ORC vs alternatives") enforced stack
invariance in alternatives. The iret-to-self does not comply with such
invariance. Thus, it cannot be used inside alternative code. Instead, use
an alternative that jumps to SERIALIZE when available.

Cc: Andy Lutomirski <luto@kernel.org>
Cc: Cathy Zhang <cathy.zhang@intel.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Kyung Min Park <kyung.min.park@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: "Ravi V. Shankar" <ravi.v.shankar@intel.com>
Cc: Sean Christopherson <sean.j.christopherson@intel.com>
Cc: linux-edac@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Suggested-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
---
This is a v2 from my initial submission [1]. The first three patches of
the series have been merged in Linus' tree. Hence, I am submitting only
this patch for review.

[1]. https://lkml.org/lkml/2020/7/27/8

Changes since v1:
 * Support SERIALIZE using alternative runtime patching.
   (Peter Zijlstra, H. Peter Anvin)
 * Added a note to specify which version of binutils supports SERIALIZE.
   (Peter Zijlstra)
 * Verified that (::: "memory") is used. (H. Peter Anvin)
---
 arch/x86/include/asm/special_insns.h |  2 ++
 arch/x86/include/asm/sync_core.h     | 10 +++++++++-
 2 files changed, 11 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/special_insns.h b/arch/x86/include/asm/special_insns.h
index 59a3e13204c3..25cd67801dda 100644
--- a/arch/x86/include/asm/special_insns.h
+++ b/arch/x86/include/asm/special_insns.h
@@ -10,6 +10,8 @@
 #include <linux/irqflags.h>
 #include <linux/jump_label.h>
 
+/* Instruction opcode for SERIALIZE; supported in binutils >= 2.35. */
+#define __ASM_SERIALIZE ".byte 0xf, 0x1, 0xe8"
 /*
  * Volatile isn't enough to prevent the compiler from reordering the
  * read/write functions for the control registers and messing everything up.
diff --git a/arch/x86/include/asm/sync_core.h b/arch/x86/include/asm/sync_core.h
index fdb5b356e59b..201ea3d9a6bd 100644
--- a/arch/x86/include/asm/sync_core.h
+++ b/arch/x86/include/asm/sync_core.h
@@ -5,15 +5,19 @@
 #include <linux/preempt.h>
 #include <asm/processor.h>
 #include <asm/cpufeature.h>
+#include <asm/special_insns.h>
 
 #ifdef CONFIG_X86_32
 static inline void iret_to_self(void)
 {
 	asm volatile (
+		ALTERNATIVE("", "jmp 2f", X86_FEATURE_SERIALIZE)
 		"pushfl\n\t"
 		"pushl %%cs\n\t"
 		"pushl $1f\n\t"
 		"iret\n\t"
+		"2:\n\t"
+		__ASM_SERIALIZE "\n"
 		"1:"
 		: ASM_CALL_CONSTRAINT : : "memory");
 }
@@ -23,6 +27,7 @@ static inline void iret_to_self(void)
 	unsigned int tmp;
 
 	asm volatile (
+		ALTERNATIVE("", "jmp 2f", X86_FEATURE_SERIALIZE)
 		"mov %%ss, %0\n\t"
 		"pushq %q0\n\t"
 		"pushq %%rsp\n\t"
@@ -32,6 +37,8 @@ static inline void iret_to_self(void)
 		"pushq %q0\n\t"
 		"pushq $1f\n\t"
 		"iretq\n\t"
+		"2:\n\t"
+		__ASM_SERIALIZE "\n"
 		"1:"
 		: "=&r" (tmp), ASM_CALL_CONSTRAINT : : "cc", "memory");
 }
@@ -54,7 +61,8 @@ static inline void iret_to_self(void)
 static inline void sync_core(void)
 {
 	/*
-	 * There are quite a few ways to do this.  IRET-to-self is nice
+	 * Hardware can do this for us if SERIALIZE is available. Otherwise,
+	 * there are quite a few ways to do this.  IRET-to-self is nice
 	 * because it works on every CPU, at any CPL (so it's compatible
 	 * with paravirtualization), and it never exits to a hypervisor.
 	 * The only down sides are that it's a bit slow (it seems to be
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH v2] x86/cpu: Use SERIALIZE in sync_core() when available
  2020-08-05  2:10 [PATCH v2] x86/cpu: Use SERIALIZE in sync_core() when available Ricardo Neri
@ 2020-08-05  4:48 ` Borislav Petkov
  2020-08-05  4:58   ` hpa
  2020-08-05 15:54   ` Ricardo Neri
  0 siblings, 2 replies; 11+ messages in thread
From: Borislav Petkov @ 2020-08-05  4:48 UTC (permalink / raw)
  To: Ricardo Neri
  Cc: Thomas Gleixner, Ingo Molnar, Andy Lutomirski, x86,
	Peter Zijlstra (Intel),
	Dave Hansen, Tony Luck, Cathy Zhang, Fenghua Yu, H. Peter Anvin,
	Kyung Min Park, Ravi V. Shankar, Sean Christopherson,
	linux-kernel, Ricardo Neri, Dave Hansen, linux-edac

On Tue, Aug 04, 2020 at 07:10:59PM -0700, Ricardo Neri wrote:
> The SERIALIZE instruction gives software a way to force the processor to
> complete all modifications to flags, registers and memory from previous
> instructions and drain all buffered writes to memory before the next
> instruction is fetched and executed. Thus, it serves the purpose of
> sync_core(). Use it when available.
> 
> Commit 7117f16bf460 ("objtool: Fix ORC vs alternatives") enforced stack
> invariance in alternatives. The iret-to-self does not comply with such
> invariance. Thus, it cannot be used inside alternative code. Instead, use
> an alternative that jumps to SERIALIZE when available.
> 
> Cc: Andy Lutomirski <luto@kernel.org>
> Cc: Cathy Zhang <cathy.zhang@intel.com>
> Cc: Dave Hansen <dave.hansen@linux.intel.com>
> Cc: Fenghua Yu <fenghua.yu@intel.com>
> Cc: "H. Peter Anvin" <hpa@zytor.com>
> Cc: Kyung Min Park <kyung.min.park@intel.com>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: "Ravi V. Shankar" <ravi.v.shankar@intel.com>
> Cc: Sean Christopherson <sean.j.christopherson@intel.com>
> Cc: linux-edac@vger.kernel.org
> Cc: linux-kernel@vger.kernel.org
> Suggested-by: Andy Lutomirski <luto@kernel.org>
> Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
> ---
> This is a v2 from my initial submission [1]. The first three patches of
> the series have been merged in Linus' tree. Hence, I am submitting only
> this patch for review.
> 
> [1]. https://lkml.org/lkml/2020/7/27/8
> 
> Changes since v1:
>  * Support SERIALIZE using alternative runtime patching.
>    (Peter Zijlstra, H. Peter Anvin)
>  * Added a note to specify which version of binutils supports SERIALIZE.
>    (Peter Zijlstra)
>  * Verified that (::: "memory") is used. (H. Peter Anvin)
> ---
>  arch/x86/include/asm/special_insns.h |  2 ++
>  arch/x86/include/asm/sync_core.h     | 10 +++++++++-
>  2 files changed, 11 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/x86/include/asm/special_insns.h b/arch/x86/include/asm/special_insns.h
> index 59a3e13204c3..25cd67801dda 100644
> --- a/arch/x86/include/asm/special_insns.h
> +++ b/arch/x86/include/asm/special_insns.h
> @@ -10,6 +10,8 @@
>  #include <linux/irqflags.h>
>  #include <linux/jump_label.h>
>  
> +/* Instruction opcode for SERIALIZE; supported in binutils >= 2.35. */
> +#define __ASM_SERIALIZE ".byte 0xf, 0x1, 0xe8"
>  /*
>   * Volatile isn't enough to prevent the compiler from reordering the
>   * read/write functions for the control registers and messing everything up.
> diff --git a/arch/x86/include/asm/sync_core.h b/arch/x86/include/asm/sync_core.h
> index fdb5b356e59b..201ea3d9a6bd 100644
> --- a/arch/x86/include/asm/sync_core.h
> +++ b/arch/x86/include/asm/sync_core.h
> @@ -5,15 +5,19 @@
>  #include <linux/preempt.h>
>  #include <asm/processor.h>
>  #include <asm/cpufeature.h>
> +#include <asm/special_insns.h>
>  
>  #ifdef CONFIG_X86_32
>  static inline void iret_to_self(void)
>  {
>  	asm volatile (
> +		ALTERNATIVE("", "jmp 2f", X86_FEATURE_SERIALIZE)
>  		"pushfl\n\t"
>  		"pushl %%cs\n\t"
>  		"pushl $1f\n\t"
>  		"iret\n\t"
> +		"2:\n\t"
> +		__ASM_SERIALIZE "\n"
>  		"1:"
>  		: ASM_CALL_CONSTRAINT : : "memory");
>  }
> @@ -23,6 +27,7 @@ static inline void iret_to_self(void)
>  	unsigned int tmp;
>  
>  	asm volatile (
> +		ALTERNATIVE("", "jmp 2f", X86_FEATURE_SERIALIZE)

Why is this and above stuck inside the asm statement?

Why can't you simply do:

	if (static_cpu_has(X86_FEATURE_SERIALIZE)) {
		asm volatile(__ASM_SERIALIZE ::: "memory");
		return;
	}

on function entry instead of making it more unreadable for no particular
reason?

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Mary Higgins, Sri Rasiah, HRB 21284 (AG Nürnberg)
-- 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2] x86/cpu: Use SERIALIZE in sync_core() when available
  2020-08-05  4:48 ` Borislav Petkov
@ 2020-08-05  4:58   ` hpa
  2020-08-05  5:08     ` Borislav Petkov
  2020-08-05 15:54   ` Ricardo Neri
  1 sibling, 1 reply; 11+ messages in thread
From: hpa @ 2020-08-05  4:58 UTC (permalink / raw)
  To: Borislav Petkov, Ricardo Neri
  Cc: Thomas Gleixner, Ingo Molnar, Andy Lutomirski, x86,
	Peter Zijlstra (Intel),
	Dave Hansen, Tony Luck, Cathy Zhang, Fenghua Yu, Kyung Min Park,
	Ravi V. Shankar, Sean Christopherson, linux-kernel, Ricardo Neri,
	Dave Hansen, linux-edac

On August 4, 2020 9:48:40 PM PDT, Borislav Petkov <bp@suse.de> wrote:
>On Tue, Aug 04, 2020 at 07:10:59PM -0700, Ricardo Neri wrote:
>> The SERIALIZE instruction gives software a way to force the processor
>to
>> complete all modifications to flags, registers and memory from
>previous
>> instructions and drain all buffered writes to memory before the next
>> instruction is fetched and executed. Thus, it serves the purpose of
>> sync_core(). Use it when available.
>> 
>> Commit 7117f16bf460 ("objtool: Fix ORC vs alternatives") enforced
>stack
>> invariance in alternatives. The iret-to-self does not comply with
>such
>> invariance. Thus, it cannot be used inside alternative code. Instead,
>use
>> an alternative that jumps to SERIALIZE when available.
>> 
>> Cc: Andy Lutomirski <luto@kernel.org>
>> Cc: Cathy Zhang <cathy.zhang@intel.com>
>> Cc: Dave Hansen <dave.hansen@linux.intel.com>
>> Cc: Fenghua Yu <fenghua.yu@intel.com>
>> Cc: "H. Peter Anvin" <hpa@zytor.com>
>> Cc: Kyung Min Park <kyung.min.park@intel.com>
>> Cc: Peter Zijlstra <peterz@infradead.org>
>> Cc: "Ravi V. Shankar" <ravi.v.shankar@intel.com>
>> Cc: Sean Christopherson <sean.j.christopherson@intel.com>
>> Cc: linux-edac@vger.kernel.org
>> Cc: linux-kernel@vger.kernel.org
>> Suggested-by: Andy Lutomirski <luto@kernel.org>
>> Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
>> ---
>> This is a v2 from my initial submission [1]. The first three patches
>of
>> the series have been merged in Linus' tree. Hence, I am submitting
>only
>> this patch for review.
>> 
>> [1]. https://lkml.org/lkml/2020/7/27/8
>> 
>> Changes since v1:
>>  * Support SERIALIZE using alternative runtime patching.
>>    (Peter Zijlstra, H. Peter Anvin)
>>  * Added a note to specify which version of binutils supports
>SERIALIZE.
>>    (Peter Zijlstra)
>>  * Verified that (::: "memory") is used. (H. Peter Anvin)
>> ---
>>  arch/x86/include/asm/special_insns.h |  2 ++
>>  arch/x86/include/asm/sync_core.h     | 10 +++++++++-
>>  2 files changed, 11 insertions(+), 1 deletion(-)
>> 
>> diff --git a/arch/x86/include/asm/special_insns.h
>b/arch/x86/include/asm/special_insns.h
>> index 59a3e13204c3..25cd67801dda 100644
>> --- a/arch/x86/include/asm/special_insns.h
>> +++ b/arch/x86/include/asm/special_insns.h
>> @@ -10,6 +10,8 @@
>>  #include <linux/irqflags.h>
>>  #include <linux/jump_label.h>
>>  
>> +/* Instruction opcode for SERIALIZE; supported in binutils >= 2.35.
>*/
>> +#define __ASM_SERIALIZE ".byte 0xf, 0x1, 0xe8"
>>  /*
>>   * Volatile isn't enough to prevent the compiler from reordering the
>>   * read/write functions for the control registers and messing
>everything up.
>> diff --git a/arch/x86/include/asm/sync_core.h
>b/arch/x86/include/asm/sync_core.h
>> index fdb5b356e59b..201ea3d9a6bd 100644
>> --- a/arch/x86/include/asm/sync_core.h
>> +++ b/arch/x86/include/asm/sync_core.h
>> @@ -5,15 +5,19 @@
>>  #include <linux/preempt.h>
>>  #include <asm/processor.h>
>>  #include <asm/cpufeature.h>
>> +#include <asm/special_insns.h>
>>  
>>  #ifdef CONFIG_X86_32
>>  static inline void iret_to_self(void)
>>  {
>>  	asm volatile (
>> +		ALTERNATIVE("", "jmp 2f", X86_FEATURE_SERIALIZE)
>>  		"pushfl\n\t"
>>  		"pushl %%cs\n\t"
>>  		"pushl $1f\n\t"
>>  		"iret\n\t"
>> +		"2:\n\t"
>> +		__ASM_SERIALIZE "\n"
>>  		"1:"
>>  		: ASM_CALL_CONSTRAINT : : "memory");
>>  }
>> @@ -23,6 +27,7 @@ static inline void iret_to_self(void)
>>  	unsigned int tmp;
>>  
>>  	asm volatile (
>> +		ALTERNATIVE("", "jmp 2f", X86_FEATURE_SERIALIZE)
>
>Why is this and above stuck inside the asm statement?
>
>Why can't you simply do:
>
>	if (static_cpu_has(X86_FEATURE_SERIALIZE)) {
>		asm volatile(__ASM_SERIALIZE ::: "memory");
>		return;
>	}
>
>on function entry instead of making it more unreadable for no
>particular
>reason?

Because why use an alternative to jump over one instruction?

I personally would prefer to have the IRET put out of line and have the call/jmp replaced by SERIALIZE inline.

-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2] x86/cpu: Use SERIALIZE in sync_core() when available
  2020-08-05  4:58   ` hpa
@ 2020-08-05  5:08     ` Borislav Petkov
  2020-08-05  5:10       ` hpa
  2020-08-05 17:07       ` Ricardo Neri
  0 siblings, 2 replies; 11+ messages in thread
From: Borislav Petkov @ 2020-08-05  5:08 UTC (permalink / raw)
  To: hpa
  Cc: Borislav Petkov, Ricardo Neri, Thomas Gleixner, Ingo Molnar,
	Andy Lutomirski, x86, Peter Zijlstra (Intel),
	Dave Hansen, Tony Luck, Cathy Zhang, Fenghua Yu, Kyung Min Park,
	Ravi V. Shankar, Sean Christopherson, linux-kernel, Ricardo Neri,
	Dave Hansen, linux-edac

On Tue, Aug 04, 2020 at 09:58:25PM -0700, hpa@zytor.com wrote:
> Because why use an alternative to jump over one instruction?
>
> I personally would prefer to have the IRET put out of line

Can't yet - SERIALIZE CPUs are a minority at the moment.

> and have the call/jmp replaced by SERIALIZE inline.

Well, we could do:

	alternative_io("... IRET bunch", __ASM_SERIALIZE, X86_FEATURE_SERIALIZE, ...);

and avoid all kinds of jumping. Alternatives get padded so there
would be a couple of NOPs following when SERIALIZE gets patched in
but it shouldn't be a problem. I guess one needs to look at what gcc
generates...

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2] x86/cpu: Use SERIALIZE in sync_core() when available
  2020-08-05  5:08     ` Borislav Petkov
@ 2020-08-05  5:10       ` hpa
  2020-08-05 17:07       ` Ricardo Neri
  1 sibling, 0 replies; 11+ messages in thread
From: hpa @ 2020-08-05  5:10 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Borislav Petkov, Ricardo Neri, Thomas Gleixner, Ingo Molnar,
	Andy Lutomirski, x86, Peter Zijlstra (Intel),
	Dave Hansen, Tony Luck, Cathy Zhang, Fenghua Yu, Kyung Min Park,
	Ravi V. Shankar, Sean Christopherson, linux-kernel, Ricardo Neri,
	Dave Hansen, linux-edac

On August 4, 2020 10:08:08 PM PDT, Borislav Petkov <bp@alien8.de> wrote:
>On Tue, Aug 04, 2020 at 09:58:25PM -0700, hpa@zytor.com wrote:
>> Because why use an alternative to jump over one instruction?
>>
>> I personally would prefer to have the IRET put out of line
>
>Can't yet - SERIALIZE CPUs are a minority at the moment.
>
>> and have the call/jmp replaced by SERIALIZE inline.
>
>Well, we could do:
>
>	alternative_io("... IRET bunch", __ASM_SERIALIZE,
>X86_FEATURE_SERIALIZE, ...);
>
>and avoid all kinds of jumping. Alternatives get padded so there
>would be a couple of NOPs following when SERIALIZE gets patched in
>but it shouldn't be a problem. I guess one needs to look at what gcc
>generates...

I didn't say behind a trap. IRET is a control transfer instruction, and slow, so putting it out of line really isn't unreasonable. Can even do a call to a common handler.
-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2] x86/cpu: Use SERIALIZE in sync_core() when available
  2020-08-05  4:48 ` Borislav Petkov
  2020-08-05  4:58   ` hpa
@ 2020-08-05 15:54   ` Ricardo Neri
  1 sibling, 0 replies; 11+ messages in thread
From: Ricardo Neri @ 2020-08-05 15:54 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Thomas Gleixner, Ingo Molnar, Andy Lutomirski, x86,
	Peter Zijlstra (Intel),
	Dave Hansen, Tony Luck, Cathy Zhang, Fenghua Yu, H. Peter Anvin,
	Kyung Min Park, Ravi V. Shankar, Sean Christopherson,
	linux-kernel, Ricardo Neri, Dave Hansen, linux-edac

On Wed, Aug 05, 2020 at 06:48:40AM +0200, Borislav Petkov wrote:
> On Tue, Aug 04, 2020 at 07:10:59PM -0700, Ricardo Neri wrote:
> > The SERIALIZE instruction gives software a way to force the processor to
> > complete all modifications to flags, registers and memory from previous
> > instructions and drain all buffered writes to memory before the next
> > instruction is fetched and executed. Thus, it serves the purpose of
> > sync_core(). Use it when available.
> > 
> > Commit 7117f16bf460 ("objtool: Fix ORC vs alternatives") enforced stack
> > invariance in alternatives. The iret-to-self does not comply with such
> > invariance. Thus, it cannot be used inside alternative code. Instead, use
> > an alternative that jumps to SERIALIZE when available.
> > 
> > Cc: Andy Lutomirski <luto@kernel.org>
> > Cc: Cathy Zhang <cathy.zhang@intel.com>
> > Cc: Dave Hansen <dave.hansen@linux.intel.com>
> > Cc: Fenghua Yu <fenghua.yu@intel.com>
> > Cc: "H. Peter Anvin" <hpa@zytor.com>
> > Cc: Kyung Min Park <kyung.min.park@intel.com>
> > Cc: Peter Zijlstra <peterz@infradead.org>
> > Cc: "Ravi V. Shankar" <ravi.v.shankar@intel.com>
> > Cc: Sean Christopherson <sean.j.christopherson@intel.com>
> > Cc: linux-edac@vger.kernel.org
> > Cc: linux-kernel@vger.kernel.org
> > Suggested-by: Andy Lutomirski <luto@kernel.org>
> > Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
> > ---
> > This is a v2 from my initial submission [1]. The first three patches of
> > the series have been merged in Linus' tree. Hence, I am submitting only
> > this patch for review.
> > 
> > [1]. https://lkml.org/lkml/2020/7/27/8
> > 
> > Changes since v1:
> >  * Support SERIALIZE using alternative runtime patching.
> >    (Peter Zijlstra, H. Peter Anvin)
> >  * Added a note to specify which version of binutils supports SERIALIZE.
> >    (Peter Zijlstra)
> >  * Verified that (::: "memory") is used. (H. Peter Anvin)
> > ---
> >  arch/x86/include/asm/special_insns.h |  2 ++
> >  arch/x86/include/asm/sync_core.h     | 10 +++++++++-
> >  2 files changed, 11 insertions(+), 1 deletion(-)
> > 
> > diff --git a/arch/x86/include/asm/special_insns.h b/arch/x86/include/asm/special_insns.h
> > index 59a3e13204c3..25cd67801dda 100644
> > --- a/arch/x86/include/asm/special_insns.h
> > +++ b/arch/x86/include/asm/special_insns.h
> > @@ -10,6 +10,8 @@
> >  #include <linux/irqflags.h>
> >  #include <linux/jump_label.h>
> >  
> > +/* Instruction opcode for SERIALIZE; supported in binutils >= 2.35. */
> > +#define __ASM_SERIALIZE ".byte 0xf, 0x1, 0xe8"
> >  /*
> >   * Volatile isn't enough to prevent the compiler from reordering the
> >   * read/write functions for the control registers and messing everything up.
> > diff --git a/arch/x86/include/asm/sync_core.h b/arch/x86/include/asm/sync_core.h
> > index fdb5b356e59b..201ea3d9a6bd 100644
> > --- a/arch/x86/include/asm/sync_core.h
> > +++ b/arch/x86/include/asm/sync_core.h
> > @@ -5,15 +5,19 @@
> >  #include <linux/preempt.h>
> >  #include <asm/processor.h>
> >  #include <asm/cpufeature.h>
> > +#include <asm/special_insns.h>
> >  
> >  #ifdef CONFIG_X86_32
> >  static inline void iret_to_self(void)
> >  {
> >  	asm volatile (
> > +		ALTERNATIVE("", "jmp 2f", X86_FEATURE_SERIALIZE)
> >  		"pushfl\n\t"
> >  		"pushl %%cs\n\t"
> >  		"pushl $1f\n\t"
> >  		"iret\n\t"
> > +		"2:\n\t"
> > +		__ASM_SERIALIZE "\n"
> >  		"1:"
> >  		: ASM_CALL_CONSTRAINT : : "memory");
> >  }
> > @@ -23,6 +27,7 @@ static inline void iret_to_self(void)
> >  	unsigned int tmp;
> >  
> >  	asm volatile (
> > +		ALTERNATIVE("", "jmp 2f", X86_FEATURE_SERIALIZE)
> 
> Why is this and above stuck inside the asm statement?
> 
> Why can't you simply do:
> 
> 	if (static_cpu_has(X86_FEATURE_SERIALIZE)) {
> 		asm volatile(__ASM_SERIALIZE ::: "memory");
> 		return;
> 	}
> 
> on function entry instead of making it more unreadable for no particular
> reason?

My my first submission I had implemented it as you describe. The only
difference was that I used boot_cpu_has() instead of static_cpu_has()
as the latter has a comment stating that:
	"Use static_cpu_has() only in fast paths (...) boot_cpu_has() is
	 already fast enough for the majority of cases..."

sync_core_before_usermode() already handles what I think are the
critical paths.

Thanks and BR,
Ricardo

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2] x86/cpu: Use SERIALIZE in sync_core() when available
  2020-08-05  5:08     ` Borislav Petkov
  2020-08-05  5:10       ` hpa
@ 2020-08-05 17:07       ` Ricardo Neri
  2020-08-05 18:28         ` Andy Lutomirski
  1 sibling, 1 reply; 11+ messages in thread
From: Ricardo Neri @ 2020-08-05 17:07 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: hpa, Borislav Petkov, Thomas Gleixner, Ingo Molnar,
	Andy Lutomirski, x86, Peter Zijlstra (Intel),
	Dave Hansen, Tony Luck, Cathy Zhang, Fenghua Yu, Kyung Min Park,
	Ravi V. Shankar, Sean Christopherson, linux-kernel, Ricardo Neri,
	Dave Hansen, linux-edac

On Wed, Aug 05, 2020 at 07:08:08AM +0200, Borislav Petkov wrote:
> On Tue, Aug 04, 2020 at 09:58:25PM -0700, hpa@zytor.com wrote:
> > Because why use an alternative to jump over one instruction?
> >
> > I personally would prefer to have the IRET put out of line
> 
> Can't yet - SERIALIZE CPUs are a minority at the moment.
> 
> > and have the call/jmp replaced by SERIALIZE inline.
> 
> Well, we could do:
> 
> 	alternative_io("... IRET bunch", __ASM_SERIALIZE, X86_FEATURE_SERIALIZE, ...);
> 
> and avoid all kinds of jumping. Alternatives get padded so there
> would be a couple of NOPs following when SERIALIZE gets patched in
> but it shouldn't be a problem. I guess one needs to look at what gcc
> generates...

But the IRET-TO-SELF code has instruction which modify the stack. This
would violate stack invariance in alternatives as enforced in commit
7117f16bf460 ("objtool: Fix ORC vs alternatives"). As a result, objtool
gives warnings as follows:

arch/x86/kernel/alternative.o: warning: objtool: do_sync_core()+0xe:
alternative modifies stack

Perhaps in this specific case it does not matter as the changes in the
stack will be undone by IRET. However, using alternative_io would require
adding the macro STACK_FRAME_NON_STANDARD to functions using sync_core().
IMHO, it wouldn't look good.

So maybe the best approach is to implement as you suggested using
static_cpu_has()?

Thanks and BR,
Ricardo

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2] x86/cpu: Use SERIALIZE in sync_core() when available
  2020-08-05 17:07       ` Ricardo Neri
@ 2020-08-05 18:28         ` Andy Lutomirski
  2020-08-05 19:11           ` Ricardo Neri
  0 siblings, 1 reply; 11+ messages in thread
From: Andy Lutomirski @ 2020-08-05 18:28 UTC (permalink / raw)
  To: Ricardo Neri
  Cc: Borislav Petkov, H. Peter Anvin, Borislav Petkov,
	Thomas Gleixner, Ingo Molnar, Andy Lutomirski, X86 ML,
	Peter Zijlstra (Intel),
	Dave Hansen, Tony Luck, Cathy Zhang, Fenghua Yu, Kyung Min Park,
	Ravi V. Shankar, Sean Christopherson, LKML, Ricardo Neri,
	Dave Hansen, linux-edac

On Wed, Aug 5, 2020 at 10:07 AM Ricardo Neri
<ricardo.neri-calderon@linux.intel.com> wrote:
>
> On Wed, Aug 05, 2020 at 07:08:08AM +0200, Borislav Petkov wrote:
> > On Tue, Aug 04, 2020 at 09:58:25PM -0700, hpa@zytor.com wrote:
> > > Because why use an alternative to jump over one instruction?
> > >
> > > I personally would prefer to have the IRET put out of line
> >
> > Can't yet - SERIALIZE CPUs are a minority at the moment.
> >
> > > and have the call/jmp replaced by SERIALIZE inline.
> >
> > Well, we could do:
> >
> >       alternative_io("... IRET bunch", __ASM_SERIALIZE, X86_FEATURE_SERIALIZE, ...);
> >
> > and avoid all kinds of jumping. Alternatives get padded so there
> > would be a couple of NOPs following when SERIALIZE gets patched in
> > but it shouldn't be a problem. I guess one needs to look at what gcc
> > generates...
>
> But the IRET-TO-SELF code has instruction which modify the stack. This
> would violate stack invariance in alternatives as enforced in commit
> 7117f16bf460 ("objtool: Fix ORC vs alternatives"). As a result, objtool
> gives warnings as follows:
>
> arch/x86/kernel/alternative.o: warning: objtool: do_sync_core()+0xe:
> alternative modifies stack
>
> Perhaps in this specific case it does not matter as the changes in the
> stack will be undone by IRET. However, using alternative_io would require
> adding the macro STACK_FRAME_NON_STANDARD to functions using sync_core().
> IMHO, it wouldn't look good.
>
> So maybe the best approach is to implement as you suggested using
> static_cpu_has()?

I agree.  Let's keep it simple.

Honestly, I think the right solution is to have iret_to_self() in
actual asm and invoke it from C as needed.  IRET is *slow* -- trying
to optimize it at all is silly.  The big optimization was switching
from CPUID to IRET, since CPUID is slooooooooooooooooooow in virtual
environments, whereas IRET is merely sloooooooow and SERIALIZE is
probably just sloooow.

(I once benchmarked it.  IIRC the winning version on my laptop is MOV
to CR2 on bare metal and IRET in a Xen PV guest.  This optimization
was not obviously worthwhile.)

>
> Thanks and BR,
> Ricardo

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2] x86/cpu: Use SERIALIZE in sync_core() when available
  2020-08-05 18:28         ` Andy Lutomirski
@ 2020-08-05 19:11           ` Ricardo Neri
  2020-08-05 20:30             ` Andy Lutomirski
  0 siblings, 1 reply; 11+ messages in thread
From: Ricardo Neri @ 2020-08-05 19:11 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Borislav Petkov, H. Peter Anvin, Borislav Petkov,
	Thomas Gleixner, Ingo Molnar, X86 ML, Peter Zijlstra (Intel),
	Dave Hansen, Tony Luck, Cathy Zhang, Fenghua Yu, Kyung Min Park,
	Ravi V. Shankar, Sean Christopherson, LKML, Ricardo Neri,
	Dave Hansen, linux-edac

On Wed, Aug 05, 2020 at 11:28:31AM -0700, Andy Lutomirski wrote:
> On Wed, Aug 5, 2020 at 10:07 AM Ricardo Neri
> <ricardo.neri-calderon@linux.intel.com> wrote:
> >
> > On Wed, Aug 05, 2020 at 07:08:08AM +0200, Borislav Petkov wrote:
> > > On Tue, Aug 04, 2020 at 09:58:25PM -0700, hpa@zytor.com wrote:
> > > > Because why use an alternative to jump over one instruction?
> > > >
> > > > I personally would prefer to have the IRET put out of line
> > >
> > > Can't yet - SERIALIZE CPUs are a minority at the moment.
> > >
> > > > and have the call/jmp replaced by SERIALIZE inline.
> > >
> > > Well, we could do:
> > >
> > >       alternative_io("... IRET bunch", __ASM_SERIALIZE, X86_FEATURE_SERIALIZE, ...);
> > >
> > > and avoid all kinds of jumping. Alternatives get padded so there
> > > would be a couple of NOPs following when SERIALIZE gets patched in
> > > but it shouldn't be a problem. I guess one needs to look at what gcc
> > > generates...
> >
> > But the IRET-TO-SELF code has instruction which modify the stack. This
> > would violate stack invariance in alternatives as enforced in commit
> > 7117f16bf460 ("objtool: Fix ORC vs alternatives"). As a result, objtool
> > gives warnings as follows:
> >
> > arch/x86/kernel/alternative.o: warning: objtool: do_sync_core()+0xe:
> > alternative modifies stack
> >
> > Perhaps in this specific case it does not matter as the changes in the
> > stack will be undone by IRET. However, using alternative_io would require
> > adding the macro STACK_FRAME_NON_STANDARD to functions using sync_core().
> > IMHO, it wouldn't look good.
> >
> > So maybe the best approach is to implement as you suggested using
> > static_cpu_has()?
> 
> I agree.  Let's keep it simple.
> 
> Honestly, I think the right solution is to have iret_to_self() in
> actual asm and invoke it from C as needed. 

Do you mean anything different from what we have already [1]? If I
understand your comment correctly, we have exactly that: an
iret_to_self() asm implementation invoked from C.

Thanks and BR,
Ricardo

[1]. https://lore.kernel.org/lkml/20200727043132.15082-4-ricardo.neri-calderon@linux.intel.com/

Thanks and BR,
Ricardo

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2] x86/cpu: Use SERIALIZE in sync_core() when available
  2020-08-05 19:11           ` Ricardo Neri
@ 2020-08-05 20:30             ` Andy Lutomirski
  2020-08-05 22:19               ` Luck, Tony
  0 siblings, 1 reply; 11+ messages in thread
From: Andy Lutomirski @ 2020-08-05 20:30 UTC (permalink / raw)
  To: Ricardo Neri
  Cc: Andy Lutomirski, Borislav Petkov, H. Peter Anvin,
	Borislav Petkov, Thomas Gleixner, Ingo Molnar, X86 ML,
	Peter Zijlstra (Intel),
	Dave Hansen, Tony Luck, Cathy Zhang, Fenghua Yu, Kyung Min Park,
	Ravi V. Shankar, Sean Christopherson, LKML, Ricardo Neri,
	Dave Hansen, linux-edac



> On Aug 5, 2020, at 12:11 PM, Ricardo Neri <ricardo.neri-calderon@linux.intel.com> wrote:
> 
> On Wed, Aug 05, 2020 at 11:28:31AM -0700, Andy Lutomirski wrote:
>>> On Wed, Aug 5, 2020 at 10:07 AM Ricardo Neri
>>> <ricardo.neri-calderon@linux.intel.com> wrote:
>>> 
>>> On Wed, Aug 05, 2020 at 07:08:08AM +0200, Borislav Petkov wrote:
>>>> On Tue, Aug 04, 2020 at 09:58:25PM -0700, hpa@zytor.com wrote:
>>>>> Because why use an alternative to jump over one instruction?
>>>>> 
>>>>> I personally would prefer to have the IRET put out of line
>>>> 
>>>> Can't yet - SERIALIZE CPUs are a minority at the moment.
>>>> 
>>>>> and have the call/jmp replaced by SERIALIZE inline.
>>>> 
>>>> Well, we could do:
>>>> 
>>>>      alternative_io("... IRET bunch", __ASM_SERIALIZE, X86_FEATURE_SERIALIZE, ...);
>>>> 
>>>> and avoid all kinds of jumping. Alternatives get padded so there
>>>> would be a couple of NOPs following when SERIALIZE gets patched in
>>>> but it shouldn't be a problem. I guess one needs to look at what gcc
>>>> generates...
>>> 
>>> But the IRET-TO-SELF code has instruction which modify the stack. This
>>> would violate stack invariance in alternatives as enforced in commit
>>> 7117f16bf460 ("objtool: Fix ORC vs alternatives"). As a result, objtool
>>> gives warnings as follows:
>>> 
>>> arch/x86/kernel/alternative.o: warning: objtool: do_sync_core()+0xe:
>>> alternative modifies stack
>>> 
>>> Perhaps in this specific case it does not matter as the changes in the
>>> stack will be undone by IRET. However, using alternative_io would require
>>> adding the macro STACK_FRAME_NON_STANDARD to functions using sync_core().
>>> IMHO, it wouldn't look good.
>>> 
>>> So maybe the best approach is to implement as you suggested using
>>> static_cpu_has()?
>> 
>> I agree.  Let's keep it simple.
>> 
>> Honestly, I think the right solution is to have iret_to_self() in
>> actual asm and invoke it from C as needed. 
> 
> Do you mean anything different from what we have already [1]? If I
> understand your comment correctly, we have exactly that: an
> iret_to_self() asm implementation invoked from C.

I meant asm as in a .S file. But the code we have is fine for this purpose, at least for now.

> 
> Thanks and BR,
> Ricardo
> 
> [1]. https://lore.kernel.org/lkml/20200727043132.15082-4-ricardo.neri-calderon@linux.intel.com/
> 
> Thanks and BR,
> Ricardo

^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: [PATCH v2] x86/cpu: Use SERIALIZE in sync_core() when available
  2020-08-05 20:30             ` Andy Lutomirski
@ 2020-08-05 22:19               ` Luck, Tony
  0 siblings, 0 replies; 11+ messages in thread
From: Luck, Tony @ 2020-08-05 22:19 UTC (permalink / raw)
  To: Andy Lutomirski, Ricardo Neri
  Cc: Andy Lutomirski, Borislav Petkov, H. Peter Anvin,
	Borislav Petkov, Thomas Gleixner, Ingo Molnar, X86 ML,
	Peter Zijlstra (Intel),
	Hansen, Dave, Zhang, Cathy, Yu, Fenghua, Park, Kyung Min,
	Shankar, Ravi V, Christopherson, Sean J, LKML, Neri, Ricardo,
	Dave Hansen, linux-edac

> I meant asm as in a .S file. But the code we have is fine for this purpose, at least for now.

There seem to be some drivers that call sync_core:

drivers/misc/sgi-gru/grufault.c:                sync_core();
drivers/misc/sgi-gru/grufault.c:                sync_core();            /* make sure we are have current data */
drivers/misc/sgi-gru/gruhandles.c:      sync_core();
drivers/misc/sgi-gru/gruhandles.c:      sync_core();
drivers/misc/sgi-gru/grukservices.c:    sync_core();

So if you go this path some day be sure to EXPORT the iret_to_self() function.

-Tony

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2020-08-05 22:20 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-08-05  2:10 [PATCH v2] x86/cpu: Use SERIALIZE in sync_core() when available Ricardo Neri
2020-08-05  4:48 ` Borislav Petkov
2020-08-05  4:58   ` hpa
2020-08-05  5:08     ` Borislav Petkov
2020-08-05  5:10       ` hpa
2020-08-05 17:07       ` Ricardo Neri
2020-08-05 18:28         ` Andy Lutomirski
2020-08-05 19:11           ` Ricardo Neri
2020-08-05 20:30             ` Andy Lutomirski
2020-08-05 22:19               ` Luck, Tony
2020-08-05 15:54   ` Ricardo Neri

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).