LKML Archive on lore.kernel.org
 help / Atom feed
* [PATCH] x86/retpoline: Optimize inline assembler for vmexit_fill_RSB
@ 2018-01-17 22:53 Andi Kleen
  2018-01-18 12:41 ` David Woodhouse
  2018-01-19 15:49 ` [tip:x86/pti] " tip-bot for Andi Kleen
  0 siblings, 2 replies; 3+ messages in thread
From: Andi Kleen @ 2018-01-17 22:53 UTC (permalink / raw)
  To: tglx; +Cc: dwmw, linux-kernel, gregkh, torvalds, arjan, dave.hansen, Andi Kleen

From: Andi Kleen <ak@linux.intel.com>

I was looking at the generated assembler for the C fill RSB
inline asm operations, and noticed several issues:

- The C code sets up the loop register, which
is then immediately overwritten in __FILL_RETURN_BUFFER
with the same value again.

- The C code also passes in the iteration count
in another register, which is not used at all.

Remove these two unnecessary operations. Just rely on
the single constant passed to the macro for the iterations.

This eliminates several instructions and avoids unnecessarily
clobbering a register.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
---
 arch/x86/include/asm/nospec-branch.h | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h
index 1e170fd3dc51..fed8703a28b9 100644
--- a/arch/x86/include/asm/nospec-branch.h
+++ b/arch/x86/include/asm/nospec-branch.h
@@ -204,15 +204,15 @@ enum spectre_v2_mitigation {
 static inline void vmexit_fill_RSB(void)
 {
 #ifdef CONFIG_RETPOLINE
-	unsigned long loops = RSB_CLEAR_LOOPS / 2;
+	unsigned long loops;
 
 	asm volatile (ANNOTATE_NOSPEC_ALTERNATIVE
 		      ALTERNATIVE("jmp 910f",
 				  __stringify(__FILL_RETURN_BUFFER(%0, RSB_CLEAR_LOOPS, %1)),
 				  X86_FEATURE_RETPOLINE)
 		      "910:"
-		      : "=&r" (loops), ASM_CALL_CONSTRAINT
-		      : "r" (loops) : "memory" );
+		      : "=r" (loops), ASM_CALL_CONSTRAINT
+		      : : "memory" );
 #endif
 }
 
-- 
2.14.3

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] x86/retpoline: Optimize inline assembler for vmexit_fill_RSB
  2018-01-17 22:53 [PATCH] x86/retpoline: Optimize inline assembler for vmexit_fill_RSB Andi Kleen
@ 2018-01-18 12:41 ` David Woodhouse
  2018-01-19 15:49 ` [tip:x86/pti] " tip-bot for Andi Kleen
  1 sibling, 0 replies; 3+ messages in thread
From: David Woodhouse @ 2018-01-18 12:41 UTC (permalink / raw)
  To: Andi Kleen, tglx
  Cc: linux-kernel, gregkh, torvalds, arjan, dave.hansen, Andi Kleen

[-- Attachment #1: Type: text/plain, Size: 960 bytes --]

On Wed, 2018-01-17 at 14:53 -0800, Andi Kleen wrote:
> From: Andi Kleen <ak@linux.intel.com>
> 
> I was looking at the generated assembler for the C fill RSB
> inline asm operations, and noticed several issues:
> 
> - The C code sets up the loop register, which
> is then immediately overwritten in __FILL_RETURN_BUFFER
> with the same value again.
> 
> - The C code also passes in the iteration count
> in another register, which is not used at all.
> 
> Remove these two unnecessary operations. Just rely on
> the single constant passed to the macro for the iterations.
> 
> This eliminates several instructions and avoids unnecessarily
> clobbering a register.
> 
> Signed-off-by: Andi Kleen <ak@linux.intel.com>

We still clobber the register, but you're right it's now filled in the
__FILL_RETURN_BUFFER macro itself. It was a previous iteration which
had the loop count passed in.

Acked-by: David Woodhouse <dwmw@amazon.co.uk>

[-- Attachment #2: smime.p7s --]
[-- Type: application/x-pkcs7-signature, Size: 5213 bytes --]

^ permalink raw reply	[flat|nested] 3+ messages in thread

* [tip:x86/pti] x86/retpoline: Optimize inline assembler for vmexit_fill_RSB
  2018-01-17 22:53 [PATCH] x86/retpoline: Optimize inline assembler for vmexit_fill_RSB Andi Kleen
  2018-01-18 12:41 ` David Woodhouse
@ 2018-01-19 15:49 ` " tip-bot for Andi Kleen
  1 sibling, 0 replies; 3+ messages in thread
From: tip-bot for Andi Kleen @ 2018-01-19 15:49 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: hpa, mingo, ak, tglx, dwmw, linux-kernel

Commit-ID:  3f7d875566d8e79c5e0b2c9a413e91b2c29e0854
Gitweb:     https://git.kernel.org/tip/3f7d875566d8e79c5e0b2c9a413e91b2c29e0854
Author:     Andi Kleen <ak@linux.intel.com>
AuthorDate: Wed, 17 Jan 2018 14:53:28 -0800
Committer:  Thomas Gleixner <tglx@linutronix.de>
CommitDate: Fri, 19 Jan 2018 16:31:30 +0100

x86/retpoline: Optimize inline assembler for vmexit_fill_RSB

The generated assembler for the C fill RSB inline asm operations has
several issues:

- The C code sets up the loop register, which is then immediately
  overwritten in __FILL_RETURN_BUFFER with the same value again.

- The C code also passes in the iteration count in another register, which
  is not used at all.

Remove these two unnecessary operations. Just rely on the single constant
passed to the macro for the iterations.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: David Woodhouse <dwmw@amazon.co.uk>
Cc: dave.hansen@intel.com
Cc: gregkh@linuxfoundation.org
Cc: torvalds@linux-foundation.org
Cc: arjan@linux.intel.com
Link: https://lkml.kernel.org/r/20180117225328.15414-1-andi@firstfloor.org

---
 arch/x86/include/asm/nospec-branch.h | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h
index 19ba5ad..4ad4108 100644
--- a/arch/x86/include/asm/nospec-branch.h
+++ b/arch/x86/include/asm/nospec-branch.h
@@ -206,16 +206,17 @@ extern char __indirect_thunk_end[];
 static inline void vmexit_fill_RSB(void)
 {
 #ifdef CONFIG_RETPOLINE
-	unsigned long loops = RSB_CLEAR_LOOPS / 2;
+	unsigned long loops;
 
 	asm volatile (ANNOTATE_NOSPEC_ALTERNATIVE
 		      ALTERNATIVE("jmp 910f",
 				  __stringify(__FILL_RETURN_BUFFER(%0, RSB_CLEAR_LOOPS, %1)),
 				  X86_FEATURE_RETPOLINE)
 		      "910:"
-		      : "=&r" (loops), ASM_CALL_CONSTRAINT
-		      : "r" (loops) : "memory" );
+		      : "=r" (loops), ASM_CALL_CONSTRAINT
+		      : : "memory" );
 #endif
 }
+
 #endif /* __ASSEMBLY__ */
 #endif /* __NOSPEC_BRANCH_H__ */

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, back to index

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-01-17 22:53 [PATCH] x86/retpoline: Optimize inline assembler for vmexit_fill_RSB Andi Kleen
2018-01-18 12:41 ` David Woodhouse
2018-01-19 15:49 ` [tip:x86/pti] " tip-bot for Andi Kleen

LKML Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/lkml/0 lkml/git/0.git
	git clone --mirror https://lore.kernel.org/lkml/1 lkml/git/1.git
	git clone --mirror https://lore.kernel.org/lkml/2 lkml/git/2.git
	git clone --mirror https://lore.kernel.org/lkml/3 lkml/git/3.git
	git clone --mirror https://lore.kernel.org/lkml/4 lkml/git/4.git
	git clone --mirror https://lore.kernel.org/lkml/5 lkml/git/5.git
	git clone --mirror https://lore.kernel.org/lkml/6 lkml/git/6.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 lkml lkml/ https://lore.kernel.org/lkml \
		linux-kernel@vger.kernel.org linux-kernel@archiver.kernel.org
	public-inbox-index lkml


Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-kernel


AGPL code for this site: git clone https://public-inbox.org/ public-inbox