linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] x86/asm/entry/64: better check for canonical address
@ 2015-04-21 16:27 Denys Vlasenko
  2015-04-21 18:08 ` Andy Lutomirski
  2015-04-22 14:11 ` [tip:x86/asm] x86/asm/entry/64: Implement better check for canonical addresses tip-bot for Denys Vlasenko
  0 siblings, 2 replies; 33+ messages in thread
From: Denys Vlasenko @ 2015-04-21 16:27 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Denys Vlasenko, Linus Torvalds, Steven Rostedt, Borislav Petkov,
	H. Peter Anvin, Andy Lutomirski, Oleg Nesterov,
	Frederic Weisbecker, Alexei Starovoitov, Will Drewry, Kees Cook,
	x86, linux-kernel

This change makes the check exact (no more false positives
on "negative" addresses).

It isn't really important to be fully correct here -
almost all addresses we'll ever see will be userspace ones,
but OTOH it looks to be cheap enough:
the new code uses two more ALU ops but preserves %rcx,
allowing to not reload it from pt_regs->cx again.
On disassembly level, the changes are:

cmp %rcx,0x80(%rsp) -> mov 0x80(%rsp),%r11; cmp %rcx,%r11
shr $0x2f,%rcx      -> shl $0x10,%rcx; sar $0x10,%rcx; cmp %rcx,%r11
mov 0x58(%rsp),%rcx -> (eliminated)

On 03/26/2015 07:45 PM, Andy Lutomirski wrote:
> I suspect that the two added ALU ops are free for all practical
> purposes, and the performance of this path isn't *that* critical.
>
> If anyone is running with vsyscall=native because they need the
> performance, then this would be a big win.  Otherwise I don't have a
> real preference.  Anyone else have any thoughts here?
>
> Let me just run through the math quickly to make sure I believe all the numbers:
>
> Canonical addresses either start with 17 zeros or 17 ones.
>
> In the old code, we checked that the top (64-47) = 17 bits were all
> zero.  We did this by shifting right by 47 bits and making sure that
> nothing was left.
>
> In the new code, we're shifting left by (64 - 48) = 16 bits and then
> signed shifting right by the same amount, this propagating the 17th
> highest bit to all positions to its left.  If we get the same value we
> started with, then we're good to go.
>
> So it looks okay to me.
>
> IOW, the new code extends the optimization correctly to one more case
> (native vsyscalls or the really weird corner case of returns to
> emulated vsyscalls, although that should basically never happen) at
> the cost of two probably-free ALU ops.

Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com>
CC: Linus Torvalds <torvalds@linux-foundation.org>
CC: Steven Rostedt <rostedt@goodmis.org>
CC: Ingo Molnar <mingo@kernel.org>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
CC: Andy Lutomirski <luto@amacapital.net>
CC: Oleg Nesterov <oleg@redhat.com>
CC: Frederic Weisbecker <fweisbec@gmail.com>
CC: Alexei Starovoitov <ast@plumgrid.com>
CC: Will Drewry <wad@chromium.org>
CC: Kees Cook <keescook@chromium.org>
CC: x86@kernel.org
CC: linux-kernel@vger.kernel.org
---

Changes since last submission: expanded commit message with Andy's reply
as requested by Ingo.

 arch/x86/kernel/entry_64.S | 23 ++++++++++++-----------
 1 file changed, 12 insertions(+), 11 deletions(-)

diff --git a/arch/x86/kernel/entry_64.S b/arch/x86/kernel/entry_64.S
index 3bdfdcd..e952f6b 100644
--- a/arch/x86/kernel/entry_64.S
+++ b/arch/x86/kernel/entry_64.S
@@ -410,26 +410,27 @@ syscall_return:
 	 * a completely clean 64-bit userspace context.
 	 */
 	movq RCX(%rsp),%rcx
-	cmpq %rcx,RIP(%rsp)		/* RCX == RIP */
+	movq RIP(%rsp),%r11
+	cmpq %rcx,%r11			/* RCX == RIP */
 	jne opportunistic_sysret_failed
 
 	/*
 	 * On Intel CPUs, SYSRET with non-canonical RCX/RIP will #GP
 	 * in kernel space.  This essentially lets the user take over
-	 * the kernel, since userspace controls RSP.  It's not worth
-	 * testing for canonicalness exactly -- this check detects any
-	 * of the 17 high bits set, which is true for non-canonical
-	 * or kernel addresses.  (This will pessimize vsyscall=native.
-	 * Big deal.)
+	 * the kernel, since userspace controls RSP.
 	 *
-	 * If virtual addresses ever become wider, this will need
+	 * If width of "canonical tail" ever becomes variable, this will need
 	 * to be updated to remain correct on both old and new CPUs.
 	 */
 	.ifne __VIRTUAL_MASK_SHIFT - 47
 	.error "virtual address width changed -- SYSRET checks need update"
 	.endif
-	shr $__VIRTUAL_MASK_SHIFT, %rcx
-	jnz opportunistic_sysret_failed
+	/* Change top 16 bits to be the sign-extension of 47th bit */
+	shl	$(64 - (__VIRTUAL_MASK_SHIFT+1)), %rcx
+	sar	$(64 - (__VIRTUAL_MASK_SHIFT+1)), %rcx
+	/* If this changed %rcx, it was not canonical */
+	cmpq	%rcx, %r11
+	jne	opportunistic_sysret_failed
 
 	cmpq $__USER_CS,CS(%rsp)	/* CS must match SYSRET */
 	jne opportunistic_sysret_failed
@@ -466,8 +467,8 @@ syscall_return:
 	 */
 syscall_return_via_sysret:
 	CFI_REMEMBER_STATE
-	/* r11 is already restored (see code above) */
-	RESTORE_C_REGS_EXCEPT_R11
+	/* rcx and r11 are already restored (see code above) */
+	RESTORE_C_REGS_EXCEPT_RCX_R11
 	movq RSP(%rsp),%rsp
 	USERGS_SYSRET64
 	CFI_RESTORE_STATE
-- 
1.8.1.4


^ permalink raw reply related	[flat|nested] 33+ messages in thread
* [PATCH] x86/asm/entry/64: better check for canonical address
@ 2015-03-26 12:42 Denys Vlasenko
  2015-03-26 18:45 ` Andy Lutomirski
                   ` (3 more replies)
  0 siblings, 4 replies; 33+ messages in thread
From: Denys Vlasenko @ 2015-03-26 12:42 UTC (permalink / raw)
  To: Andy Lutomirski; +Cc: Denys Vlasenko, Borislav Petkov, x86, linux-kernel

This change makes the check exact (no more false positives
on kernel addresses).

It isn't really important to be fully correct here -
almost all addresses we'll ever see will be userspace ones,
but OTOH it looks to be cheap enough:
the new code uses two more ALU ops but preserves %rcx,
allowing to not reload it from pt_regs->cx again.
On disassembly level, the changes are:

cmp %rcx,0x80(%rsp) -> mov 0x80(%rsp),%r11; cmp %rcx,%r11
shr $0x2f,%rcx      -> shl $0x10,%rcx; sar $0x10,%rcx; cmp %rcx,%r11
mov 0x58(%rsp),%rcx -> (eliminated)

Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: x86@kernel.org
CC: linux-kernel@vger.kernel.org
---

Andy, I'd undecided myself on the merits of doing this.
If you like it, feel free to take it in your tree.
I trimmed CC list to not bother too many people with this trivial
and quite possibly "useless churn"-class change.

 arch/x86/kernel/entry_64.S | 23 ++++++++++++-----------
 1 file changed, 12 insertions(+), 11 deletions(-)

diff --git a/arch/x86/kernel/entry_64.S b/arch/x86/kernel/entry_64.S
index bf9afad..a36d04d 100644
--- a/arch/x86/kernel/entry_64.S
+++ b/arch/x86/kernel/entry_64.S
@@ -688,26 +688,27 @@ retint_swapgs:		/* return to user-space */
 	 * a completely clean 64-bit userspace context.
 	 */
 	movq RCX(%rsp),%rcx
-	cmpq %rcx,RIP(%rsp)		/* RCX == RIP */
+	movq RIP(%rsp),%r11
+	cmpq %rcx,%r11			/* RCX == RIP */
 	jne opportunistic_sysret_failed
 
 	/*
 	 * On Intel CPUs, sysret with non-canonical RCX/RIP will #GP
 	 * in kernel space.  This essentially lets the user take over
-	 * the kernel, since userspace controls RSP.  It's not worth
-	 * testing for canonicalness exactly -- this check detects any
-	 * of the 17 high bits set, which is true for non-canonical
-	 * or kernel addresses.  (This will pessimize vsyscall=native.
-	 * Big deal.)
+	 * the kernel, since userspace controls RSP.
 	 *
-	 * If virtual addresses ever become wider, this will need
+	 * If width of "canonical tail" ever become variable, this will need
 	 * to be updated to remain correct on both old and new CPUs.
 	 */
 	.ifne __VIRTUAL_MASK_SHIFT - 47
 	.error "virtual address width changed -- sysret checks need update"
 	.endif
-	shr $__VIRTUAL_MASK_SHIFT, %rcx
-	jnz opportunistic_sysret_failed
+	/* Change top 16 bits to be a sign-extension of the rest */
+	shl	$(64 - (__VIRTUAL_MASK_SHIFT+1)), %rcx
+	sar	$(64 - (__VIRTUAL_MASK_SHIFT+1)), %rcx
+	/* If this changed %rcx, it was not canonical */
+	cmpq	%rcx, %r11
+	jne	opportunistic_sysret_failed
 
 	cmpq $__USER_CS,CS(%rsp)	/* CS must match SYSRET */
 	jne opportunistic_sysret_failed
@@ -730,8 +731,8 @@ retint_swapgs:		/* return to user-space */
 	 */
 irq_return_via_sysret:
 	CFI_REMEMBER_STATE
-	/* r11 is already restored (see code above) */
-	RESTORE_C_REGS_EXCEPT_R11
+	/* rcx and r11 are already restored (see code above) */
+	RESTORE_C_REGS_EXCEPT_RCX_R11
 	movq RSP(%rsp),%rsp
 	USERGS_SYSRET64
 	CFI_RESTORE_STATE
-- 
1.8.1.4


^ permalink raw reply related	[flat|nested] 33+ messages in thread

end of thread, other threads:[~2015-04-23 15:52 UTC | newest]

Thread overview: 33+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-04-21 16:27 [PATCH] x86/asm/entry/64: better check for canonical address Denys Vlasenko
2015-04-21 18:08 ` Andy Lutomirski
2015-04-23 15:10   ` Borislav Petkov
2015-04-23 15:41     ` Andy Lutomirski
2015-04-23 15:49       ` Borislav Petkov
2015-04-23 15:52         ` Andy Lutomirski
2015-04-22 14:11 ` [tip:x86/asm] x86/asm/entry/64: Implement better check for canonical addresses tip-bot for Denys Vlasenko
  -- strict thread matches above, loose matches on Subject: below --
2015-03-26 12:42 [PATCH] x86/asm/entry/64: better check for canonical address Denys Vlasenko
2015-03-26 18:45 ` Andy Lutomirski
2015-03-27  8:57   ` Borislav Petkov
2015-03-30 14:27   ` Denys Vlasenko
2015-03-30 14:30     ` Andy Lutomirski
2015-03-30 14:45       ` Andy Lutomirski
2015-03-27  8:11 ` Ingo Molnar
2015-03-27 10:45   ` Denys Vlasenko
2015-03-27 11:17     ` Ingo Molnar
2015-03-27 11:28       ` Brian Gerst
2015-03-27 11:34         ` Ingo Molnar
2015-03-27 12:14           ` Denys Vlasenko
2015-03-27 12:16             ` Ingo Molnar
2015-03-27 12:31               ` Denys Vlasenko
2015-03-28  9:11                 ` Ingo Molnar
2015-03-29 19:36                   ` Denys Vlasenko
2015-03-29 21:12                     ` Andy Lutomirski
2015-03-29 21:46                       ` Denys Vlasenko
2015-03-31 16:43                     ` Ingo Molnar
2015-03-31 17:08                       ` Andy Lutomirski
2015-03-31 17:31                         ` Denys Vlasenko
2015-03-27 11:27 ` Brian Gerst
2015-03-27 11:31   ` Ingo Molnar
2015-03-27 21:37     ` Andy Lutomirski
2015-04-02 17:37 ` Denys Vlasenko
2015-04-02 18:10   ` Ingo Molnar

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).