All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH urgent v2] x86, asm: Disable opportunistic SYSRET if regs->flags has TF set
@ 2015-04-01 21:26 Andy Lutomirski
  2015-04-02  6:21 ` Borislav Petkov
                   ` (2 more replies)
  0 siblings, 3 replies; 13+ messages in thread
From: Andy Lutomirski @ 2015-04-01 21:26 UTC (permalink / raw)
  To: Ingo Molnar, x86, linux-kernel
  Cc: Borislav Petkov, Denys Vlasenko, Andy Lutomirski

When I wrote the opportunistic SYSRET code, I missed an important
difference between SYSRET and IRET.  Both instructions are capable
of setting EFLAGS.TF, but they behave differently when doing so.
IRET will not issue a #DB trap after execution when it sets TF This
is critical -- otherwise you'd never be able to make forward
progress when returning to userspace.  SYSRET, on the other hand,
will trap with #DB immediately after returning to CPL3, and the next
instruction will never execute.

This breaks anything that opportunistically SYSRETs to a user
context with TF set.  For example, running this code with TF set and
a SIGTRAP handler loaded never gets past post_nop.

	extern unsigned char post_nop[];
	asm volatile ("pushfq\n\t"
		      "popq %%r11\n\t"
		      "nop\n\t"
		      "post_nop:"
		      : : "c" (post_nop) : "r11");

In my defense, I can't find this documented in the AMD or Intel
manual.

Fix it by using IRET to restore TF.

Fixes: 2a23c6b8a9c4 x86_64, entry: Use sysret to return to userspace when possible
Signed-off-by: Andy Lutomirski <luto@kernel.org>
---

This affects 4.0-rc as well as -tip.  A full test case lives here:

https://git.kernel.org/cgit/linux/kernel/git/luto/misc-tests.git/

It's called single_step_syscall_64.

On Intel systems, the 32-bit version of that test fails for unrelated
reasons, but that's not a regression, and fixing it will be much more
intrusive.

Changes from v1:
 - Remove mention of testl from changelog.
 - Improve comment per Denys' suggestion.

arch/x86/kernel/entry_64.S | 16 +++++++++++++++-
 1 file changed, 15 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kernel/entry_64.S b/arch/x86/kernel/entry_64.S
index 750c6efcb718..537716380959 100644
--- a/arch/x86/kernel/entry_64.S
+++ b/arch/x86/kernel/entry_64.S
@@ -715,7 +715,21 @@ retint_swapgs:		/* return to user-space */
 	cmpq %r11,EFLAGS(%rsp)		/* R11 == RFLAGS */
 	jne opportunistic_sysret_failed
 
-	testq $X86_EFLAGS_RF,%r11	/* sysret can't restore RF */
+	/*
+	 * SYSRET can't restore RF.  SYSRET can restore TF, but unlike IRET,
+	 * restoring TF results in a trap from userspace immediately after
+	 * SYSRET.  This would cause an infinite loop whenever #DB happens
+	 * with register state that satisfies the opportunistic SYSRET
+	 * conditions.  For example, single-stepping this user code:
+	 *
+	 *           movq $stuck_here,%rcx
+	 *           pushfq
+	 *           popq %r11
+	 *   stuck_here:
+	 *
+	 * would never get past stuck_here.
+	 */
+	testq $(X86_EFLAGS_RF|X86_EFLAGS_TF),%r11
 	jnz opportunistic_sysret_failed
 
 	/* nothing to check for RSP */
-- 
2.3.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2015-04-02 16:08 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-04-01 21:26 [PATCH urgent v2] x86, asm: Disable opportunistic SYSRET if regs->flags has TF set Andy Lutomirski
2015-04-02  6:21 ` Borislav Petkov
2015-04-02  9:07 ` Ingo Molnar
2015-04-02 10:07   ` Denys Vlasenko
2015-04-02 10:37     ` Ingo Molnar
2015-04-02 11:14       ` Brian Gerst
2015-04-02 12:24         ` Denys Vlasenko
2015-04-02 12:31           ` Ingo Molnar
2015-04-02 12:59             ` Denys Vlasenko
2015-04-02 15:49               ` Denys Vlasenko
2015-04-02 16:08                 ` Ingo Molnar
2015-04-02 14:26             ` Andy Lutomirski
2015-04-02 12:32 ` [tip:x86/urgent] x86/asm/entry/64: " tip-bot for Andy Lutomirski

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.