linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: hpa@zytor.com
To: Nadav Amit <namit@vmware.com>, Ingo Molnar <mingo@redhat.com>,
	Andy Lutomirski <luto@kernel.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Josh Poimboeuf <jpoimboe@redhat.com>,
	Edward Cree <ecree@solarflare.com>
Cc: Thomas Gleixner <tglx@linutronix.de>,
	LKML <linux-kernel@vger.kernel.org>,
	Nadav Amit <nadav.amit@gmail.com>, X86 ML <x86@kernel.org>,
	Paolo Abeni <pabeni@redhat.com>, Borislav Petkov <bp@alien8.de>,
	David Woodhouse <dwmw@amazon.co.uk>
Subject: Re: [RFC v2 1/6] x86: introduce kernel restartable sequence
Date: Thu, 03 Jan 2019 16:34:58 -0800	[thread overview]
Message-ID: <7C07ACBD-A269-4F00-A3FD-2041B27146D4@zytor.com> (raw)
In-Reply-To: <20181231072112.21051-2-namit@vmware.com>

On December 30, 2018 11:21:07 PM PST, Nadav Amit <namit@vmware.com> wrote:
>It is sometimes beneficial to have a restartable sequence - very few
>instructions which if they are preempted jump to a predefined point.
>
>To provide such functionality on x86-64, we use an empty REX-prefix
>(opcode 0x40) as an indication for instruction in such a sequence.
>Before
>calling the schedule IRQ routine, if the "magic" prefix is found, we
>call a routine to adjust the instruction pointer.  It is expected that
>this opcode is not in common use.
>
>The following patch will make use of this function. Since there are no
>other users (yet?), the patch does not bother to create a general
>infrastructure and API that others can use for such sequences. Yet, it
>should not be hard to make such extension later.
>
>Signed-off-by: Nadav Amit <namit@vmware.com>
>---
> arch/x86/entry/entry_64.S            | 16 ++++++++++++++--
> arch/x86/include/asm/nospec-branch.h | 12 ++++++++++++
> arch/x86/kernel/traps.c              |  7 +++++++
> 3 files changed, 33 insertions(+), 2 deletions(-)
>
>diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
>index 1f0efdb7b629..e144ff8b914f 100644
>--- a/arch/x86/entry/entry_64.S
>+++ b/arch/x86/entry/entry_64.S
>@@ -644,12 +644,24 @@ retint_kernel:
> 	/* Interrupts are off */
> 	/* Check if we need preemption */
> 	btl	$9, EFLAGS(%rsp)		/* were interrupts off? */
>-	jnc	1f
>+	jnc	2f
> 0:	cmpl	$0, PER_CPU_VAR(__preempt_count)
>+	jnz	2f
>+
>+	/*
>+	 * Allow to use restartable code sections in the kernel. Consider an
>+	 * instruction with the first byte having REX prefix without any bits
>+	 * set as an indication for an instruction in such a section.
>+	 */
>+	movq    RIP(%rsp), %rax
>+	cmpb    $KERNEL_RESTARTABLE_PREFIX, (%rax)
> 	jnz	1f
>+	mov	%rsp, %rdi
>+	call	restart_kernel_rseq
>+1:
> 	call	preempt_schedule_irq
> 	jmp	0b
>-1:
>+2:
> #endif
> 	/*
> 	 * The iretq could re-enable interrupts:
>diff --git a/arch/x86/include/asm/nospec-branch.h
>b/arch/x86/include/asm/nospec-branch.h
>index dad12b767ba0..be4713ef0940 100644
>--- a/arch/x86/include/asm/nospec-branch.h
>+++ b/arch/x86/include/asm/nospec-branch.h
>@@ -54,6 +54,12 @@
> 	jnz	771b;				\
> 	add	$(BITS_PER_LONG/8) * nr, sp;
> 
>+/*
>+ * An empty REX-prefix is an indication that this instruction is part
>of kernel
>+ * restartable sequence.
>+ */
>+#define KERNEL_RESTARTABLE_PREFIX		(0x40)
>+
> #ifdef __ASSEMBLY__
> 
> /*
>@@ -150,6 +156,12 @@
> #endif
> .endm
> 
>+.macro restartable_seq_prefix
>+#ifdef CONFIG_PREEMPT
>+	.byte	KERNEL_RESTARTABLE_PREFIX
>+#endif
>+.endm
>+
> #else /* __ASSEMBLY__ */
> 
> #define ANNOTATE_NOSPEC_ALTERNATIVE				\
>diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
>index 85cccadb9a65..b1e855bad5ac 100644
>--- a/arch/x86/kernel/traps.c
>+++ b/arch/x86/kernel/traps.c
>@@ -59,6 +59,7 @@
> #include <asm/fpu/xstate.h>
> #include <asm/vm86.h>
> #include <asm/umip.h>
>+#include <asm/nospec-branch.h>
> 
> #ifdef CONFIG_X86_64
> #include <asm/x86_init.h>
>@@ -186,6 +187,12 @@ int fixup_bug(struct pt_regs *regs, int trapnr)
> 	return 0;
> }
> 
>+asmlinkage __visible void restart_kernel_rseq(struct pt_regs *regs)
>+{
>+	if (user_mode(regs) || *(u8 *)regs->ip != KERNEL_RESTARTABLE_PREFIX)
>+		return;
>+}
>+
> static nokprobe_inline int
>do_trap_no_signal(struct task_struct *tsk, int trapnr, const char *str,
> 		  struct pt_regs *regs,	long error_code)

A 0x40 prefix is *not* a noop. It changes the interpretation of byte registers 4 though 7 from ah, ch, dh, bh to spl, bpl, sil and dil.

It may not matter in your application but:

a. You need to clarify that so is the case, and why;
b. Phrase it differently so others don't propagate the same misunderstanding.
-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.

  parent reply	other threads:[~2019-01-04  0:35 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-12-31  7:21 [RFC v2 0/6] x86: dynamic indirect branch promotion Nadav Amit
2018-12-31  7:21 ` [RFC v2 1/6] x86: introduce kernel restartable sequence Nadav Amit
2018-12-31 20:08   ` Andy Lutomirski
2018-12-31 21:12     ` Nadav Amit
2019-01-03 22:21   ` Andi Kleen
2019-01-03 22:29     ` Nadav Amit
2019-01-03 22:48       ` Andi Kleen
2019-01-03 22:52         ` Nadav Amit
2019-01-03 23:40           ` Andi Kleen
2019-01-03 23:56             ` Nadav Amit
2019-01-04  0:34   ` hpa [this message]
2018-12-31  7:21 ` [RFC v2 2/6] objtool: ignore instructions Nadav Amit
2018-12-31  7:21 ` [RFC v2 3/6] x86: patch indirect branch promotion Nadav Amit
2018-12-31  7:21 ` [RFC v2 4/6] x86: interface for accessing indirect branch locations Nadav Amit
2018-12-31  7:21 ` [RFC v2 5/6] x86: learning and patching indirect branch targets Nadav Amit
2018-12-31 20:05   ` Andy Lutomirski
2018-12-31 21:07     ` Nadav Amit
2018-12-31  7:21 ` [RFC v2 6/6] x86: outline optpoline Nadav Amit
2018-12-31 19:51 ` [RFC v2 0/6] x86: dynamic indirect branch promotion Andy Lutomirski
2018-12-31 19:53   ` Nadav Amit
2019-01-03 18:10     ` Josh Poimboeuf
2019-01-03 18:30       ` Nadav Amit
2019-01-03 20:31         ` Josh Poimboeuf
2019-01-03 22:18 ` Andi Kleen
2019-01-07 16:32   ` Peter Zijlstra
2019-01-08  7:47     ` Adrian Hunter
2019-01-08  9:25       ` Peter Zijlstra
2019-01-08 10:01         ` Adrian Hunter
2019-01-08 10:10           ` Peter Zijlstra
2019-01-08 17:27             ` Andi Kleen
2019-01-08 18:28               ` Nadav Amit
2019-01-08 19:01                 ` Peter Zijlstra
2019-01-08 20:47                   ` Nadav Amit
2019-01-08 20:53                     ` Andi Kleen
2019-01-09 10:35                     ` Peter Zijlstra
2019-08-29  8:23                       ` Tracing text poke / kernel self-modifying code (Was: Re: [RFC v2 0/6] x86: dynamic indirect branch promotion) Adrian Hunter
2019-08-29  8:53                         ` Peter Zijlstra
2019-08-29  9:40                           ` Adrian Hunter
2019-08-29 11:46                             ` Peter Zijlstra
2019-09-12  7:00                               ` Adrian Hunter
2019-09-12 12:17                                 ` hpa
2019-01-08 18:57               ` [RFC v2 0/6] x86: dynamic indirect branch promotion Peter Zijlstra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=7C07ACBD-A269-4F00-A3FD-2041B27146D4@zytor.com \
    --to=hpa@zytor.com \
    --cc=bp@alien8.de \
    --cc=dwmw@amazon.co.uk \
    --cc=ecree@solarflare.com \
    --cc=jpoimboe@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@kernel.org \
    --cc=mingo@redhat.com \
    --cc=nadav.amit@gmail.com \
    --cc=namit@vmware.com \
    --cc=pabeni@redhat.com \
    --cc=peterz@infradead.org \
    --cc=tglx@linutronix.de \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).