All of lore.kernel.org
 help / color / mirror / Atom feed
From: tip-bot for Andy Lutomirski <tipbot@zytor.com>
To: linux-tip-commits@vger.kernel.org
Cc: jgross@suse.com, gnomes@lxorguk.ukuu.org.uk, hmh@hmh.eng.br,
	peterz@infradead.org, brgerst@gmail.com, hpa@zytor.com,
	linux-kernel@vger.kernel.org, tedheadster@gmail.com,
	mingo@kernel.org, bp@alien8.de, luto@kernel.org,
	andrew.cooper3@citrix.com, tglx@linutronix.de,
	Xen-devel@lists.xen.org, boris.ostrovsky@oracle.com
Subject: [tip:x86/urgent] x86/asm: Rewrite sync_core() to use IRET-to-self
Date: Mon, 19 Dec 2016 03:05:41 -0800	[thread overview]
Message-ID: <tip-c198b121b1a1d7a7171770c634cd49191bac4477__17624.7133910239$1482145689$gmane$org@git.kernel.org> (raw)
In-Reply-To: <5c79f0225f68bc8c40335612bf624511abb78941.1481307769.git.luto@kernel.org>

Commit-ID:  c198b121b1a1d7a7171770c634cd49191bac4477
Gitweb:     http://git.kernel.org/tip/c198b121b1a1d7a7171770c634cd49191bac4477
Author:     Andy Lutomirski <luto@kernel.org>
AuthorDate: Fri, 9 Dec 2016 10:24:08 -0800
Committer:  Thomas Gleixner <tglx@linutronix.de>
CommitDate: Mon, 19 Dec 2016 11:54:21 +0100

x86/asm: Rewrite sync_core() to use IRET-to-self

Aside from being excessively slow, CPUID is problematic: Linux runs
on a handful of CPUs that don't have CPUID.  Use IRET-to-self
instead.  IRET-to-self works everywhere, so it makes testing easy.

For reference, On my laptop, IRET-to-self is ~110ns,
CPUID(eax=1, ecx=0) is ~83ns on native and very very slow under KVM,
and MOV-to-CR2 is ~42ns.

While we're at it: sync_core() serves a very specific purpose.
Document it.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Juergen Gross <jgross@suse.com>
Cc: One Thousand Gnomes <gnomes@lxorguk.ukuu.org.uk>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Matthew Whitehead <tedheadster@gmail.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Henrique de Moraes Holschuh <hmh@hmh.eng.br>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: xen-devel <Xen-devel@lists.xen.org>
Link: http://lkml.kernel.org/r/5c79f0225f68bc8c40335612bf624511abb78941.1481307769.git.luto@kernel.org
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

---
 arch/x86/include/asm/processor.h | 80 +++++++++++++++++++++++++++++-----------
 1 file changed, 58 insertions(+), 22 deletions(-)

diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index b934871..eaf1005 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -602,33 +602,69 @@ static __always_inline void cpu_relax(void)
 	rep_nop();
 }
 
-/* Stop speculative execution and prefetching of modified code. */
+/*
+ * This function forces the icache and prefetched instruction stream to
+ * catch up with reality in two very specific cases:
+ *
+ *  a) Text was modified using one virtual address and is about to be executed
+ *     from the same physical page at a different virtual address.
+ *
+ *  b) Text was modified on a different CPU, may subsequently be
+ *     executed on this CPU, and you want to make sure the new version
+ *     gets executed.  This generally means you're calling this in a IPI.
+ *
+ * If you're calling this for a different reason, you're probably doing
+ * it wrong.
+ */
 static inline void sync_core(void)
 {
-	int tmp;
-
-#ifdef CONFIG_X86_32
 	/*
-	 * Do a CPUID if available, otherwise do a jump.  The jump
-	 * can conveniently enough be the jump around CPUID.
+	 * There are quite a few ways to do this.  IRET-to-self is nice
+	 * because it works on every CPU, at any CPL (so it's compatible
+	 * with paravirtualization), and it never exits to a hypervisor.
+	 * The only down sides are that it's a bit slow (it seems to be
+	 * a bit more than 2x slower than the fastest options) and that
+	 * it unmasks NMIs.  The "push %cs" is needed because, in
+	 * paravirtual environments, __KERNEL_CS may not be a valid CS
+	 * value when we do IRET directly.
+	 *
+	 * In case NMI unmasking or performance ever becomes a problem,
+	 * the next best option appears to be MOV-to-CR2 and an
+	 * unconditional jump.  That sequence also works on all CPUs,
+	 * but it will fault at CPL3 (i.e. Xen PV and lguest).
+	 *
+	 * CPUID is the conventional way, but it's nasty: it doesn't
+	 * exist on some 486-like CPUs, and it usually exits to a
+	 * hypervisor.
+	 *
+	 * Like all of Linux's memory ordering operations, this is a
+	 * compiler barrier as well.
 	 */
-	asm volatile("cmpl %2,%1\n\t"
-		     "jl 1f\n\t"
-		     "cpuid\n"
-		     "1:"
-		     : "=a" (tmp)
-		     : "rm" (boot_cpu_data.cpuid_level), "ri" (0), "0" (1)
-		     : "ebx", "ecx", "edx", "memory");
+	register void *__sp asm(_ASM_SP);
+
+#ifdef CONFIG_X86_32
+	asm volatile (
+		"pushfl\n\t"
+		"pushl %%cs\n\t"
+		"pushl $1f\n\t"
+		"iret\n\t"
+		"1:"
+		: "+r" (__sp) : : "memory");
 #else
-	/*
-	 * CPUID is a barrier to speculative execution.
-	 * Prefetched instructions are automatically
-	 * invalidated when modified.
-	 */
-	asm volatile("cpuid"
-		     : "=a" (tmp)
-		     : "0" (1)
-		     : "ebx", "ecx", "edx", "memory");
+	unsigned int tmp;
+
+	asm volatile (
+		"mov %%ss, %0\n\t"
+		"pushq %q0\n\t"
+		"pushq %%rsp\n\t"
+		"addq $8, (%%rsp)\n\t"
+		"pushfq\n\t"
+		"mov %%cs, %0\n\t"
+		"pushq %q0\n\t"
+		"pushq $1f\n\t"
+		"iretq\n\t"
+		"1:"
+		: "=&r" (tmp), "+r" (__sp) : : "cc", "memory");
 #endif
 }
 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

  reply	other threads:[~2016-12-19 11:05 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-12-09 18:24 [PATCH v4 0/4] CPUID-less CPU/sync_core fixes and improvements Andy Lutomirski
2016-12-09 18:24 ` [PATCH v4 1/4] x86/asm/32: Make sync_core() handle missing CPUID on all 32-bit kernels Andy Lutomirski
2016-12-19 11:03   ` [tip:x86/urgent] " tip-bot for Andy Lutomirski
2016-12-19 11:03   ` tip-bot for Andy Lutomirski
2016-12-09 18:24 ` [PATCH v4 1/4] " Andy Lutomirski
2016-12-09 18:24 ` [PATCH v4 2/4] Revert "x86/boot: Fail the boot if !M486 and CPUID is missing" Andy Lutomirski
2016-12-09 18:24 ` Andy Lutomirski
2016-12-19 11:04   ` [tip:x86/urgent] " tip-bot for Andy Lutomirski
2016-12-19 11:04   ` tip-bot for Andy Lutomirski
2016-12-09 18:24 ` [PATCH v4 3/4] x86/microcode/intel: Replace sync_core() with native_cpuid() Andy Lutomirski
2016-12-19 11:05   ` [tip:x86/urgent] " tip-bot for Andy Lutomirski
2016-12-19 11:05   ` tip-bot for Andy Lutomirski
2016-12-09 18:24 ` [PATCH v4 3/4] " Andy Lutomirski
2016-12-09 18:24 ` [PATCH v4 4/4] x86/asm: Rewrite sync_core() to use IRET-to-self Andy Lutomirski
2016-12-09 18:24 ` Andy Lutomirski
2016-12-19 11:05   ` tip-bot for Andy Lutomirski [this message]
2016-12-19 11:05   ` [tip:x86/urgent] " tip-bot for Andy Lutomirski
2016-12-15 18:06 ` [PATCH v4 0/4] CPUID-less CPU/sync_core fixes and improvements Andy Lutomirski
2016-12-15 18:06 ` Andy Lutomirski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='tip-c198b121b1a1d7a7171770c634cd49191bac4477__17624.7133910239$1482145689$gmane$org@git.kernel.org' \
    --to=tipbot@zytor.com \
    --cc=Xen-devel@lists.xen.org \
    --cc=andrew.cooper3@citrix.com \
    --cc=boris.ostrovsky@oracle.com \
    --cc=bp@alien8.de \
    --cc=brgerst@gmail.com \
    --cc=gnomes@lxorguk.ukuu.org.uk \
    --cc=hmh@hmh.eng.br \
    --cc=hpa@zytor.com \
    --cc=jgross@suse.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-tip-commits@vger.kernel.org \
    --cc=luto@kernel.org \
    --cc=mingo@kernel.org \
    --cc=peterz@infradead.org \
    --cc=tedheadster@gmail.com \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.