All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v4 0/4] CPUID-less CPU/sync_core fixes and improvements
@ 2016-12-09 18:24 Andy Lutomirski
  2016-12-09 18:24 ` [PATCH v4 1/4] x86/asm/32: Make sync_core() handle missing CPUID on all 32-bit kernels Andy Lutomirski
                   ` (9 more replies)
  0 siblings, 10 replies; 20+ messages in thread
From: Andy Lutomirski @ 2016-12-09 18:24 UTC (permalink / raw)
  To: x86
  Cc: One Thousand Gnomes, Borislav Petkov, linux-kernel, Brian Gerst,
	Matthew Whitehead, Henrique de Moraes Holschuh, Peter Zijlstra,
	xen-devel, Juergen Gross, Boris Ostrovsky, Andrew Cooper,
	Andy Lutomirski

*** PATCHES 1 and 2 MAY BE 4.9 MATERIAL ***

Alan Cox pointed out that the 486 isn't the only supported CPU that
doesn't have CPUID.  Let's clean up the mess and make everything
faster while we're at it.

Patch 1 is intended to be an easy fix: it makes sync_core() work
without CPUID on all 32-bit kernels.  It should be quite safe.  This
will have a negligible performance cost during boot on kernels built
for newer CPUs.  With this in place, patch 2 reverts the buggy 486
check I added.

Patches 3-4 are meant to improve the situation.  Patch 3 cleans up
the Intel microcode loader and the patch 4 (which depends on patch 3
to work correctly) stops using CPUID in sync_core() altogether.

Changes from v3:
 - Improve sync_core() comments.
 - Tidy up sync_core() asm.

Changes from v2:
 - Switch to IRET-to-self and get rid of all the paravirt code.
 - Further immprove the sync_core() comment.

Changes from v1:
 - Fix Xen
 - Add timing info to the changelog (hint: 2x speedup)
 - Document patch 1 a bit better.

Andy Lutomirski (4):
  x86/asm/32: Make sync_core() handle missing CPUID on all 32-bit
    kernels
  Revert "x86/boot: Fail the boot if !M486 and CPUID is missing"
  x86/microcode/intel: Replace sync_core() with native_cpuid()
  x86/asm: Rewrite sync_core() to use IRET-to-self

 arch/x86/boot/cpu.c                   |  6 ---
 arch/x86/include/asm/processor.h      | 80 +++++++++++++++++++++++++----------
 arch/x86/kernel/cpu/microcode/intel.c | 26 ++++++++++--
 3 files changed, 81 insertions(+), 31 deletions(-)

-- 
2.9.3

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH v4 1/4] x86/asm/32: Make sync_core() handle missing CPUID on all 32-bit kernels
  2016-12-09 18:24 [PATCH v4 0/4] CPUID-less CPU/sync_core fixes and improvements Andy Lutomirski
@ 2016-12-09 18:24 ` Andy Lutomirski
  2016-12-19 11:03   ` [tip:x86/urgent] " tip-bot for Andy Lutomirski
  2016-12-19 11:03   ` tip-bot for Andy Lutomirski
  2016-12-09 18:24 ` [PATCH v4 1/4] " Andy Lutomirski
                   ` (8 subsequent siblings)
  9 siblings, 2 replies; 20+ messages in thread
From: Andy Lutomirski @ 2016-12-09 18:24 UTC (permalink / raw)
  To: x86
  Cc: One Thousand Gnomes, Borislav Petkov, linux-kernel, Brian Gerst,
	Matthew Whitehead, Henrique de Moraes Holschuh, Peter Zijlstra,
	xen-devel, Juergen Gross, Boris Ostrovsky, Andrew Cooper,
	Andy Lutomirski

We support various non-Intel CPUs that don't have the CPUID
instruction, so the M486 test was wrong.  For now, fix it with a big
hammer: handle missing CPUID on all 32-bit CPUs.

Reported-by: One Thousand Gnomes <gnomes@lxorguk.ukuu.org.uk>
Signed-off-by: Andy Lutomirski <luto@kernel.org>
---
 arch/x86/include/asm/processor.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index 984a7bf17f6a..64fbc937d586 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -595,7 +595,7 @@ static inline void sync_core(void)
 {
 	int tmp;
 
-#ifdef CONFIG_M486
+#ifdef CONFIG_X86_32
 	/*
 	 * Do a CPUID if available, otherwise do a jump.  The jump
 	 * can conveniently enough be the jump around CPUID.
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v4 1/4] x86/asm/32: Make sync_core() handle missing CPUID on all 32-bit kernels
  2016-12-09 18:24 [PATCH v4 0/4] CPUID-less CPU/sync_core fixes and improvements Andy Lutomirski
  2016-12-09 18:24 ` [PATCH v4 1/4] x86/asm/32: Make sync_core() handle missing CPUID on all 32-bit kernels Andy Lutomirski
@ 2016-12-09 18:24 ` Andy Lutomirski
  2016-12-09 18:24 ` [PATCH v4 2/4] Revert "x86/boot: Fail the boot if !M486 and CPUID is missing" Andy Lutomirski
                   ` (7 subsequent siblings)
  9 siblings, 0 replies; 20+ messages in thread
From: Andy Lutomirski @ 2016-12-09 18:24 UTC (permalink / raw)
  To: x86
  Cc: Juergen Gross, One Thousand Gnomes, Andy Lutomirski,
	Peter Zijlstra, Brian Gerst, linux-kernel, Matthew Whitehead,
	Borislav Petkov, Henrique de Moraes Holschuh, Andrew Cooper,
	Boris Ostrovsky, xen-devel

We support various non-Intel CPUs that don't have the CPUID
instruction, so the M486 test was wrong.  For now, fix it with a big
hammer: handle missing CPUID on all 32-bit CPUs.

Reported-by: One Thousand Gnomes <gnomes@lxorguk.ukuu.org.uk>
Signed-off-by: Andy Lutomirski <luto@kernel.org>
---
 arch/x86/include/asm/processor.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index 984a7bf17f6a..64fbc937d586 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -595,7 +595,7 @@ static inline void sync_core(void)
 {
 	int tmp;
 
-#ifdef CONFIG_M486
+#ifdef CONFIG_X86_32
 	/*
 	 * Do a CPUID if available, otherwise do a jump.  The jump
 	 * can conveniently enough be the jump around CPUID.
-- 
2.9.3


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v4 2/4] Revert "x86/boot: Fail the boot if !M486 and CPUID is missing"
  2016-12-09 18:24 [PATCH v4 0/4] CPUID-less CPU/sync_core fixes and improvements Andy Lutomirski
                   ` (2 preceding siblings ...)
  2016-12-09 18:24 ` [PATCH v4 2/4] Revert "x86/boot: Fail the boot if !M486 and CPUID is missing" Andy Lutomirski
@ 2016-12-09 18:24 ` Andy Lutomirski
  2016-12-19 11:04   ` [tip:x86/urgent] " tip-bot for Andy Lutomirski
  2016-12-19 11:04   ` tip-bot for Andy Lutomirski
  2016-12-09 18:24 ` [PATCH v4 3/4] x86/microcode/intel: Replace sync_core() with native_cpuid() Andy Lutomirski
                   ` (5 subsequent siblings)
  9 siblings, 2 replies; 20+ messages in thread
From: Andy Lutomirski @ 2016-12-09 18:24 UTC (permalink / raw)
  To: x86
  Cc: One Thousand Gnomes, Borislav Petkov, linux-kernel, Brian Gerst,
	Matthew Whitehead, Henrique de Moraes Holschuh, Peter Zijlstra,
	xen-devel, Juergen Gross, Boris Ostrovsky, Andrew Cooper,
	Andy Lutomirski, Denys Vlasenko, H . Peter Anvin, Josh Poimboeuf,
	Linus Torvalds, Thomas Gleixner

This reverts commit ed68d7e9b9cfb64f3045ffbcb108df03c09a0f98.

The patch wasn't quite correct -- there are non-Intel (and hence
non-486) CPUs that we support that don't have CPUID.  Since we no
longer require CPUID for sync_core(), just revert the patch.

I think the relevant CPUs are Geode and Elan, but I'm not sure.

In principle, we should try to do better at identifying CPUID-less
CPUs in early boot, but that's more complicated.

Reported-by: One Thousand Gnomes <gnomes@lxorguk.ukuu.org.uk>
Cc: Matthew Whitehead <tedheadster@gmail.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andy Lutomirski <luto@kernel.org>
---
 arch/x86/boot/cpu.c | 6 ------
 1 file changed, 6 deletions(-)

diff --git a/arch/x86/boot/cpu.c b/arch/x86/boot/cpu.c
index 4224ede43b4e..26240dde081e 100644
--- a/arch/x86/boot/cpu.c
+++ b/arch/x86/boot/cpu.c
@@ -87,12 +87,6 @@ int validate_cpu(void)
 		return -1;
 	}
 
-	if (CONFIG_X86_MINIMUM_CPU_FAMILY <= 4 && !IS_ENABLED(CONFIG_M486) &&
-	    !has_eflag(X86_EFLAGS_ID)) {
-		printf("This kernel requires a CPU with the CPUID instruction.  Build with CONFIG_M486=y to run on this CPU.\n");
-		return -1;
-	}
-
 	if (err_flags) {
 		puts("This kernel requires the following features "
 		     "not present on the CPU:\n");
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v4 2/4] Revert "x86/boot: Fail the boot if !M486 and CPUID is missing"
  2016-12-09 18:24 [PATCH v4 0/4] CPUID-less CPU/sync_core fixes and improvements Andy Lutomirski
  2016-12-09 18:24 ` [PATCH v4 1/4] x86/asm/32: Make sync_core() handle missing CPUID on all 32-bit kernels Andy Lutomirski
  2016-12-09 18:24 ` [PATCH v4 1/4] " Andy Lutomirski
@ 2016-12-09 18:24 ` Andy Lutomirski
  2016-12-09 18:24 ` Andy Lutomirski
                   ` (6 subsequent siblings)
  9 siblings, 0 replies; 20+ messages in thread
From: Andy Lutomirski @ 2016-12-09 18:24 UTC (permalink / raw)
  To: x86
  Cc: Juergen Gross, One Thousand Gnomes, Denys Vlasenko,
	Andy Lutomirski, Josh Poimboeuf, Peter Zijlstra, Brian Gerst,
	H . Peter Anvin, linux-kernel, Matthew Whitehead,
	Borislav Petkov, Henrique de Moraes Holschuh, Andrew Cooper,
	Boris Ostrovsky, xen-devel, Linus Torvalds, Thomas Gleixner

This reverts commit ed68d7e9b9cfb64f3045ffbcb108df03c09a0f98.

The patch wasn't quite correct -- there are non-Intel (and hence
non-486) CPUs that we support that don't have CPUID.  Since we no
longer require CPUID for sync_core(), just revert the patch.

I think the relevant CPUs are Geode and Elan, but I'm not sure.

In principle, we should try to do better at identifying CPUID-less
CPUs in early boot, but that's more complicated.

Reported-by: One Thousand Gnomes <gnomes@lxorguk.ukuu.org.uk>
Cc: Matthew Whitehead <tedheadster@gmail.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andy Lutomirski <luto@kernel.org>
---
 arch/x86/boot/cpu.c | 6 ------
 1 file changed, 6 deletions(-)

diff --git a/arch/x86/boot/cpu.c b/arch/x86/boot/cpu.c
index 4224ede43b4e..26240dde081e 100644
--- a/arch/x86/boot/cpu.c
+++ b/arch/x86/boot/cpu.c
@@ -87,12 +87,6 @@ int validate_cpu(void)
 		return -1;
 	}
 
-	if (CONFIG_X86_MINIMUM_CPU_FAMILY <= 4 && !IS_ENABLED(CONFIG_M486) &&
-	    !has_eflag(X86_EFLAGS_ID)) {
-		printf("This kernel requires a CPU with the CPUID instruction.  Build with CONFIG_M486=y to run on this CPU.\n");
-		return -1;
-	}
-
 	if (err_flags) {
 		puts("This kernel requires the following features "
 		     "not present on the CPU:\n");
-- 
2.9.3


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v4 3/4] x86/microcode/intel: Replace sync_core() with native_cpuid()
  2016-12-09 18:24 [PATCH v4 0/4] CPUID-less CPU/sync_core fixes and improvements Andy Lutomirski
                   ` (3 preceding siblings ...)
  2016-12-09 18:24 ` Andy Lutomirski
@ 2016-12-09 18:24 ` Andy Lutomirski
  2016-12-19 11:05   ` [tip:x86/urgent] " tip-bot for Andy Lutomirski
  2016-12-19 11:05   ` tip-bot for Andy Lutomirski
  2016-12-09 18:24 ` [PATCH v4 3/4] " Andy Lutomirski
                   ` (4 subsequent siblings)
  9 siblings, 2 replies; 20+ messages in thread
From: Andy Lutomirski @ 2016-12-09 18:24 UTC (permalink / raw)
  To: x86
  Cc: One Thousand Gnomes, Borislav Petkov, linux-kernel, Brian Gerst,
	Matthew Whitehead, Henrique de Moraes Holschuh, Peter Zijlstra,
	xen-devel, Juergen Gross, Boris Ostrovsky, Andrew Cooper,
	Andy Lutomirski

The Intel microcode driver is using sync_core() to mean "do CPUID
with EAX=1".  I want to rework sync_core(), but first the Intel
microcode driver needs to stop depending on its current behavior.

Reported-by: Henrique de Moraes Holschuh <hmh@hmh.eng.br>
Acked-by: Borislav Petkov <bp@alien8.de>
Signed-off-by: Andy Lutomirski <luto@kernel.org>
---
 arch/x86/kernel/cpu/microcode/intel.c | 26 +++++++++++++++++++++++---
 1 file changed, 23 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kernel/cpu/microcode/intel.c b/arch/x86/kernel/cpu/microcode/intel.c
index cdc0deab00c9..e0981bb2a351 100644
--- a/arch/x86/kernel/cpu/microcode/intel.c
+++ b/arch/x86/kernel/cpu/microcode/intel.c
@@ -356,6 +356,26 @@ get_matching_model_microcode(unsigned long start, void *data, size_t size,
 	return state;
 }
 
+static void cpuid_1(void)
+{
+	/*
+	 * According to the Intel SDM, Volume 3, 9.11.7:
+	 *
+	 *   CPUID returns a value in a model specific register in
+	 *   addition to its usual register return values. The
+	 *   semantics of CPUID cause it to deposit an update ID value
+	 *   in the 64-bit model-specific register at address 08BH
+	 *   (IA32_BIOS_SIGN_ID). If no update is present in the
+	 *   processor, the value in the MSR remains unmodified.
+	 *
+	 * Use native_cpuid -- this code runs very early and we don't
+	 * want to mess with paravirt.
+	 */
+	unsigned int eax = 1, ebx, ecx = 0, edx;
+
+	native_cpuid(&eax, &ebx, &ecx, &edx);
+}
+
 static int collect_cpu_info_early(struct ucode_cpu_info *uci)
 {
 	unsigned int val[2];
@@ -385,7 +405,7 @@ static int collect_cpu_info_early(struct ucode_cpu_info *uci)
 	native_wrmsrl(MSR_IA32_UCODE_REV, 0);
 
 	/* As documented in the SDM: Do a CPUID 1 here */
-	sync_core();
+	cpuid_1();
 
 	/* get the current revision from MSR 0x8B */
 	native_rdmsr(MSR_IA32_UCODE_REV, val[0], val[1]);
@@ -627,7 +647,7 @@ static int apply_microcode_early(struct ucode_cpu_info *uci, bool early)
 	native_wrmsrl(MSR_IA32_UCODE_REV, 0);
 
 	/* As documented in the SDM: Do a CPUID 1 here */
-	sync_core();
+	cpuid_1();
 
 	/* get the current revision from MSR 0x8B */
 	native_rdmsr(MSR_IA32_UCODE_REV, val[0], val[1]);
@@ -927,7 +947,7 @@ static int apply_microcode_intel(int cpu)
 	wrmsrl(MSR_IA32_UCODE_REV, 0);
 
 	/* As documented in the SDM: Do a CPUID 1 here */
-	sync_core();
+	cpuid_1();
 
 	/* get the current revision from MSR 0x8B */
 	rdmsr(MSR_IA32_UCODE_REV, val[0], val[1]);
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v4 3/4] x86/microcode/intel: Replace sync_core() with native_cpuid()
  2016-12-09 18:24 [PATCH v4 0/4] CPUID-less CPU/sync_core fixes and improvements Andy Lutomirski
                   ` (4 preceding siblings ...)
  2016-12-09 18:24 ` [PATCH v4 3/4] x86/microcode/intel: Replace sync_core() with native_cpuid() Andy Lutomirski
@ 2016-12-09 18:24 ` Andy Lutomirski
  2016-12-09 18:24 ` [PATCH v4 4/4] x86/asm: Rewrite sync_core() to use IRET-to-self Andy Lutomirski
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 20+ messages in thread
From: Andy Lutomirski @ 2016-12-09 18:24 UTC (permalink / raw)
  To: x86
  Cc: Juergen Gross, One Thousand Gnomes, Andy Lutomirski,
	Peter Zijlstra, Brian Gerst, linux-kernel, Matthew Whitehead,
	Borislav Petkov, Henrique de Moraes Holschuh, Andrew Cooper,
	Boris Ostrovsky, xen-devel

The Intel microcode driver is using sync_core() to mean "do CPUID
with EAX=1".  I want to rework sync_core(), but first the Intel
microcode driver needs to stop depending on its current behavior.

Reported-by: Henrique de Moraes Holschuh <hmh@hmh.eng.br>
Acked-by: Borislav Petkov <bp@alien8.de>
Signed-off-by: Andy Lutomirski <luto@kernel.org>
---
 arch/x86/kernel/cpu/microcode/intel.c | 26 +++++++++++++++++++++++---
 1 file changed, 23 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kernel/cpu/microcode/intel.c b/arch/x86/kernel/cpu/microcode/intel.c
index cdc0deab00c9..e0981bb2a351 100644
--- a/arch/x86/kernel/cpu/microcode/intel.c
+++ b/arch/x86/kernel/cpu/microcode/intel.c
@@ -356,6 +356,26 @@ get_matching_model_microcode(unsigned long start, void *data, size_t size,
 	return state;
 }
 
+static void cpuid_1(void)
+{
+	/*
+	 * According to the Intel SDM, Volume 3, 9.11.7:
+	 *
+	 *   CPUID returns a value in a model specific register in
+	 *   addition to its usual register return values. The
+	 *   semantics of CPUID cause it to deposit an update ID value
+	 *   in the 64-bit model-specific register at address 08BH
+	 *   (IA32_BIOS_SIGN_ID). If no update is present in the
+	 *   processor, the value in the MSR remains unmodified.
+	 *
+	 * Use native_cpuid -- this code runs very early and we don't
+	 * want to mess with paravirt.
+	 */
+	unsigned int eax = 1, ebx, ecx = 0, edx;
+
+	native_cpuid(&eax, &ebx, &ecx, &edx);
+}
+
 static int collect_cpu_info_early(struct ucode_cpu_info *uci)
 {
 	unsigned int val[2];
@@ -385,7 +405,7 @@ static int collect_cpu_info_early(struct ucode_cpu_info *uci)
 	native_wrmsrl(MSR_IA32_UCODE_REV, 0);
 
 	/* As documented in the SDM: Do a CPUID 1 here */
-	sync_core();
+	cpuid_1();
 
 	/* get the current revision from MSR 0x8B */
 	native_rdmsr(MSR_IA32_UCODE_REV, val[0], val[1]);
@@ -627,7 +647,7 @@ static int apply_microcode_early(struct ucode_cpu_info *uci, bool early)
 	native_wrmsrl(MSR_IA32_UCODE_REV, 0);
 
 	/* As documented in the SDM: Do a CPUID 1 here */
-	sync_core();
+	cpuid_1();
 
 	/* get the current revision from MSR 0x8B */
 	native_rdmsr(MSR_IA32_UCODE_REV, val[0], val[1]);
@@ -927,7 +947,7 @@ static int apply_microcode_intel(int cpu)
 	wrmsrl(MSR_IA32_UCODE_REV, 0);
 
 	/* As documented in the SDM: Do a CPUID 1 here */
-	sync_core();
+	cpuid_1();
 
 	/* get the current revision from MSR 0x8B */
 	rdmsr(MSR_IA32_UCODE_REV, val[0], val[1]);
-- 
2.9.3


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v4 4/4] x86/asm: Rewrite sync_core() to use IRET-to-self
  2016-12-09 18:24 [PATCH v4 0/4] CPUID-less CPU/sync_core fixes and improvements Andy Lutomirski
                   ` (6 preceding siblings ...)
  2016-12-09 18:24 ` [PATCH v4 4/4] x86/asm: Rewrite sync_core() to use IRET-to-self Andy Lutomirski
@ 2016-12-09 18:24 ` Andy Lutomirski
  2016-12-19 11:05   ` [tip:x86/urgent] " tip-bot for Andy Lutomirski
  2016-12-19 11:05   ` tip-bot for Andy Lutomirski
  2016-12-15 18:06 ` [PATCH v4 0/4] CPUID-less CPU/sync_core fixes and improvements Andy Lutomirski
  2016-12-15 18:06 ` Andy Lutomirski
  9 siblings, 2 replies; 20+ messages in thread
From: Andy Lutomirski @ 2016-12-09 18:24 UTC (permalink / raw)
  To: x86
  Cc: One Thousand Gnomes, Borislav Petkov, linux-kernel, Brian Gerst,
	Matthew Whitehead, Henrique de Moraes Holschuh, Peter Zijlstra,
	xen-devel, Juergen Gross, Boris Ostrovsky, Andrew Cooper,
	Andy Lutomirski, H. Peter Anvin

Aside from being excessively slow, CPUID is problematic: Linux runs
on a handful of CPUs that don't have CPUID.  Use IRET-to-self
instead.  IRET-to-self works everywhere, so it makes testing easy.

For reference, On my laptop, IRET-to-self is ~110ns,
CPUID(eax=1, ecx=0) is ~83ns on native and very very slow under KVM,
and MOV-to-CR2 is ~42ns.

While we're at it: sync_core() serves a very specific purpose.
Document it.

Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Andy Lutomirski <luto@kernel.org>
---
 arch/x86/include/asm/processor.h | 80 +++++++++++++++++++++++++++++-----------
 1 file changed, 58 insertions(+), 22 deletions(-)

diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index 64fbc937d586..ceb1f4d3f3fa 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -590,33 +590,69 @@ static __always_inline void cpu_relax(void)
 
 #define cpu_relax_lowlatency() cpu_relax()
 
-/* Stop speculative execution and prefetching of modified code. */
+/*
+ * This function forces the icache and prefetched instruction stream to
+ * catch up with reality in two very specific cases:
+ *
+ *  a) Text was modified using one virtual address and is about to be executed
+ *     from the same physical page at a different virtual address.
+ *
+ *  b) Text was modified on a different CPU, may subsequently be
+ *     executed on this CPU, and you want to make sure the new version
+ *     gets executed.  This generally means you're calling this in a IPI.
+ *
+ * If you're calling this for a different reason, you're probably doing
+ * it wrong.
+ */
 static inline void sync_core(void)
 {
-	int tmp;
-
-#ifdef CONFIG_X86_32
 	/*
-	 * Do a CPUID if available, otherwise do a jump.  The jump
-	 * can conveniently enough be the jump around CPUID.
+	 * There are quite a few ways to do this.  IRET-to-self is nice
+	 * because it works on every CPU, at any CPL (so it's compatible
+	 * with paravirtualization), and it never exits to a hypervisor.
+	 * The only down sides are that it's a bit slow (it seems to be
+	 * a bit more than 2x slower than the fastest options) and that
+	 * it unmasks NMIs.  The "push %cs" is needed because, in
+	 * paravirtual environments, __KERNEL_CS may not be a valid CS
+	 * value when we do IRET directly.
+	 *
+	 * In case NMI unmasking or performance ever becomes a problem,
+	 * the next best option appears to be MOV-to-CR2 and an
+	 * unconditional jump.  That sequence also works on all CPUs,
+	 * but it will fault at CPL3 (i.e. Xen PV and lguest).
+	 *
+	 * CPUID is the conventional way, but it's nasty: it doesn't
+	 * exist on some 486-like CPUs, and it usually exits to a
+	 * hypervisor.
+	 *
+	 * Like all of Linux's memory ordering operations, this is a
+	 * compiler barrier as well.
 	 */
-	asm volatile("cmpl %2,%1\n\t"
-		     "jl 1f\n\t"
-		     "cpuid\n"
-		     "1:"
-		     : "=a" (tmp)
-		     : "rm" (boot_cpu_data.cpuid_level), "ri" (0), "0" (1)
-		     : "ebx", "ecx", "edx", "memory");
+	register void *__sp asm(_ASM_SP);
+
+#ifdef CONFIG_X86_32
+	asm volatile (
+		"pushfl\n\t"
+		"pushl %%cs\n\t"
+		"pushl $1f\n\t"
+		"iret\n\t"
+		"1:"
+		: "+r" (__sp) : : "memory");
 #else
-	/*
-	 * CPUID is a barrier to speculative execution.
-	 * Prefetched instructions are automatically
-	 * invalidated when modified.
-	 */
-	asm volatile("cpuid"
-		     : "=a" (tmp)
-		     : "0" (1)
-		     : "ebx", "ecx", "edx", "memory");
+	unsigned int tmp;
+
+	asm volatile (
+		"mov %%ss, %0\n\t"
+		"pushq %q0\n\t"
+		"pushq %%rsp\n\t"
+		"addq $8, (%%rsp)\n\t"
+		"pushfq\n\t"
+		"mov %%cs, %0\n\t"
+		"pushq %q0\n\t"
+		"pushq $1f\n\t"
+		"iretq\n\t"
+		"1:"
+		: "=&r" (tmp), "+r" (__sp) : : "cc", "memory");
 #endif
 }
 
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v4 4/4] x86/asm: Rewrite sync_core() to use IRET-to-self
  2016-12-09 18:24 [PATCH v4 0/4] CPUID-less CPU/sync_core fixes and improvements Andy Lutomirski
                   ` (5 preceding siblings ...)
  2016-12-09 18:24 ` [PATCH v4 3/4] " Andy Lutomirski
@ 2016-12-09 18:24 ` Andy Lutomirski
  2016-12-09 18:24 ` Andy Lutomirski
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 20+ messages in thread
From: Andy Lutomirski @ 2016-12-09 18:24 UTC (permalink / raw)
  To: x86
  Cc: Juergen Gross, One Thousand Gnomes, Andy Lutomirski,
	Peter Zijlstra, Brian Gerst, H. Peter Anvin, linux-kernel,
	Matthew Whitehead, Borislav Petkov, Henrique de Moraes Holschuh,
	Andrew Cooper, Boris Ostrovsky, xen-devel

Aside from being excessively slow, CPUID is problematic: Linux runs
on a handful of CPUs that don't have CPUID.  Use IRET-to-self
instead.  IRET-to-self works everywhere, so it makes testing easy.

For reference, On my laptop, IRET-to-self is ~110ns,
CPUID(eax=1, ecx=0) is ~83ns on native and very very slow under KVM,
and MOV-to-CR2 is ~42ns.

While we're at it: sync_core() serves a very specific purpose.
Document it.

Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Andy Lutomirski <luto@kernel.org>
---
 arch/x86/include/asm/processor.h | 80 +++++++++++++++++++++++++++++-----------
 1 file changed, 58 insertions(+), 22 deletions(-)

diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index 64fbc937d586..ceb1f4d3f3fa 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -590,33 +590,69 @@ static __always_inline void cpu_relax(void)
 
 #define cpu_relax_lowlatency() cpu_relax()
 
-/* Stop speculative execution and prefetching of modified code. */
+/*
+ * This function forces the icache and prefetched instruction stream to
+ * catch up with reality in two very specific cases:
+ *
+ *  a) Text was modified using one virtual address and is about to be executed
+ *     from the same physical page at a different virtual address.
+ *
+ *  b) Text was modified on a different CPU, may subsequently be
+ *     executed on this CPU, and you want to make sure the new version
+ *     gets executed.  This generally means you're calling this in a IPI.
+ *
+ * If you're calling this for a different reason, you're probably doing
+ * it wrong.
+ */
 static inline void sync_core(void)
 {
-	int tmp;
-
-#ifdef CONFIG_X86_32
 	/*
-	 * Do a CPUID if available, otherwise do a jump.  The jump
-	 * can conveniently enough be the jump around CPUID.
+	 * There are quite a few ways to do this.  IRET-to-self is nice
+	 * because it works on every CPU, at any CPL (so it's compatible
+	 * with paravirtualization), and it never exits to a hypervisor.
+	 * The only down sides are that it's a bit slow (it seems to be
+	 * a bit more than 2x slower than the fastest options) and that
+	 * it unmasks NMIs.  The "push %cs" is needed because, in
+	 * paravirtual environments, __KERNEL_CS may not be a valid CS
+	 * value when we do IRET directly.
+	 *
+	 * In case NMI unmasking or performance ever becomes a problem,
+	 * the next best option appears to be MOV-to-CR2 and an
+	 * unconditional jump.  That sequence also works on all CPUs,
+	 * but it will fault at CPL3 (i.e. Xen PV and lguest).
+	 *
+	 * CPUID is the conventional way, but it's nasty: it doesn't
+	 * exist on some 486-like CPUs, and it usually exits to a
+	 * hypervisor.
+	 *
+	 * Like all of Linux's memory ordering operations, this is a
+	 * compiler barrier as well.
 	 */
-	asm volatile("cmpl %2,%1\n\t"
-		     "jl 1f\n\t"
-		     "cpuid\n"
-		     "1:"
-		     : "=a" (tmp)
-		     : "rm" (boot_cpu_data.cpuid_level), "ri" (0), "0" (1)
-		     : "ebx", "ecx", "edx", "memory");
+	register void *__sp asm(_ASM_SP);
+
+#ifdef CONFIG_X86_32
+	asm volatile (
+		"pushfl\n\t"
+		"pushl %%cs\n\t"
+		"pushl $1f\n\t"
+		"iret\n\t"
+		"1:"
+		: "+r" (__sp) : : "memory");
 #else
-	/*
-	 * CPUID is a barrier to speculative execution.
-	 * Prefetched instructions are automatically
-	 * invalidated when modified.
-	 */
-	asm volatile("cpuid"
-		     : "=a" (tmp)
-		     : "0" (1)
-		     : "ebx", "ecx", "edx", "memory");
+	unsigned int tmp;
+
+	asm volatile (
+		"mov %%ss, %0\n\t"
+		"pushq %q0\n\t"
+		"pushq %%rsp\n\t"
+		"addq $8, (%%rsp)\n\t"
+		"pushfq\n\t"
+		"mov %%cs, %0\n\t"
+		"pushq %q0\n\t"
+		"pushq $1f\n\t"
+		"iretq\n\t"
+		"1:"
+		: "=&r" (tmp), "+r" (__sp) : : "cc", "memory");
 #endif
 }
 
-- 
2.9.3


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH v4 0/4] CPUID-less CPU/sync_core fixes and improvements
  2016-12-09 18:24 [PATCH v4 0/4] CPUID-less CPU/sync_core fixes and improvements Andy Lutomirski
                   ` (7 preceding siblings ...)
  2016-12-09 18:24 ` Andy Lutomirski
@ 2016-12-15 18:06 ` Andy Lutomirski
  2016-12-15 18:06 ` Andy Lutomirski
  9 siblings, 0 replies; 20+ messages in thread
From: Andy Lutomirski @ 2016-12-15 18:06 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: X86 ML, One Thousand Gnomes, Borislav Petkov, linux-kernel,
	Brian Gerst, Matthew Whitehead, Henrique de Moraes Holschuh,
	Peter Zijlstra, xen-devel, Juergen Gross, Boris Ostrovsky,
	Andrew Cooper

On Fri, Dec 9, 2016 at 10:24 AM, Andy Lutomirski <luto@kernel.org> wrote:
> *** PATCHES 1 and 2 MAY BE 4.9 MATERIAL ***
>
> Alan Cox pointed out that the 486 isn't the only supported CPU that
> doesn't have CPUID.  Let's clean up the mess and make everything
> faster while we're at it.
>

Ping?  Any chance of getting these in for 4.10?

--Andy

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v4 0/4] CPUID-less CPU/sync_core fixes and improvements
  2016-12-09 18:24 [PATCH v4 0/4] CPUID-less CPU/sync_core fixes and improvements Andy Lutomirski
                   ` (8 preceding siblings ...)
  2016-12-15 18:06 ` [PATCH v4 0/4] CPUID-less CPU/sync_core fixes and improvements Andy Lutomirski
@ 2016-12-15 18:06 ` Andy Lutomirski
  9 siblings, 0 replies; 20+ messages in thread
From: Andy Lutomirski @ 2016-12-15 18:06 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Juergen Gross, One Thousand Gnomes, Peter Zijlstra, Brian Gerst,
	X86 ML, linux-kernel, Matthew Whitehead, Borislav Petkov,
	Henrique de Moraes Holschuh, Andrew Cooper, Boris Ostrovsky,
	xen-devel

On Fri, Dec 9, 2016 at 10:24 AM, Andy Lutomirski <luto@kernel.org> wrote:
> *** PATCHES 1 and 2 MAY BE 4.9 MATERIAL ***
>
> Alan Cox pointed out that the 486 isn't the only supported CPU that
> doesn't have CPUID.  Let's clean up the mess and make everything
> faster while we're at it.
>

Ping?  Any chance of getting these in for 4.10?

--Andy

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [tip:x86/urgent] x86/asm/32: Make sync_core() handle missing CPUID on all 32-bit kernels
  2016-12-09 18:24 ` [PATCH v4 1/4] x86/asm/32: Make sync_core() handle missing CPUID on all 32-bit kernels Andy Lutomirski
@ 2016-12-19 11:03   ` tip-bot for Andy Lutomirski
  2016-12-19 11:03   ` tip-bot for Andy Lutomirski
  1 sibling, 0 replies; 20+ messages in thread
From: tip-bot for Andy Lutomirski @ 2016-12-19 11:03 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: boris.ostrovsky, jgross, peterz, mingo, bp, tglx, luto,
	Xen-devel, hmh, andrew.cooper3, linux-kernel, hpa, gnomes,
	tedheadster, brgerst

Commit-ID:  1c52d859cb2d417e7216d3e56bb7fea88444cec9
Gitweb:     http://git.kernel.org/tip/1c52d859cb2d417e7216d3e56bb7fea88444cec9
Author:     Andy Lutomirski <luto@kernel.org>
AuthorDate: Fri, 9 Dec 2016 10:24:05 -0800
Committer:  Thomas Gleixner <tglx@linutronix.de>
CommitDate: Mon, 19 Dec 2016 11:54:20 +0100

x86/asm/32: Make sync_core() handle missing CPUID on all 32-bit kernels

We support various non-Intel CPUs that don't have the CPUID
instruction, so the M486 test was wrong.  For now, fix it with a big
hammer: handle missing CPUID on all 32-bit CPUs.

Reported-by: One Thousand Gnomes <gnomes@lxorguk.ukuu.org.uk>
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Juergen Gross <jgross@suse.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Matthew Whitehead <tedheadster@gmail.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Henrique de Moraes Holschuh <hmh@hmh.eng.br>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: xen-devel <Xen-devel@lists.xen.org>
Link: http://lkml.kernel.org/r/685bd083a7c036f7769510b6846315b17d6ba71f.1481307769.git.luto@kernel.org
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

---
 arch/x86/include/asm/processor.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index 6aa741f..b934871 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -607,7 +607,7 @@ static inline void sync_core(void)
 {
 	int tmp;
 
-#ifdef CONFIG_M486
+#ifdef CONFIG_X86_32
 	/*
 	 * Do a CPUID if available, otherwise do a jump.  The jump
 	 * can conveniently enough be the jump around CPUID.

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [tip:x86/urgent] x86/asm/32: Make sync_core() handle missing CPUID on all 32-bit kernels
  2016-12-09 18:24 ` [PATCH v4 1/4] x86/asm/32: Make sync_core() handle missing CPUID on all 32-bit kernels Andy Lutomirski
  2016-12-19 11:03   ` [tip:x86/urgent] " tip-bot for Andy Lutomirski
@ 2016-12-19 11:03   ` tip-bot for Andy Lutomirski
  1 sibling, 0 replies; 20+ messages in thread
From: tip-bot for Andy Lutomirski @ 2016-12-19 11:03 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: jgross, gnomes, tglx, hmh, brgerst, peterz, andrew.cooper3,
	linux-kernel, Xen-devel, bp, luto, hpa, boris.ostrovsky,
	tedheadster, mingo

Commit-ID:  1c52d859cb2d417e7216d3e56bb7fea88444cec9
Gitweb:     http://git.kernel.org/tip/1c52d859cb2d417e7216d3e56bb7fea88444cec9
Author:     Andy Lutomirski <luto@kernel.org>
AuthorDate: Fri, 9 Dec 2016 10:24:05 -0800
Committer:  Thomas Gleixner <tglx@linutronix.de>
CommitDate: Mon, 19 Dec 2016 11:54:20 +0100

x86/asm/32: Make sync_core() handle missing CPUID on all 32-bit kernels

We support various non-Intel CPUs that don't have the CPUID
instruction, so the M486 test was wrong.  For now, fix it with a big
hammer: handle missing CPUID on all 32-bit CPUs.

Reported-by: One Thousand Gnomes <gnomes@lxorguk.ukuu.org.uk>
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Juergen Gross <jgross@suse.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Matthew Whitehead <tedheadster@gmail.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Henrique de Moraes Holschuh <hmh@hmh.eng.br>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: xen-devel <Xen-devel@lists.xen.org>
Link: http://lkml.kernel.org/r/685bd083a7c036f7769510b6846315b17d6ba71f.1481307769.git.luto@kernel.org
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

---
 arch/x86/include/asm/processor.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index 6aa741f..b934871 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -607,7 +607,7 @@ static inline void sync_core(void)
 {
 	int tmp;
 
-#ifdef CONFIG_M486
+#ifdef CONFIG_X86_32
 	/*
 	 * Do a CPUID if available, otherwise do a jump.  The jump
 	 * can conveniently enough be the jump around CPUID.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [tip:x86/urgent] Revert "x86/boot: Fail the boot if !M486 and CPUID is missing"
  2016-12-09 18:24 ` Andy Lutomirski
@ 2016-12-19 11:04   ` tip-bot for Andy Lutomirski
  2016-12-19 11:04   ` tip-bot for Andy Lutomirski
  1 sibling, 0 replies; 20+ messages in thread
From: tip-bot for Andy Lutomirski @ 2016-12-19 11:04 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: jpoimboe, tglx, tedheadster, luto, linux-kernel, torvalds,
	brgerst, hpa, Xen-devel, hmh, jgross, gnomes, andrew.cooper3,
	dvlasenk, peterz, mingo, boris.ostrovsky, bp

Commit-ID:  426d1aff3138cf38da14e912df3c75e312f96e9e
Gitweb:     http://git.kernel.org/tip/426d1aff3138cf38da14e912df3c75e312f96e9e
Author:     Andy Lutomirski <luto@kernel.org>
AuthorDate: Fri, 9 Dec 2016 10:24:06 -0800
Committer:  Thomas Gleixner <tglx@linutronix.de>
CommitDate: Mon, 19 Dec 2016 11:54:20 +0100

Revert "x86/boot: Fail the boot if !M486 and CPUID is missing"

This reverts commit ed68d7e9b9cfb64f3045ffbcb108df03c09a0f98.

The patch wasn't quite correct -- there are non-Intel (and hence
non-486) CPUs that we support that don't have CPUID.  Since we no
longer require CPUID for sync_core(), just revert the patch.

I think the relevant CPUs are Geode and Elan, but I'm not sure.

In principle, we should try to do better at identifying CPUID-less
CPUs in early boot, but that's more complicated.

Reported-by: One Thousand Gnomes <gnomes@lxorguk.ukuu.org.uk>
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Juergen Gross <jgross@suse.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Matthew Whitehead <tedheadster@gmail.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Henrique de Moraes Holschuh <hmh@hmh.eng.br>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: xen-devel <Xen-devel@lists.xen.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/82acde18a108b8e353180dd6febcc2876df33f24.1481307769.git.luto@kernel.org
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

---
 arch/x86/boot/cpu.c | 6 ------
 1 file changed, 6 deletions(-)

diff --git a/arch/x86/boot/cpu.c b/arch/x86/boot/cpu.c
index 4224ede..26240dd 100644
--- a/arch/x86/boot/cpu.c
+++ b/arch/x86/boot/cpu.c
@@ -87,12 +87,6 @@ int validate_cpu(void)
 		return -1;
 	}
 
-	if (CONFIG_X86_MINIMUM_CPU_FAMILY <= 4 && !IS_ENABLED(CONFIG_M486) &&
-	    !has_eflag(X86_EFLAGS_ID)) {
-		printf("This kernel requires a CPU with the CPUID instruction.  Build with CONFIG_M486=y to run on this CPU.\n");
-		return -1;
-	}
-
 	if (err_flags) {
 		puts("This kernel requires the following features "
 		     "not present on the CPU:\n");

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [tip:x86/urgent] Revert "x86/boot: Fail the boot if !M486 and CPUID is missing"
  2016-12-09 18:24 ` Andy Lutomirski
  2016-12-19 11:04   ` [tip:x86/urgent] " tip-bot for Andy Lutomirski
@ 2016-12-19 11:04   ` tip-bot for Andy Lutomirski
  1 sibling, 0 replies; 20+ messages in thread
From: tip-bot for Andy Lutomirski @ 2016-12-19 11:04 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: jgross, gnomes, dvlasenk, hmh, peterz, brgerst, boris.ostrovsky,
	hpa, linux-kernel, tedheadster, andrew.cooper3, bp, luto,
	jpoimboe, tglx, Xen-devel, torvalds, mingo

Commit-ID:  426d1aff3138cf38da14e912df3c75e312f96e9e
Gitweb:     http://git.kernel.org/tip/426d1aff3138cf38da14e912df3c75e312f96e9e
Author:     Andy Lutomirski <luto@kernel.org>
AuthorDate: Fri, 9 Dec 2016 10:24:06 -0800
Committer:  Thomas Gleixner <tglx@linutronix.de>
CommitDate: Mon, 19 Dec 2016 11:54:20 +0100

Revert "x86/boot: Fail the boot if !M486 and CPUID is missing"

This reverts commit ed68d7e9b9cfb64f3045ffbcb108df03c09a0f98.

The patch wasn't quite correct -- there are non-Intel (and hence
non-486) CPUs that we support that don't have CPUID.  Since we no
longer require CPUID for sync_core(), just revert the patch.

I think the relevant CPUs are Geode and Elan, but I'm not sure.

In principle, we should try to do better at identifying CPUID-less
CPUs in early boot, but that's more complicated.

Reported-by: One Thousand Gnomes <gnomes@lxorguk.ukuu.org.uk>
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Juergen Gross <jgross@suse.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Matthew Whitehead <tedheadster@gmail.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Henrique de Moraes Holschuh <hmh@hmh.eng.br>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: xen-devel <Xen-devel@lists.xen.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/82acde18a108b8e353180dd6febcc2876df33f24.1481307769.git.luto@kernel.org
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

---
 arch/x86/boot/cpu.c | 6 ------
 1 file changed, 6 deletions(-)

diff --git a/arch/x86/boot/cpu.c b/arch/x86/boot/cpu.c
index 4224ede..26240dd 100644
--- a/arch/x86/boot/cpu.c
+++ b/arch/x86/boot/cpu.c
@@ -87,12 +87,6 @@ int validate_cpu(void)
 		return -1;
 	}
 
-	if (CONFIG_X86_MINIMUM_CPU_FAMILY <= 4 && !IS_ENABLED(CONFIG_M486) &&
-	    !has_eflag(X86_EFLAGS_ID)) {
-		printf("This kernel requires a CPU with the CPUID instruction.  Build with CONFIG_M486=y to run on this CPU.\n");
-		return -1;
-	}
-
 	if (err_flags) {
 		puts("This kernel requires the following features "
 		     "not present on the CPU:\n");

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [tip:x86/urgent] x86/microcode/intel: Replace sync_core() with native_cpuid()
  2016-12-09 18:24 ` [PATCH v4 3/4] x86/microcode/intel: Replace sync_core() with native_cpuid() Andy Lutomirski
@ 2016-12-19 11:05   ` tip-bot for Andy Lutomirski
  2016-12-19 11:05   ` tip-bot for Andy Lutomirski
  1 sibling, 0 replies; 20+ messages in thread
From: tip-bot for Andy Lutomirski @ 2016-12-19 11:05 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: peterz, luto, hpa, boris.ostrovsky, gnomes, tglx, tedheadster,
	jgross, bp, brgerst, hmh, mingo, Xen-devel, andrew.cooper3,
	linux-kernel

Commit-ID:  484d0e5c7943644cc46e7308a8f9d83be598f2b9
Gitweb:     http://git.kernel.org/tip/484d0e5c7943644cc46e7308a8f9d83be598f2b9
Author:     Andy Lutomirski <luto@kernel.org>
AuthorDate: Fri, 9 Dec 2016 10:24:07 -0800
Committer:  Thomas Gleixner <tglx@linutronix.de>
CommitDate: Mon, 19 Dec 2016 11:54:21 +0100

x86/microcode/intel: Replace sync_core() with native_cpuid()

The Intel microcode driver is using sync_core() to mean "do CPUID
with EAX=1".  I want to rework sync_core(), but first the Intel
microcode driver needs to stop depending on its current behavior.

Reported-by: Henrique de Moraes Holschuh <hmh@hmh.eng.br>
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Acked-by: Borislav Petkov <bp@alien8.de>
Cc: Juergen Gross <jgross@suse.com>
Cc: One Thousand Gnomes <gnomes@lxorguk.ukuu.org.uk>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Matthew Whitehead <tedheadster@gmail.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: xen-devel <Xen-devel@lists.xen.org>
Link: http://lkml.kernel.org/r/535a025bb91fed1a019c5412b036337ad239e5bb.1481307769.git.luto@kernel.org
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

---
 arch/x86/kernel/cpu/microcode/intel.c | 26 +++++++++++++++++++++++---
 1 file changed, 23 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kernel/cpu/microcode/intel.c b/arch/x86/kernel/cpu/microcode/intel.c
index 54d50c3..b624b54 100644
--- a/arch/x86/kernel/cpu/microcode/intel.c
+++ b/arch/x86/kernel/cpu/microcode/intel.c
@@ -368,6 +368,26 @@ next:
 	return patch;
 }
 
+static void cpuid_1(void)
+{
+	/*
+	 * According to the Intel SDM, Volume 3, 9.11.7:
+	 *
+	 *   CPUID returns a value in a model specific register in
+	 *   addition to its usual register return values. The
+	 *   semantics of CPUID cause it to deposit an update ID value
+	 *   in the 64-bit model-specific register at address 08BH
+	 *   (IA32_BIOS_SIGN_ID). If no update is present in the
+	 *   processor, the value in the MSR remains unmodified.
+	 *
+	 * Use native_cpuid -- this code runs very early and we don't
+	 * want to mess with paravirt.
+	 */
+	unsigned int eax = 1, ebx, ecx = 0, edx;
+
+	native_cpuid(&eax, &ebx, &ecx, &edx);
+}
+
 static int collect_cpu_info_early(struct ucode_cpu_info *uci)
 {
 	unsigned int val[2];
@@ -393,7 +413,7 @@ static int collect_cpu_info_early(struct ucode_cpu_info *uci)
 	native_wrmsrl(MSR_IA32_UCODE_REV, 0);
 
 	/* As documented in the SDM: Do a CPUID 1 here */
-	sync_core();
+	cpuid_1();
 
 	/* get the current revision from MSR 0x8B */
 	native_rdmsr(MSR_IA32_UCODE_REV, val[0], val[1]);
@@ -593,7 +613,7 @@ static int apply_microcode_early(struct ucode_cpu_info *uci, bool early)
 	native_wrmsrl(MSR_IA32_UCODE_REV, 0);
 
 	/* As documented in the SDM: Do a CPUID 1 here */
-	sync_core();
+	cpuid_1();
 
 	/* get the current revision from MSR 0x8B */
 	native_rdmsr(MSR_IA32_UCODE_REV, val[0], val[1]);
@@ -805,7 +825,7 @@ static int apply_microcode_intel(int cpu)
 	wrmsrl(MSR_IA32_UCODE_REV, 0);
 
 	/* As documented in the SDM: Do a CPUID 1 here */
-	sync_core();
+	cpuid_1();
 
 	/* get the current revision from MSR 0x8B */
 	rdmsr(MSR_IA32_UCODE_REV, val[0], val[1]);

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [tip:x86/urgent] x86/microcode/intel: Replace sync_core() with native_cpuid()
  2016-12-09 18:24 ` [PATCH v4 3/4] x86/microcode/intel: Replace sync_core() with native_cpuid() Andy Lutomirski
  2016-12-19 11:05   ` [tip:x86/urgent] " tip-bot for Andy Lutomirski
@ 2016-12-19 11:05   ` tip-bot for Andy Lutomirski
  1 sibling, 0 replies; 20+ messages in thread
From: tip-bot for Andy Lutomirski @ 2016-12-19 11:05 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: jgross, gnomes, hmh, peterz, brgerst, mingo, linux-kernel,
	tedheadster, andrew.cooper3, bp, luto, hpa, boris.ostrovsky,
	Xen-devel, tglx

Commit-ID:  484d0e5c7943644cc46e7308a8f9d83be598f2b9
Gitweb:     http://git.kernel.org/tip/484d0e5c7943644cc46e7308a8f9d83be598f2b9
Author:     Andy Lutomirski <luto@kernel.org>
AuthorDate: Fri, 9 Dec 2016 10:24:07 -0800
Committer:  Thomas Gleixner <tglx@linutronix.de>
CommitDate: Mon, 19 Dec 2016 11:54:21 +0100

x86/microcode/intel: Replace sync_core() with native_cpuid()

The Intel microcode driver is using sync_core() to mean "do CPUID
with EAX=1".  I want to rework sync_core(), but first the Intel
microcode driver needs to stop depending on its current behavior.

Reported-by: Henrique de Moraes Holschuh <hmh@hmh.eng.br>
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Acked-by: Borislav Petkov <bp@alien8.de>
Cc: Juergen Gross <jgross@suse.com>
Cc: One Thousand Gnomes <gnomes@lxorguk.ukuu.org.uk>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Matthew Whitehead <tedheadster@gmail.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: xen-devel <Xen-devel@lists.xen.org>
Link: http://lkml.kernel.org/r/535a025bb91fed1a019c5412b036337ad239e5bb.1481307769.git.luto@kernel.org
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

---
 arch/x86/kernel/cpu/microcode/intel.c | 26 +++++++++++++++++++++++---
 1 file changed, 23 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kernel/cpu/microcode/intel.c b/arch/x86/kernel/cpu/microcode/intel.c
index 54d50c3..b624b54 100644
--- a/arch/x86/kernel/cpu/microcode/intel.c
+++ b/arch/x86/kernel/cpu/microcode/intel.c
@@ -368,6 +368,26 @@ next:
 	return patch;
 }
 
+static void cpuid_1(void)
+{
+	/*
+	 * According to the Intel SDM, Volume 3, 9.11.7:
+	 *
+	 *   CPUID returns a value in a model specific register in
+	 *   addition to its usual register return values. The
+	 *   semantics of CPUID cause it to deposit an update ID value
+	 *   in the 64-bit model-specific register at address 08BH
+	 *   (IA32_BIOS_SIGN_ID). If no update is present in the
+	 *   processor, the value in the MSR remains unmodified.
+	 *
+	 * Use native_cpuid -- this code runs very early and we don't
+	 * want to mess with paravirt.
+	 */
+	unsigned int eax = 1, ebx, ecx = 0, edx;
+
+	native_cpuid(&eax, &ebx, &ecx, &edx);
+}
+
 static int collect_cpu_info_early(struct ucode_cpu_info *uci)
 {
 	unsigned int val[2];
@@ -393,7 +413,7 @@ static int collect_cpu_info_early(struct ucode_cpu_info *uci)
 	native_wrmsrl(MSR_IA32_UCODE_REV, 0);
 
 	/* As documented in the SDM: Do a CPUID 1 here */
-	sync_core();
+	cpuid_1();
 
 	/* get the current revision from MSR 0x8B */
 	native_rdmsr(MSR_IA32_UCODE_REV, val[0], val[1]);
@@ -593,7 +613,7 @@ static int apply_microcode_early(struct ucode_cpu_info *uci, bool early)
 	native_wrmsrl(MSR_IA32_UCODE_REV, 0);
 
 	/* As documented in the SDM: Do a CPUID 1 here */
-	sync_core();
+	cpuid_1();
 
 	/* get the current revision from MSR 0x8B */
 	native_rdmsr(MSR_IA32_UCODE_REV, val[0], val[1]);
@@ -805,7 +825,7 @@ static int apply_microcode_intel(int cpu)
 	wrmsrl(MSR_IA32_UCODE_REV, 0);
 
 	/* As documented in the SDM: Do a CPUID 1 here */
-	sync_core();
+	cpuid_1();
 
 	/* get the current revision from MSR 0x8B */
 	rdmsr(MSR_IA32_UCODE_REV, val[0], val[1]);

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [tip:x86/urgent] x86/asm: Rewrite sync_core() to use IRET-to-self
  2016-12-09 18:24 ` Andy Lutomirski
  2016-12-19 11:05   ` [tip:x86/urgent] " tip-bot for Andy Lutomirski
@ 2016-12-19 11:05   ` tip-bot for Andy Lutomirski
  1 sibling, 0 replies; 20+ messages in thread
From: tip-bot for Andy Lutomirski @ 2016-12-19 11:05 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: tglx, jgross, Xen-devel, tedheadster, peterz, brgerst, gnomes,
	bp, boris.ostrovsky, luto, andrew.cooper3, linux-kernel, hmh,
	hpa, mingo

Commit-ID:  c198b121b1a1d7a7171770c634cd49191bac4477
Gitweb:     http://git.kernel.org/tip/c198b121b1a1d7a7171770c634cd49191bac4477
Author:     Andy Lutomirski <luto@kernel.org>
AuthorDate: Fri, 9 Dec 2016 10:24:08 -0800
Committer:  Thomas Gleixner <tglx@linutronix.de>
CommitDate: Mon, 19 Dec 2016 11:54:21 +0100

x86/asm: Rewrite sync_core() to use IRET-to-self

Aside from being excessively slow, CPUID is problematic: Linux runs
on a handful of CPUs that don't have CPUID.  Use IRET-to-self
instead.  IRET-to-self works everywhere, so it makes testing easy.

For reference, On my laptop, IRET-to-self is ~110ns,
CPUID(eax=1, ecx=0) is ~83ns on native and very very slow under KVM,
and MOV-to-CR2 is ~42ns.

While we're at it: sync_core() serves a very specific purpose.
Document it.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Juergen Gross <jgross@suse.com>
Cc: One Thousand Gnomes <gnomes@lxorguk.ukuu.org.uk>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Matthew Whitehead <tedheadster@gmail.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Henrique de Moraes Holschuh <hmh@hmh.eng.br>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: xen-devel <Xen-devel@lists.xen.org>
Link: http://lkml.kernel.org/r/5c79f0225f68bc8c40335612bf624511abb78941.1481307769.git.luto@kernel.org
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

---
 arch/x86/include/asm/processor.h | 80 +++++++++++++++++++++++++++++-----------
 1 file changed, 58 insertions(+), 22 deletions(-)

diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index b934871..eaf1005 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -602,33 +602,69 @@ static __always_inline void cpu_relax(void)
 	rep_nop();
 }
 
-/* Stop speculative execution and prefetching of modified code. */
+/*
+ * This function forces the icache and prefetched instruction stream to
+ * catch up with reality in two very specific cases:
+ *
+ *  a) Text was modified using one virtual address and is about to be executed
+ *     from the same physical page at a different virtual address.
+ *
+ *  b) Text was modified on a different CPU, may subsequently be
+ *     executed on this CPU, and you want to make sure the new version
+ *     gets executed.  This generally means you're calling this in a IPI.
+ *
+ * If you're calling this for a different reason, you're probably doing
+ * it wrong.
+ */
 static inline void sync_core(void)
 {
-	int tmp;
-
-#ifdef CONFIG_X86_32
 	/*
-	 * Do a CPUID if available, otherwise do a jump.  The jump
-	 * can conveniently enough be the jump around CPUID.
+	 * There are quite a few ways to do this.  IRET-to-self is nice
+	 * because it works on every CPU, at any CPL (so it's compatible
+	 * with paravirtualization), and it never exits to a hypervisor.
+	 * The only down sides are that it's a bit slow (it seems to be
+	 * a bit more than 2x slower than the fastest options) and that
+	 * it unmasks NMIs.  The "push %cs" is needed because, in
+	 * paravirtual environments, __KERNEL_CS may not be a valid CS
+	 * value when we do IRET directly.
+	 *
+	 * In case NMI unmasking or performance ever becomes a problem,
+	 * the next best option appears to be MOV-to-CR2 and an
+	 * unconditional jump.  That sequence also works on all CPUs,
+	 * but it will fault at CPL3 (i.e. Xen PV and lguest).
+	 *
+	 * CPUID is the conventional way, but it's nasty: it doesn't
+	 * exist on some 486-like CPUs, and it usually exits to a
+	 * hypervisor.
+	 *
+	 * Like all of Linux's memory ordering operations, this is a
+	 * compiler barrier as well.
 	 */
-	asm volatile("cmpl %2,%1\n\t"
-		     "jl 1f\n\t"
-		     "cpuid\n"
-		     "1:"
-		     : "=a" (tmp)
-		     : "rm" (boot_cpu_data.cpuid_level), "ri" (0), "0" (1)
-		     : "ebx", "ecx", "edx", "memory");
+	register void *__sp asm(_ASM_SP);
+
+#ifdef CONFIG_X86_32
+	asm volatile (
+		"pushfl\n\t"
+		"pushl %%cs\n\t"
+		"pushl $1f\n\t"
+		"iret\n\t"
+		"1:"
+		: "+r" (__sp) : : "memory");
 #else
-	/*
-	 * CPUID is a barrier to speculative execution.
-	 * Prefetched instructions are automatically
-	 * invalidated when modified.
-	 */
-	asm volatile("cpuid"
-		     : "=a" (tmp)
-		     : "0" (1)
-		     : "ebx", "ecx", "edx", "memory");
+	unsigned int tmp;
+
+	asm volatile (
+		"mov %%ss, %0\n\t"
+		"pushq %q0\n\t"
+		"pushq %%rsp\n\t"
+		"addq $8, (%%rsp)\n\t"
+		"pushfq\n\t"
+		"mov %%cs, %0\n\t"
+		"pushq %q0\n\t"
+		"pushq $1f\n\t"
+		"iretq\n\t"
+		"1:"
+		: "=&r" (tmp), "+r" (__sp) : : "cc", "memory");
 #endif
 }
 

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [tip:x86/urgent] x86/asm: Rewrite sync_core() to use IRET-to-self
  2016-12-09 18:24 ` Andy Lutomirski
@ 2016-12-19 11:05   ` tip-bot for Andy Lutomirski
  2016-12-19 11:05   ` tip-bot for Andy Lutomirski
  1 sibling, 0 replies; 20+ messages in thread
From: tip-bot for Andy Lutomirski @ 2016-12-19 11:05 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: jgross, gnomes, hmh, peterz, brgerst, hpa, linux-kernel,
	tedheadster, mingo, bp, luto, andrew.cooper3, tglx, Xen-devel,
	boris.ostrovsky

Commit-ID:  c198b121b1a1d7a7171770c634cd49191bac4477
Gitweb:     http://git.kernel.org/tip/c198b121b1a1d7a7171770c634cd49191bac4477
Author:     Andy Lutomirski <luto@kernel.org>
AuthorDate: Fri, 9 Dec 2016 10:24:08 -0800
Committer:  Thomas Gleixner <tglx@linutronix.de>
CommitDate: Mon, 19 Dec 2016 11:54:21 +0100

x86/asm: Rewrite sync_core() to use IRET-to-self

Aside from being excessively slow, CPUID is problematic: Linux runs
on a handful of CPUs that don't have CPUID.  Use IRET-to-self
instead.  IRET-to-self works everywhere, so it makes testing easy.

For reference, On my laptop, IRET-to-self is ~110ns,
CPUID(eax=1, ecx=0) is ~83ns on native and very very slow under KVM,
and MOV-to-CR2 is ~42ns.

While we're at it: sync_core() serves a very specific purpose.
Document it.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Juergen Gross <jgross@suse.com>
Cc: One Thousand Gnomes <gnomes@lxorguk.ukuu.org.uk>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Matthew Whitehead <tedheadster@gmail.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Henrique de Moraes Holschuh <hmh@hmh.eng.br>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: xen-devel <Xen-devel@lists.xen.org>
Link: http://lkml.kernel.org/r/5c79f0225f68bc8c40335612bf624511abb78941.1481307769.git.luto@kernel.org
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

---
 arch/x86/include/asm/processor.h | 80 +++++++++++++++++++++++++++++-----------
 1 file changed, 58 insertions(+), 22 deletions(-)

diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index b934871..eaf1005 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -602,33 +602,69 @@ static __always_inline void cpu_relax(void)
 	rep_nop();
 }
 
-/* Stop speculative execution and prefetching of modified code. */
+/*
+ * This function forces the icache and prefetched instruction stream to
+ * catch up with reality in two very specific cases:
+ *
+ *  a) Text was modified using one virtual address and is about to be executed
+ *     from the same physical page at a different virtual address.
+ *
+ *  b) Text was modified on a different CPU, may subsequently be
+ *     executed on this CPU, and you want to make sure the new version
+ *     gets executed.  This generally means you're calling this in a IPI.
+ *
+ * If you're calling this for a different reason, you're probably doing
+ * it wrong.
+ */
 static inline void sync_core(void)
 {
-	int tmp;
-
-#ifdef CONFIG_X86_32
 	/*
-	 * Do a CPUID if available, otherwise do a jump.  The jump
-	 * can conveniently enough be the jump around CPUID.
+	 * There are quite a few ways to do this.  IRET-to-self is nice
+	 * because it works on every CPU, at any CPL (so it's compatible
+	 * with paravirtualization), and it never exits to a hypervisor.
+	 * The only down sides are that it's a bit slow (it seems to be
+	 * a bit more than 2x slower than the fastest options) and that
+	 * it unmasks NMIs.  The "push %cs" is needed because, in
+	 * paravirtual environments, __KERNEL_CS may not be a valid CS
+	 * value when we do IRET directly.
+	 *
+	 * In case NMI unmasking or performance ever becomes a problem,
+	 * the next best option appears to be MOV-to-CR2 and an
+	 * unconditional jump.  That sequence also works on all CPUs,
+	 * but it will fault at CPL3 (i.e. Xen PV and lguest).
+	 *
+	 * CPUID is the conventional way, but it's nasty: it doesn't
+	 * exist on some 486-like CPUs, and it usually exits to a
+	 * hypervisor.
+	 *
+	 * Like all of Linux's memory ordering operations, this is a
+	 * compiler barrier as well.
 	 */
-	asm volatile("cmpl %2,%1\n\t"
-		     "jl 1f\n\t"
-		     "cpuid\n"
-		     "1:"
-		     : "=a" (tmp)
-		     : "rm" (boot_cpu_data.cpuid_level), "ri" (0), "0" (1)
-		     : "ebx", "ecx", "edx", "memory");
+	register void *__sp asm(_ASM_SP);
+
+#ifdef CONFIG_X86_32
+	asm volatile (
+		"pushfl\n\t"
+		"pushl %%cs\n\t"
+		"pushl $1f\n\t"
+		"iret\n\t"
+		"1:"
+		: "+r" (__sp) : : "memory");
 #else
-	/*
-	 * CPUID is a barrier to speculative execution.
-	 * Prefetched instructions are automatically
-	 * invalidated when modified.
-	 */
-	asm volatile("cpuid"
-		     : "=a" (tmp)
-		     : "0" (1)
-		     : "ebx", "ecx", "edx", "memory");
+	unsigned int tmp;
+
+	asm volatile (
+		"mov %%ss, %0\n\t"
+		"pushq %q0\n\t"
+		"pushq %%rsp\n\t"
+		"addq $8, (%%rsp)\n\t"
+		"pushfq\n\t"
+		"mov %%cs, %0\n\t"
+		"pushq %q0\n\t"
+		"pushq $1f\n\t"
+		"iretq\n\t"
+		"1:"
+		: "=&r" (tmp), "+r" (__sp) : : "cc", "memory");
 #endif
 }
 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v4 0/4] CPUID-less CPU/sync_core fixes and improvements
@ 2016-12-09 18:24 Andy Lutomirski
  0 siblings, 0 replies; 20+ messages in thread
From: Andy Lutomirski @ 2016-12-09 18:24 UTC (permalink / raw)
  To: x86
  Cc: Juergen Gross, One Thousand Gnomes, Andy Lutomirski,
	Peter Zijlstra, Brian Gerst, linux-kernel, Matthew Whitehead,
	Borislav Petkov, Henrique de Moraes Holschuh, Andrew Cooper,
	Boris Ostrovsky, xen-devel

*** PATCHES 1 and 2 MAY BE 4.9 MATERIAL ***

Alan Cox pointed out that the 486 isn't the only supported CPU that
doesn't have CPUID.  Let's clean up the mess and make everything
faster while we're at it.

Patch 1 is intended to be an easy fix: it makes sync_core() work
without CPUID on all 32-bit kernels.  It should be quite safe.  This
will have a negligible performance cost during boot on kernels built
for newer CPUs.  With this in place, patch 2 reverts the buggy 486
check I added.

Patches 3-4 are meant to improve the situation.  Patch 3 cleans up
the Intel microcode loader and the patch 4 (which depends on patch 3
to work correctly) stops using CPUID in sync_core() altogether.

Changes from v3:
 - Improve sync_core() comments.
 - Tidy up sync_core() asm.

Changes from v2:
 - Switch to IRET-to-self and get rid of all the paravirt code.
 - Further immprove the sync_core() comment.

Changes from v1:
 - Fix Xen
 - Add timing info to the changelog (hint: 2x speedup)
 - Document patch 1 a bit better.

Andy Lutomirski (4):
  x86/asm/32: Make sync_core() handle missing CPUID on all 32-bit
    kernels
  Revert "x86/boot: Fail the boot if !M486 and CPUID is missing"
  x86/microcode/intel: Replace sync_core() with native_cpuid()
  x86/asm: Rewrite sync_core() to use IRET-to-self

 arch/x86/boot/cpu.c                   |  6 ---
 arch/x86/include/asm/processor.h      | 80 +++++++++++++++++++++++++----------
 arch/x86/kernel/cpu/microcode/intel.c | 26 ++++++++++--
 3 files changed, 81 insertions(+), 31 deletions(-)

-- 
2.9.3


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2016-12-19 11:07 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-12-09 18:24 [PATCH v4 0/4] CPUID-less CPU/sync_core fixes and improvements Andy Lutomirski
2016-12-09 18:24 ` [PATCH v4 1/4] x86/asm/32: Make sync_core() handle missing CPUID on all 32-bit kernels Andy Lutomirski
2016-12-19 11:03   ` [tip:x86/urgent] " tip-bot for Andy Lutomirski
2016-12-19 11:03   ` tip-bot for Andy Lutomirski
2016-12-09 18:24 ` [PATCH v4 1/4] " Andy Lutomirski
2016-12-09 18:24 ` [PATCH v4 2/4] Revert "x86/boot: Fail the boot if !M486 and CPUID is missing" Andy Lutomirski
2016-12-09 18:24 ` Andy Lutomirski
2016-12-19 11:04   ` [tip:x86/urgent] " tip-bot for Andy Lutomirski
2016-12-19 11:04   ` tip-bot for Andy Lutomirski
2016-12-09 18:24 ` [PATCH v4 3/4] x86/microcode/intel: Replace sync_core() with native_cpuid() Andy Lutomirski
2016-12-19 11:05   ` [tip:x86/urgent] " tip-bot for Andy Lutomirski
2016-12-19 11:05   ` tip-bot for Andy Lutomirski
2016-12-09 18:24 ` [PATCH v4 3/4] " Andy Lutomirski
2016-12-09 18:24 ` [PATCH v4 4/4] x86/asm: Rewrite sync_core() to use IRET-to-self Andy Lutomirski
2016-12-09 18:24 ` Andy Lutomirski
2016-12-19 11:05   ` [tip:x86/urgent] " tip-bot for Andy Lutomirski
2016-12-19 11:05   ` tip-bot for Andy Lutomirski
2016-12-15 18:06 ` [PATCH v4 0/4] CPUID-less CPU/sync_core fixes and improvements Andy Lutomirski
2016-12-15 18:06 ` Andy Lutomirski
2016-12-09 18:24 Andy Lutomirski

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.