* [PATCH v4 1/4] x86/asm/32: Make sync_core() handle missing CPUID on all 32-bit kernels
2016-12-09 18:24 [PATCH v4 0/4] CPUID-less CPU/sync_core fixes and improvements Andy Lutomirski
@ 2016-12-09 18:24 ` Andy Lutomirski
2016-12-19 11:03 ` [tip:x86/urgent] " tip-bot for Andy Lutomirski
2016-12-09 18:24 ` [PATCH v4 2/4] Revert "x86/boot: Fail the boot if !M486 and CPUID is missing" Andy Lutomirski
` (3 subsequent siblings)
4 siblings, 1 reply; 10+ messages in thread
From: Andy Lutomirski @ 2016-12-09 18:24 UTC (permalink / raw)
To: x86
Cc: One Thousand Gnomes, Borislav Petkov, linux-kernel, Brian Gerst,
Matthew Whitehead, Henrique de Moraes Holschuh, Peter Zijlstra,
xen-devel, Juergen Gross, Boris Ostrovsky, Andrew Cooper,
Andy Lutomirski
We support various non-Intel CPUs that don't have the CPUID
instruction, so the M486 test was wrong. For now, fix it with a big
hammer: handle missing CPUID on all 32-bit CPUs.
Reported-by: One Thousand Gnomes <gnomes@lxorguk.ukuu.org.uk>
Signed-off-by: Andy Lutomirski <luto@kernel.org>
---
arch/x86/include/asm/processor.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index 984a7bf17f6a..64fbc937d586 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -595,7 +595,7 @@ static inline void sync_core(void)
{
int tmp;
-#ifdef CONFIG_M486
+#ifdef CONFIG_X86_32
/*
* Do a CPUID if available, otherwise do a jump. The jump
* can conveniently enough be the jump around CPUID.
--
2.9.3
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [tip:x86/urgent] x86/asm/32: Make sync_core() handle missing CPUID on all 32-bit kernels
2016-12-09 18:24 ` [PATCH v4 1/4] x86/asm/32: Make sync_core() handle missing CPUID on all 32-bit kernels Andy Lutomirski
@ 2016-12-19 11:03 ` tip-bot for Andy Lutomirski
0 siblings, 0 replies; 10+ messages in thread
From: tip-bot for Andy Lutomirski @ 2016-12-19 11:03 UTC (permalink / raw)
To: linux-tip-commits
Cc: boris.ostrovsky, jgross, peterz, mingo, bp, tglx, luto,
Xen-devel, hmh, andrew.cooper3, linux-kernel, hpa, gnomes,
tedheadster, brgerst
Commit-ID: 1c52d859cb2d417e7216d3e56bb7fea88444cec9
Gitweb: http://git.kernel.org/tip/1c52d859cb2d417e7216d3e56bb7fea88444cec9
Author: Andy Lutomirski <luto@kernel.org>
AuthorDate: Fri, 9 Dec 2016 10:24:05 -0800
Committer: Thomas Gleixner <tglx@linutronix.de>
CommitDate: Mon, 19 Dec 2016 11:54:20 +0100
x86/asm/32: Make sync_core() handle missing CPUID on all 32-bit kernels
We support various non-Intel CPUs that don't have the CPUID
instruction, so the M486 test was wrong. For now, fix it with a big
hammer: handle missing CPUID on all 32-bit CPUs.
Reported-by: One Thousand Gnomes <gnomes@lxorguk.ukuu.org.uk>
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Juergen Gross <jgross@suse.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Matthew Whitehead <tedheadster@gmail.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Henrique de Moraes Holschuh <hmh@hmh.eng.br>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: xen-devel <Xen-devel@lists.xen.org>
Link: http://lkml.kernel.org/r/685bd083a7c036f7769510b6846315b17d6ba71f.1481307769.git.luto@kernel.org
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
arch/x86/include/asm/processor.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index 6aa741f..b934871 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -607,7 +607,7 @@ static inline void sync_core(void)
{
int tmp;
-#ifdef CONFIG_M486
+#ifdef CONFIG_X86_32
/*
* Do a CPUID if available, otherwise do a jump. The jump
* can conveniently enough be the jump around CPUID.
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH v4 2/4] Revert "x86/boot: Fail the boot if !M486 and CPUID is missing"
2016-12-09 18:24 [PATCH v4 0/4] CPUID-less CPU/sync_core fixes and improvements Andy Lutomirski
2016-12-09 18:24 ` [PATCH v4 1/4] x86/asm/32: Make sync_core() handle missing CPUID on all 32-bit kernels Andy Lutomirski
@ 2016-12-09 18:24 ` Andy Lutomirski
2016-12-19 11:04 ` [tip:x86/urgent] " tip-bot for Andy Lutomirski
2016-12-09 18:24 ` [PATCH v4 3/4] x86/microcode/intel: Replace sync_core() with native_cpuid() Andy Lutomirski
` (2 subsequent siblings)
4 siblings, 1 reply; 10+ messages in thread
From: Andy Lutomirski @ 2016-12-09 18:24 UTC (permalink / raw)
To: x86
Cc: One Thousand Gnomes, Borislav Petkov, linux-kernel, Brian Gerst,
Matthew Whitehead, Henrique de Moraes Holschuh, Peter Zijlstra,
xen-devel, Juergen Gross, Boris Ostrovsky, Andrew Cooper,
Andy Lutomirski, Denys Vlasenko, H . Peter Anvin, Josh Poimboeuf,
Linus Torvalds, Thomas Gleixner
This reverts commit ed68d7e9b9cfb64f3045ffbcb108df03c09a0f98.
The patch wasn't quite correct -- there are non-Intel (and hence
non-486) CPUs that we support that don't have CPUID. Since we no
longer require CPUID for sync_core(), just revert the patch.
I think the relevant CPUs are Geode and Elan, but I'm not sure.
In principle, we should try to do better at identifying CPUID-less
CPUs in early boot, but that's more complicated.
Reported-by: One Thousand Gnomes <gnomes@lxorguk.ukuu.org.uk>
Cc: Matthew Whitehead <tedheadster@gmail.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andy Lutomirski <luto@kernel.org>
---
arch/x86/boot/cpu.c | 6 ------
1 file changed, 6 deletions(-)
diff --git a/arch/x86/boot/cpu.c b/arch/x86/boot/cpu.c
index 4224ede43b4e..26240dde081e 100644
--- a/arch/x86/boot/cpu.c
+++ b/arch/x86/boot/cpu.c
@@ -87,12 +87,6 @@ int validate_cpu(void)
return -1;
}
- if (CONFIG_X86_MINIMUM_CPU_FAMILY <= 4 && !IS_ENABLED(CONFIG_M486) &&
- !has_eflag(X86_EFLAGS_ID)) {
- printf("This kernel requires a CPU with the CPUID instruction. Build with CONFIG_M486=y to run on this CPU.\n");
- return -1;
- }
-
if (err_flags) {
puts("This kernel requires the following features "
"not present on the CPU:\n");
--
2.9.3
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [tip:x86/urgent] Revert "x86/boot: Fail the boot if !M486 and CPUID is missing"
2016-12-09 18:24 ` [PATCH v4 2/4] Revert "x86/boot: Fail the boot if !M486 and CPUID is missing" Andy Lutomirski
@ 2016-12-19 11:04 ` tip-bot for Andy Lutomirski
0 siblings, 0 replies; 10+ messages in thread
From: tip-bot for Andy Lutomirski @ 2016-12-19 11:04 UTC (permalink / raw)
To: linux-tip-commits
Cc: jpoimboe, tglx, tedheadster, luto, linux-kernel, torvalds,
brgerst, hpa, Xen-devel, hmh, jgross, gnomes, andrew.cooper3,
dvlasenk, peterz, mingo, boris.ostrovsky, bp
Commit-ID: 426d1aff3138cf38da14e912df3c75e312f96e9e
Gitweb: http://git.kernel.org/tip/426d1aff3138cf38da14e912df3c75e312f96e9e
Author: Andy Lutomirski <luto@kernel.org>
AuthorDate: Fri, 9 Dec 2016 10:24:06 -0800
Committer: Thomas Gleixner <tglx@linutronix.de>
CommitDate: Mon, 19 Dec 2016 11:54:20 +0100
Revert "x86/boot: Fail the boot if !M486 and CPUID is missing"
This reverts commit ed68d7e9b9cfb64f3045ffbcb108df03c09a0f98.
The patch wasn't quite correct -- there are non-Intel (and hence
non-486) CPUs that we support that don't have CPUID. Since we no
longer require CPUID for sync_core(), just revert the patch.
I think the relevant CPUs are Geode and Elan, but I'm not sure.
In principle, we should try to do better at identifying CPUID-less
CPUs in early boot, but that's more complicated.
Reported-by: One Thousand Gnomes <gnomes@lxorguk.ukuu.org.uk>
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Juergen Gross <jgross@suse.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Matthew Whitehead <tedheadster@gmail.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Henrique de Moraes Holschuh <hmh@hmh.eng.br>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: xen-devel <Xen-devel@lists.xen.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/82acde18a108b8e353180dd6febcc2876df33f24.1481307769.git.luto@kernel.org
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
arch/x86/boot/cpu.c | 6 ------
1 file changed, 6 deletions(-)
diff --git a/arch/x86/boot/cpu.c b/arch/x86/boot/cpu.c
index 4224ede..26240dd 100644
--- a/arch/x86/boot/cpu.c
+++ b/arch/x86/boot/cpu.c
@@ -87,12 +87,6 @@ int validate_cpu(void)
return -1;
}
- if (CONFIG_X86_MINIMUM_CPU_FAMILY <= 4 && !IS_ENABLED(CONFIG_M486) &&
- !has_eflag(X86_EFLAGS_ID)) {
- printf("This kernel requires a CPU with the CPUID instruction. Build with CONFIG_M486=y to run on this CPU.\n");
- return -1;
- }
-
if (err_flags) {
puts("This kernel requires the following features "
"not present on the CPU:\n");
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH v4 3/4] x86/microcode/intel: Replace sync_core() with native_cpuid()
2016-12-09 18:24 [PATCH v4 0/4] CPUID-less CPU/sync_core fixes and improvements Andy Lutomirski
2016-12-09 18:24 ` [PATCH v4 1/4] x86/asm/32: Make sync_core() handle missing CPUID on all 32-bit kernels Andy Lutomirski
2016-12-09 18:24 ` [PATCH v4 2/4] Revert "x86/boot: Fail the boot if !M486 and CPUID is missing" Andy Lutomirski
@ 2016-12-09 18:24 ` Andy Lutomirski
2016-12-19 11:05 ` [tip:x86/urgent] " tip-bot for Andy Lutomirski
2016-12-09 18:24 ` [PATCH v4 4/4] x86/asm: Rewrite sync_core() to use IRET-to-self Andy Lutomirski
2016-12-15 18:06 ` [PATCH v4 0/4] CPUID-less CPU/sync_core fixes and improvements Andy Lutomirski
4 siblings, 1 reply; 10+ messages in thread
From: Andy Lutomirski @ 2016-12-09 18:24 UTC (permalink / raw)
To: x86
Cc: One Thousand Gnomes, Borislav Petkov, linux-kernel, Brian Gerst,
Matthew Whitehead, Henrique de Moraes Holschuh, Peter Zijlstra,
xen-devel, Juergen Gross, Boris Ostrovsky, Andrew Cooper,
Andy Lutomirski
The Intel microcode driver is using sync_core() to mean "do CPUID
with EAX=1". I want to rework sync_core(), but first the Intel
microcode driver needs to stop depending on its current behavior.
Reported-by: Henrique de Moraes Holschuh <hmh@hmh.eng.br>
Acked-by: Borislav Petkov <bp@alien8.de>
Signed-off-by: Andy Lutomirski <luto@kernel.org>
---
arch/x86/kernel/cpu/microcode/intel.c | 26 +++++++++++++++++++++++---
1 file changed, 23 insertions(+), 3 deletions(-)
diff --git a/arch/x86/kernel/cpu/microcode/intel.c b/arch/x86/kernel/cpu/microcode/intel.c
index cdc0deab00c9..e0981bb2a351 100644
--- a/arch/x86/kernel/cpu/microcode/intel.c
+++ b/arch/x86/kernel/cpu/microcode/intel.c
@@ -356,6 +356,26 @@ get_matching_model_microcode(unsigned long start, void *data, size_t size,
return state;
}
+static void cpuid_1(void)
+{
+ /*
+ * According to the Intel SDM, Volume 3, 9.11.7:
+ *
+ * CPUID returns a value in a model specific register in
+ * addition to its usual register return values. The
+ * semantics of CPUID cause it to deposit an update ID value
+ * in the 64-bit model-specific register at address 08BH
+ * (IA32_BIOS_SIGN_ID). If no update is present in the
+ * processor, the value in the MSR remains unmodified.
+ *
+ * Use native_cpuid -- this code runs very early and we don't
+ * want to mess with paravirt.
+ */
+ unsigned int eax = 1, ebx, ecx = 0, edx;
+
+ native_cpuid(&eax, &ebx, &ecx, &edx);
+}
+
static int collect_cpu_info_early(struct ucode_cpu_info *uci)
{
unsigned int val[2];
@@ -385,7 +405,7 @@ static int collect_cpu_info_early(struct ucode_cpu_info *uci)
native_wrmsrl(MSR_IA32_UCODE_REV, 0);
/* As documented in the SDM: Do a CPUID 1 here */
- sync_core();
+ cpuid_1();
/* get the current revision from MSR 0x8B */
native_rdmsr(MSR_IA32_UCODE_REV, val[0], val[1]);
@@ -627,7 +647,7 @@ static int apply_microcode_early(struct ucode_cpu_info *uci, bool early)
native_wrmsrl(MSR_IA32_UCODE_REV, 0);
/* As documented in the SDM: Do a CPUID 1 here */
- sync_core();
+ cpuid_1();
/* get the current revision from MSR 0x8B */
native_rdmsr(MSR_IA32_UCODE_REV, val[0], val[1]);
@@ -927,7 +947,7 @@ static int apply_microcode_intel(int cpu)
wrmsrl(MSR_IA32_UCODE_REV, 0);
/* As documented in the SDM: Do a CPUID 1 here */
- sync_core();
+ cpuid_1();
/* get the current revision from MSR 0x8B */
rdmsr(MSR_IA32_UCODE_REV, val[0], val[1]);
--
2.9.3
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [tip:x86/urgent] x86/microcode/intel: Replace sync_core() with native_cpuid()
2016-12-09 18:24 ` [PATCH v4 3/4] x86/microcode/intel: Replace sync_core() with native_cpuid() Andy Lutomirski
@ 2016-12-19 11:05 ` tip-bot for Andy Lutomirski
0 siblings, 0 replies; 10+ messages in thread
From: tip-bot for Andy Lutomirski @ 2016-12-19 11:05 UTC (permalink / raw)
To: linux-tip-commits
Cc: peterz, luto, hpa, boris.ostrovsky, gnomes, tglx, tedheadster,
jgross, bp, brgerst, hmh, mingo, Xen-devel, andrew.cooper3,
linux-kernel
Commit-ID: 484d0e5c7943644cc46e7308a8f9d83be598f2b9
Gitweb: http://git.kernel.org/tip/484d0e5c7943644cc46e7308a8f9d83be598f2b9
Author: Andy Lutomirski <luto@kernel.org>
AuthorDate: Fri, 9 Dec 2016 10:24:07 -0800
Committer: Thomas Gleixner <tglx@linutronix.de>
CommitDate: Mon, 19 Dec 2016 11:54:21 +0100
x86/microcode/intel: Replace sync_core() with native_cpuid()
The Intel microcode driver is using sync_core() to mean "do CPUID
with EAX=1". I want to rework sync_core(), but first the Intel
microcode driver needs to stop depending on its current behavior.
Reported-by: Henrique de Moraes Holschuh <hmh@hmh.eng.br>
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Acked-by: Borislav Petkov <bp@alien8.de>
Cc: Juergen Gross <jgross@suse.com>
Cc: One Thousand Gnomes <gnomes@lxorguk.ukuu.org.uk>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Matthew Whitehead <tedheadster@gmail.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: xen-devel <Xen-devel@lists.xen.org>
Link: http://lkml.kernel.org/r/535a025bb91fed1a019c5412b036337ad239e5bb.1481307769.git.luto@kernel.org
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
arch/x86/kernel/cpu/microcode/intel.c | 26 +++++++++++++++++++++++---
1 file changed, 23 insertions(+), 3 deletions(-)
diff --git a/arch/x86/kernel/cpu/microcode/intel.c b/arch/x86/kernel/cpu/microcode/intel.c
index 54d50c3..b624b54 100644
--- a/arch/x86/kernel/cpu/microcode/intel.c
+++ b/arch/x86/kernel/cpu/microcode/intel.c
@@ -368,6 +368,26 @@ next:
return patch;
}
+static void cpuid_1(void)
+{
+ /*
+ * According to the Intel SDM, Volume 3, 9.11.7:
+ *
+ * CPUID returns a value in a model specific register in
+ * addition to its usual register return values. The
+ * semantics of CPUID cause it to deposit an update ID value
+ * in the 64-bit model-specific register at address 08BH
+ * (IA32_BIOS_SIGN_ID). If no update is present in the
+ * processor, the value in the MSR remains unmodified.
+ *
+ * Use native_cpuid -- this code runs very early and we don't
+ * want to mess with paravirt.
+ */
+ unsigned int eax = 1, ebx, ecx = 0, edx;
+
+ native_cpuid(&eax, &ebx, &ecx, &edx);
+}
+
static int collect_cpu_info_early(struct ucode_cpu_info *uci)
{
unsigned int val[2];
@@ -393,7 +413,7 @@ static int collect_cpu_info_early(struct ucode_cpu_info *uci)
native_wrmsrl(MSR_IA32_UCODE_REV, 0);
/* As documented in the SDM: Do a CPUID 1 here */
- sync_core();
+ cpuid_1();
/* get the current revision from MSR 0x8B */
native_rdmsr(MSR_IA32_UCODE_REV, val[0], val[1]);
@@ -593,7 +613,7 @@ static int apply_microcode_early(struct ucode_cpu_info *uci, bool early)
native_wrmsrl(MSR_IA32_UCODE_REV, 0);
/* As documented in the SDM: Do a CPUID 1 here */
- sync_core();
+ cpuid_1();
/* get the current revision from MSR 0x8B */
native_rdmsr(MSR_IA32_UCODE_REV, val[0], val[1]);
@@ -805,7 +825,7 @@ static int apply_microcode_intel(int cpu)
wrmsrl(MSR_IA32_UCODE_REV, 0);
/* As documented in the SDM: Do a CPUID 1 here */
- sync_core();
+ cpuid_1();
/* get the current revision from MSR 0x8B */
rdmsr(MSR_IA32_UCODE_REV, val[0], val[1]);
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH v4 4/4] x86/asm: Rewrite sync_core() to use IRET-to-self
2016-12-09 18:24 [PATCH v4 0/4] CPUID-less CPU/sync_core fixes and improvements Andy Lutomirski
` (2 preceding siblings ...)
2016-12-09 18:24 ` [PATCH v4 3/4] x86/microcode/intel: Replace sync_core() with native_cpuid() Andy Lutomirski
@ 2016-12-09 18:24 ` Andy Lutomirski
2016-12-19 11:05 ` [tip:x86/urgent] " tip-bot for Andy Lutomirski
2016-12-15 18:06 ` [PATCH v4 0/4] CPUID-less CPU/sync_core fixes and improvements Andy Lutomirski
4 siblings, 1 reply; 10+ messages in thread
From: Andy Lutomirski @ 2016-12-09 18:24 UTC (permalink / raw)
To: x86
Cc: One Thousand Gnomes, Borislav Petkov, linux-kernel, Brian Gerst,
Matthew Whitehead, Henrique de Moraes Holschuh, Peter Zijlstra,
xen-devel, Juergen Gross, Boris Ostrovsky, Andrew Cooper,
Andy Lutomirski, H. Peter Anvin
Aside from being excessively slow, CPUID is problematic: Linux runs
on a handful of CPUs that don't have CPUID. Use IRET-to-self
instead. IRET-to-self works everywhere, so it makes testing easy.
For reference, On my laptop, IRET-to-self is ~110ns,
CPUID(eax=1, ecx=0) is ~83ns on native and very very slow under KVM,
and MOV-to-CR2 is ~42ns.
While we're at it: sync_core() serves a very specific purpose.
Document it.
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Andy Lutomirski <luto@kernel.org>
---
arch/x86/include/asm/processor.h | 80 +++++++++++++++++++++++++++++-----------
1 file changed, 58 insertions(+), 22 deletions(-)
diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index 64fbc937d586..ceb1f4d3f3fa 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -590,33 +590,69 @@ static __always_inline void cpu_relax(void)
#define cpu_relax_lowlatency() cpu_relax()
-/* Stop speculative execution and prefetching of modified code. */
+/*
+ * This function forces the icache and prefetched instruction stream to
+ * catch up with reality in two very specific cases:
+ *
+ * a) Text was modified using one virtual address and is about to be executed
+ * from the same physical page at a different virtual address.
+ *
+ * b) Text was modified on a different CPU, may subsequently be
+ * executed on this CPU, and you want to make sure the new version
+ * gets executed. This generally means you're calling this in a IPI.
+ *
+ * If you're calling this for a different reason, you're probably doing
+ * it wrong.
+ */
static inline void sync_core(void)
{
- int tmp;
-
-#ifdef CONFIG_X86_32
/*
- * Do a CPUID if available, otherwise do a jump. The jump
- * can conveniently enough be the jump around CPUID.
+ * There are quite a few ways to do this. IRET-to-self is nice
+ * because it works on every CPU, at any CPL (so it's compatible
+ * with paravirtualization), and it never exits to a hypervisor.
+ * The only down sides are that it's a bit slow (it seems to be
+ * a bit more than 2x slower than the fastest options) and that
+ * it unmasks NMIs. The "push %cs" is needed because, in
+ * paravirtual environments, __KERNEL_CS may not be a valid CS
+ * value when we do IRET directly.
+ *
+ * In case NMI unmasking or performance ever becomes a problem,
+ * the next best option appears to be MOV-to-CR2 and an
+ * unconditional jump. That sequence also works on all CPUs,
+ * but it will fault at CPL3 (i.e. Xen PV and lguest).
+ *
+ * CPUID is the conventional way, but it's nasty: it doesn't
+ * exist on some 486-like CPUs, and it usually exits to a
+ * hypervisor.
+ *
+ * Like all of Linux's memory ordering operations, this is a
+ * compiler barrier as well.
*/
- asm volatile("cmpl %2,%1\n\t"
- "jl 1f\n\t"
- "cpuid\n"
- "1:"
- : "=a" (tmp)
- : "rm" (boot_cpu_data.cpuid_level), "ri" (0), "0" (1)
- : "ebx", "ecx", "edx", "memory");
+ register void *__sp asm(_ASM_SP);
+
+#ifdef CONFIG_X86_32
+ asm volatile (
+ "pushfl\n\t"
+ "pushl %%cs\n\t"
+ "pushl $1f\n\t"
+ "iret\n\t"
+ "1:"
+ : "+r" (__sp) : : "memory");
#else
- /*
- * CPUID is a barrier to speculative execution.
- * Prefetched instructions are automatically
- * invalidated when modified.
- */
- asm volatile("cpuid"
- : "=a" (tmp)
- : "0" (1)
- : "ebx", "ecx", "edx", "memory");
+ unsigned int tmp;
+
+ asm volatile (
+ "mov %%ss, %0\n\t"
+ "pushq %q0\n\t"
+ "pushq %%rsp\n\t"
+ "addq $8, (%%rsp)\n\t"
+ "pushfq\n\t"
+ "mov %%cs, %0\n\t"
+ "pushq %q0\n\t"
+ "pushq $1f\n\t"
+ "iretq\n\t"
+ "1:"
+ : "=&r" (tmp), "+r" (__sp) : : "cc", "memory");
#endif
}
--
2.9.3
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [tip:x86/urgent] x86/asm: Rewrite sync_core() to use IRET-to-self
2016-12-09 18:24 ` [PATCH v4 4/4] x86/asm: Rewrite sync_core() to use IRET-to-self Andy Lutomirski
@ 2016-12-19 11:05 ` tip-bot for Andy Lutomirski
0 siblings, 0 replies; 10+ messages in thread
From: tip-bot for Andy Lutomirski @ 2016-12-19 11:05 UTC (permalink / raw)
To: linux-tip-commits
Cc: tglx, jgross, Xen-devel, tedheadster, peterz, brgerst, gnomes,
bp, boris.ostrovsky, luto, andrew.cooper3, linux-kernel, hmh,
hpa, mingo
Commit-ID: c198b121b1a1d7a7171770c634cd49191bac4477
Gitweb: http://git.kernel.org/tip/c198b121b1a1d7a7171770c634cd49191bac4477
Author: Andy Lutomirski <luto@kernel.org>
AuthorDate: Fri, 9 Dec 2016 10:24:08 -0800
Committer: Thomas Gleixner <tglx@linutronix.de>
CommitDate: Mon, 19 Dec 2016 11:54:21 +0100
x86/asm: Rewrite sync_core() to use IRET-to-self
Aside from being excessively slow, CPUID is problematic: Linux runs
on a handful of CPUs that don't have CPUID. Use IRET-to-self
instead. IRET-to-self works everywhere, so it makes testing easy.
For reference, On my laptop, IRET-to-self is ~110ns,
CPUID(eax=1, ecx=0) is ~83ns on native and very very slow under KVM,
and MOV-to-CR2 is ~42ns.
While we're at it: sync_core() serves a very specific purpose.
Document it.
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Juergen Gross <jgross@suse.com>
Cc: One Thousand Gnomes <gnomes@lxorguk.ukuu.org.uk>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Matthew Whitehead <tedheadster@gmail.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Henrique de Moraes Holschuh <hmh@hmh.eng.br>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: xen-devel <Xen-devel@lists.xen.org>
Link: http://lkml.kernel.org/r/5c79f0225f68bc8c40335612bf624511abb78941.1481307769.git.luto@kernel.org
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
arch/x86/include/asm/processor.h | 80 +++++++++++++++++++++++++++++-----------
1 file changed, 58 insertions(+), 22 deletions(-)
diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index b934871..eaf1005 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -602,33 +602,69 @@ static __always_inline void cpu_relax(void)
rep_nop();
}
-/* Stop speculative execution and prefetching of modified code. */
+/*
+ * This function forces the icache and prefetched instruction stream to
+ * catch up with reality in two very specific cases:
+ *
+ * a) Text was modified using one virtual address and is about to be executed
+ * from the same physical page at a different virtual address.
+ *
+ * b) Text was modified on a different CPU, may subsequently be
+ * executed on this CPU, and you want to make sure the new version
+ * gets executed. This generally means you're calling this in a IPI.
+ *
+ * If you're calling this for a different reason, you're probably doing
+ * it wrong.
+ */
static inline void sync_core(void)
{
- int tmp;
-
-#ifdef CONFIG_X86_32
/*
- * Do a CPUID if available, otherwise do a jump. The jump
- * can conveniently enough be the jump around CPUID.
+ * There are quite a few ways to do this. IRET-to-self is nice
+ * because it works on every CPU, at any CPL (so it's compatible
+ * with paravirtualization), and it never exits to a hypervisor.
+ * The only down sides are that it's a bit slow (it seems to be
+ * a bit more than 2x slower than the fastest options) and that
+ * it unmasks NMIs. The "push %cs" is needed because, in
+ * paravirtual environments, __KERNEL_CS may not be a valid CS
+ * value when we do IRET directly.
+ *
+ * In case NMI unmasking or performance ever becomes a problem,
+ * the next best option appears to be MOV-to-CR2 and an
+ * unconditional jump. That sequence also works on all CPUs,
+ * but it will fault at CPL3 (i.e. Xen PV and lguest).
+ *
+ * CPUID is the conventional way, but it's nasty: it doesn't
+ * exist on some 486-like CPUs, and it usually exits to a
+ * hypervisor.
+ *
+ * Like all of Linux's memory ordering operations, this is a
+ * compiler barrier as well.
*/
- asm volatile("cmpl %2,%1\n\t"
- "jl 1f\n\t"
- "cpuid\n"
- "1:"
- : "=a" (tmp)
- : "rm" (boot_cpu_data.cpuid_level), "ri" (0), "0" (1)
- : "ebx", "ecx", "edx", "memory");
+ register void *__sp asm(_ASM_SP);
+
+#ifdef CONFIG_X86_32
+ asm volatile (
+ "pushfl\n\t"
+ "pushl %%cs\n\t"
+ "pushl $1f\n\t"
+ "iret\n\t"
+ "1:"
+ : "+r" (__sp) : : "memory");
#else
- /*
- * CPUID is a barrier to speculative execution.
- * Prefetched instructions are automatically
- * invalidated when modified.
- */
- asm volatile("cpuid"
- : "=a" (tmp)
- : "0" (1)
- : "ebx", "ecx", "edx", "memory");
+ unsigned int tmp;
+
+ asm volatile (
+ "mov %%ss, %0\n\t"
+ "pushq %q0\n\t"
+ "pushq %%rsp\n\t"
+ "addq $8, (%%rsp)\n\t"
+ "pushfq\n\t"
+ "mov %%cs, %0\n\t"
+ "pushq %q0\n\t"
+ "pushq $1f\n\t"
+ "iretq\n\t"
+ "1:"
+ : "=&r" (tmp), "+r" (__sp) : : "cc", "memory");
#endif
}
^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH v4 0/4] CPUID-less CPU/sync_core fixes and improvements
2016-12-09 18:24 [PATCH v4 0/4] CPUID-less CPU/sync_core fixes and improvements Andy Lutomirski
` (3 preceding siblings ...)
2016-12-09 18:24 ` [PATCH v4 4/4] x86/asm: Rewrite sync_core() to use IRET-to-self Andy Lutomirski
@ 2016-12-15 18:06 ` Andy Lutomirski
4 siblings, 0 replies; 10+ messages in thread
From: Andy Lutomirski @ 2016-12-15 18:06 UTC (permalink / raw)
To: Andy Lutomirski
Cc: X86 ML, One Thousand Gnomes, Borislav Petkov, linux-kernel,
Brian Gerst, Matthew Whitehead, Henrique de Moraes Holschuh,
Peter Zijlstra, xen-devel, Juergen Gross, Boris Ostrovsky,
Andrew Cooper
On Fri, Dec 9, 2016 at 10:24 AM, Andy Lutomirski <luto@kernel.org> wrote:
> *** PATCHES 1 and 2 MAY BE 4.9 MATERIAL ***
>
> Alan Cox pointed out that the 486 isn't the only supported CPU that
> doesn't have CPUID. Let's clean up the mess and make everything
> faster while we're at it.
>
Ping? Any chance of getting these in for 4.10?
--Andy
^ permalink raw reply [flat|nested] 10+ messages in thread