All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v6 0/5] x86: Enable LKGS instruction
@ 2023-01-12  7:20 Xin Li
  2023-01-12  7:20 ` [PATCH v6 1/5] x86/cpufeature: add the cpu feature bit for LKGS Xin Li
                   ` (5 more replies)
  0 siblings, 6 replies; 15+ messages in thread
From: Xin Li @ 2023-01-12  7:20 UTC (permalink / raw)
  To: linux-kernel, x86
  Cc: tglx, mingo, bp, dave.hansen, hpa, peterz, brgerst,
	chang.seok.bae, jgross

LKGS instruction is introduced with Intel FRED (flexible return and event
delivery) specification. As LKGS is independent of FRED, we enable it as
a standalone CPU feature.

LKGS behaves like the MOV to GS instruction except that it loads the base
address into the IA32_KERNEL_GS_BASE MSR instead of the GS segment’s
descriptor cache, which is exactly what Linux kernel does to load user level
GS base.  Thus, with LKGS, there is no need to SWAPGS away from the kernel
GS base.

Changes since v5:
* Recommend to search for the latest FRED spec instead of providing
  a FRED spec URL, which is likely to be unstable (Borislav Petkov).
* Remove reviewers' SOBs (Borislav Petkov).

Changes since v4:
* Clear the LKGS feature from Xen PV guests (Juergen Gross).

Changes since v3:
* We want less ASM not more, thus keep local_irq_{save,restore}() inside
  native_load_gs_index() (Thomas Gleixner).
* For paravirt enabled kernels, initialize pv_ops.cpu.load_gs_index to
  native_lkgs (Thomas Gleixner).

Changes since v2:
* Add "" not to show "lkgs" in /proc/cpuinfo (Chang S. Bae).
* Mark DI as input and output (+D) as in v1, since the exception handler
  modifies it (Brian Gerst).

Changes since v1:
* Use EX_TYPE_ZERO_REG instead of fixup code in the obsolete .fixup code
  section (Peter Zijlstra).
* Add a comment that states the LKGS_DI macro will be replaced with "lkgs %di"
  once the binutils support the LKGS instruction (Peter Zijlstra).

H. Peter Anvin (Intel) (5):
  x86/cpufeature: add the cpu feature bit for LKGS
  x86/opcode: add the LKGS instruction to x86-opcode-map
  x86/gsseg: make asm_load_gs_index() take an u16
  x86/gsseg: move load_gs_index() to its own new header file
  x86/gsseg: use the LKGS instruction if available for load_gs_index()

 arch/x86/entry/entry_64.S                |  2 +-
 arch/x86/include/asm/cpufeatures.h       |  1 +
 arch/x86/include/asm/gsseg.h             | 66 ++++++++++++++++++++++++
 arch/x86/include/asm/mmu_context.h       |  1 +
 arch/x86/include/asm/special_insns.h     | 21 --------
 arch/x86/kernel/cpu/common.c             |  1 +
 arch/x86/kernel/paravirt.c               |  1 +
 arch/x86/kernel/signal_32.c              |  1 +
 arch/x86/kernel/tls.c                    |  1 +
 arch/x86/lib/x86-opcode-map.txt          |  1 +
 arch/x86/xen/enlighten_pv.c              |  1 +
 tools/arch/x86/include/asm/cpufeatures.h |  1 +
 tools/arch/x86/lib/x86-opcode-map.txt    |  1 +
 13 files changed, 77 insertions(+), 22 deletions(-)
 create mode 100644 arch/x86/include/asm/gsseg.h

-- 
2.34.1


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH v6 1/5] x86/cpufeature: add the cpu feature bit for LKGS
  2023-01-12  7:20 [PATCH v6 0/5] x86: Enable LKGS instruction Xin Li
@ 2023-01-12  7:20 ` Xin Li
  2023-01-12 12:16   ` [tip: x86/cpu] x86/cpufeature: Add the CPU " tip-bot2 for H. Peter Anvin (Intel)
  2023-01-12  7:20 ` [PATCH v6 2/5] x86/opcode: add the LKGS instruction to x86-opcode-map Xin Li
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 15+ messages in thread
From: Xin Li @ 2023-01-12  7:20 UTC (permalink / raw)
  To: linux-kernel, x86
  Cc: tglx, mingo, bp, dave.hansen, hpa, peterz, brgerst,
	chang.seok.bae, jgross

From: "H. Peter Anvin (Intel)" <hpa@zytor.com>

Add the CPU feature bit for LKGS (Load "Kernel" GS).

LKGS instruction is introduced with Intel FRED (flexible return and
event delivery) specification. Search for the latest FRED spec in most
search engines by doing:
site:intel.com FRED (flexible return and event delivery) specification

LKGS behaves like the MOV to GS instruction except that it loads
the base address into the IA32_KERNEL_GS_BASE MSR instead of the
GS segment’s descriptor cache, which is exactly what Linux kernel
does to load a user level GS base.  Thus, with LKGS, there is no
need to SWAPGS away from the kernel GS base.

Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
Signed-off-by: Xin Li <xin3.li@intel.com>
---

Changes since v5:
* Recommend to search for the latest FRED spec instead of providing
  a FRED spec URL, which is likely to be unstable (Borislav Petkov).

Changes since v2:
* Add "" not to show "lkgs" in /proc/cpuinfo (Chang S. Bae).
---
 arch/x86/include/asm/cpufeatures.h       | 1 +
 tools/arch/x86/include/asm/cpufeatures.h | 1 +
 2 files changed, 2 insertions(+)

diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h
index 61012476d66e..4d93c60407fe 100644
--- a/arch/x86/include/asm/cpufeatures.h
+++ b/arch/x86/include/asm/cpufeatures.h
@@ -312,6 +312,7 @@
 #define X86_FEATURE_AVX_VNNI		(12*32+ 4) /* AVX VNNI instructions */
 #define X86_FEATURE_AVX512_BF16		(12*32+ 5) /* AVX512 BFLOAT16 instructions */
 #define X86_FEATURE_CMPCCXADD           (12*32+ 7) /* "" CMPccXADD instructions */
+#define X86_FEATURE_LKGS		(12*32+18) /* "" Load "kernel" (userspace) gs */
 #define X86_FEATURE_AMX_FP16		(12*32+21) /* "" AMX fp16 Support */
 #define X86_FEATURE_AVX_IFMA            (12*32+23) /* "" Support for VPMADD52[H,L]UQ */
 
diff --git a/tools/arch/x86/include/asm/cpufeatures.h b/tools/arch/x86/include/asm/cpufeatures.h
index 61012476d66e..4d93c60407fe 100644
--- a/tools/arch/x86/include/asm/cpufeatures.h
+++ b/tools/arch/x86/include/asm/cpufeatures.h
@@ -312,6 +312,7 @@
 #define X86_FEATURE_AVX_VNNI		(12*32+ 4) /* AVX VNNI instructions */
 #define X86_FEATURE_AVX512_BF16		(12*32+ 5) /* AVX512 BFLOAT16 instructions */
 #define X86_FEATURE_CMPCCXADD           (12*32+ 7) /* "" CMPccXADD instructions */
+#define X86_FEATURE_LKGS		(12*32+18) /* "" Load "kernel" (userspace) gs */
 #define X86_FEATURE_AMX_FP16		(12*32+21) /* "" AMX fp16 Support */
 #define X86_FEATURE_AVX_IFMA            (12*32+23) /* "" Support for VPMADD52[H,L]UQ */
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v6 2/5] x86/opcode: add the LKGS instruction to x86-opcode-map
  2023-01-12  7:20 [PATCH v6 0/5] x86: Enable LKGS instruction Xin Li
  2023-01-12  7:20 ` [PATCH v6 1/5] x86/cpufeature: add the cpu feature bit for LKGS Xin Li
@ 2023-01-12  7:20 ` Xin Li
  2023-01-12 12:16   ` [tip: x86/cpu] x86/opcode: Add " tip-bot2 for H. Peter Anvin (Intel)
  2023-01-12  7:20 ` [PATCH v6 3/5] x86/gsseg: make asm_load_gs_index() take an u16 Xin Li
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 15+ messages in thread
From: Xin Li @ 2023-01-12  7:20 UTC (permalink / raw)
  To: linux-kernel, x86
  Cc: tglx, mingo, bp, dave.hansen, hpa, peterz, brgerst,
	chang.seok.bae, jgross

From: "H. Peter Anvin (Intel)" <hpa@zytor.com>

Add the instruction opcode used by LKGS to x86-opcode-map.

Opcode number is per public FRED draft spec v3.0.

Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
Signed-off-by: Xin Li <xin3.li@intel.com>
---
 arch/x86/lib/x86-opcode-map.txt       | 1 +
 tools/arch/x86/lib/x86-opcode-map.txt | 1 +
 2 files changed, 2 insertions(+)

diff --git a/arch/x86/lib/x86-opcode-map.txt b/arch/x86/lib/x86-opcode-map.txt
index d12d1358f96d..5168ee0360b2 100644
--- a/arch/x86/lib/x86-opcode-map.txt
+++ b/arch/x86/lib/x86-opcode-map.txt
@@ -1047,6 +1047,7 @@ GrpTable: Grp6
 3: LTR Ew
 4: VERR Ew
 5: VERW Ew
+6: LKGS Ew (F2)
 EndTable
 
 GrpTable: Grp7
diff --git a/tools/arch/x86/lib/x86-opcode-map.txt b/tools/arch/x86/lib/x86-opcode-map.txt
index d12d1358f96d..5168ee0360b2 100644
--- a/tools/arch/x86/lib/x86-opcode-map.txt
+++ b/tools/arch/x86/lib/x86-opcode-map.txt
@@ -1047,6 +1047,7 @@ GrpTable: Grp6
 3: LTR Ew
 4: VERR Ew
 5: VERW Ew
+6: LKGS Ew (F2)
 EndTable
 
 GrpTable: Grp7
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v6 3/5] x86/gsseg: make asm_load_gs_index() take an u16
  2023-01-12  7:20 [PATCH v6 0/5] x86: Enable LKGS instruction Xin Li
  2023-01-12  7:20 ` [PATCH v6 1/5] x86/cpufeature: add the cpu feature bit for LKGS Xin Li
  2023-01-12  7:20 ` [PATCH v6 2/5] x86/opcode: add the LKGS instruction to x86-opcode-map Xin Li
@ 2023-01-12  7:20 ` Xin Li
  2023-01-12 12:16   ` [tip: x86/cpu] x86/gsseg: Make " tip-bot2 for H. Peter Anvin (Intel)
  2023-01-12  7:20 ` [PATCH v6 4/5] x86/gsseg: move load_gs_index() to its own new header file Xin Li
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 15+ messages in thread
From: Xin Li @ 2023-01-12  7:20 UTC (permalink / raw)
  To: linux-kernel, x86
  Cc: tglx, mingo, bp, dave.hansen, hpa, peterz, brgerst,
	chang.seok.bae, jgross

From: "H. Peter Anvin (Intel)" <hpa@zytor.com>

Let gcc know that only the low 16 bits of load_gs_index() argument
actually matter. It might allow it to create slightly better
code. However, do not propagate this into the prototypes of functions
that end up being paravirtualized, to avoid unnecessary changes.

Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
Signed-off-by: Xin Li <xin3.li@intel.com>
---
 arch/x86/entry/entry_64.S            | 2 +-
 arch/x86/include/asm/special_insns.h | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
index 15739a2c0983..7ecd2aeeeffc 100644
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -782,7 +782,7 @@ _ASM_NOKPROBE(common_interrupt_return)
 
 /*
  * Reload gs selector with exception handling
- * edi:  new selector
+ *  di:  new selector
  *
  * Is in entry.text as it shouldn't be instrumented.
  */
diff --git a/arch/x86/include/asm/special_insns.h b/arch/x86/include/asm/special_insns.h
index 35f709f619fb..a71d0e8d4684 100644
--- a/arch/x86/include/asm/special_insns.h
+++ b/arch/x86/include/asm/special_insns.h
@@ -120,7 +120,7 @@ static inline void native_wbinvd(void)
 	asm volatile("wbinvd": : :"memory");
 }
 
-extern asmlinkage void asm_load_gs_index(unsigned int selector);
+extern asmlinkage void asm_load_gs_index(u16 selector);
 
 static inline void native_load_gs_index(unsigned int selector)
 {
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v6 4/5] x86/gsseg: move load_gs_index() to its own new header file
  2023-01-12  7:20 [PATCH v6 0/5] x86: Enable LKGS instruction Xin Li
                   ` (2 preceding siblings ...)
  2023-01-12  7:20 ` [PATCH v6 3/5] x86/gsseg: make asm_load_gs_index() take an u16 Xin Li
@ 2023-01-12  7:20 ` Xin Li
  2023-01-12 12:16   ` [tip: x86/cpu] x86/gsseg: Move " tip-bot2 for H. Peter Anvin (Intel)
  2023-01-12  7:20 ` [PATCH v6 5/5] x86/gsseg: use the LKGS instruction if available for load_gs_index() Xin Li
  2023-01-12 12:13 ` [PATCH v6 0/5] x86: Enable LKGS instruction Ingo Molnar
  5 siblings, 1 reply; 15+ messages in thread
From: Xin Li @ 2023-01-12  7:20 UTC (permalink / raw)
  To: linux-kernel, x86
  Cc: tglx, mingo, bp, dave.hansen, hpa, peterz, brgerst,
	chang.seok.bae, jgross

From: "H. Peter Anvin (Intel)" <hpa@zytor.com>

GS is a special segment on x86_64, move load_gs_index() to its own new
header file to simplify header inclusion.

Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
Signed-off-by: Xin Li <xin3.li@intel.com>
---
 arch/x86/include/asm/gsseg.h         | 41 ++++++++++++++++++++++++++++
 arch/x86/include/asm/mmu_context.h   |  1 +
 arch/x86/include/asm/special_insns.h | 21 --------------
 arch/x86/kernel/paravirt.c           |  1 +
 arch/x86/kernel/signal_32.c          |  1 +
 arch/x86/kernel/tls.c                |  1 +
 6 files changed, 45 insertions(+), 21 deletions(-)
 create mode 100644 arch/x86/include/asm/gsseg.h

diff --git a/arch/x86/include/asm/gsseg.h b/arch/x86/include/asm/gsseg.h
new file mode 100644
index 000000000000..d15577c39e8d
--- /dev/null
+++ b/arch/x86/include/asm/gsseg.h
@@ -0,0 +1,41 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+#ifndef _ASM_X86_GSSEG_H
+#define _ASM_X86_GSSEG_H
+
+#include <linux/types.h>
+
+#include <asm/asm.h>
+#include <asm/cpufeature.h>
+#include <asm/alternative.h>
+#include <asm/processor.h>
+#include <asm/nops.h>
+
+#ifdef CONFIG_X86_64
+
+extern asmlinkage void asm_load_gs_index(u16 selector);
+
+static inline void native_load_gs_index(unsigned int selector)
+{
+	unsigned long flags;
+
+	local_irq_save(flags);
+	asm_load_gs_index(selector);
+	local_irq_restore(flags);
+}
+
+#endif /* CONFIG_X86_64 */
+
+#ifndef CONFIG_PARAVIRT_XXL
+
+static inline void load_gs_index(unsigned int selector)
+{
+#ifdef CONFIG_X86_64
+	native_load_gs_index(selector);
+#else
+	loadsegment(gs, selector);
+#endif
+}
+
+#endif /* CONFIG_PARAVIRT_XXL */
+
+#endif /* _ASM_X86_GSSEG_H */
diff --git a/arch/x86/include/asm/mmu_context.h b/arch/x86/include/asm/mmu_context.h
index b8d40ddeab00..e01aa74a6de7 100644
--- a/arch/x86/include/asm/mmu_context.h
+++ b/arch/x86/include/asm/mmu_context.h
@@ -12,6 +12,7 @@
 #include <asm/tlbflush.h>
 #include <asm/paravirt.h>
 #include <asm/debugreg.h>
+#include <asm/gsseg.h>
 
 extern atomic64_t last_mm_ctx_id;
 
diff --git a/arch/x86/include/asm/special_insns.h b/arch/x86/include/asm/special_insns.h
index a71d0e8d4684..cfd9499b617c 100644
--- a/arch/x86/include/asm/special_insns.h
+++ b/arch/x86/include/asm/special_insns.h
@@ -120,17 +120,6 @@ static inline void native_wbinvd(void)
 	asm volatile("wbinvd": : :"memory");
 }
 
-extern asmlinkage void asm_load_gs_index(u16 selector);
-
-static inline void native_load_gs_index(unsigned int selector)
-{
-	unsigned long flags;
-
-	local_irq_save(flags);
-	asm_load_gs_index(selector);
-	local_irq_restore(flags);
-}
-
 static inline unsigned long __read_cr4(void)
 {
 	return native_read_cr4();
@@ -184,16 +173,6 @@ static inline void wbinvd(void)
 	native_wbinvd();
 }
 
-
-static inline void load_gs_index(unsigned int selector)
-{
-#ifdef CONFIG_X86_64
-	native_load_gs_index(selector);
-#else
-	loadsegment(gs, selector);
-#endif
-}
-
 #endif /* CONFIG_PARAVIRT_XXL */
 
 static inline void clflush(volatile void *__p)
diff --git a/arch/x86/kernel/paravirt.c b/arch/x86/kernel/paravirt.c
index 327757afb027..bdc886c3f13a 100644
--- a/arch/x86/kernel/paravirt.c
+++ b/arch/x86/kernel/paravirt.c
@@ -32,6 +32,7 @@
 #include <asm/special_insns.h>
 #include <asm/tlb.h>
 #include <asm/io_bitmap.h>
+#include <asm/gsseg.h>
 
 /*
  * nop stub, which must not clobber anything *including the stack* to
diff --git a/arch/x86/kernel/signal_32.c b/arch/x86/kernel/signal_32.c
index 2553136cf39b..bb4f3f3b1c84 100644
--- a/arch/x86/kernel/signal_32.c
+++ b/arch/x86/kernel/signal_32.c
@@ -31,6 +31,7 @@
 #include <asm/sigframe.h>
 #include <asm/sighandling.h>
 #include <asm/smap.h>
+#include <asm/gsseg.h>
 
 #ifdef CONFIG_IA32_EMULATION
 #include <asm/ia32_unistd.h>
diff --git a/arch/x86/kernel/tls.c b/arch/x86/kernel/tls.c
index 3c883e064242..3ffbab0081f4 100644
--- a/arch/x86/kernel/tls.c
+++ b/arch/x86/kernel/tls.c
@@ -12,6 +12,7 @@
 #include <asm/ldt.h>
 #include <asm/processor.h>
 #include <asm/proto.h>
+#include <asm/gsseg.h>
 
 #include "tls.h"
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v6 5/5] x86/gsseg: use the LKGS instruction if available for load_gs_index()
  2023-01-12  7:20 [PATCH v6 0/5] x86: Enable LKGS instruction Xin Li
                   ` (3 preceding siblings ...)
  2023-01-12  7:20 ` [PATCH v6 4/5] x86/gsseg: move load_gs_index() to its own new header file Xin Li
@ 2023-01-12  7:20 ` Xin Li
  2023-01-13  9:36   ` [tip: x86/cpu] x86/gsseg: Use " tip-bot2 for H. Peter Anvin (Intel)
  2023-01-12 12:13 ` [PATCH v6 0/5] x86: Enable LKGS instruction Ingo Molnar
  5 siblings, 1 reply; 15+ messages in thread
From: Xin Li @ 2023-01-12  7:20 UTC (permalink / raw)
  To: linux-kernel, x86
  Cc: tglx, mingo, bp, dave.hansen, hpa, peterz, brgerst,
	chang.seok.bae, jgross

From: "H. Peter Anvin (Intel)" <hpa@zytor.com>

The LKGS instruction atomically loads a segment descriptor into the
%gs descriptor registers, *except* that %gs.base is unchanged, and the
base is instead loaded into MSR_IA32_KERNEL_GS_BASE, which is exactly
what we want this function to do.

Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
Signed-off-by: Xin Li <xin3.li@intel.com>
---

Changes since v5:
* Remove reviewers' SOBs (Borislav Petkov).

Changes since v4:
* Clear the LKGS feature from Xen PV guests (Juergen Gross).

Changes since v3:
* We want less ASM not more, thus keep local_irq_{save,restore}() inside
  native_load_gs_index() (Thomas Gleixner).
* For paravirt enabled kernels, initialize pv_ops.cpu.load_gs_index to
  native_lkgs (Thomas Gleixner).

Changes since v2:
* Mark DI as input and output (+D) as in v1, since the exception handler
  modifies it (Brian Gerst).

Changes since v1:
* Use EX_TYPE_ZERO_REG instead of fixup code in the obsolete .fixup code
  section (Peter Zijlstra).
* Add a comment that states the LKGS_DI macro will be replaced with "lkgs %di"
  once the binutils support the LKGS instruction (Peter Zijlstra).
---
 arch/x86/include/asm/gsseg.h | 33 +++++++++++++++++++++++++++++----
 arch/x86/kernel/cpu/common.c |  1 +
 arch/x86/xen/enlighten_pv.c  |  1 +
 3 files changed, 31 insertions(+), 4 deletions(-)

diff --git a/arch/x86/include/asm/gsseg.h b/arch/x86/include/asm/gsseg.h
index d15577c39e8d..ab6a595cea70 100644
--- a/arch/x86/include/asm/gsseg.h
+++ b/arch/x86/include/asm/gsseg.h
@@ -14,17 +14,42 @@
 
 extern asmlinkage void asm_load_gs_index(u16 selector);
 
+/* Replace with "lkgs %di" once binutils support LKGS instruction */
+#define LKGS_DI _ASM_BYTES(0xf2,0x0f,0x00,0xf7)
+
+static inline void native_lkgs(unsigned int selector)
+{
+	u16 sel = selector;
+	asm_inline volatile("1: " LKGS_DI
+			    _ASM_EXTABLE_TYPE_REG(1b, 1b, EX_TYPE_ZERO_REG, %k[sel])
+			    : [sel] "+D" (sel));
+}
+
 static inline void native_load_gs_index(unsigned int selector)
 {
-	unsigned long flags;
+	if (cpu_feature_enabled(X86_FEATURE_LKGS)) {
+		native_lkgs(selector);
+	} else {
+		unsigned long flags;
 
-	local_irq_save(flags);
-	asm_load_gs_index(selector);
-	local_irq_restore(flags);
+		local_irq_save(flags);
+		asm_load_gs_index(selector);
+		local_irq_restore(flags);
+	}
 }
 
 #endif /* CONFIG_X86_64 */
 
+static inline void __init lkgs_init(void)
+{
+#ifdef CONFIG_PARAVIRT_XXL
+#ifdef CONFIG_X86_64
+	if (cpu_feature_enabled(X86_FEATURE_LKGS))
+		pv_ops.cpu.load_gs_index = native_lkgs;
+#endif
+#endif
+}
+
 #ifndef CONFIG_PARAVIRT_XXL
 
 static inline void load_gs_index(unsigned int selector)
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index 9cfca3d7d0e2..b7ac85a1e5df 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -1960,6 +1960,7 @@ void __init identify_boot_cpu(void)
 	setup_cr_pinning();
 
 	tsx_init();
+	lkgs_init();
 }
 
 void identify_secondary_cpu(struct cpuinfo_x86 *c)
diff --git a/arch/x86/xen/enlighten_pv.c b/arch/x86/xen/enlighten_pv.c
index 5b1379662877..ce2f19ee4bfc 100644
--- a/arch/x86/xen/enlighten_pv.c
+++ b/arch/x86/xen/enlighten_pv.c
@@ -276,6 +276,7 @@ static void __init xen_init_capabilities(void)
 	setup_clear_cpu_cap(X86_FEATURE_ACC);
 	setup_clear_cpu_cap(X86_FEATURE_X2APIC);
 	setup_clear_cpu_cap(X86_FEATURE_SME);
+	setup_clear_cpu_cap(X86_FEATURE_LKGS);
 
 	/*
 	 * Xen PV would need some work to support PCID: CR3 handling as well
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH v6 0/5] x86: Enable LKGS instruction
  2023-01-12  7:20 [PATCH v6 0/5] x86: Enable LKGS instruction Xin Li
                   ` (4 preceding siblings ...)
  2023-01-12  7:20 ` [PATCH v6 5/5] x86/gsseg: use the LKGS instruction if available for load_gs_index() Xin Li
@ 2023-01-12 12:13 ` Ingo Molnar
  2023-01-12 14:57   ` Peter Zijlstra
  5 siblings, 1 reply; 15+ messages in thread
From: Ingo Molnar @ 2023-01-12 12:13 UTC (permalink / raw)
  To: Xin Li
  Cc: linux-kernel, x86, tglx, mingo, bp, dave.hansen, hpa, peterz,
	brgerst, chang.seok.bae, jgross


* Xin Li <xin3.li@intel.com> wrote:

> LKGS instruction is introduced with Intel FRED (flexible return and event 
> delivery) specification. As LKGS is independent of FRED, we enable it as 
> a standalone CPU feature.
> 
> LKGS behaves like the MOV to GS instruction except that it loads the base 
> address into the IA32_KERNEL_GS_BASE MSR instead of the GS segment’s 
> descriptor cache, which is exactly what Linux kernel does to load user 
> level GS base.  Thus, with LKGS, there is no need to SWAPGS away from the 
> kernel GS base.

Ok, this looks good to me.

I've applied the first 4 patches to tip:x86/cpu, as the instruction exists 
in a public document and these patches are fine stand-alone as well, such 
as the factoring out of load_gs_index() methods from a high-use low level 
header into a new header file.

Planning to apply the final, LKGS enabler patch as well, unless there's any 
objections from others?

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [tip: x86/cpu] x86/gsseg: Move load_gs_index() to its own new header file
  2023-01-12  7:20 ` [PATCH v6 4/5] x86/gsseg: move load_gs_index() to its own new header file Xin Li
@ 2023-01-12 12:16   ` tip-bot2 for H. Peter Anvin (Intel)
  0 siblings, 0 replies; 15+ messages in thread
From: tip-bot2 for H. Peter Anvin (Intel) @ 2023-01-12 12:16 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: H. Peter Anvin (Intel), Xin Li, Ingo Molnar, x86, linux-kernel

The following commit has been merged into the x86/cpu branch of tip:

Commit-ID:     ae53fa18703000f507107df43efd1168a0365361
Gitweb:        https://git.kernel.org/tip/ae53fa18703000f507107df43efd1168a0365361
Author:        H. Peter Anvin (Intel) <hpa@zytor.com>
AuthorDate:    Wed, 11 Jan 2023 23:20:31 -08:00
Committer:     Ingo Molnar <mingo@kernel.org>
CommitterDate: Thu, 12 Jan 2023 13:06:36 +01:00

x86/gsseg: Move load_gs_index() to its own new header file

GS is a special segment on x86_64, move load_gs_index() to its own new
header file to simplify header inclusion.

No change in functionality.

Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
Signed-off-by: Xin Li <xin3.li@intel.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20230112072032.35626-5-xin3.li@intel.com
---
 arch/x86/include/asm/gsseg.h         | 41 +++++++++++++++++++++++++++-
 arch/x86/include/asm/mmu_context.h   |  1 +-
 arch/x86/include/asm/special_insns.h | 21 +--------------
 arch/x86/kernel/paravirt.c           |  1 +-
 arch/x86/kernel/signal_32.c          |  1 +-
 arch/x86/kernel/tls.c                |  1 +-
 6 files changed, 45 insertions(+), 21 deletions(-)
 create mode 100644 arch/x86/include/asm/gsseg.h

diff --git a/arch/x86/include/asm/gsseg.h b/arch/x86/include/asm/gsseg.h
new file mode 100644
index 0000000..d15577c
--- /dev/null
+++ b/arch/x86/include/asm/gsseg.h
@@ -0,0 +1,41 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+#ifndef _ASM_X86_GSSEG_H
+#define _ASM_X86_GSSEG_H
+
+#include <linux/types.h>
+
+#include <asm/asm.h>
+#include <asm/cpufeature.h>
+#include <asm/alternative.h>
+#include <asm/processor.h>
+#include <asm/nops.h>
+
+#ifdef CONFIG_X86_64
+
+extern asmlinkage void asm_load_gs_index(u16 selector);
+
+static inline void native_load_gs_index(unsigned int selector)
+{
+	unsigned long flags;
+
+	local_irq_save(flags);
+	asm_load_gs_index(selector);
+	local_irq_restore(flags);
+}
+
+#endif /* CONFIG_X86_64 */
+
+#ifndef CONFIG_PARAVIRT_XXL
+
+static inline void load_gs_index(unsigned int selector)
+{
+#ifdef CONFIG_X86_64
+	native_load_gs_index(selector);
+#else
+	loadsegment(gs, selector);
+#endif
+}
+
+#endif /* CONFIG_PARAVIRT_XXL */
+
+#endif /* _ASM_X86_GSSEG_H */
diff --git a/arch/x86/include/asm/mmu_context.h b/arch/x86/include/asm/mmu_context.h
index b8d40dd..e01aa74 100644
--- a/arch/x86/include/asm/mmu_context.h
+++ b/arch/x86/include/asm/mmu_context.h
@@ -12,6 +12,7 @@
 #include <asm/tlbflush.h>
 #include <asm/paravirt.h>
 #include <asm/debugreg.h>
+#include <asm/gsseg.h>
 
 extern atomic64_t last_mm_ctx_id;
 
diff --git a/arch/x86/include/asm/special_insns.h b/arch/x86/include/asm/special_insns.h
index a71d0e8..cfd9499 100644
--- a/arch/x86/include/asm/special_insns.h
+++ b/arch/x86/include/asm/special_insns.h
@@ -120,17 +120,6 @@ static inline void native_wbinvd(void)
 	asm volatile("wbinvd": : :"memory");
 }
 
-extern asmlinkage void asm_load_gs_index(u16 selector);
-
-static inline void native_load_gs_index(unsigned int selector)
-{
-	unsigned long flags;
-
-	local_irq_save(flags);
-	asm_load_gs_index(selector);
-	local_irq_restore(flags);
-}
-
 static inline unsigned long __read_cr4(void)
 {
 	return native_read_cr4();
@@ -184,16 +173,6 @@ static inline void wbinvd(void)
 	native_wbinvd();
 }
 
-
-static inline void load_gs_index(unsigned int selector)
-{
-#ifdef CONFIG_X86_64
-	native_load_gs_index(selector);
-#else
-	loadsegment(gs, selector);
-#endif
-}
-
 #endif /* CONFIG_PARAVIRT_XXL */
 
 static inline void clflush(volatile void *__p)
diff --git a/arch/x86/kernel/paravirt.c b/arch/x86/kernel/paravirt.c
index 327757a..bdc886c 100644
--- a/arch/x86/kernel/paravirt.c
+++ b/arch/x86/kernel/paravirt.c
@@ -32,6 +32,7 @@
 #include <asm/special_insns.h>
 #include <asm/tlb.h>
 #include <asm/io_bitmap.h>
+#include <asm/gsseg.h>
 
 /*
  * nop stub, which must not clobber anything *including the stack* to
diff --git a/arch/x86/kernel/signal_32.c b/arch/x86/kernel/signal_32.c
index 2553136..bb4f3f3 100644
--- a/arch/x86/kernel/signal_32.c
+++ b/arch/x86/kernel/signal_32.c
@@ -31,6 +31,7 @@
 #include <asm/sigframe.h>
 #include <asm/sighandling.h>
 #include <asm/smap.h>
+#include <asm/gsseg.h>
 
 #ifdef CONFIG_IA32_EMULATION
 #include <asm/ia32_unistd.h>
diff --git a/arch/x86/kernel/tls.c b/arch/x86/kernel/tls.c
index 3c883e0..3ffbab0 100644
--- a/arch/x86/kernel/tls.c
+++ b/arch/x86/kernel/tls.c
@@ -12,6 +12,7 @@
 #include <asm/ldt.h>
 #include <asm/processor.h>
 #include <asm/proto.h>
+#include <asm/gsseg.h>
 
 #include "tls.h"
 

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [tip: x86/cpu] x86/gsseg: Make asm_load_gs_index() take an u16
  2023-01-12  7:20 ` [PATCH v6 3/5] x86/gsseg: make asm_load_gs_index() take an u16 Xin Li
@ 2023-01-12 12:16   ` tip-bot2 for H. Peter Anvin (Intel)
  0 siblings, 0 replies; 15+ messages in thread
From: tip-bot2 for H. Peter Anvin (Intel) @ 2023-01-12 12:16 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: H. Peter Anvin (Intel), Xin Li, Ingo Molnar, x86, linux-kernel

The following commit has been merged into the x86/cpu branch of tip:

Commit-ID:     df729fb05ae2db52f7de150439392a88ee9d9b4f
Gitweb:        https://git.kernel.org/tip/df729fb05ae2db52f7de150439392a88ee9d9b4f
Author:        H. Peter Anvin (Intel) <hpa@zytor.com>
AuthorDate:    Wed, 11 Jan 2023 23:20:30 -08:00
Committer:     Ingo Molnar <mingo@kernel.org>
CommitterDate: Thu, 12 Jan 2023 13:06:36 +01:00

x86/gsseg: Make asm_load_gs_index() take an u16

Let GCC know that only the low 16 bits of load_gs_index() argument
actually matter. It might allow it to create slightly better
code. However, do not propagate this into the prototypes of functions
that end up being paravirtualized, to avoid unnecessary changes.

Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
Signed-off-by: Xin Li <xin3.li@intel.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20230112072032.35626-4-xin3.li@intel.com
---
 arch/x86/entry/entry_64.S            | 2 +-
 arch/x86/include/asm/special_insns.h | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
index 15739a2..7ecd2ae 100644
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -782,7 +782,7 @@ _ASM_NOKPROBE(common_interrupt_return)
 
 /*
  * Reload gs selector with exception handling
- * edi:  new selector
+ *  di:  new selector
  *
  * Is in entry.text as it shouldn't be instrumented.
  */
diff --git a/arch/x86/include/asm/special_insns.h b/arch/x86/include/asm/special_insns.h
index 35f709f..a71d0e8 100644
--- a/arch/x86/include/asm/special_insns.h
+++ b/arch/x86/include/asm/special_insns.h
@@ -120,7 +120,7 @@ static inline void native_wbinvd(void)
 	asm volatile("wbinvd": : :"memory");
 }
 
-extern asmlinkage void asm_load_gs_index(unsigned int selector);
+extern asmlinkage void asm_load_gs_index(u16 selector);
 
 static inline void native_load_gs_index(unsigned int selector)
 {

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [tip: x86/cpu] x86/cpufeature: Add the CPU feature bit for LKGS
  2023-01-12  7:20 ` [PATCH v6 1/5] x86/cpufeature: add the cpu feature bit for LKGS Xin Li
@ 2023-01-12 12:16   ` tip-bot2 for H. Peter Anvin (Intel)
  0 siblings, 0 replies; 15+ messages in thread
From: tip-bot2 for H. Peter Anvin (Intel) @ 2023-01-12 12:16 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: H. Peter Anvin (Intel), Xin Li, Ingo Molnar, x86, linux-kernel

The following commit has been merged into the x86/cpu branch of tip:

Commit-ID:     660569472dd7ac64571375b6727c3f2c1d70ba40
Gitweb:        https://git.kernel.org/tip/660569472dd7ac64571375b6727c3f2c1d70ba40
Author:        H. Peter Anvin (Intel) <hpa@zytor.com>
AuthorDate:    Wed, 11 Jan 2023 23:20:28 -08:00
Committer:     Ingo Molnar <mingo@kernel.org>
CommitterDate: Thu, 12 Jan 2023 13:06:20 +01:00

x86/cpufeature: Add the CPU feature bit for LKGS

Add the CPU feature bit for LKGS (Load "Kernel" GS).

LKGS instruction is introduced with Intel FRED (flexible return and
event delivery) specification. Search for the latest FRED spec in most
search engines with this search pattern:

  site:intel.com FRED (flexible return and event delivery) specification

LKGS behaves like the MOV to GS instruction except that it loads
the base address into the IA32_KERNEL_GS_BASE MSR instead of the
GS segment’s descriptor cache, which is exactly what Linux kernel
does to load a user level GS base.  Thus, with LKGS, there is no
need to SWAPGS away from the kernel GS base.

[ mingo: Minor tweaks to the description. ]

Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
Signed-off-by: Xin Li <xin3.li@intel.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20230112072032.35626-2-xin3.li@intel.com
---
 arch/x86/include/asm/cpufeatures.h       | 1 +
 tools/arch/x86/include/asm/cpufeatures.h | 1 +
 2 files changed, 2 insertions(+)

diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h
index 6101247..b70111a 100644
--- a/arch/x86/include/asm/cpufeatures.h
+++ b/arch/x86/include/asm/cpufeatures.h
@@ -312,6 +312,7 @@
 #define X86_FEATURE_AVX_VNNI		(12*32+ 4) /* AVX VNNI instructions */
 #define X86_FEATURE_AVX512_BF16		(12*32+ 5) /* AVX512 BFLOAT16 instructions */
 #define X86_FEATURE_CMPCCXADD           (12*32+ 7) /* "" CMPccXADD instructions */
+#define X86_FEATURE_LKGS		(12*32+18) /* "" Load "kernel" (userspace) GS */
 #define X86_FEATURE_AMX_FP16		(12*32+21) /* "" AMX fp16 Support */
 #define X86_FEATURE_AVX_IFMA            (12*32+23) /* "" Support for VPMADD52[H,L]UQ */
 
diff --git a/tools/arch/x86/include/asm/cpufeatures.h b/tools/arch/x86/include/asm/cpufeatures.h
index 6101247..b70111a 100644
--- a/tools/arch/x86/include/asm/cpufeatures.h
+++ b/tools/arch/x86/include/asm/cpufeatures.h
@@ -312,6 +312,7 @@
 #define X86_FEATURE_AVX_VNNI		(12*32+ 4) /* AVX VNNI instructions */
 #define X86_FEATURE_AVX512_BF16		(12*32+ 5) /* AVX512 BFLOAT16 instructions */
 #define X86_FEATURE_CMPCCXADD           (12*32+ 7) /* "" CMPccXADD instructions */
+#define X86_FEATURE_LKGS		(12*32+18) /* "" Load "kernel" (userspace) GS */
 #define X86_FEATURE_AMX_FP16		(12*32+21) /* "" AMX fp16 Support */
 #define X86_FEATURE_AVX_IFMA            (12*32+23) /* "" Support for VPMADD52[H,L]UQ */
 

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [tip: x86/cpu] x86/opcode: Add the LKGS instruction to x86-opcode-map
  2023-01-12  7:20 ` [PATCH v6 2/5] x86/opcode: add the LKGS instruction to x86-opcode-map Xin Li
@ 2023-01-12 12:16   ` tip-bot2 for H. Peter Anvin (Intel)
  0 siblings, 0 replies; 15+ messages in thread
From: tip-bot2 for H. Peter Anvin (Intel) @ 2023-01-12 12:16 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: H. Peter Anvin (Intel), Xin Li, Ingo Molnar, x86, linux-kernel

The following commit has been merged into the x86/cpu branch of tip:

Commit-ID:     5a91f12660fe7249e37b11372bf599e02b6a319c
Gitweb:        https://git.kernel.org/tip/5a91f12660fe7249e37b11372bf599e02b6a319c
Author:        H. Peter Anvin (Intel) <hpa@zytor.com>
AuthorDate:    Wed, 11 Jan 2023 23:20:29 -08:00
Committer:     Ingo Molnar <mingo@kernel.org>
CommitterDate: Thu, 12 Jan 2023 13:06:36 +01:00

x86/opcode: Add the LKGS instruction to x86-opcode-map

Add the instruction opcode used by LKGS to x86-opcode-map.

Opcode number is per public FRED draft spec v3.0.

Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
Signed-off-by: Xin Li <xin3.li@intel.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20230112072032.35626-3-xin3.li@intel.com
---
 arch/x86/lib/x86-opcode-map.txt       | 1 +
 tools/arch/x86/lib/x86-opcode-map.txt | 1 +
 2 files changed, 2 insertions(+)

diff --git a/arch/x86/lib/x86-opcode-map.txt b/arch/x86/lib/x86-opcode-map.txt
index d12d135..5168ee0 100644
--- a/arch/x86/lib/x86-opcode-map.txt
+++ b/arch/x86/lib/x86-opcode-map.txt
@@ -1047,6 +1047,7 @@ GrpTable: Grp6
 3: LTR Ew
 4: VERR Ew
 5: VERW Ew
+6: LKGS Ew (F2)
 EndTable
 
 GrpTable: Grp7
diff --git a/tools/arch/x86/lib/x86-opcode-map.txt b/tools/arch/x86/lib/x86-opcode-map.txt
index d12d135..5168ee0 100644
--- a/tools/arch/x86/lib/x86-opcode-map.txt
+++ b/tools/arch/x86/lib/x86-opcode-map.txt
@@ -1047,6 +1047,7 @@ GrpTable: Grp6
 3: LTR Ew
 4: VERR Ew
 5: VERW Ew
+6: LKGS Ew (F2)
 EndTable
 
 GrpTable: Grp7

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH v6 0/5] x86: Enable LKGS instruction
  2023-01-12 12:13 ` [PATCH v6 0/5] x86: Enable LKGS instruction Ingo Molnar
@ 2023-01-12 14:57   ` Peter Zijlstra
  2023-01-13 13:29     ` Ingo Molnar
  0 siblings, 1 reply; 15+ messages in thread
From: Peter Zijlstra @ 2023-01-12 14:57 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Xin Li, linux-kernel, x86, tglx, mingo, bp, dave.hansen, hpa,
	brgerst, chang.seok.bae, jgross

On Thu, Jan 12, 2023 at 01:13:20PM +0100, Ingo Molnar wrote:
> 
> * Xin Li <xin3.li@intel.com> wrote:
> 
> > LKGS instruction is introduced with Intel FRED (flexible return and event 
> > delivery) specification. As LKGS is independent of FRED, we enable it as 
> > a standalone CPU feature.
> > 
> > LKGS behaves like the MOV to GS instruction except that it loads the base 
> > address into the IA32_KERNEL_GS_BASE MSR instead of the GS segment’s 
> > descriptor cache, which is exactly what Linux kernel does to load user 
> > level GS base.  Thus, with LKGS, there is no need to SWAPGS away from the 
> > kernel GS base.
> 
> Ok, this looks good to me.
> 
> I've applied the first 4 patches to tip:x86/cpu, as the instruction exists 
> in a public document and these patches are fine stand-alone as well, such 
> as the factoring out of load_gs_index() methods from a high-use low level 
> header into a new header file.
> 
> Planning to apply the final, LKGS enabler patch as well, unless there's any 
> objections from others?

Nah, I think that thing's bike-shedded to near death. Let's just do it.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [tip: x86/cpu] x86/gsseg: Use the LKGS instruction if available for load_gs_index()
  2023-01-12  7:20 ` [PATCH v6 5/5] x86/gsseg: use the LKGS instruction if available for load_gs_index() Xin Li
@ 2023-01-13  9:36   ` tip-bot2 for H. Peter Anvin (Intel)
  0 siblings, 0 replies; 15+ messages in thread
From: tip-bot2 for H. Peter Anvin (Intel) @ 2023-01-13  9:36 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: H. Peter Anvin (Intel),
	Xin Li, Ingo Molnar, Peter Zijlstra, Andy Lutomirski,
	Dave Hansen, Linus Torvalds, x86, linux-kernel

The following commit has been merged into the x86/cpu branch of tip:

Commit-ID:     92cbbadf73f45c5d8bb26ed8668ff59671ff21e6
Gitweb:        https://git.kernel.org/tip/92cbbadf73f45c5d8bb26ed8668ff59671ff21e6
Author:        H. Peter Anvin (Intel) <hpa@zytor.com>
AuthorDate:    Wed, 11 Jan 2023 23:20:32 -08:00
Committer:     Ingo Molnar <mingo@kernel.org>
CommitterDate: Fri, 13 Jan 2023 10:07:27 +01:00

x86/gsseg: Use the LKGS instruction if available for load_gs_index()

The LKGS instruction atomically loads a segment descriptor into the
%gs descriptor registers, *except* that %gs.base is unchanged, and the
base is instead loaded into MSR_IA32_KERNEL_GS_BASE, which is exactly
what we want this function to do.

Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
Signed-off-by: Xin Li <xin3.li@intel.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20230112072032.35626-6-xin3.li@intel.com
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
---
 arch/x86/include/asm/gsseg.h | 33 +++++++++++++++++++++++++++++----
 arch/x86/kernel/cpu/common.c |  1 +
 arch/x86/xen/enlighten_pv.c  |  1 +
 3 files changed, 31 insertions(+), 4 deletions(-)

diff --git a/arch/x86/include/asm/gsseg.h b/arch/x86/include/asm/gsseg.h
index d15577c..ab6a595 100644
--- a/arch/x86/include/asm/gsseg.h
+++ b/arch/x86/include/asm/gsseg.h
@@ -14,17 +14,42 @@
 
 extern asmlinkage void asm_load_gs_index(u16 selector);
 
+/* Replace with "lkgs %di" once binutils support LKGS instruction */
+#define LKGS_DI _ASM_BYTES(0xf2,0x0f,0x00,0xf7)
+
+static inline void native_lkgs(unsigned int selector)
+{
+	u16 sel = selector;
+	asm_inline volatile("1: " LKGS_DI
+			    _ASM_EXTABLE_TYPE_REG(1b, 1b, EX_TYPE_ZERO_REG, %k[sel])
+			    : [sel] "+D" (sel));
+}
+
 static inline void native_load_gs_index(unsigned int selector)
 {
-	unsigned long flags;
+	if (cpu_feature_enabled(X86_FEATURE_LKGS)) {
+		native_lkgs(selector);
+	} else {
+		unsigned long flags;
 
-	local_irq_save(flags);
-	asm_load_gs_index(selector);
-	local_irq_restore(flags);
+		local_irq_save(flags);
+		asm_load_gs_index(selector);
+		local_irq_restore(flags);
+	}
 }
 
 #endif /* CONFIG_X86_64 */
 
+static inline void __init lkgs_init(void)
+{
+#ifdef CONFIG_PARAVIRT_XXL
+#ifdef CONFIG_X86_64
+	if (cpu_feature_enabled(X86_FEATURE_LKGS))
+		pv_ops.cpu.load_gs_index = native_lkgs;
+#endif
+#endif
+}
+
 #ifndef CONFIG_PARAVIRT_XXL
 
 static inline void load_gs_index(unsigned int selector)
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index 9cfca3d..b7ac85a 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -1960,6 +1960,7 @@ void __init identify_boot_cpu(void)
 	setup_cr_pinning();
 
 	tsx_init();
+	lkgs_init();
 }
 
 void identify_secondary_cpu(struct cpuinfo_x86 *c)
diff --git a/arch/x86/xen/enlighten_pv.c b/arch/x86/xen/enlighten_pv.c
index 5b13796..ce2f19e 100644
--- a/arch/x86/xen/enlighten_pv.c
+++ b/arch/x86/xen/enlighten_pv.c
@@ -276,6 +276,7 @@ static void __init xen_init_capabilities(void)
 	setup_clear_cpu_cap(X86_FEATURE_ACC);
 	setup_clear_cpu_cap(X86_FEATURE_X2APIC);
 	setup_clear_cpu_cap(X86_FEATURE_SME);
+	setup_clear_cpu_cap(X86_FEATURE_LKGS);
 
 	/*
 	 * Xen PV would need some work to support PCID: CR3 handling as well

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH v6 0/5] x86: Enable LKGS instruction
  2023-01-12 14:57   ` Peter Zijlstra
@ 2023-01-13 13:29     ` Ingo Molnar
  2023-01-13 18:26       ` Li, Xin3
  0 siblings, 1 reply; 15+ messages in thread
From: Ingo Molnar @ 2023-01-13 13:29 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Xin Li, linux-kernel, x86, tglx, mingo, bp, dave.hansen, hpa,
	brgerst, chang.seok.bae, jgross


* Peter Zijlstra <peterz@infradead.org> wrote:

> On Thu, Jan 12, 2023 at 01:13:20PM +0100, Ingo Molnar wrote:
> > 
> > * Xin Li <xin3.li@intel.com> wrote:
> > 
> > > LKGS instruction is introduced with Intel FRED (flexible return and 
> > > event delivery) specification. As LKGS is independent of FRED, we 
> > > enable it as a standalone CPU feature.
> > > 
> > > LKGS behaves like the MOV to GS instruction except that it loads the 
> > > base address into the IA32_KERNEL_GS_BASE MSR instead of the GS 
> > > segment’s descriptor cache, which is exactly what Linux kernel does 
> > > to load user level GS base.  Thus, with LKGS, there is no need to 
> > > SWAPGS away from the kernel GS base.
> > 
> > Ok, this looks good to me.
> > 
> > I've applied the first 4 patches to tip:x86/cpu, as the instruction 
> > exists in a public document and these patches are fine stand-alone as 
> > well, such as the factoring out of load_gs_index() methods from a 
> > high-use low level header into a new header file.
> > 
> > Planning to apply the final, LKGS enabler patch as well, unless there's 
> > any objections from others?
> 
> Nah, I think that thing's bike-shedded to near death. Let's just do it.

Ok - applied the #5 patch to tip:x86/cpu, for a v6.3 merge.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 15+ messages in thread

* RE: [PATCH v6 0/5] x86: Enable LKGS instruction
  2023-01-13 13:29     ` Ingo Molnar
@ 2023-01-13 18:26       ` Li, Xin3
  0 siblings, 0 replies; 15+ messages in thread
From: Li, Xin3 @ 2023-01-13 18:26 UTC (permalink / raw)
  To: Ingo Molnar, Peter Zijlstra
  Cc: linux-kernel, x86, tglx, mingo, bp, dave.hansen, hpa, brgerst,
	Bae, Chang Seok, Gross, Jurgen

> > > > LKGS instruction is introduced with Intel FRED (flexible return
> > > > and event delivery) specification. As LKGS is independent of FRED,
> > > > we enable it as a standalone CPU feature.
> > > >
> > > > LKGS behaves like the MOV to GS instruction except that it loads
> > > > the base address into the IA32_KERNEL_GS_BASE MSR instead of the
> > > > GS segment’s descriptor cache, which is exactly what Linux kernel
> > > > does to load user level GS base.  Thus, with LKGS, there is no
> > > > need to SWAPGS away from the kernel GS base.
> > >
> > > Ok, this looks good to me.
> > >
> > > I've applied the first 4 patches to tip:x86/cpu, as the instruction
> > > exists in a public document and these patches are fine stand-alone
> > > as well, such as the factoring out of load_gs_index() methods from a
> > > high-use low level header into a new header file.
> > >
> > > Planning to apply the final, LKGS enabler patch as well, unless
> > > there's any objections from others?
> >
> > Nah, I think that thing's bike-shedded to near death. Let's just do it.
> 
> Ok - applied the #5 patch to tip:x86/cpu, for a v6.3 merge.
> 
> Thanks,
> 
> 	Ingo

Thanks a lot!
Xin


^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2023-01-13 18:31 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-01-12  7:20 [PATCH v6 0/5] x86: Enable LKGS instruction Xin Li
2023-01-12  7:20 ` [PATCH v6 1/5] x86/cpufeature: add the cpu feature bit for LKGS Xin Li
2023-01-12 12:16   ` [tip: x86/cpu] x86/cpufeature: Add the CPU " tip-bot2 for H. Peter Anvin (Intel)
2023-01-12  7:20 ` [PATCH v6 2/5] x86/opcode: add the LKGS instruction to x86-opcode-map Xin Li
2023-01-12 12:16   ` [tip: x86/cpu] x86/opcode: Add " tip-bot2 for H. Peter Anvin (Intel)
2023-01-12  7:20 ` [PATCH v6 3/5] x86/gsseg: make asm_load_gs_index() take an u16 Xin Li
2023-01-12 12:16   ` [tip: x86/cpu] x86/gsseg: Make " tip-bot2 for H. Peter Anvin (Intel)
2023-01-12  7:20 ` [PATCH v6 4/5] x86/gsseg: move load_gs_index() to its own new header file Xin Li
2023-01-12 12:16   ` [tip: x86/cpu] x86/gsseg: Move " tip-bot2 for H. Peter Anvin (Intel)
2023-01-12  7:20 ` [PATCH v6 5/5] x86/gsseg: use the LKGS instruction if available for load_gs_index() Xin Li
2023-01-13  9:36   ` [tip: x86/cpu] x86/gsseg: Use " tip-bot2 for H. Peter Anvin (Intel)
2023-01-12 12:13 ` [PATCH v6 0/5] x86: Enable LKGS instruction Ingo Molnar
2023-01-12 14:57   ` Peter Zijlstra
2023-01-13 13:29     ` Ingo Molnar
2023-01-13 18:26       ` Li, Xin3

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.