All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH V2 00/41] x86/entry/64: Convert a bunch of ASM entry code into C code
@ 2021-09-26 15:07 Lai Jiangshan
  2021-09-26 15:07 ` [PATCH V2 01/41] x86/entry: Fix swapgs fence Lai Jiangshan
                   ` (40 more replies)
  0 siblings, 41 replies; 54+ messages in thread
From: Lai Jiangshan @ 2021-09-26 15:07 UTC (permalink / raw)
  To: linux-kernel
  Cc: Lai Jiangshan, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Peter Zijlstra, Andy Lutomirski, H. Peter Anvin, Joerg Roedel

From: Lai Jiangshan <laijs@linux.alibaba.com>

Many ASM code in entry_64.S can be rewritten in C if they can be written
to be non-instrumentable and are called in the right order regarding to
whether CR3/gsbase is changed to kernel CR3/gsbase.

The patchset covert some of them to C code.

The patch 16 converts the error_entry() to C code. And patch 1-15
are preparation for it.

The patches 17-37 convert the IST entry code to C code.  Many of them
are preparation for the actual conversion.

The patch 41 converts a small part of ASM code of syscall to C code which
does the checking for whether it can use sysret to return to userspace.

Some other paths can be possible to be in C code, for example: the
error exit, the syscall entry/exit.  The PTI handling for them can
be in C code.  But it would required the pt_regs to be copied/pushed
to the entry stack which means the C code would not be efficient.

When converting ASM to C, the most effort is to make them the same.
Almost no creative was involved.  The code are kept as the same as ASM
as possible and no functional change intended unless my misunderstanding
in the ASM code was involved.  The functions called by the C entry code
are checked to be ensured noinstr or __always_inline.  Some of them have
more than one definitions and require some more cares from reviewers.
The comments in the ASM are also copied in the right place in the C code.

Changed from V1:
	Add a fix as the patch1.  Found by trying to applied Peterz's
		suggestion in patch11.
	The whole entry_error() is converted to C instead of partial.
	The whole parnoid_entry() is converted to C instead of partial.
	The asm code of "parnoid_entry() cfunc() parnoid_exit()" are
		converted to C as suggested by Peterz.
	Add entry64.c rather than move traps.c to arch/x86/entry/
	The order of some commits is changed.
	Remove two cleanups

[V1]: https://lore.kernel.org/all/20210831175025.27570-1-jiangshanlai@gmail.com/

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Joerg Roedel <jroedel@suse.de>

Lai Jiangshan (41):
  x86/entry: Fix swapgs fence
  x86/traps: Remove stack-protector from traps.c
  compiler_types.h: Add __noinstr_section() for noinstr
  x86/entry: Introduce __entry_text for entry code written in C
  x86/entry: Move PTI_USER_* to arch/x86/include/asm/processor-flags.h
  x86: Mark __native_read_cr3() & native_write_cr3() as __always_inline
  x86/traps: Move the declaration of native_irq_return_iret into proto.h
  x86/entry: Add arch/x86/entry/entry64.c for C entry code
  x86/entry: Expose the address of .Lgs_change to entry64.c
  x86/entry: Add C verion of SWITCH_TO_KERNEL_CR3 as
    switch_to_kernel_cr3()
  x86/entry: Add C user_entry_swapgs_and_fence() and
    kernel_entry_fence_no_swapgs()
  x86/traps: Move pt_regs only in fixup_bad_iret()
  x86/entry: Switch the stack after error_entry() returns
  x86/entry: move PUSH_AND_CLEAR_REGS out of error_entry
  objtool: Allow .entry.text function using CLD instruction
  x86/entry: Implement the whole error_entry() as C code
  x86/entry: Make paranoid_exit() callable
  x86/entry: Call paranoid_exit() in asm_exc_nmi()
  x86/entry: move PUSH_AND_CLEAR_REGS out of paranoid_entry
  x86/entry: Add the C version ist_switch_to_kernel_cr3()
  x86/entry: Add the C version ist_restore_cr3()
  x86/entry: Add the C version get_percpu_base()
  x86/entry: Add the C version ist_switch_to_kernel_gsbase()
  x86/entry: Implement the C version ist_paranoid_entry()
  x86/entry: Implement the C version ist_paranoid_exit()
  x86/entry: Add a C macro to define the function body for IST in
    .entry.text
  x86/mce: Remove stack protector from mce/core.c
  x86/debug, mce: Use C entry code
  x86/idtentry.h: Move the definitions *IDTENTRY_{MCE|DEBUG}* up
  x86/nmi: Use DEFINE_IDTENTRY_NMI for nmi
  x86/nmi: Remove stack protector from nmi.c
  x86/nmi: Use C entry code
  x86/entry: Add a C macro to define the function body for IST in
    .entry.text with an error code
  x86/doublefault: Use C entry code
  x86/sev: Add and use ist_vc_switch_off_ist()
  x86/sev: Remove stack protector from sev.c
  x86/sev: Use C entry code
  x86/entry: Remove ASM function paranoid_entry() and paranoid_exit()
  x86/entry: Remove the unused ASM macros
  x86/entry: Remove save_ret from PUSH_AND_CLEAR_REGS
  x86/syscall/64: Move the checking for sysret to C code

 arch/x86/entry/Makefile                |   5 +-
 arch/x86/entry/calling.h               | 142 +--------
 arch/x86/entry/common.c                |  73 ++++-
 arch/x86/entry/entry64.c               | 354 ++++++++++++++++++++++
 arch/x86/entry/entry_64.S              | 403 +++----------------------
 arch/x86/include/asm/idtentry.h        |  64 +++-
 arch/x86/include/asm/processor-flags.h |  15 +
 arch/x86/include/asm/proto.h           |   1 +
 arch/x86/include/asm/special_insns.h   |   4 +-
 arch/x86/include/asm/syscall.h         |   2 +-
 arch/x86/include/asm/traps.h           |   6 +-
 arch/x86/kernel/Makefile               |   7 +
 arch/x86/kernel/cpu/mce/Makefile       |   4 +
 arch/x86/kernel/nmi.c                  |   2 +-
 arch/x86/kernel/traps.c                |  33 +-
 include/linux/compiler_types.h         |   6 +-
 tools/objtool/check.c                  |   2 +-
 17 files changed, 580 insertions(+), 543 deletions(-)
 create mode 100644 arch/x86/entry/entry64.c

-- 
2.19.1.6.gb485710b


^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH V2 01/41] x86/entry: Fix swapgs fence
  2021-09-26 15:07 [PATCH V2 00/41] x86/entry/64: Convert a bunch of ASM entry code into C code Lai Jiangshan
@ 2021-09-26 15:07 ` Lai Jiangshan
  2021-09-26 20:43   ` Thomas Gleixner
  2021-09-26 15:07 ` [PATCH V2 02/41] x86/traps: Remove stack-protector from traps.c Lai Jiangshan
                   ` (39 subsequent siblings)
  40 siblings, 1 reply; 54+ messages in thread
From: Lai Jiangshan @ 2021-09-26 15:07 UTC (permalink / raw)
  To: linux-kernel
  Cc: Lai Jiangshan, Josh Poimboeuf, Chang S . Bae, Sasha Levin,
	Andy Lutomirski, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	x86, H. Peter Anvin

From: Lai Jiangshan <laijs@linux.alibaba.com>

Commit 18ec54fdd6d18 ("x86/speculation: Prepare entry code for Spectre
v1 swapgs mitigations") (commit1) added FENCE_SWAPGS_{KERNEL|USER}_ENTRY
for conditional swapgs.  And in paranoid_entry(), it uses only
FENCE_SWAPGS_KERNEL_ENTRY for both conditions/branches.  It is totally
correct because FENCE_SWAPGS_KERNEL_ENTRY implies FENCE_SWAPGS_USER_ENTRY
which can be seen in spectre_v1_select_mitigation() that if
X86_FEATURE_FENCE_SWAPGS_USER is set, X86_FEATURE_FENCE_SWAPGS_KERNEL
must be also set.

The commit1 also has a piece of comment saying why
FENCE_SWAPGS_KERNEL_ENTRY is needed since writing cr3 implies lfence:
writing cr3 is also conditionally.

But commit 96b2371413e8f ("x86/entry/64: Switch CR3 before SWAPGS in
paranoid entry") (commit2) switches the code order and at least this piece
of comments is useless because there is no "writing cr3" in between the
conditional swaps and the fence.

Even worse, the commit2 does not use FENCE_SWAPGS_{KERNEL|USER}_ENTRY
in the corresponding branches.  It just uses FENCE_SWAPGS_KERNEL_ENTRY
in the user path and no FENCE_SWAPGS_KERNEL_ENTRY in the kernel path.
Possibly, it was because the commit1 which uses FENCE_SWAPGS_KERNEL_ENTRY
in both paths and shadowed the lfence requirement.

Fix it and remove the unneeded comment.

Fixes: Commit 96b2371413e8f ("x86/entry/64: Switch CR3 before SWAPGS in paranoid entry")
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Chang S. Bae <chang.seok.bae@intel.com>
Cc: Sasha Levin <sashal@kernel.org>
Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
---
 arch/x86/entry/entry_64.S | 9 ++-------
 1 file changed, 2 insertions(+), 7 deletions(-)

diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
index e38a4cf795d9..95d85b16710b 100644
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -898,17 +898,12 @@ SYM_CODE_START_LOCAL(paranoid_entry)
 	rdmsr
 	testl	%edx, %edx
 	jns	.Lparanoid_entry_swapgs
+	FENCE_SWAPGS_KERNEL_ENTRY
 	ret
 
 .Lparanoid_entry_swapgs:
 	swapgs
-
-	/*
-	 * The above SAVE_AND_SWITCH_TO_KERNEL_CR3 macro doesn't do an
-	 * unconditional CR3 write, even in the PTI case.  So do an lfence
-	 * to prevent GS speculation, regardless of whether PTI is enabled.
-	 */
-	FENCE_SWAPGS_KERNEL_ENTRY
+	FENCE_SWAPGS_USER_ENTRY
 
 	/* EBX = 0 -> SWAPGS required on exit */
 	xorl	%ebx, %ebx
-- 
2.19.1.6.gb485710b


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH V2 02/41] x86/traps: Remove stack-protector from traps.c
  2021-09-26 15:07 [PATCH V2 00/41] x86/entry/64: Convert a bunch of ASM entry code into C code Lai Jiangshan
  2021-09-26 15:07 ` [PATCH V2 01/41] x86/entry: Fix swapgs fence Lai Jiangshan
@ 2021-09-26 15:07 ` Lai Jiangshan
  2021-09-27 10:19   ` Borislav Petkov
  2021-09-26 15:08 ` [PATCH V2 03/41] compiler_types.h: Add __noinstr_section() for noinstr Lai Jiangshan
                   ` (38 subsequent siblings)
  40 siblings, 1 reply; 54+ messages in thread
From: Lai Jiangshan @ 2021-09-26 15:07 UTC (permalink / raw)
  To: linux-kernel
  Cc: Lai Jiangshan, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	x86, H. Peter Anvin, Joerg Roedel, Javier Martinez Canillas,
	Daniel Bristot de Oliveira, Brijesh Singh, Andy Shevchenko,
	Arvind Sankar, Juergen Gross, Chester Lin

From: Lai Jiangshan <laijs@linux.alibaba.com>

When stack-protector is enabled, the compiler adds some instrument code
at the beginning and the end of some functions. Many functions in traps.c
are non-instrumentable.  Moreover, stack-protector code in the beginning
of the affected function accesses the canary that might be watched by
hardware breakpoints which also violate the non-instrumentable
nature of some functions and might cause infinite recursive #DB because
the canary is accessed before resetting the dr7.

So it is better to remove stack-protector from traps.c.

It is also prepared for later patches that move some entry code into
traps.c, some of which can NOT use percpu register until gsbase is
properly switched.  And stack-protector depends on the percpu register
to work.

Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
---
 arch/x86/kernel/Makefile | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile
index 8f4e8fa6ed75..0e054e2304c6 100644
--- a/arch/x86/kernel/Makefile
+++ b/arch/x86/kernel/Makefile
@@ -48,6 +48,9 @@ KCOV_INSTRUMENT		:= n
 
 CFLAGS_head$(BITS).o	+= -fno-stack-protector
 
+CFLAGS_REMOVE_traps.o		= -fstack-protector -fstack-protector-strong
+CFLAGS_traps.o			+= -fno-stack-protector
+
 CFLAGS_irq.o := -I $(srctree)/$(src)/../include/asm/trace
 
 obj-y			:= process_$(BITS).o signal.o
-- 
2.19.1.6.gb485710b


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH V2 03/41] compiler_types.h: Add __noinstr_section() for noinstr
  2021-09-26 15:07 [PATCH V2 00/41] x86/entry/64: Convert a bunch of ASM entry code into C code Lai Jiangshan
  2021-09-26 15:07 ` [PATCH V2 01/41] x86/entry: Fix swapgs fence Lai Jiangshan
  2021-09-26 15:07 ` [PATCH V2 02/41] x86/traps: Remove stack-protector from traps.c Lai Jiangshan
@ 2021-09-26 15:08 ` Lai Jiangshan
  2021-09-27 18:09   ` Kees Cook
  2021-09-26 15:08 ` [PATCH V2 04/41] x86/entry: Introduce __entry_text for entry code written in C Lai Jiangshan
                   ` (37 subsequent siblings)
  40 siblings, 1 reply; 54+ messages in thread
From: Lai Jiangshan @ 2021-09-26 15:08 UTC (permalink / raw)
  To: linux-kernel
  Cc: Lai Jiangshan, Kees Cook, Nathan Chancellor, Miguel Ojeda,
	Nick Desaulniers, Peter Zijlstra (Intel),
	Sami Tolvanen, Masahiro Yamada, Marco Elver, Arnd Bergmann,
	Ard Biesheuvel

From: Lai Jiangshan <laijs@linux.alibaba.com>

And it will be extended for C entry code.

Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
---
 include/linux/compiler_types.h | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/include/linux/compiler_types.h b/include/linux/compiler_types.h
index b6ff83a714ca..3c77631c68bd 100644
--- a/include/linux/compiler_types.h
+++ b/include/linux/compiler_types.h
@@ -208,10 +208,12 @@ struct ftrace_likely_data {
 #endif
 
 /* Section for code which can't be instrumented at all */
-#define noinstr								\
-	noinline notrace __attribute((__section__(".noinstr.text")))	\
+#define __noinstr_section(section)				\
+	noinline notrace __attribute((__section__(section)))	\
 	__no_kcsan __no_sanitize_address __no_profile __no_sanitize_coverage
 
+#define noinstr __noinstr_section(".noinstr.text")
+
 #endif /* __KERNEL__ */
 
 #endif /* __ASSEMBLY__ */
-- 
2.19.1.6.gb485710b


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH V2 04/41] x86/entry: Introduce __entry_text for entry code written in C
  2021-09-26 15:07 [PATCH V2 00/41] x86/entry/64: Convert a bunch of ASM entry code into C code Lai Jiangshan
                   ` (2 preceding siblings ...)
  2021-09-26 15:08 ` [PATCH V2 03/41] compiler_types.h: Add __noinstr_section() for noinstr Lai Jiangshan
@ 2021-09-26 15:08 ` Lai Jiangshan
  2021-09-30 11:49   ` Borislav Petkov
  2021-09-26 15:08 ` [PATCH V2 05/41] x86/entry: Move PTI_USER_* to arch/x86/include/asm/processor-flags.h Lai Jiangshan
                   ` (36 subsequent siblings)
  40 siblings, 1 reply; 54+ messages in thread
From: Lai Jiangshan @ 2021-09-26 15:08 UTC (permalink / raw)
  To: linux-kernel
  Cc: Lai Jiangshan, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	x86, H. Peter Anvin, Juergen Gross, Peter Zijlstra (Intel),
	Joerg Roedel, Mike Travis

From: Lai Jiangshan <laijs@linux.alibaba.com>

Some entry code will be implemented in C files.  We need __entry_text
to set them in .entry.text section.  __entry_text disables instruments
like noinstr, but it doesn't disable stack protector since not all
compiler supported by kernel supporting function level granular
attribute to disable stack protector.  It will be disabled by C file
level.

Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
---
 arch/x86/include/asm/idtentry.h | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/arch/x86/include/asm/idtentry.h b/arch/x86/include/asm/idtentry.h
index 1345088e9902..6779def97591 100644
--- a/arch/x86/include/asm/idtentry.h
+++ b/arch/x86/include/asm/idtentry.h
@@ -11,6 +11,9 @@
 
 #include <asm/irq_stack.h>
 
+/* Entry code written in C. */
+#define __entry_text __noinstr_section(".entry.text")
+
 /**
  * DECLARE_IDTENTRY - Declare functions for simple IDT entry points
  *		      No error code pushed by hardware
-- 
2.19.1.6.gb485710b


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH V2 05/41] x86/entry: Move PTI_USER_* to arch/x86/include/asm/processor-flags.h
  2021-09-26 15:07 [PATCH V2 00/41] x86/entry/64: Convert a bunch of ASM entry code into C code Lai Jiangshan
                   ` (3 preceding siblings ...)
  2021-09-26 15:08 ` [PATCH V2 04/41] x86/entry: Introduce __entry_text for entry code written in C Lai Jiangshan
@ 2021-09-26 15:08 ` Lai Jiangshan
  2021-09-26 15:08 ` [PATCH V2 06/41] x86: Mark __native_read_cr3() & native_write_cr3() as __always_inline Lai Jiangshan
                   ` (35 subsequent siblings)
  40 siblings, 0 replies; 54+ messages in thread
From: Lai Jiangshan @ 2021-09-26 15:08 UTC (permalink / raw)
  To: linux-kernel
  Cc: Lai Jiangshan, Andy Lutomirski, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, x86, H. Peter Anvin

From: Lai Jiangshan <laijs@linux.alibaba.com>

These constants will be also used in C file, so we move them to
arch/x86/include/asm/processor-flags.h which already has a kin
X86_CR3_PTI_PCID_USER_BIT defined in it.

Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
---
 arch/x86/entry/calling.h               | 10 ----------
 arch/x86/include/asm/processor-flags.h | 15 +++++++++++++++
 2 files changed, 15 insertions(+), 10 deletions(-)

diff --git a/arch/x86/entry/calling.h b/arch/x86/entry/calling.h
index a4c061fb7c6e..996b041e92d2 100644
--- a/arch/x86/entry/calling.h
+++ b/arch/x86/entry/calling.h
@@ -149,16 +149,6 @@ For 32-bit we have the following conventions - kernel is built with
 
 #ifdef CONFIG_PAGE_TABLE_ISOLATION
 
-/*
- * PAGE_TABLE_ISOLATION PGDs are 8k.  Flip bit 12 to switch between the two
- * halves:
- */
-#define PTI_USER_PGTABLE_BIT		PAGE_SHIFT
-#define PTI_USER_PGTABLE_MASK		(1 << PTI_USER_PGTABLE_BIT)
-#define PTI_USER_PCID_BIT		X86_CR3_PTI_PCID_USER_BIT
-#define PTI_USER_PCID_MASK		(1 << PTI_USER_PCID_BIT)
-#define PTI_USER_PGTABLE_AND_PCID_MASK  (PTI_USER_PCID_MASK | PTI_USER_PGTABLE_MASK)
-
 .macro SET_NOFLUSH_BIT	reg:req
 	bts	$X86_CR3_PCID_NOFLUSH_BIT, \reg
 .endm
diff --git a/arch/x86/include/asm/processor-flags.h b/arch/x86/include/asm/processor-flags.h
index 02c2cbda4a74..4dd2fbbc861a 100644
--- a/arch/x86/include/asm/processor-flags.h
+++ b/arch/x86/include/asm/processor-flags.h
@@ -4,6 +4,7 @@
 
 #include <uapi/asm/processor-flags.h>
 #include <linux/mem_encrypt.h>
+#include <asm/page_types.h>
 
 #ifdef CONFIG_VM86
 #define X86_VM_MASK	X86_EFLAGS_VM
@@ -50,7 +51,21 @@
 #endif
 
 #ifdef CONFIG_PAGE_TABLE_ISOLATION
+
 # define X86_CR3_PTI_PCID_USER_BIT	11
+
+#ifdef CONFIG_X86_64
+/*
+ * PAGE_TABLE_ISOLATION PGDs are 8k.  Flip bit 12 to switch between the two
+ * halves:
+ */
+#define PTI_USER_PGTABLE_BIT		PAGE_SHIFT
+#define PTI_USER_PGTABLE_MASK		(1 << PTI_USER_PGTABLE_BIT)
+#define PTI_USER_PCID_BIT		X86_CR3_PTI_PCID_USER_BIT
+#define PTI_USER_PCID_MASK		(1 << PTI_USER_PCID_BIT)
+#define PTI_USER_PGTABLE_AND_PCID_MASK  (PTI_USER_PCID_MASK | PTI_USER_PGTABLE_MASK)
+#endif
+
 #endif
 
 #endif /* _ASM_X86_PROCESSOR_FLAGS_H */
-- 
2.19.1.6.gb485710b


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH V2 06/41] x86: Mark __native_read_cr3() & native_write_cr3() as __always_inline
  2021-09-26 15:07 [PATCH V2 00/41] x86/entry/64: Convert a bunch of ASM entry code into C code Lai Jiangshan
                   ` (4 preceding siblings ...)
  2021-09-26 15:08 ` [PATCH V2 05/41] x86/entry: Move PTI_USER_* to arch/x86/include/asm/processor-flags.h Lai Jiangshan
@ 2021-09-26 15:08 ` Lai Jiangshan
  2021-09-26 15:08 ` [PATCH V2 07/41] x86/traps: Move the declaration of native_irq_return_iret into proto.h Lai Jiangshan
                   ` (34 subsequent siblings)
  40 siblings, 0 replies; 54+ messages in thread
From: Lai Jiangshan @ 2021-09-26 15:08 UTC (permalink / raw)
  To: linux-kernel
  Cc: Lai Jiangshan, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	x86, H. Peter Anvin, Dave Jiang, Ben Widawsky, Dan Williams,
	Tony Luck, Peter Zijlstra, Arvind Sankar

From: Lai Jiangshan <laijs@linux.alibaba.com>

We need __native_read_cr3() & native_write_cr3() to be ensured noinstr.

It is prepared for later patches which implement entry code in C file.
Some of the code needs to handle KPTI and has to read/write CR3.

Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
---
 arch/x86/include/asm/special_insns.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/special_insns.h b/arch/x86/include/asm/special_insns.h
index f3fbb84ff8a7..058995bb153c 100644
--- a/arch/x86/include/asm/special_insns.h
+++ b/arch/x86/include/asm/special_insns.h
@@ -42,14 +42,14 @@ static __always_inline void native_write_cr2(unsigned long val)
 	asm volatile("mov %0,%%cr2": : "r" (val) : "memory");
 }
 
-static inline unsigned long __native_read_cr3(void)
+static __always_inline unsigned long __native_read_cr3(void)
 {
 	unsigned long val;
 	asm volatile("mov %%cr3,%0\n\t" : "=r" (val) : __FORCE_ORDER);
 	return val;
 }
 
-static inline void native_write_cr3(unsigned long val)
+static __always_inline void native_write_cr3(unsigned long val)
 {
 	asm volatile("mov %0,%%cr3": : "r" (val) : "memory");
 }
-- 
2.19.1.6.gb485710b


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH V2 07/41] x86/traps: Move the declaration of native_irq_return_iret into proto.h
  2021-09-26 15:07 [PATCH V2 00/41] x86/entry/64: Convert a bunch of ASM entry code into C code Lai Jiangshan
                   ` (5 preceding siblings ...)
  2021-09-26 15:08 ` [PATCH V2 06/41] x86: Mark __native_read_cr3() & native_write_cr3() as __always_inline Lai Jiangshan
@ 2021-09-26 15:08 ` Lai Jiangshan
  2021-09-26 15:08 ` [PATCH V2 08/41] x86/entry: Add arch/x86/entry/entry64.c for C entry code Lai Jiangshan
                   ` (33 subsequent siblings)
  40 siblings, 0 replies; 54+ messages in thread
From: Lai Jiangshan @ 2021-09-26 15:08 UTC (permalink / raw)
  To: linux-kernel
  Cc: Lai Jiangshan, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	x86, H. Peter Anvin, Jan Kiszka, Joerg Roedel, Peter Zijlstra,
	Sean Christopherson

From: Lai Jiangshan <laijs@linux.alibaba.com>

The declaration of native_irq_return_iret is used in exc_double_fault()
only by now.  But it will be used in other place later, so the declaration
is moved to a header file for preparation.

Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
---
 arch/x86/include/asm/proto.h | 1 +
 arch/x86/kernel/traps.c      | 2 --
 2 files changed, 1 insertion(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/proto.h b/arch/x86/include/asm/proto.h
index 8c5d1910a848..ee07b3cae213 100644
--- a/arch/x86/include/asm/proto.h
+++ b/arch/x86/include/asm/proto.h
@@ -13,6 +13,7 @@ void syscall_init(void);
 #ifdef CONFIG_X86_64
 void entry_SYSCALL_64(void);
 void entry_SYSCALL_64_safe_stack(void);
+extern unsigned char native_irq_return_iret[];
 long do_arch_prctl_64(struct task_struct *task, int option, unsigned long arg2);
 #endif
 
diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index cc6de3a01293..cf852b5e347f 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -359,8 +359,6 @@ DEFINE_IDTENTRY_DF(exc_double_fault)
 #endif
 
 #ifdef CONFIG_X86_ESPFIX64
-	extern unsigned char native_irq_return_iret[];
-
 	/*
 	 * If IRET takes a non-IST fault on the espfix64 stack, then we
 	 * end up promoting it to a doublefault.  In that case, take
-- 
2.19.1.6.gb485710b


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH V2 08/41] x86/entry: Add arch/x86/entry/entry64.c for C entry code
  2021-09-26 15:07 [PATCH V2 00/41] x86/entry/64: Convert a bunch of ASM entry code into C code Lai Jiangshan
                   ` (6 preceding siblings ...)
  2021-09-26 15:08 ` [PATCH V2 07/41] x86/traps: Move the declaration of native_irq_return_iret into proto.h Lai Jiangshan
@ 2021-09-26 15:08 ` Lai Jiangshan
  2021-09-26 15:08 ` [PATCH V2 09/41] x86/entry: Expose the address of .Lgs_change to entry64.c Lai Jiangshan
                   ` (32 subsequent siblings)
  40 siblings, 0 replies; 54+ messages in thread
From: Lai Jiangshan @ 2021-09-26 15:08 UTC (permalink / raw)
  To: linux-kernel
  Cc: Lai Jiangshan, Andy Lutomirski, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, x86, H. Peter Anvin

From: Lai Jiangshan <laijs@linux.alibaba.com>

Add a C file "entry64.c" to deposit C entry code for traps and faults
which will be as the same logic as the existing ASM code in entry_64.S.

The file is as low level as entry_64.S and its code can be running in
the environments that the GS base is user controlled value, or the CR3
is PTI user CR3 or both.

All the code in this file should not be instrumentable.  Many instrument
facilities can be disabled by per-function attributes which are included
in __noinstr_section.  But stack-protector can not be disabled function-
granularly by many versions of GCC that can be supported for compiling
the kernel.  So stack-protector is disabled for the whole file in Makefile.

It is prepared for later patches that implement C version of the entry
code in entry64.c.

Suggested-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
---
 arch/x86/entry/Makefile  |  5 ++++-
 arch/x86/entry/entry64.c | 11 +++++++++++
 2 files changed, 15 insertions(+), 1 deletion(-)
 create mode 100644 arch/x86/entry/entry64.c

diff --git a/arch/x86/entry/Makefile b/arch/x86/entry/Makefile
index 7fec5dcf6438..492e0b113bd0 100644
--- a/arch/x86/entry/Makefile
+++ b/arch/x86/entry/Makefile
@@ -11,12 +11,15 @@ CFLAGS_REMOVE_common.o		= $(CC_FLAGS_FTRACE)
 
 CFLAGS_common.o			+= -fno-stack-protector
 
+CFLAGS_REMOVE_entry64.o		= -fstack-protector -fstack-protector-strong
+CFLAGS_entry64.o		+= -fno-stack-protector
+
 obj-y				:= entry_$(BITS).o thunk_$(BITS).o syscall_$(BITS).o
 obj-y				+= common.o
+obj-$(CONFIG_X86_64)		+= entry64.o
 
 obj-y				+= vdso/
 obj-y				+= vsyscall/
 
 obj-$(CONFIG_IA32_EMULATION)	+= entry_64_compat.o syscall_32.o
 obj-$(CONFIG_X86_X32_ABI)	+= syscall_x32.o
-
diff --git a/arch/x86/entry/entry64.c b/arch/x86/entry/entry64.c
new file mode 100644
index 000000000000..3a6d70367940
--- /dev/null
+++ b/arch/x86/entry/entry64.c
@@ -0,0 +1,14 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ *  Copyright (C) 1991, 1992  Linus Torvalds
+ *  Copyright (C) 2000, 2001, 2002  Andi Kleen SuSE Labs
+ *  Copyright (C) 2000  Pavel Machek <pavel@suse.cz>
+ *  Copyright (C) 2021 Lai Jiangshan, Alibaba
+ *
+ * Handle entries and exits for hardware traps and faults.
+ *
+ * It is as low level as entry_64.S and its code can be running in the
+ * environments that the GS base is user controlled value, or the CR3
+ * is PTI user CR3 or both.
+ */
+#include <asm/traps.h>
-- 
2.19.1.6.gb485710b


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH V2 09/41] x86/entry: Expose the address of .Lgs_change to entry64.c
  2021-09-26 15:07 [PATCH V2 00/41] x86/entry/64: Convert a bunch of ASM entry code into C code Lai Jiangshan
                   ` (7 preceding siblings ...)
  2021-09-26 15:08 ` [PATCH V2 08/41] x86/entry: Add arch/x86/entry/entry64.c for C entry code Lai Jiangshan
@ 2021-09-26 15:08 ` Lai Jiangshan
  2021-09-26 15:08 ` [PATCH V2 10/41] x86/entry: Add C verion of SWITCH_TO_KERNEL_CR3 as switch_to_kernel_cr3() Lai Jiangshan
                   ` (31 subsequent siblings)
  40 siblings, 0 replies; 54+ messages in thread
From: Lai Jiangshan @ 2021-09-26 15:08 UTC (permalink / raw)
  To: linux-kernel
  Cc: Lai Jiangshan, Andy Lutomirski, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, x86, H. Peter Anvin

From: Lai Jiangshan <laijs@linux.alibaba.com>

The address of .Lgs_change will be used in traps.c in later patch when
some entry code is implemented in entry64.c.  So the address of .Lgs_change
is exposed to traps.c for preparation.

The label .Lgs_change is still needed in ASM code for extable due to it
can not use asm_load_gs_index_gs_change.  Otherwise:

	warning: objtool: __ex_table+0x0: don't know how to handle
	non-section reloc symbol asm_load_gs_index_gs_change

Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
---
 arch/x86/entry/entry64.c  | 2 ++
 arch/x86/entry/entry_64.S | 3 ++-
 2 files changed, 4 insertions(+), 1 deletion(-)

diff --git a/arch/x86/entry/entry64.c b/arch/x86/entry/entry64.c
index 3a6d70367940..7272266a3726 100644
--- a/arch/x86/entry/entry64.c
+++ b/arch/x86/entry/entry64.c
@@ -9,3 +9,5 @@
  * is PTI user CR3 or both.
  */
 #include <asm/traps.h>
+
+extern unsigned char asm_load_gs_index_gs_change[];
diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
index 95d85b16710b..291732f571a7 100644
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -729,6 +729,7 @@ _ASM_NOKPROBE(common_interrupt_return)
 SYM_FUNC_START(asm_load_gs_index)
 	FRAME_BEGIN
 	swapgs
+SYM_INNER_LABEL(asm_load_gs_index_gs_change, SYM_L_GLOBAL)
 .Lgs_change:
 	movl	%edi, %gs
 2:	ALTERNATIVE "", "mfence", X86_BUG_SWAPGS_FENCE
@@ -1006,7 +1007,7 @@ SYM_CODE_START_LOCAL(error_entry)
 	movl	%ecx, %eax			/* zero extend */
 	cmpq	%rax, RIP+8(%rsp)
 	je	.Lbstep_iret
-	cmpq	$.Lgs_change, RIP+8(%rsp)
+	cmpq	$asm_load_gs_index_gs_change, RIP+8(%rsp)
 	jne	.Lerror_entry_done_lfence
 
 	/*
-- 
2.19.1.6.gb485710b


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH V2 10/41] x86/entry: Add C verion of SWITCH_TO_KERNEL_CR3 as switch_to_kernel_cr3()
  2021-09-26 15:07 [PATCH V2 00/41] x86/entry/64: Convert a bunch of ASM entry code into C code Lai Jiangshan
                   ` (8 preceding siblings ...)
  2021-09-26 15:08 ` [PATCH V2 09/41] x86/entry: Expose the address of .Lgs_change to entry64.c Lai Jiangshan
@ 2021-09-26 15:08 ` Lai Jiangshan
  2021-09-26 15:08 ` [PATCH V2 11/41] x86/entry: Add C user_entry_swapgs_and_fence() and kernel_entry_fence_no_swapgs() Lai Jiangshan
                   ` (30 subsequent siblings)
  40 siblings, 0 replies; 54+ messages in thread
From: Lai Jiangshan @ 2021-09-26 15:08 UTC (permalink / raw)
  To: linux-kernel
  Cc: Lai Jiangshan, Andy Lutomirski, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, x86, H. Peter Anvin

From: Lai Jiangshan <laijs@linux.alibaba.com>

The C version switch_to_kernel_cr3() implements SWITCH_TO_KERNEL_CR3().

No functional difference intended.

Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
---
 arch/x86/entry/entry64.c | 24 ++++++++++++++++++++++++
 1 file changed, 24 insertions(+)

diff --git a/arch/x86/entry/entry64.c b/arch/x86/entry/entry64.c
index 7272266a3726..77838e19f1ac 100644
--- a/arch/x86/entry/entry64.c
+++ b/arch/x86/entry/entry64.c
@@ -11,3 +11,27 @@
 #include <asm/traps.h>
 
 extern unsigned char asm_load_gs_index_gs_change[];
+
+#ifdef CONFIG_PAGE_TABLE_ISOLATION
+static __always_inline void pti_switch_to_kernel_cr3(unsigned long user_cr3)
+{
+	/*
+	 * Clear PCID and "PAGE_TABLE_ISOLATION bit", point CR3
+	 * at kernel pagetables:
+	 */
+	unsigned long cr3 = user_cr3 & ~PTI_USER_PGTABLE_AND_PCID_MASK;
+
+	if (static_cpu_has(X86_FEATURE_PCID))
+		cr3 |= X86_CR3_PCID_NOFLUSH;
+
+	native_write_cr3(cr3);
+}
+
+static __always_inline void switch_to_kernel_cr3(void)
+{
+	if (static_cpu_has(X86_FEATURE_PTI))
+		pti_switch_to_kernel_cr3(__native_read_cr3());
+}
+#else
+static __always_inline void switch_to_kernel_cr3(void) {}
+#endif
-- 
2.19.1.6.gb485710b


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH V2 11/41] x86/entry: Add C user_entry_swapgs_and_fence() and kernel_entry_fence_no_swapgs()
  2021-09-26 15:07 [PATCH V2 00/41] x86/entry/64: Convert a bunch of ASM entry code into C code Lai Jiangshan
                   ` (9 preceding siblings ...)
  2021-09-26 15:08 ` [PATCH V2 10/41] x86/entry: Add C verion of SWITCH_TO_KERNEL_CR3 as switch_to_kernel_cr3() Lai Jiangshan
@ 2021-09-26 15:08 ` Lai Jiangshan
  2021-09-26 15:08 ` [PATCH V2 12/41] x86/traps: Move pt_regs only in fixup_bad_iret() Lai Jiangshan
                   ` (29 subsequent siblings)
  40 siblings, 0 replies; 54+ messages in thread
From: Lai Jiangshan @ 2021-09-26 15:08 UTC (permalink / raw)
  To: linux-kernel
  Cc: Lai Jiangshan, Josh Poimboeuf, Andy Lutomirski, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, x86, H. Peter Anvin

From: Lai Jiangshan <laijs@linux.alibaba.com>

The C user_entry_swapgs_and_fence() implements the ASM code:
	swapgs
	FENCE_SWAPGS_USER_ENTRY

It will be used in the user entry swapgs code path,  doing the swapgs and
lfence to prevent a speculative swapgs when coming from kernel space.

The C kernel_entry_fence_no_swapgs() implements the ASM code:
	FENCE_SWAPGS_KERNEL_ENTRY

It will be used in the kernel entry non-swapgs code path to prevent the
swapgs from getting speculatively skipped when coming from user space.

Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
---
 arch/x86/entry/entry64.c | 21 +++++++++++++++++++++
 1 file changed, 21 insertions(+)

diff --git a/arch/x86/entry/entry64.c b/arch/x86/entry/entry64.c
index 77838e19f1ac..dafae60d31f9 100644
--- a/arch/x86/entry/entry64.c
+++ b/arch/x86/entry/entry64.c
@@ -35,3 +35,24 @@ static __always_inline void switch_to_kernel_cr3(void)
 #else
 static __always_inline void switch_to_kernel_cr3(void) {}
 #endif
+
+/*
+ * Mitigate Spectre v1 for conditional swapgs code paths.
+ *
+ * user_entry_swapgs_and_fence is used in the user entry swapgs code path,
+ * to prevent a speculative swapgs when coming from kernel space.
+ *
+ * kernel_entry_fence_no_swapgs is used in the kernel entry non-swapgs code
+ * path, to prevent the swapgs from getting speculatively skipped when coming
+ * from user space.
+ */
+static __always_inline void user_entry_swapgs_and_fence(void)
+{
+	native_swapgs();
+	alternative("", "lfence", X86_FEATURE_FENCE_SWAPGS_USER);
+}
+
+static __always_inline void kernel_entry_fence_no_swapgs(void)
+{
+	alternative("", "lfence", X86_FEATURE_FENCE_SWAPGS_KERNEL);
+}
-- 
2.19.1.6.gb485710b


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH V2 12/41] x86/traps: Move pt_regs only in fixup_bad_iret()
  2021-09-26 15:07 [PATCH V2 00/41] x86/entry/64: Convert a bunch of ASM entry code into C code Lai Jiangshan
                   ` (10 preceding siblings ...)
  2021-09-26 15:08 ` [PATCH V2 11/41] x86/entry: Add C user_entry_swapgs_and_fence() and kernel_entry_fence_no_swapgs() Lai Jiangshan
@ 2021-09-26 15:08 ` Lai Jiangshan
  2021-09-26 15:08 ` [PATCH V2 13/41] x86/entry: Switch the stack after error_entry() returns Lai Jiangshan
                   ` (28 subsequent siblings)
  40 siblings, 0 replies; 54+ messages in thread
From: Lai Jiangshan @ 2021-09-26 15:08 UTC (permalink / raw)
  To: linux-kernel
  Cc: Lai Jiangshan, Andy Lutomirski, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, x86, H. Peter Anvin, Youquan Song,
	Peter Zijlstra, Tony Luck, Sean Christopherson

From: Lai Jiangshan <laijs@linux.alibaba.com>

Make fixup_bad_iret() works like sync_regs() which doesn't
move the return address of error_entry().

It is prepared later patch which implements the body of error_entry()
in C code.  The fixup_bad_iret() can't handle return address when it
is called from C code.

Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
---
 arch/x86/entry/entry_64.S    |  5 ++++-
 arch/x86/include/asm/traps.h |  2 +-
 arch/x86/kernel/traps.c      | 17 ++++++-----------
 3 files changed, 11 insertions(+), 13 deletions(-)

diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
index 291732f571a7..9921a823b2c6 100644
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -1037,9 +1037,12 @@ SYM_CODE_START_LOCAL(error_entry)
 	 * Pretend that the exception came from user mode: set up pt_regs
 	 * as if we faulted immediately after IRET.
 	 */
-	mov	%rsp, %rdi
+	popq	%r12				/* save return addr in %12 */
+	movq	%rsp, %rdi			/* arg0 = pt_regs pointer */
 	call	fixup_bad_iret
 	mov	%rax, %rsp
+	ENCODE_FRAME_POINTER
+	pushq	%r12
 	jmp	.Lerror_entry_from_usermode_after_swapgs
 SYM_CODE_END(error_entry)
 
diff --git a/arch/x86/include/asm/traps.h b/arch/x86/include/asm/traps.h
index 6221be7cafc3..1cdd7e8bcba7 100644
--- a/arch/x86/include/asm/traps.h
+++ b/arch/x86/include/asm/traps.h
@@ -13,7 +13,7 @@
 #ifdef CONFIG_X86_64
 asmlinkage __visible notrace struct pt_regs *sync_regs(struct pt_regs *eregs);
 asmlinkage __visible notrace
-struct bad_iret_stack *fixup_bad_iret(struct bad_iret_stack *s);
+struct pt_regs *fixup_bad_iret(struct pt_regs *bad_regs);
 void __init trap_init(void);
 asmlinkage __visible noinstr struct pt_regs *vc_switch_off_ist(struct pt_regs *eregs);
 #endif
diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index cf852b5e347f..0afa16ea3702 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -759,13 +759,8 @@ asmlinkage __visible noinstr struct pt_regs *vc_switch_off_ist(struct pt_regs *r
 }
 #endif
 
-struct bad_iret_stack {
-	void *error_entry_ret;
-	struct pt_regs regs;
-};
-
 asmlinkage __visible noinstr
-struct bad_iret_stack *fixup_bad_iret(struct bad_iret_stack *s)
+struct pt_regs *fixup_bad_iret(struct pt_regs *bad_regs)
 {
 	/*
 	 * This is called from entry_64.S early in handling a fault
@@ -775,19 +770,19 @@ struct bad_iret_stack *fixup_bad_iret(struct bad_iret_stack *s)
 	 * just below the IRET frame) and we want to pretend that the
 	 * exception came from the IRET target.
 	 */
-	struct bad_iret_stack tmp, *new_stack =
-		(struct bad_iret_stack *)__this_cpu_read(cpu_tss_rw.x86_tss.sp0) - 1;
+	struct pt_regs tmp, *new_stack =
+		(struct pt_regs *)__this_cpu_read(cpu_tss_rw.x86_tss.sp0) - 1;
 
 	/* Copy the IRET target to the temporary storage. */
-	__memcpy(&tmp.regs.ip, (void *)s->regs.sp, 5*8);
+	__memcpy(&tmp.ip, (void *)bad_regs->sp, 5*8);
 
 	/* Copy the remainder of the stack from the current stack. */
-	__memcpy(&tmp, s, offsetof(struct bad_iret_stack, regs.ip));
+	__memcpy(&tmp, bad_regs, offsetof(struct pt_regs, ip));
 
 	/* Update the entry stack */
 	__memcpy(new_stack, &tmp, sizeof(tmp));
 
-	BUG_ON(!user_mode(&new_stack->regs));
+	BUG_ON(!user_mode(new_stack));
 	return new_stack;
 }
 #endif
-- 
2.19.1.6.gb485710b


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH V2 13/41] x86/entry: Switch the stack after error_entry() returns
  2021-09-26 15:07 [PATCH V2 00/41] x86/entry/64: Convert a bunch of ASM entry code into C code Lai Jiangshan
                   ` (11 preceding siblings ...)
  2021-09-26 15:08 ` [PATCH V2 12/41] x86/traps: Move pt_regs only in fixup_bad_iret() Lai Jiangshan
@ 2021-09-26 15:08 ` Lai Jiangshan
  2021-09-26 15:08 ` [PATCH V2 14/41] x86/entry: move PUSH_AND_CLEAR_REGS out of error_entry Lai Jiangshan
                   ` (27 subsequent siblings)
  40 siblings, 0 replies; 54+ messages in thread
From: Lai Jiangshan @ 2021-09-26 15:08 UTC (permalink / raw)
  To: linux-kernel
  Cc: Lai Jiangshan, Andy Lutomirski, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, x86, H. Peter Anvin

From: Lai Jiangshan <laijs@linux.alibaba.com>

error_entry() calls sync_regs() to settle/copy the pt_regs and switches
the stack directly after sync_regs().  But because error_entry() is also
called from entry, the switching has to handle the return address together,
which causes the behavior tangly.

Switching to the stack after error_entry() makes the code simpler and
intuitive.

Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
---
 arch/x86/entry/entry_64.S | 16 ++++++----------
 1 file changed, 6 insertions(+), 10 deletions(-)

diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
index 9921a823b2c6..dd453a8e7317 100644
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -323,6 +323,8 @@ SYM_CODE_END(ret_from_fork)
 .macro idtentry_body cfunc has_error_code:req
 
 	call	error_entry
+	movq	%rax, %rsp			/* switch stack settled by sync_regs() */
+	ENCODE_FRAME_POINTER
 	UNWIND_HINT_REGS
 
 	movq	%rsp, %rdi			/* pt_regs pointer into 1st argument*/
@@ -979,19 +981,16 @@ SYM_CODE_START_LOCAL(error_entry)
 	/* We have user CR3.  Change to kernel CR3. */
 	SWITCH_TO_KERNEL_CR3 scratch_reg=%rax
 
+	leaq	8(%rsp), %rdi			/* arg0 = pt_regs pointer */
 .Lerror_entry_from_usermode_after_swapgs:
 	/* Put us onto the real thread stack. */
-	popq	%r12				/* save return addr in %12 */
-	movq	%rsp, %rdi			/* arg0 = pt_regs pointer */
 	call	sync_regs
-	movq	%rax, %rsp			/* switch stack */
-	ENCODE_FRAME_POINTER
-	pushq	%r12
 	ret
 
 .Lerror_entry_done_lfence:
 	FENCE_SWAPGS_KERNEL_ENTRY
 .Lerror_entry_done:
+	leaq	8(%rsp), %rax			/* return pt_regs pointer */
 	ret
 
 	/*
@@ -1037,12 +1036,9 @@ SYM_CODE_START_LOCAL(error_entry)
 	 * Pretend that the exception came from user mode: set up pt_regs
 	 * as if we faulted immediately after IRET.
 	 */
-	popq	%r12				/* save return addr in %12 */
-	movq	%rsp, %rdi			/* arg0 = pt_regs pointer */
+	leaq	8(%rsp), %rdi			/* arg0 = pt_regs pointer */
 	call	fixup_bad_iret
-	mov	%rax, %rsp
-	ENCODE_FRAME_POINTER
-	pushq	%r12
+	mov	%rax, %rdi
 	jmp	.Lerror_entry_from_usermode_after_swapgs
 SYM_CODE_END(error_entry)
 
-- 
2.19.1.6.gb485710b


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH V2 14/41] x86/entry: move PUSH_AND_CLEAR_REGS out of error_entry
  2021-09-26 15:07 [PATCH V2 00/41] x86/entry/64: Convert a bunch of ASM entry code into C code Lai Jiangshan
                   ` (12 preceding siblings ...)
  2021-09-26 15:08 ` [PATCH V2 13/41] x86/entry: Switch the stack after error_entry() returns Lai Jiangshan
@ 2021-09-26 15:08 ` Lai Jiangshan
  2021-09-26 15:08 ` [PATCH V2 15/41] objtool: Allow .entry.text function using CLD instruction Lai Jiangshan
                   ` (26 subsequent siblings)
  40 siblings, 0 replies; 54+ messages in thread
From: Lai Jiangshan @ 2021-09-26 15:08 UTC (permalink / raw)
  To: linux-kernel
  Cc: Lai Jiangshan, Andy Lutomirski, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, x86, H. Peter Anvin

From: Lai Jiangshan <laijs@linux.alibaba.com>

Moving PUSH_AND_CLEAR_REGS out of error_entry doesn't change any
functionality.  It will enlarge the size:

size arch/x86/entry/entry_64.o.before:
   text	   data	    bss	    dec	    hex	filename
  17916	    384	      0	  18300	   477c	arch/x86/entry/entry_64.o

size --format=SysV arch/x86/entry/entry_64.o.before:
.entry.text                      5528      0
.orc_unwind                      6456      0
.orc_unwind_ip                   4304      0

size arch/x86/entry/entry_64.o.after:
   text	   data	    bss	    dec	    hex	filename
  26868	    384	      0	  27252	   6a74	arch/x86/entry/entry_64.o

size --format=SysV arch/x86/entry/entry_64.o.after:
.entry.text                      8200      0
.orc_unwind                     10224      0
.orc_unwind_ip                   6816      0

But .entry.text in x86_64 is 2M aligned, enlarging it to 8.2k doesn't
enlarge the final text size.

The tables .orc_unwind[_ip] are enlarged due to it adds many pushes.

It is prepared for converting the whole error_entry into C code.

Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
---
 arch/x86/entry/entry_64.S | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
index dd453a8e7317..757e7155670e 100644
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -322,6 +322,9 @@ SYM_CODE_END(ret_from_fork)
  */
 .macro idtentry_body cfunc has_error_code:req
 
+	PUSH_AND_CLEAR_REGS
+	ENCODE_FRAME_POINTER
+
 	call	error_entry
 	movq	%rax, %rsp			/* switch stack settled by sync_regs() */
 	ENCODE_FRAME_POINTER
@@ -967,8 +970,6 @@ SYM_CODE_END(paranoid_exit)
 SYM_CODE_START_LOCAL(error_entry)
 	UNWIND_HINT_FUNC
 	cld
-	PUSH_AND_CLEAR_REGS save_ret=1
-	ENCODE_FRAME_POINTER 8
 	testb	$3, CS+8(%rsp)
 	jz	.Lerror_kernelspace
 
-- 
2.19.1.6.gb485710b


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH V2 15/41] objtool: Allow .entry.text function using CLD instruction
  2021-09-26 15:07 [PATCH V2 00/41] x86/entry/64: Convert a bunch of ASM entry code into C code Lai Jiangshan
                   ` (13 preceding siblings ...)
  2021-09-26 15:08 ` [PATCH V2 14/41] x86/entry: move PUSH_AND_CLEAR_REGS out of error_entry Lai Jiangshan
@ 2021-09-26 15:08 ` Lai Jiangshan
  2021-09-26 15:08 ` [PATCH V2 16/41] x86/entry: Implement the whole error_entry() as C code Lai Jiangshan
                   ` (25 subsequent siblings)
  40 siblings, 0 replies; 54+ messages in thread
From: Lai Jiangshan @ 2021-09-26 15:08 UTC (permalink / raw)
  To: linux-kernel; +Cc: Lai Jiangshan, Josh Poimboeuf, Peter Zijlstra

From: Lai Jiangshan <laijs@linux.alibaba.com>

The whole error_entry() will be implemented in C which has a CLD
instruction.

Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
---
 tools/objtool/check.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/objtool/check.c b/tools/objtool/check.c
index 84e59a97bab6..2c775317b864 100644
--- a/tools/objtool/check.c
+++ b/tools/objtool/check.c
@@ -3103,7 +3103,7 @@ static int validate_branch(struct objtool_file *file, struct symbol *func,
 			break;
 
 		case INSN_CLD:
-			if (!state.df && func) {
+			if (!state.df && func && strcmp(sec->name, ".entry.text")) {
 				WARN_FUNC("redundant CLD", sec, insn->offset);
 				return 1;
 			}
-- 
2.19.1.6.gb485710b


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH V2 16/41] x86/entry: Implement the whole error_entry() as C code
  2021-09-26 15:07 [PATCH V2 00/41] x86/entry/64: Convert a bunch of ASM entry code into C code Lai Jiangshan
                   ` (14 preceding siblings ...)
  2021-09-26 15:08 ` [PATCH V2 15/41] objtool: Allow .entry.text function using CLD instruction Lai Jiangshan
@ 2021-09-26 15:08 ` Lai Jiangshan
  2021-09-28 21:34   ` Brian Gerst
  2021-09-26 15:08 ` [PATCH V2 17/41] x86/entry: Make paranoid_exit() callable Lai Jiangshan
                   ` (24 subsequent siblings)
  40 siblings, 1 reply; 54+ messages in thread
From: Lai Jiangshan @ 2021-09-26 15:08 UTC (permalink / raw)
  To: linux-kernel
  Cc: Lai Jiangshan, Andy Lutomirski, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, x86, H. Peter Anvin, Youquan Song,
	Peter Zijlstra, Tony Luck

From: Lai Jiangshan <laijs@linux.alibaba.com>

All the needed facilities are set in entry64.c, the whole error_entry()
can be implemented in C in entry64.c.  The C version generally has better
readability and easier to be updated/improved.

No function change intended. Only a check for X86_FEATURE_XENPV is added
because the new error_entry() does not use the pv SWAPGS, rather it uses
native_swapgs().  And for XENPV, error_entry() has nothing to do, so it
can return directly.

Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
---
 arch/x86/entry/entry64.c     | 76 ++++++++++++++++++++++++++++++++++
 arch/x86/entry/entry_64.S    | 80 +-----------------------------------
 arch/x86/include/asm/traps.h |  1 +
 3 files changed, 78 insertions(+), 79 deletions(-)

diff --git a/arch/x86/entry/entry64.c b/arch/x86/entry/entry64.c
index dafae60d31f9..5f2be4c3f333 100644
--- a/arch/x86/entry/entry64.c
+++ b/arch/x86/entry/entry64.c
@@ -56,3 +56,78 @@ static __always_inline void kernel_entry_fence_no_swapgs(void)
 {
 	alternative("", "lfence", X86_FEATURE_FENCE_SWAPGS_KERNEL);
 }
+
+/*
+ * Put pt_regs onto the task stack and switch GS and CR3 if needed.
+ * The actual stack switch is done in entry_64.S.
+ *
+ * Becareful, it might be in the user CR3 and user GS base at the start
+ * of the function.
+ */
+asmlinkage __visible __entry_text
+struct pt_regs *error_entry(struct pt_regs *eregs)
+{
+	unsigned long iret_ip = (unsigned long)native_irq_return_iret;
+
+	asm volatile ("cld");
+
+	/*
+	 * When XENPV, it is already in the task stack, and it can't fault
+	 * from native_irq_return_iret and asm_load_gs_index_gs_change()
+	 * since XENPV uses its own pvops for iret and load_gs_index, and
+	 * also it doesn't use PTI.  So it can directly return and
+	 * native_swapgs() can be used in the following code.
+	 */
+	if (static_cpu_has(X86_FEATURE_XENPV))
+		return eregs;
+
+	if (user_mode(eregs)) {
+		/*
+		 * We entered from user mode.
+		 * Switch to kernel gsbase and CR3.
+		 */
+		user_entry_swapgs_and_fence();
+		switch_to_kernel_cr3();
+
+		/* Put pt_regs onto the task stack. */
+		return sync_regs(eregs);
+	}
+
+	/*
+	 * There are two places in the kernel that can potentially fault with
+	 * usergs. Handle them here.  B stepping K8s sometimes report a
+	 * truncated RIP for IRET exceptions returning to compat mode. Check
+	 * for these here too.
+	 */
+	if ((eregs->ip == iret_ip) || (eregs->ip == (unsigned int)iret_ip)) {
+		eregs->ip = iret_ip; /* Fix truncated RIP */
+
+		/*
+		 * We came from an IRET to user mode, so we have user
+		 * gsbase and CR3.  Switch to kernel gsbase and CR3:
+		 */
+		user_entry_swapgs_and_fence();
+		switch_to_kernel_cr3();
+
+		/*
+		 * Pretend that the exception came from user mode: set up
+		 * pt_regs as if we faulted immediately after IRET and put
+		 * pt_regs onto the real task stack.
+		 */
+		return sync_regs(fixup_bad_iret(eregs));
+	}
+
+	/*
+	 * Hack: asm_load_gs_index_gs_change can fail with user gsbase.
+	 * If this happens, fix up gsbase and proceed.  We'll fix up the
+	 * exception and land in asm_load_gs_index_gs_change's error
+	 * handler with kernel gsbase.
+	 */
+	if (eregs->ip == (unsigned long)asm_load_gs_index_gs_change)
+		user_entry_swapgs_and_fence();
+	else
+		kernel_entry_fence_no_swapgs();
+
+	/* Enter from kernel, don't move pt_regs */
+	return eregs;
+}
diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
index 757e7155670e..169ee14cc2d6 100644
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -325,6 +325,7 @@ SYM_CODE_END(ret_from_fork)
 	PUSH_AND_CLEAR_REGS
 	ENCODE_FRAME_POINTER
 
+	movq	%rsp, %rdi
 	call	error_entry
 	movq	%rax, %rsp			/* switch stack settled by sync_regs() */
 	ENCODE_FRAME_POINTER
@@ -964,85 +965,6 @@ SYM_CODE_START_LOCAL(paranoid_exit)
 	jmp		restore_regs_and_return_to_kernel
 SYM_CODE_END(paranoid_exit)
 
-/*
- * Save all registers in pt_regs, and switch GS if needed.
- */
-SYM_CODE_START_LOCAL(error_entry)
-	UNWIND_HINT_FUNC
-	cld
-	testb	$3, CS+8(%rsp)
-	jz	.Lerror_kernelspace
-
-	/*
-	 * We entered from user mode or we're pretending to have entered
-	 * from user mode due to an IRET fault.
-	 */
-	SWAPGS
-	FENCE_SWAPGS_USER_ENTRY
-	/* We have user CR3.  Change to kernel CR3. */
-	SWITCH_TO_KERNEL_CR3 scratch_reg=%rax
-
-	leaq	8(%rsp), %rdi			/* arg0 = pt_regs pointer */
-.Lerror_entry_from_usermode_after_swapgs:
-	/* Put us onto the real thread stack. */
-	call	sync_regs
-	ret
-
-.Lerror_entry_done_lfence:
-	FENCE_SWAPGS_KERNEL_ENTRY
-.Lerror_entry_done:
-	leaq	8(%rsp), %rax			/* return pt_regs pointer */
-	ret
-
-	/*
-	 * There are two places in the kernel that can potentially fault with
-	 * usergs. Handle them here.  B stepping K8s sometimes report a
-	 * truncated RIP for IRET exceptions returning to compat mode. Check
-	 * for these here too.
-	 */
-.Lerror_kernelspace:
-	leaq	native_irq_return_iret(%rip), %rcx
-	cmpq	%rcx, RIP+8(%rsp)
-	je	.Lerror_bad_iret
-	movl	%ecx, %eax			/* zero extend */
-	cmpq	%rax, RIP+8(%rsp)
-	je	.Lbstep_iret
-	cmpq	$asm_load_gs_index_gs_change, RIP+8(%rsp)
-	jne	.Lerror_entry_done_lfence
-
-	/*
-	 * hack: .Lgs_change can fail with user gsbase.  If this happens, fix up
-	 * gsbase and proceed.  We'll fix up the exception and land in
-	 * .Lgs_change's error handler with kernel gsbase.
-	 */
-	SWAPGS
-	FENCE_SWAPGS_USER_ENTRY
-	jmp .Lerror_entry_done
-
-.Lbstep_iret:
-	/* Fix truncated RIP */
-	movq	%rcx, RIP+8(%rsp)
-	/* fall through */
-
-.Lerror_bad_iret:
-	/*
-	 * We came from an IRET to user mode, so we have user
-	 * gsbase and CR3.  Switch to kernel gsbase and CR3:
-	 */
-	SWAPGS
-	FENCE_SWAPGS_USER_ENTRY
-	SWITCH_TO_KERNEL_CR3 scratch_reg=%rax
-
-	/*
-	 * Pretend that the exception came from user mode: set up pt_regs
-	 * as if we faulted immediately after IRET.
-	 */
-	leaq	8(%rsp), %rdi			/* arg0 = pt_regs pointer */
-	call	fixup_bad_iret
-	mov	%rax, %rdi
-	jmp	.Lerror_entry_from_usermode_after_swapgs
-SYM_CODE_END(error_entry)
-
 SYM_CODE_START_LOCAL(error_return)
 	UNWIND_HINT_REGS
 	DEBUG_ENTRY_ASSERT_IRQS_OFF
diff --git a/arch/x86/include/asm/traps.h b/arch/x86/include/asm/traps.h
index 1cdd7e8bcba7..686461ac9803 100644
--- a/arch/x86/include/asm/traps.h
+++ b/arch/x86/include/asm/traps.h
@@ -14,6 +14,7 @@
 asmlinkage __visible notrace struct pt_regs *sync_regs(struct pt_regs *eregs);
 asmlinkage __visible notrace
 struct pt_regs *fixup_bad_iret(struct pt_regs *bad_regs);
+asmlinkage __visible notrace struct pt_regs *error_entry(struct pt_regs *eregs);
 void __init trap_init(void);
 asmlinkage __visible noinstr struct pt_regs *vc_switch_off_ist(struct pt_regs *eregs);
 #endif
-- 
2.19.1.6.gb485710b


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH V2 17/41] x86/entry: Make paranoid_exit() callable
  2021-09-26 15:07 [PATCH V2 00/41] x86/entry/64: Convert a bunch of ASM entry code into C code Lai Jiangshan
                   ` (15 preceding siblings ...)
  2021-09-26 15:08 ` [PATCH V2 16/41] x86/entry: Implement the whole error_entry() as C code Lai Jiangshan
@ 2021-09-26 15:08 ` Lai Jiangshan
  2021-09-26 15:08 ` [PATCH V2 18/41] x86/entry: Call paranoid_exit() in asm_exc_nmi() Lai Jiangshan
                   ` (23 subsequent siblings)
  40 siblings, 0 replies; 54+ messages in thread
From: Lai Jiangshan @ 2021-09-26 15:08 UTC (permalink / raw)
  To: linux-kernel
  Cc: Lai Jiangshan, Andy Lutomirski, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, x86, H. Peter Anvin

From: Lai Jiangshan <laijs@linux.alibaba.com>

Move the last JMP out of paranoid_exit() and make it callable.

Allow paranoid_exit() to be re-written in C later and also allow
asm_exc_nmi() to call it to avoid duplicated code.

No functional change intended.

Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
---
 arch/x86/entry/entry_64.S | 18 +++++++++++-------
 1 file changed, 11 insertions(+), 7 deletions(-)

diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
index 169ee14cc2d6..202253c9a4f2 100644
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -443,7 +443,8 @@ SYM_CODE_START(\asmsym)
 
 	call	\cfunc
 
-	jmp	paranoid_exit
+	call	paranoid_exit
+	jmp	restore_regs_and_return_to_kernel
 
 	/* Switch to the regular task stack and use the noist entry point */
 .Lfrom_usermode_switch_stack_\@:
@@ -520,7 +521,8 @@ SYM_CODE_START(\asmsym)
 	 * identical to the stack in the IRET frame or the VC fall-back stack,
 	 * so it is definitely mapped even with PTI enabled.
 	 */
-	jmp	paranoid_exit
+	call	paranoid_exit
+	jmp	restore_regs_and_return_to_kernel
 
 	/* Switch to the regular task stack */
 .Lfrom_usermode_switch_stack_\@:
@@ -550,7 +552,8 @@ SYM_CODE_START(\asmsym)
 	movq	$-1, ORIG_RAX(%rsp)	/* no syscall to restart */
 	call	\cfunc
 
-	jmp	paranoid_exit
+	call	paranoid_exit
+	jmp	restore_regs_and_return_to_kernel
 
 _ASM_NOKPROBE(\asmsym)
 SYM_CODE_END(\asmsym)
@@ -937,7 +940,7 @@ SYM_CODE_END(paranoid_entry)
  *     Y        User space GSBASE, must be restored unconditionally
  */
 SYM_CODE_START_LOCAL(paranoid_exit)
-	UNWIND_HINT_REGS
+	UNWIND_HINT_REGS offset=8
 	/*
 	 * The order of operations is important. RESTORE_CR3 requires
 	 * kernel GSBASE.
@@ -953,16 +956,17 @@ SYM_CODE_START_LOCAL(paranoid_exit)
 
 	/* With FSGSBASE enabled, unconditionally restore GSBASE */
 	wrgsbase	%rbx
-	jmp		restore_regs_and_return_to_kernel
+	ret
 
 .Lparanoid_exit_checkgs:
 	/* On non-FSGSBASE systems, conditionally do SWAPGS */
 	testl		%ebx, %ebx
-	jnz		restore_regs_and_return_to_kernel
+	jnz		.Lparanoid_exit_done
 
 	/* We are returning to a context with user GSBASE */
 	swapgs
-	jmp		restore_regs_and_return_to_kernel
+.Lparanoid_exit_done:
+	ret
 SYM_CODE_END(paranoid_exit)
 
 SYM_CODE_START_LOCAL(error_return)
-- 
2.19.1.6.gb485710b


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH V2 18/41] x86/entry: Call paranoid_exit() in asm_exc_nmi()
  2021-09-26 15:07 [PATCH V2 00/41] x86/entry/64: Convert a bunch of ASM entry code into C code Lai Jiangshan
                   ` (16 preceding siblings ...)
  2021-09-26 15:08 ` [PATCH V2 17/41] x86/entry: Make paranoid_exit() callable Lai Jiangshan
@ 2021-09-26 15:08 ` Lai Jiangshan
  2021-09-26 15:08 ` [PATCH V2 19/41] x86/entry: move PUSH_AND_CLEAR_REGS out of paranoid_entry Lai Jiangshan
                   ` (22 subsequent siblings)
  40 siblings, 0 replies; 54+ messages in thread
From: Lai Jiangshan @ 2021-09-26 15:08 UTC (permalink / raw)
  To: linux-kernel
  Cc: Lai Jiangshan, Andy Lutomirski, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, x86, H. Peter Anvin

From: Lai Jiangshan <laijs@linux.alibaba.com>

The code between "call exc_nmi" and nmi_restore is as the same as
paranoid_exit(), so we can just use paranoid_exit() instead of the open
duplicated code.

No functional change intended.

Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
---
 arch/x86/entry/entry_64.S | 34 +++++-----------------------------
 1 file changed, 5 insertions(+), 29 deletions(-)

diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
index 202253c9a4f2..a0d73dc0d2f3 100644
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -922,8 +922,7 @@ SYM_CODE_END(paranoid_entry)
 
 /*
  * "Paranoid" exit path from exception stack.  This is invoked
- * only on return from non-NMI IST interrupts that came
- * from kernel space.
+ * only on return from IST interrupts that came from kernel space.
  *
  * We may be returning to very strange contexts (e.g. very early
  * in syscall entry), so checking for preemption here would
@@ -1271,11 +1270,7 @@ end_repeat_nmi:
 	pushq	$-1				/* ORIG_RAX: no syscall to restart */
 
 	/*
-	 * Use paranoid_entry to handle SWAPGS, but no need to use paranoid_exit
-	 * as we should not be calling schedule in NMI context.
-	 * Even with normal interrupts enabled. An NMI should not be
-	 * setting NEED_RESCHED or anything that normal interrupts and
-	 * exceptions might do.
+	 * Use paranoid_entry to handle SWAPGS and CR3.
 	 */
 	call	paranoid_entry
 	UNWIND_HINT_REGS
@@ -1284,31 +1279,12 @@ end_repeat_nmi:
 	movq	$-1, %rsi
 	call	exc_nmi
 
-	/* Always restore stashed CR3 value (see paranoid_entry) */
-	RESTORE_CR3 scratch_reg=%r15 save_reg=%r14
-
 	/*
-	 * The above invocation of paranoid_entry stored the GSBASE
-	 * related information in R/EBX depending on the availability
-	 * of FSGSBASE.
-	 *
-	 * If FSGSBASE is enabled, restore the saved GSBASE value
-	 * unconditionally, otherwise take the conditional SWAPGS path.
+	 * Use paranoid_exit to handle SWAPGS and CR3, but no need to use
+	 * restore_regs_and_return_to_kernel as we must handle nested NMI.
 	 */
-	ALTERNATIVE "jmp nmi_no_fsgsbase", "", X86_FEATURE_FSGSBASE
-
-	wrgsbase	%rbx
-	jmp	nmi_restore
-
-nmi_no_fsgsbase:
-	/* EBX == 0 -> invoke SWAPGS */
-	testl	%ebx, %ebx
-	jnz	nmi_restore
-
-nmi_swapgs:
-	swapgs
+	call	paranoid_exit
 
-nmi_restore:
 	POP_REGS
 
 	/*
-- 
2.19.1.6.gb485710b


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH V2 19/41] x86/entry: move PUSH_AND_CLEAR_REGS out of paranoid_entry
  2021-09-26 15:07 [PATCH V2 00/41] x86/entry/64: Convert a bunch of ASM entry code into C code Lai Jiangshan
                   ` (17 preceding siblings ...)
  2021-09-26 15:08 ` [PATCH V2 18/41] x86/entry: Call paranoid_exit() in asm_exc_nmi() Lai Jiangshan
@ 2021-09-26 15:08 ` Lai Jiangshan
  2021-09-26 15:08 ` [PATCH V2 20/41] x86/entry: Add the C version ist_switch_to_kernel_cr3() Lai Jiangshan
                   ` (21 subsequent siblings)
  40 siblings, 0 replies; 54+ messages in thread
From: Lai Jiangshan @ 2021-09-26 15:08 UTC (permalink / raw)
  To: linux-kernel
  Cc: Lai Jiangshan, Andy Lutomirski, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, x86, H. Peter Anvin

From: Lai Jiangshan <laijs@linux.alibaba.com>

It is prepared for converting the whole paranoid_entry() into C code.

No functional change intended.

Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
---
 arch/x86/entry/entry_64.S | 24 +++++++++++++++++-------
 1 file changed, 17 insertions(+), 7 deletions(-)

diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
index a0d73dc0d2f3..bd6bce341360 100644
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -322,9 +322,6 @@ SYM_CODE_END(ret_from_fork)
  */
 .macro idtentry_body cfunc has_error_code:req
 
-	PUSH_AND_CLEAR_REGS
-	ENCODE_FRAME_POINTER
-
 	movq	%rsp, %rdi
 	call	error_entry
 	movq	%rax, %rsp			/* switch stack settled by sync_regs() */
@@ -376,6 +373,9 @@ SYM_CODE_START(\asmsym)
 .Lfrom_usermode_no_gap_\@:
 	.endif
 
+	PUSH_AND_CLEAR_REGS
+	ENCODE_FRAME_POINTER
+
 	idtentry_body \cfunc \has_error_code
 
 _ASM_NOKPROBE(\asmsym)
@@ -427,11 +427,14 @@ SYM_CODE_START(\asmsym)
 
 	pushq	$-1			/* ORIG_RAX: no syscall to restart */
 
+	PUSH_AND_CLEAR_REGS
+	ENCODE_FRAME_POINTER
+
 	/*
 	 * If the entry is from userspace, switch stacks and treat it as
 	 * a normal entry.
 	 */
-	testb	$3, CS-ORIG_RAX(%rsp)
+	testb	$3, CS(%rsp)
 	jnz	.Lfrom_usermode_switch_stack_\@
 
 	/* paranoid_entry returns GS information for paranoid_exit in EBX. */
@@ -481,11 +484,14 @@ SYM_CODE_START(\asmsym)
 	UNWIND_HINT_IRET_REGS
 	ASM_CLAC
 
+	PUSH_AND_CLEAR_REGS
+	ENCODE_FRAME_POINTER
+
 	/*
 	 * If the entry is from userspace, switch stacks and treat it as
 	 * a normal entry.
 	 */
-	testb	$3, CS-ORIG_RAX(%rsp)
+	testb	$3, CS(%rsp)
 	jnz	.Lfrom_usermode_switch_stack_\@
 
 	/*
@@ -543,6 +549,9 @@ SYM_CODE_START(\asmsym)
 	UNWIND_HINT_IRET_REGS offset=8
 	ASM_CLAC
 
+	PUSH_AND_CLEAR_REGS
+	ENCODE_FRAME_POINTER
+
 	/* paranoid_entry returns GS information for paranoid_exit in EBX. */
 	call	paranoid_entry
 	UNWIND_HINT_REGS
@@ -855,8 +864,6 @@ SYM_CODE_END(xen_failsafe_callback)
 SYM_CODE_START_LOCAL(paranoid_entry)
 	UNWIND_HINT_FUNC
 	cld
-	PUSH_AND_CLEAR_REGS save_ret=1
-	ENCODE_FRAME_POINTER 8
 
 	/*
 	 * Always stash CR3 in %r14.  This value will be restored,
@@ -1269,6 +1276,9 @@ end_repeat_nmi:
 	 */
 	pushq	$-1				/* ORIG_RAX: no syscall to restart */
 
+	PUSH_AND_CLEAR_REGS
+	ENCODE_FRAME_POINTER
+
 	/*
 	 * Use paranoid_entry to handle SWAPGS and CR3.
 	 */
-- 
2.19.1.6.gb485710b


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH V2 20/41] x86/entry: Add the C version ist_switch_to_kernel_cr3()
  2021-09-26 15:07 [PATCH V2 00/41] x86/entry/64: Convert a bunch of ASM entry code into C code Lai Jiangshan
                   ` (18 preceding siblings ...)
  2021-09-26 15:08 ` [PATCH V2 19/41] x86/entry: move PUSH_AND_CLEAR_REGS out of paranoid_entry Lai Jiangshan
@ 2021-09-26 15:08 ` Lai Jiangshan
  2021-09-26 15:08 ` [PATCH V2 21/41] x86/entry: Add the C version ist_restore_cr3() Lai Jiangshan
                   ` (20 subsequent siblings)
  40 siblings, 0 replies; 54+ messages in thread
From: Lai Jiangshan @ 2021-09-26 15:08 UTC (permalink / raw)
  To: linux-kernel
  Cc: Lai Jiangshan, Andy Lutomirski, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, x86, H. Peter Anvin

From: Lai Jiangshan <laijs@linux.alibaba.com>

It switches the CR3 to kernel CR3 and returns the original CR3, and
the caller should save the return value.

It is the C version of SAVE_AND_SWITCH_TO_KERNEL_CR3.

Not functional difference intended.

Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
---
 arch/x86/entry/entry64.c | 15 +++++++++++++++
 1 file changed, 15 insertions(+)

diff --git a/arch/x86/entry/entry64.c b/arch/x86/entry/entry64.c
index 5f2be4c3f333..faee44a3d1d8 100644
--- a/arch/x86/entry/entry64.c
+++ b/arch/x86/entry/entry64.c
@@ -32,8 +32,23 @@ static __always_inline void switch_to_kernel_cr3(void)
 	if (static_cpu_has(X86_FEATURE_PTI))
 		pti_switch_to_kernel_cr3(__native_read_cr3());
 }
+
+static __always_inline unsigned long ist_switch_to_kernel_cr3(void)
+{
+	unsigned long cr3 = 0;
+
+	if (static_cpu_has(X86_FEATURE_PTI)) {
+		cr3 = __native_read_cr3();
+
+		if (cr3 & PTI_USER_PGTABLE_MASK)
+			pti_switch_to_kernel_cr3(cr3);
+	}
+
+	return cr3;
+}
 #else
 static __always_inline void switch_to_kernel_cr3(void) {}
+static __always_inline unsigned long ist_switch_to_kernel_cr3(void) { return 0; }
 #endif
 
 /*
-- 
2.19.1.6.gb485710b


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH V2 21/41] x86/entry: Add the C version ist_restore_cr3()
  2021-09-26 15:07 [PATCH V2 00/41] x86/entry/64: Convert a bunch of ASM entry code into C code Lai Jiangshan
                   ` (19 preceding siblings ...)
  2021-09-26 15:08 ` [PATCH V2 20/41] x86/entry: Add the C version ist_switch_to_kernel_cr3() Lai Jiangshan
@ 2021-09-26 15:08 ` Lai Jiangshan
  2021-09-26 15:08 ` [PATCH V2 22/41] x86/entry: Add the C version get_percpu_base() Lai Jiangshan
                   ` (19 subsequent siblings)
  40 siblings, 0 replies; 54+ messages in thread
From: Lai Jiangshan @ 2021-09-26 15:08 UTC (permalink / raw)
  To: linux-kernel
  Cc: Lai Jiangshan, Andy Lutomirski, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, x86, H. Peter Anvin

From: Lai Jiangshan <laijs@linux.alibaba.com>

It implements the C version of RESTORE_CR3().

Not functional difference intended except the ASM code uses bit test
and clear operations while the C version uses mask check and 'AND'
operations.  The resulted asm code of both versions are very similar.

Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
---
 arch/x86/entry/entry64.c | 46 ++++++++++++++++++++++++++++++++++++++++
 1 file changed, 46 insertions(+)

diff --git a/arch/x86/entry/entry64.c b/arch/x86/entry/entry64.c
index faee44a3d1d8..2db9ae3508f1 100644
--- a/arch/x86/entry/entry64.c
+++ b/arch/x86/entry/entry64.c
@@ -8,6 +8,7 @@
  * environments that the GS base is user controlled value, or the CR3
  * is PTI user CR3 or both.
  */
+#include <asm/tlbflush.h>
 #include <asm/traps.h>
 
 extern unsigned char asm_load_gs_index_gs_change[];
@@ -27,6 +28,26 @@ static __always_inline void pti_switch_to_kernel_cr3(unsigned long user_cr3)
 	native_write_cr3(cr3);
 }
 
+static __always_inline void pti_switch_to_user_cr3(unsigned long user_cr3)
+{
+#define KERN_PCID_MASK (CR3_PCID_MASK & ~PTI_USER_PCID_MASK)
+
+	if (static_cpu_has(X86_FEATURE_PCID)) {
+		int pcid = user_cr3 & KERN_PCID_MASK;
+		unsigned short pcid_mask = 1ull << pcid;
+
+		/*
+		 * Check if there's a pending flush for the user ASID we're
+		 * about to set.
+		 */
+		if (!(this_cpu_read(cpu_tlbstate.user_pcid_flush_mask) & pcid_mask))
+			user_cr3 |= X86_CR3_PCID_NOFLUSH;
+		else
+			this_cpu_and(cpu_tlbstate.user_pcid_flush_mask, ~pcid_mask);
+	}
+	native_write_cr3(user_cr3);
+}
+
 static __always_inline void switch_to_kernel_cr3(void)
 {
 	if (static_cpu_has(X86_FEATURE_PTI))
@@ -46,9 +67,34 @@ static __always_inline unsigned long ist_switch_to_kernel_cr3(void)
 
 	return cr3;
 }
+
+static __always_inline void ist_restore_cr3(unsigned long cr3)
+{
+	if (!static_cpu_has(X86_FEATURE_PTI))
+		return;
+
+	if (unlikely(cr3 & PTI_USER_PGTABLE_MASK)) {
+		pti_switch_to_user_cr3(cr3);
+		return;
+	}
+
+	/*
+	 * KERNEL pages can always resume with NOFLUSH as we do
+	 * explicit flushes.
+	 */
+	if (static_cpu_has(X86_FEATURE_PCID))
+		cr3 |= X86_CR3_PCID_NOFLUSH;
+
+	/*
+	 * The CR3 write could be avoided when not changing its value,
+	 * but would require a CR3 read.
+	 */
+	native_write_cr3(cr3);
+}
 #else
 static __always_inline void switch_to_kernel_cr3(void) {}
 static __always_inline unsigned long ist_switch_to_kernel_cr3(void) { return 0; }
+static __always_inline void ist_restore_cr3(unsigned long cr3) {}
 #endif
 
 /*
-- 
2.19.1.6.gb485710b


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH V2 22/41] x86/entry: Add the C version get_percpu_base()
  2021-09-26 15:07 [PATCH V2 00/41] x86/entry/64: Convert a bunch of ASM entry code into C code Lai Jiangshan
                   ` (20 preceding siblings ...)
  2021-09-26 15:08 ` [PATCH V2 21/41] x86/entry: Add the C version ist_restore_cr3() Lai Jiangshan
@ 2021-09-26 15:08 ` Lai Jiangshan
  2021-09-26 15:08 ` [PATCH V2 23/41] x86/entry: Add the C version ist_switch_to_kernel_gsbase() Lai Jiangshan
                   ` (18 subsequent siblings)
  40 siblings, 0 replies; 54+ messages in thread
From: Lai Jiangshan @ 2021-09-26 15:08 UTC (permalink / raw)
  To: linux-kernel
  Cc: Lai Jiangshan, Andy Lutomirski, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, x86, H. Peter Anvin

From: Lai Jiangshan <laijs@linux.alibaba.com>

It implements the C version of asm macro GET_PERCPU_BASE().

Not functional difference intended.

Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
---
 arch/x86/entry/entry64.c | 36 ++++++++++++++++++++++++++++++++++++
 1 file changed, 36 insertions(+)

diff --git a/arch/x86/entry/entry64.c b/arch/x86/entry/entry64.c
index 2db9ae3508f1..b939b56d985d 100644
--- a/arch/x86/entry/entry64.c
+++ b/arch/x86/entry/entry64.c
@@ -193,3 +193,39 @@ struct pt_regs *error_entry(struct pt_regs *eregs)
 	/* Enter from kernel, don't move pt_regs */
 	return eregs;
 }
+
+#ifdef CONFIG_SMP
+/*
+ * CPU/node NR is loaded from the limit (size) field of a special segment
+ * descriptor entry in GDT.
+ *
+ * Do not use RDPID, because KVM loads guest's TSC_AUX on vm-entry and
+ * may not restore the host's value until the CPU returns to userspace.
+ * Thus the kernel would consume a guest's TSC_AUX if an NMI arrives
+ * while running KVM's run loop.
+ */
+static __always_inline unsigned int gdt_get_cpu(void)
+{
+	unsigned int p;
+
+	asm ("lsl %[seg],%[p]" : [p] "=a" (p) : [seg] "r" (__CPUNODE_SEG));
+
+	return p & VDSO_CPUNODE_MASK;
+}
+
+/*
+ * Fetch the per-CPU GSBASE value for this processor.
+ *
+ * We normally use %gs for accessing per-CPU data, but we are setting up
+ * %gs here and obviously can not use %gs itself to access per-CPU data.
+ */
+static __always_inline unsigned long get_percpu_base(void)
+{
+	return __per_cpu_offset[gdt_get_cpu()];
+}
+#else
+static __always_inline unsigned long get_percpu_base(void)
+{
+	return pcpu_unit_offsets;
+}
+#endif
-- 
2.19.1.6.gb485710b


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH V2 23/41] x86/entry: Add the C version ist_switch_to_kernel_gsbase()
  2021-09-26 15:07 [PATCH V2 00/41] x86/entry/64: Convert a bunch of ASM entry code into C code Lai Jiangshan
                   ` (21 preceding siblings ...)
  2021-09-26 15:08 ` [PATCH V2 22/41] x86/entry: Add the C version get_percpu_base() Lai Jiangshan
@ 2021-09-26 15:08 ` Lai Jiangshan
  2021-09-26 15:08 ` [PATCH V2 24/41] x86/entry: Implement the C version ist_paranoid_entry() Lai Jiangshan
                   ` (17 subsequent siblings)
  40 siblings, 0 replies; 54+ messages in thread
From: Lai Jiangshan @ 2021-09-26 15:08 UTC (permalink / raw)
  To: linux-kernel
  Cc: Lai Jiangshan, Andy Lutomirski, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, x86, H. Peter Anvin

From: Lai Jiangshan <laijs@linux.alibaba.com>

It implements the second half of paranoid_entry() whose functionality
is to switch to kernel gsbase.

Not functional difference intended.

Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
---
 arch/x86/entry/entry64.c | 44 ++++++++++++++++++++++++++++++++++++++++
 1 file changed, 44 insertions(+)

diff --git a/arch/x86/entry/entry64.c b/arch/x86/entry/entry64.c
index b939b56d985d..1a0d5d703ad6 100644
--- a/arch/x86/entry/entry64.c
+++ b/arch/x86/entry/entry64.c
@@ -229,3 +229,47 @@ static __always_inline unsigned long get_percpu_base(void)
 	return pcpu_unit_offsets;
 }
 #endif
+
+/*
+ * Handle GSBASE depends on the availability of FSGSBASE.
+ *
+ * Without FSGSBASE the kernel enforces that negative GSBASE
+ * values indicate kernel GSBASE. With FSGSBASE no assumptions
+ * can be made about the GSBASE value when entering from user
+ * space.
+ */
+static __always_inline unsigned long ist_switch_to_kernel_gsbase(void)
+{
+	unsigned long gsbase;
+
+	if (static_cpu_has(X86_FEATURE_FSGSBASE)) {
+		/*
+		 * Read the current GSBASE for return.
+		 * Retrieve and set the current CPUs kernel GSBASE.
+		 *
+		 * The unconditional write to GS base below ensures that
+		 * no subsequent loads based on a mispredicted GS base can
+		 * happen, therefore no LFENCE is needed here.
+		 */
+		gsbase = rdgsbase();
+		wrgsbase(get_percpu_base());
+		return gsbase;
+	}
+
+	gsbase = __rdmsr(MSR_GS_BASE);
+
+	/*
+	 * The kernel-enforced convention is a negative GSBASE indicates
+	 * a kernel value. No SWAPGS needed on entry and exit.
+	 */
+	if ((long)gsbase < 0) {
+		kernel_entry_fence_no_swapgs();
+		/* no SWAPGS required on exit */
+		return 1;
+	}
+
+	user_entry_swapgs_and_fence();
+
+	/* SWAPGS required on exit */
+	return 0;
+}
-- 
2.19.1.6.gb485710b


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH V2 24/41] x86/entry: Implement the C version ist_paranoid_entry()
  2021-09-26 15:07 [PATCH V2 00/41] x86/entry/64: Convert a bunch of ASM entry code into C code Lai Jiangshan
                   ` (22 preceding siblings ...)
  2021-09-26 15:08 ` [PATCH V2 23/41] x86/entry: Add the C version ist_switch_to_kernel_gsbase() Lai Jiangshan
@ 2021-09-26 15:08 ` Lai Jiangshan
  2021-09-26 15:08 ` [PATCH V2 25/41] x86/entry: Implement the C version ist_paranoid_exit() Lai Jiangshan
                   ` (16 subsequent siblings)
  40 siblings, 0 replies; 54+ messages in thread
From: Lai Jiangshan @ 2021-09-26 15:08 UTC (permalink / raw)
  To: linux-kernel
  Cc: Lai Jiangshan, Andy Lutomirski, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, x86, H. Peter Anvin, Juergen Gross,
	Peter Zijlstra (Intel),
	Joerg Roedel, Mike Travis

From: Lai Jiangshan <laijs@linux.alibaba.com>

It implements the whole ASM version paranoid_entry().

No functional difference intended.

Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
---
 arch/x86/entry/entry64.c        | 39 +++++++++++++++++++++++++++++++++
 arch/x86/include/asm/idtentry.h |  3 +++
 2 files changed, 42 insertions(+)

diff --git a/arch/x86/entry/entry64.c b/arch/x86/entry/entry64.c
index 1a0d5d703ad6..67f13aebd948 100644
--- a/arch/x86/entry/entry64.c
+++ b/arch/x86/entry/entry64.c
@@ -273,3 +273,42 @@ static __always_inline unsigned long ist_switch_to_kernel_gsbase(void)
 	/* SWAPGS required on exit */
 	return 0;
 }
+
+/*
+ * Switch and save CR3 in *@cr3 if PTI enabled. Return GSBASE related
+ * information in *@gsbase depending on the availability of the FSGSBASE
+ * instructions:
+ *
+ * FSGSBASE	*@gsbase
+ *     N        0 -> SWAPGS on exit
+ *              1 -> no SWAPGS on exit
+ *
+ *     Y        GSBASE value at entry, must be restored in ist_paranoid_exit
+ */
+__visible __entry_text
+void ist_paranoid_entry(unsigned long *cr3, unsigned long *gsbase)
+{
+	asm volatile ("cld");
+
+	/*
+	 * Always stash CR3 in *@cr3.  This value will be restored,
+	 * verbatim, at exit.  Needed if ist_paranoid_entry interrupted
+	 * another entry that already switched to the user CR3 value
+	 * but has not yet returned to userspace.
+	 *
+	 * This is also why CS (stashed in the "iret frame" by the
+	 * hardware at entry) can not be used: this may be a return
+	 * to kernel code, but with a user CR3 value.
+	 *
+	 * Switching CR3 does not depend on kernel GSBASE so it can
+	 * be done before switching to the kernel GSBASE. This is
+	 * required for FSGSBASE because the kernel GSBASE has to
+	 * be retrieved from a kernel internal table.
+	 */
+	*cr3 = ist_switch_to_kernel_cr3();
+
+	barrier();
+
+	/* Handle GSBASE, store the return value in *@gsbase for exit. */
+	*gsbase = ist_switch_to_kernel_gsbase();
+}
diff --git a/arch/x86/include/asm/idtentry.h b/arch/x86/include/asm/idtentry.h
index 6779def97591..fa8d73cfd8d6 100644
--- a/arch/x86/include/asm/idtentry.h
+++ b/arch/x86/include/asm/idtentry.h
@@ -293,6 +293,9 @@ static __always_inline void __##func(struct pt_regs *regs)
 	DECLARE_IDTENTRY(vector, func)
 
 #ifdef CONFIG_X86_64
+__visible __entry_text
+void ist_paranoid_entry(unsigned long *cr3, unsigned long *gsbase);
+
 /**
  * DECLARE_IDTENTRY_IST - Declare functions for IST handling IDT entry points
  * @vector:	Vector number (ignored for C)
-- 
2.19.1.6.gb485710b


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH V2 25/41] x86/entry: Implement the C version ist_paranoid_exit()
  2021-09-26 15:07 [PATCH V2 00/41] x86/entry/64: Convert a bunch of ASM entry code into C code Lai Jiangshan
                   ` (23 preceding siblings ...)
  2021-09-26 15:08 ` [PATCH V2 24/41] x86/entry: Implement the C version ist_paranoid_entry() Lai Jiangshan
@ 2021-09-26 15:08 ` Lai Jiangshan
  2021-09-26 15:08 ` [PATCH V2 26/41] x86/entry: Add a C macro to define the function body for IST in .entry.text Lai Jiangshan
                   ` (15 subsequent siblings)
  40 siblings, 0 replies; 54+ messages in thread
From: Lai Jiangshan @ 2021-09-26 15:08 UTC (permalink / raw)
  To: linux-kernel
  Cc: Lai Jiangshan, Andy Lutomirski, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, x86, H. Peter Anvin, Juergen Gross,
	Peter Zijlstra (Intel),
	Joerg Roedel, Mike Travis

From: Lai Jiangshan <laijs@linux.alibaba.com>

It implements the whole ASM version paranoid_exit().

No functional difference intended.

Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
---
 arch/x86/entry/entry64.c        | 40 +++++++++++++++++++++++++++++++++
 arch/x86/include/asm/idtentry.h |  2 ++
 2 files changed, 42 insertions(+)

diff --git a/arch/x86/entry/entry64.c b/arch/x86/entry/entry64.c
index 67f13aebd948..017a7f94e3a4 100644
--- a/arch/x86/entry/entry64.c
+++ b/arch/x86/entry/entry64.c
@@ -312,3 +312,43 @@ void ist_paranoid_entry(unsigned long *cr3, unsigned long *gsbase)
 	/* Handle GSBASE, store the return value in *@gsbase for exit. */
 	*gsbase = ist_switch_to_kernel_gsbase();
 }
+
+/*
+ * "Paranoid" exit path from exception stack.  This is invoked
+ * only on return from IST interrupts that came from kernel space.
+ *
+ * We may be returning to very strange contexts (e.g. very early
+ * in syscall entry), so checking for preemption here would
+ * be complicated.  Fortunately, there's no good reason to try
+ * to handle preemption here.
+ */
+__visible __entry_text
+void ist_paranoid_exit(unsigned long cr3, unsigned long gsbase)
+{
+	/*
+	 * Restore CR3 at first, it can use kernel GSBASE.
+	 */
+	ist_restore_cr3(cr3);
+
+	barrier();
+
+	/*
+	 * Handle the three GSBASE cases.
+	 *
+	 * @gsbase contains the GSBASE related information depending
+	 * on the availability of the FSGSBASE instructions:
+	 *
+	 * FSGSBASE	@gsbase
+	 *     N        0 -> SWAPGS on exit
+	 *              1 -> no SWAPGS on exit
+	 *
+	 *     Y        User space GSBASE, must be restored unconditionally
+	 */
+	if (static_cpu_has(X86_FEATURE_FSGSBASE)) {
+		wrgsbase(gsbase);
+		return;
+	}
+
+	if (gsbase)
+		native_swapgs();
+}
diff --git a/arch/x86/include/asm/idtentry.h b/arch/x86/include/asm/idtentry.h
index fa8d73cfd8d6..b144ea05b859 100644
--- a/arch/x86/include/asm/idtentry.h
+++ b/arch/x86/include/asm/idtentry.h
@@ -295,6 +295,8 @@ static __always_inline void __##func(struct pt_regs *regs)
 #ifdef CONFIG_X86_64
 __visible __entry_text
 void ist_paranoid_entry(unsigned long *cr3, unsigned long *gsbase);
+__visible __entry_text
+void ist_paranoid_exit(unsigned long cr3, unsigned long gsbase);
 
 /**
  * DECLARE_IDTENTRY_IST - Declare functions for IST handling IDT entry points
-- 
2.19.1.6.gb485710b


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH V2 26/41] x86/entry: Add a C macro to define the function body for IST in .entry.text
  2021-09-26 15:07 [PATCH V2 00/41] x86/entry/64: Convert a bunch of ASM entry code into C code Lai Jiangshan
                   ` (24 preceding siblings ...)
  2021-09-26 15:08 ` [PATCH V2 25/41] x86/entry: Implement the C version ist_paranoid_exit() Lai Jiangshan
@ 2021-09-26 15:08 ` Lai Jiangshan
  2021-09-26 15:08 ` [PATCH V2 27/41] x86/mce: Remove stack protector from mce/core.c Lai Jiangshan
                   ` (14 subsequent siblings)
  40 siblings, 0 replies; 54+ messages in thread
From: Lai Jiangshan @ 2021-09-26 15:08 UTC (permalink / raw)
  To: linux-kernel
  Cc: Lai Jiangshan, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	x86, H. Peter Anvin, Juergen Gross, Peter Zijlstra (Intel),
	Joerg Roedel, Mike Travis

From: Lai Jiangshan <laijs@linux.alibaba.com>

Add DEFINE_IDTENTRY_IST_ETNRY() macro to define C code to implement
the ASM code which calls paranoid_entry(), cfunc(), paranoid_exit()
in series for IST exceptions without error code.

Not functional difference intended.

Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
---
 arch/x86/include/asm/idtentry.h | 14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff --git a/arch/x86/include/asm/idtentry.h b/arch/x86/include/asm/idtentry.h
index b144ea05b859..b33e96e983c0 100644
--- a/arch/x86/include/asm/idtentry.h
+++ b/arch/x86/include/asm/idtentry.h
@@ -323,6 +323,20 @@ void ist_paranoid_exit(unsigned long cr3, unsigned long gsbase);
 	__visible noinstr void kernel_##func(struct pt_regs *regs, unsigned long error_code);	\
 	__visible noinstr void   user_##func(struct pt_regs *regs, unsigned long error_code)
 
+/**
+ * DEFINE_IDTENTRY_IST_ENTRY - Emit __entry_text code for IST entry points
+ * @func:	Function name of the entry point
+ */
+#define DEFINE_IDTENTRY_IST_ETNRY(func)					\
+__visible __entry_text void ist_##func(struct pt_regs *regs)		\
+{									\
+	unsigned long cr3, gsbase;					\
+									\
+	ist_paranoid_entry(&cr3, &gsbase);				\
+	func(regs);							\
+	ist_paranoid_exit(cr3, gsbase);					\
+}
+
 /**
  * DEFINE_IDTENTRY_IST - Emit code for IST entry points
  * @func:	Function name of the entry point
-- 
2.19.1.6.gb485710b


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH V2 27/41] x86/mce: Remove stack protector from mce/core.c
  2021-09-26 15:07 [PATCH V2 00/41] x86/entry/64: Convert a bunch of ASM entry code into C code Lai Jiangshan
                   ` (25 preceding siblings ...)
  2021-09-26 15:08 ` [PATCH V2 26/41] x86/entry: Add a C macro to define the function body for IST in .entry.text Lai Jiangshan
@ 2021-09-26 15:08 ` Lai Jiangshan
  2021-09-26 15:08 ` [PATCH V2 28/41] x86/debug, mce: Use C entry code Lai Jiangshan
                   ` (13 subsequent siblings)
  40 siblings, 0 replies; 54+ messages in thread
From: Lai Jiangshan @ 2021-09-26 15:08 UTC (permalink / raw)
  To: linux-kernel
  Cc: Lai Jiangshan, Tony Luck, Borislav Petkov, Thomas Gleixner,
	Ingo Molnar, x86, H. Peter Anvin, linux-edac

From: Lai Jiangshan <laijs@linux.alibaba.com>

mce/core.c is going to contain __entry_code which can not be instrumented
by stack protector.

Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
---
 arch/x86/kernel/cpu/mce/Makefile | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/arch/x86/kernel/cpu/mce/Makefile b/arch/x86/kernel/cpu/mce/Makefile
index 015856abdbb1..ce192c5344fc 100644
--- a/arch/x86/kernel/cpu/mce/Makefile
+++ b/arch/x86/kernel/cpu/mce/Makefile
@@ -1,4 +1,8 @@
 # SPDX-License-Identifier: GPL-2.0
+
+CFLAGS_REMOVE_core.o		= -fstack-protector -fstack-protector-strong
+CFLAGS_core.o			+= -fno-stack-protector
+
 obj-y				=  core.o severity.o genpool.o
 
 obj-$(CONFIG_X86_ANCIENT_MCE)	+= winchip.o p5.o
-- 
2.19.1.6.gb485710b


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH V2 28/41] x86/debug, mce: Use C entry code
  2021-09-26 15:07 [PATCH V2 00/41] x86/entry/64: Convert a bunch of ASM entry code into C code Lai Jiangshan
                   ` (26 preceding siblings ...)
  2021-09-26 15:08 ` [PATCH V2 27/41] x86/mce: Remove stack protector from mce/core.c Lai Jiangshan
@ 2021-09-26 15:08 ` Lai Jiangshan
  2021-09-26 15:08 ` [PATCH V2 29/41] x86/idtentry.h: Move the definitions *IDTENTRY_{MCE|DEBUG}* up Lai Jiangshan
                   ` (12 subsequent siblings)
  40 siblings, 0 replies; 54+ messages in thread
From: Lai Jiangshan @ 2021-09-26 15:08 UTC (permalink / raw)
  To: linux-kernel
  Cc: Lai Jiangshan, Andy Lutomirski, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, x86, H. Peter Anvin, Juergen Gross,
	Peter Zijlstra (Intel),
	Joerg Roedel, Mike Travis

From: Lai Jiangshan <laijs@linux.alibaba.com>

Use DEFINE_IDTENTRY_IST_ETNRY to emit C entry function and use the function
directly in entry_64.S.

Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
---
 arch/x86/entry/entry_64.S       | 10 +---------
 arch/x86/include/asm/idtentry.h |  1 +
 2 files changed, 2 insertions(+), 9 deletions(-)

diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
index bd6bce341360..0ba788bb9857 100644
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -437,16 +437,8 @@ SYM_CODE_START(\asmsym)
 	testb	$3, CS(%rsp)
 	jnz	.Lfrom_usermode_switch_stack_\@
 
-	/* paranoid_entry returns GS information for paranoid_exit in EBX. */
-	call	paranoid_entry
-
-	UNWIND_HINT_REGS
-
 	movq	%rsp, %rdi		/* pt_regs pointer */
-
-	call	\cfunc
-
-	call	paranoid_exit
+	call	ist_\cfunc
 	jmp	restore_regs_and_return_to_kernel
 
 	/* Switch to the regular task stack and use the noist entry point */
diff --git a/arch/x86/include/asm/idtentry.h b/arch/x86/include/asm/idtentry.h
index b33e96e983c0..babe530cfa77 100644
--- a/arch/x86/include/asm/idtentry.h
+++ b/arch/x86/include/asm/idtentry.h
@@ -344,6 +344,7 @@ __visible __entry_text void ist_##func(struct pt_regs *regs)		\
  * Maps to DEFINE_IDTENTRY_RAW
  */
 #define DEFINE_IDTENTRY_IST(func)					\
+	DEFINE_IDTENTRY_IST_ETNRY(func)					\
 	DEFINE_IDTENTRY_RAW(func)
 
 /**
-- 
2.19.1.6.gb485710b


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH V2 29/41] x86/idtentry.h: Move the definitions *IDTENTRY_{MCE|DEBUG}* up
  2021-09-26 15:07 [PATCH V2 00/41] x86/entry/64: Convert a bunch of ASM entry code into C code Lai Jiangshan
                   ` (27 preceding siblings ...)
  2021-09-26 15:08 ` [PATCH V2 28/41] x86/debug, mce: Use C entry code Lai Jiangshan
@ 2021-09-26 15:08 ` Lai Jiangshan
  2021-09-26 15:08 ` [PATCH V2 30/41] x86/nmi: Use DEFINE_IDTENTRY_NMI for nmi Lai Jiangshan
                   ` (11 subsequent siblings)
  40 siblings, 0 replies; 54+ messages in thread
From: Lai Jiangshan @ 2021-09-26 15:08 UTC (permalink / raw)
  To: linux-kernel
  Cc: Lai Jiangshan, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	x86, H. Peter Anvin, Juergen Gross, Peter Zijlstra (Intel),
	Joerg Roedel, Mike Travis

From: Lai Jiangshan <laijs@linux.alibaba.com>

Move them closer to the related definitions and reduce a #ifdef entry.

Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
---
 arch/x86/include/asm/idtentry.h | 18 ++++++++----------
 1 file changed, 8 insertions(+), 10 deletions(-)

diff --git a/arch/x86/include/asm/idtentry.h b/arch/x86/include/asm/idtentry.h
index babe530cfa77..49c0ebe374ae 100644
--- a/arch/x86/include/asm/idtentry.h
+++ b/arch/x86/include/asm/idtentry.h
@@ -358,6 +358,14 @@ __visible __entry_text void ist_##func(struct pt_regs *regs)		\
 #define DEFINE_IDTENTRY_NOIST(func)					\
 	DEFINE_IDTENTRY_RAW(noist_##func)
 
+#define DECLARE_IDTENTRY_MCE		DECLARE_IDTENTRY_IST
+#define DEFINE_IDTENTRY_MCE		DEFINE_IDTENTRY_IST
+#define DEFINE_IDTENTRY_MCE_USER	DEFINE_IDTENTRY_NOIST
+
+#define DECLARE_IDTENTRY_DEBUG		DECLARE_IDTENTRY_IST
+#define DEFINE_IDTENTRY_DEBUG		DEFINE_IDTENTRY_IST
+#define DEFINE_IDTENTRY_DEBUG_USER	DEFINE_IDTENTRY_NOIST
+
 /**
  * DECLARE_IDTENTRY_DF - Declare functions for double fault
  * @vector:	Vector number (ignored for C)
@@ -432,16 +440,6 @@ __visible noinstr void func(struct pt_regs *regs,			\
 #define DECLARE_IDTENTRY_NMI		DECLARE_IDTENTRY_RAW
 #define DEFINE_IDTENTRY_NMI		DEFINE_IDTENTRY_RAW
 
-#ifdef CONFIG_X86_64
-#define DECLARE_IDTENTRY_MCE		DECLARE_IDTENTRY_IST
-#define DEFINE_IDTENTRY_MCE		DEFINE_IDTENTRY_IST
-#define DEFINE_IDTENTRY_MCE_USER	DEFINE_IDTENTRY_NOIST
-
-#define DECLARE_IDTENTRY_DEBUG		DECLARE_IDTENTRY_IST
-#define DEFINE_IDTENTRY_DEBUG		DEFINE_IDTENTRY_IST
-#define DEFINE_IDTENTRY_DEBUG_USER	DEFINE_IDTENTRY_NOIST
-#endif
-
 #else /* !__ASSEMBLY__ */
 
 /*
-- 
2.19.1.6.gb485710b


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH V2 30/41] x86/nmi: Use DEFINE_IDTENTRY_NMI for nmi
  2021-09-26 15:07 [PATCH V2 00/41] x86/entry/64: Convert a bunch of ASM entry code into C code Lai Jiangshan
                   ` (28 preceding siblings ...)
  2021-09-26 15:08 ` [PATCH V2 29/41] x86/idtentry.h: Move the definitions *IDTENTRY_{MCE|DEBUG}* up Lai Jiangshan
@ 2021-09-26 15:08 ` Lai Jiangshan
  2021-09-26 15:08 ` [PATCH V2 31/41] x86/nmi: Remove stack protector from nmi.c Lai Jiangshan
                   ` (10 subsequent siblings)
  40 siblings, 0 replies; 54+ messages in thread
From: Lai Jiangshan @ 2021-09-26 15:08 UTC (permalink / raw)
  To: linux-kernel
  Cc: Lai Jiangshan, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	x86, H. Peter Anvin, Joerg Roedel, Ira Weiny, Brijesh Singh,
	Libing Zhou

From: Lai Jiangshan <laijs@linux.alibaba.com>

DEFINE_IDTENTRY_NMI is defined, but not used.  It is better to use it.

It is also prepared for later patch to define DEFINE_IDTENTRY_NMI
differently in 32bit and 64bit.

Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
---
 arch/x86/kernel/nmi.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kernel/nmi.c b/arch/x86/kernel/nmi.c
index 4bce802d25fb..44c3adb68282 100644
--- a/arch/x86/kernel/nmi.c
+++ b/arch/x86/kernel/nmi.c
@@ -473,7 +473,7 @@ static DEFINE_PER_CPU(enum nmi_states, nmi_state);
 static DEFINE_PER_CPU(unsigned long, nmi_cr2);
 static DEFINE_PER_CPU(unsigned long, nmi_dr7);
 
-DEFINE_IDTENTRY_RAW(exc_nmi)
+DEFINE_IDTENTRY_NMI(exc_nmi)
 {
 	irqentry_state_t irq_state;
 
-- 
2.19.1.6.gb485710b


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH V2 31/41] x86/nmi: Remove stack protector from nmi.c
  2021-09-26 15:07 [PATCH V2 00/41] x86/entry/64: Convert a bunch of ASM entry code into C code Lai Jiangshan
                   ` (29 preceding siblings ...)
  2021-09-26 15:08 ` [PATCH V2 30/41] x86/nmi: Use DEFINE_IDTENTRY_NMI for nmi Lai Jiangshan
@ 2021-09-26 15:08 ` Lai Jiangshan
  2021-09-26 15:08 ` [PATCH V2 32/41] x86/nmi: Use C entry code Lai Jiangshan
                   ` (9 subsequent siblings)
  40 siblings, 0 replies; 54+ messages in thread
From: Lai Jiangshan @ 2021-09-26 15:08 UTC (permalink / raw)
  To: linux-kernel
  Cc: Lai Jiangshan, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	x86, H. Peter Anvin, Joerg Roedel, Javier Martinez Canillas,
	Daniel Bristot de Oliveira, Brijesh Singh, Andy Shevchenko,
	Arvind Sankar, Juergen Gross, Chester Lin

From: Lai Jiangshan <laijs@linux.alibaba.com>

nmi.c is going to contain __entry_code which can not be instrumented
by stack protector.

Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
---
 arch/x86/kernel/Makefile | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile
index 0e054e2304c6..f56e8088c85d 100644
--- a/arch/x86/kernel/Makefile
+++ b/arch/x86/kernel/Makefile
@@ -50,6 +50,8 @@ CFLAGS_head$(BITS).o	+= -fno-stack-protector
 
 CFLAGS_REMOVE_traps.o		= -fstack-protector -fstack-protector-strong
 CFLAGS_traps.o			+= -fno-stack-protector
+CFLAGS_REMOVE_nmi.o		= -fstack-protector -fstack-protector-strong
+CFLAGS_nmi.o			+= -fno-stack-protector
 
 CFLAGS_irq.o := -I $(srctree)/$(src)/../include/asm/trace
 
-- 
2.19.1.6.gb485710b


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH V2 32/41] x86/nmi: Use C entry code
  2021-09-26 15:07 [PATCH V2 00/41] x86/entry/64: Convert a bunch of ASM entry code into C code Lai Jiangshan
                   ` (30 preceding siblings ...)
  2021-09-26 15:08 ` [PATCH V2 31/41] x86/nmi: Remove stack protector from nmi.c Lai Jiangshan
@ 2021-09-26 15:08 ` Lai Jiangshan
  2021-09-26 15:08 ` [PATCH V2 33/41] x86/entry: Add a C macro to define the function body for IST in .entry.text with an error code Lai Jiangshan
                   ` (8 subsequent siblings)
  40 siblings, 0 replies; 54+ messages in thread
From: Lai Jiangshan @ 2021-09-26 15:08 UTC (permalink / raw)
  To: linux-kernel
  Cc: Lai Jiangshan, Andy Lutomirski, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, x86, H. Peter Anvin, Juergen Gross,
	Peter Zijlstra (Intel),
	Joerg Roedel, Mike Travis

From: Lai Jiangshan <laijs@linux.alibaba.com>

Use DEFINE_IDTENTRY_IST_ETNRY to emit C entry function and use the function
directly in entry_64.S.

Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
---
 arch/x86/entry/entry_64.S       | 17 ++---------------
 arch/x86/include/asm/idtentry.h |  5 ++++-
 2 files changed, 6 insertions(+), 16 deletions(-)

diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
index 0ba788bb9857..72a1610bb540 100644
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -1271,21 +1271,8 @@ end_repeat_nmi:
 	PUSH_AND_CLEAR_REGS
 	ENCODE_FRAME_POINTER
 
-	/*
-	 * Use paranoid_entry to handle SWAPGS and CR3.
-	 */
-	call	paranoid_entry
-	UNWIND_HINT_REGS
-
-	movq	%rsp, %rdi
-	movq	$-1, %rsi
-	call	exc_nmi
-
-	/*
-	 * Use paranoid_exit to handle SWAPGS and CR3, but no need to use
-	 * restore_regs_and_return_to_kernel as we must handle nested NMI.
-	 */
-	call	paranoid_exit
+	movq	%rsp, %rdi		/* pt_regs pointer */
+	call	ist_exc_nmi
 
 	POP_REGS
 
diff --git a/arch/x86/include/asm/idtentry.h b/arch/x86/include/asm/idtentry.h
index 49c0ebe374ae..c99c58bc179a 100644
--- a/arch/x86/include/asm/idtentry.h
+++ b/arch/x86/include/asm/idtentry.h
@@ -358,6 +358,8 @@ __visible __entry_text void ist_##func(struct pt_regs *regs)		\
 #define DEFINE_IDTENTRY_NOIST(func)					\
 	DEFINE_IDTENTRY_RAW(noist_##func)
 
+#define DEFINE_IDTENTRY_NMI		DEFINE_IDTENTRY_IST
+
 #define DECLARE_IDTENTRY_MCE		DECLARE_IDTENTRY_IST
 #define DEFINE_IDTENTRY_MCE		DEFINE_IDTENTRY_IST
 #define DEFINE_IDTENTRY_MCE_USER	DEFINE_IDTENTRY_NOIST
@@ -407,6 +409,8 @@ __visible __entry_text void ist_##func(struct pt_regs *regs)		\
 
 #else	/* CONFIG_X86_64 */
 
+#define DEFINE_IDTENTRY_NMI		DEFINE_IDTENTRY_RAW
+
 /**
  * DECLARE_IDTENTRY_DF - Declare functions for double fault 32bit variant
  * @vector:	Vector number (ignored for C)
@@ -438,7 +442,6 @@ __visible noinstr void func(struct pt_regs *regs,			\
 
 /* C-Code mapping */
 #define DECLARE_IDTENTRY_NMI		DECLARE_IDTENTRY_RAW
-#define DEFINE_IDTENTRY_NMI		DEFINE_IDTENTRY_RAW
 
 #else /* !__ASSEMBLY__ */
 
-- 
2.19.1.6.gb485710b


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH V2 33/41] x86/entry: Add a C macro to define the function body for IST in .entry.text with an error code
  2021-09-26 15:07 [PATCH V2 00/41] x86/entry/64: Convert a bunch of ASM entry code into C code Lai Jiangshan
                   ` (31 preceding siblings ...)
  2021-09-26 15:08 ` [PATCH V2 32/41] x86/nmi: Use C entry code Lai Jiangshan
@ 2021-09-26 15:08 ` Lai Jiangshan
  2021-09-26 15:08 ` [PATCH V2 34/41] x86/doublefault: Use C entry code Lai Jiangshan
                   ` (7 subsequent siblings)
  40 siblings, 0 replies; 54+ messages in thread
From: Lai Jiangshan @ 2021-09-26 15:08 UTC (permalink / raw)
  To: linux-kernel
  Cc: Lai Jiangshan, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	x86, H. Peter Anvin, Juergen Gross, Peter Zijlstra (Intel),
	Joerg Roedel, Mike Travis

From: Lai Jiangshan <laijs@linux.alibaba.com>

Add DEFINE_IDTENTRY_IST_ETNRY_ERRORCODE() macro to define C code to
implement the ASM code which calls paranoid_entry(), modify orig_ax,
cfunc(), paranoid_exit() in series for IST exceptions with an error code.

Not functional difference intended.

Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
---
 arch/x86/include/asm/idtentry.h | 16 ++++++++++++++++
 1 file changed, 16 insertions(+)

diff --git a/arch/x86/include/asm/idtentry.h b/arch/x86/include/asm/idtentry.h
index c99c58bc179a..7935b0abc65d 100644
--- a/arch/x86/include/asm/idtentry.h
+++ b/arch/x86/include/asm/idtentry.h
@@ -337,6 +337,22 @@ __visible __entry_text void ist_##func(struct pt_regs *regs)		\
 	ist_paranoid_exit(cr3, gsbase);					\
 }
 
+/**
+ * DEFINE_IDTENTRY_IST_ENTRY_ERRORCODE - Emit __entry_text code for IST
+ *					 entry points with an error code
+ * @func:	Function name of the entry point
+ */
+#define DEFINE_IDTENTRY_IST_ETNRY_ERRORCODE(func)			\
+__visible __entry_text void ist_##func(struct pt_regs *regs)		\
+{									\
+	unsigned long cr3, gsbase, error_code = regs->orig_ax;		\
+									\
+	ist_paranoid_entry(&cr3, &gsbase);				\
+	regs->orig_ax = -1;	/* no syscall to restart */		\
+	func(regs, error_code);						\
+	ist_paranoid_exit(cr3, gsbase);					\
+}
+
 /**
  * DEFINE_IDTENTRY_IST - Emit code for IST entry points
  * @func:	Function name of the entry point
-- 
2.19.1.6.gb485710b


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH V2 34/41] x86/doublefault: Use C entry code
  2021-09-26 15:07 [PATCH V2 00/41] x86/entry/64: Convert a bunch of ASM entry code into C code Lai Jiangshan
                   ` (32 preceding siblings ...)
  2021-09-26 15:08 ` [PATCH V2 33/41] x86/entry: Add a C macro to define the function body for IST in .entry.text with an error code Lai Jiangshan
@ 2021-09-26 15:08 ` Lai Jiangshan
  2021-09-26 15:08 ` [PATCH V2 35/41] x86/sev: Add and use ist_vc_switch_off_ist() Lai Jiangshan
                   ` (6 subsequent siblings)
  40 siblings, 0 replies; 54+ messages in thread
From: Lai Jiangshan @ 2021-09-26 15:08 UTC (permalink / raw)
  To: linux-kernel
  Cc: Lai Jiangshan, Andy Lutomirski, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, x86, H. Peter Anvin, Juergen Gross,
	Peter Zijlstra (Intel),
	Joerg Roedel, Mike Travis

From: Lai Jiangshan <laijs@linux.alibaba.com>

Use DEFINE_IDTENTRY_IST_ETNRY_ERRORCODE to emit C entry function and
use the function directly in entry_64.S.

Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
---
 arch/x86/entry/entry_64.S       | 12 ++----------
 arch/x86/include/asm/idtentry.h |  1 +
 2 files changed, 3 insertions(+), 10 deletions(-)

diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
index 72a1610bb540..db108f8cd554 100644
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -544,16 +544,8 @@ SYM_CODE_START(\asmsym)
 	PUSH_AND_CLEAR_REGS
 	ENCODE_FRAME_POINTER
 
-	/* paranoid_entry returns GS information for paranoid_exit in EBX. */
-	call	paranoid_entry
-	UNWIND_HINT_REGS
-
-	movq	%rsp, %rdi		/* pt_regs pointer into first argument */
-	movq	ORIG_RAX(%rsp), %rsi	/* get error code into 2nd argument*/
-	movq	$-1, ORIG_RAX(%rsp)	/* no syscall to restart */
-	call	\cfunc
-
-	call	paranoid_exit
+	movq	%rsp, %rdi		/* pt_regs pointer */
+	call	ist_\cfunc
 	jmp	restore_regs_and_return_to_kernel
 
 _ASM_NOKPROBE(\asmsym)
diff --git a/arch/x86/include/asm/idtentry.h b/arch/x86/include/asm/idtentry.h
index 7935b0abc65d..99e1ae3f5c7d 100644
--- a/arch/x86/include/asm/idtentry.h
+++ b/arch/x86/include/asm/idtentry.h
@@ -401,6 +401,7 @@ __visible __entry_text void ist_##func(struct pt_regs *regs)		\
  * Maps to DEFINE_IDTENTRY_RAW_ERRORCODE
  */
 #define DEFINE_IDTENTRY_DF(func)					\
+	DEFINE_IDTENTRY_IST_ETNRY_ERRORCODE(func)			\
 	DEFINE_IDTENTRY_RAW_ERRORCODE(func)
 
 /**
-- 
2.19.1.6.gb485710b


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH V2 35/41] x86/sev: Add and use ist_vc_switch_off_ist()
  2021-09-26 15:07 [PATCH V2 00/41] x86/entry/64: Convert a bunch of ASM entry code into C code Lai Jiangshan
                   ` (33 preceding siblings ...)
  2021-09-26 15:08 ` [PATCH V2 34/41] x86/doublefault: Use C entry code Lai Jiangshan
@ 2021-09-26 15:08 ` Lai Jiangshan
  2021-09-26 15:08 ` [PATCH V2 36/41] x86/sev: Remove stack protector from sev.c Lai Jiangshan
                   ` (5 subsequent siblings)
  40 siblings, 0 replies; 54+ messages in thread
From: Lai Jiangshan @ 2021-09-26 15:08 UTC (permalink / raw)
  To: linux-kernel
  Cc: Lai Jiangshan, Andy Lutomirski, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, x86, H. Peter Anvin, Youquan Song,
	Peter Zijlstra, Tony Luck, Sean Christopherson

From: Lai Jiangshan <laijs@linux.alibaba.com>

ist_vc_switch_off_ist() is the same as vc_switch_off_ist(), but it is
called without CR3 or gsbase fixed.  It has to call ist_paranoid_entry()
by its own.

It is prepared for using C code for the other part of identry_vc and
remove ASM paranoid_entry() and paranoid_exit().

Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
---
 arch/x86/entry/entry_64.S    | 20 ++++++++++----------
 arch/x86/include/asm/traps.h |  3 ++-
 arch/x86/kernel/traps.c      | 14 +++++++++++++-
 3 files changed, 25 insertions(+), 12 deletions(-)

diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
index db108f8cd554..8871f8ccf117 100644
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -486,26 +486,26 @@ SYM_CODE_START(\asmsym)
 	testb	$3, CS(%rsp)
 	jnz	.Lfrom_usermode_switch_stack_\@
 
-	/*
-	 * paranoid_entry returns SWAPGS flag for paranoid_exit in EBX.
-	 * EBX == 0 -> SWAPGS, EBX == 1 -> no SWAPGS
-	 */
-	call	paranoid_entry
-
-	UNWIND_HINT_REGS
-
 	/*
 	 * Switch off the IST stack to make it free for nested exceptions. The
-	 * vc_switch_off_ist() function will switch back to the interrupted
+	 * ist_vc_switch_off_ist() function will switch back to the interrupted
 	 * stack if it is safe to do so. If not it switches to the VC fall-back
 	 * stack.
 	 */
 	movq	%rsp, %rdi		/* pt_regs pointer */
-	call	vc_switch_off_ist
+	call	ist_vc_switch_off_ist
 	movq	%rax, %rsp		/* Switch to new stack */
 
 	UNWIND_HINT_REGS
 
+	/*
+	 * paranoid_entry returns SWAPGS flag for paranoid_exit in EBX.
+	 * EBX == 0 -> SWAPGS, EBX == 1 -> no SWAPGS
+	 */
+	call	paranoid_entry
+
+	UNWIND_HINT_REGS
+
 	/* Update pt_regs */
 	movq	ORIG_RAX(%rsp), %rsi	/* get error code into 2nd argument*/
 	movq	$-1, ORIG_RAX(%rsp)	/* no syscall to restart */
diff --git a/arch/x86/include/asm/traps.h b/arch/x86/include/asm/traps.h
index 686461ac9803..1aefc081d763 100644
--- a/arch/x86/include/asm/traps.h
+++ b/arch/x86/include/asm/traps.h
@@ -16,7 +16,8 @@ asmlinkage __visible notrace
 struct pt_regs *fixup_bad_iret(struct pt_regs *bad_regs);
 asmlinkage __visible notrace struct pt_regs *error_entry(struct pt_regs *eregs);
 void __init trap_init(void);
-asmlinkage __visible noinstr struct pt_regs *vc_switch_off_ist(struct pt_regs *eregs);
+asmlinkage __visible __entry_text
+struct pt_regs *ist_vc_switch_off_ist(struct pt_regs *eregs);
 #endif
 
 #ifdef CONFIG_X86_F00F_BUG
diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index 0afa16ea3702..03347db4c2c4 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -717,7 +717,7 @@ asmlinkage __visible noinstr struct pt_regs *sync_regs(struct pt_regs *eregs)
 }
 
 #ifdef CONFIG_AMD_MEM_ENCRYPT
-asmlinkage __visible noinstr struct pt_regs *vc_switch_off_ist(struct pt_regs *regs)
+static noinstr struct pt_regs *vc_switch_off_ist(struct pt_regs *regs)
 {
 	unsigned long sp, *stack;
 	struct stack_info info;
@@ -757,6 +757,18 @@ asmlinkage __visible noinstr struct pt_regs *vc_switch_off_ist(struct pt_regs *r
 
 	return regs_ret;
 }
+
+asmlinkage __visible __entry_text
+struct pt_regs *ist_vc_switch_off_ist(struct pt_regs *regs)
+{
+	unsigned long cr3, gsbase;
+
+	ist_paranoid_entry(&cr3, &gsbase);
+	regs = vc_switch_off_ist(regs);
+	ist_paranoid_exit(cr3, gsbase);
+
+	return regs;
+}
 #endif
 
 asmlinkage __visible noinstr
-- 
2.19.1.6.gb485710b


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH V2 36/41] x86/sev: Remove stack protector from sev.c
  2021-09-26 15:07 [PATCH V2 00/41] x86/entry/64: Convert a bunch of ASM entry code into C code Lai Jiangshan
                   ` (34 preceding siblings ...)
  2021-09-26 15:08 ` [PATCH V2 35/41] x86/sev: Add and use ist_vc_switch_off_ist() Lai Jiangshan
@ 2021-09-26 15:08 ` Lai Jiangshan
  2021-09-26 15:08 ` [PATCH V2 37/41] x86/sev: Use C entry code Lai Jiangshan
                   ` (4 subsequent siblings)
  40 siblings, 0 replies; 54+ messages in thread
From: Lai Jiangshan @ 2021-09-26 15:08 UTC (permalink / raw)
  To: linux-kernel
  Cc: Lai Jiangshan, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	x86, H. Peter Anvin, Joerg Roedel, Javier Martinez Canillas,
	Daniel Bristot de Oliveira, Brijesh Singh, Andy Shevchenko,
	Arvind Sankar, Juergen Gross, Chester Lin

From: Lai Jiangshan <laijs@linux.alibaba.com>

sev.c is going to contain __entry_code which can not be instrumented
by stack protector.

Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
---
 arch/x86/kernel/Makefile | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile
index f56e8088c85d..88bbfeeab929 100644
--- a/arch/x86/kernel/Makefile
+++ b/arch/x86/kernel/Makefile
@@ -52,6 +52,8 @@ CFLAGS_REMOVE_traps.o		= -fstack-protector -fstack-protector-strong
 CFLAGS_traps.o			+= -fno-stack-protector
 CFLAGS_REMOVE_nmi.o		= -fstack-protector -fstack-protector-strong
 CFLAGS_nmi.o			+= -fno-stack-protector
+CFLAGS_REMOVE_sev.o		= -fstack-protector -fstack-protector-strong
+CFLAGS_sev.o			+= -fno-stack-protector
 
 CFLAGS_irq.o := -I $(srctree)/$(src)/../include/asm/trace
 
-- 
2.19.1.6.gb485710b


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH V2 37/41] x86/sev: Use C entry code
  2021-09-26 15:07 [PATCH V2 00/41] x86/entry/64: Convert a bunch of ASM entry code into C code Lai Jiangshan
                   ` (35 preceding siblings ...)
  2021-09-26 15:08 ` [PATCH V2 36/41] x86/sev: Remove stack protector from sev.c Lai Jiangshan
@ 2021-09-26 15:08 ` Lai Jiangshan
  2021-09-26 15:08 ` [PATCH V2 38/41] x86/entry: Remove ASM function paranoid_entry() and paranoid_exit() Lai Jiangshan
                   ` (3 subsequent siblings)
  40 siblings, 0 replies; 54+ messages in thread
From: Lai Jiangshan @ 2021-09-26 15:08 UTC (permalink / raw)
  To: linux-kernel
  Cc: Lai Jiangshan, Andy Lutomirski, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, x86, H. Peter Anvin, Juergen Gross,
	Peter Zijlstra (Intel),
	Joerg Roedel, Mike Travis

From: Lai Jiangshan <laijs@linux.alibaba.com>

Use DEFINE_IDTENTRY_IST_ETNRY_ERRORCODE to emit C entry function and
use the function directly in entry_64.S.

Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
---
 arch/x86/entry/entry_64.S       | 22 +---------------------
 arch/x86/include/asm/idtentry.h |  1 +
 2 files changed, 2 insertions(+), 21 deletions(-)

diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
index 8871f8ccf117..63cafeeaf27d 100644
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -498,28 +498,8 @@ SYM_CODE_START(\asmsym)
 
 	UNWIND_HINT_REGS
 
-	/*
-	 * paranoid_entry returns SWAPGS flag for paranoid_exit in EBX.
-	 * EBX == 0 -> SWAPGS, EBX == 1 -> no SWAPGS
-	 */
-	call	paranoid_entry
-
-	UNWIND_HINT_REGS
-
-	/* Update pt_regs */
-	movq	ORIG_RAX(%rsp), %rsi	/* get error code into 2nd argument*/
-	movq	$-1, ORIG_RAX(%rsp)	/* no syscall to restart */
-
 	movq	%rsp, %rdi		/* pt_regs pointer */
-
-	call	kernel_\cfunc
-
-	/*
-	 * No need to switch back to the IST stack. The current stack is either
-	 * identical to the stack in the IRET frame or the VC fall-back stack,
-	 * so it is definitely mapped even with PTI enabled.
-	 */
-	call	paranoid_exit
+	call	ist_kernel_\cfunc
 	jmp	restore_regs_and_return_to_kernel
 
 	/* Switch to the regular task stack */
diff --git a/arch/x86/include/asm/idtentry.h b/arch/x86/include/asm/idtentry.h
index 99e1ae3f5c7d..c8837bb3991f 100644
--- a/arch/x86/include/asm/idtentry.h
+++ b/arch/x86/include/asm/idtentry.h
@@ -412,6 +412,7 @@ __visible __entry_text void ist_##func(struct pt_regs *regs)		\
  * Maps to DEFINE_IDTENTRY_RAW_ERRORCODE
  */
 #define DEFINE_IDTENTRY_VC_KERNEL(func)				\
+	DEFINE_IDTENTRY_IST_ETNRY_ERRORCODE(kernel_##func)	\
 	DEFINE_IDTENTRY_RAW_ERRORCODE(kernel_##func)
 
 /**
-- 
2.19.1.6.gb485710b


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH V2 38/41] x86/entry: Remove ASM function paranoid_entry() and paranoid_exit()
  2021-09-26 15:07 [PATCH V2 00/41] x86/entry/64: Convert a bunch of ASM entry code into C code Lai Jiangshan
                   ` (36 preceding siblings ...)
  2021-09-26 15:08 ` [PATCH V2 37/41] x86/sev: Use C entry code Lai Jiangshan
@ 2021-09-26 15:08 ` Lai Jiangshan
  2021-09-26 15:08 ` [PATCH V2 39/41] x86/entry: Remove the unused ASM macros Lai Jiangshan
                   ` (2 subsequent siblings)
  40 siblings, 0 replies; 54+ messages in thread
From: Lai Jiangshan @ 2021-09-26 15:08 UTC (permalink / raw)
  To: linux-kernel
  Cc: Lai Jiangshan, Andy Lutomirski, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, x86, H. Peter Anvin

From: Lai Jiangshan <laijs@linux.alibaba.com>

IST exceptions are changed to use C entry code which uses the C function
ist_paranoid_entry() and ist_paranoid_exit().  The ASM function
paranoid_entry() and paranoid_exit() are useless.

Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
---
 arch/x86/entry/entry_64.S | 124 --------------------------------------
 1 file changed, 124 deletions(-)

diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
index 63cafeeaf27d..260be3c9da7d 100644
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -815,130 +815,6 @@ SYM_CODE_START(xen_failsafe_callback)
 SYM_CODE_END(xen_failsafe_callback)
 #endif /* CONFIG_XEN_PV */
 
-/*
- * Save all registers in pt_regs. Return GSBASE related information
- * in EBX depending on the availability of the FSGSBASE instructions:
- *
- * FSGSBASE	R/EBX
- *     N        0 -> SWAPGS on exit
- *              1 -> no SWAPGS on exit
- *
- *     Y        GSBASE value at entry, must be restored in paranoid_exit
- */
-SYM_CODE_START_LOCAL(paranoid_entry)
-	UNWIND_HINT_FUNC
-	cld
-
-	/*
-	 * Always stash CR3 in %r14.  This value will be restored,
-	 * verbatim, at exit.  Needed if paranoid_entry interrupted
-	 * another entry that already switched to the user CR3 value
-	 * but has not yet returned to userspace.
-	 *
-	 * This is also why CS (stashed in the "iret frame" by the
-	 * hardware at entry) can not be used: this may be a return
-	 * to kernel code, but with a user CR3 value.
-	 *
-	 * Switching CR3 does not depend on kernel GSBASE so it can
-	 * be done before switching to the kernel GSBASE. This is
-	 * required for FSGSBASE because the kernel GSBASE has to
-	 * be retrieved from a kernel internal table.
-	 */
-	SAVE_AND_SWITCH_TO_KERNEL_CR3 scratch_reg=%rax save_reg=%r14
-
-	/*
-	 * Handling GSBASE depends on the availability of FSGSBASE.
-	 *
-	 * Without FSGSBASE the kernel enforces that negative GSBASE
-	 * values indicate kernel GSBASE. With FSGSBASE no assumptions
-	 * can be made about the GSBASE value when entering from user
-	 * space.
-	 */
-	ALTERNATIVE "jmp .Lparanoid_entry_checkgs", "", X86_FEATURE_FSGSBASE
-
-	/*
-	 * Read the current GSBASE and store it in %rbx unconditionally,
-	 * retrieve and set the current CPUs kernel GSBASE. The stored value
-	 * has to be restored in paranoid_exit unconditionally.
-	 *
-	 * The unconditional write to GS base below ensures that no subsequent
-	 * loads based on a mispredicted GS base can happen, therefore no LFENCE
-	 * is needed here.
-	 */
-	SAVE_AND_SET_GSBASE scratch_reg=%rax save_reg=%rbx
-	ret
-
-.Lparanoid_entry_checkgs:
-	/* EBX = 1 -> kernel GSBASE active, no restore required */
-	movl	$1, %ebx
-	/*
-	 * The kernel-enforced convention is a negative GSBASE indicates
-	 * a kernel value. No SWAPGS needed on entry and exit.
-	 */
-	movl	$MSR_GS_BASE, %ecx
-	rdmsr
-	testl	%edx, %edx
-	jns	.Lparanoid_entry_swapgs
-	FENCE_SWAPGS_KERNEL_ENTRY
-	ret
-
-.Lparanoid_entry_swapgs:
-	swapgs
-	FENCE_SWAPGS_USER_ENTRY
-
-	/* EBX = 0 -> SWAPGS required on exit */
-	xorl	%ebx, %ebx
-	ret
-SYM_CODE_END(paranoid_entry)
-
-/*
- * "Paranoid" exit path from exception stack.  This is invoked
- * only on return from IST interrupts that came from kernel space.
- *
- * We may be returning to very strange contexts (e.g. very early
- * in syscall entry), so checking for preemption here would
- * be complicated.  Fortunately, there's no good reason to try
- * to handle preemption here.
- *
- * R/EBX contains the GSBASE related information depending on the
- * availability of the FSGSBASE instructions:
- *
- * FSGSBASE	R/EBX
- *     N        0 -> SWAPGS on exit
- *              1 -> no SWAPGS on exit
- *
- *     Y        User space GSBASE, must be restored unconditionally
- */
-SYM_CODE_START_LOCAL(paranoid_exit)
-	UNWIND_HINT_REGS offset=8
-	/*
-	 * The order of operations is important. RESTORE_CR3 requires
-	 * kernel GSBASE.
-	 *
-	 * NB to anyone to try to optimize this code: this code does
-	 * not execute at all for exceptions from user mode. Those
-	 * exceptions go through error_exit instead.
-	 */
-	RESTORE_CR3	scratch_reg=%rax save_reg=%r14
-
-	/* Handle the three GSBASE cases */
-	ALTERNATIVE "jmp .Lparanoid_exit_checkgs", "", X86_FEATURE_FSGSBASE
-
-	/* With FSGSBASE enabled, unconditionally restore GSBASE */
-	wrgsbase	%rbx
-	ret
-
-.Lparanoid_exit_checkgs:
-	/* On non-FSGSBASE systems, conditionally do SWAPGS */
-	testl		%ebx, %ebx
-	jnz		.Lparanoid_exit_done
-
-	/* We are returning to a context with user GSBASE */
-	swapgs
-.Lparanoid_exit_done:
-	ret
-SYM_CODE_END(paranoid_exit)
-
 SYM_CODE_START_LOCAL(error_return)
 	UNWIND_HINT_REGS
 	DEBUG_ENTRY_ASSERT_IRQS_OFF
-- 
2.19.1.6.gb485710b


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH V2 39/41] x86/entry: Remove the unused ASM macros
  2021-09-26 15:07 [PATCH V2 00/41] x86/entry/64: Convert a bunch of ASM entry code into C code Lai Jiangshan
                   ` (37 preceding siblings ...)
  2021-09-26 15:08 ` [PATCH V2 38/41] x86/entry: Remove ASM function paranoid_entry() and paranoid_exit() Lai Jiangshan
@ 2021-09-26 15:08 ` Lai Jiangshan
  2021-09-26 15:08 ` [PATCH V2 40/41] x86/entry: Remove save_ret from PUSH_AND_CLEAR_REGS Lai Jiangshan
  2021-09-26 15:08 ` [PATCH V2 41/41] x86/syscall/64: Move the checking for sysret to C code Lai Jiangshan
  40 siblings, 0 replies; 54+ messages in thread
From: Lai Jiangshan @ 2021-09-26 15:08 UTC (permalink / raw)
  To: linux-kernel
  Cc: Lai Jiangshan, Andy Lutomirski, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, x86, H. Peter Anvin

From: Lai Jiangshan <laijs@linux.alibaba.com>

They are implemented and used in C code.  The ASM version is not needed
any more.

Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
---
 arch/x86/entry/calling.h | 106 ---------------------------------------
 1 file changed, 106 deletions(-)

diff --git a/arch/x86/entry/calling.h b/arch/x86/entry/calling.h
index 996b041e92d2..d42012fc694d 100644
--- a/arch/x86/entry/calling.h
+++ b/arch/x86/entry/calling.h
@@ -210,60 +210,6 @@ For 32-bit we have the following conventions - kernel is built with
 	popq	%rax
 .endm
 
-.macro SAVE_AND_SWITCH_TO_KERNEL_CR3 scratch_reg:req save_reg:req
-	ALTERNATIVE "jmp .Ldone_\@", "", X86_FEATURE_PTI
-	movq	%cr3, \scratch_reg
-	movq	\scratch_reg, \save_reg
-	/*
-	 * Test the user pagetable bit. If set, then the user page tables
-	 * are active. If clear CR3 already has the kernel page table
-	 * active.
-	 */
-	bt	$PTI_USER_PGTABLE_BIT, \scratch_reg
-	jnc	.Ldone_\@
-
-	ADJUST_KERNEL_CR3 \scratch_reg
-	movq	\scratch_reg, %cr3
-
-.Ldone_\@:
-.endm
-
-.macro RESTORE_CR3 scratch_reg:req save_reg:req
-	ALTERNATIVE "jmp .Lend_\@", "", X86_FEATURE_PTI
-
-	ALTERNATIVE "jmp .Lwrcr3_\@", "", X86_FEATURE_PCID
-
-	/*
-	 * KERNEL pages can always resume with NOFLUSH as we do
-	 * explicit flushes.
-	 */
-	bt	$PTI_USER_PGTABLE_BIT, \save_reg
-	jnc	.Lnoflush_\@
-
-	/*
-	 * Check if there's a pending flush for the user ASID we're
-	 * about to set.
-	 */
-	movq	\save_reg, \scratch_reg
-	andq	$(0x7FF), \scratch_reg
-	bt	\scratch_reg, THIS_CPU_user_pcid_flush_mask
-	jnc	.Lnoflush_\@
-
-	btr	\scratch_reg, THIS_CPU_user_pcid_flush_mask
-	jmp	.Lwrcr3_\@
-
-.Lnoflush_\@:
-	SET_NOFLUSH_BIT \save_reg
-
-.Lwrcr3_\@:
-	/*
-	 * The CR3 write could be avoided when not changing its value,
-	 * but would require a CR3 read *and* a scratch register.
-	 */
-	movq	\save_reg, %cr3
-.Lend_\@:
-.endm
-
 #else /* CONFIG_PAGE_TABLE_ISOLATION=n: */
 
 .macro SWITCH_TO_KERNEL_CR3 scratch_reg:req
@@ -272,10 +218,6 @@ For 32-bit we have the following conventions - kernel is built with
 .endm
 .macro SWITCH_TO_USER_CR3_STACK scratch_reg:req
 .endm
-.macro SAVE_AND_SWITCH_TO_KERNEL_CR3 scratch_reg:req save_reg:req
-.endm
-.macro RESTORE_CR3 scratch_reg:req save_reg:req
-.endm
 
 #endif
 
@@ -284,17 +226,10 @@ For 32-bit we have the following conventions - kernel is built with
  *
  * FENCE_SWAPGS_USER_ENTRY is used in the user entry swapgs code path, to
  * prevent a speculative swapgs when coming from kernel space.
- *
- * FENCE_SWAPGS_KERNEL_ENTRY is used in the kernel entry non-swapgs code path,
- * to prevent the swapgs from getting speculatively skipped when coming from
- * user space.
  */
 .macro FENCE_SWAPGS_USER_ENTRY
 	ALTERNATIVE "", "lfence", X86_FEATURE_FENCE_SWAPGS_USER
 .endm
-.macro FENCE_SWAPGS_KERNEL_ENTRY
-	ALTERNATIVE "", "lfence", X86_FEATURE_FENCE_SWAPGS_KERNEL
-.endm
 
 .macro STACKLEAK_ERASE_NOCLOBBER
 #ifdef CONFIG_GCC_PLUGIN_STACKLEAK
@@ -304,12 +239,6 @@ For 32-bit we have the following conventions - kernel is built with
 #endif
 .endm
 
-.macro SAVE_AND_SET_GSBASE scratch_reg:req save_reg:req
-	rdgsbase \save_reg
-	GET_PERCPU_BASE \scratch_reg
-	wrgsbase \scratch_reg
-.endm
-
 #else /* CONFIG_X86_64 */
 # undef		UNWIND_HINT_IRET_REGS
 # define	UNWIND_HINT_IRET_REGS
@@ -320,38 +249,3 @@ For 32-bit we have the following conventions - kernel is built with
 	call stackleak_erase
 #endif
 .endm
-
-#ifdef CONFIG_SMP
-
-/*
- * CPU/node NR is loaded from the limit (size) field of a special segment
- * descriptor entry in GDT.
- */
-.macro LOAD_CPU_AND_NODE_SEG_LIMIT reg:req
-	movq	$__CPUNODE_SEG, \reg
-	lsl	\reg, \reg
-.endm
-
-/*
- * Fetch the per-CPU GSBASE value for this processor and put it in @reg.
- * We normally use %gs for accessing per-CPU data, but we are setting up
- * %gs here and obviously can not use %gs itself to access per-CPU data.
- *
- * Do not use RDPID, because KVM loads guest's TSC_AUX on vm-entry and
- * may not restore the host's value until the CPU returns to userspace.
- * Thus the kernel would consume a guest's TSC_AUX if an NMI arrives
- * while running KVM's run loop.
- */
-.macro GET_PERCPU_BASE reg:req
-	LOAD_CPU_AND_NODE_SEG_LIMIT \reg
-	andq	$VDSO_CPUNODE_MASK, \reg
-	movq	__per_cpu_offset(, \reg, 8), \reg
-.endm
-
-#else
-
-.macro GET_PERCPU_BASE reg:req
-	movq	pcpu_unit_offsets(%rip), \reg
-.endm
-
-#endif /* CONFIG_SMP */
-- 
2.19.1.6.gb485710b


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH V2 40/41] x86/entry: Remove save_ret from PUSH_AND_CLEAR_REGS
  2021-09-26 15:07 [PATCH V2 00/41] x86/entry/64: Convert a bunch of ASM entry code into C code Lai Jiangshan
                   ` (38 preceding siblings ...)
  2021-09-26 15:08 ` [PATCH V2 39/41] x86/entry: Remove the unused ASM macros Lai Jiangshan
@ 2021-09-26 15:08 ` Lai Jiangshan
  2021-09-26 15:08 ` [PATCH V2 41/41] x86/syscall/64: Move the checking for sysret to C code Lai Jiangshan
  40 siblings, 0 replies; 54+ messages in thread
From: Lai Jiangshan @ 2021-09-26 15:08 UTC (permalink / raw)
  To: linux-kernel
  Cc: Lai Jiangshan, Andy Lutomirski, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, x86, H. Peter Anvin

From: Lai Jiangshan <laijs@linux.alibaba.com>

PUSH_AND_CLEAR_REGS is never used with save_ret anymore.

Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
---
 arch/x86/entry/calling.h | 16 +++-------------
 1 file changed, 3 insertions(+), 13 deletions(-)

diff --git a/arch/x86/entry/calling.h b/arch/x86/entry/calling.h
index d42012fc694d..6f9de1c6da73 100644
--- a/arch/x86/entry/calling.h
+++ b/arch/x86/entry/calling.h
@@ -63,15 +63,9 @@ For 32-bit we have the following conventions - kernel is built with
  * for assembly code:
  */
 
-.macro PUSH_REGS rdx=%rdx rax=%rax save_ret=0
-	.if \save_ret
-	pushq	%rsi		/* pt_regs->si */
-	movq	8(%rsp), %rsi	/* temporarily store the return address in %rsi */
-	movq	%rdi, 8(%rsp)	/* pt_regs->di (overwriting original return address) */
-	.else
+.macro PUSH_REGS rdx=%rdx rax=%rax
 	pushq   %rdi		/* pt_regs->di */
 	pushq   %rsi		/* pt_regs->si */
-	.endif
 	pushq	\rdx		/* pt_regs->dx */
 	pushq   %rcx		/* pt_regs->cx */
 	pushq   \rax		/* pt_regs->ax */
@@ -86,10 +80,6 @@ For 32-bit we have the following conventions - kernel is built with
 	pushq	%r14		/* pt_regs->r14 */
 	pushq	%r15		/* pt_regs->r15 */
 	UNWIND_HINT_REGS
-
-	.if \save_ret
-	pushq	%rsi		/* return address on top of stack */
-	.endif
 .endm
 
 .macro CLEAR_REGS
@@ -114,8 +104,8 @@ For 32-bit we have the following conventions - kernel is built with
 
 .endm
 
-.macro PUSH_AND_CLEAR_REGS rdx=%rdx rax=%rax save_ret=0
-	PUSH_REGS rdx=\rdx, rax=\rax, save_ret=\save_ret
+.macro PUSH_AND_CLEAR_REGS rdx=%rdx rax=%rax
+	PUSH_REGS rdx=\rdx, rax=\rax
 	CLEAR_REGS
 .endm
 
-- 
2.19.1.6.gb485710b


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH V2 41/41] x86/syscall/64: Move the checking for sysret to C code
  2021-09-26 15:07 [PATCH V2 00/41] x86/entry/64: Convert a bunch of ASM entry code into C code Lai Jiangshan
                   ` (39 preceding siblings ...)
  2021-09-26 15:08 ` [PATCH V2 40/41] x86/entry: Remove save_ret from PUSH_AND_CLEAR_REGS Lai Jiangshan
@ 2021-09-26 15:08 ` Lai Jiangshan
  40 siblings, 0 replies; 54+ messages in thread
From: Lai Jiangshan @ 2021-09-26 15:08 UTC (permalink / raw)
  To: linux-kernel
  Cc: Lai Jiangshan, Andy Lutomirski, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, x86, H. Peter Anvin

From: Lai Jiangshan <laijs@linux.alibaba.com>

Like do_fast_syscall_32() which checks whether it can return to userspace
via fast instructions before the function returns, do_syscall_64()
also checks whether it can use sysret to return to userspace before
do_syscall_64() returns via C code.  And a bunch of ASM code can be
removed.

No functional change intended.

Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
---
 arch/x86/entry/calling.h       | 10 +----
 arch/x86/entry/common.c        | 73 ++++++++++++++++++++++++++++++-
 arch/x86/entry/entry_64.S      | 78 ++--------------------------------
 arch/x86/include/asm/syscall.h |  2 +-
 4 files changed, 78 insertions(+), 85 deletions(-)

diff --git a/arch/x86/entry/calling.h b/arch/x86/entry/calling.h
index 6f9de1c6da73..05da3ef48ee4 100644
--- a/arch/x86/entry/calling.h
+++ b/arch/x86/entry/calling.h
@@ -109,27 +109,19 @@ For 32-bit we have the following conventions - kernel is built with
 	CLEAR_REGS
 .endm
 
-.macro POP_REGS pop_rdi=1 skip_r11rcx=0
+.macro POP_REGS pop_rdi=1
 	popq %r15
 	popq %r14
 	popq %r13
 	popq %r12
 	popq %rbp
 	popq %rbx
-	.if \skip_r11rcx
-	popq %rsi
-	.else
 	popq %r11
-	.endif
 	popq %r10
 	popq %r9
 	popq %r8
 	popq %rax
-	.if \skip_r11rcx
-	popq %rsi
-	.else
 	popq %rcx
-	.endif
 	popq %rdx
 	popq %rsi
 	.if \pop_rdi
diff --git a/arch/x86/entry/common.c b/arch/x86/entry/common.c
index 6c2826417b33..718045b7a53c 100644
--- a/arch/x86/entry/common.c
+++ b/arch/x86/entry/common.c
@@ -70,7 +70,77 @@ static __always_inline bool do_syscall_x32(struct pt_regs *regs, int nr)
 	return false;
 }
 
-__visible noinstr void do_syscall_64(struct pt_regs *regs, int nr)
+/*
+ * Change top bits to match the most significant bit (47th or 56th bit
+ * depending on paging mode) in the address to get canonical address.
+ *
+ * If width of "canonical tail" ever becomes variable, this will need
+ * to be updated to remain correct on both old and new CPUs.
+ */
+static __always_inline u64 canonical_address(u64 vaddr)
+{
+	if (IS_ENABLED(CONFIG_X86_5LEVEL) && static_cpu_has(X86_FEATURE_LA57))
+		return ((s64)vaddr << (64 - 57)) >> (64 - 57);
+	else
+		return ((s64)vaddr << (64 - 48)) >> (64 - 48);
+}
+
+/*
+ * Check if it can use SYSRET.
+ *
+ * Try to use SYSRET instead of IRET if we're returning to
+ * a completely clean 64-bit userspace context.
+ *
+ * Returns 0 to return using IRET or 1 to return using SYSRET.
+ */
+static __always_inline int can_sysret(struct pt_regs *regs)
+{
+	/* In the Xen PV case we must use iret anyway. */
+	if (static_cpu_has(X86_FEATURE_XENPV))
+		return 0;
+
+	/* SYSRET requires RCX == RIP && R11 == RFLAGS */
+	if (regs->ip != regs->cx || regs->flags != regs->r11)
+		return 0;
+
+	/* CS and SS must match SYSRET */
+	if (regs->cs != __USER_CS || regs->ss != __USER_DS)
+		return 0;
+
+	/*
+	 * On Intel CPUs, SYSRET with non-canonical RCX/RIP will #GP
+	 * in kernel space.  This essentially lets the user take over
+	 * the kernel, since userspace controls RSP.
+	 */
+	if (regs->cx != canonical_address(regs->cx))
+		return 0;
+
+	/*
+	 * SYSCALL clears RF when it saves RFLAGS in R11 and SYSRET cannot
+	 * restore RF properly. If the slowpath sets it for whatever reason, we
+	 * need to restore it correctly.
+	 *
+	 * SYSRET can restore TF, but unlike IRET, restoring TF results in a
+	 * trap from userspace immediately after SYSRET.  This would cause an
+	 * infinite loop whenever #DB happens with register state that satisfies
+	 * the opportunistic SYSRET conditions.  For example, single-stepping
+	 * this user code:
+	 *
+	 *           movq	$stuck_here, %rcx
+	 *           pushfq
+	 *           popq %r11
+	 *   stuck_here:
+	 *
+	 * would never get past 'stuck_here'.
+	 */
+	if (regs->r11 & (X86_EFLAGS_RF | X86_EFLAGS_TF))
+		return 0;
+
+	return 1;
+}
+
+/* Returns 0 to return using IRET or 1 to return using SYSRET. */
+__visible noinstr int do_syscall_64(struct pt_regs *regs, int nr)
 {
 	add_random_kstack_offset();
 	nr = syscall_enter_from_user_mode(regs, nr);
@@ -84,6 +154,7 @@ __visible noinstr void do_syscall_64(struct pt_regs *regs, int nr)
 
 	instrumentation_end();
 	syscall_exit_to_user_mode(regs);
+	return can_sysret(regs);
 }
 #endif
 
diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
index 260be3c9da7d..777fbf7c3939 100644
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -112,85 +112,15 @@ SYM_INNER_LABEL(entry_SYSCALL_64_after_hwframe, SYM_L_GLOBAL)
 	movslq	%eax, %rsi
 	call	do_syscall_64		/* returns with IRQs disabled */
 
-	/*
-	 * Try to use SYSRET instead of IRET if we're returning to
-	 * a completely clean 64-bit userspace context.  If we're not,
-	 * go to the slow exit path.
-	 * In the Xen PV case we must use iret anyway.
-	 */
-
-	ALTERNATIVE "", "jmp	swapgs_restore_regs_and_return_to_usermode", \
-		X86_FEATURE_XENPV
-
-	movq	RCX(%rsp), %rcx
-	movq	RIP(%rsp), %r11
-
-	cmpq	%rcx, %r11	/* SYSRET requires RCX == RIP */
-	jne	swapgs_restore_regs_and_return_to_usermode
+	testl	%eax, %eax
+	jz swapgs_restore_regs_and_return_to_usermode
 
 	/*
-	 * On Intel CPUs, SYSRET with non-canonical RCX/RIP will #GP
-	 * in kernel space.  This essentially lets the user take over
-	 * the kernel, since userspace controls RSP.
-	 *
-	 * If width of "canonical tail" ever becomes variable, this will need
-	 * to be updated to remain correct on both old and new CPUs.
-	 *
-	 * Change top bits to match most significant bit (47th or 56th bit
-	 * depending on paging mode) in the address.
-	 */
-#ifdef CONFIG_X86_5LEVEL
-	ALTERNATIVE "shl $(64 - 48), %rcx; sar $(64 - 48), %rcx", \
-		"shl $(64 - 57), %rcx; sar $(64 - 57), %rcx", X86_FEATURE_LA57
-#else
-	shl	$(64 - (__VIRTUAL_MASK_SHIFT+1)), %rcx
-	sar	$(64 - (__VIRTUAL_MASK_SHIFT+1)), %rcx
-#endif
-
-	/* If this changed %rcx, it was not canonical */
-	cmpq	%rcx, %r11
-	jne	swapgs_restore_regs_and_return_to_usermode
-
-	cmpq	$__USER_CS, CS(%rsp)		/* CS must match SYSRET */
-	jne	swapgs_restore_regs_and_return_to_usermode
-
-	movq	R11(%rsp), %r11
-	cmpq	%r11, EFLAGS(%rsp)		/* R11 == RFLAGS */
-	jne	swapgs_restore_regs_and_return_to_usermode
-
-	/*
-	 * SYSCALL clears RF when it saves RFLAGS in R11 and SYSRET cannot
-	 * restore RF properly. If the slowpath sets it for whatever reason, we
-	 * need to restore it correctly.
-	 *
-	 * SYSRET can restore TF, but unlike IRET, restoring TF results in a
-	 * trap from userspace immediately after SYSRET.  This would cause an
-	 * infinite loop whenever #DB happens with register state that satisfies
-	 * the opportunistic SYSRET conditions.  For example, single-stepping
-	 * this user code:
-	 *
-	 *           movq	$stuck_here, %rcx
-	 *           pushfq
-	 *           popq %r11
-	 *   stuck_here:
-	 *
-	 * would never get past 'stuck_here'.
-	 */
-	testq	$(X86_EFLAGS_RF|X86_EFLAGS_TF), %r11
-	jnz	swapgs_restore_regs_and_return_to_usermode
-
-	/* nothing to check for RSP */
-
-	cmpq	$__USER_DS, SS(%rsp)		/* SS must match SYSRET */
-	jne	swapgs_restore_regs_and_return_to_usermode
-
-	/*
-	 * We win! This label is here just for ease of understanding
+	 * This label is here just for ease of understanding
 	 * perf profiles. Nothing jumps here.
 	 */
 syscall_return_via_sysret:
-	/* rcx and r11 are already restored (see code above) */
-	POP_REGS pop_rdi=0 skip_r11rcx=1
+	POP_REGS pop_rdi=0
 
 	/*
 	 * Now all regs are restored except RSP and RDI.
diff --git a/arch/x86/include/asm/syscall.h b/arch/x86/include/asm/syscall.h
index f7e2d82d24fb..477adea7bac0 100644
--- a/arch/x86/include/asm/syscall.h
+++ b/arch/x86/include/asm/syscall.h
@@ -159,7 +159,7 @@ static inline int syscall_get_arch(struct task_struct *task)
 		? AUDIT_ARCH_I386 : AUDIT_ARCH_X86_64;
 }
 
-void do_syscall_64(struct pt_regs *regs, int nr);
+int do_syscall_64(struct pt_regs *regs, int nr);
 void do_int80_syscall_32(struct pt_regs *regs);
 long do_fast_syscall_32(struct pt_regs *regs);
 
-- 
2.19.1.6.gb485710b


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* Re: [PATCH V2 01/41] x86/entry: Fix swapgs fence
  2021-09-26 15:07 ` [PATCH V2 01/41] x86/entry: Fix swapgs fence Lai Jiangshan
@ 2021-09-26 20:43   ` Thomas Gleixner
  2021-09-27  1:10     ` Lai Jiangshan
  0 siblings, 1 reply; 54+ messages in thread
From: Thomas Gleixner @ 2021-09-26 20:43 UTC (permalink / raw)
  To: Lai Jiangshan, linux-kernel
  Cc: Lai Jiangshan, Josh Poimboeuf, Chang S . Bae, Sasha Levin,
	Andy Lutomirski, Ingo Molnar, Borislav Petkov, x86,
	H. Peter Anvin

Lai,

On Sun, Sep 26 2021 at 23:07, Lai Jiangshan wrote:
> --- a/arch/x86/entry/entry_64.S
> +++ b/arch/x86/entry/entry_64.S
> @@ -898,17 +898,12 @@ SYM_CODE_START_LOCAL(paranoid_entry)
>  	rdmsr
>  	testl	%edx, %edx
>  	jns	.Lparanoid_entry_swapgs
> +	FENCE_SWAPGS_KERNEL_ENTRY

Good catch.

>  	ret
>  
>  .Lparanoid_entry_swapgs:
>  	swapgs
> -
> -	/*
> -	 * The above SAVE_AND_SWITCH_TO_KERNEL_CR3 macro doesn't do an
> -	 * unconditional CR3 write, even in the PTI case.  So do an lfence
> -	 * to prevent GS speculation, regardless of whether PTI is enabled.
> -	 */
> -	FENCE_SWAPGS_KERNEL_ENTRY
> +	FENCE_SWAPGS_USER_ENTRY

This change is wrong.

In the paranoid entry path even if user GS base is set then the entry
does not necessarily come from user space so there is no guarantee that
there was a CR3 write on PTI enabled systems before the SWAPGS.

FENCE_SWAPGS_USER_ENTRY does not emit a LFENCE when PTI is enabled, so
both the comment and FENCE_SWAPGS_KERNEL_ENTRY which emits LFENCE on
affected CPUs unconditionaly are correct. Though the comment could do
with some polishing to make this entirely clear.

Before adding support for FSGSBASE both the swapgs and non swapgs case
issued the LFENCE unconditionally with FENCE_SWAPGS_KERNEL_ENTRY. The
commit you identified splitted the code pathes and failed to add the
FENCE_SWAPGS_KERNEL_ENTRY into the non-swapgs path.

Thanks,

        tglx

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH V2 01/41] x86/entry: Fix swapgs fence
  2021-09-26 20:43   ` Thomas Gleixner
@ 2021-09-27  1:10     ` Lai Jiangshan
  2021-09-27  3:27       ` Lai Jiangshan
  0 siblings, 1 reply; 54+ messages in thread
From: Lai Jiangshan @ 2021-09-27  1:10 UTC (permalink / raw)
  To: Thomas Gleixner, Lai Jiangshan, linux-kernel
  Cc: Josh Poimboeuf, Chang S . Bae, Sasha Levin, Andy Lutomirski,
	Ingo Molnar, Borislav Petkov, x86, H. Peter Anvin



On 2021/9/27 04:43, Thomas Gleixner wrote:
> Lai,
> 
> On Sun, Sep 26 2021 at 23:07, Lai Jiangshan wrote:
>> --- a/arch/x86/entry/entry_64.S
>> +++ b/arch/x86/entry/entry_64.S
>> @@ -898,17 +898,12 @@ SYM_CODE_START_LOCAL(paranoid_entry)
>>   	rdmsr
>>   	testl	%edx, %edx
>>   	jns	.Lparanoid_entry_swapgs
>> +	FENCE_SWAPGS_KERNEL_ENTRY
> 
> Good catch.
> 
>>   	ret
>>   
>>   .Lparanoid_entry_swapgs:
>>   	swapgs
>> -
>> -	/*
>> -	 * The above SAVE_AND_SWITCH_TO_KERNEL_CR3 macro doesn't do an
>> -	 * unconditional CR3 write, even in the PTI case.  So do an lfence
>> -	 * to prevent GS speculation, regardless of whether PTI is enabled.
>> -	 */
>> -	FENCE_SWAPGS_KERNEL_ENTRY
>> +	FENCE_SWAPGS_USER_ENTRY
> 
> This change is wrong.
> 
> In the paranoid entry path even if user GS base is set then the entry
> does not necessarily come from user space so there is no guarantee that
> there was a CR3 write on PTI enabled systems before the SWAPGS.
> 
> FENCE_SWAPGS_USER_ENTRY does not emit a LFENCE when PTI is enabled, so
> both the comment and FENCE_SWAPGS_KERNEL_ENTRY which emits LFENCE on
> affected CPUs unconditionaly are correct. Though the comment could do
> with some polishing to make this entirely clear.


I didn't notice FENCE_SWAPGS_USER_ENTRY depends on PTI.

I will add FENCE_SWAPGS_KERNEL_ENTRY only on the kernel path.

Thanks
Lai

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH V2 01/41] x86/entry: Fix swapgs fence
  2021-09-27  1:10     ` Lai Jiangshan
@ 2021-09-27  3:27       ` Lai Jiangshan
  2021-09-27  7:50         ` Thomas Gleixner
  0 siblings, 1 reply; 54+ messages in thread
From: Lai Jiangshan @ 2021-09-27  3:27 UTC (permalink / raw)
  To: Thomas Gleixner, Lai Jiangshan, linux-kernel
  Cc: Josh Poimboeuf, Chang S . Bae, Sasha Levin, Andy Lutomirski,
	Ingo Molnar, Borislav Petkov, x86, H. Peter Anvin



On 2021/9/27 09:10, Lai Jiangshan wrote:

>>
>> This change is wrong.
>>
>> In the paranoid entry path even if user GS base is set then the entry
>> does not necessarily come from user space so there is no guarantee that
>> there was a CR3 write on PTI enabled systems before the SWAPGS.
>>
>> FENCE_SWAPGS_USER_ENTRY does not emit a LFENCE when PTI is enabled, so
>> both the comment and FENCE_SWAPGS_KERNEL_ENTRY which emits LFENCE on
>> affected CPUs unconditionaly are correct. Though the comment could do
>> with some polishing to make this entirely clear.
> 
> 
> I didn't notice FENCE_SWAPGS_USER_ENTRY depends on PTI.

The commit c75890700455 ("x86/entry/64: Remove unneeded kernel CR3 switching")
( https://lore.kernel.org/all/20200419144049.1906-2-laijs@linux.alibaba.com/ )
also made it wrong.

When the SWITCH_TO_KERNEL_CR3 in the path is removed, FENCE_SWAPGS_USER_ENTRY
should also be changed to FENCE_SWAPGS_KERNEL_ENTRY. (Or just jmp to
.Lerror_entry_done_lfence which has FENCE_SWAPGS_KERNEL_ENTRY already.)

And FENCE_SWAPGS_USER_ENTRY could be documented with "it should be followed with
serializing operations such as SWITCH_TO_KERNEL_CR3".  Or we can add a
SWAPGS_AND_SWITCH_TO_KERNEL_CR3 to combine them.

I will fix it in v3. (Or should I do it separately before v3?)

Sorry for my fault.
Lai

> 
> I will add FENCE_SWAPGS_KERNEL_ENTRY only on the kernel path.
> 
> Thanks
> Lai

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH V2 01/41] x86/entry: Fix swapgs fence
  2021-09-27  3:27       ` Lai Jiangshan
@ 2021-09-27  7:50         ` Thomas Gleixner
  0 siblings, 0 replies; 54+ messages in thread
From: Thomas Gleixner @ 2021-09-27  7:50 UTC (permalink / raw)
  To: Lai Jiangshan, Lai Jiangshan, linux-kernel
  Cc: Josh Poimboeuf, Chang S . Bae, Sasha Levin, Andy Lutomirski,
	Ingo Molnar, Borislav Petkov, x86, H. Peter Anvin

Lai,

On Mon, Sep 27 2021 at 11:27, Lai Jiangshan wrote:
> On 2021/9/27 09:10, Lai Jiangshan wrote:
>
> The commit c75890700455 ("x86/entry/64: Remove unneeded kernel CR3 switching")
> ( https://lore.kernel.org/all/20200419144049.1906-2-laijs@linux.alibaba.com/ )
> also made it wrong.

Duh, did not spot that either.

> When the SWITCH_TO_KERNEL_CR3 in the path is removed, FENCE_SWAPGS_USER_ENTRY
> should also be changed to FENCE_SWAPGS_KERNEL_ENTRY. (Or just jmp to
> .Lerror_entry_done_lfence which has FENCE_SWAPGS_KERNEL_ENTRY already.)

Yes.

> And FENCE_SWAPGS_USER_ENTRY could be documented with "it should be followed with
> serializing operations such as SWITCH_TO_KERNEL_CR3".

It does not matter whether the serializing is before or after. The
problem is:

    if (from_user)
    	swapgs();

can take the wrong path speculatively which means the speculation is
then based on the wrong GS.

We have these sequences in the non paranoid entries:

    if (from_user) {
       pti_switch_cr3();
       swapgs();
    }

    if (from_user) {
       swapgs();
       pti_switch_cr3();
    }

and with mitigation these become:

    if (from_user) {
       pti_switch_cr3();
       swapgs();
       lfence_if_not_pti();
    } else {
       lfence();
    }

    if (from_user) {
       swapgs();
       lfence_if_not_pti();
       pti_switch_cr3();
    } else {
       lfence();
    }

When PTI is enabled then the CR3 write is sufficient because it's fully
serializing. If PTI is off the LFENCE is required. On which side the CR3
write is before or after SWAPGS does not matter. 

>  Or we can add a SWAPGS_AND_SWITCH_TO_KERNEL_CR3 to combine them.

No. We really don't want to go there.

Thanks,

        tglx


^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH V2 02/41] x86/traps: Remove stack-protector from traps.c
  2021-09-26 15:07 ` [PATCH V2 02/41] x86/traps: Remove stack-protector from traps.c Lai Jiangshan
@ 2021-09-27 10:19   ` Borislav Petkov
  2021-09-27 10:49     ` Lai Jiangshan
  0 siblings, 1 reply; 54+ messages in thread
From: Borislav Petkov @ 2021-09-27 10:19 UTC (permalink / raw)
  To: Lai Jiangshan
  Cc: linux-kernel, Lai Jiangshan, Thomas Gleixner, Ingo Molnar, x86,
	H. Peter Anvin, Joerg Roedel, Javier Martinez Canillas,
	Daniel Bristot de Oliveira, Brijesh Singh, Andy Shevchenko,
	Arvind Sankar, Juergen Gross, Chester Lin

On Sun, Sep 26, 2021 at 11:07:59PM +0800, Lai Jiangshan wrote:
> diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile
> index 8f4e8fa6ed75..0e054e2304c6 100644
> --- a/arch/x86/kernel/Makefile
> +++ b/arch/x86/kernel/Makefile
> @@ -48,6 +48,9 @@ KCOV_INSTRUMENT		:= n
>  
>  CFLAGS_head$(BITS).o	+= -fno-stack-protector
>  
> +CFLAGS_REMOVE_traps.o		= -fstack-protector -fstack-protector-strong

Why this too?

> +CFLAGS_traps.o			+= -fno-stack-protector

Isn't this enough to disable stack protector for this file?

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH V2 02/41] x86/traps: Remove stack-protector from traps.c
  2021-09-27 10:19   ` Borislav Petkov
@ 2021-09-27 10:49     ` Lai Jiangshan
  2021-09-27 11:01       ` Borislav Petkov
  0 siblings, 1 reply; 54+ messages in thread
From: Lai Jiangshan @ 2021-09-27 10:49 UTC (permalink / raw)
  To: Borislav Petkov, Lai Jiangshan
  Cc: linux-kernel, Thomas Gleixner, Ingo Molnar, x86, H. Peter Anvin,
	Joerg Roedel, Javier Martinez Canillas,
	Daniel Bristot de Oliveira, Brijesh Singh, Andy Shevchenko,
	Arvind Sankar, Juergen Gross, Chester Lin



On 2021/9/27 18:19, Borislav Petkov wrote:
> On Sun, Sep 26, 2021 at 11:07:59PM +0800, Lai Jiangshan wrote:
>> diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile
>> index 8f4e8fa6ed75..0e054e2304c6 100644
>> --- a/arch/x86/kernel/Makefile
>> +++ b/arch/x86/kernel/Makefile
>> @@ -48,6 +48,9 @@ KCOV_INSTRUMENT		:= n
>>   
>>   CFLAGS_head$(BITS).o	+= -fno-stack-protector
>>   
>> +CFLAGS_REMOVE_traps.o		= -fstack-protector -fstack-protector-strong
> 
> Why this too?
> 
>> +CFLAGS_traps.o			+= -fno-stack-protector
> 
> Isn't this enough to disable stack protector for this file?
> 

I did not investigate deep enough.  I reviewed the generated code and
found %gs is accessed early for the C entry function and searched for
solution and I chose to copy the code that I thought is the most complete:
kernel/entry/Makefile

Using only "-fno-stack-protector" is enough to disable stack protector with
my .config, I'm not so sure about other configuration.

Thanks
Lai

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH V2 02/41] x86/traps: Remove stack-protector from traps.c
  2021-09-27 10:49     ` Lai Jiangshan
@ 2021-09-27 11:01       ` Borislav Petkov
  2021-09-27 14:38         ` Lai Jiangshan
  0 siblings, 1 reply; 54+ messages in thread
From: Borislav Petkov @ 2021-09-27 11:01 UTC (permalink / raw)
  To: Lai Jiangshan
  Cc: Lai Jiangshan, linux-kernel, Thomas Gleixner, Ingo Molnar, x86,
	H. Peter Anvin, Joerg Roedel, Javier Martinez Canillas,
	Daniel Bristot de Oliveira, Brijesh Singh, Andy Shevchenko,
	Arvind Sankar, Juergen Gross, Chester Lin

On Mon, Sep 27, 2021 at 06:49:16PM +0800, Lai Jiangshan wrote:
> Using only "-fno-stack-protector" is enough to disable stack protector with
> my .config, I'm not so sure about other configuration.

What does the gcc manpage say about it?

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH V2 02/41] x86/traps: Remove stack-protector from traps.c
  2021-09-27 11:01       ` Borislav Petkov
@ 2021-09-27 14:38         ` Lai Jiangshan
  0 siblings, 0 replies; 54+ messages in thread
From: Lai Jiangshan @ 2021-09-27 14:38 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Lai Jiangshan, linux-kernel, Thomas Gleixner, Ingo Molnar, x86,
	H. Peter Anvin, Joerg Roedel, Javier Martinez Canillas,
	Daniel Bristot de Oliveira, Brijesh Singh, Andy Shevchenko,
	Arvind Sankar, Juergen Gross, Chester Lin



On 2021/9/27 19:01, Borislav Petkov wrote:
> On Mon, Sep 27, 2021 at 06:49:16PM +0800, Lai Jiangshan wrote:
>> Using only "-fno-stack-protector" is enough to disable stack protector with
>> my .config, I'm not so sure about other configuration.
> 
> What does the gcc manpage say about it?
>

In gcc's code, all the -f[no-]stack-protector* argument overwrites the
same flag_stack_protect variable, so the last one takes effect.

 > fstack-protector
 > Common Var(flag_stack_protect, 1) Init(-1) Optimization
 > Use propolice as a stack protection method.
 >
 > fstack-protector-all
 > Common RejectNegative Var(flag_stack_protect, 2) Init(-1) Optimization
 > Use a stack protection method for every function.
 >
 > fstack-protector-strong
 > Common RejectNegative Var(flag_stack_protect, 3) Init(-1) Optimization
 > Use a smart stack protection method for certain functions.
 >
 > fstack-protector-explicit
 > Common RejectNegative Var(flag_stack_protect, 4) Optimization
 > Use stack protection method only for functions with the stack_protect attribute.

In linux kernel's scripts/Makefile.lib, CFLAGS_traps.o is the last flags for
gcc invocation, so only "CFLAGS_traps.o += -fno-stack-protector" must be enough.

 > _c_flags       = $(filter-out $(CFLAGS_REMOVE_$(target-stem).o), \
 >                     $(filter-out $(ccflags-remove-y), \
 >                         $(KBUILD_CPPFLAGS) $(KBUILD_CFLAGS) $(ccflags-y)) \
 >                     $(CFLAGS_$(target-stem).o))

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH V2 03/41] compiler_types.h: Add __noinstr_section() for noinstr
  2021-09-26 15:08 ` [PATCH V2 03/41] compiler_types.h: Add __noinstr_section() for noinstr Lai Jiangshan
@ 2021-09-27 18:09   ` Kees Cook
  0 siblings, 0 replies; 54+ messages in thread
From: Kees Cook @ 2021-09-27 18:09 UTC (permalink / raw)
  To: Lai Jiangshan
  Cc: linux-kernel, Lai Jiangshan, Nathan Chancellor, Miguel Ojeda,
	Nick Desaulniers, Peter Zijlstra (Intel),
	Sami Tolvanen, Masahiro Yamada, Marco Elver, Arnd Bergmann,
	Ard Biesheuvel

On Sun, Sep 26, 2021 at 11:08:00PM +0800, Lai Jiangshan wrote:
> From: Lai Jiangshan <laijs@linux.alibaba.com>
> 
> And it will be extended for C entry code.
> 
> Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
> ---
>  include/linux/compiler_types.h | 6 ++++--
>  1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/include/linux/compiler_types.h b/include/linux/compiler_types.h
> index b6ff83a714ca..3c77631c68bd 100644
> --- a/include/linux/compiler_types.h
> +++ b/include/linux/compiler_types.h
> @@ -208,10 +208,12 @@ struct ftrace_likely_data {
>  #endif
>  
>  /* Section for code which can't be instrumented at all */
> -#define noinstr								\
> -	noinline notrace __attribute((__section__(".noinstr.text")))	\
> +#define __noinstr_section(section)				\

bikeshed: this could be just __noinstr(section) instead
of __noinstr_section(section) just to avoid semi-redundant
information. *shrug*

Reviewed-by: Kees Cook <keescook@chromium.org>

> +	noinline notrace __attribute((__section__(section)))	\
>  	__no_kcsan __no_sanitize_address __no_profile __no_sanitize_coverage
>  
> +#define noinstr __noinstr_section(".noinstr.text")
> +
>  #endif /* __KERNEL__ */
>  
>  #endif /* __ASSEMBLY__ */
> -- 
> 2.19.1.6.gb485710b
> 

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH V2 16/41] x86/entry: Implement the whole error_entry() as C code
  2021-09-26 15:08 ` [PATCH V2 16/41] x86/entry: Implement the whole error_entry() as C code Lai Jiangshan
@ 2021-09-28 21:34   ` Brian Gerst
  2021-09-29  8:45     ` Peter Zijlstra
  0 siblings, 1 reply; 54+ messages in thread
From: Brian Gerst @ 2021-09-28 21:34 UTC (permalink / raw)
  To: Lai Jiangshan
  Cc: Linux Kernel Mailing List, Lai Jiangshan, Andy Lutomirski,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	the arch/x86 maintainers, H. Peter Anvin, Youquan Song,
	Peter Zijlstra, Tony Luck

On Sun, Sep 26, 2021 at 11:13 AM Lai Jiangshan <jiangshanlai@gmail.com> wrote:
>
> From: Lai Jiangshan <laijs@linux.alibaba.com>
>
> All the needed facilities are set in entry64.c, the whole error_entry()
> can be implemented in C in entry64.c.  The C version generally has better
> readability and easier to be updated/improved.
>
> No function change intended. Only a check for X86_FEATURE_XENPV is added
> because the new error_entry() does not use the pv SWAPGS, rather it uses
> native_swapgs().  And for XENPV, error_entry() has nothing to do, so it
> can return directly.
>
> Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
> ---
>  arch/x86/entry/entry64.c     | 76 ++++++++++++++++++++++++++++++++++
>  arch/x86/entry/entry_64.S    | 80 +-----------------------------------
>  arch/x86/include/asm/traps.h |  1 +
>  3 files changed, 78 insertions(+), 79 deletions(-)
>
> diff --git a/arch/x86/entry/entry64.c b/arch/x86/entry/entry64.c
> index dafae60d31f9..5f2be4c3f333 100644
> --- a/arch/x86/entry/entry64.c
> +++ b/arch/x86/entry/entry64.c
> @@ -56,3 +56,78 @@ static __always_inline void kernel_entry_fence_no_swapgs(void)
>  {
>         alternative("", "lfence", X86_FEATURE_FENCE_SWAPGS_KERNEL);
>  }
> +
> +/*
> + * Put pt_regs onto the task stack and switch GS and CR3 if needed.
> + * The actual stack switch is done in entry_64.S.
> + *
> + * Becareful, it might be in the user CR3 and user GS base at the start
> + * of the function.
> + */
> +asmlinkage __visible __entry_text
> +struct pt_regs *error_entry(struct pt_regs *eregs)
> +{
> +       unsigned long iret_ip = (unsigned long)native_irq_return_iret;
> +
> +       asm volatile ("cld");

The C ABI states that the direction flag must be clear on function
entry and exit, so the CLD instruction needs to remain in the asm
code.

https://refspecs.linuxbase.org/elf/x86_64-abi-0.99.pdf#subsection.3.2.1

--
Brian Gerst

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH V2 16/41] x86/entry: Implement the whole error_entry() as C code
  2021-09-28 21:34   ` Brian Gerst
@ 2021-09-29  8:45     ` Peter Zijlstra
  0 siblings, 0 replies; 54+ messages in thread
From: Peter Zijlstra @ 2021-09-29  8:45 UTC (permalink / raw)
  To: Brian Gerst
  Cc: Lai Jiangshan, Linux Kernel Mailing List, Lai Jiangshan,
	Andy Lutomirski, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	the arch/x86 maintainers, H. Peter Anvin, Youquan Song,
	Tony Luck

On Tue, Sep 28, 2021 at 05:34:02PM -0400, Brian Gerst wrote:
> On Sun, Sep 26, 2021 at 11:13 AM Lai Jiangshan <jiangshanlai@gmail.com> wrote:
> > +asmlinkage __visible __entry_text
> > +struct pt_regs *error_entry(struct pt_regs *eregs)
> > +{
> > +       unsigned long iret_ip = (unsigned long)native_irq_return_iret;
> > +
> > +       asm volatile ("cld");
> 
> The C ABI states that the direction flag must be clear on function
> entry and exit, so the CLD instruction needs to remain in the asm
> code.

Right, also, one of my pet peeves with out entry code is that CLD and
CLAC are not next to one another.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH V2 04/41] x86/entry: Introduce __entry_text for entry code written in C
  2021-09-26 15:08 ` [PATCH V2 04/41] x86/entry: Introduce __entry_text for entry code written in C Lai Jiangshan
@ 2021-09-30 11:49   ` Borislav Petkov
  0 siblings, 0 replies; 54+ messages in thread
From: Borislav Petkov @ 2021-09-30 11:49 UTC (permalink / raw)
  To: Lai Jiangshan
  Cc: linux-kernel, Lai Jiangshan, Thomas Gleixner, Ingo Molnar, x86,
	H. Peter Anvin, Juergen Gross, Peter Zijlstra (Intel),
	Joerg Roedel, Mike Travis

On Sun, Sep 26, 2021 at 11:08:01PM +0800, Lai Jiangshan wrote:
> From: Lai Jiangshan <laijs@linux.alibaba.com>
> 
> Some entry code will be implemented in C files.  We need __entry_text

Who's "we"?

> to set them in .entry.text section.  __entry_text disables instruments

s/instruments/instrumentation/

> like noinstr, but it doesn't disable stack protector since not all
> compiler supported by kernel supporting function level granular
> attribute to disable stack protector.  It will be disabled by C file
> level.
> 
> Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
> ---
>  arch/x86/include/asm/idtentry.h | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/arch/x86/include/asm/idtentry.h b/arch/x86/include/asm/idtentry.h
> index 1345088e9902..6779def97591 100644
> --- a/arch/x86/include/asm/idtentry.h
> +++ b/arch/x86/include/asm/idtentry.h
> @@ -11,6 +11,9 @@
>  
>  #include <asm/irq_stack.h>
>  
> +/* Entry code written in C. */
> +#define __entry_text __noinstr_section(".entry.text")

I'm assuming that __noinstr_section() is defined somewhere, maybe in
patch 3, which I don't have in my mbox.

Yah, the 0th message says:

"  compiler_types.h: Add __noinstr_section() for noinstr"

Aha, I see why: you haven't CCed me on that one so I don't have it:

https://lkml.kernel.org/r/20210926150838.197719-4-jiangshanlai@gmail.com

I have all the remaining 40 but not that one.

On your next submission, please make sure you CC x86@kernel.org so that
all x86 people get the whole patchset.

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 54+ messages in thread

end of thread, other threads:[~2021-09-30 11:49 UTC | newest]

Thread overview: 54+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-09-26 15:07 [PATCH V2 00/41] x86/entry/64: Convert a bunch of ASM entry code into C code Lai Jiangshan
2021-09-26 15:07 ` [PATCH V2 01/41] x86/entry: Fix swapgs fence Lai Jiangshan
2021-09-26 20:43   ` Thomas Gleixner
2021-09-27  1:10     ` Lai Jiangshan
2021-09-27  3:27       ` Lai Jiangshan
2021-09-27  7:50         ` Thomas Gleixner
2021-09-26 15:07 ` [PATCH V2 02/41] x86/traps: Remove stack-protector from traps.c Lai Jiangshan
2021-09-27 10:19   ` Borislav Petkov
2021-09-27 10:49     ` Lai Jiangshan
2021-09-27 11:01       ` Borislav Petkov
2021-09-27 14:38         ` Lai Jiangshan
2021-09-26 15:08 ` [PATCH V2 03/41] compiler_types.h: Add __noinstr_section() for noinstr Lai Jiangshan
2021-09-27 18:09   ` Kees Cook
2021-09-26 15:08 ` [PATCH V2 04/41] x86/entry: Introduce __entry_text for entry code written in C Lai Jiangshan
2021-09-30 11:49   ` Borislav Petkov
2021-09-26 15:08 ` [PATCH V2 05/41] x86/entry: Move PTI_USER_* to arch/x86/include/asm/processor-flags.h Lai Jiangshan
2021-09-26 15:08 ` [PATCH V2 06/41] x86: Mark __native_read_cr3() & native_write_cr3() as __always_inline Lai Jiangshan
2021-09-26 15:08 ` [PATCH V2 07/41] x86/traps: Move the declaration of native_irq_return_iret into proto.h Lai Jiangshan
2021-09-26 15:08 ` [PATCH V2 08/41] x86/entry: Add arch/x86/entry/entry64.c for C entry code Lai Jiangshan
2021-09-26 15:08 ` [PATCH V2 09/41] x86/entry: Expose the address of .Lgs_change to entry64.c Lai Jiangshan
2021-09-26 15:08 ` [PATCH V2 10/41] x86/entry: Add C verion of SWITCH_TO_KERNEL_CR3 as switch_to_kernel_cr3() Lai Jiangshan
2021-09-26 15:08 ` [PATCH V2 11/41] x86/entry: Add C user_entry_swapgs_and_fence() and kernel_entry_fence_no_swapgs() Lai Jiangshan
2021-09-26 15:08 ` [PATCH V2 12/41] x86/traps: Move pt_regs only in fixup_bad_iret() Lai Jiangshan
2021-09-26 15:08 ` [PATCH V2 13/41] x86/entry: Switch the stack after error_entry() returns Lai Jiangshan
2021-09-26 15:08 ` [PATCH V2 14/41] x86/entry: move PUSH_AND_CLEAR_REGS out of error_entry Lai Jiangshan
2021-09-26 15:08 ` [PATCH V2 15/41] objtool: Allow .entry.text function using CLD instruction Lai Jiangshan
2021-09-26 15:08 ` [PATCH V2 16/41] x86/entry: Implement the whole error_entry() as C code Lai Jiangshan
2021-09-28 21:34   ` Brian Gerst
2021-09-29  8:45     ` Peter Zijlstra
2021-09-26 15:08 ` [PATCH V2 17/41] x86/entry: Make paranoid_exit() callable Lai Jiangshan
2021-09-26 15:08 ` [PATCH V2 18/41] x86/entry: Call paranoid_exit() in asm_exc_nmi() Lai Jiangshan
2021-09-26 15:08 ` [PATCH V2 19/41] x86/entry: move PUSH_AND_CLEAR_REGS out of paranoid_entry Lai Jiangshan
2021-09-26 15:08 ` [PATCH V2 20/41] x86/entry: Add the C version ist_switch_to_kernel_cr3() Lai Jiangshan
2021-09-26 15:08 ` [PATCH V2 21/41] x86/entry: Add the C version ist_restore_cr3() Lai Jiangshan
2021-09-26 15:08 ` [PATCH V2 22/41] x86/entry: Add the C version get_percpu_base() Lai Jiangshan
2021-09-26 15:08 ` [PATCH V2 23/41] x86/entry: Add the C version ist_switch_to_kernel_gsbase() Lai Jiangshan
2021-09-26 15:08 ` [PATCH V2 24/41] x86/entry: Implement the C version ist_paranoid_entry() Lai Jiangshan
2021-09-26 15:08 ` [PATCH V2 25/41] x86/entry: Implement the C version ist_paranoid_exit() Lai Jiangshan
2021-09-26 15:08 ` [PATCH V2 26/41] x86/entry: Add a C macro to define the function body for IST in .entry.text Lai Jiangshan
2021-09-26 15:08 ` [PATCH V2 27/41] x86/mce: Remove stack protector from mce/core.c Lai Jiangshan
2021-09-26 15:08 ` [PATCH V2 28/41] x86/debug, mce: Use C entry code Lai Jiangshan
2021-09-26 15:08 ` [PATCH V2 29/41] x86/idtentry.h: Move the definitions *IDTENTRY_{MCE|DEBUG}* up Lai Jiangshan
2021-09-26 15:08 ` [PATCH V2 30/41] x86/nmi: Use DEFINE_IDTENTRY_NMI for nmi Lai Jiangshan
2021-09-26 15:08 ` [PATCH V2 31/41] x86/nmi: Remove stack protector from nmi.c Lai Jiangshan
2021-09-26 15:08 ` [PATCH V2 32/41] x86/nmi: Use C entry code Lai Jiangshan
2021-09-26 15:08 ` [PATCH V2 33/41] x86/entry: Add a C macro to define the function body for IST in .entry.text with an error code Lai Jiangshan
2021-09-26 15:08 ` [PATCH V2 34/41] x86/doublefault: Use C entry code Lai Jiangshan
2021-09-26 15:08 ` [PATCH V2 35/41] x86/sev: Add and use ist_vc_switch_off_ist() Lai Jiangshan
2021-09-26 15:08 ` [PATCH V2 36/41] x86/sev: Remove stack protector from sev.c Lai Jiangshan
2021-09-26 15:08 ` [PATCH V2 37/41] x86/sev: Use C entry code Lai Jiangshan
2021-09-26 15:08 ` [PATCH V2 38/41] x86/entry: Remove ASM function paranoid_entry() and paranoid_exit() Lai Jiangshan
2021-09-26 15:08 ` [PATCH V2 39/41] x86/entry: Remove the unused ASM macros Lai Jiangshan
2021-09-26 15:08 ` [PATCH V2 40/41] x86/entry: Remove save_ret from PUSH_AND_CLEAR_REGS Lai Jiangshan
2021-09-26 15:08 ` [PATCH V2 41/41] x86/syscall/64: Move the checking for sysret to C code Lai Jiangshan

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.