* [PATCH 00/11] x86/entry: Clean up entry code
@ 2021-12-08 11:08 Lai Jiangshan
2021-12-08 11:08 ` [PATCH 01/11] x86/entry: Use swapgs and native_iret directly in swapgs_restore_regs_and_return_to_usermode Lai Jiangshan
` (10 more replies)
0 siblings, 11 replies; 13+ messages in thread
From: Lai Jiangshan @ 2021-12-08 11:08 UTC (permalink / raw)
To: linux-kernel; +Cc: x86, xen-devel, Lai Jiangshan
From: Lai Jiangshan <laijs@linux.alibaba.com>
This patchset moves the stack-switch code to the place where
error_entry() return, distangles error_entry() from XENpv and makes
entry_INT80_compat use idtentry macro.
This patchset is highly related to XENpv, because it does the extra
cleanup to convert SWAPGS to swapgs after major cleanup is done.
After error_entry() is distangled from XENpv, a branch in sync_regs()
is removed to.
Patch 1-10 are picked from the patchset
https://lore.kernel.org/lkml/20211126101209.8613-1-jiangshanlai@gmail.com/
which coverts ASM code to C code. These patches are prepared for that
purpose. But this patchset has it own value: it simplifies the stack
switch, avoids leaving the old stack inside a function, and separates
XENpv code with native code without adding new code which can lead to
more cleanups.
Patch 11 is new in this patchset.
Lai Jiangshan (11):
x86/entry: Use swapgs and native_iret directly in
swapgs_restore_regs_and_return_to_usermode
x86/traps: Move pt_regs only in fixup_bad_iret()
x86/entry: Switch the stack after error_entry() returns
x86/entry: move PUSH_AND_CLEAR_REGS out of error_entry
x86/entry: Move cld to the start of idtentry
x86/entry: Don't call error_entry for XENPV
x86/entry: Convert SWAPGS to swapgs in error_entry()
x86/entry: Use idtentry macro for entry_INT80_compat
x86/entry: Convert SWAPGS to swapgs in entry_SYSENTER_compat()
x86: Remove the definition of SWAPGS
x86/entry: Remove the branch in sync_regs()
arch/x86/entry/entry_64.S | 65 +++++++++++++------
arch/x86/entry/entry_64_compat.S | 104 +------------------------------
arch/x86/include/asm/idtentry.h | 47 ++++++++++++++
arch/x86/include/asm/irqflags.h | 2 -
arch/x86/include/asm/proto.h | 4 --
arch/x86/include/asm/traps.h | 2 +-
arch/x86/kernel/traps.c | 25 ++++----
7 files changed, 107 insertions(+), 142 deletions(-)
--
2.19.1.6.gb485710b
^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH 01/11] x86/entry: Use swapgs and native_iret directly in swapgs_restore_regs_and_return_to_usermode
2021-12-08 11:08 [PATCH 00/11] x86/entry: Clean up entry code Lai Jiangshan
@ 2021-12-08 11:08 ` Lai Jiangshan
2021-12-13 12:13 ` Juergen Gross
2021-12-08 11:08 ` [PATCH 02/11] x86/traps: Move pt_regs only in fixup_bad_iret() Lai Jiangshan
` (9 subsequent siblings)
10 siblings, 1 reply; 13+ messages in thread
From: Lai Jiangshan @ 2021-12-08 11:08 UTC (permalink / raw)
To: linux-kernel
Cc: x86, xen-devel, Lai Jiangshan, Andy Lutomirski, Thomas Gleixner,
Ingo Molnar, Borislav Petkov, Dave Hansen, H. Peter Anvin
From: Lai Jiangshan <laijs@linux.alibaba.com>
swapgs_restore_regs_and_return_to_usermode() is used in native code
(non-xenpv) only now, so it doesn't need the PV-aware SWAPGS and
INTERRUPT_RETURN.
Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
---
arch/x86/entry/entry_64.S | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
index 97b1f84bb53f..d95a02ecee16 100644
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -608,8 +608,8 @@ SYM_INNER_LABEL(swapgs_restore_regs_and_return_to_usermode, SYM_L_GLOBAL)
/* Restore RDI. */
popq %rdi
- SWAPGS
- INTERRUPT_RETURN
+ swapgs
+ jmp native_iret
SYM_INNER_LABEL(restore_regs_and_return_to_kernel, SYM_L_GLOBAL)
--
2.19.1.6.gb485710b
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH 02/11] x86/traps: Move pt_regs only in fixup_bad_iret()
2021-12-08 11:08 [PATCH 00/11] x86/entry: Clean up entry code Lai Jiangshan
2021-12-08 11:08 ` [PATCH 01/11] x86/entry: Use swapgs and native_iret directly in swapgs_restore_regs_and_return_to_usermode Lai Jiangshan
@ 2021-12-08 11:08 ` Lai Jiangshan
2021-12-08 11:08 ` [PATCH 03/11] x86/entry: Switch the stack after error_entry() returns Lai Jiangshan
` (8 subsequent siblings)
10 siblings, 0 replies; 13+ messages in thread
From: Lai Jiangshan @ 2021-12-08 11:08 UTC (permalink / raw)
To: linux-kernel
Cc: x86, xen-devel, Lai Jiangshan, Andy Lutomirski, Thomas Gleixner,
Ingo Molnar, Borislav Petkov, Dave Hansen, H. Peter Anvin,
Peter Zijlstra, Joerg Roedel, Chang S. Bae
From: Lai Jiangshan <laijs@linux.alibaba.com>
fixup_bad_iret() and sync_regs() have similar arguments and do similar
work that copies full or partial pt_regs to a place and switches stack
after return. They are quite the same, but fixup_bad_iret() not only
copies the pt_regs but also the return address of error_entry() while
sync_regs() copies the pt_regs only and the return address of
error_entry() was preserved and handled in ASM code.
This patch makes fixup_bad_iret() work like sync_regs() and the
handling of the return address of error_entry() is moved in ASM code.
It removes the need to use the struct bad_iret_stack, simplifies
fixup_bad_iret() and makes the ASM error_entry() call fixup_bad_iret()
as the same as calling sync_regs() which adds readability because
the calling patterns are exactly the same.
It is prepared for later patch to do the stack switch after the
error_entry() which simplifies the code further.
Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
---
arch/x86/entry/entry_64.S | 5 ++++-
arch/x86/include/asm/traps.h | 2 +-
arch/x86/kernel/traps.c | 17 ++++++-----------
3 files changed, 11 insertions(+), 13 deletions(-)
diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
index d95a02ecee16..f8faff3ca61e 100644
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -1040,9 +1040,12 @@ SYM_CODE_START_LOCAL(error_entry)
* Pretend that the exception came from user mode: set up pt_regs
* as if we faulted immediately after IRET.
*/
- mov %rsp, %rdi
+ popq %r12 /* save return addr in %12 */
+ movq %rsp, %rdi /* arg0 = pt_regs pointer */
call fixup_bad_iret
mov %rax, %rsp
+ ENCODE_FRAME_POINTER
+ pushq %r12
jmp .Lerror_entry_from_usermode_after_swapgs
SYM_CODE_END(error_entry)
diff --git a/arch/x86/include/asm/traps.h b/arch/x86/include/asm/traps.h
index 6221be7cafc3..1cdd7e8bcba7 100644
--- a/arch/x86/include/asm/traps.h
+++ b/arch/x86/include/asm/traps.h
@@ -13,7 +13,7 @@
#ifdef CONFIG_X86_64
asmlinkage __visible notrace struct pt_regs *sync_regs(struct pt_regs *eregs);
asmlinkage __visible notrace
-struct bad_iret_stack *fixup_bad_iret(struct bad_iret_stack *s);
+struct pt_regs *fixup_bad_iret(struct pt_regs *bad_regs);
void __init trap_init(void);
asmlinkage __visible noinstr struct pt_regs *vc_switch_off_ist(struct pt_regs *eregs);
#endif
diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index c9d566dcf89a..a6c0bc9ee36f 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -761,13 +761,8 @@ asmlinkage __visible noinstr struct pt_regs *vc_switch_off_ist(struct pt_regs *r
}
#endif
-struct bad_iret_stack {
- void *error_entry_ret;
- struct pt_regs regs;
-};
-
asmlinkage __visible noinstr
-struct bad_iret_stack *fixup_bad_iret(struct bad_iret_stack *s)
+struct pt_regs *fixup_bad_iret(struct pt_regs *bad_regs)
{
/*
* This is called from entry_64.S early in handling a fault
@@ -777,19 +772,19 @@ struct bad_iret_stack *fixup_bad_iret(struct bad_iret_stack *s)
* just below the IRET frame) and we want to pretend that the
* exception came from the IRET target.
*/
- struct bad_iret_stack tmp, *new_stack =
- (struct bad_iret_stack *)__this_cpu_read(cpu_tss_rw.x86_tss.sp0) - 1;
+ struct pt_regs tmp, *new_stack =
+ (struct pt_regs *)__this_cpu_read(cpu_tss_rw.x86_tss.sp0) - 1;
/* Copy the IRET target to the temporary storage. */
- __memcpy(&tmp.regs.ip, (void *)s->regs.sp, 5*8);
+ __memcpy(&tmp.ip, (void *)bad_regs->sp, 5*8);
/* Copy the remainder of the stack from the current stack. */
- __memcpy(&tmp, s, offsetof(struct bad_iret_stack, regs.ip));
+ __memcpy(&tmp, bad_regs, offsetof(struct pt_regs, ip));
/* Update the entry stack */
__memcpy(new_stack, &tmp, sizeof(tmp));
- BUG_ON(!user_mode(&new_stack->regs));
+ BUG_ON(!user_mode(new_stack));
return new_stack;
}
#endif
--
2.19.1.6.gb485710b
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH 03/11] x86/entry: Switch the stack after error_entry() returns
2021-12-08 11:08 [PATCH 00/11] x86/entry: Clean up entry code Lai Jiangshan
2021-12-08 11:08 ` [PATCH 01/11] x86/entry: Use swapgs and native_iret directly in swapgs_restore_regs_and_return_to_usermode Lai Jiangshan
2021-12-08 11:08 ` [PATCH 02/11] x86/traps: Move pt_regs only in fixup_bad_iret() Lai Jiangshan
@ 2021-12-08 11:08 ` Lai Jiangshan
2021-12-08 11:08 ` [PATCH 04/11] x86/entry: move PUSH_AND_CLEAR_REGS out of error_entry Lai Jiangshan
` (7 subsequent siblings)
10 siblings, 0 replies; 13+ messages in thread
From: Lai Jiangshan @ 2021-12-08 11:08 UTC (permalink / raw)
To: linux-kernel
Cc: x86, xen-devel, Lai Jiangshan, Andy Lutomirski, Thomas Gleixner,
Ingo Molnar, Borislav Petkov, Dave Hansen, H. Peter Anvin
From: Lai Jiangshan <laijs@linux.alibaba.com>
error_entry() calls sync_regs() to settle/copy the pt_regs and switches
the stack directly after sync_regs(). But error_entry() itself is also
a function call, the switching has to handle the return address of it
together, which causes the work complicated and tangly.
Switching to the stack after error_entry() makes the code simpler and
intuitive.
Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
---
arch/x86/entry/entry_64.S | 16 ++++++----------
1 file changed, 6 insertions(+), 10 deletions(-)
diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
index f8faff3ca61e..6baa7413f7eb 100644
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -323,6 +323,8 @@ SYM_CODE_END(ret_from_fork)
.macro idtentry_body cfunc has_error_code:req
call error_entry
+ movq %rax, %rsp /* switch stack settled by sync_regs() */
+ ENCODE_FRAME_POINTER
UNWIND_HINT_REGS
movq %rsp, %rdi /* pt_regs pointer into 1st argument*/
@@ -981,14 +983,10 @@ SYM_CODE_START_LOCAL(error_entry)
/* We have user CR3. Change to kernel CR3. */
SWITCH_TO_KERNEL_CR3 scratch_reg=%rax
+ leaq 8(%rsp), %rdi /* arg0 = pt_regs pointer */
.Lerror_entry_from_usermode_after_swapgs:
/* Put us onto the real thread stack. */
- popq %r12 /* save return addr in %12 */
- movq %rsp, %rdi /* arg0 = pt_regs pointer */
call sync_regs
- movq %rax, %rsp /* switch stack */
- ENCODE_FRAME_POINTER
- pushq %r12
ret
/*
@@ -1020,6 +1018,7 @@ SYM_CODE_START_LOCAL(error_entry)
*/
.Lerror_entry_done_lfence:
FENCE_SWAPGS_KERNEL_ENTRY
+ leaq 8(%rsp), %rax /* return pt_regs pointer */
ret
.Lbstep_iret:
@@ -1040,12 +1039,9 @@ SYM_CODE_START_LOCAL(error_entry)
* Pretend that the exception came from user mode: set up pt_regs
* as if we faulted immediately after IRET.
*/
- popq %r12 /* save return addr in %12 */
- movq %rsp, %rdi /* arg0 = pt_regs pointer */
+ leaq 8(%rsp), %rdi /* arg0 = pt_regs pointer */
call fixup_bad_iret
- mov %rax, %rsp
- ENCODE_FRAME_POINTER
- pushq %r12
+ mov %rax, %rdi
jmp .Lerror_entry_from_usermode_after_swapgs
SYM_CODE_END(error_entry)
--
2.19.1.6.gb485710b
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH 04/11] x86/entry: move PUSH_AND_CLEAR_REGS out of error_entry
2021-12-08 11:08 [PATCH 00/11] x86/entry: Clean up entry code Lai Jiangshan
` (2 preceding siblings ...)
2021-12-08 11:08 ` [PATCH 03/11] x86/entry: Switch the stack after error_entry() returns Lai Jiangshan
@ 2021-12-08 11:08 ` Lai Jiangshan
2021-12-08 11:08 ` [PATCH 05/11] x86/entry: Move cld to the start of idtentry Lai Jiangshan
` (6 subsequent siblings)
10 siblings, 0 replies; 13+ messages in thread
From: Lai Jiangshan @ 2021-12-08 11:08 UTC (permalink / raw)
To: linux-kernel
Cc: x86, xen-devel, Lai Jiangshan, Andy Lutomirski, Thomas Gleixner,
Ingo Molnar, Borislav Petkov, Dave Hansen, H. Peter Anvin
From: Lai Jiangshan <laijs@linux.alibaba.com>
Moving PUSH_AND_CLEAR_REGS out of error_entry doesn't change any
functionality. It will enlarge the size:
size arch/x86/entry/entry_64.o.before:
text data bss dec hex filename
17916 384 0 18300 477c arch/x86/entry/entry_64.o
size --format=SysV arch/x86/entry/entry_64.o.before:
.entry.text 5528 0
.orc_unwind 6456 0
.orc_unwind_ip 4304 0
size arch/x86/entry/entry_64.o.after:
text data bss dec hex filename
26868 384 0 27252 6a74 arch/x86/entry/entry_64.o
size --format=SysV arch/x86/entry/entry_64.o.after:
.entry.text 8200 0
.orc_unwind 10224 0
.orc_unwind_ip 6816 0
But .entry.text in x86_64 is 2M aligned, enlarging it to 8.2k doesn't
enlarge the final text size.
The tables .orc_unwind[_ip] are enlarged due to it adds many pushes.
It is prepared for not calling error_entry() from XENPV in later patch
and for future converting the whole error_entry into C code.
Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
---
arch/x86/entry/entry_64.S | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
index 6baa7413f7eb..72ceb4b70822 100644
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -322,6 +322,9 @@ SYM_CODE_END(ret_from_fork)
*/
.macro idtentry_body cfunc has_error_code:req
+ PUSH_AND_CLEAR_REGS
+ ENCODE_FRAME_POINTER
+
call error_entry
movq %rax, %rsp /* switch stack settled by sync_regs() */
ENCODE_FRAME_POINTER
@@ -969,8 +972,6 @@ SYM_CODE_END(paranoid_exit)
SYM_CODE_START_LOCAL(error_entry)
UNWIND_HINT_FUNC
cld
- PUSH_AND_CLEAR_REGS save_ret=1
- ENCODE_FRAME_POINTER 8
testb $3, CS+8(%rsp)
jz .Lerror_kernelspace
--
2.19.1.6.gb485710b
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH 05/11] x86/entry: Move cld to the start of idtentry
2021-12-08 11:08 [PATCH 00/11] x86/entry: Clean up entry code Lai Jiangshan
` (3 preceding siblings ...)
2021-12-08 11:08 ` [PATCH 04/11] x86/entry: move PUSH_AND_CLEAR_REGS out of error_entry Lai Jiangshan
@ 2021-12-08 11:08 ` Lai Jiangshan
2021-12-08 11:08 ` [PATCH 06/11] x86/entry: Don't call error_entry for XENPV Lai Jiangshan
` (5 subsequent siblings)
10 siblings, 0 replies; 13+ messages in thread
From: Lai Jiangshan @ 2021-12-08 11:08 UTC (permalink / raw)
To: linux-kernel
Cc: x86, xen-devel, Lai Jiangshan, Andy Lutomirski, Thomas Gleixner,
Ingo Molnar, Borislav Petkov, Dave Hansen, H. Peter Anvin
From: Lai Jiangshan <laijs@linux.alibaba.com>
Make it next to CLAC
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
---
arch/x86/entry/entry_64.S | 8 +++++---
1 file changed, 5 insertions(+), 3 deletions(-)
diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
index 72ceb4b70822..ee1d4adcdab0 100644
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -356,6 +356,7 @@ SYM_CODE_END(ret_from_fork)
SYM_CODE_START(\asmsym)
UNWIND_HINT_IRET_REGS offset=\has_error_code*8
ASM_CLAC
+ cld
.if \has_error_code == 0
pushq $-1 /* ORIG_RAX: no syscall to restart */
@@ -423,6 +424,7 @@ SYM_CODE_END(\asmsym)
SYM_CODE_START(\asmsym)
UNWIND_HINT_IRET_REGS
ASM_CLAC
+ cld
pushq $-1 /* ORIG_RAX: no syscall to restart */
@@ -478,6 +480,7 @@ SYM_CODE_END(\asmsym)
SYM_CODE_START(\asmsym)
UNWIND_HINT_IRET_REGS
ASM_CLAC
+ cld
/*
* If the entry is from userspace, switch stacks and treat it as
@@ -539,6 +542,7 @@ SYM_CODE_END(\asmsym)
SYM_CODE_START(\asmsym)
UNWIND_HINT_IRET_REGS offset=8
ASM_CLAC
+ cld
/* paranoid_entry returns GS information for paranoid_exit in EBX. */
call paranoid_entry
@@ -853,7 +857,6 @@ SYM_CODE_END(xen_failsafe_callback)
*/
SYM_CODE_START_LOCAL(paranoid_entry)
UNWIND_HINT_FUNC
- cld
PUSH_AND_CLEAR_REGS save_ret=1
ENCODE_FRAME_POINTER 8
@@ -971,7 +974,6 @@ SYM_CODE_END(paranoid_exit)
*/
SYM_CODE_START_LOCAL(error_entry)
UNWIND_HINT_FUNC
- cld
testb $3, CS+8(%rsp)
jz .Lerror_kernelspace
@@ -1104,6 +1106,7 @@ SYM_CODE_START(asm_exc_nmi)
*/
ASM_CLAC
+ cld
/* Use %rdx as our temp variable throughout */
pushq %rdx
@@ -1123,7 +1126,6 @@ SYM_CODE_START(asm_exc_nmi)
*/
swapgs
- cld
FENCE_SWAPGS_USER_ENTRY
SWITCH_TO_KERNEL_CR3 scratch_reg=%rdx
movq %rsp, %rdx
--
2.19.1.6.gb485710b
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH 06/11] x86/entry: Don't call error_entry for XENPV
2021-12-08 11:08 [PATCH 00/11] x86/entry: Clean up entry code Lai Jiangshan
` (4 preceding siblings ...)
2021-12-08 11:08 ` [PATCH 05/11] x86/entry: Move cld to the start of idtentry Lai Jiangshan
@ 2021-12-08 11:08 ` Lai Jiangshan
2021-12-08 11:08 ` [PATCH 07/11] x86/entry: Convert SWAPGS to swapgs in error_entry() Lai Jiangshan
` (4 subsequent siblings)
10 siblings, 0 replies; 13+ messages in thread
From: Lai Jiangshan @ 2021-12-08 11:08 UTC (permalink / raw)
To: linux-kernel
Cc: x86, xen-devel, Lai Jiangshan, Andy Lutomirski, Thomas Gleixner,
Ingo Molnar, Borislav Petkov, Dave Hansen, H. Peter Anvin
From: Lai Jiangshan <laijs@linux.alibaba.com>
When in XENPV, it is already in the task stack, and it can't fault
for native_iret() nor native_load_gs_index() since XENPV uses its own
pvops for iret and load_gs_index(). And it doesn't need to switch CR3.
So there is no reason to call error_entry() in XENPV.
Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
---
arch/x86/entry/entry_64.S | 13 +++++++++++--
1 file changed, 11 insertions(+), 2 deletions(-)
diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
index ee1d4adcdab0..383070aa9272 100644
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -325,8 +325,17 @@ SYM_CODE_END(ret_from_fork)
PUSH_AND_CLEAR_REGS
ENCODE_FRAME_POINTER
- call error_entry
- movq %rax, %rsp /* switch stack settled by sync_regs() */
+ /*
+ * Call error_entry and switch stack settled by sync_regs().
+ *
+ * When in XENPV, it is already in the task stack, and it can't fault
+ * for native_iret() nor native_load_gs_index() since XENPV uses its
+ * own pvops for iret and load_gs_index(). And it doesn't need to
+ * switch CR3. So it can skip invoking error_entry().
+ */
+ ALTERNATIVE "call error_entry; movq %rax, %rsp", \
+ "", X86_FEATURE_XENPV
+
ENCODE_FRAME_POINTER
UNWIND_HINT_REGS
--
2.19.1.6.gb485710b
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH 07/11] x86/entry: Convert SWAPGS to swapgs in error_entry()
2021-12-08 11:08 [PATCH 00/11] x86/entry: Clean up entry code Lai Jiangshan
` (5 preceding siblings ...)
2021-12-08 11:08 ` [PATCH 06/11] x86/entry: Don't call error_entry for XENPV Lai Jiangshan
@ 2021-12-08 11:08 ` Lai Jiangshan
2021-12-08 11:08 ` [PATCH 08/11] x86/entry: Use idtentry macro for entry_INT80_compat Lai Jiangshan
` (3 subsequent siblings)
10 siblings, 0 replies; 13+ messages in thread
From: Lai Jiangshan @ 2021-12-08 11:08 UTC (permalink / raw)
To: linux-kernel
Cc: x86, xen-devel, Lai Jiangshan, Andy Lutomirski, Thomas Gleixner,
Ingo Molnar, Borislav Petkov, Dave Hansen, H. Peter Anvin
From: Lai Jiangshan <laijs@linux.alibaba.com>
XENPV doesn't use error_entry() anymore, so the pv-aware SWAPGS can be
changed to native swapgs.
Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
---
arch/x86/entry/entry_64.S | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
index 383070aa9272..6100007fd03f 100644
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -990,7 +990,7 @@ SYM_CODE_START_LOCAL(error_entry)
* We entered from user mode or we're pretending to have entered
* from user mode due to an IRET fault.
*/
- SWAPGS
+ swapgs
FENCE_SWAPGS_USER_ENTRY
/* We have user CR3. Change to kernel CR3. */
SWITCH_TO_KERNEL_CR3 scratch_reg=%rax
@@ -1022,7 +1022,7 @@ SYM_CODE_START_LOCAL(error_entry)
* gsbase and proceed. We'll fix up the exception and land in
* .Lgs_change's error handler with kernel gsbase.
*/
- SWAPGS
+ swapgs
/*
* Issue an LFENCE to prevent GS speculation, regardless of whether it is a
@@ -1043,7 +1043,7 @@ SYM_CODE_START_LOCAL(error_entry)
* We came from an IRET to user mode, so we have user
* gsbase and CR3. Switch to kernel gsbase and CR3:
*/
- SWAPGS
+ swapgs
FENCE_SWAPGS_USER_ENTRY
SWITCH_TO_KERNEL_CR3 scratch_reg=%rax
--
2.19.1.6.gb485710b
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH 08/11] x86/entry: Use idtentry macro for entry_INT80_compat
2021-12-08 11:08 [PATCH 00/11] x86/entry: Clean up entry code Lai Jiangshan
` (6 preceding siblings ...)
2021-12-08 11:08 ` [PATCH 07/11] x86/entry: Convert SWAPGS to swapgs in error_entry() Lai Jiangshan
@ 2021-12-08 11:08 ` Lai Jiangshan
2021-12-08 11:08 ` [PATCH 09/11] x86/entry: Convert SWAPGS to swapgs in entry_SYSENTER_compat() Lai Jiangshan
` (2 subsequent siblings)
10 siblings, 0 replies; 13+ messages in thread
From: Lai Jiangshan @ 2021-12-08 11:08 UTC (permalink / raw)
To: linux-kernel
Cc: x86, xen-devel, Lai Jiangshan, Andy Lutomirski, Thomas Gleixner,
Ingo Molnar, Borislav Petkov, Dave Hansen, H. Peter Anvin,
Juergen Gross, Peter Zijlstra (Intel),
Joerg Roedel, Chang S. Bae, Jan Kiszka
From: Lai Jiangshan <laijs@linux.alibaba.com>
entry_INT80_compat is identical to idtentry macro except a special
handling for %rax in the prolog.
Add the prolog to idtentry and use idtentry for entry_INT80_compat.
Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
---
arch/x86/entry/entry_64.S | 18 ++++++
arch/x86/entry/entry_64_compat.S | 102 -------------------------------
arch/x86/include/asm/idtentry.h | 47 ++++++++++++++
arch/x86/include/asm/proto.h | 4 --
4 files changed, 65 insertions(+), 106 deletions(-)
diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
index 6100007fd03f..1fd5efaa99b3 100644
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -371,6 +371,24 @@ SYM_CODE_START(\asmsym)
pushq $-1 /* ORIG_RAX: no syscall to restart */
.endif
+ .if \vector == IA32_SYSCALL_VECTOR
+ /*
+ * User tracing code (ptrace or signal handlers) might assume
+ * that the saved RAX contains a 32-bit number when we're
+ * invoking a 32-bit syscall. Just in case the high bits are
+ * nonzero, zero-extend the syscall number. (This could almost
+ * certainly be deleted with no ill effects.)
+ */
+ movl %eax, %eax
+
+ /*
+ * do_int80_syscall_32() expects regs->orig_ax to be user ax,
+ * and regs->ax to be $-ENOSYS.
+ */
+ movq %rax, (%rsp)
+ movq $-ENOSYS, %rax
+ .endif
+
.if \vector == X86_TRAP_BP
/*
* If coming from kernel space, create a 6-word gap to allow the
diff --git a/arch/x86/entry/entry_64_compat.S b/arch/x86/entry/entry_64_compat.S
index 0051cf5c792d..a4fcea0cab14 100644
--- a/arch/x86/entry/entry_64_compat.S
+++ b/arch/x86/entry/entry_64_compat.S
@@ -311,105 +311,3 @@ sysret32_from_system_call:
swapgs
sysretl
SYM_CODE_END(entry_SYSCALL_compat)
-
-/*
- * 32-bit legacy system call entry.
- *
- * 32-bit x86 Linux system calls traditionally used the INT $0x80
- * instruction. INT $0x80 lands here.
- *
- * This entry point can be used by 32-bit and 64-bit programs to perform
- * 32-bit system calls. Instances of INT $0x80 can be found inline in
- * various programs and libraries. It is also used by the vDSO's
- * __kernel_vsyscall fallback for hardware that doesn't support a faster
- * entry method. Restarted 32-bit system calls also fall back to INT
- * $0x80 regardless of what instruction was originally used to do the
- * system call.
- *
- * This is considered a slow path. It is not used by most libc
- * implementations on modern hardware except during process startup.
- *
- * Arguments:
- * eax system call number
- * ebx arg1
- * ecx arg2
- * edx arg3
- * esi arg4
- * edi arg5
- * ebp arg6
- */
-SYM_CODE_START(entry_INT80_compat)
- UNWIND_HINT_EMPTY
- /*
- * Interrupts are off on entry.
- */
- ASM_CLAC /* Do this early to minimize exposure */
- SWAPGS
-
- /*
- * User tracing code (ptrace or signal handlers) might assume that
- * the saved RAX contains a 32-bit number when we're invoking a 32-bit
- * syscall. Just in case the high bits are nonzero, zero-extend
- * the syscall number. (This could almost certainly be deleted
- * with no ill effects.)
- */
- movl %eax, %eax
-
- /* switch to thread stack expects orig_ax and rdi to be pushed */
- pushq %rax /* pt_regs->orig_ax */
- pushq %rdi /* pt_regs->di */
-
- /* Need to switch before accessing the thread stack. */
- SWITCH_TO_KERNEL_CR3 scratch_reg=%rdi
-
- /* In the Xen PV case we already run on the thread stack. */
- ALTERNATIVE "", "jmp .Lint80_keep_stack", X86_FEATURE_XENPV
-
- movq %rsp, %rdi
- movq PER_CPU_VAR(cpu_current_top_of_stack), %rsp
-
- pushq 6*8(%rdi) /* regs->ss */
- pushq 5*8(%rdi) /* regs->rsp */
- pushq 4*8(%rdi) /* regs->eflags */
- pushq 3*8(%rdi) /* regs->cs */
- pushq 2*8(%rdi) /* regs->ip */
- pushq 1*8(%rdi) /* regs->orig_ax */
- pushq (%rdi) /* pt_regs->di */
-.Lint80_keep_stack:
-
- pushq %rsi /* pt_regs->si */
- xorl %esi, %esi /* nospec si */
- pushq %rdx /* pt_regs->dx */
- xorl %edx, %edx /* nospec dx */
- pushq %rcx /* pt_regs->cx */
- xorl %ecx, %ecx /* nospec cx */
- pushq $-ENOSYS /* pt_regs->ax */
- pushq %r8 /* pt_regs->r8 */
- xorl %r8d, %r8d /* nospec r8 */
- pushq %r9 /* pt_regs->r9 */
- xorl %r9d, %r9d /* nospec r9 */
- pushq %r10 /* pt_regs->r10*/
- xorl %r10d, %r10d /* nospec r10 */
- pushq %r11 /* pt_regs->r11 */
- xorl %r11d, %r11d /* nospec r11 */
- pushq %rbx /* pt_regs->rbx */
- xorl %ebx, %ebx /* nospec rbx */
- pushq %rbp /* pt_regs->rbp */
- xorl %ebp, %ebp /* nospec rbp */
- pushq %r12 /* pt_regs->r12 */
- xorl %r12d, %r12d /* nospec r12 */
- pushq %r13 /* pt_regs->r13 */
- xorl %r13d, %r13d /* nospec r13 */
- pushq %r14 /* pt_regs->r14 */
- xorl %r14d, %r14d /* nospec r14 */
- pushq %r15 /* pt_regs->r15 */
- xorl %r15d, %r15d /* nospec r15 */
-
- UNWIND_HINT_REGS
-
- cld
-
- movq %rsp, %rdi
- call do_int80_syscall_32
- jmp swapgs_restore_regs_and_return_to_usermode
-SYM_CODE_END(entry_INT80_compat)
diff --git a/arch/x86/include/asm/idtentry.h b/arch/x86/include/asm/idtentry.h
index 1345088e9902..38cb2e0dc2c7 100644
--- a/arch/x86/include/asm/idtentry.h
+++ b/arch/x86/include/asm/idtentry.h
@@ -204,6 +204,20 @@ __visible noinstr void func(struct pt_regs *regs, \
\
static noinline void __##func(struct pt_regs *regs, u32 vector)
+/**
+ * DECLARE_IDTENTRY_IA32_EMULATION - Declare functions for int80
+ * @vector: Vector number (ignored for C)
+ * @asm_func: Function name of the entry point
+ * @cfunc: The C handler called from the ASM entry point (ignored for C)
+ *
+ * Declares two functions:
+ * - The ASM entry point: asm_func
+ * - The XEN PV trap entry point: xen_##asm_func (maybe unused)
+ */
+#define DECLARE_IDTENTRY_IA32_EMULATION(vector, asm_func, cfunc) \
+ asmlinkage void asm_func(void); \
+ asmlinkage void xen_##asm_func(void)
+
/**
* DECLARE_IDTENTRY_SYSVEC - Declare functions for system vector entry points
* @vector: Vector number (ignored for C)
@@ -430,6 +444,35 @@ __visible noinstr void func(struct pt_regs *regs, \
#define DECLARE_IDTENTRY_ERRORCODE(vector, func) \
idtentry vector asm_##func func has_error_code=1
+/*
+ * 32-bit legacy system call entry.
+ *
+ * 32-bit x86 Linux system calls traditionally used the INT $0x80
+ * instruction. INT $0x80 lands here.
+ *
+ * This entry point can be used by 32-bit and 64-bit programs to perform
+ * 32-bit system calls. Instances of INT $0x80 can be found inline in
+ * various programs and libraries. It is also used by the vDSO's
+ * __kernel_vsyscall fallback for hardware that doesn't support a faster
+ * entry method. Restarted 32-bit system calls also fall back to INT
+ * $0x80 regardless of what instruction was originally used to do the
+ * system call.
+ *
+ * This is considered a slow path. It is not used by most libc
+ * implementations on modern hardware except during process startup.
+ *
+ * Arguments:
+ * eax system call number
+ * ebx arg1
+ * ecx arg2
+ * edx arg3
+ * esi arg4
+ * edi arg5
+ * ebp arg6
+ */
+#define DECLARE_IDTENTRY_IA32_EMULATION(vector, asm_func, cfunc) \
+ idtentry vector asm_func cfunc has_error_code=0
+
/* Special case for 32bit IRET 'trap'. Do not emit ASM code */
#define DECLARE_IDTENTRY_SW(vector, func)
@@ -631,6 +674,10 @@ DECLARE_IDTENTRY_IRQ(X86_TRAP_OTHER, common_interrupt);
DECLARE_IDTENTRY_IRQ(X86_TRAP_OTHER, spurious_interrupt);
#endif
+#ifdef CONFIG_IA32_EMULATION
+DECLARE_IDTENTRY_IA32_EMULATION(IA32_SYSCALL_VECTOR, entry_INT80_compat, do_int80_syscall_32);
+#endif
+
/* System vector entry points */
#ifdef CONFIG_X86_LOCAL_APIC
DECLARE_IDTENTRY_SYSVEC(ERROR_APIC_VECTOR, sysvec_error_interrupt);
diff --git a/arch/x86/include/asm/proto.h b/arch/x86/include/asm/proto.h
index feed36d44d04..c4d331fe65ff 100644
--- a/arch/x86/include/asm/proto.h
+++ b/arch/x86/include/asm/proto.h
@@ -28,10 +28,6 @@ void entry_SYSENTER_compat(void);
void __end_entry_SYSENTER_compat(void);
void entry_SYSCALL_compat(void);
void entry_SYSCALL_compat_safe_stack(void);
-void entry_INT80_compat(void);
-#ifdef CONFIG_XEN_PV
-void xen_entry_INT80_compat(void);
-#endif
#endif
void x86_configure_nx(void);
--
2.19.1.6.gb485710b
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH 09/11] x86/entry: Convert SWAPGS to swapgs in entry_SYSENTER_compat()
2021-12-08 11:08 [PATCH 00/11] x86/entry: Clean up entry code Lai Jiangshan
` (7 preceding siblings ...)
2021-12-08 11:08 ` [PATCH 08/11] x86/entry: Use idtentry macro for entry_INT80_compat Lai Jiangshan
@ 2021-12-08 11:08 ` Lai Jiangshan
2021-12-08 11:08 ` [PATCH 10/11] x86: Remove the definition of SWAPGS Lai Jiangshan
2021-12-08 11:08 ` [PATCH 11/11] x86/entry: Remove the branch in sync_regs() Lai Jiangshan
10 siblings, 0 replies; 13+ messages in thread
From: Lai Jiangshan @ 2021-12-08 11:08 UTC (permalink / raw)
To: linux-kernel
Cc: x86, xen-devel, Lai Jiangshan, Andy Lutomirski, Thomas Gleixner,
Ingo Molnar, Borislav Petkov, Dave Hansen, H. Peter Anvin
From: Lai Jiangshan <laijs@linux.alibaba.com>
XENPV has its own entry point for SYSENTER, it doesn't use
entry_SYSENTER_compat. So the pv-awared SWAPGS can be changed to
swapgs.
Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
---
arch/x86/entry/entry_64_compat.S | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/x86/entry/entry_64_compat.S b/arch/x86/entry/entry_64_compat.S
index a4fcea0cab14..72e017c3941f 100644
--- a/arch/x86/entry/entry_64_compat.S
+++ b/arch/x86/entry/entry_64_compat.S
@@ -49,7 +49,7 @@
SYM_CODE_START(entry_SYSENTER_compat)
UNWIND_HINT_EMPTY
/* Interrupts are off on entry. */
- SWAPGS
+ swapgs
pushq %rax
SWITCH_TO_KERNEL_CR3 scratch_reg=%rax
--
2.19.1.6.gb485710b
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH 10/11] x86: Remove the definition of SWAPGS
2021-12-08 11:08 [PATCH 00/11] x86/entry: Clean up entry code Lai Jiangshan
` (8 preceding siblings ...)
2021-12-08 11:08 ` [PATCH 09/11] x86/entry: Convert SWAPGS to swapgs in entry_SYSENTER_compat() Lai Jiangshan
@ 2021-12-08 11:08 ` Lai Jiangshan
2021-12-08 11:08 ` [PATCH 11/11] x86/entry: Remove the branch in sync_regs() Lai Jiangshan
10 siblings, 0 replies; 13+ messages in thread
From: Lai Jiangshan @ 2021-12-08 11:08 UTC (permalink / raw)
To: linux-kernel
Cc: x86, xen-devel, Lai Jiangshan, Thomas Gleixner, Ingo Molnar,
Borislav Petkov, Dave Hansen, H. Peter Anvin, Juergen Gross,
Peter Zijlstra (Intel),
Kirill A. Shutemov
From: Lai Jiangshan <laijs@linux.alibaba.com>
There is no user of the pv-aware SWAPGS anymore.
Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
---
arch/x86/include/asm/irqflags.h | 2 --
1 file changed, 2 deletions(-)
diff --git a/arch/x86/include/asm/irqflags.h b/arch/x86/include/asm/irqflags.h
index 87761396e8cc..ac2e4cc47210 100644
--- a/arch/x86/include/asm/irqflags.h
+++ b/arch/x86/include/asm/irqflags.h
@@ -140,13 +140,11 @@ static __always_inline void arch_local_irq_restore(unsigned long flags)
#else
#ifdef CONFIG_X86_64
#ifdef CONFIG_XEN_PV
-#define SWAPGS ALTERNATIVE "swapgs", "", X86_FEATURE_XENPV
#define INTERRUPT_RETURN \
ANNOTATE_RETPOLINE_SAFE; \
ALTERNATIVE_TERNARY("jmp *paravirt_iret(%rip);", \
X86_FEATURE_XENPV, "jmp xen_iret;", "jmp native_iret;")
#else
-#define SWAPGS swapgs
#define INTERRUPT_RETURN jmp native_iret
#endif
#endif
--
2.19.1.6.gb485710b
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH 11/11] x86/entry: Remove the branch in sync_regs()
2021-12-08 11:08 [PATCH 00/11] x86/entry: Clean up entry code Lai Jiangshan
` (9 preceding siblings ...)
2021-12-08 11:08 ` [PATCH 10/11] x86: Remove the definition of SWAPGS Lai Jiangshan
@ 2021-12-08 11:08 ` Lai Jiangshan
10 siblings, 0 replies; 13+ messages in thread
From: Lai Jiangshan @ 2021-12-08 11:08 UTC (permalink / raw)
To: linux-kernel
Cc: x86, xen-devel, Lai Jiangshan, Thomas Gleixner, Ingo Molnar,
Borislav Petkov, Dave Hansen, H. Peter Anvin, Peter Zijlstra,
Joerg Roedel, Chang S. Bae
From: Lai Jiangshan <laijs@linux.alibaba.com>
In non-xenpv, the sp0 is the trampoline stack, and sync_regs() is
called on non-xenpv only since error_entry is not called on xenpv, so
the stack must be the trampoline stack or one of the IST stack and the
check in sync_regs() is unneeded.
Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
---
arch/x86/kernel/traps.c | 8 +++++---
1 file changed, 5 insertions(+), 3 deletions(-)
diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index a6c0bc9ee36f..63cdb7aedd67 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -708,13 +708,15 @@ DEFINE_IDTENTRY_RAW(exc_int3)
/*
* Help handler running on a per-cpu (IST or entry trampoline) stack
* to switch to the normal thread stack if the interrupted code was in
- * user mode. The actual stack switch is done in entry_64.S
+ * user mode. The actual stack switch is done in entry_64.S. It is not
+ * called from XENPV and the caller has ensured it is not on the thread
+ * stack.
*/
asmlinkage __visible noinstr struct pt_regs *sync_regs(struct pt_regs *eregs)
{
struct pt_regs *regs = (struct pt_regs *)this_cpu_read(cpu_current_top_of_stack) - 1;
- if (regs != eregs)
- *regs = *eregs;
+
+ *regs = *eregs;
return regs;
}
--
2.19.1.6.gb485710b
^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH 01/11] x86/entry: Use swapgs and native_iret directly in swapgs_restore_regs_and_return_to_usermode
2021-12-08 11:08 ` [PATCH 01/11] x86/entry: Use swapgs and native_iret directly in swapgs_restore_regs_and_return_to_usermode Lai Jiangshan
@ 2021-12-13 12:13 ` Juergen Gross
0 siblings, 0 replies; 13+ messages in thread
From: Juergen Gross @ 2021-12-13 12:13 UTC (permalink / raw)
To: Lai Jiangshan, linux-kernel
Cc: x86, xen-devel, Lai Jiangshan, Andy Lutomirski, Thomas Gleixner,
Ingo Molnar, Borislav Petkov, Dave Hansen, H. Peter Anvin
[-- Attachment #1.1.1: Type: text/plain, Size: 377 bytes --]
On 08.12.21 12:08, Lai Jiangshan wrote:
> From: Lai Jiangshan <laijs@linux.alibaba.com>
>
> swapgs_restore_regs_and_return_to_usermode() is used in native code
> (non-xenpv) only now, so it doesn't need the PV-aware SWAPGS and
> INTERRUPT_RETURN.
>
> Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Juergen
[-- Attachment #1.1.2: OpenPGP public key --]
[-- Type: application/pgp-keys, Size: 3135 bytes --]
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 495 bytes --]
^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2021-12-13 12:13 UTC | newest]
Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-12-08 11:08 [PATCH 00/11] x86/entry: Clean up entry code Lai Jiangshan
2021-12-08 11:08 ` [PATCH 01/11] x86/entry: Use swapgs and native_iret directly in swapgs_restore_regs_and_return_to_usermode Lai Jiangshan
2021-12-13 12:13 ` Juergen Gross
2021-12-08 11:08 ` [PATCH 02/11] x86/traps: Move pt_regs only in fixup_bad_iret() Lai Jiangshan
2021-12-08 11:08 ` [PATCH 03/11] x86/entry: Switch the stack after error_entry() returns Lai Jiangshan
2021-12-08 11:08 ` [PATCH 04/11] x86/entry: move PUSH_AND_CLEAR_REGS out of error_entry Lai Jiangshan
2021-12-08 11:08 ` [PATCH 05/11] x86/entry: Move cld to the start of idtentry Lai Jiangshan
2021-12-08 11:08 ` [PATCH 06/11] x86/entry: Don't call error_entry for XENPV Lai Jiangshan
2021-12-08 11:08 ` [PATCH 07/11] x86/entry: Convert SWAPGS to swapgs in error_entry() Lai Jiangshan
2021-12-08 11:08 ` [PATCH 08/11] x86/entry: Use idtentry macro for entry_INT80_compat Lai Jiangshan
2021-12-08 11:08 ` [PATCH 09/11] x86/entry: Convert SWAPGS to swapgs in entry_SYSENTER_compat() Lai Jiangshan
2021-12-08 11:08 ` [PATCH 10/11] x86: Remove the definition of SWAPGS Lai Jiangshan
2021-12-08 11:08 ` [PATCH 11/11] x86/entry: Remove the branch in sync_regs() Lai Jiangshan
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).