* [PATCH V5 1/7] x86/traps: Move pt_regs only in fixup_bad_iret()
2022-04-12 12:15 [PATCH V5 0/7] x86/entry: Clean up entry code Lai Jiangshan
@ 2022-04-12 12:15 ` Lai Jiangshan
2022-04-12 12:15 ` [PATCH V5 2/7] x86/entry: Switch the stack after error_entry() returns Lai Jiangshan
` (5 subsequent siblings)
6 siblings, 0 replies; 16+ messages in thread
From: Lai Jiangshan @ 2022-04-12 12:15 UTC (permalink / raw)
To: linux-kernel
Cc: Borislav Petkov, Peter Zijlstra, Josh Poimboeuf, Andy Lutomirski,
Thomas Gleixner, x86, Lai Jiangshan, Ingo Molnar, Dave Hansen,
H. Peter Anvin, Kirill A. Shutemov, Fenghua Yu, Chang S. Bae
From: Lai Jiangshan <jiangshan.ljs@antgroup.com>
Always stash the address error_entry() is going to return to, in %r12
and get rid of the void *error_entry_ret; slot in struct bad_iret_stack
which was supposed to account for it and pt_regs pushed on the stack.
After this, both fixup_bad_iret() and sync_regs() can work on a
struct pt_regs pointer directly.
Signed-off-by: Lai Jiangshan <jiangshan.ljs@antgroup.com>
---
arch/x86/entry/entry_64.S | 5 ++++-
arch/x86/include/asm/traps.h | 2 +-
arch/x86/kernel/traps.c | 18 ++++++------------
3 files changed, 11 insertions(+), 14 deletions(-)
diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
index 4faac48ebec5..e9d896717ab4 100644
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -1058,9 +1058,12 @@ SYM_CODE_START_LOCAL(error_entry)
* Pretend that the exception came from user mode: set up pt_regs
* as if we faulted immediately after IRET.
*/
- mov %rsp, %rdi
+ popq %r12 /* save return addr in %12 */
+ movq %rsp, %rdi /* arg0 = pt_regs pointer */
call fixup_bad_iret
mov %rax, %rsp
+ ENCODE_FRAME_POINTER
+ pushq %r12
jmp .Lerror_entry_from_usermode_after_swapgs
SYM_CODE_END(error_entry)
diff --git a/arch/x86/include/asm/traps.h b/arch/x86/include/asm/traps.h
index 35317c5c551d..47ecfff2c83d 100644
--- a/arch/x86/include/asm/traps.h
+++ b/arch/x86/include/asm/traps.h
@@ -13,7 +13,7 @@
#ifdef CONFIG_X86_64
asmlinkage __visible notrace struct pt_regs *sync_regs(struct pt_regs *eregs);
asmlinkage __visible notrace
-struct bad_iret_stack *fixup_bad_iret(struct bad_iret_stack *s);
+struct pt_regs *fixup_bad_iret(struct pt_regs *bad_regs);
void __init trap_init(void);
asmlinkage __visible noinstr struct pt_regs *vc_switch_off_ist(struct pt_regs *eregs);
#endif
diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index a4e2efde5d1f..111b18d57a54 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -898,13 +898,7 @@ asmlinkage __visible noinstr struct pt_regs *vc_switch_off_ist(struct pt_regs *r
}
#endif
-struct bad_iret_stack {
- void *error_entry_ret;
- struct pt_regs regs;
-};
-
-asmlinkage __visible noinstr
-struct bad_iret_stack *fixup_bad_iret(struct bad_iret_stack *s)
+asmlinkage __visible noinstr struct pt_regs *fixup_bad_iret(struct pt_regs *bad_regs)
{
/*
* This is called from entry_64.S early in handling a fault
@@ -914,19 +908,19 @@ struct bad_iret_stack *fixup_bad_iret(struct bad_iret_stack *s)
* just below the IRET frame) and we want to pretend that the
* exception came from the IRET target.
*/
- struct bad_iret_stack tmp, *new_stack =
- (struct bad_iret_stack *)__this_cpu_read(cpu_tss_rw.x86_tss.sp0) - 1;
+ struct pt_regs tmp, *new_stack =
+ (struct pt_regs *)__this_cpu_read(cpu_tss_rw.x86_tss.sp0) - 1;
/* Copy the IRET target to the temporary storage. */
- __memcpy(&tmp.regs.ip, (void *)s->regs.sp, 5*8);
+ __memcpy(&tmp.ip, (void *)bad_regs->sp, 5*8);
/* Copy the remainder of the stack from the current stack. */
- __memcpy(&tmp, s, offsetof(struct bad_iret_stack, regs.ip));
+ __memcpy(&tmp, bad_regs, offsetof(struct pt_regs, ip));
/* Update the entry stack */
__memcpy(new_stack, &tmp, sizeof(tmp));
- BUG_ON(!user_mode(&new_stack->regs));
+ BUG_ON(!user_mode(new_stack));
return new_stack;
}
#endif
--
2.19.1.6.gb485710b
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH V5 2/7] x86/entry: Switch the stack after error_entry() returns
2022-04-12 12:15 [PATCH V5 0/7] x86/entry: Clean up entry code Lai Jiangshan
2022-04-12 12:15 ` [PATCH V5 1/7] x86/traps: Move pt_regs only in fixup_bad_iret() Lai Jiangshan
@ 2022-04-12 12:15 ` Lai Jiangshan
2022-04-12 12:15 ` [PATCH V5 3/7] x86/entry: Move PUSH_AND_CLEAR_REGS out of error_entry() Lai Jiangshan
` (4 subsequent siblings)
6 siblings, 0 replies; 16+ messages in thread
From: Lai Jiangshan @ 2022-04-12 12:15 UTC (permalink / raw)
To: linux-kernel
Cc: Borislav Petkov, Peter Zijlstra, Josh Poimboeuf, Andy Lutomirski,
Thomas Gleixner, x86, Lai Jiangshan, Ingo Molnar, Dave Hansen,
H. Peter Anvin
From: Lai Jiangshan <jiangshan.ljs@antgroup.com>
error_entry() calls sync_regs(), and fixup_bad_iret() before sync_regs()
if it is a fault from bad IRET, to copy the pt_regs to the kernel stack
and switches the kernel stack directly after sync_regs().
But error_entry() itself is also a function call, so the code has to
stash the address error_entry() is going to return to, in %r12 and
makes the work complicated.
Move the code of switching stack after error_entry() and get rid of the
need to handle the return address.
Signed-off-by: Lai Jiangshan <jiangshan.ljs@antgroup.com>
---
arch/x86/entry/entry_64.S | 16 ++++++----------
1 file changed, 6 insertions(+), 10 deletions(-)
diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
index e9d896717ab4..e1efc56fbcd4 100644
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -326,6 +326,8 @@ SYM_CODE_END(ret_from_fork)
.macro idtentry_body cfunc has_error_code:req
call error_entry
+ movq %rax, %rsp /* switch to the task stack if from userspace */
+ ENCODE_FRAME_POINTER
UNWIND_HINT_REGS
movq %rsp, %rdi /* pt_regs pointer into 1st argument*/
@@ -999,14 +1001,10 @@ SYM_CODE_START_LOCAL(error_entry)
/* We have user CR3. Change to kernel CR3. */
SWITCH_TO_KERNEL_CR3 scratch_reg=%rax
+ leaq 8(%rsp), %rdi /* arg0 = pt_regs pointer */
.Lerror_entry_from_usermode_after_swapgs:
/* Put us onto the real thread stack. */
- popq %r12 /* save return addr in %12 */
- movq %rsp, %rdi /* arg0 = pt_regs pointer */
call sync_regs
- movq %rax, %rsp /* switch stack */
- ENCODE_FRAME_POINTER
- pushq %r12
RET
/*
@@ -1038,6 +1036,7 @@ SYM_CODE_START_LOCAL(error_entry)
*/
.Lerror_entry_done_lfence:
FENCE_SWAPGS_KERNEL_ENTRY
+ leaq 8(%rsp), %rax /* return pt_regs pointer */
RET
.Lbstep_iret:
@@ -1058,12 +1057,9 @@ SYM_CODE_START_LOCAL(error_entry)
* Pretend that the exception came from user mode: set up pt_regs
* as if we faulted immediately after IRET.
*/
- popq %r12 /* save return addr in %12 */
- movq %rsp, %rdi /* arg0 = pt_regs pointer */
+ leaq 8(%rsp), %rdi /* arg0 = pt_regs pointer */
call fixup_bad_iret
- mov %rax, %rsp
- ENCODE_FRAME_POINTER
- pushq %r12
+ mov %rax, %rdi
jmp .Lerror_entry_from_usermode_after_swapgs
SYM_CODE_END(error_entry)
--
2.19.1.6.gb485710b
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH V5 3/7] x86/entry: Move PUSH_AND_CLEAR_REGS out of error_entry()
2022-04-12 12:15 [PATCH V5 0/7] x86/entry: Clean up entry code Lai Jiangshan
2022-04-12 12:15 ` [PATCH V5 1/7] x86/traps: Move pt_regs only in fixup_bad_iret() Lai Jiangshan
2022-04-12 12:15 ` [PATCH V5 2/7] x86/entry: Switch the stack after error_entry() returns Lai Jiangshan
@ 2022-04-12 12:15 ` Lai Jiangshan
2022-04-12 13:26 ` Borislav Petkov
2022-04-12 12:15 ` [PATCH V5 4/7] x86/entry: Move cld to the start of idtentry macro Lai Jiangshan
` (3 subsequent siblings)
6 siblings, 1 reply; 16+ messages in thread
From: Lai Jiangshan @ 2022-04-12 12:15 UTC (permalink / raw)
To: linux-kernel
Cc: Borislav Petkov, Peter Zijlstra, Josh Poimboeuf, Andy Lutomirski,
Thomas Gleixner, x86, Lai Jiangshan, Ingo Molnar, Dave Hansen,
H. Peter Anvin
From: Lai Jiangshan <jiangshan.ljs@antgroup.com>
error_entry() doesn't handle the stack balanced. It includes
PUSH_AND_CLEAR_REGS which is commonly needed for all IDT entries and
can't pop the regs before it returns.
Move PUSH_AND_CLEAR_REGS out of error_entry() and make error_entry()
works on the stack normally.
After this, XENPV doesn't need error_entry() since PUSH_AND_CLEAR_REGS
is moved out and error_entry() can be converted to C code in future
since it doesn't fiddle the stack.
The text size will be enlarged:
size arch/x86/entry/entry_64.o.before:
text data bss dec hex filename
17916 384 0 18300 477c arch/x86/entry/entry_64.o
size --format=SysV arch/x86/entry/entry_64.o.before:
.entry.text 5528 0
.orc_unwind 6456 0
.orc_unwind_ip 4304 0
size arch/x86/entry/entry_64.o.after:
text data bss dec hex filename
26868 384 0 27252 6a74 arch/x86/entry/entry_64.o
size --format=SysV arch/x86/entry/entry_64.o.after:
.entry.text 8200 0
.orc_unwind 10224 0
.orc_unwind_ip 6816 0
The tables .orc_unwind[_ip] are enlarged due to it adds many pushes.
But .entry.text in x86_64 is 2M aligned, enlarging it to 8.2k doesn't
enlarge the final text size.
And it will only increase the footprint when different interrupts and
exceptions happen unlikely heavily at the same time.
Signed-off-by: Lai Jiangshan <jiangshan.ljs@antgroup.com>
---
arch/x86/entry/entry_64.S | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
index e1efc56fbcd4..835b798556fb 100644
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -325,6 +325,9 @@ SYM_CODE_END(ret_from_fork)
*/
.macro idtentry_body cfunc has_error_code:req
+ PUSH_AND_CLEAR_REGS
+ ENCODE_FRAME_POINTER
+
call error_entry
movq %rax, %rsp /* switch to the task stack if from userspace */
ENCODE_FRAME_POINTER
@@ -987,8 +990,6 @@ SYM_CODE_END(paranoid_exit)
SYM_CODE_START_LOCAL(error_entry)
UNWIND_HINT_FUNC
cld
- PUSH_AND_CLEAR_REGS save_ret=1
- ENCODE_FRAME_POINTER 8
testb $3, CS+8(%rsp)
jz .Lerror_kernelspace
--
2.19.1.6.gb485710b
^ permalink raw reply related [flat|nested] 16+ messages in thread
* Re: [PATCH V5 3/7] x86/entry: Move PUSH_AND_CLEAR_REGS out of error_entry()
2022-04-12 12:15 ` [PATCH V5 3/7] x86/entry: Move PUSH_AND_CLEAR_REGS out of error_entry() Lai Jiangshan
@ 2022-04-12 13:26 ` Borislav Petkov
2022-04-12 13:52 ` Lai Jiangshan
0 siblings, 1 reply; 16+ messages in thread
From: Borislav Petkov @ 2022-04-12 13:26 UTC (permalink / raw)
To: Lai Jiangshan
Cc: linux-kernel, Peter Zijlstra, Josh Poimboeuf, Andy Lutomirski,
Thomas Gleixner, x86, Lai Jiangshan, Ingo Molnar, Dave Hansen,
H. Peter Anvin
On Tue, Apr 12, 2022 at 08:15:37PM +0800, Lai Jiangshan wrote:
> From: Lai Jiangshan <jiangshan.ljs@antgroup.com>
>
> error_entry() doesn't handle the stack balanced.
What does that mean?
> It includes
> PUSH_AND_CLEAR_REGS which is commonly needed for all IDT entries and
> can't pop the regs before it returns.
>
> Move PUSH_AND_CLEAR_REGS out of error_entry() and make error_entry()
> works on the stack normally.
>
> After this, XENPV doesn't need error_entry() since PUSH_AND_CLEAR_REGS
> is moved out and error_entry() can be converted to C code in future
> since it doesn't fiddle the stack.
This is not a justification for this size increase:
text data bss dec hex filename
16060616 128131358 36384888 180576862 ac3625e vmlinux.before
16065626 128131358 36380792 180577776 ac365f0 vmlinux.after
~5K text increase already with my tailored config.
You can have a asm_error_entry(), written in asm, which does the regs
pushing and which calls error_entry() - the latter being the C version.
And no need for the size increase.
For example.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH V5 3/7] x86/entry: Move PUSH_AND_CLEAR_REGS out of error_entry()
2022-04-12 13:26 ` Borislav Petkov
@ 2022-04-12 13:52 ` Lai Jiangshan
2022-04-12 14:30 ` Borislav Petkov
0 siblings, 1 reply; 16+ messages in thread
From: Lai Jiangshan @ 2022-04-12 13:52 UTC (permalink / raw)
To: Borislav Petkov
Cc: LKML, Peter Zijlstra, Josh Poimboeuf, Andy Lutomirski,
Thomas Gleixner, X86 ML, Lai Jiangshan, Ingo Molnar, Dave Hansen,
H. Peter Anvin
On Tue, Apr 12, 2022 at 9:26 PM Borislav Petkov <bp@alien8.de> wrote:
>
> On Tue, Apr 12, 2022 at 08:15:37PM +0800, Lai Jiangshan wrote:
> > From: Lai Jiangshan <jiangshan.ljs@antgroup.com>
> >
> > error_entry() doesn't handle the stack balanced.
>
> What does that mean?
For a normal function, e.g. a function complied from a C function,
the stack will return to its original place when the function
returns. The size of memory pushed and popped are the same in
a function.
>
> > It includes
> > PUSH_AND_CLEAR_REGS which is commonly needed for all IDT entries and
> > can't pop the regs before it returns.
> >
> > Move PUSH_AND_CLEAR_REGS out of error_entry() and make error_entry()
> > works on the stack normally.
> >
> > After this, XENPV doesn't need error_entry() since PUSH_AND_CLEAR_REGS
> > is moved out and error_entry() can be converted to C code in future
> > since it doesn't fiddle the stack.
>
> This is not a justification for this size increase:
>
> text data bss dec hex filename
> 16060616 128131358 36384888 180576862 ac3625e vmlinux.before
> 16065626 128131358 36380792 180577776 ac365f0 vmlinux.after
>
> ~5K text increase already with my tailored config.
>
> You can have a asm_error_entry(), written in asm, which does the regs
> pushing and which calls error_entry() - the latter being the C version.
> And no need for the size increase.
The mapped size for the text is always 2M when the kernel is booted
since it is 2M-aligned. So I don't think the size is a concern.
The only concern is the footprint when different interrupts and
exceptions happen heavily at the same time. In this case, different
copies of PUSH_AND_CLEAR_REGS in the text will be touched.
For example, a heavy page fault and IPI or timer at the same time.
I'm not sure if it is a real case.
I'm Okay with asm_error_entry(). And also we can use an ASM function
containing PUSH_AND_CLEAR_REGS only.
Thanks
Lai
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH V5 3/7] x86/entry: Move PUSH_AND_CLEAR_REGS out of error_entry()
2022-04-12 13:52 ` Lai Jiangshan
@ 2022-04-12 14:30 ` Borislav Petkov
2022-04-13 3:48 ` Lai Jiangshan
0 siblings, 1 reply; 16+ messages in thread
From: Borislav Petkov @ 2022-04-12 14:30 UTC (permalink / raw)
To: Lai Jiangshan
Cc: LKML, Peter Zijlstra, Josh Poimboeuf, Andy Lutomirski,
Thomas Gleixner, X86 ML, Lai Jiangshan, Ingo Molnar, Dave Hansen,
H. Peter Anvin
On Tue, Apr 12, 2022 at 09:52:44PM +0800, Lai Jiangshan wrote:
> The mapped size for the text is always 2M when the kernel is booted
> since it is 2M-aligned. So I don't think the size is a concern.
This is not how this argumentation works: you add 0.01% size increase
here, 0.01% slowdown there and it all soon adds up into a bloated
kernel. Besides, nowadays the kernel runs a lot as a guest and guest
kernel size does matter.
So no, just because you want to turn error_entry() into C code and for
that you'll add some bloat doesn't justify it in my book. All changes to
core code need to be absolutely needed and justifiable.
> For example, a heavy page fault and IPI or timer at the same time.
> I'm not sure if it is a real case.
I wouldn't be surprised if it shows in some perf profiles.
> I'm Okay with asm_error_entry(). And also we can use an ASM function
> containing PUSH_AND_CLEAR_REGS only.
Just do the necessary and minimal thing - this is core code and in asm
for that matter, so be conservative pls. This is not some random driver.
Thx.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH V5 3/7] x86/entry: Move PUSH_AND_CLEAR_REGS out of error_entry()
2022-04-12 14:30 ` Borislav Petkov
@ 2022-04-13 3:48 ` Lai Jiangshan
2022-04-13 8:39 ` Borislav Petkov
0 siblings, 1 reply; 16+ messages in thread
From: Lai Jiangshan @ 2022-04-13 3:48 UTC (permalink / raw)
To: Borislav Petkov
Cc: LKML, Peter Zijlstra, Josh Poimboeuf, Andy Lutomirski,
Thomas Gleixner, X86 ML, Lai Jiangshan, Ingo Molnar, Dave Hansen,
H. Peter Anvin
On Tue, Apr 12, 2022 at 10:31 PM Borislav Petkov <bp@alien8.de> wrote:
>
> On Tue, Apr 12, 2022 at 09:52:44PM +0800, Lai Jiangshan wrote:
> > The mapped size for the text is always 2M when the kernel is booted
> > since it is 2M-aligned. So I don't think the size is a concern.
>
> This is not how this argumentation works: you add 0.01% size increase
> here, 0.01% slowdown there and it all soon adds up into a bloated
> kernel. Besides, nowadays the kernel runs a lot as a guest and guest
> kernel size does matter.
I agree with you.
I meant the real .entry.text size is unchanged. No matter what
the complied .entry.text size is, 5.5k before the patch, or 8.2k
after the patch, the size of .entry.text in the kernel is always 2M.
But the overall size is enlarged which is a concern.
>
> So no, just because you want to turn error_entry() into C code and for
> that you'll add some bloat doesn't justify it in my book. All changes to
> core code need to be absolutely needed and justifiable.
>
> > For example, a heavy page fault and IPI or timer at the same time.
> > I'm not sure if it is a real case.
>
> I wouldn't be surprised if it shows in some perf profiles.
>
> > I'm Okay with asm_error_entry(). And also we can use an ASM function
> > containing PUSH_AND_CLEAR_REGS only.
>
> Just do the necessary and minimal thing - this is core code and in asm
> for that matter, so be conservative pls. This is not some random driver.
>
I will follow your instructions and add an extra ASM function.
And maybe add this to the changelog:
A function call is added for every interrupt or exception, but it
is a low-level asm function call and actually, only a "call" and a
"ret" instructions are added. It will enable error_entry() to be
C code that can use inlined sync_regs() and save a C function call
which will reduce more instructions in return.
I'm also waiting for reviews for other patches and I'm not going to
send this new code too soon since I don't want to send a new revision
with only this patch updated.
Thanks
Lai
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH V5 3/7] x86/entry: Move PUSH_AND_CLEAR_REGS out of error_entry()
2022-04-13 3:48 ` Lai Jiangshan
@ 2022-04-13 8:39 ` Borislav Petkov
0 siblings, 0 replies; 16+ messages in thread
From: Borislav Petkov @ 2022-04-13 8:39 UTC (permalink / raw)
To: Lai Jiangshan
Cc: LKML, Peter Zijlstra, Josh Poimboeuf, Andy Lutomirski,
Thomas Gleixner, X86 ML, Lai Jiangshan, Ingo Molnar, Dave Hansen,
H. Peter Anvin
On Wed, Apr 13, 2022 at 11:48:21AM +0800, Lai Jiangshan wrote:
> I meant the real .entry.text size is unchanged. No matter what
> the complied .entry.text size is, 5.5k before the patch, or 8.2k
> after the patch, the size of .entry.text in the kernel is always 2M.
Yes, unless the bloat I'm talking about above continues and then it
becomes 4M. So we need to be careful.
> I will follow your instructions and add an extra ASM function.
Thx.
> And maybe add this to the changelog:
>
> A function call is added for every interrupt or exception, but it
I think you need to read submitting-patches.rst again:
"Describe your changes in imperative mood, e.g. "make xyzzy do frotz"
instead of "[This patch] makes xyzzy do frotz" or "[I] changed xyzzy
to do frotz", as if you are giving orders to the codebase to change
its behaviour."
In your case:
"Add a function call..."
> is a low-level asm function call and actually, only a "call" and a
> "ret" instructions are added.
CALL, RET - all x86 insns in caps.
> It will enable error_entry() to be
It will allow for sync_regs() to be converted to C...
> C code that can use inlined sync_regs() and save a C function call
> which will reduce more instructions in return.
>
> I'm also waiting for reviews for other patches and I'm not going to
> send this new code too soon since I don't want to send a new revision
> with only this patch updated.
Again, submitting-patches.rst:
"After you have submitted your change, be patient and wait. Reviewers are
busy people and may not get to your patch right away.
...
You should receive comments within a week or so; if that does not
happen, make sure that you have sent your patches to the right place.
Wait for a minimum of one week before resubmitting or pinging reviewers
- possibly longer during busy times like merge windows"
So you send, wait for feedback, incorporate it and resend after a week,
but not earlier.
Thx.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply [flat|nested] 16+ messages in thread
* [PATCH V5 4/7] x86/entry: Move cld to the start of idtentry macro
2022-04-12 12:15 [PATCH V5 0/7] x86/entry: Clean up entry code Lai Jiangshan
` (2 preceding siblings ...)
2022-04-12 12:15 ` [PATCH V5 3/7] x86/entry: Move PUSH_AND_CLEAR_REGS out of error_entry() Lai Jiangshan
@ 2022-04-12 12:15 ` Lai Jiangshan
2022-04-12 12:15 ` [PATCH V5 5/7] x86/entry: Don't call error_entry() for XENPV Lai Jiangshan
` (2 subsequent siblings)
6 siblings, 0 replies; 16+ messages in thread
From: Lai Jiangshan @ 2022-04-12 12:15 UTC (permalink / raw)
To: linux-kernel
Cc: Borislav Petkov, Peter Zijlstra, Josh Poimboeuf, Andy Lutomirski,
Thomas Gleixner, x86, Lai Jiangshan, Ingo Molnar, Dave Hansen,
H. Peter Anvin
From: Lai Jiangshan <jiangshan.ljs@antgroup.com>
Make it next to CLAC
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Lai Jiangshan <jiangshan.ljs@antgroup.com>
---
arch/x86/entry/entry_64.S | 8 +++++---
1 file changed, 5 insertions(+), 3 deletions(-)
diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
index 835b798556fb..7b6a0f15bb20 100644
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -360,6 +360,7 @@ SYM_CODE_START(\asmsym)
UNWIND_HINT_IRET_REGS offset=\has_error_code*8
ENDBR
ASM_CLAC
+ cld
.if \has_error_code == 0
pushq $-1 /* ORIG_RAX: no syscall to restart */
@@ -428,6 +429,7 @@ SYM_CODE_START(\asmsym)
UNWIND_HINT_IRET_REGS
ENDBR
ASM_CLAC
+ cld
pushq $-1 /* ORIG_RAX: no syscall to restart */
@@ -484,6 +486,7 @@ SYM_CODE_START(\asmsym)
UNWIND_HINT_IRET_REGS
ENDBR
ASM_CLAC
+ cld
/*
* If the entry is from userspace, switch stacks and treat it as
@@ -546,6 +549,7 @@ SYM_CODE_START(\asmsym)
UNWIND_HINT_IRET_REGS offset=8
ENDBR
ASM_CLAC
+ cld
/* paranoid_entry returns GS information for paranoid_exit in EBX. */
call paranoid_entry
@@ -871,7 +875,6 @@ SYM_CODE_END(xen_failsafe_callback)
*/
SYM_CODE_START_LOCAL(paranoid_entry)
UNWIND_HINT_FUNC
- cld
PUSH_AND_CLEAR_REGS save_ret=1
ENCODE_FRAME_POINTER 8
@@ -989,7 +992,6 @@ SYM_CODE_END(paranoid_exit)
*/
SYM_CODE_START_LOCAL(error_entry)
UNWIND_HINT_FUNC
- cld
testb $3, CS+8(%rsp)
jz .Lerror_kernelspace
@@ -1123,6 +1125,7 @@ SYM_CODE_START(asm_exc_nmi)
*/
ASM_CLAC
+ cld
/* Use %rdx as our temp variable throughout */
pushq %rdx
@@ -1142,7 +1145,6 @@ SYM_CODE_START(asm_exc_nmi)
*/
swapgs
- cld
FENCE_SWAPGS_USER_ENTRY
SWITCH_TO_KERNEL_CR3 scratch_reg=%rdx
movq %rsp, %rdx
--
2.19.1.6.gb485710b
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH V5 5/7] x86/entry: Don't call error_entry() for XENPV
2022-04-12 12:15 [PATCH V5 0/7] x86/entry: Clean up entry code Lai Jiangshan
` (3 preceding siblings ...)
2022-04-12 12:15 ` [PATCH V5 4/7] x86/entry: Move cld to the start of idtentry macro Lai Jiangshan
@ 2022-04-12 12:15 ` Lai Jiangshan
2022-04-20 16:32 ` Borislav Petkov
2022-04-12 12:15 ` [PATCH V5 6/7] x86/entry: Convert SWAPGS to swapgs and remove the definition of SWAPGS Lai Jiangshan
2022-04-12 12:15 ` [PATCH V5 7/7] x86/entry: Use idtentry macro for entry_INT80_compat Lai Jiangshan
6 siblings, 1 reply; 16+ messages in thread
From: Lai Jiangshan @ 2022-04-12 12:15 UTC (permalink / raw)
To: linux-kernel
Cc: Borislav Petkov, Peter Zijlstra, Josh Poimboeuf, Andy Lutomirski,
Thomas Gleixner, x86, Lai Jiangshan, Ingo Molnar, Dave Hansen,
H. Peter Anvin
From: Lai Jiangshan <jiangshan.ljs@antgroup.com>
When in XENPV, it is already in the task stack, and it can't fault for
native_iret() nor native_load_gs_index() since XENPV uses its own pvops
for IRET and load_gs_index(). And it doesn't need to switch the CR3.
So there is no reason to call error_entry() in XENPV.
Signed-off-by: Lai Jiangshan <jiangshan.ljs@antgroup.com>
---
arch/x86/entry/entry_64.S | 13 +++++++++++--
1 file changed, 11 insertions(+), 2 deletions(-)
diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
index 7b6a0f15bb20..3aca7815fe79 100644
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -328,8 +328,17 @@ SYM_CODE_END(ret_from_fork)
PUSH_AND_CLEAR_REGS
ENCODE_FRAME_POINTER
- call error_entry
- movq %rax, %rsp /* switch to the task stack if from userspace */
+ /*
+ * Call error_entry() and switch to the task stack if from userspace.
+ *
+ * When in XENPV, it is already in the task stack, and it can't fault
+ * for native_iret() nor native_load_gs_index() since XENPV uses its
+ * own pvops for IRET and load_gs_index(). And it doesn't need to
+ * switch the CR3. So it can skip invoking error_entry().
+ */
+ ALTERNATIVE "call error_entry; movq %rax, %rsp", \
+ "", X86_FEATURE_XENPV
+
ENCODE_FRAME_POINTER
UNWIND_HINT_REGS
--
2.19.1.6.gb485710b
^ permalink raw reply related [flat|nested] 16+ messages in thread
* Re: [PATCH V5 5/7] x86/entry: Don't call error_entry() for XENPV
2022-04-12 12:15 ` [PATCH V5 5/7] x86/entry: Don't call error_entry() for XENPV Lai Jiangshan
@ 2022-04-20 16:32 ` Borislav Petkov
0 siblings, 0 replies; 16+ messages in thread
From: Borislav Petkov @ 2022-04-20 16:32 UTC (permalink / raw)
To: Lai Jiangshan, Jürgen Gross
Cc: linux-kernel, Peter Zijlstra, Josh Poimboeuf, Andy Lutomirski,
Thomas Gleixner, x86, Lai Jiangshan, Ingo Molnar, Dave Hansen,
H. Peter Anvin
On Tue, Apr 12, 2022 at 08:15:39PM +0800, Lai Jiangshan wrote:
> From: Lai Jiangshan <jiangshan.ljs@antgroup.com>
>
> When in XENPV, it is already in the task stack, and it can't fault for
> native_iret() nor native_load_gs_index() since XENPV uses its own pvops
> for IRET and load_gs_index(). And it doesn't need to switch the CR3.
>
> So there is no reason to call error_entry() in XENPV.
>
> Signed-off-by: Lai Jiangshan <jiangshan.ljs@antgroup.com>
> ---
> arch/x86/entry/entry_64.S | 13 +++++++++++--
> 1 file changed, 11 insertions(+), 2 deletions(-)
>
> diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
> index 7b6a0f15bb20..3aca7815fe79 100644
> --- a/arch/x86/entry/entry_64.S
> +++ b/arch/x86/entry/entry_64.S
> @@ -328,8 +328,17 @@ SYM_CODE_END(ret_from_fork)
> PUSH_AND_CLEAR_REGS
> ENCODE_FRAME_POINTER
>
> - call error_entry
> - movq %rax, %rsp /* switch to the task stack if from userspace */
> + /*
> + * Call error_entry() and switch to the task stack if from userspace.
> + *
> + * When in XENPV, it is already in the task stack, and it can't fault
> + * for native_iret() nor native_load_gs_index() since XENPV uses its
> + * own pvops for IRET and load_gs_index(). And it doesn't need to
> + * switch the CR3. So it can skip invoking error_entry().
> + */
> + ALTERNATIVE "call error_entry; movq %rax, %rsp", \
> + "", X86_FEATURE_XENPV
> +
> ENCODE_FRAME_POINTER
> UNWIND_HINT_REGS
>
> --
This one needs Jürgen. CCed.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply [flat|nested] 16+ messages in thread
* [PATCH V5 6/7] x86/entry: Convert SWAPGS to swapgs and remove the definition of SWAPGS
2022-04-12 12:15 [PATCH V5 0/7] x86/entry: Clean up entry code Lai Jiangshan
` (4 preceding siblings ...)
2022-04-12 12:15 ` [PATCH V5 5/7] x86/entry: Don't call error_entry() for XENPV Lai Jiangshan
@ 2022-04-12 12:15 ` Lai Jiangshan
2022-04-12 12:15 ` [PATCH V5 7/7] x86/entry: Use idtentry macro for entry_INT80_compat Lai Jiangshan
6 siblings, 0 replies; 16+ messages in thread
From: Lai Jiangshan @ 2022-04-12 12:15 UTC (permalink / raw)
To: linux-kernel
Cc: Borislav Petkov, Peter Zijlstra, Josh Poimboeuf, Andy Lutomirski,
Thomas Gleixner, x86, Lai Jiangshan, Ingo Molnar, Dave Hansen,
H. Peter Anvin, Juergen Gross, Kirill A. Shutemov
From: Lai Jiangshan <jiangshan.ljs@antgroup.com>
XENPV doesn't use swapgs_restore_regs_and_return_to_usermode(),
error_entry() and entry_SYSENTER_compat().
Change the PV-compatible SWAPGS to the ASM instruction swapgs in these
functions.
Also remove the definition of SWAPGS since there is no user of the
SWAPGS anymore.
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Lai Jiangshan <jiangshan.ljs@antgroup.com>
---
arch/x86/entry/entry_64.S | 6 +++---
arch/x86/entry/entry_64_compat.S | 2 +-
arch/x86/include/asm/irqflags.h | 8 --------
3 files changed, 4 insertions(+), 12 deletions(-)
diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
index 3aca7815fe79..6611032979d9 100644
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -1008,7 +1008,7 @@ SYM_CODE_START_LOCAL(error_entry)
* We entered from user mode or we're pretending to have entered
* from user mode due to an IRET fault.
*/
- SWAPGS
+ swapgs
FENCE_SWAPGS_USER_ENTRY
/* We have user CR3. Change to kernel CR3. */
SWITCH_TO_KERNEL_CR3 scratch_reg=%rax
@@ -1040,7 +1040,7 @@ SYM_CODE_START_LOCAL(error_entry)
* gsbase and proceed. We'll fix up the exception and land in
* .Lgs_change's error handler with kernel gsbase.
*/
- SWAPGS
+ swapgs
/*
* Issue an LFENCE to prevent GS speculation, regardless of whether it is a
@@ -1061,7 +1061,7 @@ SYM_CODE_START_LOCAL(error_entry)
* We came from an IRET to user mode, so we have user
* gsbase and CR3. Switch to kernel gsbase and CR3:
*/
- SWAPGS
+ swapgs
FENCE_SWAPGS_USER_ENTRY
SWITCH_TO_KERNEL_CR3 scratch_reg=%rax
diff --git a/arch/x86/entry/entry_64_compat.S b/arch/x86/entry/entry_64_compat.S
index 4fdb007cddbd..c5aeb0819707 100644
--- a/arch/x86/entry/entry_64_compat.S
+++ b/arch/x86/entry/entry_64_compat.S
@@ -50,7 +50,7 @@ SYM_CODE_START(entry_SYSENTER_compat)
UNWIND_HINT_EMPTY
ENDBR
/* Interrupts are off on entry. */
- SWAPGS
+ swapgs
pushq %rax
SWITCH_TO_KERNEL_CR3 scratch_reg=%rax
diff --git a/arch/x86/include/asm/irqflags.h b/arch/x86/include/asm/irqflags.h
index 111104d1c2cd..7793e52d6237 100644
--- a/arch/x86/include/asm/irqflags.h
+++ b/arch/x86/include/asm/irqflags.h
@@ -137,14 +137,6 @@ static __always_inline void arch_local_irq_restore(unsigned long flags)
if (!arch_irqs_disabled_flags(flags))
arch_local_irq_enable();
}
-#else
-#ifdef CONFIG_X86_64
-#ifdef CONFIG_XEN_PV
-#define SWAPGS ALTERNATIVE "swapgs", "", X86_FEATURE_XENPV
-#else
-#define SWAPGS swapgs
-#endif
-#endif
#endif /* !__ASSEMBLY__ */
#endif
--
2.19.1.6.gb485710b
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH V5 7/7] x86/entry: Use idtentry macro for entry_INT80_compat
2022-04-12 12:15 [PATCH V5 0/7] x86/entry: Clean up entry code Lai Jiangshan
` (5 preceding siblings ...)
2022-04-12 12:15 ` [PATCH V5 6/7] x86/entry: Convert SWAPGS to swapgs and remove the definition of SWAPGS Lai Jiangshan
@ 2022-04-12 12:15 ` Lai Jiangshan
2022-04-25 10:24 ` Thomas Gleixner
6 siblings, 1 reply; 16+ messages in thread
From: Lai Jiangshan @ 2022-04-12 12:15 UTC (permalink / raw)
To: linux-kernel
Cc: Borislav Petkov, Peter Zijlstra, Josh Poimboeuf, Andy Lutomirski,
Thomas Gleixner, x86, Lai Jiangshan, Ingo Molnar, Dave Hansen,
H. Peter Anvin, Joerg Roedel, Kirill A. Shutemov, Chang S. Bae,
Kees Cook
From: Lai Jiangshan <jiangshan.ljs@antgroup.com>
entry_INT80_compat is identical to idtentry macro except a special
handling for %rax in the prolog.
Add the prolog to idtentry and use idtentry for entry_INT80_compat.
Signed-off-by: Lai Jiangshan <jiangshan.ljs@antgroup.com>
---
arch/x86/entry/entry_64.S | 18 ++++++
arch/x86/entry/entry_64_compat.S | 103 -------------------------------
arch/x86/include/asm/idtentry.h | 47 ++++++++++++++
arch/x86/include/asm/proto.h | 4 --
4 files changed, 65 insertions(+), 107 deletions(-)
diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
index 6611032979d9..d49b93f494df 100644
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -375,6 +375,24 @@ SYM_CODE_START(\asmsym)
pushq $-1 /* ORIG_RAX: no syscall to restart */
.endif
+ .if \vector == IA32_SYSCALL_VECTOR
+ /*
+ * User tracing code (ptrace or signal handlers) might assume
+ * that the saved RAX contains a 32-bit number when we're
+ * invoking a 32-bit syscall. Just in case the high bits are
+ * nonzero, zero-extend the syscall number. (This could almost
+ * certainly be deleted with no ill effects.)
+ */
+ movl %eax, %eax
+
+ /*
+ * do_int80_syscall_32() expects regs->orig_ax to be user ax,
+ * and regs->ax to be $-ENOSYS.
+ */
+ movq %rax, (%rsp)
+ movq $-ENOSYS, %rax
+ .endif
+
.if \vector == X86_TRAP_BP
/*
* If coming from kernel space, create a 6-word gap to allow the
diff --git a/arch/x86/entry/entry_64_compat.S b/arch/x86/entry/entry_64_compat.S
index c5aeb0819707..6866151bbef3 100644
--- a/arch/x86/entry/entry_64_compat.S
+++ b/arch/x86/entry/entry_64_compat.S
@@ -315,106 +315,3 @@ sysret32_from_system_call:
swapgs
sysretl
SYM_CODE_END(entry_SYSCALL_compat)
-
-/*
- * 32-bit legacy system call entry.
- *
- * 32-bit x86 Linux system calls traditionally used the INT $0x80
- * instruction. INT $0x80 lands here.
- *
- * This entry point can be used by 32-bit and 64-bit programs to perform
- * 32-bit system calls. Instances of INT $0x80 can be found inline in
- * various programs and libraries. It is also used by the vDSO's
- * __kernel_vsyscall fallback for hardware that doesn't support a faster
- * entry method. Restarted 32-bit system calls also fall back to INT
- * $0x80 regardless of what instruction was originally used to do the
- * system call.
- *
- * This is considered a slow path. It is not used by most libc
- * implementations on modern hardware except during process startup.
- *
- * Arguments:
- * eax system call number
- * ebx arg1
- * ecx arg2
- * edx arg3
- * esi arg4
- * edi arg5
- * ebp arg6
- */
-SYM_CODE_START(entry_INT80_compat)
- UNWIND_HINT_EMPTY
- ENDBR
- /*
- * Interrupts are off on entry.
- */
- ASM_CLAC /* Do this early to minimize exposure */
- SWAPGS
-
- /*
- * User tracing code (ptrace or signal handlers) might assume that
- * the saved RAX contains a 32-bit number when we're invoking a 32-bit
- * syscall. Just in case the high bits are nonzero, zero-extend
- * the syscall number. (This could almost certainly be deleted
- * with no ill effects.)
- */
- movl %eax, %eax
-
- /* switch to thread stack expects orig_ax and rdi to be pushed */
- pushq %rax /* pt_regs->orig_ax */
- pushq %rdi /* pt_regs->di */
-
- /* Need to switch before accessing the thread stack. */
- SWITCH_TO_KERNEL_CR3 scratch_reg=%rdi
-
- /* In the Xen PV case we already run on the thread stack. */
- ALTERNATIVE "", "jmp .Lint80_keep_stack", X86_FEATURE_XENPV
-
- movq %rsp, %rdi
- movq PER_CPU_VAR(cpu_current_top_of_stack), %rsp
-
- pushq 6*8(%rdi) /* regs->ss */
- pushq 5*8(%rdi) /* regs->rsp */
- pushq 4*8(%rdi) /* regs->eflags */
- pushq 3*8(%rdi) /* regs->cs */
- pushq 2*8(%rdi) /* regs->ip */
- pushq 1*8(%rdi) /* regs->orig_ax */
- pushq (%rdi) /* pt_regs->di */
-.Lint80_keep_stack:
-
- pushq %rsi /* pt_regs->si */
- xorl %esi, %esi /* nospec si */
- pushq %rdx /* pt_regs->dx */
- xorl %edx, %edx /* nospec dx */
- pushq %rcx /* pt_regs->cx */
- xorl %ecx, %ecx /* nospec cx */
- pushq $-ENOSYS /* pt_regs->ax */
- pushq %r8 /* pt_regs->r8 */
- xorl %r8d, %r8d /* nospec r8 */
- pushq %r9 /* pt_regs->r9 */
- xorl %r9d, %r9d /* nospec r9 */
- pushq %r10 /* pt_regs->r10*/
- xorl %r10d, %r10d /* nospec r10 */
- pushq %r11 /* pt_regs->r11 */
- xorl %r11d, %r11d /* nospec r11 */
- pushq %rbx /* pt_regs->rbx */
- xorl %ebx, %ebx /* nospec rbx */
- pushq %rbp /* pt_regs->rbp */
- xorl %ebp, %ebp /* nospec rbp */
- pushq %r12 /* pt_regs->r12 */
- xorl %r12d, %r12d /* nospec r12 */
- pushq %r13 /* pt_regs->r13 */
- xorl %r13d, %r13d /* nospec r13 */
- pushq %r14 /* pt_regs->r14 */
- xorl %r14d, %r14d /* nospec r14 */
- pushq %r15 /* pt_regs->r15 */
- xorl %r15d, %r15d /* nospec r15 */
-
- UNWIND_HINT_REGS
-
- cld
-
- movq %rsp, %rdi
- call do_int80_syscall_32
- jmp swapgs_restore_regs_and_return_to_usermode
-SYM_CODE_END(entry_INT80_compat)
diff --git a/arch/x86/include/asm/idtentry.h b/arch/x86/include/asm/idtentry.h
index 72184b0b2219..5bf8a01d31f3 100644
--- a/arch/x86/include/asm/idtentry.h
+++ b/arch/x86/include/asm/idtentry.h
@@ -206,6 +206,20 @@ __visible noinstr void func(struct pt_regs *regs, \
\
static noinline void __##func(struct pt_regs *regs, u32 vector)
+/**
+ * DECLARE_IDTENTRY_IA32_EMULATION - Declare functions for int80
+ * @vector: Vector number (ignored for C)
+ * @asm_func: Function name of the entry point
+ * @cfunc: The C handler called from the ASM entry point (ignored for C)
+ *
+ * Declares two functions:
+ * - The ASM entry point: asm_func
+ * - The XEN PV trap entry point: xen_##asm_func (maybe unused)
+ */
+#define DECLARE_IDTENTRY_IA32_EMULATION(vector, asm_func, cfunc) \
+ asmlinkage void asm_func(void); \
+ asmlinkage void xen_##asm_func(void)
+
/**
* DECLARE_IDTENTRY_SYSVEC - Declare functions for system vector entry points
* @vector: Vector number (ignored for C)
@@ -432,6 +446,35 @@ __visible noinstr void func(struct pt_regs *regs, \
#define DECLARE_IDTENTRY_ERRORCODE(vector, func) \
idtentry vector asm_##func func has_error_code=1
+/*
+ * 32-bit legacy system call entry.
+ *
+ * 32-bit x86 Linux system calls traditionally used the INT $0x80
+ * instruction. INT $0x80 lands here.
+ *
+ * This entry point can be used by 32-bit and 64-bit programs to perform
+ * 32-bit system calls. Instances of INT $0x80 can be found inline in
+ * various programs and libraries. It is also used by the vDSO's
+ * __kernel_vsyscall fallback for hardware that doesn't support a faster
+ * entry method. Restarted 32-bit system calls also fall back to INT
+ * $0x80 regardless of what instruction was originally used to do the
+ * system call.
+ *
+ * This is considered a slow path. It is not used by most libc
+ * implementations on modern hardware except during process startup.
+ *
+ * Arguments:
+ * eax system call number
+ * ebx arg1
+ * ecx arg2
+ * edx arg3
+ * esi arg4
+ * edi arg5
+ * ebp arg6
+ */
+#define DECLARE_IDTENTRY_IA32_EMULATION(vector, asm_func, cfunc) \
+ idtentry vector asm_func cfunc has_error_code=0
+
/* Special case for 32bit IRET 'trap'. Do not emit ASM code */
#define DECLARE_IDTENTRY_SW(vector, func)
@@ -642,6 +685,10 @@ DECLARE_IDTENTRY_IRQ(X86_TRAP_OTHER, common_interrupt);
DECLARE_IDTENTRY_IRQ(X86_TRAP_OTHER, spurious_interrupt);
#endif
+#ifdef CONFIG_IA32_EMULATION
+DECLARE_IDTENTRY_IA32_EMULATION(IA32_SYSCALL_VECTOR, entry_INT80_compat, do_int80_syscall_32);
+#endif
+
/* System vector entry points */
#ifdef CONFIG_X86_LOCAL_APIC
DECLARE_IDTENTRY_SYSVEC(ERROR_APIC_VECTOR, sysvec_error_interrupt);
diff --git a/arch/x86/include/asm/proto.h b/arch/x86/include/asm/proto.h
index 0f899c8d7a4e..4700d1650410 100644
--- a/arch/x86/include/asm/proto.h
+++ b/arch/x86/include/asm/proto.h
@@ -28,10 +28,6 @@ void entry_SYSENTER_compat(void);
void __end_entry_SYSENTER_compat(void);
void entry_SYSCALL_compat(void);
void entry_SYSCALL_compat_safe_stack(void);
-void entry_INT80_compat(void);
-#ifdef CONFIG_XEN_PV
-void xen_entry_INT80_compat(void);
-#endif
#endif
void x86_configure_nx(void);
--
2.19.1.6.gb485710b
^ permalink raw reply related [flat|nested] 16+ messages in thread
* Re: [PATCH V5 7/7] x86/entry: Use idtentry macro for entry_INT80_compat
2022-04-12 12:15 ` [PATCH V5 7/7] x86/entry: Use idtentry macro for entry_INT80_compat Lai Jiangshan
@ 2022-04-25 10:24 ` Thomas Gleixner
2022-04-25 13:25 ` Lai Jiangshan
0 siblings, 1 reply; 16+ messages in thread
From: Thomas Gleixner @ 2022-04-25 10:24 UTC (permalink / raw)
To: Lai Jiangshan, linux-kernel
Cc: Borislav Petkov, Peter Zijlstra, Josh Poimboeuf, Andy Lutomirski,
x86, Lai Jiangshan, Ingo Molnar, Dave Hansen, H. Peter Anvin,
Joerg Roedel, Kirill A. Shutemov, Chang S. Bae, Kees Cook
On Tue, Apr 12 2022 at 20:15, Lai Jiangshan wrote:
> From: Lai Jiangshan <jiangshan.ljs@antgroup.com>
>
> entry_INT80_compat is identical to idtentry macro except a special
> handling for %rax in the prolog.
Seriously?
> - pushq %rsi /* pt_regs->si */
> - xorl %esi, %esi /* nospec si */
esi is not cleared in CLEAR_REGS. So much for identical.
Thanks,
tglx
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH V5 7/7] x86/entry: Use idtentry macro for entry_INT80_compat
2022-04-25 10:24 ` Thomas Gleixner
@ 2022-04-25 13:25 ` Lai Jiangshan
0 siblings, 0 replies; 16+ messages in thread
From: Lai Jiangshan @ 2022-04-25 13:25 UTC (permalink / raw)
To: Thomas Gleixner
Cc: LKML, Borislav Petkov, Peter Zijlstra, Josh Poimboeuf,
Andy Lutomirski, X86 ML, Lai Jiangshan, Ingo Molnar, Dave Hansen,
H. Peter Anvin, Joerg Roedel, Kirill A. Shutemov, Chang S. Bae,
Kees Cook
On Mon, Apr 25, 2022 at 6:24 PM Thomas Gleixner <tglx@linutronix.de> wrote:
>
> On Tue, Apr 12 2022 at 20:15, Lai Jiangshan wrote:
> > From: Lai Jiangshan <jiangshan.ljs@antgroup.com>
> >
> > entry_INT80_compat is identical to idtentry macro except a special
> > handling for %rax in the prolog.
>
> Seriously?
>
> > - pushq %rsi /* pt_regs->si */
> > - xorl %esi, %esi /* nospec si */
>
> esi is not cleared in CLEAR_REGS. So much for identical.
>
>
Hello, Thomas
Thank you for the review.
They (the old entry_INT80_compat() and the new using the macro
idtentry) are not identical in ASM code.
The macro idtentry pushes %rsi to the entry stack and then copies
it from the entry stack to the kernel stack and then switches
the stack.
The original entry_INT80_compat() is much more straightforward
and efficient. It switches the stack as soon as possible and
then pushes %rsi directly onto the kernel stack.
So they are different in this aspect.
I compared the macro idtentry and the original entry_INT80_compat(),
to check if the macro idtentry has all the behaviors that the INT80
thing has and check if what the macro idtentry has but the INT80
thing doesn't is a No-op (like the handling of bad IRET).
In my view, the checks don't fail my expectations except for
regs->ax and regs->orig_ax.
As for CLEAR_REGS, I also have reviewed it many times. To me, it
equals clearing all registers although it doesn't clear ax, sp,
di, si. In the comments, it says
The lower registers are likely clobbered well before they could
be put to use in a speculative execution gadget.
When using CLEAR_REGS for the INT80 thing, the %rsi will be cleared
explicitly when syscall_enter_from_user_mode() which has 2 arguments
is called.
"identical" is overstated. I will change the changelog to say their
behaviors are almost similar and the final result are the same when
the macro idtentry has the prolog.
Thanks
Lai
^ permalink raw reply [flat|nested] 16+ messages in thread