linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH v2 0/5] x86/entry/64: interrupt entry size reduction
@ 2018-02-18 12:31 Dominik Brodowski
  2018-02-18 12:31 ` [RFC PATCH v2 1/5] x86/entry/64: move PUSH_AND_CLEAR_REGS from interrupt macro to helper function Dominik Brodowski
                   ` (4 more replies)
  0 siblings, 5 replies; 8+ messages in thread
From: Dominik Brodowski @ 2018-02-18 12:31 UTC (permalink / raw)
  To: linux-kernel, mingo, x86, brgerst, luto, jpoimboe
  Cc: torvalds, ak, tglx, dan.j.williams

Here is a re-spin of the interrupt entry size reduction patchset, which
applies on top of tip/pti and tries to implement what Linus suggested a
few days ago.[*]

Patch 1/5 provides the most significant cuttings (-3k) and gets us below
the text size of entry_64.o (by about 2k) compared to before I started
meddling with it. This patch is unchanged to v1 and seems to be the most
trivial of this patchset (famous last words), but still requires
stringent review before being applied.

The other four patches still need more discussion on whether each additional
step and the code complexity it adds is really worth it. And stringent
review, of course.

Overall, these patches provide for a sizeable cutting of up to 4.1k compared
to tip/pti:

   text	   data	    bss	    dec	    hex	filename
  19500	      0	      0	  19500	   4c2c	entry_64.o-orig (before 2e3f0098bc45)
  21088	      0	      0	  21088	   5260	entry_64.o-pti (as of ced5d0bf603f)
  18006	      0	      0	  18006	   4656	entry_64.o (patch 1/5)
  16882	      0	      0	  16882	   41f2	entry_64.o (patch 5/5)

[*] http://lkml.kernel.org/r/CA+55aFwLTF3EtaQ4OpDv2UM41J=EU7gfemv=eVq+uQi31-usSg@mail.gmail.com

Changes since v1:
- patch 1 (unchanged)
- patch 2:
	- use leaq 8() instead of mov+addq in ENTER_IRQ_STACK (suggested
	  by Brian Gerst)
	- add UNWIND_HINT_REGS indirect=1 after call to interrupt_helper;
	  update size information accordingly
- patch 3:
	- add cld to switch_to_thread_stack wrapper function
	- improve commit message
	- update size information
- patch 4:
	- add UNWIND_HINT_REGS indirect=1 to the now-removed interrupt
	  macro
- patch 5 (NEW):
	- open-code the DO_SWITCH_TO_THREAD_STACK macro (suggested by
	  Brian Gerst)
	- improve (?) UNWIND hints

Dominik Brodowski (5):
  x86/entry/64: move PUSH_AND_CLEAR_REGS from interrupt macro to helper
    function
  x86/entry/64: move ENTER_IRQ_STACK from interrupt macro to helper
    function
  x86/entry/64: move switch_to_thread_stack to interrupt helper function
  x86/entry/64: remove interrupt macro
  x86/entry/64: open-code switch_to_thread_stack

 arch/x86/entry/entry_64.S        | 105 +++++++++++++++++++++++----------------
 arch/x86/entry/entry_64_compat.S |  19 ++++++-
 2 files changed, 81 insertions(+), 43 deletions(-)

-- 
2.16.1

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [RFC PATCH v2 1/5] x86/entry/64: move PUSH_AND_CLEAR_REGS from interrupt macro to helper function
  2018-02-18 12:31 [RFC PATCH v2 0/5] x86/entry/64: interrupt entry size reduction Dominik Brodowski
@ 2018-02-18 12:31 ` Dominik Brodowski
  2018-02-18 12:31 ` [RFC PATCH v2 2/5] x86/entry/64: move ENTER_IRQ_STACK " Dominik Brodowski
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 8+ messages in thread
From: Dominik Brodowski @ 2018-02-18 12:31 UTC (permalink / raw)
  To: linux-kernel, mingo, x86, brgerst, luto, jpoimboe
  Cc: torvalds, ak, tglx, dan.j.williams

The PUSH_AND_CLEAR_REGS macro is able to insert the GP registers
"above" the original return address. This allows us to move a sizeable
part of the interrupt entry macro to an interrupt entry helper function:

   text	   data	    bss	    dec	    hex	filename
  21088	      0	      0	  21088	   5260	entry_64.o-orig
  18006	      0	      0	  18006	   4656	entry_64.o

Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
---
 arch/x86/entry/entry_64.S | 12 ++++++++++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
index 77edc2390868..5690391cb7c7 100644
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -526,6 +526,15 @@ END(irq_entries_start)
  *
  * Entry runs with interrupts off.
  */
+ENTRY(interrupt_helper)
+	UNWIND_HINT_FUNC
+	cld
+
+	PUSH_AND_CLEAR_REGS save_ret=1
+	ENCODE_FRAME_POINTER 8
+
+	ret
+END(interrupt_helper)
 
 /* 0(%rsp): ~(interrupt number) */
 	.macro interrupt func
@@ -537,8 +546,7 @@ END(irq_entries_start)
 	call	switch_to_thread_stack
 1:
 
-	PUSH_AND_CLEAR_REGS
-	ENCODE_FRAME_POINTER
+	call	interrupt_helper
 
 	testb	$3, CS(%rsp)
 	jz	1f
-- 
2.16.1

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [RFC PATCH v2 2/5] x86/entry/64: move ENTER_IRQ_STACK from interrupt macro to helper function
  2018-02-18 12:31 [RFC PATCH v2 0/5] x86/entry/64: interrupt entry size reduction Dominik Brodowski
  2018-02-18 12:31 ` [RFC PATCH v2 1/5] x86/entry/64: move PUSH_AND_CLEAR_REGS from interrupt macro to helper function Dominik Brodowski
@ 2018-02-18 12:31 ` Dominik Brodowski
  2018-02-18 12:31 ` [RFC PATCH v2 3/5] x86/entry/64: move switch_to_thread_stack to interrupt " Dominik Brodowski
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 8+ messages in thread
From: Dominik Brodowski @ 2018-02-18 12:31 UTC (permalink / raw)
  To: linux-kernel, mingo, x86, brgerst, luto, jpoimboe
  Cc: torvalds, ak, tglx, dan.j.williams

Moving the switch to IRQ stack from the interrupt macro to the helper
function requires some trickery: All ENTER_IRQ_STACK really cares about
is where the "original" stack -- meaning the GP registers etc. -- is
stored. Therefore, we need to offset the stored RSP value by 8 whenever
ENTER_IRQ_STACK is called from within a function. In such cases, and
after switching to the IRQ stack, we need to push the "original" return
address (i.e. the return address from the call to the interrupt entry
function) to the IRQ stack.

This trickery allows us to carve another .85k from the text size (it
would be more except for the additional unwind hints):

   text	   data	    bss	    dec	    hex	filename
  18006	      0	      0	  18006	   4656	entry_64.o-orig
  17158	      0	      0	  17158	   4306	entry_64.o

Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
---
 arch/x86/entry/entry_64.S | 56 ++++++++++++++++++++++++++++++++---------------
 1 file changed, 38 insertions(+), 18 deletions(-)

diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
index 5690391cb7c7..93a9bea6b969 100644
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -449,9 +449,19 @@ END(irq_entries_start)
  *
  * The invariant is that, if irq_count != -1, then the IRQ stack is in use.
  */
-.macro ENTER_IRQ_STACK regs=1 old_rsp
+.macro ENTER_IRQ_STACK regs=1 old_rsp save_ret=0
 	DEBUG_ENTRY_ASSERT_IRQS_OFF
+
+	.if \save_ret
+	/*
+	 * If save_ret is set, the original stack contains one additional
+	 * entry -- the return address. Therefore, move the address one
+	 * entry below %rsp to \old_rsp.
+	 */
+	leaq	8(%rsp), \old_rsp
+	.else
 	movq	%rsp, \old_rsp
+	.endif
 
 	.if \regs
 	UNWIND_HINT_REGS base=\old_rsp
@@ -497,6 +507,15 @@ END(irq_entries_start)
 	.if \regs
 	UNWIND_HINT_REGS indirect=1
 	.endif
+
+	.if \save_ret
+	/*
+	 * Push the return address to the stack. This return address can
+	 * be found at the "real" original RSP, which was offset by 8 at
+	 * the beginning of this macro.
+	 */
+	pushq	-8(\old_rsp)
+	.endif
 .endm
 
 /*
@@ -533,22 +552,7 @@ ENTRY(interrupt_helper)
 	PUSH_AND_CLEAR_REGS save_ret=1
 	ENCODE_FRAME_POINTER 8
 
-	ret
-END(interrupt_helper)
-
-/* 0(%rsp): ~(interrupt number) */
-	.macro interrupt func
-	cld
-
-	testb	$3, CS-ORIG_RAX(%rsp)
-	jz	1f
-	SWAPGS
-	call	switch_to_thread_stack
-1:
-
-	call	interrupt_helper
-
-	testb	$3, CS(%rsp)
+	testb	$3, CS+8(%rsp)
 	jz	1f
 
 	/*
@@ -566,10 +570,26 @@ END(interrupt_helper)
 	CALL_enter_from_user_mode
 
 1:
-	ENTER_IRQ_STACK old_rsp=%rdi
+	ENTER_IRQ_STACK old_rsp=%rdi save_ret=1
 	/* We entered an interrupt context - irqs are off: */
 	TRACE_IRQS_OFF
 
+	ret
+END(interrupt_helper)
+
+/* 0(%rsp): ~(interrupt number) */
+	.macro interrupt func
+	cld
+
+	testb	$3, CS-ORIG_RAX(%rsp)
+	jz	1f
+	SWAPGS
+	call	switch_to_thread_stack
+1:
+
+	call	interrupt_helper
+
+	UNWIND_HINT_REGS indirect=1
 	call	\func	/* rdi points to pt_regs */
 	.endm
 
-- 
2.16.1

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [RFC PATCH v2 3/5] x86/entry/64: move switch_to_thread_stack to interrupt helper function
  2018-02-18 12:31 [RFC PATCH v2 0/5] x86/entry/64: interrupt entry size reduction Dominik Brodowski
  2018-02-18 12:31 ` [RFC PATCH v2 1/5] x86/entry/64: move PUSH_AND_CLEAR_REGS from interrupt macro to helper function Dominik Brodowski
  2018-02-18 12:31 ` [RFC PATCH v2 2/5] x86/entry/64: move ENTER_IRQ_STACK " Dominik Brodowski
@ 2018-02-18 12:31 ` Dominik Brodowski
  2018-02-18 12:31 ` [RFC PATCH v2 4/5] x86/entry/64: remove interrupt macro Dominik Brodowski
  2018-02-18 12:31 ` [RFC PATCH v2 5/5] x86/entry/64: open-code switch_to_thread_stack Dominik Brodowski
  4 siblings, 0 replies; 8+ messages in thread
From: Dominik Brodowski @ 2018-02-18 12:31 UTC (permalink / raw)
  To: linux-kernel, mingo, x86, brgerst, luto, jpoimboe
  Cc: torvalds, ak, tglx, dan.j.williams

We can also move the SWAPGS and the switch_to_thread_stack to the
interrupt helper function. As we do not want call depths of two,
convert switch_to_thread_stack to a macro.

However, switch_to_thread_stack has another user in entry_64_compat.S,
which currently expects it to be a function. To keep the code changes
in this patch minimal, create a wrapper function.

The switch to a macro means that there is some binary code duplication
if CONFIG_IA32_EMULATION is enabled. Therefore, the size reduction
differs whether CONFIG_IA32_EMULATION is enabled or not:

CONFIG_IA32_EMULATION=y (-0.13k):
   text	   data	    bss	    dec	    hex	filename
  17158	      0	      0	  17158	   4306	entry_64.o-orig
  17028	      0	      0	  17028	   4284	entry_64.o

CONFIG_IA32_EMULATION=n (-0.27k):
   text	   data	    bss	    dec	    hex	filename
  17158	      0	      0	  17158	   4306	entry_64.o-orig
  16883	      0	      0	  16883	   41f3	entry_64.o

Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
---
 arch/x86/entry/entry_64.S | 64 +++++++++++++++++++++++++++--------------------
 1 file changed, 37 insertions(+), 27 deletions(-)

diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
index 93a9bea6b969..266bfcba3bb5 100644
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -538,6 +538,31 @@ END(irq_entries_start)
 	decl	PER_CPU_VAR(irq_count)
 .endm
 
+/*
+ * Switch to the thread stack.  This is called with the IRET frame and
+ * orig_ax on the stack.  (That is, RDI..R12 are not on the stack and
+ * space has not been allocated for them.)
+ */
+.macro DO_SWITCH_TO_THREAD_STACK
+	pushq	%rdi
+	/* Need to switch before accessing the thread stack. */
+	SWITCH_TO_KERNEL_CR3 scratch_reg=%rdi
+	movq	%rsp, %rdi
+	movq	PER_CPU_VAR(cpu_current_top_of_stack), %rsp
+	UNWIND_HINT sp_offset=16 sp_reg=ORC_REG_DI
+
+	pushq	7*8(%rdi)		/* regs->ss */
+	pushq	6*8(%rdi)		/* regs->rsp */
+	pushq	5*8(%rdi)		/* regs->eflags */
+	pushq	4*8(%rdi)		/* regs->cs */
+	pushq	3*8(%rdi)		/* regs->ip */
+	pushq	2*8(%rdi)		/* regs->orig_ax */
+	pushq	8(%rdi)			/* return address */
+	UNWIND_HINT_FUNC
+
+	movq	(%rdi), %rdi
+.endm
+
 /*
  * Interrupt entry/exit.
  *
@@ -545,10 +570,17 @@ END(irq_entries_start)
  *
  * Entry runs with interrupts off.
  */
+/* 8(%rsp): ~(interrupt number) */
 ENTRY(interrupt_helper)
 	UNWIND_HINT_FUNC
 	cld
 
+	testb	$3, CS-ORIG_RAX+8(%rsp)
+	jz	1f
+	SWAPGS
+	DO_SWITCH_TO_THREAD_STACK
+1:
+
 	PUSH_AND_CLEAR_REGS save_ret=1
 	ENCODE_FRAME_POINTER 8
 
@@ -581,12 +613,6 @@ END(interrupt_helper)
 	.macro interrupt func
 	cld
 
-	testb	$3, CS-ORIG_RAX(%rsp)
-	jz	1f
-	SWAPGS
-	call	switch_to_thread_stack
-1:
-
 	call	interrupt_helper
 
 	UNWIND_HINT_REGS indirect=1
@@ -860,33 +886,17 @@ apicinterrupt IRQ_WORK_VECTOR			irq_work_interrupt		smp_irq_work_interrupt
  */
 #define CPU_TSS_IST(x) PER_CPU_VAR(cpu_tss_rw) + (TSS_ist + ((x) - 1) * 8)
 
-/*
- * Switch to the thread stack.  This is called with the IRET frame and
- * orig_ax on the stack.  (That is, RDI..R12 are not on the stack and
- * space has not been allocated for them.)
- */
+#if defined(CONFIG_IA32_EMULATION)
+/* entry_64_compat.S::entry_INT80_compat expects this to be an ASM function */
 ENTRY(switch_to_thread_stack)
 	UNWIND_HINT_FUNC
+	cld
 
-	pushq	%rdi
-	/* Need to switch before accessing the thread stack. */
-	SWITCH_TO_KERNEL_CR3 scratch_reg=%rdi
-	movq	%rsp, %rdi
-	movq	PER_CPU_VAR(cpu_current_top_of_stack), %rsp
-	UNWIND_HINT sp_offset=16 sp_reg=ORC_REG_DI
-
-	pushq	7*8(%rdi)		/* regs->ss */
-	pushq	6*8(%rdi)		/* regs->rsp */
-	pushq	5*8(%rdi)		/* regs->eflags */
-	pushq	4*8(%rdi)		/* regs->cs */
-	pushq	3*8(%rdi)		/* regs->ip */
-	pushq	2*8(%rdi)		/* regs->orig_ax */
-	pushq	8(%rdi)			/* return address */
-	UNWIND_HINT_FUNC
+	DO_SWITCH_TO_THREAD_STACK
 
-	movq	(%rdi), %rdi
 	ret
 END(switch_to_thread_stack)
+#endif
 
 .macro idtentry sym do_sym has_error_code:req paranoid=0 shift_ist=-1
 ENTRY(\sym)
-- 
2.16.1

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [RFC PATCH v2 4/5] x86/entry/64: remove interrupt macro
  2018-02-18 12:31 [RFC PATCH v2 0/5] x86/entry/64: interrupt entry size reduction Dominik Brodowski
                   ` (2 preceding siblings ...)
  2018-02-18 12:31 ` [RFC PATCH v2 3/5] x86/entry/64: move switch_to_thread_stack to interrupt " Dominik Brodowski
@ 2018-02-18 12:31 ` Dominik Brodowski
  2018-02-18 12:31 ` [RFC PATCH v2 5/5] x86/entry/64: open-code switch_to_thread_stack Dominik Brodowski
  4 siblings, 0 replies; 8+ messages in thread
From: Dominik Brodowski @ 2018-02-18 12:31 UTC (permalink / raw)
  To: linux-kernel, mingo, x86, brgerst, luto, jpoimboe
  Cc: torvalds, ak, tglx, dan.j.williams

It is now trivial to call the interrupt helper function and then the
actual worker. Therefore, remove the interrupt macro.

Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
---
 arch/x86/entry/entry_64.S | 18 ++++++------------
 1 file changed, 6 insertions(+), 12 deletions(-)

diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
index 266bfcba3bb5..0ac2b4865d58 100644
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -609,16 +609,6 @@ ENTRY(interrupt_helper)
 	ret
 END(interrupt_helper)
 
-/* 0(%rsp): ~(interrupt number) */
-	.macro interrupt func
-	cld
-
-	call	interrupt_helper
-
-	UNWIND_HINT_REGS indirect=1
-	call	\func	/* rdi points to pt_regs */
-	.endm
-
 	/*
 	 * The interrupt stubs push (~vector+0x80) onto the stack and
 	 * then jump to common_interrupt.
@@ -627,7 +617,9 @@ END(interrupt_helper)
 common_interrupt:
 	ASM_CLAC
 	addq	$-0x80, (%rsp)			/* Adjust vector to [-256, -1] range */
-	interrupt do_IRQ
+	call	interrupt_helper
+	UNWIND_HINT_REGS indirect=1
+	call	do_IRQ	/* rdi points to pt_regs */
 	/* 0(%rsp): old RSP */
 ret_from_intr:
 	DISABLE_INTERRUPTS(CLBR_ANY)
@@ -823,7 +815,9 @@ ENTRY(\sym)
 	ASM_CLAC
 	pushq	$~(\num)
 .Lcommon_\sym:
-	interrupt \do_sym
+	call	interrupt_helper
+	UNWIND_HINT_REGS indirect=1
+	call	\do_sym	/* rdi points to pt_regs */
 	jmp	ret_from_intr
 END(\sym)
 .endm
-- 
2.16.1

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [RFC PATCH v2 5/5] x86/entry/64: open-code switch_to_thread_stack
  2018-02-18 12:31 [RFC PATCH v2 0/5] x86/entry/64: interrupt entry size reduction Dominik Brodowski
                   ` (3 preceding siblings ...)
  2018-02-18 12:31 ` [RFC PATCH v2 4/5] x86/entry/64: remove interrupt macro Dominik Brodowski
@ 2018-02-18 12:31 ` Dominik Brodowski
  2018-02-19 14:34   ` Josh Poimboeuf
  4 siblings, 1 reply; 8+ messages in thread
From: Dominik Brodowski @ 2018-02-18 12:31 UTC (permalink / raw)
  To: linux-kernel, mingo, x86, brgerst, luto, jpoimboe
  Cc: torvalds, ak, tglx, dan.j.williams

Open-code the two instances which used switch_to_thread_stack. This
allows us to remove the wrapper around DO_SWITCH_TO_THREAD_STACK.

While at it, update the UNWIND hint to reflect where the IRET frame is.

Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
---
 arch/x86/entry/entry_64.S        | 59 ++++++++++++++++------------------------
 arch/x86/entry/entry_64_compat.S | 19 ++++++++++++-
 2 files changed, 42 insertions(+), 36 deletions(-)

diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
index 0ac2b4865d58..fa61be457082 100644
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -539,17 +539,36 @@ END(irq_entries_start)
 .endm
 
 /*
- * Switch to the thread stack.  This is called with the IRET frame and
- * orig_ax on the stack.  (That is, RDI..R12 are not on the stack and
- * space has not been allocated for them.)
+ * Interrupt entry/exit.
+ *
+ * Interrupt entry points save only callee clobbered registers in fast path.
+ *
+ * Entry runs with interrupts off.
  */
-.macro DO_SWITCH_TO_THREAD_STACK
+/* 8(%rsp): ~(interrupt number) */
+ENTRY(interrupt_helper)
+	UNWIND_HINT_FUNC
+	cld
+
+	testb	$3, CS-ORIG_RAX+8(%rsp)
+	jz	1f
+	SWAPGS
+
+	/*
+	 * Switch to the thread stack.  The IRET frame and orig_ax are
+	 * on the stack, as well as the return address. RDI..R12 are
+	 * not (yet) on the stack and space has not (yet) been
+	 * allocated for them.
+	 */
 	pushq	%rdi
+
 	/* Need to switch before accessing the thread stack. */
 	SWITCH_TO_KERNEL_CR3 scratch_reg=%rdi
 	movq	%rsp, %rdi
 	movq	PER_CPU_VAR(cpu_current_top_of_stack), %rsp
-	UNWIND_HINT sp_offset=16 sp_reg=ORC_REG_DI
+	 /* We have RDI, return address, and orig_ax on the stack on
+	  * top of the IRET frame. That means offset=24 */
+	UNWIND_HINT_IRET_REGS base=%rdi offset=24
 
 	pushq	7*8(%rdi)		/* regs->ss */
 	pushq	6*8(%rdi)		/* regs->rsp */
@@ -561,24 +580,6 @@ END(irq_entries_start)
 	UNWIND_HINT_FUNC
 
 	movq	(%rdi), %rdi
-.endm
-
-/*
- * Interrupt entry/exit.
- *
- * Interrupt entry points save only callee clobbered registers in fast path.
- *
- * Entry runs with interrupts off.
- */
-/* 8(%rsp): ~(interrupt number) */
-ENTRY(interrupt_helper)
-	UNWIND_HINT_FUNC
-	cld
-
-	testb	$3, CS-ORIG_RAX+8(%rsp)
-	jz	1f
-	SWAPGS
-	DO_SWITCH_TO_THREAD_STACK
 1:
 
 	PUSH_AND_CLEAR_REGS save_ret=1
@@ -880,18 +881,6 @@ apicinterrupt IRQ_WORK_VECTOR			irq_work_interrupt		smp_irq_work_interrupt
  */
 #define CPU_TSS_IST(x) PER_CPU_VAR(cpu_tss_rw) + (TSS_ist + ((x) - 1) * 8)
 
-#if defined(CONFIG_IA32_EMULATION)
-/* entry_64_compat.S::entry_INT80_compat expects this to be an ASM function */
-ENTRY(switch_to_thread_stack)
-	UNWIND_HINT_FUNC
-	cld
-
-	DO_SWITCH_TO_THREAD_STACK
-
-	ret
-END(switch_to_thread_stack)
-#endif
-
 .macro idtentry sym do_sym has_error_code:req paranoid=0 shift_ist=-1
 ENTRY(\sym)
 	UNWIND_HINT_IRET_REGS offset=\has_error_code*8
diff --git a/arch/x86/entry/entry_64_compat.S b/arch/x86/entry/entry_64_compat.S
index 364ea4a207be..5b9503c7148d 100644
--- a/arch/x86/entry/entry_64_compat.S
+++ b/arch/x86/entry/entry_64_compat.S
@@ -350,7 +350,24 @@ ENTRY(entry_INT80_compat)
 	pushq	%rax			/* pt_regs->orig_ax */
 
 	/* switch to thread stack expects orig_ax to be pushed */
-	call	switch_to_thread_stack
+	pushq	%rdi
+
+	/* Need to switch before accessing the thread stack. */
+	SWITCH_TO_KERNEL_CR3 scratch_reg=%rdi
+	movq	%rsp, %rdi
+	movq	PER_CPU_VAR(cpu_current_top_of_stack), %rsp
+	 /* We have RDI and orig_ax on the stack on top of the IRET frame.
+	  * That means offset=16 */
+	UNWIND_HINT_IRET_REGS base=%rdi offset=16
+
+	pushq	6*8(%rdi)		/* regs->ss */
+	pushq	5*8(%rdi)		/* regs->rsp */
+	pushq	4*8(%rdi)		/* regs->eflags */
+	pushq	3*8(%rdi)		/* regs->cs */
+	pushq	2*8(%rdi)		/* regs->ip */
+	pushq	1*8(%rdi)		/* regs->orig_ax */
+
+	movq	(%rdi), %rdi		/* restore %rdi */
 
 	pushq	%rdi			/* pt_regs->di */
 	pushq	%rsi			/* pt_regs->si */
-- 
2.16.1

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [RFC PATCH v2 5/5] x86/entry/64: open-code switch_to_thread_stack
  2018-02-18 12:31 ` [RFC PATCH v2 5/5] x86/entry/64: open-code switch_to_thread_stack Dominik Brodowski
@ 2018-02-19 14:34   ` Josh Poimboeuf
  2018-02-20  7:37     ` Ingo Molnar
  0 siblings, 1 reply; 8+ messages in thread
From: Josh Poimboeuf @ 2018-02-19 14:34 UTC (permalink / raw)
  To: Dominik Brodowski
  Cc: linux-kernel, mingo, x86, brgerst, luto, torvalds, ak, tglx,
	dan.j.williams

On Sun, Feb 18, 2018 at 01:31:16PM +0100, Dominik Brodowski wrote:
> Open-code the two instances which used switch_to_thread_stack. This
> allows us to remove the wrapper around DO_SWITCH_TO_THREAD_STACK.
> 
> While at it, update the UNWIND hint to reflect where the IRET frame is.
> 
> Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
> ---
>  arch/x86/entry/entry_64.S        | 59 ++++++++++++++++------------------------
>  arch/x86/entry/entry_64_compat.S | 19 ++++++++++++-
>  2 files changed, 42 insertions(+), 36 deletions(-)
> 
> diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
> index 0ac2b4865d58..fa61be457082 100644
> --- a/arch/x86/entry/entry_64.S
> +++ b/arch/x86/entry/entry_64.S
> @@ -539,17 +539,36 @@ END(irq_entries_start)
>  .endm
>  
>  /*
> - * Switch to the thread stack.  This is called with the IRET frame and
> - * orig_ax on the stack.  (That is, RDI..R12 are not on the stack and
> - * space has not been allocated for them.)
> + * Interrupt entry/exit.
> + *
> + * Interrupt entry points save only callee clobbered registers in fast path.
> + *
> + * Entry runs with interrupts off.
>   */

These comments aren't really accurate anymore:

- The function only handles interrupt entry, not exit.

- All registers are saved, there is no fast path.

> -.macro DO_SWITCH_TO_THREAD_STACK
> +/* 8(%rsp): ~(interrupt number) */
> +ENTRY(interrupt_helper)

"interrupt_helper" is a bit vague, how about "interrupt_entry"?  That
better describes its purpose and also makes it more analagous to
paranoid_entry() and error_entry().

> +	UNWIND_HINT_FUNC
> +	cld
> +
> +	testb	$3, CS-ORIG_RAX+8(%rsp)
> +	jz	1f
> +	SWAPGS
> +
> +	/*
> +	 * Switch to the thread stack.  The IRET frame and orig_ax are
> +	 * on the stack, as well as the return address. RDI..R12 are
> +	 * not (yet) on the stack and space has not (yet) been
> +	 * allocated for them.
> +	 */
>  	pushq	%rdi
> +
>  	/* Need to switch before accessing the thread stack. */
>  	SWITCH_TO_KERNEL_CR3 scratch_reg=%rdi
>  	movq	%rsp, %rdi
>  	movq	PER_CPU_VAR(cpu_current_top_of_stack), %rsp
> -	UNWIND_HINT sp_offset=16 sp_reg=ORC_REG_DI
> +	 /* We have RDI, return address, and orig_ax on the stack on
> +	  * top of the IRET frame. That means offset=24 */
> +	UNWIND_HINT_IRET_REGS base=%rdi offset=24

The hint looks right, though the comment formatting isn't in the kernel
style.

> diff --git a/arch/x86/entry/entry_64_compat.S b/arch/x86/entry/entry_64_compat.S
> index 364ea4a207be..5b9503c7148d 100644
> --- a/arch/x86/entry/entry_64_compat.S
> +++ b/arch/x86/entry/entry_64_compat.S
> @@ -350,7 +350,24 @@ ENTRY(entry_INT80_compat)
>  	pushq	%rax			/* pt_regs->orig_ax */
>  
>  	/* switch to thread stack expects orig_ax to be pushed */
> -	call	switch_to_thread_stack
> +	pushq	%rdi
> +
> +	/* Need to switch before accessing the thread stack. */
> +	SWITCH_TO_KERNEL_CR3 scratch_reg=%rdi
> +	movq	%rsp, %rdi
> +	movq	PER_CPU_VAR(cpu_current_top_of_stack), %rsp
> +	 /* We have RDI and orig_ax on the stack on top of the IRET frame.
> +	  * That means offset=16 */
> +	UNWIND_HINT_IRET_REGS base=%rdi offset=16

Objtool doesn't scan this file yet, so no hints needed here yet.

Adding it to my TODO list...

-- 
Josh

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC PATCH v2 5/5] x86/entry/64: open-code switch_to_thread_stack
  2018-02-19 14:34   ` Josh Poimboeuf
@ 2018-02-20  7:37     ` Ingo Molnar
  0 siblings, 0 replies; 8+ messages in thread
From: Ingo Molnar @ 2018-02-20  7:37 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: Dominik Brodowski, linux-kernel, x86, brgerst, luto, torvalds,
	ak, tglx, dan.j.williams


* Josh Poimboeuf <jpoimboe@redhat.com> wrote:

> > -.macro DO_SWITCH_TO_THREAD_STACK
> > +/* 8(%rsp): ~(interrupt number) */
> > +ENTRY(interrupt_helper)
> 
> "interrupt_helper" is a bit vague, how about "interrupt_entry"?  That
> better describes its purpose and also makes it more analagous to
> paranoid_entry() and error_entry().

Yeah, interrupt_entry() sounds better - and the renaming should be propagated to 
the other patches as well.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2018-02-20  7:37 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-02-18 12:31 [RFC PATCH v2 0/5] x86/entry/64: interrupt entry size reduction Dominik Brodowski
2018-02-18 12:31 ` [RFC PATCH v2 1/5] x86/entry/64: move PUSH_AND_CLEAR_REGS from interrupt macro to helper function Dominik Brodowski
2018-02-18 12:31 ` [RFC PATCH v2 2/5] x86/entry/64: move ENTER_IRQ_STACK " Dominik Brodowski
2018-02-18 12:31 ` [RFC PATCH v2 3/5] x86/entry/64: move switch_to_thread_stack to interrupt " Dominik Brodowski
2018-02-18 12:31 ` [RFC PATCH v2 4/5] x86/entry/64: remove interrupt macro Dominik Brodowski
2018-02-18 12:31 ` [RFC PATCH v2 5/5] x86/entry/64: open-code switch_to_thread_stack Dominik Brodowski
2018-02-19 14:34   ` Josh Poimboeuf
2018-02-20  7:37     ` Ingo Molnar

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).