All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 1/2] x86/asm/entry/64: Enable interrupts *after* we fetch PER_CPU_VAR(old_rsp)
@ 2015-03-17 13:42 Denys Vlasenko
  2015-03-17 13:42 ` [PATCH 2/2] x86/asm/entry/64: Remove unused thread_struct::usersp Denys Vlasenko
                   ` (2 more replies)
  0 siblings, 3 replies; 6+ messages in thread
From: Denys Vlasenko @ 2015-03-17 13:42 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Denys Vlasenko, Linus Torvalds, Steven Rostedt, Borislav Petkov,
	H. Peter Anvin, Andy Lutomirski, Oleg Nesterov,
	Frederic Weisbecker, Alexei Starovoitov, Will Drewry, Kees Cook,
	x86, linux-kernel

Without this change, it is still not possible to get rid of
PER_CPU_VAR(old_rsp) usage in switch_to: if preemption happens
while we did not fetch PER_CPU_VAR(old_rsp) and stored it in pt_regs->sp,
PER_CPU_VAR(old_rsp) gets corrupted by other task's user sp.

Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com>
CC: Linus Torvalds <torvalds@linux-foundation.org>
CC: Steven Rostedt <rostedt@goodmis.org>
CC: Ingo Molnar <mingo@kernel.org>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
CC: Andy Lutomirski <luto@amacapital.net>
CC: Oleg Nesterov <oleg@redhat.com>
CC: Frederic Weisbecker <fweisbec@gmail.com>
CC: Alexei Starovoitov <ast@plumgrid.com>
CC: Will Drewry <wad@chromium.org>
CC: Kees Cook <keescook@chromium.org>
CC: x86@kernel.org
CC: linux-kernel@vger.kernel.org
---

Run-tested, including with PARAVIRT on.

 arch/x86/kernel/entry_64.S | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/arch/x86/kernel/entry_64.S b/arch/x86/kernel/entry_64.S
index d86788c..3054a9d 100644
--- a/arch/x86/kernel/entry_64.S
+++ b/arch/x86/kernel/entry_64.S
@@ -241,17 +241,17 @@ GLOBAL(system_call_after_swapgs)
 	movq	%rsp,PER_CPU_VAR(old_rsp)
 	/* kernel_stack is set so that 5 slots (iret frame) are preallocated */
 	movq	PER_CPU_VAR(kernel_stack),%rsp
-	/*
-	 * No need to follow this irqs off/on section - it's straight
-	 * and short:
-	 */
-	ENABLE_INTERRUPTS(CLBR_NONE)
 	ALLOC_PT_GPREGS_ON_STACK 8		/* +8: space for orig_ax */
 	movq	%rcx,RIP(%rsp)
 	movq	PER_CPU_VAR(old_rsp),%rcx
 	movq	%r11,EFLAGS(%rsp)
 	movq	%rcx,RSP(%rsp)
 	movq_cfi rax,ORIG_RAX
+	/*
+	 * No need to follow this irqs off/on section - it's straight
+	 * and short:
+	 */
+	ENABLE_INTERRUPTS(CLBR_RAX)
 	SAVE_C_REGS_EXCEPT_RAX_RCX_R11
 	movq	$-ENOSYS,RAX(%rsp)
 	CFI_REL_OFFSET rip,RIP
-- 
1.8.1.4


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH 2/2] x86/asm/entry/64: Remove unused thread_struct::usersp
  2015-03-17 13:42 [PATCH 1/2] x86/asm/entry/64: Enable interrupts *after* we fetch PER_CPU_VAR(old_rsp) Denys Vlasenko
@ 2015-03-17 13:42 ` Denys Vlasenko
  2015-03-17 14:41   ` Ingo Molnar
  2015-03-17 16:42   ` [tip:x86/asm] x86/asm/entry/64: Simplify 'old_rsp' usage tip-bot for Ingo Molnar
  2015-03-17 13:51 ` [PATCH 1/2] x86/asm/entry/64: Enable interrupts *after* we fetch PER_CPU_VAR(old_rsp) Denys Vlasenko
  2015-03-17 18:04 ` Andy Lutomirski
  2 siblings, 2 replies; 6+ messages in thread
From: Denys Vlasenko @ 2015-03-17 13:42 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Denys Vlasenko, Linus Torvalds, Steven Rostedt, Borislav Petkov,
	H. Peter Anvin, Andy Lutomirski, Oleg Nesterov,
	Frederic Weisbecker, Alexei Starovoitov, Will Drewry, Kees Cook,
	x86, linux-kernel

All manipulations of PER_CPU(old_rsp) in C code are removed:
it is not used on SYSRET return, so storing anything there is
pointless.

This also allows us to get rid of thread_struct::usersp,
which was needed only to set PER_CPU(old_rsp) for correct
return from fork/clone.

Tweak a few comments as well: we no longer have "partial stack frame",
ever.

Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com>
CC: Linus Torvalds <torvalds@linux-foundation.org>
CC: Steven Rostedt <rostedt@goodmis.org>
CC: Ingo Molnar <mingo@kernel.org>
CC: Borislav Petkov <bp@alien8.de>
CC: "H. Peter Anvin" <hpa@zytor.com>
CC: Andy Lutomirski <luto@amacapital.net>
CC: Oleg Nesterov <oleg@redhat.com>
CC: Frederic Weisbecker <fweisbec@gmail.com>
CC: Alexei Starovoitov <ast@plumgrid.com>
CC: Will Drewry <wad@chromium.org>
CC: Kees Cook <keescook@chromium.org>
CC: x86@kernel.org
CC: linux-kernel@vger.kernel.org
---

Run-tested, including with PARAVIRT on.

 arch/x86/include/asm/processor.h |  6 ------
 arch/x86/kernel/entry_64.S       | 14 ++++++--------
 arch/x86/kernel/process_64.c     |  5 -----
 3 files changed, 6 insertions(+), 19 deletions(-)

diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index 997e6a1..66a1954 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -496,7 +496,6 @@ struct thread_struct {
 #ifdef CONFIG_X86_32
 	unsigned long		sysenter_cs;
 #else
-	unsigned long		usersp;	/* Copy from PDA */
 	unsigned short		es;
 	unsigned short		ds;
 	unsigned short		fsindex;
@@ -908,11 +907,6 @@ extern unsigned long thread_saved_pc(struct task_struct *tsk);
 #define task_pt_regs(tsk)	((struct pt_regs *)(tsk)->thread.sp0 - 1)
 extern unsigned long KSTK_ESP(struct task_struct *task);
 
-/*
- * User space RSP while inside the SYSCALL fast path
- */
-DECLARE_PER_CPU(unsigned long, old_rsp);
-
 #endif /* CONFIG_X86_64 */
 
 extern void start_thread(struct pt_regs *regs, unsigned long new_ip,
diff --git a/arch/x86/kernel/entry_64.S b/arch/x86/kernel/entry_64.S
index 3054a9d..cb86db0 100644
--- a/arch/x86/kernel/entry_64.S
+++ b/arch/x86/kernel/entry_64.S
@@ -15,10 +15,8 @@
  * after an interrupt and after each system call.
  *
  * A note on terminology:
- * - top of stack: Architecture defined interrupt frame from SS to RIP
+ * - iret frame: Architecture defined interrupt frame from SS to RIP
  * at the top of the kernel process stack.
- * - partial stack frame: partially saved registers up to R11.
- * - full stack frame: Like partial stack frame, but all register saved.
  *
  * Some macro usage:
  * - CFI macros are used to generate dwarf2 unwind information for better
@@ -219,7 +217,7 @@ ENDPROC(native_usergs_sysret64)
  * Interrupts are off on entry.
  * Only called from user space.
  *
- * When user can change the frames always force IRET. That is because
+ * When user can change pt_regs->foo always force IRET. That is because
  * it deals with uncanonical addresses better. SYSRET has trouble
  * with them due to bugs in both AMD and Intel CPUs.
  */
@@ -303,7 +301,7 @@ int_ret_from_sys_call_fixup:
 	FIXUP_TOP_OF_STACK %r11
 	jmp int_ret_from_sys_call
 
-	/* Do syscall tracing */
+	/* Do syscall entry tracing */
 tracesys:
 	movq %rsp, %rdi
 	movq $AUDIT_ARCH_X86_64, %rsi
@@ -339,11 +337,11 @@ tracesys_phase2:
 	movq %r10,%rcx	/* fixup for C */
 	call *sys_call_table(,%rax,8)
 	movq %rax,RAX(%rsp)
-	/* Use IRET because user could have changed frame */
+	/* Use IRET because user could have changed pt_regs->foo */
 
 /*
  * Syscall return path ending with IRET.
- * Has correct top of stack, but partial stack frame.
+ * Has correct iret frame.
  */
 GLOBAL(int_ret_from_sys_call)
 	DISABLE_INTERRUPTS(CLBR_NONE)
@@ -374,7 +372,7 @@ int_careful:
 	TRACE_IRQS_OFF
 	jmp int_with_check
 
-	/* handle signals and tracing -- both require a full stack frame */
+	/* handle signals and tracing -- both require a full pt_regs */
 int_very_careful:
 	TRACE_IRQS_ON
 	ENABLE_INTERRUPTS(CLBR_NONE)
diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c
index e8c124a..14df2be 100644
--- a/arch/x86/kernel/process_64.c
+++ b/arch/x86/kernel/process_64.c
@@ -161,7 +161,6 @@ int copy_thread(unsigned long clone_flags, unsigned long sp,
 	p->thread.sp0 = (unsigned long)task_stack_page(p) + THREAD_SIZE;
 	childregs = task_pt_regs(p);
 	p->thread.sp = (unsigned long) childregs;
-	p->thread.usersp = me->thread.usersp;
 	set_tsk_thread_flag(p, TIF_FORK);
 	p->thread.io_bitmap_ptr = NULL;
 
@@ -235,10 +234,8 @@ start_thread_common(struct pt_regs *regs, unsigned long new_ip,
 	loadsegment(es, _ds);
 	loadsegment(ds, _ds);
 	load_gs_index(0);
-	current->thread.usersp	= new_sp;
 	regs->ip		= new_ip;
 	regs->sp		= new_sp;
-	this_cpu_write(old_rsp, new_sp);
 	regs->cs		= _cs;
 	regs->ss		= _ss;
 	regs->flags		= X86_EFLAGS_IF;
@@ -398,8 +395,6 @@ __switch_to(struct task_struct *prev_p, struct task_struct *next_p)
 	/*
 	 * Switch the PDA and FPU contexts.
 	 */
-	prev->usersp = this_cpu_read(old_rsp);
-	this_cpu_write(old_rsp, next->usersp);
 	this_cpu_write(current_task, next_p);
 
 	/*
-- 
1.8.1.4


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH 1/2] x86/asm/entry/64: Enable interrupts *after* we fetch PER_CPU_VAR(old_rsp)
  2015-03-17 13:42 [PATCH 1/2] x86/asm/entry/64: Enable interrupts *after* we fetch PER_CPU_VAR(old_rsp) Denys Vlasenko
  2015-03-17 13:42 ` [PATCH 2/2] x86/asm/entry/64: Remove unused thread_struct::usersp Denys Vlasenko
@ 2015-03-17 13:51 ` Denys Vlasenko
  2015-03-17 18:04 ` Andy Lutomirski
  2 siblings, 0 replies; 6+ messages in thread
From: Denys Vlasenko @ 2015-03-17 13:51 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Denys Vlasenko, Linus Torvalds, Steven Rostedt, Borislav Petkov,
	H. Peter Anvin, Andy Lutomirski, Oleg Nesterov,
	Frederic Weisbecker, Alexei Starovoitov, Will Drewry, Kees Cook,
	x86, linux-kernel

On 03/17/2015 02:42 PM, Denys Vlasenko wrote:
> Without this change, it is still not possible to get rid of
> PER_CPU_VAR(old_rsp) usage in switch_to: if preemption happens
> while we did not fetch PER_CPU_VAR(old_rsp) and stored it in pt_regs->sp,
> PER_CPU_VAR(old_rsp) gets corrupted by other task's user sp.
> 
> Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com>
> CC: Linus Torvalds <torvalds@linux-foundation.org>
> CC: Steven Rostedt <rostedt@goodmis.org>
> CC: Ingo Molnar <mingo@kernel.org>
> CC: Borislav Petkov <bp@alien8.de>
> CC: "H. Peter Anvin" <hpa@zytor.com>
> CC: Andy Lutomirski <luto@amacapital.net>
> CC: Oleg Nesterov <oleg@redhat.com>
> CC: Frederic Weisbecker <fweisbec@gmail.com>
> CC: Alexei Starovoitov <ast@plumgrid.com>
> CC: Will Drewry <wad@chromium.org>
> CC: Kees Cook <keescook@chromium.org>
> CC: x86@kernel.org
> CC: linux-kernel@vger.kernel.org
> ---
> 
> Run-tested, including with PARAVIRT on.

Well. The testing with PARAVIRT did work, however, I don't know why.

>  	movq_cfi rax,ORIG_RAX
> +	/*
> +	 * No need to follow this irqs off/on section - it's straight
> +	 * and short:
> +	 */
> +	ENABLE_INTERRUPTS(CLBR_RAX)

Here I wrongly assumed that now I can clobber rax, since it is saved
in pt_regs->orig_ax now. Wrong. Code below still wants to use rax
register directly.

Looks like I was "lucky" and paravirt call happen to not change rax.
I'll send a v2 patch without this ill-advised attempt of optimization.
Sorry.


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 2/2] x86/asm/entry/64: Remove unused thread_struct::usersp
  2015-03-17 13:42 ` [PATCH 2/2] x86/asm/entry/64: Remove unused thread_struct::usersp Denys Vlasenko
@ 2015-03-17 14:41   ` Ingo Molnar
  2015-03-17 16:42   ` [tip:x86/asm] x86/asm/entry/64: Simplify 'old_rsp' usage tip-bot for Ingo Molnar
  1 sibling, 0 replies; 6+ messages in thread
From: Ingo Molnar @ 2015-03-17 14:41 UTC (permalink / raw)
  To: Denys Vlasenko
  Cc: Linus Torvalds, Steven Rostedt, Borislav Petkov, H. Peter Anvin,
	Andy Lutomirski, Oleg Nesterov, Frederic Weisbecker,
	Alexei Starovoitov, Will Drewry, Kees Cook, x86, linux-kernel


* Denys Vlasenko <dvlasenk@redhat.com> wrote:

> All manipulations of PER_CPU(old_rsp) in C code are removed:
> it is not used on SYSRET return, so storing anything there is
> pointless.
> 
> This also allows us to get rid of thread_struct::usersp,
> which was needed only to set PER_CPU(old_rsp) for correct
> return from fork/clone.
> 
> Tweak a few comments as well: we no longer have "partial stack frame",
> ever.

So this doesn't really explain it very well, plus it still 
unnecessarily mixes up two changes in a single patch: please
don't do that!

I've fixed it all up and split up the patch into two, easier
to bisect parts.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [tip:x86/asm] x86/asm/entry/64: Simplify 'old_rsp' usage
  2015-03-17 13:42 ` [PATCH 2/2] x86/asm/entry/64: Remove unused thread_struct::usersp Denys Vlasenko
  2015-03-17 14:41   ` Ingo Molnar
@ 2015-03-17 16:42   ` tip-bot for Ingo Molnar
  1 sibling, 0 replies; 6+ messages in thread
From: tip-bot for Ingo Molnar @ 2015-03-17 16:42 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: linux-kernel, bp, hpa, luto, oleg, dvlasenk, tglx, fweisbec,
	mingo, ast, torvalds, keescook, wad, rostedt

Commit-ID:  9854dd74c3f6af8d9d527de86c6074b7ed0495f1
Gitweb:     http://git.kernel.org/tip/9854dd74c3f6af8d9d527de86c6074b7ed0495f1
Author:     Ingo Molnar <mingo@kernel.org>
AuthorDate: Tue, 17 Mar 2015 14:42:59 +0100
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Tue, 17 Mar 2015 16:01:41 +0100

x86/asm/entry/64: Simplify 'old_rsp' usage

Remove all manipulations of PER_CPU(old_rsp) in C code:

 - it is not used on SYSRET return anymore, and system entries
   are atomic, so updating it from the fork and context switch
   paths is pointless.

 - Tweak a few related comments as well: we no longer have a
   "partial stack frame" on entry, ever.

Based on (split out of) patch from Denys Vlasenko.

Originally-from: Denys Vlasenko <dvlasenk@redhat.com>
Tested-by: Borislav Petkov <bp@alien8.de>
Acked-by: Borislav Petkov <bp@alien8.de>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Will Drewry <wad@chromium.org>
Link: http://lkml.kernel.org/r/1426599779-8010-2-git-send-email-dvlasenk@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/include/asm/processor.h | 5 -----
 arch/x86/kernel/process_64.c     | 2 --
 2 files changed, 7 deletions(-)

diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index 5abd9a5..3ac5092 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -905,11 +905,6 @@ extern unsigned long thread_saved_pc(struct task_struct *tsk);
 #define task_pt_regs(tsk)	((struct pt_regs *)(tsk)->thread.sp0 - 1)
 extern unsigned long KSTK_ESP(struct task_struct *task);
 
-/*
- * User space RSP while inside the SYSCALL fast path
- */
-DECLARE_PER_CPU(unsigned long, old_rsp);
-
 #endif /* CONFIG_X86_64 */
 
 extern void start_thread(struct pt_regs *regs, unsigned long new_ip,
diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c
index e8c124a..59696d7 100644
--- a/arch/x86/kernel/process_64.c
+++ b/arch/x86/kernel/process_64.c
@@ -238,7 +238,6 @@ start_thread_common(struct pt_regs *regs, unsigned long new_ip,
 	current->thread.usersp	= new_sp;
 	regs->ip		= new_ip;
 	regs->sp		= new_sp;
-	this_cpu_write(old_rsp, new_sp);
 	regs->cs		= _cs;
 	regs->ss		= _ss;
 	regs->flags		= X86_EFLAGS_IF;
@@ -399,7 +398,6 @@ __switch_to(struct task_struct *prev_p, struct task_struct *next_p)
 	 * Switch the PDA and FPU contexts.
 	 */
 	prev->usersp = this_cpu_read(old_rsp);
-	this_cpu_write(old_rsp, next->usersp);
 	this_cpu_write(current_task, next_p);
 
 	/*

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH 1/2] x86/asm/entry/64: Enable interrupts *after* we fetch PER_CPU_VAR(old_rsp)
  2015-03-17 13:42 [PATCH 1/2] x86/asm/entry/64: Enable interrupts *after* we fetch PER_CPU_VAR(old_rsp) Denys Vlasenko
  2015-03-17 13:42 ` [PATCH 2/2] x86/asm/entry/64: Remove unused thread_struct::usersp Denys Vlasenko
  2015-03-17 13:51 ` [PATCH 1/2] x86/asm/entry/64: Enable interrupts *after* we fetch PER_CPU_VAR(old_rsp) Denys Vlasenko
@ 2015-03-17 18:04 ` Andy Lutomirski
  2 siblings, 0 replies; 6+ messages in thread
From: Andy Lutomirski @ 2015-03-17 18:04 UTC (permalink / raw)
  To: Denys Vlasenko
  Cc: Ingo Molnar, Linus Torvalds, Steven Rostedt, Borislav Petkov,
	H. Peter Anvin, Oleg Nesterov, Frederic Weisbecker,
	Alexei Starovoitov, Will Drewry, Kees Cook, X86 ML, linux-kernel

On Tue, Mar 17, 2015 at 6:42 AM, Denys Vlasenko <dvlasenk@redhat.com> wrote:
> Without this change, it is still not possible to get rid of
> PER_CPU_VAR(old_rsp) usage in switch_to: if preemption happens
> while we did not fetch PER_CPU_VAR(old_rsp) and stored it in pt_regs->sp,
> PER_CPU_VAR(old_rsp) gets corrupted by other task's user sp.
>

[...]

> -       ENABLE_INTERRUPTS(CLBR_NONE)

[...]

> +       ENABLE_INTERRUPTS(CLBR_RAX)

In future patches, please don't sneak these things in.  It makes it
much harder to review.  If you need to move code and change it like
this, I think it should be clearly and explicitly mentioned in the
changelog.

--Andy

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2015-03-17 18:05 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-03-17 13:42 [PATCH 1/2] x86/asm/entry/64: Enable interrupts *after* we fetch PER_CPU_VAR(old_rsp) Denys Vlasenko
2015-03-17 13:42 ` [PATCH 2/2] x86/asm/entry/64: Remove unused thread_struct::usersp Denys Vlasenko
2015-03-17 14:41   ` Ingo Molnar
2015-03-17 16:42   ` [tip:x86/asm] x86/asm/entry/64: Simplify 'old_rsp' usage tip-bot for Ingo Molnar
2015-03-17 13:51 ` [PATCH 1/2] x86/asm/entry/64: Enable interrupts *after* we fetch PER_CPU_VAR(old_rsp) Denys Vlasenko
2015-03-17 18:04 ` Andy Lutomirski

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.