All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3] m68k: improved switch stack handling
@ 2021-06-20  8:14 Michael Schmitz
  2021-06-20  8:14 ` [PATCH v3 1/3] m68k: save extra registers on more syscall entry points Michael Schmitz
                   ` (2 more replies)
  0 siblings, 3 replies; 6+ messages in thread
From: Michael Schmitz @ 2021-06-20  8:14 UTC (permalink / raw)
  To: geert, linux-arch, linux-m68k; +Cc: ebiederm, torvalds, schwab

m68k version of Eric's 'improved switch stack handling' patch for alpha. 

The first two patches address m68k missing saving switch_stack on
those syscalls that can may call ptrace_stop(), and adding a full stack
frame in kernel threads.

The last patch adds a 'status' field to m68k thread_info struct, and
stores information about whether a syscall trace is in progress, and
a full stack frame has been saved, in that field. This information can
be used in arch/m68k/kernel/ptrace.c by get/put_reg() to guard against
accessing incorrect information on the stack (haven't got around to
that bit yet). I'm quite certain I haven't picked the most efficient
implementation here - suggestions to optimize most welcome!

Cheers,

   Michael




^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH v3 1/3] m68k: save extra registers on more syscall entry points
  2021-06-20  8:14 [PATCH v3] m68k: improved switch stack handling Michael Schmitz
@ 2021-06-20  8:14 ` Michael Schmitz
  2021-06-20  8:14 ` [PATCH v3 2/3] m68k: correctly handle IO worker stack frame set-up Michael Schmitz
  2021-06-20  8:14 ` [PATCH v3 3/3] m68k: track syscalls being traced with shallow user context stack Michael Schmitz
  2 siblings, 0 replies; 6+ messages in thread
From: Michael Schmitz @ 2021-06-20  8:14 UTC (permalink / raw)
  To: geert, linux-arch, linux-m68k; +Cc: ebiederm, torvalds, schwab, Michael Schmitz

Multiple syscalls are liable to PTRACE_EVENT_* tracing and thus
require full user context saved on the kernel stack. We only
save those registers not preserved by C code currently.

do_exit() calls ptrace_stop() which may require access to all
saved registers. Add code to save additional registers in the
switch_stack struct for exit and exit_group syscalls (similar
to what is already done for fork, vfork and clone3). According
to Eric's analysis, execve and execveat can be traced as well,
so have been given the same treatment.

Tested on both ARAnyM and Falcon hardware.

CC: Eric W. Biederman <ebiederm@xmission.com>
CC: Linus Torvalds <torvalds@linux-foundation.org>
CC: Andreas Schwab <schwab@linux-m68k.org>
Signed-off-by: Michael Schmitz <schmitzmic@gmail.com>

--
Changes from v2:
- drop handling of io_uring_setup syscall

Changes from v1:

- added exec, execve and io_uring_setup syscalls
- save extra registers around kworker thread calls

drop io_uring_setup handling
---
 arch/m68k/kernel/entry.S              | 28 ++++++++++++++++++++++++++++
 arch/m68k/kernel/process.c            | 33 +++++++++++++++++++++++++++++++++
 arch/m68k/kernel/syscalls/syscall.tbl |  8 ++++----
 3 files changed, 65 insertions(+), 4 deletions(-)

diff --git a/arch/m68k/kernel/entry.S b/arch/m68k/kernel/entry.S
index 9dd76fb..275452a 100644
--- a/arch/m68k/kernel/entry.S
+++ b/arch/m68k/kernel/entry.S
@@ -76,6 +76,34 @@ ENTRY(__sys_clone3)
 	lea	%sp@(28),%sp
 	rts
 
+ENTRY(__sys_exit)
+	SAVE_SWITCH_STACK
+	pea	%sp@(SWITCH_STACK_SIZE)
+	jbsr	m68k_exit
+	lea	%sp@(28),%sp
+	rts
+
+ENTRY(__sys_exit_group)
+	SAVE_SWITCH_STACK
+	pea	%sp@(SWITCH_STACK_SIZE)
+	jbsr	m68k_exit_group
+	lea	%sp@(28),%sp
+	rts
+
+ENTRY(__sys_execve)
+	SAVE_SWITCH_STACK
+	pea	%sp@(SWITCH_STACK_SIZE)
+	jbsr	m68k_execve
+	lea	%sp@(28),%sp
+	rts
+
+ENTRY(__sys_execveat)
+	SAVE_SWITCH_STACK
+	pea	%sp@(SWITCH_STACK_SIZE)
+	jbsr	m68k_execveat
+	lea	%sp@(28),%sp
+	rts
+
 ENTRY(sys_sigreturn)
 	SAVE_SWITCH_STACK
 	movel	%sp,%sp@-		  | switch_stack pointer
diff --git a/arch/m68k/kernel/process.c b/arch/m68k/kernel/process.c
index da83cc8..7dd0eea 100644
--- a/arch/m68k/kernel/process.c
+++ b/arch/m68k/kernel/process.c
@@ -138,6 +138,39 @@ asmlinkage int m68k_clone3(struct pt_regs *regs)
 	return sys_clone3((struct clone_args __user *)regs->d1, regs->d2);
 }
 
+/*
+ * Because extra registers are saved on the stack after the sys_exit()
+ * arguments, this C wrapper extracts them from pt_regs * and then calls the
+ * generic sys_exit() implementation.
+ */
+asmlinkage int m68k_exit(struct pt_regs *regs)
+{
+	return sys_exit(regs->d1);
+}
+
+/* Same for sys_exit_group ... */
+asmlinkage int m68k_exit_group(struct pt_regs *regs)
+{
+	return sys_exit_group(regs->d1);
+}
+
+/* Same for sys_exit_group ... */
+asmlinkage int m68k_execve(struct pt_regs *regs)
+{
+	return sys_execve((const char __user *)regs->d1,
+			(const char __user *const __user*)regs->d2,
+			(const char __user *const __user*)regs->d3);
+}
+
+/* Same for sys_exit_group ... */
+asmlinkage int m68k_execveat(struct pt_regs *regs)
+{
+	return sys_execveat(regs->d1, (const char __user *)regs->d2,
+			(const char __user *const __user*)regs->d3,
+			(const char __user *const __user*)regs->d4,
+			regs->d5);
+}
+
 int copy_thread(unsigned long clone_flags, unsigned long usp, unsigned long arg,
 		struct task_struct *p, unsigned long tls)
 {
diff --git a/arch/m68k/kernel/syscalls/syscall.tbl b/arch/m68k/kernel/syscalls/syscall.tbl
index 0dd019d..13dd02e 100644
--- a/arch/m68k/kernel/syscalls/syscall.tbl
+++ b/arch/m68k/kernel/syscalls/syscall.tbl
@@ -8,7 +8,7 @@
 # The <abi> is always "common" for this file
 #
 0	common	restart_syscall			sys_restart_syscall
-1	common	exit				sys_exit
+1	common	exit				__sys_exit
 2	common	fork				__sys_fork
 3	common	read				sys_read
 4	common	write				sys_write
@@ -18,7 +18,7 @@
 8	common	creat				sys_creat
 9	common	link				sys_link
 10	common	unlink				sys_unlink
-11	common	execve				sys_execve
+11	common	execve				__sys_execve
 12	common	chdir				sys_chdir
 13	common	time				sys_time32
 14	common	mknod				sys_mknod
@@ -254,7 +254,7 @@
 244	common	io_submit			sys_io_submit
 245	common	io_cancel			sys_io_cancel
 246	common	fadvise64			sys_fadvise64
-247	common	exit_group			sys_exit_group
+247	common	exit_group			__sys_exit_group
 248	common	lookup_dcookie			sys_lookup_dcookie
 249	common	epoll_create			sys_epoll_create
 250	common	epoll_ctl			sys_epoll_ctl
@@ -362,7 +362,7 @@
 352	common	getrandom			sys_getrandom
 353	common	memfd_create			sys_memfd_create
 354	common	bpf				sys_bpf
-355	common	execveat			sys_execveat
+355	common	execveat			__sys_execveat
 356	common	socket				sys_socket
 357	common	socketpair			sys_socketpair
 358	common	bind				sys_bind
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH v3 2/3] m68k: correctly handle IO worker stack frame set-up
  2021-06-20  8:14 [PATCH v3] m68k: improved switch stack handling Michael Schmitz
  2021-06-20  8:14 ` [PATCH v3 1/3] m68k: save extra registers on more syscall entry points Michael Schmitz
@ 2021-06-20  8:14 ` Michael Schmitz
  2021-06-21  3:57   ` Finn Thain
  2021-06-20  8:14 ` [PATCH v3 3/3] m68k: track syscalls being traced with shallow user context stack Michael Schmitz
  2 siblings, 1 reply; 6+ messages in thread
From: Michael Schmitz @ 2021-06-20  8:14 UTC (permalink / raw)
  To: geert, linux-arch, linux-m68k; +Cc: ebiederm, torvalds, schwab, Michael Schmitz

Create full stack frame plus switch stack frame in copy_thread()
when creating a kernel worker thread. The switch stack frame will
then be consumed in resume(), leaving a full stack frame of zero
content for ptrace to play with.

Change ret_from_exception switch stack handling to restore the
switch stack (and pop the return address restored by that)after
return from kernel worker threads.

Patch suggested by Linus (authored, actually - replace my signoff
if that's not appropriate, please).

Tested on both ARAnyM and Falcon hardware.

CC: Eric W. Biederman <ebiederm@xmission.com>
CC: Linus Torvalds <torvalds@linux-foundation.org>
CC: Andreas Schwab <schwab@linux-m68k.org>
Signed-off-by: Michael Schmitz <schmitzmic@gmail.com>
---
 arch/m68k/kernel/entry.S   | 11 +++++++++++
 arch/m68k/kernel/process.c | 13 +++++++++----
 2 files changed, 20 insertions(+), 4 deletions(-)

diff --git a/arch/m68k/kernel/entry.S b/arch/m68k/kernel/entry.S
index 275452a..0c25038 100644
--- a/arch/m68k/kernel/entry.S
+++ b/arch/m68k/kernel/entry.S
@@ -147,6 +147,15 @@ ENTRY(ret_from_fork)
 	addql	#4,%sp
 	jra	ret_from_exception
 
+	| A kernel thread will jump here directly from resume,
+	| with the stack containing the full register state
+	| (pt_regs and switch_stack).
+	|
+	| The argument will be in d7, and the kernel function
+	| to call will be in a3.
+	|
+	| If the kernel function returns, we want to return
+	| to user space - it has done a kernel_execve().
 ENTRY(ret_from_kernel_thread)
 	| a3 contains the kernel thread payload, d7 - its argument
 	movel	%d1,%sp@-
@@ -154,6 +163,8 @@ ENTRY(ret_from_kernel_thread)
 	movel	%d7,(%sp)
 	jsr	%a3@
 	addql	#4,%sp
+	RESTORE_SWITCH_STACK
+	addql	#4,%sp
 	jra	ret_from_exception
 
 #if defined(CONFIG_COLDFIRE) || !defined(CONFIG_MMU)
diff --git a/arch/m68k/kernel/process.c b/arch/m68k/kernel/process.c
index 7dd0eea..f9598e0 100644
--- a/arch/m68k/kernel/process.c
+++ b/arch/m68k/kernel/process.c
@@ -192,12 +192,17 @@ int copy_thread(unsigned long clone_flags, unsigned long usp, unsigned long arg,
 
 	if (unlikely(p->flags & (PF_KTHREAD | PF_IO_WORKER))) {
 		/* kernel thread */
-		memset(frame, 0, sizeof(struct fork_frame));
+		struct switch_stack *kstp = &frame->sw - 1;
+
+		/* kernel thread - a kernel-side switch-stack and the full user fork_frame */
+		memset(kstp, 0, sizeof(struct switch_stack) + sizeof(struct fork_frame));
+
 		frame->regs.sr = PS_S;
-		frame->sw.a3 = usp; /* function */
-		frame->sw.d7 = arg;
-		frame->sw.retpc = (unsigned long)ret_from_kernel_thread;
+		kstp->a3 = usp; /* function */
+		kstp->d7 = arg;
+		kstp->retpc = (unsigned long)ret_from_kernel_thread;
 		p->thread.usp = 0;
+		p->thread.ksp = (unsigned long)kstp;
 		return 0;
 	}
 	memcpy(frame, container_of(current_pt_regs(), struct fork_frame, regs),
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH v3 3/3] m68k: track syscalls being traced with shallow user context stack
  2021-06-20  8:14 [PATCH v3] m68k: improved switch stack handling Michael Schmitz
  2021-06-20  8:14 ` [PATCH v3 1/3] m68k: save extra registers on more syscall entry points Michael Schmitz
  2021-06-20  8:14 ` [PATCH v3 2/3] m68k: correctly handle IO worker stack frame set-up Michael Schmitz
@ 2021-06-20  8:14 ` Michael Schmitz
  2 siblings, 0 replies; 6+ messages in thread
From: Michael Schmitz @ 2021-06-20  8:14 UTC (permalink / raw)
  To: geert, linux-arch, linux-m68k; +Cc: ebiederm, torvalds, schwab, Michael Schmitz

Add 'status' field to thread_info struct to hold syscall trace
status info.

Set flag bit in thread_info->status at syscall trace entry, clear
flag bit on trace exit.

Set another flag bit on entering syscall where the full stack
frame has been saved. These flags can be checked whenever a
syscall calls ptrace_stop().

Tested on ARAnyM only - boots and survives running strace on a
binary, nothing fancy.

CC: Eric W. Biederman <ebiederm@xmission.com>
CC: Linus Torvalds <torvalds@linux-foundation.org>
CC: Andreas Schwab <schwab@linux-m68k.org>
Signed-off-by: Michael Schmitz <schmitzmic@gmail.com>
---
 arch/m68k/include/asm/entry.h       | 5 +++++
 arch/m68k/include/asm/thread_info.h | 1 +
 arch/m68k/kernel/asm-offsets.c      | 1 +
 arch/m68k/kernel/entry.S            | 8 ++++++++
 4 files changed, 15 insertions(+)

diff --git a/arch/m68k/include/asm/entry.h b/arch/m68k/include/asm/entry.h
index 9b52b06..e6a5318 100644
--- a/arch/m68k/include/asm/entry.h
+++ b/arch/m68k/include/asm/entry.h
@@ -41,6 +41,11 @@
 #define ALLOWINT	(~0x700)
 #endif /* machine compilation types */
 
+#define TIS_TRACE_ON	(0x1)
+#define TIS_TRACE_OFF	(0xfe)
+#define TIS_SWITCH_STACK	(0x2)
+#define TIS_NO_SWITCH_STACK	(0xfd)
+
 #ifdef __ASSEMBLY__
 /*
  * This defines the normal kernel pt-regs layout.
diff --git a/arch/m68k/include/asm/thread_info.h b/arch/m68k/include/asm/thread_info.h
index 15a7570..a88b48b 100644
--- a/arch/m68k/include/asm/thread_info.h
+++ b/arch/m68k/include/asm/thread_info.h
@@ -29,6 +29,7 @@ struct thread_info {
 	unsigned long		flags;
 	mm_segment_t		addr_limit;	/* thread address space */
 	int			preempt_count;	/* 0 => preemptable, <0 => BUG */
+	unsigned int		status;		/* thread-synchronous flags */
 	__u32			cpu;		/* should always be 0 on m68k */
 	unsigned long		tp_value;	/* thread pointer */
 };
diff --git a/arch/m68k/kernel/asm-offsets.c b/arch/m68k/kernel/asm-offsets.c
index ccea355..ac1ec8f 100644
--- a/arch/m68k/kernel/asm-offsets.c
+++ b/arch/m68k/kernel/asm-offsets.c
@@ -41,6 +41,7 @@ int main(void)
 	/* offsets into the thread_info struct */
 	DEFINE(TINFO_PREEMPT, offsetof(struct thread_info, preempt_count));
 	DEFINE(TINFO_FLAGS, offsetof(struct thread_info, flags));
+	DEFINE(TINFO_STATUS, offsetof(struct thread_info, status));
 
 	/* offsets into the pt_regs */
 	DEFINE(PT_OFF_D0, offsetof(struct pt_regs, d0));
diff --git a/arch/m68k/kernel/entry.S b/arch/m68k/kernel/entry.S
index 0c25038..7fe0cdf 100644
--- a/arch/m68k/kernel/entry.S
+++ b/arch/m68k/kernel/entry.S
@@ -77,10 +77,14 @@ ENTRY(__sys_clone3)
 	rts
 
 ENTRY(__sys_exit)
+	movel	%curptr@(TASK_STACK),%a1
+	orb	#TIS_SWITCH_STACK, %a1@(TINFO_STATUS+3)
 	SAVE_SWITCH_STACK
 	pea	%sp@(SWITCH_STACK_SIZE)
 	jbsr	m68k_exit
 	lea	%sp@(28),%sp
+	movel	%curptr@(TASK_STACK),%a1
+	andb	#TIS_NO_SWITCH_STACK, %a1@(TINFO_STATUS+3)
 	rts
 
 ENTRY(__sys_exit_group)
@@ -200,6 +204,7 @@ ENTRY(ret_from_user_rt_signal)
 #else
 
 do_trace_entry:
+	orb	#TIS_TRACE_ON, %a1@(TINFO_STATUS+3)
 	movel	#-ENOSYS,%sp@(PT_OFF_D0)| needed for strace
 	subql	#4,%sp
 	SAVE_SWITCH_STACK
@@ -210,6 +215,7 @@ do_trace_entry:
 	cmpl	#NR_syscalls,%d0
 	jcs	syscall
 badsys:
+	andb	#TIS_TRACE_OFF, %a1@(TINFO_STATUS+3)
 	movel	#-ENOSYS,%sp@(PT_OFF_D0)
 	jra	ret_from_syscall
 
@@ -219,6 +225,8 @@ do_trace_exit:
 	jbsr	syscall_trace
 	RESTORE_SWITCH_STACK
 	addql	#4,%sp
+	movel	%curptr@(TASK_STACK),%a1
+	andb	#TIS_TRACE_OFF, %a1@(TINFO_STATUS+3)
 	jra	.Lret_from_exception
 
 ENTRY(ret_from_signal)
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH v3 2/3] m68k: correctly handle IO worker stack frame set-up
  2021-06-20  8:14 ` [PATCH v3 2/3] m68k: correctly handle IO worker stack frame set-up Michael Schmitz
@ 2021-06-21  3:57   ` Finn Thain
  2021-06-21 16:06     ` Linus Torvalds
  0 siblings, 1 reply; 6+ messages in thread
From: Finn Thain @ 2021-06-21  3:57 UTC (permalink / raw)
  To: Michael Schmitz; +Cc: geert, linux-arch, linux-m68k, ebiederm, torvalds, schwab

On Sun, 20 Jun 2021, Michael Schmitz wrote:

> Patch suggested by Linus (authored, actually - replace my signoff if 
> that's not appropriate, please).
> 

When the patch author is not the message sender, the message (commit log 
entry) would normally begin with "From: Author <author@example.com>".

That way, 'git am' will automatically attribute the commit to the actual 
author instead of the message sender.

'git send-email' should generate the extra From: line automatically if you 
set the authorship on the commit appropriately (e.g. using 'git commit 
--reset-author').

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v3 2/3] m68k: correctly handle IO worker stack frame set-up
  2021-06-21  3:57   ` Finn Thain
@ 2021-06-21 16:06     ` Linus Torvalds
  0 siblings, 0 replies; 6+ messages in thread
From: Linus Torvalds @ 2021-06-21 16:06 UTC (permalink / raw)
  To: Finn Thain
  Cc: Michael Schmitz, Geert Uytterhoeven, linux-arch, linux-m68k,
	Eric W. Biederman, Andreas Schwab

On Sun, Jun 20, 2021 at 8:57 PM Finn Thain <fthain@linux-m68k.org> wrote:
>
> When the patch author is not the message sender, the message (commit log
> entry) would normally begin with "From: Author <author@example.com>".

Yes,. But in this case (as, honestly, with most of the trial patches I
send out) I'm perfectly happy to not get authorship attribution.
That's particularly true for something that I can't even test, and is
to code that I don't really know all that well.

I'm a big believer in trying to make sure people get proper credit and
attribution, but I'm also the one exception to the rule.

I get too much credit already, I don't need it for the patches that I
send out that are "I think something like this should work" and then
others do the heavy lifting of actually testing them and making sure
everything is good.

                 Linus

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2021-06-21 16:06 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-06-20  8:14 [PATCH v3] m68k: improved switch stack handling Michael Schmitz
2021-06-20  8:14 ` [PATCH v3 1/3] m68k: save extra registers on more syscall entry points Michael Schmitz
2021-06-20  8:14 ` [PATCH v3 2/3] m68k: correctly handle IO worker stack frame set-up Michael Schmitz
2021-06-21  3:57   ` Finn Thain
2021-06-21 16:06     ` Linus Torvalds
2021-06-20  8:14 ` [PATCH v3 3/3] m68k: track syscalls being traced with shallow user context stack Michael Schmitz

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.