All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v4 0/3] m68k: Improved switch stack handling
@ 2021-06-23  0:21 Michael Schmitz
  2021-06-23  0:21 ` [PATCH v4 1/3] m68k: save extra registers on more syscall entry points Michael Schmitz
                   ` (3 more replies)
  0 siblings, 4 replies; 37+ messages in thread
From: Michael Schmitz @ 2021-06-23  0:21 UTC (permalink / raw)
  To: geert, linux-arch, linux-m68k; +Cc: ebiederm, torvalds, schwab

m68k version of Eric's patch series 'alpha/ptrace: Improved
switch_stack handling'.

Registers d6, d7, a3-a6 are not saved on the stack by default
on every syscall entry by the m68k kernel. A separate switch
stack frame is pushed to save those registers as needed.
This leaves the majority of syscalls with only a subset of
registers on the stack, and access to unsaved registers in
those would expose or modify random stack addresses.  

Patch 1 and 2 add a switch stack for all syscalls that were
found to need one to allow ptrace access to all registers
outside of syscall entry/exit tracing, as well as kernel
worker threads. This ought to protect against accidents.

Patch 3 adds safety checks and debug output to m68k get_reg()
and put_reg() functions. Any unsafe register access during
process tracing will be prevented and reported. 

Suggestions for optimizations or improvements welcome!

Cheers,

   Michael
   
Link: https://lore.kernel.org/r/<87pmwlek8d.fsf_-_@disp2133>  



^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH v4 1/3] m68k: save extra registers on more syscall entry points
  2021-06-23  0:21 [PATCH v4 0/3] m68k: Improved switch stack handling Michael Schmitz
@ 2021-06-23  0:21 ` Michael Schmitz
  2021-06-23  0:21 ` [PATCH v4 2/3] m68k: correctly handle IO worker stack frame set-up Michael Schmitz
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 37+ messages in thread
From: Michael Schmitz @ 2021-06-23  0:21 UTC (permalink / raw)
  To: geert, linux-arch, linux-m68k; +Cc: ebiederm, torvalds, schwab, Michael Schmitz

Multiple syscalls are liable to PTRACE_EVENT_* tracing and thus
require full user context saved on the kernel stack. We only
save those registers not preserved by C code currently.

do_exit() calls ptrace_stop() which may require access to all
saved registers. Add code to save additional registers in the
switch_stack struct for exit and exit_group syscalls (similar
to what is already done for fork, vfork and clone3). According
to Eric's analysis, execve and execveat can be traced as well,
so have been given the same treatment.

Tested on both ARAnyM and Falcon hardware.

CC: Eric W. Biederman <ebiederm@xmission.com>
CC: Linus Torvalds <torvalds@linux-foundation.org>
CC: Andreas Schwab <schwab@linux-m68k.org>
Signed-off-by: Michael Schmitz <schmitzmic@gmail.com>

--
Changes from v2:
- drop handling of io_uring_setup syscall

Changes from v1:

- added exec, execve and io_uring_setup syscalls
- save extra registers around kworker thread calls

drop io_uring_setup handling
---
 arch/m68k/kernel/entry.S              | 28 ++++++++++++++++++++++++++++
 arch/m68k/kernel/process.c            | 33 +++++++++++++++++++++++++++++++++
 arch/m68k/kernel/syscalls/syscall.tbl |  8 ++++----
 3 files changed, 65 insertions(+), 4 deletions(-)

diff --git a/arch/m68k/kernel/entry.S b/arch/m68k/kernel/entry.S
index 9dd76fb..275452a 100644
--- a/arch/m68k/kernel/entry.S
+++ b/arch/m68k/kernel/entry.S
@@ -76,6 +76,34 @@ ENTRY(__sys_clone3)
 	lea	%sp@(28),%sp
 	rts
 
+ENTRY(__sys_exit)
+	SAVE_SWITCH_STACK
+	pea	%sp@(SWITCH_STACK_SIZE)
+	jbsr	m68k_exit
+	lea	%sp@(28),%sp
+	rts
+
+ENTRY(__sys_exit_group)
+	SAVE_SWITCH_STACK
+	pea	%sp@(SWITCH_STACK_SIZE)
+	jbsr	m68k_exit_group
+	lea	%sp@(28),%sp
+	rts
+
+ENTRY(__sys_execve)
+	SAVE_SWITCH_STACK
+	pea	%sp@(SWITCH_STACK_SIZE)
+	jbsr	m68k_execve
+	lea	%sp@(28),%sp
+	rts
+
+ENTRY(__sys_execveat)
+	SAVE_SWITCH_STACK
+	pea	%sp@(SWITCH_STACK_SIZE)
+	jbsr	m68k_execveat
+	lea	%sp@(28),%sp
+	rts
+
 ENTRY(sys_sigreturn)
 	SAVE_SWITCH_STACK
 	movel	%sp,%sp@-		  | switch_stack pointer
diff --git a/arch/m68k/kernel/process.c b/arch/m68k/kernel/process.c
index da83cc8..6f2f2ab 100644
--- a/arch/m68k/kernel/process.c
+++ b/arch/m68k/kernel/process.c
@@ -138,6 +138,39 @@ asmlinkage int m68k_clone3(struct pt_regs *regs)
 	return sys_clone3((struct clone_args __user *)regs->d1, regs->d2);
 }
 
+/*
+ * Because extra registers are saved on the stack after the sys_exit()
+ * arguments, this C wrapper extracts them from pt_regs * and then calls the
+ * generic sys_exit() implementation.
+ */
+asmlinkage int m68k_exit(struct pt_regs *regs)
+{
+	return sys_exit(regs->d1);
+}
+
+/* Same for sys_exit_group ... */
+asmlinkage int m68k_exit_group(struct pt_regs *regs)
+{
+	return sys_exit_group(regs->d1);
+}
+
+/* Same for sys_exit_group ... */
+asmlinkage int m68k_execve(struct pt_regs *regs)
+{
+	return sys_execve((const char __user *)regs->d1,
+			(const char __user *const __user *)regs->d2,
+			(const char __user *const __user *)regs->d3);
+}
+
+/* Same for sys_exit_group ... */
+asmlinkage int m68k_execveat(struct pt_regs *regs)
+{
+	return sys_execveat(regs->d1, (const char __user *)regs->d2,
+			(const char __user *const __user *)regs->d3,
+			(const char __user *const __user *)regs->d4,
+			regs->d5);
+}
+
 int copy_thread(unsigned long clone_flags, unsigned long usp, unsigned long arg,
 		struct task_struct *p, unsigned long tls)
 {
diff --git a/arch/m68k/kernel/syscalls/syscall.tbl b/arch/m68k/kernel/syscalls/syscall.tbl
index 0dd019d..13dd02e 100644
--- a/arch/m68k/kernel/syscalls/syscall.tbl
+++ b/arch/m68k/kernel/syscalls/syscall.tbl
@@ -8,7 +8,7 @@
 # The <abi> is always "common" for this file
 #
 0	common	restart_syscall			sys_restart_syscall
-1	common	exit				sys_exit
+1	common	exit				__sys_exit
 2	common	fork				__sys_fork
 3	common	read				sys_read
 4	common	write				sys_write
@@ -18,7 +18,7 @@
 8	common	creat				sys_creat
 9	common	link				sys_link
 10	common	unlink				sys_unlink
-11	common	execve				sys_execve
+11	common	execve				__sys_execve
 12	common	chdir				sys_chdir
 13	common	time				sys_time32
 14	common	mknod				sys_mknod
@@ -254,7 +254,7 @@
 244	common	io_submit			sys_io_submit
 245	common	io_cancel			sys_io_cancel
 246	common	fadvise64			sys_fadvise64
-247	common	exit_group			sys_exit_group
+247	common	exit_group			__sys_exit_group
 248	common	lookup_dcookie			sys_lookup_dcookie
 249	common	epoll_create			sys_epoll_create
 250	common	epoll_ctl			sys_epoll_ctl
@@ -362,7 +362,7 @@
 352	common	getrandom			sys_getrandom
 353	common	memfd_create			sys_memfd_create
 354	common	bpf				sys_bpf
-355	common	execveat			sys_execveat
+355	common	execveat			__sys_execveat
 356	common	socket				sys_socket
 357	common	socketpair			sys_socketpair
 358	common	bind				sys_bind
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH v4 2/3] m68k: correctly handle IO worker stack frame set-up
  2021-06-23  0:21 [PATCH v4 0/3] m68k: Improved switch stack handling Michael Schmitz
  2021-06-23  0:21 ` [PATCH v4 1/3] m68k: save extra registers on more syscall entry points Michael Schmitz
@ 2021-06-23  0:21 ` Michael Schmitz
  2021-06-23  0:21 ` [PATCH v4 3/3] m68k: track syscalls being traced with shallow user context stack Michael Schmitz
  2021-07-15 13:29 ` [PATCH v4 0/3] m68k: Improved switch stack handling Eric W. Biederman
  3 siblings, 0 replies; 37+ messages in thread
From: Michael Schmitz @ 2021-06-23  0:21 UTC (permalink / raw)
  To: geert, linux-arch, linux-m68k; +Cc: ebiederm, torvalds, schwab, Michael Schmitz

Create full stack frame plus switch stack frame in copy_thread()
when creating a kernel worker thread. The switch stack frame will
then be consumed in resume(), leaving a full stack frame of zero
content for ptrace to play with.

Change ret_from_exception switch stack handling to restore the
switch stack (and pop the return address restored by that)after
return from kernel worker threads.

Patch as suggested by Linus (comments added by me).

Tested on both ARAnyM and Falcon hardware.

CC: Eric W. Biederman <ebiederm@xmission.com>
CC: Linus Torvalds <torvalds@linux-foundation.org>
CC: Andreas Schwab <schwab@linux-m68k.org>
Signed-off-by: Michael Schmitz <schmitzmic@gmail.com>
---
 arch/m68k/kernel/entry.S   | 11 +++++++++++
 arch/m68k/kernel/process.c | 17 +++++++++++++----
 2 files changed, 24 insertions(+), 4 deletions(-)

diff --git a/arch/m68k/kernel/entry.S b/arch/m68k/kernel/entry.S
index 275452a..0c25038 100644
--- a/arch/m68k/kernel/entry.S
+++ b/arch/m68k/kernel/entry.S
@@ -147,6 +147,15 @@ ENTRY(ret_from_fork)
 	addql	#4,%sp
 	jra	ret_from_exception
 
+	| A kernel thread will jump here directly from resume,
+	| with the stack containing the full register state
+	| (pt_regs and switch_stack).
+	|
+	| The argument will be in d7, and the kernel function
+	| to call will be in a3.
+	|
+	| If the kernel function returns, we want to return
+	| to user space - it has done a kernel_execve().
 ENTRY(ret_from_kernel_thread)
 	| a3 contains the kernel thread payload, d7 - its argument
 	movel	%d1,%sp@-
@@ -154,6 +163,8 @@ ENTRY(ret_from_kernel_thread)
 	movel	%d7,(%sp)
 	jsr	%a3@
 	addql	#4,%sp
+	RESTORE_SWITCH_STACK
+	addql	#4,%sp
 	jra	ret_from_exception
 
 #if defined(CONFIG_COLDFIRE) || !defined(CONFIG_MMU)
diff --git a/arch/m68k/kernel/process.c b/arch/m68k/kernel/process.c
index 6f2f2ab..d371edf 100644
--- a/arch/m68k/kernel/process.c
+++ b/arch/m68k/kernel/process.c
@@ -190,14 +190,23 @@ int copy_thread(unsigned long clone_flags, unsigned long usp, unsigned long arg,
 	 */
 	p->thread.fs = get_fs().seg;
 
+	/* kernel threads require an additional switch stack,
+	 * which is then consumed by resume() once we switch to
+	 * the new thread!
+	 */
 	if (unlikely(p->flags & (PF_KTHREAD | PF_IO_WORKER))) {
 		/* kernel thread */
-		memset(frame, 0, sizeof(struct fork_frame));
+		struct switch_stack *kstp = &frame->sw - 1;
+
+		/* kernel thread - a kernel-side switch-stack and the full user fork_frame */
+		memset(kstp, 0, sizeof(struct switch_stack) + sizeof(struct fork_frame));
+
 		frame->regs.sr = PS_S;
-		frame->sw.a3 = usp; /* function */
-		frame->sw.d7 = arg;
-		frame->sw.retpc = (unsigned long)ret_from_kernel_thread;
+		kstp->a3 = usp; /* function */
+		kstp->d7 = arg;
+		kstp->retpc = (unsigned long)ret_from_kernel_thread;
 		p->thread.usp = 0;
+		p->thread.ksp = (unsigned long)kstp;
 		return 0;
 	}
 	memcpy(frame, container_of(current_pt_regs(), struct fork_frame, regs),
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH v4 3/3] m68k: track syscalls being traced with shallow user context stack
  2021-06-23  0:21 [PATCH v4 0/3] m68k: Improved switch stack handling Michael Schmitz
  2021-06-23  0:21 ` [PATCH v4 1/3] m68k: save extra registers on more syscall entry points Michael Schmitz
  2021-06-23  0:21 ` [PATCH v4 2/3] m68k: correctly handle IO worker stack frame set-up Michael Schmitz
@ 2021-06-23  0:21 ` Michael Schmitz
  2021-07-25 10:05   ` Geert Uytterhoeven
  2021-07-15 13:29 ` [PATCH v4 0/3] m68k: Improved switch stack handling Eric W. Biederman
  3 siblings, 1 reply; 37+ messages in thread
From: Michael Schmitz @ 2021-06-23  0:21 UTC (permalink / raw)
  To: geert, linux-arch, linux-m68k; +Cc: ebiederm, torvalds, schwab, Michael Schmitz

Add 'status' field to thread_info struct to hold syscall trace
status info.

Set flag bit in thread_info->status at syscall trace entry, clear
flag bit on trace exit.

Set another flag bit on entering syscall where the full stack
frame has been saved. These flags can be checked whenever a
syscall calls ptrace_stop().

Check flag bits in get_reg()/put_reg() and prevent access to
registers that are saved on the switch stack, in case the
syscall did not actually save these registers on the switch
stack.

Tested on ARAnyM only - boots and survives running strace on a
binary, nothing fancy.

CC: Eric W. Biederman <ebiederm@xmission.com>
CC: Linus Torvalds <torvalds@linux-foundation.org>
CC: Andreas Schwab <schwab@linux-m68k.org>
Signed-off-by: Michael Schmitz <schmitzmic@gmail.com>

--

Changes from v3:

- complete flag bit handling for all syscalls that use a m68k
  wrapper
- add flag checking code to get_reg()/put_reg() in m68k ptrace.c
---
 arch/m68k/include/asm/entry.h       | 10 +++++++
 arch/m68k/include/asm/thread_info.h |  1 +
 arch/m68k/kernel/asm-offsets.c      |  1 +
 arch/m68k/kernel/entry.S            | 54 +++++++++++++++++++++++++++++++++++++
 arch/m68k/kernel/ptrace.c           | 44 +++++++++++++++++++++++++-----
 5 files changed, 104 insertions(+), 6 deletions(-)

diff --git a/arch/m68k/include/asm/entry.h b/arch/m68k/include/asm/entry.h
index 9b52b06..37ba65b 100644
--- a/arch/m68k/include/asm/entry.h
+++ b/arch/m68k/include/asm/entry.h
@@ -41,6 +41,16 @@
 #define ALLOWINT	(~0x700)
 #endif /* machine compilation types */
 
+#define TIS_TRACING		0
+#define TIS_ALLREGS_SAVED	1
+#define _TIS_TRACING		(1<<TIS_TRACING)
+#define _TIS_ALLREGS_SAVED	(1<<TIS_ALLREGS_SAVED)
+
+#define TIS_TRACE_ON		_TIS_TRACING
+#define TIS_TRACE_OFF		(~(_TIS_TRACING))
+#define TIS_SWITCH_STACK	_TIS_ALLREGS_SAVED
+#define TIS_NO_SWITCH_STACK	(~(_TIS_ALLREGS_SAVED))
+
 #ifdef __ASSEMBLY__
 /*
  * This defines the normal kernel pt-regs layout.
diff --git a/arch/m68k/include/asm/thread_info.h b/arch/m68k/include/asm/thread_info.h
index 15a7570..a88b48b 100644
--- a/arch/m68k/include/asm/thread_info.h
+++ b/arch/m68k/include/asm/thread_info.h
@@ -29,6 +29,7 @@ struct thread_info {
 	unsigned long		flags;
 	mm_segment_t		addr_limit;	/* thread address space */
 	int			preempt_count;	/* 0 => preemptable, <0 => BUG */
+	unsigned int		status;		/* thread-synchronous flags */
 	__u32			cpu;		/* should always be 0 on m68k */
 	unsigned long		tp_value;	/* thread pointer */
 };
diff --git a/arch/m68k/kernel/asm-offsets.c b/arch/m68k/kernel/asm-offsets.c
index ccea355..ac1ec8f 100644
--- a/arch/m68k/kernel/asm-offsets.c
+++ b/arch/m68k/kernel/asm-offsets.c
@@ -41,6 +41,7 @@ int main(void)
 	/* offsets into the thread_info struct */
 	DEFINE(TINFO_PREEMPT, offsetof(struct thread_info, preempt_count));
 	DEFINE(TINFO_FLAGS, offsetof(struct thread_info, flags));
+	DEFINE(TINFO_STATUS, offsetof(struct thread_info, status));
 
 	/* offsets into the pt_regs */
 	DEFINE(PT_OFF_D0, offsetof(struct pt_regs, d0));
diff --git a/arch/m68k/kernel/entry.S b/arch/m68k/kernel/entry.S
index 0c25038..4cc24d5 100644
--- a/arch/m68k/kernel/entry.S
+++ b/arch/m68k/kernel/entry.S
@@ -51,75 +51,115 @@
 
 .text
 ENTRY(__sys_fork)
+	movel	%curptr@(TASK_STACK),%a1
+	orb	#TIS_SWITCH_STACK, %a1@(TINFO_STATUS+3)
 	SAVE_SWITCH_STACK
 	jbsr	sys_fork
 	lea     %sp@(24),%sp
+	movel	%curptr@(TASK_STACK),%a1
+	andb	#TIS_NO_SWITCH_STACK, %a1@(TINFO_STATUS+3)
 	rts
 
 ENTRY(__sys_clone)
+	movel	%curptr@(TASK_STACK),%a1
+	orb	#TIS_SWITCH_STACK, %a1@(TINFO_STATUS+3)
 	SAVE_SWITCH_STACK
 	pea	%sp@(SWITCH_STACK_SIZE)
 	jbsr	m68k_clone
 	lea     %sp@(28),%sp
+	movel	%curptr@(TASK_STACK),%a1
+	andb	#TIS_NO_SWITCH_STACK, %a1@(TINFO_STATUS+3)
 	rts
 
 ENTRY(__sys_vfork)
+	movel	%curptr@(TASK_STACK),%a1
+	orb	#TIS_SWITCH_STACK, %a1@(TINFO_STATUS+3)
 	SAVE_SWITCH_STACK
 	jbsr	sys_vfork
 	lea     %sp@(24),%sp
+	movel	%curptr@(TASK_STACK),%a1
+	andb	#TIS_NO_SWITCH_STACK, %a1@(TINFO_STATUS+3)
 	rts
 
 ENTRY(__sys_clone3)
+	movel	%curptr@(TASK_STACK),%a1
+	orb	#TIS_SWITCH_STACK, %a1@(TINFO_STATUS+3)
 	SAVE_SWITCH_STACK
 	pea	%sp@(SWITCH_STACK_SIZE)
 	jbsr	m68k_clone3
 	lea	%sp@(28),%sp
+	movel	%curptr@(TASK_STACK),%a1
+	andb	#TIS_NO_SWITCH_STACK, %a1@(TINFO_STATUS+3)
 	rts
 
 ENTRY(__sys_exit)
+	movel	%curptr@(TASK_STACK),%a1
+	orb	#TIS_SWITCH_STACK, %a1@(TINFO_STATUS+3)
 	SAVE_SWITCH_STACK
 	pea	%sp@(SWITCH_STACK_SIZE)
 	jbsr	m68k_exit
 	lea	%sp@(28),%sp
+	movel	%curptr@(TASK_STACK),%a1
+	andb	#TIS_NO_SWITCH_STACK, %a1@(TINFO_STATUS+3)
 	rts
 
 ENTRY(__sys_exit_group)
+	movel	%curptr@(TASK_STACK),%a1
+	orb	#TIS_SWITCH_STACK, %a1@(TINFO_STATUS+3)
 	SAVE_SWITCH_STACK
 	pea	%sp@(SWITCH_STACK_SIZE)
 	jbsr	m68k_exit_group
 	lea	%sp@(28),%sp
+	movel	%curptr@(TASK_STACK),%a1
+	andb	#TIS_NO_SWITCH_STACK, %a1@(TINFO_STATUS+3)
 	rts
 
 ENTRY(__sys_execve)
+	movel	%curptr@(TASK_STACK),%a1
+	orb	#TIS_SWITCH_STACK, %a1@(TINFO_STATUS+3)
 	SAVE_SWITCH_STACK
 	pea	%sp@(SWITCH_STACK_SIZE)
 	jbsr	m68k_execve
 	lea	%sp@(28),%sp
+	movel	%curptr@(TASK_STACK),%a1
+	andb	#TIS_NO_SWITCH_STACK, %a1@(TINFO_STATUS+3)
 	rts
 
 ENTRY(__sys_execveat)
+	movel	%curptr@(TASK_STACK),%a1
+	orb	#TIS_SWITCH_STACK, %a1@(TINFO_STATUS+3)
 	SAVE_SWITCH_STACK
 	pea	%sp@(SWITCH_STACK_SIZE)
 	jbsr	m68k_execveat
 	lea	%sp@(28),%sp
+	movel	%curptr@(TASK_STACK),%a1
+	andb	#TIS_NO_SWITCH_STACK, %a1@(TINFO_STATUS+3)
 	rts
 
 ENTRY(sys_sigreturn)
+	movel	%curptr@(TASK_STACK),%a1
+	orb	#TIS_SWITCH_STACK, %a1@(TINFO_STATUS+3)
 	SAVE_SWITCH_STACK
 	movel	%sp,%sp@-		  | switch_stack pointer
 	pea	%sp@(SWITCH_STACK_SIZE+4) | pt_regs pointer
 	jbsr	do_sigreturn
 	addql	#8,%sp
 	RESTORE_SWITCH_STACK
+	movel	%curptr@(TASK_STACK),%a1
+	andb	#TIS_NO_SWITCH_STACK, %a1@(TINFO_STATUS+3)
 	rts
 
 ENTRY(sys_rt_sigreturn)
+	movel	%curptr@(TASK_STACK),%a1
+	orb	#TIS_SWITCH_STACK, %a1@(TINFO_STATUS+3)
 	SAVE_SWITCH_STACK
 	movel	%sp,%sp@-		  | switch_stack pointer
 	pea	%sp@(SWITCH_STACK_SIZE+4) | pt_regs pointer
 	jbsr	do_rt_sigreturn
 	addql	#8,%sp
 	RESTORE_SWITCH_STACK
+	movel	%curptr@(TASK_STACK),%a1
+	andb	#TIS_NO_SWITCH_STACK, %a1@(TINFO_STATUS+3)
 	rts
 
 ENTRY(buserr)
@@ -200,25 +240,33 @@ ENTRY(ret_from_user_rt_signal)
 #else
 
 do_trace_entry:
+	orb	#TIS_TRACE_ON, %a1@(TINFO_STATUS+3)
 	movel	#-ENOSYS,%sp@(PT_OFF_D0)| needed for strace
 	subql	#4,%sp
+	orb	#TIS_SWITCH_STACK, %a1@(TINFO_STATUS+3)
 	SAVE_SWITCH_STACK
 	jbsr	syscall_trace
 	RESTORE_SWITCH_STACK
+	andb	#TIS_NO_SWITCH_STACK, %a1@(TINFO_STATUS+3)
 	addql	#4,%sp
 	movel	%sp@(PT_OFF_ORIG_D0),%d0
 	cmpl	#NR_syscalls,%d0
 	jcs	syscall
 badsys:
+	andb	#TIS_TRACE_OFF, %a1@(TINFO_STATUS+3)
 	movel	#-ENOSYS,%sp@(PT_OFF_D0)
 	jra	ret_from_syscall
 
 do_trace_exit:
 	subql	#4,%sp
+	orb	#TIS_SWITCH_STACK, %a1@(TINFO_STATUS+3)
 	SAVE_SWITCH_STACK
 	jbsr	syscall_trace
 	RESTORE_SWITCH_STACK
+	andb	#TIS_NO_SWITCH_STACK, %a1@(TINFO_STATUS+3)
 	addql	#4,%sp
+	movel	%curptr@(TASK_STACK),%a1
+	andb	#TIS_TRACE_OFF, %a1@(TINFO_STATUS+3)
 	jra	.Lret_from_exception
 
 ENTRY(ret_from_signal)
@@ -227,6 +275,8 @@ ENTRY(ret_from_signal)
 	jge	1f
 	jbsr	syscall_trace
 1:	RESTORE_SWITCH_STACK
+	movel	%curptr@(TASK_STACK),%a1
+	andb	#TIS_TRACE_OFF, %a1@(TINFO_STATUS+3)
 	addql	#4,%sp
 /* on 68040 complete pending writebacks if any */
 #ifdef CONFIG_M68040
@@ -303,11 +353,15 @@ exit_work:
 do_signal_return:
 	|andw	#ALLOWINT,%sr
 	subql	#4,%sp			| dummy return address
+	movel	%curptr@(TASK_STACK),%a1
+	orb	#TIS_SWITCH_STACK, %a1@(TINFO_STATUS+3)
 	SAVE_SWITCH_STACK
 	pea	%sp@(SWITCH_STACK_SIZE)
 	bsrl	do_notify_resume
 	addql	#4,%sp
 	RESTORE_SWITCH_STACK
+	movel	%curptr@(TASK_STACK),%a1
+	andb	#TIS_NO_SWITCH_STACK, %a1@(TINFO_STATUS+3)
 	addql	#4,%sp
 	jbra	resume_userspace
 
diff --git a/arch/m68k/kernel/ptrace.c b/arch/m68k/kernel/ptrace.c
index 94b3b27..ae4ef61 100644
--- a/arch/m68k/kernel/ptrace.c
+++ b/arch/m68k/kernel/ptrace.c
@@ -68,6 +68,12 @@ static const int regoff[] = {
 	[18]	= PT_REG(pc),
 };
 
+static inline int test_ti_thread_status(struct thread_info *ti, int flag)
+{
+	return test_bit(flag, (unsigned long *)&ti->status);
+}
+
+
 /*
  * Get contents of register REGNO in task TASK.
  */
@@ -77,9 +83,22 @@ static inline long get_reg(struct task_struct *task, int regno)
 
 	if (regno == PT_USP)
 		addr = &task->thread.usp;
-	else if (regno < ARRAY_SIZE(regoff))
-		addr = (unsigned long *)(task->thread.esp0 + regoff[regno]);
-	else
+	else if (regno < ARRAY_SIZE(regoff)) {
+		int off  =regoff[regno];
+
+		if (WARN_ON_ONCE((off < PT_REG(d1)) &&
+			test_ti_thread_status(task_thread_info(task), TIS_TRACING) &&
+			!test_ti_thread_status(task_thread_info(task),
+					     TIS_ALLREGS_SAVED))) {
+			unsigned long *addr_d0;
+
+			addr_d0 = (unsigned long *)(task->thread.esp0 + regoff[16]);
+			pr_err("register read from incomplete stack, regno %d offs %d orig_d0 %lx\n",
+				regno, off, *addr_d0);
+			return 0;
+		}
+		addr = (unsigned long *)(task->thread.esp0 + off);
+	} else
 		return 0;
 	/* Need to take stkadj into account. */
 	if (regno == PT_SR || regno == PT_PC) {
@@ -102,9 +121,22 @@ static inline int put_reg(struct task_struct *task, int regno,
 
 	if (regno == PT_USP)
 		addr = &task->thread.usp;
-	else if (regno < ARRAY_SIZE(regoff))
-		addr = (unsigned long *)(task->thread.esp0 + regoff[regno]);
-	else
+	else if (regno < ARRAY_SIZE(regoff)) {
+		int off = regoff[regno];
+
+		if (WARN_ON_ONCE((off < PT_REG(d1)) &&
+			test_ti_thread_status(task_thread_info(task), TIS_TRACING) &&
+			!test_ti_thread_status(task_thread_info(task),
+					     TIS_ALLREGS_SAVED))) {
+			unsigned long *addr_d0;
+
+			addr_d0 = (unsigned long *)(task->thread.esp0 + regoff[16]);
+			pr_err("register write to incomplete stack, regno %d offs %d orig_d0 %lx\n",
+				regno, off, *addr_d0);
+			return -1;
+		}
+		addr = (unsigned long *)(task->thread.esp0 + off);
+	} else
 		return -1;
 	/* Need to take stkadj into account. */
 	if (regno == PT_SR || regno == PT_PC) {
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* Re: [PATCH v4 0/3] m68k: Improved switch stack handling
  2021-06-23  0:21 [PATCH v4 0/3] m68k: Improved switch stack handling Michael Schmitz
                   ` (2 preceding siblings ...)
  2021-06-23  0:21 ` [PATCH v4 3/3] m68k: track syscalls being traced with shallow user context stack Michael Schmitz
@ 2021-07-15 13:29 ` Eric W. Biederman
  2021-07-15 23:10   ` Michael Schmitz
  3 siblings, 1 reply; 37+ messages in thread
From: Eric W. Biederman @ 2021-07-15 13:29 UTC (permalink / raw)
  To: Michael Schmitz; +Cc: geert, linux-arch, linux-m68k, torvalds, schwab

Michael Schmitz <schmitzmic@gmail.com> writes:

> m68k version of Eric's patch series 'alpha/ptrace: Improved
> switch_stack handling'.
>
> Registers d6, d7, a3-a6 are not saved on the stack by default
> on every syscall entry by the m68k kernel. A separate switch
> stack frame is pushed to save those registers as needed.
> This leaves the majority of syscalls with only a subset of
> registers on the stack, and access to unsaved registers in
> those would expose or modify random stack addresses.  
>
> Patch 1 and 2 add a switch stack for all syscalls that were
> found to need one to allow ptrace access to all registers
> outside of syscall entry/exit tracing, as well as kernel
> worker threads. This ought to protect against accidents.
>
> Patch 3 adds safety checks and debug output to m68k get_reg()
> and put_reg() functions. Any unsafe register access during
> process tracing will be prevented and reported. 
>
> Suggestions for optimizations or improvements welcome!
>
> Cheers,
>
>    Michael
>    
> Link: https://lore.kernel.org/r/<87pmwlek8d.fsf_-_@disp2133>

I have been digging into this some more and I have found one place
that I am having a challenge dealing with.

In arch/m68k/fpsp040/skeleton.S there is an assembly version of
copy_from_user that calls fpsp040_die when the bytes can not be read.

Now fpsp040_die is just:

/*
 * This function is called if an error occur while accessing
 * user-space from the fpsp040 code.
 */
asmlinkage void fpsp040_die(void)
{
	do_exit(SIGSEGV);
}


The problem here is the instruction emulation performed in the fpsp040
code performs a very minimal saving of registers.  I don't think even
the normal system call entry point registers that are saved are present
at that point.

Is there any chance you can help me figure out how to get a stack frame
with all of the registers present before fpsp040_die is called?

Eric

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v4 0/3] m68k: Improved switch stack handling
  2021-07-15 13:29 ` [PATCH v4 0/3] m68k: Improved switch stack handling Eric W. Biederman
@ 2021-07-15 23:10   ` Michael Schmitz
  2021-07-17  5:38     ` Michael Schmitz
  0 siblings, 1 reply; 37+ messages in thread
From: Michael Schmitz @ 2021-07-15 23:10 UTC (permalink / raw)
  To: Eric W. Biederman; +Cc: geert, linux-arch, linux-m68k, torvalds, schwab

Eric,

On 16/07/21 1:29 am, Eric W. Biederman wrote:
>
> I have been digging into this some more and I have found one place
> that I am having a challenge dealing with.
>
> In arch/m68k/fpsp040/skeleton.S there is an assembly version of
> copy_from_user that calls fpsp040_die when the bytes can not be read.
>
> Now fpsp040_die is just:
>
> /*
>   * This function is called if an error occur while accessing
>   * user-space from the fpsp040 code.
>   */
> asmlinkage void fpsp040_die(void)
> {
> 	do_exit(SIGSEGV);
> }

In other places (bus error handlers) we have

force_sig(SIGSEGV);

or

force_sig_fault(sig, si_code, addr);

(the latter for floating point traps from FPU hardware). Would that be 
any better?

>
> The problem here is the instruction emulation performed in the fpsp040
> code performs a very minimal saving of registers.  I don't think even
> the normal system call entry point registers that are saved are present
> at that point.
>
> Is there any chance you can help me figure out how to get a stack frame
> with all of the registers present before fpsp040_die is called?

I suppose adding the following code (untested) to entry.S:

ENTRY(fpsp040_die)
         SAVE_ALL_INT
         jbsr    fpsp040_die_c
         jra     ret_from_exception

along with renaming above C entry point to fpsp040_die_c would add the 
basic saved registers, but these would not necessarily reflect the state 
of the processor when the fpsp040 trap was called. Is that what you're 
after?

To add the rest of the switch stack (again, won't reflect state before 
entering fpsp040), try:

ENTRY(fpsp040_die)
         SAVE_ALL_INT

         SAVE_SWITCH_STACK

         jbsr    fpsp040_die_c

         addql   #24,%sp_c

         jra     ret_from_exception


If you need the registers saved at fpsp040 entry, the only way I can see 
is to change the code in arch/m68k/kernel/vectors.c to use a common fpsp 
trap entry point that saves state, before jumping to the desired fpsp040 
entry point using a FPU trap table. Just like we do for system calls.

Cheers,

     Michael


> Eric

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v4 0/3] m68k: Improved switch stack handling
  2021-07-15 23:10   ` Michael Schmitz
@ 2021-07-17  5:38     ` Michael Schmitz
  2021-07-17 18:52       ` Eric W. Biederman
  0 siblings, 1 reply; 37+ messages in thread
From: Michael Schmitz @ 2021-07-17  5:38 UTC (permalink / raw)
  To: Eric W. Biederman; +Cc: geert, linux-arch, linux-m68k, torvalds, schwab

Am 16.07.2021 um 11:10 schrieb Michael Schmitz:
> Eric,
>
> On 16/07/21 1:29 am, Eric W. Biederman wrote:
>>
>> I have been digging into this some more and I have found one place
>> that I am having a challenge dealing with.
>>
>> In arch/m68k/fpsp040/skeleton.S there is an assembly version of
>> copy_from_user that calls fpsp040_die when the bytes can not be read.
>>
>> Now fpsp040_die is just:
>>
>> /*
>>   * This function is called if an error occur while accessing
>>   * user-space from the fpsp040 code.
>>   */
>> asmlinkage void fpsp040_die(void)
>> {
>>     do_exit(SIGSEGV);
>> }
>> The problem here is the instruction emulation performed in the fpsp040
>> code performs a very minimal saving of registers.  I don't think even
>> the normal system call entry point registers that are saved are present
>> at that point.
>>
>> Is there any chance you can help me figure out how to get a stack frame
>> with all of the registers present before fpsp040_die is called?
>
> I suppose adding the following code (untested) to entry.S:
>
> ENTRY(fpsp040_die)
>         SAVE_ALL_INT
>         jbsr    fpsp040_die_c
>         jra     ret_from_exception
>
> along with renaming above C entry point to fpsp040_die_c would add the
> basic saved registers, but these would not necessarily reflect the state
> of the processor when the fpsp040 trap was called. Is that what you're
> after?

I should have looked more closely at skeleton.S - most FPU exceptions 
handled there call trap_c the same way as is done for generic traps, 
i.e. SAVE_ALL_INT before, ret_from_exception after.

Instead of adding code to entry.S, much better to add it in skeleton.S. 
I'll try to come up with a way to test this code path (calling 
fpsp040_die from the dz exception hander seems much the easiest way) to 
make sure this doesn't have side effects.

Does do_exit() ever return?

Cheers,

	Michael

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v4 0/3] m68k: Improved switch stack handling
  2021-07-17  5:38     ` Michael Schmitz
@ 2021-07-17 18:52       ` Eric W. Biederman
  2021-07-17 20:09         ` Michael Schmitz
  0 siblings, 1 reply; 37+ messages in thread
From: Eric W. Biederman @ 2021-07-17 18:52 UTC (permalink / raw)
  To: Michael Schmitz; +Cc: geert, linux-arch, linux-m68k, torvalds, schwab

Michael Schmitz <schmitzmic@gmail.com> writes:

> Am 16.07.2021 um 11:10 schrieb Michael Schmitz:
>> Eric,
>>
>> On 16/07/21 1:29 am, Eric W. Biederman wrote:
>>>
>>> I have been digging into this some more and I have found one place
>>> that I am having a challenge dealing with.
>>>
>>> In arch/m68k/fpsp040/skeleton.S there is an assembly version of
>>> copy_from_user that calls fpsp040_die when the bytes can not be read.
>>>
>>> Now fpsp040_die is just:
>>>
>>> /*
>>>   * This function is called if an error occur while accessing
>>>   * user-space from the fpsp040 code.
>>>   */
>>> asmlinkage void fpsp040_die(void)
>>> {
>>>     do_exit(SIGSEGV);
>>> }
>>> The problem here is the instruction emulation performed in the fpsp040
>>> code performs a very minimal saving of registers.  I don't think even
>>> the normal system call entry point registers that are saved are present
>>> at that point.
>>>
>>> Is there any chance you can help me figure out how to get a stack frame
>>> with all of the registers present before fpsp040_die is called?
>>
>> I suppose adding the following code (untested) to entry.S:
>>
>> ENTRY(fpsp040_die)
>>         SAVE_ALL_INT
>>         jbsr    fpsp040_die_c
>>         jra     ret_from_exception
>>
>> along with renaming above C entry point to fpsp040_die_c would add the
>> basic saved registers, but these would not necessarily reflect the state
>> of the processor when the fpsp040 trap was called. Is that what you're
>> after?
>
> I should have looked more closely at skeleton.S - most FPU exceptions
> handled there call trap_c the same way as is done for generic traps,
> i.e. SAVE_ALL_INT before, ret_from_exception after.
>
> Instead of adding code to entry.S, much better to add it in
> skeleton.S. I'll try to come up with a way to test this code path
> (calling fpsp040_die from the dz exception hander seems much the
> easiest way) to make sure this doesn't have side effects.
>
> Does do_exit() ever return?

No.  The function do_exit never returns.

If it is not too much difficulty I would be in favor of having the code
do force_sigsegv(SIGSEGV), instead of calling do_exit directly.

Looking at that code I have not been able to figure out the call paths
that get into skeleton.S.  I am not certain saving all of the registers
on an the exceptions that reach there make sense.  In practice I suspect
taking an exception is much more expensive than saving the registers so it
might not make any difference.  But this definitely looks like code that
is performance sensitive.

My sense when I was reading through skeleton.S was just one or two
registers were saved before the instruction emulation was called.

Eric


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v4 0/3] m68k: Improved switch stack handling
  2021-07-17 18:52       ` Eric W. Biederman
@ 2021-07-17 20:09         ` Michael Schmitz
  2021-07-17 23:04           ` Michael Schmitz
  0 siblings, 1 reply; 37+ messages in thread
From: Michael Schmitz @ 2021-07-17 20:09 UTC (permalink / raw)
  To: Eric W. Biederman; +Cc: geert, linux-arch, linux-m68k, torvalds, schwab

Hi Eric,

Am 18.07.2021 um 06:52 schrieb Eric W. Biederman:
>> I should have looked more closely at skeleton.S - most FPU exceptions
>> handled there call trap_c the same way as is done for generic traps,
>> i.e. SAVE_ALL_INT before, ret_from_exception after.
>>
>> Instead of adding code to entry.S, much better to add it in
>> skeleton.S. I'll try to come up with a way to test this code path
>> (calling fpsp040_die from the dz exception hander seems much the
>> easiest way) to make sure this doesn't have side effects.
>>
>> Does do_exit() ever return?
>
> No.  The function do_exit never returns.

Fine - nothing to worry about as regards restoring the stack pointer 
correctly then.

> If it is not too much difficulty I would be in favor of having the code
> do force_sigsegv(SIGSEGV), instead of calling do_exit directly.

That _would_ force a return, right? The exception handling in skeleton.S 
won't be set up for that.

> Looking at that code I have not been able to figure out the call paths
> that get into skeleton.S.  I am not certain saving all of the registers
> on an the exceptions that reach there make sense.  In practice I suspect

The registers are saved only so trap_c has a stack frame to work with. 
In that sense, adding a stack frame before calling fpsp040_die is no 
different.

> taking an exception is much more expensive than saving the registers so it
> might not make any difference.  But this definitely looks like code that
> is performance sensitive.

We're only planning to add a stack frame save before calling out of the 
user access exception handler, right? I doubt that will be called very 
often.

> My sense when I was reading through skeleton.S was just one or two
> registers were saved before the instruction emulation was called.

skeleton.S only contains the entry points for code to handle FPU 
exceptions, from what I've seen (plus the user space access code).

Wherever that exception handling requires calling into the C exception 
handler (trap_c), a stack frame is added.

Cheers,

	Michael

>

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v4 0/3] m68k: Improved switch stack handling
  2021-07-17 20:09         ` Michael Schmitz
@ 2021-07-17 23:04           ` Michael Schmitz
  2021-07-18 10:47             ` Andreas Schwab
  2021-07-20 20:32             ` Eric W. Biederman
  0 siblings, 2 replies; 37+ messages in thread
From: Michael Schmitz @ 2021-07-17 23:04 UTC (permalink / raw)
  To: Eric W. Biederman; +Cc: geert, linux-arch, linux-m68k, torvalds, schwab

[-- Attachment #1: Type: text/plain, Size: 2680 bytes --]

Am 18.07.2021 um 08:09 schrieb Michael Schmitz:
> Hi Eric,
>
> Am 18.07.2021 um 06:52 schrieb Eric W. Biederman:
>>> I should have looked more closely at skeleton.S - most FPU exceptions
>>> handled there call trap_c the same way as is done for generic traps,
>>> i.e. SAVE_ALL_INT before, ret_from_exception after.
>>>
>>> Instead of adding code to entry.S, much better to add it in
>>> skeleton.S. I'll try to come up with a way to test this code path
>>> (calling fpsp040_die from the dz exception hander seems much the
>>> easiest way) to make sure this doesn't have side effects.
>>>
>>> Does do_exit() ever return?
>>
>> No.  The function do_exit never returns.
>
> Fine - nothing to worry about as regards restoring the stack pointer
> correctly then.
>
>> If it is not too much difficulty I would be in favor of having the code
>> do force_sigsegv(SIGSEGV), instead of calling do_exit directly.
>
> That _would_ force a return, right? The exception handling in skeleton.S
> won't be set up for that.

See attached patch - note that when you change fpsp040_die() to call 
force_sig(SIGSEGV), the access exception handler will return to whatever 
function in fpsp040 attempted the user space access, and continue that 
operation with quite likely bogus data. That may well force another FPU 
trap before the SIGSEGV is delivered (will force_sig() immediately force 
a trap, or wait until returning to user space?).

Compile tested - haven't found an easy way to execute that code path yet.

Cheers,

	Michael


>
>> Looking at that code I have not been able to figure out the call paths
>> that get into skeleton.S.  I am not certain saving all of the registers
>> on an the exceptions that reach there make sense.  In practice I suspect
>
> The registers are saved only so trap_c has a stack frame to work with.
> In that sense, adding a stack frame before calling fpsp040_die is no
> different.
>
>> taking an exception is much more expensive than saving the registers
>> so it
>> might not make any difference.  But this definitely looks like code that
>> is performance sensitive.
>
> We're only planning to add a stack frame save before calling out of the
> user access exception handler, right? I doubt that will be called very
> often.
>
>> My sense when I was reading through skeleton.S was just one or two
>> registers were saved before the instruction emulation was called.
>
> skeleton.S only contains the entry points for code to handle FPU
> exceptions, from what I've seen (plus the user space access code).
>
> Wherever that exception handling requires calling into the C exception
> handler (trap_c), a stack frame is added.
>
> Cheers,
>
>     Michael
>
>>

[-- Attachment #2: 0001-m68k-fpsp040-save-full-stack-frame-before-calling-fp.patch --]
[-- Type: text/x-diff, Size: 1376 bytes --]

From 1e9be9238fb88dc0b87a7ffdd48068f944d8626c Mon Sep 17 00:00:00 2001
From: Michael Schmitz <schmitzmic@gmail.com>
Date: Sun, 18 Jul 2021 10:31:42 +1200
Subject: [PATCH] m68k/fpsp040 - save full stack frame before calling
 fpsp040_die

The FPSP040 floating point support code does not know how to
handle user space access faults gracefully, and just calls
do_exit(SIGSEGV) indirectly on these faults to abort.

do_exit() may stop if traced, and needs a full stack frame
available to avoid exposing kernel data.

Add the current stack frame before calling do_exit() from the
fpsp040 user access exception handler. Unwind the stack frame
and return to caller once done, in case do_exit() is replaced
by force_sig() later on.

CC: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Michael Schmitz <schmitzmic@gmail.com>
---
 arch/m68k/fpsp040/skeleton.S | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/arch/m68k/fpsp040/skeleton.S b/arch/m68k/fpsp040/skeleton.S
index a8f4161..6c92d38 100644
--- a/arch/m68k/fpsp040/skeleton.S
+++ b/arch/m68k/fpsp040/skeleton.S
@@ -502,7 +502,17 @@ in_ea:
 	.section .fixup,#alloc,#execinstr
 	.even
 1:
+
+	SAVE_ALL_INT
+	SAVE_SWITCH_STACK
 	jbra	fpsp040_die
+	addql   #8,%sp
+	addql   #8,%sp
+	addql   #8,%sp
+	addql   #8,%sp
+	addql   #8,%sp
+	addql   #4,%sp
+	rts
 
 	.section __ex_table,#alloc
 	.align	4
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* Re: [PATCH v4 0/3] m68k: Improved switch stack handling
  2021-07-17 23:04           ` Michael Schmitz
@ 2021-07-18 10:47             ` Andreas Schwab
  2021-07-18 19:47               ` Michael Schmitz
  2021-07-20 20:32             ` Eric W. Biederman
  1 sibling, 1 reply; 37+ messages in thread
From: Andreas Schwab @ 2021-07-18 10:47 UTC (permalink / raw)
  To: Michael Schmitz
  Cc: Eric W. Biederman, geert, linux-arch, linux-m68k, torvalds

On Jul 18 2021, Michael Schmitz wrote:

> +	addql   #8,%sp
> +	addql   #8,%sp
> +	addql   #8,%sp
> +	addql   #8,%sp
> +	addql   #8,%sp
> +	addql   #4,%sp

aka     lea     44(%sp),%sp

Andreas.

-- 
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
"And now for something completely different."

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v4 0/3] m68k: Improved switch stack handling
  2021-07-18 10:47             ` Andreas Schwab
@ 2021-07-18 19:47               ` Michael Schmitz
  2021-07-18 20:59                 ` Brad Boyer
  0 siblings, 1 reply; 37+ messages in thread
From: Michael Schmitz @ 2021-07-18 19:47 UTC (permalink / raw)
  To: Andreas Schwab; +Cc: Eric W. Biederman, geert, linux-arch, linux-m68k, torvalds

Hi Andreas,

On 18/07/21 10:47 pm, Andreas Schwab wrote:
> On Jul 18 2021, Michael Schmitz wrote:
>
>> +	addql   #8,%sp
>> +	addql   #8,%sp
>> +	addql   #8,%sp
>> +	addql   #8,%sp
>> +	addql   #8,%sp
>> +	addql   #4,%sp
> aka     lea     44(%sp),%sp

Thanks - I knew there should be a better way.

Somewhere in entry.S is

addql   #8,%sp
addql   #4,%sp

- is that faster than

lea     12(%sp),%sp ?

Cheers,

	Michael


  


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v4 0/3] m68k: Improved switch stack handling
  2021-07-18 19:47               ` Michael Schmitz
@ 2021-07-18 20:59                 ` Brad Boyer
  2021-07-19  3:15                   ` Michael Schmitz
  0 siblings, 1 reply; 37+ messages in thread
From: Brad Boyer @ 2021-07-18 20:59 UTC (permalink / raw)
  To: Michael Schmitz
  Cc: Andreas Schwab, Eric W. Biederman, geert, linux-arch, linux-m68k,
	torvalds

On Mon, Jul 19, 2021 at 07:47:19AM +1200, Michael Schmitz wrote:
> Somewhere in entry.S is
> 
> addql   #8,%sp
> addql   #4,%sp
> 
> - is that faster than
> 
> lea     12(%sp),%sp ?

On the 68040 the timing can depend on the other instructions around
it. Each of those addql instructions is listed as 1 and 1 for
fetch/execute, while that lea is listed as 2 and 1L+1 meaning that
it could potentially be faster depending on the behavior of the
instruction that preceded it thorough the execute stage. That one
free cycle if the stage is busy (due to the 1L) could make it
effectively faster since the first addql would have to wait that
extra cycle in that case.

On the 68060, it looks like the lea version is the clear winner,
although the timing description is obviously much more complicated
and thus I might have missed something. From a quick look, it
seems that lea takes the same time as just the first addql.

On CPU32, the lea version loses due to the extra 3 cycles from
the addressing mode, even though the base cycles of lea are the
same as for addql (2 cycles each). The lea might be even worse
if it can't take advantage of overlapping the surrounding
instructions (1 cycle before and 1 after).

Those are the only ones I already have the documentation in my
hands. I haven't checked older classic cores or coldfire, but
it does seem like it is specific to each chip which is faster.

Obviously both versions would be the same size (2 words).

	Brad Boyer
	flar@allandria.com


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v4 0/3] m68k: Improved switch stack handling
  2021-07-18 20:59                 ` Brad Boyer
@ 2021-07-19  3:15                   ` Michael Schmitz
  0 siblings, 0 replies; 37+ messages in thread
From: Michael Schmitz @ 2021-07-19  3:15 UTC (permalink / raw)
  To: Brad Boyer
  Cc: Andreas Schwab, Eric W. Biederman, geert, linux-arch, linux-m68k,
	torvalds

Hi Brad,

Am 19.07.2021 um 08:59 schrieb Brad Boyer:
> On Mon, Jul 19, 2021 at 07:47:19AM +1200, Michael Schmitz wrote:
>> Somewhere in entry.S is
>>
>> addql   #8,%sp
>> addql   #4,%sp
>>
>> - is that faster than
>>
>> lea     12(%sp),%sp ?
>
> On the 68040 the timing can depend on the other instructions around
> it. Each of those addql instructions is listed as 1 and 1 for
> fetch/execute, while that lea is listed as 2 and 1L+1 meaning that
> it could potentially be faster depending on the behavior of the
> instruction that preceded it thorough the execute stage. That one
> free cycle if the stage is busy (due to the 1L) could make it
> effectively faster since the first addql would have to wait that
> extra cycle in that case.
>
> On the 68060, it looks like the lea version is the clear winner,
> although the timing description is obviously much more complicated
> and thus I might have missed something. From a quick look, it
> seems that lea takes the same time as just the first addql.
>
> On CPU32, the lea version loses due to the extra 3 cycles from
> the addressing mode, even though the base cycles of lea are the
> same as for addql (2 cycles each). The lea might be even worse
> if it can't take advantage of overlapping the surrounding
> instructions (1 cycle before and 1 after).
>
> Those are the only ones I already have the documentation in my
> hands. I haven't checked older classic cores or coldfire, but
> it does seem like it is specific to each chip which is faster.
>
> Obviously both versions would be the same size (2 words).

Thanks, best leave it as is then.

Cheers,

	Michael

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v4 0/3] m68k: Improved switch stack handling
  2021-07-17 23:04           ` Michael Schmitz
  2021-07-18 10:47             ` Andreas Schwab
@ 2021-07-20 20:32             ` Eric W. Biederman
  2021-07-20 22:16               ` Michael Schmitz
  1 sibling, 1 reply; 37+ messages in thread
From: Eric W. Biederman @ 2021-07-20 20:32 UTC (permalink / raw)
  To: Michael Schmitz; +Cc: geert, linux-arch, linux-m68k, torvalds, schwab

Michael Schmitz <schmitzmic@gmail.com> writes:

> Am 18.07.2021 um 08:09 schrieb Michael Schmitz:
>> Hi Eric,
>>
>> Am 18.07.2021 um 06:52 schrieb Eric W. Biederman:
>>>> I should have looked more closely at skeleton.S - most FPU exceptions
>>>> handled there call trap_c the same way as is done for generic traps,
>>>> i.e. SAVE_ALL_INT before, ret_from_exception after.
>>>>
>>>> Instead of adding code to entry.S, much better to add it in
>>>> skeleton.S. I'll try to come up with a way to test this code path
>>>> (calling fpsp040_die from the dz exception hander seems much the
>>>> easiest way) to make sure this doesn't have side effects.
>>>>
>>>> Does do_exit() ever return?
>>>
>>> No.  The function do_exit never returns.
>>
>> Fine - nothing to worry about as regards restoring the stack pointer
>> correctly then.
>>
>>> If it is not too much difficulty I would be in favor of having the code
>>> do force_sigsegv(SIGSEGV), instead of calling do_exit directly.
>>
>> That _would_ force a return, right? The exception handling in skeleton.S
>> won't be set up for that.
>
> See attached patch - note that when you change fpsp040_die() to call
> force_sig(SIGSEGV), the access exception handler will return to whatever
> function in fpsp040 attempted the user space access, and continue that operation
> with quite likely bogus data. That may well force another FPU trap before the
> SIGSEGV is delivered (will force_sig() immediately force a trap, or wait until
> returning to user space?).
>
> Compile tested - haven't found an easy way to execute that code path yet.
>
> Cheers,
>
> 	Michael
>
>
>>
>>> Looking at that code I have not been able to figure out the call paths
>>> that get into skeleton.S.  I am not certain saving all of the registers
>>> on an the exceptions that reach there make sense.  In practice I suspect
>>
>> The registers are saved only so trap_c has a stack frame to work with.
>> In that sense, adding a stack frame before calling fpsp040_die is no
>> different.
>>
>>> taking an exception is much more expensive than saving the registers
>>> so it
>>> might not make any difference.  But this definitely looks like code that
>>> is performance sensitive.
>>
>> We're only planning to add a stack frame save before calling out of the
>> user access exception handler, right? I doubt that will be called very
>> often.
>>
>>> My sense when I was reading through skeleton.S was just one or two
>>> registers were saved before the instruction emulation was called.
>>
>> skeleton.S only contains the entry points for code to handle FPU
>> exceptions, from what I've seen (plus the user space access code).
>>
>> Wherever that exception handling requires calling into the C exception
>> handler (trap_c), a stack frame is added.
>>
>> Cheers,
>>
>>     Michael
>>
>>>
>
> From 1e9be9238fb88dc0b87a7ffdd48068f944d8626c Mon Sep 17 00:00:00 2001
> From: Michael Schmitz <schmitzmic@gmail.com>
> Date: Sun, 18 Jul 2021 10:31:42 +1200
> Subject: [PATCH] m68k/fpsp040 - save full stack frame before calling
>  fpsp040_die
>
> The FPSP040 floating point support code does not know how to
> handle user space access faults gracefully, and just calls
> do_exit(SIGSEGV) indirectly on these faults to abort.
>
> do_exit() may stop if traced, and needs a full stack frame
> available to avoid exposing kernel data.
>
> Add the current stack frame before calling do_exit() from the
> fpsp040 user access exception handler. Unwind the stack frame
> and return to caller once done, in case do_exit() is replaced
> by force_sig() later on.
>
> CC: Eric W. Biederman <ebiederm@xmission.com>
> Signed-off-by: Michael Schmitz <schmitzmic@gmail.com>
> ---
>  arch/m68k/fpsp040/skeleton.S | 10 ++++++++++
>  1 file changed, 10 insertions(+)
>
> diff --git a/arch/m68k/fpsp040/skeleton.S b/arch/m68k/fpsp040/skeleton.S
> index a8f4161..6c92d38 100644
> --- a/arch/m68k/fpsp040/skeleton.S
> +++ b/arch/m68k/fpsp040/skeleton.S
> @@ -502,7 +502,17 @@ in_ea:
>  	.section .fixup,#alloc,#execinstr
>  	.even
>  1:
> +
> +	SAVE_ALL_INT
> +	SAVE_SWITCH_STACK
        ^^^^^^^^^^

I don't think this saves the registers in the well known fixed location
on the stack because some registers are saved at the exception entry
point.

Without being saved at the well known fixed location if some process
stops in PTRACE_EVENT_EXIT in do_exit we likely get some complete
gibberish.

That is probably safe.

>  	jbra	fpsp040_die
> +	addql   #8,%sp
> +	addql   #8,%sp
> +	addql   #8,%sp
> +	addql   #8,%sp
> +	addql   #8,%sp
> +	addql   #4,%sp
> +	rts

Especially as everything after jumping to fpsp040_die does not execute.

Eric


>  
>  	.section __ex_table,#alloc
>  	.align	4

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v4 0/3] m68k: Improved switch stack handling
  2021-07-20 20:32             ` Eric W. Biederman
@ 2021-07-20 22:16               ` Michael Schmitz
  2021-07-22 14:49                 ` Eric W. Biederman
  0 siblings, 1 reply; 37+ messages in thread
From: Michael Schmitz @ 2021-07-20 22:16 UTC (permalink / raw)
  To: Eric W. Biederman; +Cc: geert, linux-arch, linux-m68k, torvalds, schwab

Hi Eric,

On 21/07/21 8:32 am, Eric W. Biederman wrote:
>
>> diff --git a/arch/m68k/fpsp040/skeleton.S b/arch/m68k/fpsp040/skeleton.S
>> index a8f4161..6c92d38 100644
>> --- a/arch/m68k/fpsp040/skeleton.S
>> +++ b/arch/m68k/fpsp040/skeleton.S
>> @@ -502,7 +502,17 @@ in_ea:
>>   	.section .fixup,#alloc,#execinstr
>>   	.even
>>   1:
>> +
>> +	SAVE_ALL_INT
>> +	SAVE_SWITCH_STACK
>          ^^^^^^^^^^
>
> I don't think this saves the registers in the well known fixed location
> on the stack because some registers are saved at the exception entry
> point.

The FPU exception entry points are not using the exception entry code in 
head.S. These entry points are stored in the exception vector table 
directly. No saving of a syscall stack frame happens there. The FPU 
places its exception frame on the stack, and that is what the FPU 
exception handlers use.

(If these have to call out to the generic exception handlers again, they 
will build a minimal stack frame, see code in skeleton.S.)

Calling fpsp040_die() is no different from calling a syscall that may 
need to have access to the full stack frame. The 'fixed location' is 
just 'on the stack before calling  fpsp040_die()', again this is no 
different from calling e.g. sys_fork() which does not take a pointer to 
the begin of the stack frame as an argument.

I must admit I never looked at how do_exit() figures out where the stack 
frame containing the saved registers is stored, I just assumed it 
unwinds the stack up to the point where the caller syscall was made, and 
works from there. The same strategy ought to work here.

>
> Without being saved at the well known fixed location if some process
> stops in PTRACE_EVENT_EXIT in do_exit we likely get some complete
> gibberish.
>
> That is probably safe.
>
>>   	jbra	fpsp040_die
>> +	addql   #8,%sp
>> +	addql   #8,%sp
>> +	addql   #8,%sp
>> +	addql   #8,%sp
>> +	addql   #8,%sp
>> +	addql   #4,%sp
>> +	rts
> Especially as everything after jumping to fpsp040_die does not execute.

Unless we change fpsp040_die() to call force_sig(SIGSEGV).

Cheers,

     Michael


>
> Eric
>
>
>>   
>>   	.section __ex_table,#alloc
>>   	.align	4

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v4 0/3] m68k: Improved switch stack handling
  2021-07-20 22:16               ` Michael Schmitz
@ 2021-07-22 14:49                 ` Eric W. Biederman
  2021-07-23  4:23                   ` Michael Schmitz
  0 siblings, 1 reply; 37+ messages in thread
From: Eric W. Biederman @ 2021-07-22 14:49 UTC (permalink / raw)
  To: Michael Schmitz; +Cc: geert, linux-arch, linux-m68k, torvalds, schwab

Michael Schmitz <schmitzmic@gmail.com> writes:

> Hi Eric,
>
> On 21/07/21 8:32 am, Eric W. Biederman wrote:
>>
>>> diff --git a/arch/m68k/fpsp040/skeleton.S b/arch/m68k/fpsp040/skeleton.S
>>> index a8f4161..6c92d38 100644
>>> --- a/arch/m68k/fpsp040/skeleton.S
>>> +++ b/arch/m68k/fpsp040/skeleton.S
>>> @@ -502,7 +502,17 @@ in_ea:
>>>   	.section .fixup,#alloc,#execinstr
>>>   	.even
>>>   1:
>>> +
>>> +	SAVE_ALL_INT
>>> +	SAVE_SWITCH_STACK
>>          ^^^^^^^^^^
>>
>> I don't think this saves the registers in the well known fixed location
>> on the stack because some registers are saved at the exception entry
>> point.
>
> The FPU exception entry points are not using the exception entry code in
> head.S. These entry points are stored in the exception vector table directly. No
> saving of a syscall stack frame happens there. The FPU places its exception
> frame on the stack, and that is what the FPU exception handlers use.
>
> (If these have to call out to the generic exception handlers again, they will
> build a minimal stack frame, see code in skeleton.S.)
>
> Calling fpsp040_die() is no different from calling a syscall that may need to
> have access to the full stack frame. The 'fixed location' is just 'on the stack
> before calling  fpsp040_die()', again this is no different from calling
> e.g. sys_fork() which does not take a pointer to the begin of the stack frame as
> an argument.
>
> I must admit I never looked at how do_exit() figures out where the stack frame
> containing the saved registers is stored, I just assumed it unwinds the stack up
> to the point where the caller syscall was made, and works from there. The same
> strategy ought to work here.

For do_exit the part we need to be careful with is PTRACE_EVENT_EXIT,
which means it is ptrace that we need to look at.

For m68k the code in put_reg and get_reg finds the registers by looking
at task->thread.esp0.

I was expecting m68k to use the same technique as alpha which expects a
fixed offset from task_stack_page(task).

So your code will work if you add code to update task->thread.esp0 which
is also known as THREAD_ESP0 in entry.S

>> Without being saved at the well known fixed location if some process
>> stops in PTRACE_EVENT_EXIT in do_exit we likely get some complete
>> gibberish.
>>
>> That is probably safe.
>>
>>>   	jbra	fpsp040_die
>>> +	addql   #8,%sp
>>> +	addql   #8,%sp
>>> +	addql   #8,%sp
>>> +	addql   #8,%sp
>>> +	addql   #8,%sp
>>> +	addql   #4,%sp
>>> +	rts
>> Especially as everything after jumping to fpsp040_die does not execute.
>
> Unless we change fpsp040_die() to call force_sig(SIGSEGV).

Yes.  I think we would probably need to have it also call get_signal and
all of that, because I don't think the very light call path for that
exception includes testing if signals are pending.

The way the code is structured it is actively incorrect to return from
fpsp040_die, as the code does not know what to do if it reads a byte
from userspace and there is nothing there.

So instead of handling -EFAULT like most pieces of kernel code the code
just immediately calls do_exit, and does not even attempt to handle
the error.

That is not my favorite strategy at all, but I suspect it isn't worth
it, or safe to update the skeleton.S to handle errors.  Especially as we
have not even figured out how to test that code yet.

Eric

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v4 0/3] m68k: Improved switch stack handling
  2021-07-22 14:49                 ` Eric W. Biederman
@ 2021-07-23  4:23                   ` Michael Schmitz
  2021-07-23 22:31                     ` Eric W. Biederman
  0 siblings, 1 reply; 37+ messages in thread
From: Michael Schmitz @ 2021-07-23  4:23 UTC (permalink / raw)
  To: Eric W. Biederman; +Cc: geert, linux-arch, linux-m68k, torvalds, schwab

Hi Eric,

Am 23.07.2021 um 02:49 schrieb Eric W. Biederman:
> Michael Schmitz <schmitzmic@gmail.com> writes:
>
>> Hi Eric,
>>
>> On 21/07/21 8:32 am, Eric W. Biederman wrote:
>>>
>>>> diff --git a/arch/m68k/fpsp040/skeleton.S b/arch/m68k/fpsp040/skeleton.S
>>>> index a8f4161..6c92d38 100644
>>>> --- a/arch/m68k/fpsp040/skeleton.S
>>>> +++ b/arch/m68k/fpsp040/skeleton.S
>>>> @@ -502,7 +502,17 @@ in_ea:
>>>>   	.section .fixup,#alloc,#execinstr
>>>>   	.even
>>>>   1:
>>>> +
>>>> +	SAVE_ALL_INT
>>>> +	SAVE_SWITCH_STACK
>>>          ^^^^^^^^^^
>>>
>>> I don't think this saves the registers in the well known fixed location
>>> on the stack because some registers are saved at the exception entry
>>> point.
>>
>> The FPU exception entry points are not using the exception entry code in
>> head.S. These entry points are stored in the exception vector table directly. No
>> saving of a syscall stack frame happens there. The FPU places its exception
>> frame on the stack, and that is what the FPU exception handlers use.
>>
>> (If these have to call out to the generic exception handlers again, they will
>> build a minimal stack frame, see code in skeleton.S.)
>>
>> Calling fpsp040_die() is no different from calling a syscall that may need to
>> have access to the full stack frame. The 'fixed location' is just 'on the stack
>> before calling  fpsp040_die()', again this is no different from calling
>> e.g. sys_fork() which does not take a pointer to the begin of the stack frame as
>> an argument.
>>
>> I must admit I never looked at how do_exit() figures out where the stack frame
>> containing the saved registers is stored, I just assumed it unwinds the stack up
>> to the point where the caller syscall was made, and works from there. The same
>> strategy ought to work here.
>
> For do_exit the part we need to be careful with is PTRACE_EVENT_EXIT,
> which means it is ptrace that we need to look at.
>
> For m68k the code in put_reg and get_reg finds the registers by looking
> at task->thread.esp0.

Thanks, that's what I was missing here.
>
> I was expecting m68k to use the same technique as alpha which expects a
> fixed offset from task_stack_page(task).
>
> So your code will work if you add code to update task->thread.esp0 which
> is also known as THREAD_ESP0 in entry.S

Shoving

movel   %sp,%curptr@(TASK_THREAD+THREAD_ESP0)

in between the SAVE_ALL_INT and SAVE_SWITCH_STACK ought to do the trick 
there.

>
>>> Without being saved at the well known fixed location if some process
>>> stops in PTRACE_EVENT_EXIT in do_exit we likely get some complete
>>> gibberish.
>>>
>>> That is probably safe.
>>>
>>>>   	jbra	fpsp040_die
>>>> +	addql   #8,%sp
>>>> +	addql   #8,%sp
>>>> +	addql   #8,%sp
>>>> +	addql   #8,%sp
>>>> +	addql   #8,%sp
>>>> +	addql   #4,%sp
>>>> +	rts
>>> Especially as everything after jumping to fpsp040_die does not execute.
>>
>> Unless we change fpsp040_die() to call force_sig(SIGSEGV).
>
> Yes.  I think we would probably need to have it also call get_signal and
> all of that, because I don't think the very light call path for that
> exception includes testing if signals are pending.

As far as I can see, there is a test for pending signals:

ENTRY(ret_from_exception)
.Lret_from_exception:
         btst    #5,%sp@(PT_OFF_SR)      | check if returning to kernel
         bnes    1f                      | if so, skip resched, signals
         | only allow interrupts when we are really the last one on the
         | kernel stack, otherwise stack overflow can occur during
         | heavy interrupt load
         andw    #ALLOWINT,%sr

resume_userspace:
         movel   %curptr@(TASK_STACK),%a1
         moveb   %a1@(TINFO_FLAGS+3),%d0	| bits 0-7 of TINFO_FLAGS
         jne     exit_work		| any bit set? -> exit_work
1:      RESTORE_ALL

exit_work:
         | save top of frame
         movel   %sp,%curptr@(TASK_THREAD+THREAD_ESP0)
         lslb    #1,%d0			| shift out TIF_NEED_RESCHED
         jne     do_signal_return	| any remaining bit 
(signal/notify_resume)? -> do_signal_return
         pea     resume_userspace
         jra     schedule

As long as TIF_NOTIFY_SIGNAL or TIF_SIGPENDING are set, do_signal_return 
will be called.


>
> The way the code is structured it is actively incorrect to return from
> fpsp040_die, as the code does not know what to do if it reads a byte
> from userspace and there is nothing there.

Correct - my hope is that upon return from the FPU exception (that 
continued after a dodgy read or write), we get the signal delivered and 
will die then.

>
> So instead of handling -EFAULT like most pieces of kernel code the code
> just immediately calls do_exit, and does not even attempt to handle
> the error.
>
> That is not my favorite strategy at all, but I suspect it isn't worth
> it, or safe to update the skeleton.S to handle errors.  Especially as we
> have not even figured out how to test that code yet.

That's bothering me more than a little, but I need to find out whether 
the emulator even handles FPU exceptions correctly ...

Cheers,

	Michael

>
> Eric
>

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v4 0/3] m68k: Improved switch stack handling
  2021-07-23  4:23                   ` Michael Schmitz
@ 2021-07-23 22:31                     ` Eric W. Biederman
  2021-07-23 23:52                       ` Michael Schmitz
  0 siblings, 1 reply; 37+ messages in thread
From: Eric W. Biederman @ 2021-07-23 22:31 UTC (permalink / raw)
  To: Michael Schmitz; +Cc: geert, linux-arch, linux-m68k, torvalds, schwab

Michael Schmitz <schmitzmic@gmail.com> writes:

> Hi Eric,
>
> Am 23.07.2021 um 02:49 schrieb Eric W. Biederman:
>> Michael Schmitz <schmitzmic@gmail.com> writes:
>>
>>> Hi Eric,
>>>
>>> On 21/07/21 8:32 am, Eric W. Biederman wrote:
>>>>
>>>>> diff --git a/arch/m68k/fpsp040/skeleton.S b/arch/m68k/fpsp040/skeleton.S
>>>>> index a8f4161..6c92d38 100644
>>>>> --- a/arch/m68k/fpsp040/skeleton.S
>>>>> +++ b/arch/m68k/fpsp040/skeleton.S
>>>>> @@ -502,7 +502,17 @@ in_ea:
>>>>>   	.section .fixup,#alloc,#execinstr
>>>>>   	.even
>>>>>   1:
>>>>> +
>>>>> +	SAVE_ALL_INT
>>>>> +	SAVE_SWITCH_STACK
>>>>          ^^^^^^^^^^
>>>>
>>>> I don't think this saves the registers in the well known fixed location
>>>> on the stack because some registers are saved at the exception entry
>>>> point.
>>>
>>> The FPU exception entry points are not using the exception entry code in
>>> head.S. These entry points are stored in the exception vector table directly. No
>>> saving of a syscall stack frame happens there. The FPU places its exception
>>> frame on the stack, and that is what the FPU exception handlers use.
>>>
>>> (If these have to call out to the generic exception handlers again, they will
>>> build a minimal stack frame, see code in skeleton.S.)
>>>
>>> Calling fpsp040_die() is no different from calling a syscall that may need to
>>> have access to the full stack frame. The 'fixed location' is just 'on the stack
>>> before calling  fpsp040_die()', again this is no different from calling
>>> e.g. sys_fork() which does not take a pointer to the begin of the stack frame as
>>> an argument.
>>>
>>> I must admit I never looked at how do_exit() figures out where the stack frame
>>> containing the saved registers is stored, I just assumed it unwinds the stack up
>>> to the point where the caller syscall was made, and works from there. The same
>>> strategy ought to work here.
>>
>> For do_exit the part we need to be careful with is PTRACE_EVENT_EXIT,
>> which means it is ptrace that we need to look at.
>>
>> For m68k the code in put_reg and get_reg finds the registers by looking
>> at task->thread.esp0.
>
> Thanks, that's what I was missing here.
>>
>> I was expecting m68k to use the same technique as alpha which expects a
>> fixed offset from task_stack_page(task).
>>
>> So your code will work if you add code to update task->thread.esp0 which
>> is also known as THREAD_ESP0 in entry.S
>
> Shoving
>
> movel   %sp,%curptr@(TASK_THREAD+THREAD_ESP0)
>
> in between the SAVE_ALL_INT and SAVE_SWITCH_STACK ought to do the
> trick there.
>
>>
>>>> Without being saved at the well known fixed location if some process
>>>> stops in PTRACE_EVENT_EXIT in do_exit we likely get some complete
>>>> gibberish.
>>>>
>>>> That is probably safe.
>>>>
>>>>>   	jbra	fpsp040_die
>>>>> +	addql   #8,%sp
>>>>> +	addql   #8,%sp
>>>>> +	addql   #8,%sp
>>>>> +	addql   #8,%sp
>>>>> +	addql   #8,%sp
>>>>> +	addql   #4,%sp
>>>>> +	rts
>>>> Especially as everything after jumping to fpsp040_die does not execute.
>>>
>>> Unless we change fpsp040_die() to call force_sig(SIGSEGV).
>>
>> Yes.  I think we would probably need to have it also call get_signal and
>> all of that, because I don't think the very light call path for that
>> exception includes testing if signals are pending.
>
> As far as I can see, there is a test for pending signals:
>
> ENTRY(ret_from_exception)
> .Lret_from_exception:
>         btst    #5,%sp@(PT_OFF_SR)      | check if returning to kernel
>         bnes    1f                      | if so, skip resched, signals
>         | only allow interrupts when we are really the last one on the
>         | kernel stack, otherwise stack overflow can occur during
>         | heavy interrupt load
>         andw    #ALLOWINT,%sr
>
> resume_userspace:
>         movel   %curptr@(TASK_STACK),%a1
>         moveb   %a1@(TINFO_FLAGS+3),%d0	| bits 0-7 of TINFO_FLAGS
>         jne     exit_work		| any bit set? -> exit_work
> 1:      RESTORE_ALL
>
> exit_work:
>         | save top of frame
>         movel   %sp,%curptr@(TASK_THREAD+THREAD_ESP0)
>         lslb    #1,%d0			| shift out TIF_NEED_RESCHED
>         jne     do_signal_return	| any remaining bit
> (signal/notify_resume)? -> do_signal_return
>         pea     resume_userspace
>         jra     schedule
>
> As long as TIF_NOTIFY_SIGNAL or TIF_SIGPENDING are set,
> do_signal_return will be called.

I was going to say I don't think so, as my tracing of
the code lead in a couple of different directions.  Upon closer
inspection all those paths either lead to fpsp_done or more
directly to ret_from_exception.

For anyone else who might want to trace the code, or for myself later on
when I forget.  As best as I can figure the hardware exception vector
table is setup in: arch/m68k/kernel/vector.c

For the vectors in question it appears to be this chunk of code:

	if (CPU_IS_040 && !FPU_IS_EMU) {
		/* set up FPSP entry points */
		asmlinkage void dz_vec(void) asm ("dz");
		asmlinkage void inex_vec(void) asm ("inex");
		asmlinkage void ovfl_vec(void) asm ("ovfl");
		asmlinkage void unfl_vec(void) asm ("unfl");
		asmlinkage void snan_vec(void) asm ("snan");
		asmlinkage void operr_vec(void) asm ("operr");
		asmlinkage void bsun_vec(void) asm ("bsun");
		asmlinkage void fline_vec(void) asm ("fline");
		asmlinkage void unsupp_vec(void) asm ("unsupp");

		vectors[VEC_FPDIVZ] = dz_vec;
		vectors[VEC_FPIR] = inex_vec;
		vectors[VEC_FPOVER] = ovfl_vec;
		vectors[VEC_FPUNDER] = unfl_vec;
		vectors[VEC_FPNAN] = snan_vec;
		vectors[VEC_FPOE] = operr_vec;
		vectors[VEC_FPBRUC] = bsun_vec;
		vectors[VEC_LINE11] = fline_vec;
		vectors[VEC_FPUNSUP] = unsupp_vec;
	}


Which leads me to call traces that look like this:

hw
  fline
    fpsp_fline
       mem_read
          user_read
             copyin
               in_ea
                  <page-fault>
                     fpsp040_die

If that mem_read returns it can be followed by
       not_mvcr
          real_fline
            ret_from_exception

Or it can be followed by
       fix_con
          uni_2
             gen_except
                 do_clean
                    finish_up
                       fpsp_done
                          ret_from_exception
                          


>> The way the code is structured it is actively incorrect to return from
>> fpsp040_die, as the code does not know what to do if it reads a byte
>> from userspace and there is nothing there.
>
> Correct - my hope is that upon return from the FPU exception (that
> continued after a dodgy read or write), we get the signal delivered
> and will die then.

Yes.  That does look like a good strategy.

I am wondering if there are values we can return that will make the
path out of the exit routine more deterministic.

I have played with that a little bit today, but it doesn't look like I
am going to have time to put together any kind of real patch today.

Simply modifying fpsp040_die to call force_sigsegv(SIGSEGV) should be
enough to trigger a signal (no call stack work needed if we remove
do_exit).  The tricky bit is what value do we want to fake when we
can not read anything from userspace.   For a write fault we should just
be able to skip the write entirely.

In both cases we probably should break out of the loop prematurely.  But
I don't know if that is necessary.


The lazy strategy would be to copy the ifpsp060 code and simply oops
the kernel if the read or write of userspace gets a page fault.

>> So instead of handling -EFAULT like most pieces of kernel code the code
>> just immediately calls do_exit, and does not even attempt to handle
>> the error.
>>
>> That is not my favorite strategy at all, but I suspect it isn't worth
>> it, or safe to update the skeleton.S to handle errors.  Especially as we
>> have not even figured out how to test that code yet.
>
> That's bothering me more than a little, but I need to find out whether
> the emulator even handles FPU exceptions correctly ...


As a fallback plan we can following the lead of ifpsp060/os.S and simply
not catch the kernel triggered page fault, and let
arch/m68k/mm/fault.c:send_fault_sig() return a kernel oops.  It is not
ideal as it allows userspace to trigger a kernel oops, but it does at
least keep the kernel in a consistent state.

diff --git a/arch/m68k/fpsp040/skeleton.S b/arch/m68k/fpsp040/skeleton.S
index a8f41615d94a..4c6c4b07ef38 100644
--- a/arch/m68k/fpsp040/skeleton.S
+++ b/arch/m68k/fpsp040/skeleton.S
@@ -479,7 +479,6 @@ copyout:
 |	movec	%d1,%DFC		| set dfc for user data space
 moreout:
 	moveb	(%a0)+,%d1	| fetch supervisor byte
-out_ea:
 	movesb	%d1,(%a1)+	| write user byte
 	dbf	%d0,moreout
 	rts
@@ -493,21 +492,9 @@ copyin:
 |	SFC is already set
 |	movec	%d1,%SFC		| set sfc for user space
 morein:
-in_ea:
 	movesb	(%a0)+,%d1	| fetch user byte
 	moveb	%d1,(%a1)+	| write supervisor byte
 	dbf	%d0,morein
 	rts
 
-	.section .fixup,#alloc,#execinstr
-	.even
-1:
-	jbra	fpsp040_die
-
-	.section __ex_table,#alloc
-	.align	4
-
-	.long	in_ea,1b
-	.long	out_ea,1b
-
 	|end
diff --git a/arch/m68k/kernel/traps.c b/arch/m68k/kernel/traps.c
index e2a6f3556211..3ec6ae1bdaf9 100644
--- a/arch/m68k/kernel/traps.c
+++ b/arch/m68k/kernel/traps.c
@@ -1144,15 +1144,6 @@ asmlinkage void set_esp0(unsigned long ssp)
 	current->thread.esp0 = ssp;
 }
 
-/*
- * This function is called if an error occur while accessing
- * user-space from the fpsp040 code.
- */
-asmlinkage void fpsp040_die(void)
-{
-	do_exit(SIGSEGV);
-}
-
 #ifdef CONFIG_M68KFPU_EMU
 asmlinkage void fpemu_signal(int signal, int code, void *addr)
 {

Eric

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* Re: [PATCH v4 0/3] m68k: Improved switch stack handling
  2021-07-23 22:31                     ` Eric W. Biederman
@ 2021-07-23 23:52                       ` Michael Schmitz
  2021-07-24 12:05                         ` Andreas Schwab
  0 siblings, 1 reply; 37+ messages in thread
From: Michael Schmitz @ 2021-07-23 23:52 UTC (permalink / raw)
  To: Eric W. Biederman; +Cc: geert, linux-arch, linux-m68k, torvalds, schwab

[-- Attachment #1: Type: text/plain, Size: 7832 bytes --]

Hi Eric,

Am 24.07.2021 um 10:31 schrieb Eric W. Biederman:

>>>> Unless we change fpsp040_die() to call force_sig(SIGSEGV).
>>>
>>> Yes.  I think we would probably need to have it also call get_signal and
>>> all of that, because I don't think the very light call path for that
>>> exception includes testing if signals are pending.
>>
>> As far as I can see, there is a test for pending signals:
>>
>> ENTRY(ret_from_exception)
>> .Lret_from_exception:
>>         btst    #5,%sp@(PT_OFF_SR)      | check if returning to kernel
>>         bnes    1f                      | if so, skip resched, signals
>>         | only allow interrupts when we are really the last one on the
>>         | kernel stack, otherwise stack overflow can occur during
>>         | heavy interrupt load
>>         andw    #ALLOWINT,%sr
>>
>> resume_userspace:
>>         movel   %curptr@(TASK_STACK),%a1
>>         moveb   %a1@(TINFO_FLAGS+3),%d0	| bits 0-7 of TINFO_FLAGS
>>         jne     exit_work		| any bit set? -> exit_work
>> 1:      RESTORE_ALL
>>
>> exit_work:
>>         | save top of frame
>>         movel   %sp,%curptr@(TASK_THREAD+THREAD_ESP0)
>>         lslb    #1,%d0			| shift out TIF_NEED_RESCHED
>>         jne     do_signal_return	| any remaining bit
>> (signal/notify_resume)? -> do_signal_return
>>         pea     resume_userspace
>>         jra     schedule
>>
>> As long as TIF_NOTIFY_SIGNAL or TIF_SIGPENDING are set,
>> do_signal_return will be called.
>
> I was going to say I don't think so, as my tracing of
> the code lead in a couple of different directions.  Upon closer
> inspection all those paths either lead to fpsp_done or more
> directly to ret_from_exception.
>
> For anyone else who might want to trace the code, or for myself later on
> when I forget.  As best as I can figure the hardware exception vector
> table is setup in: arch/m68k/kernel/vector.c
>
> For the vectors in question it appears to be this chunk of code:
>
> 	if (CPU_IS_040 && !FPU_IS_EMU) {
> 		/* set up FPSP entry points */
> 		asmlinkage void dz_vec(void) asm ("dz");
> 		asmlinkage void inex_vec(void) asm ("inex");
> 		asmlinkage void ovfl_vec(void) asm ("ovfl");
> 		asmlinkage void unfl_vec(void) asm ("unfl");
> 		asmlinkage void snan_vec(void) asm ("snan");
> 		asmlinkage void operr_vec(void) asm ("operr");
> 		asmlinkage void bsun_vec(void) asm ("bsun");
> 		asmlinkage void fline_vec(void) asm ("fline");
> 		asmlinkage void unsupp_vec(void) asm ("unsupp");
>
> 		vectors[VEC_FPDIVZ] = dz_vec;
> 		vectors[VEC_FPIR] = inex_vec;
> 		vectors[VEC_FPOVER] = ovfl_vec;
> 		vectors[VEC_FPUNDER] = unfl_vec;
> 		vectors[VEC_FPNAN] = snan_vec;
> 		vectors[VEC_FPOE] = operr_vec;
> 		vectors[VEC_FPBRUC] = bsun_vec;
> 		vectors[VEC_LINE11] = fline_vec;
> 		vectors[VEC_FPUNSUP] = unsupp_vec;
> 	}
>

Correct.

> Which leads me to call traces that look like this:
>
> hw
>   fline
>     fpsp_fline
>        mem_read
>           user_read
>              copyin
>                in_ea
>                   <page-fault>
>                      fpsp040_die

According to my understanding, you can't get a F-line exception on 
68040. F-line is a coprocessor protocol violation, only raised when 
there is no coprocessor present on the bus.

What we expect to get is any of the arithmetic exceptions, and the 
'unsupported opcode' one (for those floating point instructions that the 
68040 FPU does not implement).

In reality, it's probably the 'unsupported' exception we expect to hit 
most often.

>
> If that mem_read returns it can be followed by
>        not_mvcr
>           real_fline
>             ret_from_exception
>
> Or it can be followed by
>        fix_con
>           uni_2
>              gen_except
>                  do_clean
>                     finish_up
>                        fpsp_done
>                           ret_from_exception
>
>
>
>>> The way the code is structured it is actively incorrect to return from
>>> fpsp040_die, as the code does not know what to do if it reads a byte
>>> from userspace and there is nothing there.
>>
>> Correct - my hope is that upon return from the FPU exception (that
>> continued after a dodgy read or write), we get the signal delivered
>> and will die then.
>
> Yes.  That does look like a good strategy.
>
> I am wondering if there are values we can return that will make the
> path out of the exit routine more deterministic.

I doubt it - maybe 'preloading' the register used for the read with 
something invalid as floating point instruction or data might force 
another exception more readily, but that's just speculation on my part.

> I have played with that a little bit today, but it doesn't look like I
> am going to have time to put together any kind of real patch today.

I've attached a corrected version of my patch to supply the required 
stack frame - this ought to make use of do_exit() safe. Still working on 
a way to exercise this code path. Let's think about ways to use signals 
once I've succeeded to do that.

Cheers,

	Michael


>
> Simply modifying fpsp040_die to call force_sigsegv(SIGSEGV) should be
> enough to trigger a signal (no call stack work needed if we remove
> do_exit).  The tricky bit is what value do we want to fake when we
> can not read anything from userspace.   For a write fault we should just
> be able to skip the write entirely.
>
> In both cases we probably should break out of the loop prematurely.  But
> I don't know if that is necessary.
>
>
> The lazy strategy would be to copy the ifpsp060 code and simply oops
> the kernel if the read or write of userspace gets a page fault.
>
>>> So instead of handling -EFAULT like most pieces of kernel code the code
>>> just immediately calls do_exit, and does not even attempt to handle
>>> the error.
>>>
>>> That is not my favorite strategy at all, but I suspect it isn't worth
>>> it, or safe to update the skeleton.S to handle errors.  Especially as we
>>> have not even figured out how to test that code yet.
>>
>> That's bothering me more than a little, but I need to find out whether
>> the emulator even handles FPU exceptions correctly ...
>
>
> As a fallback plan we can following the lead of ifpsp060/os.S and simply
> not catch the kernel triggered page fault, and let
> arch/m68k/mm/fault.c:send_fault_sig() return a kernel oops.  It is not
> ideal as it allows userspace to trigger a kernel oops, but it does at
> least keep the kernel in a consistent state.
>
> diff --git a/arch/m68k/fpsp040/skeleton.S b/arch/m68k/fpsp040/skeleton.S
> index a8f41615d94a..4c6c4b07ef38 100644
> --- a/arch/m68k/fpsp040/skeleton.S
> +++ b/arch/m68k/fpsp040/skeleton.S
> @@ -479,7 +479,6 @@ copyout:
>  |	movec	%d1,%DFC		| set dfc for user data space
>  moreout:
>  	moveb	(%a0)+,%d1	| fetch supervisor byte
> -out_ea:
>  	movesb	%d1,(%a1)+	| write user byte
>  	dbf	%d0,moreout
>  	rts
> @@ -493,21 +492,9 @@ copyin:
>  |	SFC is already set
>  |	movec	%d1,%SFC		| set sfc for user space
>  morein:
> -in_ea:
>  	movesb	(%a0)+,%d1	| fetch user byte
>  	moveb	%d1,(%a1)+	| write supervisor byte
>  	dbf	%d0,morein
>  	rts
>
> -	.section .fixup,#alloc,#execinstr
> -	.even
> -1:
> -	jbra	fpsp040_die
> -
> -	.section __ex_table,#alloc
> -	.align	4
> -
> -	.long	in_ea,1b
> -	.long	out_ea,1b
> -
>  	|end
> diff --git a/arch/m68k/kernel/traps.c b/arch/m68k/kernel/traps.c
> index e2a6f3556211..3ec6ae1bdaf9 100644
> --- a/arch/m68k/kernel/traps.c
> +++ b/arch/m68k/kernel/traps.c
> @@ -1144,15 +1144,6 @@ asmlinkage void set_esp0(unsigned long ssp)
>  	current->thread.esp0 = ssp;
>  }
>
> -/*
> - * This function is called if an error occur while accessing
> - * user-space from the fpsp040 code.
> - */
> -asmlinkage void fpsp040_die(void)
> -{
> -	do_exit(SIGSEGV);
> -}
> -
>  #ifdef CONFIG_M68KFPU_EMU
>  asmlinkage void fpemu_signal(int signal, int code, void *addr)
>  {
>
> Eric
>

[-- Attachment #2: 0001-m68k-fpsp040-save-full-stack-frame-before-calling-fp.patch --]
[-- Type: text/x-diff, Size: 1643 bytes --]

From 737b74a376f0b3da09ba7cb088e99c2c85b7405c Mon Sep 17 00:00:00 2001
From: Michael Schmitz <schmitzmic@gmail.com>
Date: Sun, 18 Jul 2021 10:31:42 +1200
Subject: [PATCH] m68k/fpsp040 - save full stack frame before calling
 fpsp040_die

The FPSP040 floating point support code does not know how to
handle user space access faults gracefully, and just calls
do_exit(SIGSEGV) indirectly on these faults to abort.

do_exit() may stop if traced, and needs a full stack frame
available to avoid exposing kernel data.

Add the current stack frame before calling do_exit() from the
fpsp040 user access exception handler. Top of stack frame saved
to task->thread.esp0 as is done for system calls.

Unwind the stack frame and return to caller once done, in case
do_exit() is replaced by force_sig() later on. Note that this
will allow the current exception handler to continue with
incorrect state, but the results will never make it to the
calling user program which is terminated by SYSSIGV upon return
from exception.

CC: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Michael Schmitz <schmitzmic@gmail.com>
---
 arch/m68k/fpsp040/skeleton.S | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/arch/m68k/fpsp040/skeleton.S b/arch/m68k/fpsp040/skeleton.S
index a8f4161..1cbc52b 100644
--- a/arch/m68k/fpsp040/skeleton.S
+++ b/arch/m68k/fpsp040/skeleton.S
@@ -502,7 +502,14 @@ in_ea:
 	.section .fixup,#alloc,#execinstr
 	.even
 1:
+
+	SAVE_ALL_INT
+	| save top of frame
+	movel	%sp,%curptr@(TASK_THREAD+THREAD_ESP0)
+	SAVE_SWITCH_STACK
 	jbra	fpsp040_die
+	lea	44(%sp),%sp
+	rts
 
 	.section __ex_table,#alloc
 	.align	4
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* Re: [PATCH v4 0/3] m68k: Improved switch stack handling
  2021-07-23 23:52                       ` Michael Schmitz
@ 2021-07-24 12:05                         ` Andreas Schwab
  2021-07-25  7:44                           ` Michael Schmitz
  0 siblings, 1 reply; 37+ messages in thread
From: Andreas Schwab @ 2021-07-24 12:05 UTC (permalink / raw)
  To: Michael Schmitz
  Cc: Eric W. Biederman, geert, linux-arch, linux-m68k, torvalds

On Jul 24 2021, Michael Schmitz wrote:

> According to my understanding, you can't get a F-line exception on
> 68040.

The F-line exeception vector is used for all FPU illegal and
unimplemented insns.

Andreas.

-- 
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
"And now for something completely different."

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v4 0/3] m68k: Improved switch stack handling
  2021-07-24 12:05                         ` Andreas Schwab
@ 2021-07-25  7:44                           ` Michael Schmitz
  2021-07-25 10:12                             ` Brad Boyer
  2021-07-25 11:53                             ` [PATCH v4 0/3] m68k: Improved switch stack handling Andreas Schwab
  0 siblings, 2 replies; 37+ messages in thread
From: Michael Schmitz @ 2021-07-25  7:44 UTC (permalink / raw)
  To: Andreas Schwab; +Cc: Eric W. Biederman, geert, linux-arch, linux-m68k, torvalds

Hi Andreas,

Am 25.07.2021 um 00:05 schrieb Andreas Schwab:
> On Jul 24 2021, Michael Schmitz wrote:
>
>> According to my understanding, you can't get a F-line exception on
>> 68040.
>
> The F-line exeception vector is used for all FPU illegal and
> unimplemented insns.

Thanks - now from my reading of the fpsp040 code (which has mislead me 
in the past), it would seem that operations like sin() and exp() ought 
to raise that exception then. I don't see that in ARAnyM.
Is there any emulator that correctly emulates the 68040 FPU in that regard?

Cheers,

	Michael


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v4 3/3] m68k: track syscalls being traced with shallow user context stack
  2021-06-23  0:21 ` [PATCH v4 3/3] m68k: track syscalls being traced with shallow user context stack Michael Schmitz
@ 2021-07-25 10:05   ` Geert Uytterhoeven
  2021-07-25 20:48     ` Michael Schmitz
  0 siblings, 1 reply; 37+ messages in thread
From: Geert Uytterhoeven @ 2021-07-25 10:05 UTC (permalink / raw)
  To: Michael Schmitz
  Cc: Linux-Arch, linux-m68k, Eric W. Biederman, Linus Torvalds,
	Andreas Schwab

Hi Michael,

On Wed, Jun 23, 2021 at 2:21 AM Michael Schmitz <schmitzmic@gmail.com> wrote:
> Add 'status' field to thread_info struct to hold syscall trace
> status info.
>
> Set flag bit in thread_info->status at syscall trace entry, clear
> flag bit on trace exit.
>
> Set another flag bit on entering syscall where the full stack
> frame has been saved. These flags can be checked whenever a
> syscall calls ptrace_stop().
>
> Check flag bits in get_reg()/put_reg() and prevent access to
> registers that are saved on the switch stack, in case the
> syscall did not actually save these registers on the switch
> stack.
>
> Tested on ARAnyM only - boots and survives running strace on a
> binary, nothing fancy.
>
> CC: Eric W. Biederman <ebiederm@xmission.com>
> CC: Linus Torvalds <torvalds@linux-foundation.org>
> CC: Andreas Schwab <schwab@linux-m68k.org>
> Signed-off-by: Michael Schmitz <schmitzmic@gmail.com>

Thanks for your patch!

> --- a/arch/m68k/kernel/entry.S
> +++ b/arch/m68k/kernel/entry.S
> @@ -51,75 +51,115 @@
>
>  .text
>  ENTRY(__sys_fork)
> +       movel   %curptr@(TASK_STACK),%a1
> +       orb     #TIS_SWITCH_STACK, %a1@(TINFO_STATUS+3)

This doesn't work on Coldfire:

arch/m68k/kernel/entry.S:55: Error: invalid instruction for this
architecture; needs 68000 or higher (68000 [68ec000, 68hc000, 68hc001,
68008, 68302, 68306, 68307, 68322, 68356], 68010, 68020 [68k,
68ec020], 68030 [68ec030], 68040 [68ec040], 68060 [68ec060], cpu32
[68330, 68331, 68332,
 68333, 68334, 68336, 68340, 68341, 68349, 68360], fidoa [fido]) --
statement `orb #(1<<1),%a1@(16+3)' ignored

>         SAVE_SWITCH_STACK
>         jbsr    sys_fork
>         lea     %sp@(24),%sp
> +       movel   %curptr@(TASK_STACK),%a1
> +       andb    #TIS_NO_SWITCH_STACK, %a1@(TINFO_STATUS+3)

arch/m68k/kernel/entry.S:60: Error: invalid instruction for this
architecture; needs 68000 or higher (68000 [68ec000, 68hc000, 68hc001,
68008, 68302, 68306, 68307, 68322, 68356], 68010, 68020 [68k,
68ec020], 68030 [68ec030], 68040 [68ec040], 68060 [68ec060], cpu32
[68330, 68331, 68332, 68333, 68334, 68336, 68340, 68341, 68349,
68360], fidoa [fido]) -- statement `andb #(~((1<<1))),%a1@(16+3)'
ignored

>         rts

Gr{oetje,eeting}s,

                        Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v4 0/3] m68k: Improved switch stack handling
  2021-07-25  7:44                           ` Michael Schmitz
@ 2021-07-25 10:12                             ` Brad Boyer
  2021-07-26  2:00                               ` Michael Schmitz
  2021-07-25 11:53                             ` [PATCH v4 0/3] m68k: Improved switch stack handling Andreas Schwab
  1 sibling, 1 reply; 37+ messages in thread
From: Brad Boyer @ 2021-07-25 10:12 UTC (permalink / raw)
  To: Michael Schmitz
  Cc: Andreas Schwab, Eric W. Biederman, geert, linux-arch, linux-m68k,
	torvalds

On Sun, Jul 25, 2021 at 07:44:11PM +1200, Michael Schmitz wrote:
> Am 25.07.2021 um 00:05 schrieb Andreas Schwab:
> >On Jul 24 2021, Michael Schmitz wrote:
> >
> >>According to my understanding, you can't get a F-line exception on
> >>68040.
> >
> >The F-line exception vector is used for all FPU illegal and
> >unimplemented insns.
> 
> Thanks - now from my reading of the fpsp040 code (which has mislead me in
> the past), it would seem that operations like sin() and exp() ought to raise
> that exception then. I don't see that in ARAnyM.

Yes, according to the 68040 user's manual, unimplemented and illegal F-line
instructions trigger the standard F-line exception vector (11) but have
separate stack frame formats so the fpsp040 code gets some extra data.
The CPU does a bunch of the prep work so that part doesn't need to be
emulated in software.

The ARAnyM docs appear to claim a strange combination that wouldn't
exist in hardware by implementing a full 68882 instead of the limited
subset found on a real 68040. Strangely, that might have been easier to
implement. However, it would also completely bypass any use of fpsp040.

	Brad Boyer
	flar@allandria.com


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v4 0/3] m68k: Improved switch stack handling
  2021-07-25  7:44                           ` Michael Schmitz
  2021-07-25 10:12                             ` Brad Boyer
@ 2021-07-25 11:53                             ` Andreas Schwab
  1 sibling, 0 replies; 37+ messages in thread
From: Andreas Schwab @ 2021-07-25 11:53 UTC (permalink / raw)
  To: Michael Schmitz
  Cc: Eric W. Biederman, geert, linux-arch, linux-m68k, torvalds

On Jul 25 2021, Michael Schmitz wrote:

> I don't see that in ARAnyM.

ARAnyM lacks a lot in its fpu emulation.

Andreas.

-- 
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
"And now for something completely different."

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v4 3/3] m68k: track syscalls being traced with shallow user context stack
  2021-07-25 10:05   ` Geert Uytterhoeven
@ 2021-07-25 20:48     ` Michael Schmitz
  2021-07-25 21:00       ` Linus Torvalds
  0 siblings, 1 reply; 37+ messages in thread
From: Michael Schmitz @ 2021-07-25 20:48 UTC (permalink / raw)
  To: Geert Uytterhoeven
  Cc: Linux-Arch, linux-m68k, Eric W. Biederman, Linus Torvalds,
	Andreas Schwab

Hi Geert,

thanks for the feedback!

As far as I understand, Eric's 'refactor exit()' patch series has 
obsoleted this band-aid fix of mine. The last remnant of code using 
do_exit() is our fpsp040 copyin/copyout exception handling, and there's 
another patch in testing for that. (I'd need access to a 040 hardware 
setup to properly test that one, but that's a different matter.)

Eric, Andreas - please correct me if I'm wrong (again).

Just out of interest - what would be the correct way to set/clear a 
single bit on Coldfire? Add/subtract the 1<<bit value?

Cheers,

     Michael


On 25/07/21 10:05 pm, Geert Uytterhoeven wrote:
> Hi Michael,
>
> On Wed, Jun 23, 2021 at 2:21 AM Michael Schmitz <schmitzmic@gmail.com> wrote:
>> Add 'status' field to thread_info struct to hold syscall trace
>> status info.
>>
>> Set flag bit in thread_info->status at syscall trace entry, clear
>> flag bit on trace exit.
>>
>> Set another flag bit on entering syscall where the full stack
>> frame has been saved. These flags can be checked whenever a
>> syscall calls ptrace_stop().
>>
>> Check flag bits in get_reg()/put_reg() and prevent access to
>> registers that are saved on the switch stack, in case the
>> syscall did not actually save these registers on the switch
>> stack.
>>
>> Tested on ARAnyM only - boots and survives running strace on a
>> binary, nothing fancy.
>>
>> CC: Eric W. Biederman <ebiederm@xmission.com>
>> CC: Linus Torvalds <torvalds@linux-foundation.org>
>> CC: Andreas Schwab <schwab@linux-m68k.org>
>> Signed-off-by: Michael Schmitz <schmitzmic@gmail.com>
> Thanks for your patch!
>
>> --- a/arch/m68k/kernel/entry.S
>> +++ b/arch/m68k/kernel/entry.S
>> @@ -51,75 +51,115 @@
>>
>>   .text
>>   ENTRY(__sys_fork)
>> +       movel   %curptr@(TASK_STACK),%a1
>> +       orb     #TIS_SWITCH_STACK, %a1@(TINFO_STATUS+3)
> This doesn't work on Coldfire:
>
> arch/m68k/kernel/entry.S:55: Error: invalid instruction for this
> architecture; needs 68000 or higher (68000 [68ec000, 68hc000, 68hc001,
> 68008, 68302, 68306, 68307, 68322, 68356], 68010, 68020 [68k,
> 68ec020], 68030 [68ec030], 68040 [68ec040], 68060 [68ec060], cpu32
> [68330, 68331, 68332,
>   68333, 68334, 68336, 68340, 68341, 68349, 68360], fidoa [fido]) --
> statement `orb #(1<<1),%a1@(16+3)' ignored
>
>>          SAVE_SWITCH_STACK
>>          jbsr    sys_fork
>>          lea     %sp@(24),%sp
>> +       movel   %curptr@(TASK_STACK),%a1
>> +       andb    #TIS_NO_SWITCH_STACK, %a1@(TINFO_STATUS+3)
> arch/m68k/kernel/entry.S:60: Error: invalid instruction for this
> architecture; needs 68000 or higher (68000 [68ec000, 68hc000, 68hc001,
> 68008, 68302, 68306, 68307, 68322, 68356], 68010, 68020 [68k,
> 68ec020], 68030 [68ec030], 68040 [68ec040], 68060 [68ec060], cpu32
> [68330, 68331, 68332, 68333, 68334, 68336, 68340, 68341, 68349,
> 68360], fidoa [fido]) -- statement `andb #(~((1<<1))),%a1@(16+3)'
> ignored
>
>>          rts
> Gr{oetje,eeting}s,
>
>                          Geert
>

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v4 3/3] m68k: track syscalls being traced with shallow user context stack
  2021-07-25 20:48     ` Michael Schmitz
@ 2021-07-25 21:00       ` Linus Torvalds
  2021-07-26 14:27         ` Greg Ungerer
  0 siblings, 1 reply; 37+ messages in thread
From: Linus Torvalds @ 2021-07-25 21:00 UTC (permalink / raw)
  To: Michael Schmitz
  Cc: Geert Uytterhoeven, Linux-Arch, linux-m68k, Eric W. Biederman,
	Andreas Schwab

On Sun, Jul 25, 2021 at 1:48 PM Michael Schmitz <schmitzmic@gmail.com> wrote:
>
> Just out of interest - what would be the correct way to set/clear a
> single bit on Coldfire? Add/subtract the 1<<bit value?

I think BSET/BCLR are the way to go.

Or, alternatively, put the constant in a register, and use a longword
memory access. The arithmetic ops don't do immediates _and_ memory
operands in Coldfire, and they don't do byte ops.

Or something like that.

              Linus

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v4 0/3] m68k: Improved switch stack handling
  2021-07-25 10:12                             ` Brad Boyer
@ 2021-07-26  2:00                               ` Michael Schmitz
  2021-07-26 19:36                                 ` [RFC][PATCH] signal/m68k: Use force_sigsegv(SIGSEGV) in fpsp040_die Eric W. Biederman
  0 siblings, 1 reply; 37+ messages in thread
From: Michael Schmitz @ 2021-07-26  2:00 UTC (permalink / raw)
  To: Brad Boyer
  Cc: Andreas Schwab, Eric W. Biederman, geert, linux-arch, linux-m68k,
	torvalds

[-- Attachment #1: Type: text/plain, Size: 1873 bytes --]

Thanks Brad, Andreas,

I won't rely on ARAnyM for these tests any longer then.

I would be much obliged if one of the m68k kernel crowd with access to a 
68040 could apply the two attached patches, on top of Eric's 
'refactoring exit' series for preference, and check that any program 
attempting a simple sin() or exp() operation exits with SEGV.

If you know of a way to trace said program and set a breakpoint in 
do_exit(), please also try to inspect saved registers at that point 
(though I'm not sure how to create a dump of the actual registers from 
inside the exception handler to compare with).

Cheers,

     Michael


On 25/07/21 10:12 pm, Brad Boyer wrote:

> On Sun, Jul 25, 2021 at 07:44:11PM +1200, Michael Schmitz wrote:
>> Am 25.07.2021 um 00:05 schrieb Andreas Schwab:
>>> On Jul 24 2021, Michael Schmitz wrote:
>>>
>>>> According to my understanding, you can't get a F-line exception on
>>>> 68040.
>>> The F-line exception vector is used for all FPU illegal and
>>> unimplemented insns.
>> Thanks - now from my reading of the fpsp040 code (which has mislead me in
>> the past), it would seem that operations like sin() and exp() ought to raise
>> that exception then. I don't see that in ARAnyM.
> Yes, according to the 68040 user's manual, unimplemented and illegal F-line
> instructions trigger the standard F-line exception vector (11) but have
> separate stack frame formats so the fpsp040 code gets some extra data.
> The CPU does a bunch of the prep work so that part doesn't need to be
> emulated in software.
>
> The ARAnyM docs appear to claim a strange combination that wouldn't
> exist in hardware by implementing a full 68882 instead of the limited
> subset found on a real 68040. Strangely, that might have been easier to
> implement. However, it would also completely bypass any use of fpsp040.
>
> 	Brad Boyer
> 	flar@allandria.com
>

[-- Attachment #2: 0002-m68k-fpsp040-test-changes-to-copyin-out-exception-ha.patch --]
[-- Type: text/x-patch, Size: 1120 bytes --]

From 3df3164dd0f34f3ef7cfaccd079e83a7d146ee5f Mon Sep 17 00:00:00 2001
From: Michael Schmitz <schmitzmic@gmail.com>
Date: Sat, 24 Jul 2021 15:22:58 +1200
Subject: [PATCH 2/2] m68k/fpsp040 - test changes to copyin/out exception
 handling

Call the exception handler in fpsp040/skeleton.S on each f-line
trap. This ought to allow verifying that the added stack frame
is accessible and contains useful data by just tracing a simple
program using one of the floating point operations not supported
by the 68040 FPU.

Signed-off-By: Michael Schmitz <schmitzmic@gmail.com>
---
 arch/m68k/fpsp040/skeleton.S | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/arch/m68k/fpsp040/skeleton.S b/arch/m68k/fpsp040/skeleton.S
index 1cbc52b..1ca04bd 100644
--- a/arch/m68k/fpsp040/skeleton.S
+++ b/arch/m68k/fpsp040/skeleton.S
@@ -302,7 +302,8 @@ real_bsun:
 	.global	real_fline
 	.global	fline
 fline:
-	jmp	fpsp_fline
+	jmp	test_fpsp040_die
+	|jmp	fpsp_fline
 real_fline:
 
 	SAVE_ALL_INT
@@ -501,6 +502,7 @@ in_ea:
 
 	.section .fixup,#alloc,#execinstr
 	.even
+test_fpsp040_die:
 1:
 
 	SAVE_ALL_INT
-- 
2.7.4


[-- Attachment #3: 0001-m68k-fpsp040-save-full-stack-frame-before-calling-fp.patch --]
[-- Type: text/x-patch, Size: 1647 bytes --]

From 737b74a376f0b3da09ba7cb088e99c2c85b7405c Mon Sep 17 00:00:00 2001
From: Michael Schmitz <schmitzmic@gmail.com>
Date: Sun, 18 Jul 2021 10:31:42 +1200
Subject: [PATCH 1/2] m68k/fpsp040 - save full stack frame before calling
 fpsp040_die

The FPSP040 floating point support code does not know how to
handle user space access faults gracefully, and just calls
do_exit(SIGSEGV) indirectly on these faults to abort.

do_exit() may stop if traced, and needs a full stack frame
available to avoid exposing kernel data.

Add the current stack frame before calling do_exit() from the
fpsp040 user access exception handler. Top of stack frame saved
to task->thread.esp0 as is done for system calls.

Unwind the stack frame and return to caller once done, in case
do_exit() is replaced by force_sig() later on. Note that this
will allow the current exception handler to continue with
incorrect state, but the results will never make it to the
calling user program which is terminated by SYSSIGV upon return
from exception.

CC: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Michael Schmitz <schmitzmic@gmail.com>
---
 arch/m68k/fpsp040/skeleton.S | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/arch/m68k/fpsp040/skeleton.S b/arch/m68k/fpsp040/skeleton.S
index a8f4161..1cbc52b 100644
--- a/arch/m68k/fpsp040/skeleton.S
+++ b/arch/m68k/fpsp040/skeleton.S
@@ -502,7 +502,14 @@ in_ea:
 	.section .fixup,#alloc,#execinstr
 	.even
 1:
+
+	SAVE_ALL_INT
+	| save top of frame
+	movel	%sp,%curptr@(TASK_THREAD+THREAD_ESP0)
+	SAVE_SWITCH_STACK
 	jbra	fpsp040_die
+	lea	44(%sp),%sp
+	rts
 
 	.section __ex_table,#alloc
 	.align	4
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* Re: [PATCH v4 3/3] m68k: track syscalls being traced with shallow user context stack
  2021-07-25 21:00       ` Linus Torvalds
@ 2021-07-26 14:27         ` Greg Ungerer
  0 siblings, 0 replies; 37+ messages in thread
From: Greg Ungerer @ 2021-07-26 14:27 UTC (permalink / raw)
  To: Linus Torvalds, Michael Schmitz
  Cc: Geert Uytterhoeven, Linux-Arch, linux-m68k, Eric W. Biederman,
	Andreas Schwab


On 26/7/21 7:00 am, Linus Torvalds wrote:
> On Sun, Jul 25, 2021 at 1:48 PM Michael Schmitz <schmitzmic@gmail.com> wrote:
>>
>> Just out of interest - what would be the correct way to set/clear a
>> single bit on Coldfire? Add/subtract the 1<<bit value?
> 
> I think BSET/BCLR are the way to go.

Yep, they are available on all ColdFire revisions.
I think they are the best choice here.


> Or, alternatively, put the constant in a register, and use a longword
> memory access. The arithmetic ops don't do immediates _and_ memory
> operands in Coldfire, and they don't do byte ops.
> 
> Or something like that.

Yes, that is right with one exception. You can use addq with immediates
of value 1 to 8 with memory operands.

Regards
Greg


^ permalink raw reply	[flat|nested] 37+ messages in thread

* [RFC][PATCH] signal/m68k: Use force_sigsegv(SIGSEGV) in fpsp040_die
  2021-07-26  2:00                               ` Michael Schmitz
@ 2021-07-26 19:36                                 ` Eric W. Biederman
  2021-07-26 20:13                                   ` Andreas Schwab
  2021-07-26 20:29                                   ` Michael Schmitz
  0 siblings, 2 replies; 37+ messages in thread
From: Eric W. Biederman @ 2021-07-26 19:36 UTC (permalink / raw)
  To: Michael Schmitz
  Cc: Brad Boyer, Andreas Schwab, geert, linux-arch, linux-m68k, torvalds


In the fpsp040 code when copyin or copyout fails call
force_sigsegv(SIGSEGV) instead of do_exit(SIGSEGV).

This solves a couple of problems.  Because do_exit embeds the ptrace
stop PTRACE_EVENT_EXIT a complete stack frame needs to be present for
that to work correctly.  There is always the information needed for a
ptrace stop where get_signal is called.  So exiting with a signal
solves the ptrace issue.

Further exiting with a signal ensures that all of the threads in a
process are killed not just the thread that malfunctioned.  Which
avoids confusing userspace.

To make force_sigsegv(SIGSEGV) work in fpsp040_die modify the code to
save all of the registers and jump to ret_from_exception (which
ultimately calls get_signal) after fpsp040_die returns.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---

Can someone please check my m68k assembly changes?

I think I have them correct, and the code assembles but I don't
understand the fine points of when the different branch instructions
should be used.

 arch/m68k/fpsp040/skeleton.S | 3 ++-
 arch/m68k/kernel/traps.c     | 2 +-
 2 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/arch/m68k/fpsp040/skeleton.S b/arch/m68k/fpsp040/skeleton.S
index a8f41615d94a..ec767523c012 100644
--- a/arch/m68k/fpsp040/skeleton.S
+++ b/arch/m68k/fpsp040/skeleton.S
@@ -502,7 +502,8 @@ in_ea:
 	.section .fixup,#alloc,#execinstr
 	.even
 1:
-	jbra	fpsp040_die
+	bsrl	fpsp040_die
+	jmp	.Lnotkern
 
 	.section __ex_table,#alloc
 	.align	4
diff --git a/arch/m68k/kernel/traps.c b/arch/m68k/kernel/traps.c
index 9e1261462bcc..5b19fcdcd69e 100644
--- a/arch/m68k/kernel/traps.c
+++ b/arch/m68k/kernel/traps.c
@@ -1150,7 +1150,7 @@ asmlinkage void set_esp0(unsigned long ssp)
  */
 asmlinkage void fpsp040_die(void)
 {
-	do_exit(SIGSEGV);
+	force_sigsegv(SIGSEGV);
 }
 
 #ifdef CONFIG_M68KFPU_EMU
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* Re: [RFC][PATCH] signal/m68k: Use force_sigsegv(SIGSEGV) in fpsp040_die
  2021-07-26 19:36                                 ` [RFC][PATCH] signal/m68k: Use force_sigsegv(SIGSEGV) in fpsp040_die Eric W. Biederman
@ 2021-07-26 20:13                                   ` Andreas Schwab
  2021-07-26 20:29                                     ` Eric W. Biederman
  2021-07-26 20:29                                   ` Michael Schmitz
  1 sibling, 1 reply; 37+ messages in thread
From: Andreas Schwab @ 2021-07-26 20:13 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Michael Schmitz, Brad Boyer, geert, linux-arch, linux-m68k, torvalds

On Jul 26 2021, Eric W. Biederman wrote:

> diff --git a/arch/m68k/fpsp040/skeleton.S b/arch/m68k/fpsp040/skeleton.S
> index a8f41615d94a..ec767523c012 100644
> --- a/arch/m68k/fpsp040/skeleton.S
> +++ b/arch/m68k/fpsp040/skeleton.S
> @@ -502,7 +502,8 @@ in_ea:
>  	.section .fixup,#alloc,#execinstr
>  	.even
>  1:
> -	jbra	fpsp040_die
> +	bsrl	fpsp040_die
> +	jmp	.Lnotkern

That should be jbra instead of jmp.

Andreas.

-- 
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
"And now for something completely different."

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [RFC][PATCH] signal/m68k: Use force_sigsegv(SIGSEGV) in fpsp040_die
  2021-07-26 20:13                                   ` Andreas Schwab
@ 2021-07-26 20:29                                     ` Eric W. Biederman
  2021-07-26 21:25                                       ` Andreas Schwab
  0 siblings, 1 reply; 37+ messages in thread
From: Eric W. Biederman @ 2021-07-26 20:29 UTC (permalink / raw)
  To: Andreas Schwab
  Cc: Michael Schmitz, Brad Boyer, geert, linux-arch, linux-m68k, torvalds

Andreas Schwab <schwab@linux-m68k.org> writes:

> On Jul 26 2021, Eric W. Biederman wrote:
>
>> diff --git a/arch/m68k/fpsp040/skeleton.S b/arch/m68k/fpsp040/skeleton.S
>> index a8f41615d94a..ec767523c012 100644
>> --- a/arch/m68k/fpsp040/skeleton.S
>> +++ b/arch/m68k/fpsp040/skeleton.S
>> @@ -502,7 +502,8 @@ in_ea:
>>  	.section .fixup,#alloc,#execinstr
>>  	.even
>>  1:
>> -	jbra	fpsp040_die
>> +	bsrl	fpsp040_die
>> +	jmp	.Lnotkern
>
> That should be jbra instead of jmp.

I will update my patch.  Mind if I ask what the difference is?

I could not find a reference mentioning jbra.  Do I need to look in the
gas source or do you know if there is a better source?

Eric


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [RFC][PATCH] signal/m68k: Use force_sigsegv(SIGSEGV) in fpsp040_die
  2021-07-26 19:36                                 ` [RFC][PATCH] signal/m68k: Use force_sigsegv(SIGSEGV) in fpsp040_die Eric W. Biederman
  2021-07-26 20:13                                   ` Andreas Schwab
@ 2021-07-26 20:29                                   ` Michael Schmitz
  2021-07-26 21:08                                     ` [PATCH] " Eric W. Biederman
  1 sibling, 1 reply; 37+ messages in thread
From: Michael Schmitz @ 2021-07-26 20:29 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Brad Boyer, Andreas Schwab, geert, linux-arch, linux-m68k, torvalds

Hi Eric,

looks good to me!

On 27/07/21 7:36 am, Eric W. Biederman wrote:
> In the fpsp040 code when copyin or copyout fails call
> force_sigsegv(SIGSEGV) instead of do_exit(SIGSEGV).
>
> This solves a couple of problems.  Because do_exit embeds the ptrace
> stop PTRACE_EVENT_EXIT a complete stack frame needs to be present for
> that to work correctly.  There is always the information needed for a
> ptrace stop where get_signal is called.  So exiting with a signal
> solves the ptrace issue.
>
> Further exiting with a signal ensures that all of the threads in a
> process are killed not just the thread that malfunctioned.  Which
> avoids confusing userspace.
>
> To make force_sigsegv(SIGSEGV) work in fpsp040_die modify the code to
> save all of the registers and jump to ret_from_exception (which
> ultimately calls get_signal) after fpsp040_die returns.
>
> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
> ---
>
> Can someone please check my m68k assembly changes?
>
> I think I have them correct, and the code assembles but I don't
> understand the fine points of when the different branch instructions
> should be used.

Since the exception handler ends up in a different text section from the 
actual code, long offsets are in use for jumps there.

According to the gas manual (and pointed out by Andreas just now), 'jmp' 
is used only for longword offsets on 68000/010. Use 'bral' for 020 etc. 
The pseudo-ops 'jra' or 'jbra' will pick the correct version (shortest 
offset possible). Similar for 'jbsr' when calling a subroutine.

  1:
-    jbra    fpsp040_die
+    jbsr    fpsp040_die
+    jbra    .Lnotkern

would be the most generic version to write this (but as this code is 
never used on 68000, 'brsl' and 'jbra' is perfectly OK).

Cheers,

     Michael

>
>   arch/m68k/fpsp040/skeleton.S | 3 ++-
>   arch/m68k/kernel/traps.c     | 2 +-
>   2 files changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/arch/m68k/fpsp040/skeleton.S b/arch/m68k/fpsp040/skeleton.S
> index a8f41615d94a..ec767523c012 100644
> --- a/arch/m68k/fpsp040/skeleton.S
> +++ b/arch/m68k/fpsp040/skeleton.S
> @@ -502,7 +502,8 @@ in_ea:
>   	.section .fixup,#alloc,#execinstr
>   	.even
>   1:
> -	jbra	fpsp040_die
> +	bsrl	fpsp040_die
> +	jmp	.Lnotkern
>   
>   	.section __ex_table,#alloc
>   	.align	4
> diff --git a/arch/m68k/kernel/traps.c b/arch/m68k/kernel/traps.c
> index 9e1261462bcc..5b19fcdcd69e 100644
> --- a/arch/m68k/kernel/traps.c
> +++ b/arch/m68k/kernel/traps.c
> @@ -1150,7 +1150,7 @@ asmlinkage void set_esp0(unsigned long ssp)
>    */
>   asmlinkage void fpsp040_die(void)
>   {
> -	do_exit(SIGSEGV);
> +	force_sigsegv(SIGSEGV);
>   }
>   
>   #ifdef CONFIG_M68KFPU_EMU

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH] signal/m68k: Use force_sigsegv(SIGSEGV) in fpsp040_die
  2021-07-26 20:29                                   ` Michael Schmitz
@ 2021-07-26 21:08                                     ` Eric W. Biederman
  2021-08-25 15:56                                       ` Eric W. Biederman
  2021-08-26 12:15                                       ` Geert Uytterhoeven
  0 siblings, 2 replies; 37+ messages in thread
From: Eric W. Biederman @ 2021-07-26 21:08 UTC (permalink / raw)
  To: Michael Schmitz
  Cc: Brad Boyer, Andreas Schwab, geert, linux-arch, linux-m68k, torvalds


In the fpsp040 code when copyin or copyout fails call
force_sigsegv(SIGSEGV) instead of do_exit(SIGSEGV).

This solves a couple of problems.  Because do_exit embeds the ptrace
stop PTRACE_EVENT_EXIT a complete stack frame needs to be present for
that to work correctly.  There is always the information needed for a
ptrace stop where get_signal is called.  So exiting with a signal
solves the ptrace issue.

Further exiting with a signal ensures that all of the threads in a
process are killed not just the thread that malfunctioned.  Which
avoids confusing userspace.

To make force_sigsegv(SIGSEGV) work in fpsp040_die modify the code to
save all of the registers and jump to ret_from_exception (which
ultimately calls get_signal) after fpsp040_die returns.

v2: Updated the branches to use gas's pseudo ops that automatically
    calculate the best branch instruction to use for the purpose.

v1: Link: https://lkml.kernel.org/r/87a6m8kgtx.fsf_-_@disp2133
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 arch/m68k/fpsp040/skeleton.S | 3 ++-
 arch/m68k/kernel/traps.c     | 2 +-
 2 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/arch/m68k/fpsp040/skeleton.S b/arch/m68k/fpsp040/skeleton.S
index a8f41615d94a..439395aa6fb4 100644
--- a/arch/m68k/fpsp040/skeleton.S
+++ b/arch/m68k/fpsp040/skeleton.S
@@ -502,7 +502,8 @@ in_ea:
 	.section .fixup,#alloc,#execinstr
 	.even
 1:
-	jbra	fpsp040_die
+	jbsr	fpsp040_die
+	jbra	.Lnotkern
 
 	.section __ex_table,#alloc
 	.align	4
diff --git a/arch/m68k/kernel/traps.c b/arch/m68k/kernel/traps.c
index 9e1261462bcc..5b19fcdcd69e 100644
--- a/arch/m68k/kernel/traps.c
+++ b/arch/m68k/kernel/traps.c
@@ -1150,7 +1150,7 @@ asmlinkage void set_esp0(unsigned long ssp)
  */
 asmlinkage void fpsp040_die(void)
 {
-	do_exit(SIGSEGV);
+	force_sigsegv(SIGSEGV);
 }
 
 #ifdef CONFIG_M68KFPU_EMU
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* Re: [RFC][PATCH] signal/m68k: Use force_sigsegv(SIGSEGV) in fpsp040_die
  2021-07-26 20:29                                     ` Eric W. Biederman
@ 2021-07-26 21:25                                       ` Andreas Schwab
  0 siblings, 0 replies; 37+ messages in thread
From: Andreas Schwab @ 2021-07-26 21:25 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Michael Schmitz, Brad Boyer, geert, linux-arch, linux-m68k, torvalds

On Jul 26 2021, Eric W. Biederman wrote:

> I could not find a reference mentioning jbra.  Do I need to look in the
> gas source or do you know if there is a better source?

It's a pseudo insn that is relaxed to the optimal branch insn.

Andreas.

-- 
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
"And now for something completely different."

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH] signal/m68k: Use force_sigsegv(SIGSEGV) in fpsp040_die
  2021-07-26 21:08                                     ` [PATCH] " Eric W. Biederman
@ 2021-08-25 15:56                                       ` Eric W. Biederman
  2021-08-26 12:15                                       ` Geert Uytterhoeven
  1 sibling, 0 replies; 37+ messages in thread
From: Eric W. Biederman @ 2021-08-25 15:56 UTC (permalink / raw)
  To: Michael Schmitz
  Cc: Brad Boyer, Andreas Schwab, geert, linux-arch, linux-m68k, torvalds

ebiederm@xmission.com (Eric W. Biederman) writes:

> In the fpsp040 code when copyin or copyout fails call
> force_sigsegv(SIGSEGV) instead of do_exit(SIGSEGV).
>
> This solves a couple of problems.  Because do_exit embeds the ptrace
> stop PTRACE_EVENT_EXIT a complete stack frame needs to be present for
> that to work correctly.  There is always the information needed for a
> ptrace stop where get_signal is called.  So exiting with a signal
> solves the ptrace issue.
>
> Further exiting with a signal ensures that all of the threads in a
> process are killed not just the thread that malfunctioned.  Which
> avoids confusing userspace.
>
> To make force_sigsegv(SIGSEGV) work in fpsp040_die modify the code to
> save all of the registers and jump to ret_from_exception (which
> ultimately calls get_signal) after fpsp040_die returns.
>
> v2: Updated the branches to use gas's pseudo ops that automatically
>     calculate the best branch instruction to use for the purpose.
>
> v1: Link: https://lkml.kernel.org/r/87a6m8kgtx.fsf_-_@disp2133
> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>

Any chance I can get an ack on this patch?

Eric

> ---
>  arch/m68k/fpsp040/skeleton.S | 3 ++-
>  arch/m68k/kernel/traps.c     | 2 +-
>  2 files changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/arch/m68k/fpsp040/skeleton.S b/arch/m68k/fpsp040/skeleton.S
> index a8f41615d94a..439395aa6fb4 100644
> --- a/arch/m68k/fpsp040/skeleton.S
> +++ b/arch/m68k/fpsp040/skeleton.S
> @@ -502,7 +502,8 @@ in_ea:
>  	.section .fixup,#alloc,#execinstr
>  	.even
>  1:
> -	jbra	fpsp040_die
> +	jbsr	fpsp040_die
> +	jbra	.Lnotkern
>  
>  	.section __ex_table,#alloc
>  	.align	4
> diff --git a/arch/m68k/kernel/traps.c b/arch/m68k/kernel/traps.c
> index 9e1261462bcc..5b19fcdcd69e 100644
> --- a/arch/m68k/kernel/traps.c
> +++ b/arch/m68k/kernel/traps.c
> @@ -1150,7 +1150,7 @@ asmlinkage void set_esp0(unsigned long ssp)
>   */
>  asmlinkage void fpsp040_die(void)
>  {
> -	do_exit(SIGSEGV);
> +	force_sigsegv(SIGSEGV);
>  }
>  
>  #ifdef CONFIG_M68KFPU_EMU

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH] signal/m68k: Use force_sigsegv(SIGSEGV) in fpsp040_die
  2021-07-26 21:08                                     ` [PATCH] " Eric W. Biederman
  2021-08-25 15:56                                       ` Eric W. Biederman
@ 2021-08-26 12:15                                       ` Geert Uytterhoeven
  1 sibling, 0 replies; 37+ messages in thread
From: Geert Uytterhoeven @ 2021-08-26 12:15 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Michael Schmitz, Brad Boyer, Andreas Schwab, Linux-Arch,
	linux-m68k, Linus Torvalds

On Mon, Jul 26, 2021 at 11:08 PM Eric W. Biederman
<ebiederm@xmission.com> wrote:
> In the fpsp040 code when copyin or copyout fails call
> force_sigsegv(SIGSEGV) instead of do_exit(SIGSEGV).
>
> This solves a couple of problems.  Because do_exit embeds the ptrace
> stop PTRACE_EVENT_EXIT a complete stack frame needs to be present for
> that to work correctly.  There is always the information needed for a
> ptrace stop where get_signal is called.  So exiting with a signal
> solves the ptrace issue.
>
> Further exiting with a signal ensures that all of the threads in a
> process are killed not just the thread that malfunctioned.  Which
> avoids confusing userspace.
>
> To make force_sigsegv(SIGSEGV) work in fpsp040_die modify the code to
> save all of the registers and jump to ret_from_exception (which
> ultimately calls get_signal) after fpsp040_die returns.
>
> v2: Updated the branches to use gas's pseudo ops that automatically
>     calculate the best branch instruction to use for the purpose.
>
> v1: Link: https://lkml.kernel.org/r/87a6m8kgtx.fsf_-_@disp2133
> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>

Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>

Gr{oetje,eeting}s,

                        Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply	[flat|nested] 37+ messages in thread

end of thread, other threads:[~2021-08-26 12:15 UTC | newest]

Thread overview: 37+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-06-23  0:21 [PATCH v4 0/3] m68k: Improved switch stack handling Michael Schmitz
2021-06-23  0:21 ` [PATCH v4 1/3] m68k: save extra registers on more syscall entry points Michael Schmitz
2021-06-23  0:21 ` [PATCH v4 2/3] m68k: correctly handle IO worker stack frame set-up Michael Schmitz
2021-06-23  0:21 ` [PATCH v4 3/3] m68k: track syscalls being traced with shallow user context stack Michael Schmitz
2021-07-25 10:05   ` Geert Uytterhoeven
2021-07-25 20:48     ` Michael Schmitz
2021-07-25 21:00       ` Linus Torvalds
2021-07-26 14:27         ` Greg Ungerer
2021-07-15 13:29 ` [PATCH v4 0/3] m68k: Improved switch stack handling Eric W. Biederman
2021-07-15 23:10   ` Michael Schmitz
2021-07-17  5:38     ` Michael Schmitz
2021-07-17 18:52       ` Eric W. Biederman
2021-07-17 20:09         ` Michael Schmitz
2021-07-17 23:04           ` Michael Schmitz
2021-07-18 10:47             ` Andreas Schwab
2021-07-18 19:47               ` Michael Schmitz
2021-07-18 20:59                 ` Brad Boyer
2021-07-19  3:15                   ` Michael Schmitz
2021-07-20 20:32             ` Eric W. Biederman
2021-07-20 22:16               ` Michael Schmitz
2021-07-22 14:49                 ` Eric W. Biederman
2021-07-23  4:23                   ` Michael Schmitz
2021-07-23 22:31                     ` Eric W. Biederman
2021-07-23 23:52                       ` Michael Schmitz
2021-07-24 12:05                         ` Andreas Schwab
2021-07-25  7:44                           ` Michael Schmitz
2021-07-25 10:12                             ` Brad Boyer
2021-07-26  2:00                               ` Michael Schmitz
2021-07-26 19:36                                 ` [RFC][PATCH] signal/m68k: Use force_sigsegv(SIGSEGV) in fpsp040_die Eric W. Biederman
2021-07-26 20:13                                   ` Andreas Schwab
2021-07-26 20:29                                     ` Eric W. Biederman
2021-07-26 21:25                                       ` Andreas Schwab
2021-07-26 20:29                                   ` Michael Schmitz
2021-07-26 21:08                                     ` [PATCH] " Eric W. Biederman
2021-08-25 15:56                                       ` Eric W. Biederman
2021-08-26 12:15                                       ` Geert Uytterhoeven
2021-07-25 11:53                             ` [PATCH v4 0/3] m68k: Improved switch stack handling Andreas Schwab

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.