linux-m68k.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC][CFT] signal handling fixes
@ 2021-07-25 17:18 Al Viro
  2021-07-25 17:19 ` [PATCH 1/3] m68k: handle arrivals of multiple signals correctly Al Viro
                   ` (5 more replies)
  0 siblings, 6 replies; 18+ messages in thread
From: Al Viro @ 2021-07-25 17:18 UTC (permalink / raw)
  To: linux-m68k; +Cc: Geert Uytterhoeven, Greg Ungerer, linux-kernel

	Back in 2012 or so I'd found a bunch of fun issues with multiple
pending signals on a lot of architectures.  m68k looked scarier than
usual (due to the combination of variable-sized exception frames with the
way kernel stack pointer is handled by the hardware), but I'd convinced
myself that it had been correct.

	Unfortunately, I was wrong - handling of multiple pending signals
does *not* work correctly there.

	Some background: wrt exception stack frames m68k variants fall
into 3 groups -
	1) 68000 and near relatives (all non-MMU): push 32bit PC, then
push 16bit SR.
	2) everything later than that, except for coldfire: push a
variable amount of auxillary data (used for insn restart, among other
things), then 16bit value encoding the format (upper 4 bits) and vector
(lower 12), then same as for (1) - 32bit PC and 16bit SR.  Size of
variable part depends upon the frame type (upper 4 bits of frame/vector
word).	Note that CPU32 falls into that group, even though it's non-MMU.
	3) coldfire (both MMU and non-MMU): push 32bit PC, then 16bit SR,
then 16bit word superficially similar to format/vector combination on (2).

	Handling of (2) is complicated, since we need the right frame
type when we go from kernel to userland.  In particular, we want format 0
(8-byte) for entering a signal handler, no matter how did we enter the
kernel when we caught the signal.  Conversely, when we return from signal
handler, we have format 0 on kernel entry (sigreturn(2) is a syscall) and
we need whatever frame we used to have back when we'd caught the signal.

	The monumentally unpleasant part is that we *must* leave the
kernel mode with the same value of kernel stack pointer we had on entry.
Crossing from user to kernel mode does not set the kernel stack pointer
to known value - its value is kept since the last time we'd left the
kernel mode.

	The sigreturn part is ugly as hell.  Signal delivery avoids
quite that level of nastiness by the following trick: there's an int
(stkadj, initially 0) between the exception frame and the rest of pt_regs.
On the way back from exception we pop the registers, then add stkadj +
4 to stack pointer before doing RTE.  Normally stkadj remains zero;
if we need to shrink the exception stack frame, we put the minimal one
over the aux data and store the offset into stkadj.  When on the way out
we pop our way through the registers, we'll end up with stack pointer
pointing to stkadj (4 bytes below the original exception stack frame)
and once we add 4 + stkadj to stack pointer, we have the minimal exception
stack frame on top of stack and RTE does the right thing.

	The problem with that trick is that exception stack frame in
pt_regs is no longer valid; in the best case regs->format will match
the original exception stack frame format and in the worst case it'll
be overwritten by bits 31..28 of signal handler address (if aux data
used to be 4 bytes long).

	ptrace get_regs()/set_regs() takes that effect into account when
accessing PC and SR; unfortunately, setup_frame() and setup_rt_frame()
do not.  That's not a problem for the first signal - ->stkadj is still
0 at that point.  However, if we end up building multiple sigframes,
we might get screwed.  Not hard to fix, thankfully...

	Another bug is on the sigreturn side of things, and that one is
my fault - in bd6f56a75bb2 ("m68k: Missing syscall_trace() on sigreturn")
I'd missed the fact that we'd just relocated pt_regs, without having
updated ->thread.esp0.

	These two are, IMO, -stable fodder.  The third one isn't -
it cleans sigreturn, hopefully making it less convoluted.  Instead of
doing unnatural things to C stack frames, call chain, etc. we let the
asm wrapper of {rt_,}sigreturn(2) do the following:
	reserve a gap on stack, sufficiently large to hold any aux data
	then call the C side of things, passing pt_regs and switch_stack
pointers to it, same as we do now if C part decides to insert aux data,
	it can simply use memmove
to shift switch_stack + pt_regs and memcpy whatever's needed into the
created gap.  Then we can return without any kind of magic - the C part
of stack is intact.  Just return the new location of switch_stack +
pt_regs to the (asm) caller, so it could set stack pointer to it.

	The series is on top of 5.14-rc1; it lives in
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs.git #untested.m68k
Individual patches in followups...

	_Very_ lightly tested on aranym; no real hardware to test it on.
Any help with review and testing would be very welcome.

PS:  FWIW, ifdefs in arch/m68k/kernel/signal.c are wrong - it's not !MMU
vs. coldfire/MMU vs. classic/MMU.  It's actually 68000 vs. coldfire vs.
everything else.  These days it's nearly correct, but only because on MMU
variants of coldfire we never see exception stack frames with type other
than 4 - it's controlled by alignment of kernel stack pointer on those,
and it's under the kernel control, so it's always 32bit-aligned.  It used
to be more serious back when we had 68360 support - that's !MMU and exception
stack frames are like those on 68020, unless I'm misreading their manual...

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH 1/3] m68k: handle arrivals of multiple signals correctly
  2021-07-25 17:18 [RFC][CFT] signal handling fixes Al Viro
@ 2021-07-25 17:19 ` Al Viro
  2021-09-15 22:08   ` Michael Schmitz
  2021-07-25 17:19 ` [PATCH 2/3] m68k: update ->thread.esp0 before calling syscall_trace() in ret_from_signal Al Viro
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 18+ messages in thread
From: Al Viro @ 2021-07-25 17:19 UTC (permalink / raw)
  To: linux-m68k; +Cc: Geert Uytterhoeven, Greg Ungerer, linux-kernel

When we have several pending signals, have entered with the kernel
with large exception frame *and* have already built at least one
sigframe, regs->stkadj is going to be non-zero and regs->format/sr/pc
are going to be junk - the real values are in shifted exception stack
frame we'd built when putting together the first sigframe.

If that happens, subsequent sigframes are going to be garbage.
Not hard to fix - just need to find the "adjusted" frame first
and look for format/vector/sr/pc in it.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 arch/m68k/kernel/signal.c | 88 ++++++++++++++++++++++-------------------------
 1 file changed, 42 insertions(+), 46 deletions(-)

diff --git a/arch/m68k/kernel/signal.c b/arch/m68k/kernel/signal.c
index 8f215e79e70e..cd11eb101eac 100644
--- a/arch/m68k/kernel/signal.c
+++ b/arch/m68k/kernel/signal.c
@@ -447,7 +447,7 @@ static inline void save_fpu_state(struct sigcontext *sc, struct pt_regs *regs)
 
 	if (CPU_IS_060 ? sc->sc_fpstate[2] : sc->sc_fpstate[0]) {
 		fpu_version = sc->sc_fpstate[0];
-		if (CPU_IS_020_OR_030 &&
+		if (CPU_IS_020_OR_030 && !regs->stkadj &&
 		    regs->vector >= (VEC_FPBRUC * 4) &&
 		    regs->vector <= (VEC_FPNAN * 4)) {
 			/* Clear pending exception in 68882 idle frame */
@@ -510,7 +510,7 @@ static inline int rt_save_fpu_state(struct ucontext __user *uc, struct pt_regs *
 		if (!(CPU_IS_060 || CPU_IS_COLDFIRE))
 			context_size = fpstate[1];
 		fpu_version = fpstate[0];
-		if (CPU_IS_020_OR_030 &&
+		if (CPU_IS_020_OR_030 && !regs->stkadj &&
 		    regs->vector >= (VEC_FPBRUC * 4) &&
 		    regs->vector <= (VEC_FPNAN * 4)) {
 			/* Clear pending exception in 68882 idle frame */
@@ -832,18 +832,24 @@ asmlinkage int do_rt_sigreturn(struct pt_regs *regs, struct switch_stack *sw)
 	return 0;
 }
 
+static inline struct pt_regs *rte_regs(struct pt_regs *regs)
+{
+	return (void *)regs + regs->stkadj;
+}
+
 static void setup_sigcontext(struct sigcontext *sc, struct pt_regs *regs,
 			     unsigned long mask)
 {
+	struct pt_regs *tregs = rte_regs(regs);
 	sc->sc_mask = mask;
 	sc->sc_usp = rdusp();
 	sc->sc_d0 = regs->d0;
 	sc->sc_d1 = regs->d1;
 	sc->sc_a0 = regs->a0;
 	sc->sc_a1 = regs->a1;
-	sc->sc_sr = regs->sr;
-	sc->sc_pc = regs->pc;
-	sc->sc_formatvec = regs->format << 12 | regs->vector;
+	sc->sc_sr = tregs->sr;
+	sc->sc_pc = tregs->pc;
+	sc->sc_formatvec = tregs->format << 12 | tregs->vector;
 	save_a5_state(sc, regs);
 	save_fpu_state(sc, regs);
 }
@@ -851,6 +857,7 @@ static void setup_sigcontext(struct sigcontext *sc, struct pt_regs *regs,
 static inline int rt_setup_ucontext(struct ucontext __user *uc, struct pt_regs *regs)
 {
 	struct switch_stack *sw = (struct switch_stack *)regs - 1;
+	struct pt_regs *tregs = rte_regs(regs);
 	greg_t __user *gregs = uc->uc_mcontext.gregs;
 	int err = 0;
 
@@ -871,9 +878,9 @@ static inline int rt_setup_ucontext(struct ucontext __user *uc, struct pt_regs *
 	err |= __put_user(sw->a5, &gregs[13]);
 	err |= __put_user(sw->a6, &gregs[14]);
 	err |= __put_user(rdusp(), &gregs[15]);
-	err |= __put_user(regs->pc, &gregs[16]);
-	err |= __put_user(regs->sr, &gregs[17]);
-	err |= __put_user((regs->format << 12) | regs->vector, &uc->uc_formatvec);
+	err |= __put_user(tregs->pc, &gregs[16]);
+	err |= __put_user(tregs->sr, &gregs[17]);
+	err |= __put_user((tregs->format << 12) | tregs->vector, &uc->uc_formatvec);
 	err |= rt_save_fpu_state(uc, regs);
 	return err;
 }
@@ -890,13 +897,14 @@ static int setup_frame(struct ksignal *ksig, sigset_t *set,
 			struct pt_regs *regs)
 {
 	struct sigframe __user *frame;
-	int fsize = frame_extra_sizes(regs->format);
+	struct pt_regs *tregs = rte_regs(regs);
+	int fsize = frame_extra_sizes(tregs->format);
 	struct sigcontext context;
 	int err = 0, sig = ksig->sig;
 
 	if (fsize < 0) {
 		pr_debug("setup_frame: Unknown frame format %#x\n",
-			 regs->format);
+			 tregs->format);
 		return -EFAULT;
 	}
 
@@ -907,7 +915,7 @@ static int setup_frame(struct ksignal *ksig, sigset_t *set,
 
 	err |= __put_user(sig, &frame->sig);
 
-	err |= __put_user(regs->vector, &frame->code);
+	err |= __put_user(tregs->vector, &frame->code);
 	err |= __put_user(&frame->sc, &frame->psc);
 
 	if (_NSIG_WORDS > 1)
@@ -934,33 +942,27 @@ static int setup_frame(struct ksignal *ksig, sigset_t *set,
 	push_cache ((unsigned long) &frame->retcode);
 
 	/*
-	 * Set up registers for signal handler.  All the state we are about
-	 * to destroy is successfully copied to sigframe.
-	 */
-	wrusp ((unsigned long) frame);
-	regs->pc = (unsigned long) ksig->ka.sa.sa_handler;
-	adjustformat(regs);
-
-	/*
 	 * This is subtle; if we build more than one sigframe, all but the
 	 * first one will see frame format 0 and have fsize == 0, so we won't
 	 * screw stkadj.
 	 */
-	if (fsize)
+	if (fsize) {
 		regs->stkadj = fsize;
-
-	/* Prepare to skip over the extra stuff in the exception frame.  */
-	if (regs->stkadj) {
-		struct pt_regs *tregs =
-			(struct pt_regs *)((ulong)regs + regs->stkadj);
+		tregs = rte_regs(regs);
 		pr_debug("Performing stackadjust=%04lx\n", regs->stkadj);
-		/* This must be copied with decreasing addresses to
-                   handle overlaps.  */
 		tregs->vector = 0;
 		tregs->format = 0;
-		tregs->pc = regs->pc;
 		tregs->sr = regs->sr;
 	}
+
+	/*
+	 * Set up registers for signal handler.  All the state we are about
+	 * to destroy is successfully copied to sigframe.
+	 */
+	wrusp ((unsigned long) frame);
+	tregs->pc = (unsigned long) ksig->ka.sa.sa_handler;
+	adjustformat(regs);
+
 	return 0;
 }
 
@@ -968,7 +970,8 @@ static int setup_rt_frame(struct ksignal *ksig, sigset_t *set,
 			   struct pt_regs *regs)
 {
 	struct rt_sigframe __user *frame;
-	int fsize = frame_extra_sizes(regs->format);
+	struct pt_regs *tregs = rte_regs(regs);
+	int fsize = frame_extra_sizes(tregs->format);
 	int err = 0, sig = ksig->sig;
 
 	if (fsize < 0) {
@@ -1019,33 +1022,26 @@ static int setup_rt_frame(struct ksignal *ksig, sigset_t *set,
 	push_cache ((unsigned long) &frame->retcode);
 
 	/*
-	 * Set up registers for signal handler.  All the state we are about
-	 * to destroy is successfully copied to sigframe.
-	 */
-	wrusp ((unsigned long) frame);
-	regs->pc = (unsigned long) ksig->ka.sa.sa_handler;
-	adjustformat(regs);
-
-	/*
 	 * This is subtle; if we build more than one sigframe, all but the
 	 * first one will see frame format 0 and have fsize == 0, so we won't
 	 * screw stkadj.
 	 */
-	if (fsize)
+	if (fsize) {
 		regs->stkadj = fsize;
-
-	/* Prepare to skip over the extra stuff in the exception frame.  */
-	if (regs->stkadj) {
-		struct pt_regs *tregs =
-			(struct pt_regs *)((ulong)regs + regs->stkadj);
+		tregs = rte_regs(regs);
 		pr_debug("Performing stackadjust=%04lx\n", regs->stkadj);
-		/* This must be copied with decreasing addresses to
-                   handle overlaps.  */
 		tregs->vector = 0;
 		tregs->format = 0;
-		tregs->pc = regs->pc;
 		tregs->sr = regs->sr;
 	}
+
+	/*
+	 * Set up registers for signal handler.  All the state we are about
+	 * to destroy is successfully copied to sigframe.
+	 */
+	wrusp ((unsigned long) frame);
+	tregs->pc = (unsigned long) ksig->ka.sa.sa_handler;
+	adjustformat(regs);
 	return 0;
 }
 
-- 
2.11.0


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 2/3] m68k: update ->thread.esp0 before calling syscall_trace() in ret_from_signal
  2021-07-25 17:18 [RFC][CFT] signal handling fixes Al Viro
  2021-07-25 17:19 ` [PATCH 1/3] m68k: handle arrivals of multiple signals correctly Al Viro
@ 2021-07-25 17:19 ` Al Viro
  2021-09-15 22:19   ` Michael Schmitz
  2021-07-25 17:20 ` [PATCH 3/3] m68k: leave stack mangling to asm wrapper of sigreturn() Al Viro
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 18+ messages in thread
From: Al Viro @ 2021-07-25 17:19 UTC (permalink / raw)
  To: linux-m68k; +Cc: Geert Uytterhoeven, Greg Ungerer, linux-kernel

We get there when sigreturn has performed obscene acts on kernel stack;
in particular, the location of pt_regs has shifted.  We are about to call
syscall_trace(), which might stop for tracer.  If that happens, we'd better
have task_pt_regs() returning correct result...

Fucked-up-by: Al Viro <viro@zeniv.linux.org.uk>
Fixes: bd6f56a75bb2 ("m68k: Missing syscall_trace() on sigreturn")
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 arch/m68k/kernel/entry.S | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/m68k/kernel/entry.S b/arch/m68k/kernel/entry.S
index 9dd76fbb7c6b..ff9e842cec0f 100644
--- a/arch/m68k/kernel/entry.S
+++ b/arch/m68k/kernel/entry.S
@@ -186,6 +186,8 @@ ENTRY(ret_from_signal)
 	movel	%curptr@(TASK_STACK),%a1
 	tstb	%a1@(TINFO_FLAGS+2)
 	jge	1f
+	lea	%sp@(SWITCH_STACK_SIZE),%a1
+	movel	%a1,%curptr@(TASK_THREAD+THREAD_ESP0)
 	jbsr	syscall_trace
 1:	RESTORE_SWITCH_STACK
 	addql	#4,%sp
-- 
2.11.0


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 3/3] m68k: leave stack mangling to asm wrapper of sigreturn()
  2021-07-25 17:18 [RFC][CFT] signal handling fixes Al Viro
  2021-07-25 17:19 ` [PATCH 1/3] m68k: handle arrivals of multiple signals correctly Al Viro
  2021-07-25 17:19 ` [PATCH 2/3] m68k: update ->thread.esp0 before calling syscall_trace() in ret_from_signal Al Viro
@ 2021-07-25 17:20 ` Al Viro
  2021-09-15 23:35   ` Michael Schmitz
  2021-07-27 10:21 ` [RFC][CFT] signal handling fixes Finn Thain
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 18+ messages in thread
From: Al Viro @ 2021-07-25 17:20 UTC (permalink / raw)
  To: linux-m68k; +Cc: Geert Uytterhoeven, Greg Ungerer, linux-kernel

sigreturn has to deal with an unpleasant problem - exception stack frames
have different sizes, depending upon the exception (and processor model, as
well) and variable-sized part of exception frame may contain information
needed for instruction restart.  So when signal handler terminates and calls
sigreturn to resume the execution at the place where we'd been when we caught
the signal, it has to rearrange the frame at the bottom of kernel stack.
Worse, it might need to open a gap in the kernel stack, shifting pt_regs
towards lower addresses.

Doing that from C is insane - we'd need to shift stack frames (return addresses,
local variables, etc.) of C call chain, right under the nose of compiler and
hope it won't fall apart horribly.  What had been actually done is only slightly
less insane - an inline asm in mangle_kernel_stack() moved the stuff around,
then reset stack pointer and jumped to label in asm glue.

However, we can avoid all that mess if the asm wrapper we have to use anyway
would reserve some space on the stack between switch_stack and the C stack
frame of do_{rt_,}sigreturn().   Then C part can simply memmove() pt_regs +
switch_stack, memcpy() the variable part of exception frame into the opened
gap - all of that without inline asm, buggering C call chain, magical jumps
to asm labels, etc.

Asm wrapper would need to know where the moved switch_stack has ended up -
it might have been shifted into the gap we'd reserved before do_rt_sigreturn()
call.  That's where it needs to set the stack pointer to.  So let the C part
return just that and be done with that.

While we are at it, the call of berr_040cleanup() we need to do when
returning via 68040 bus error exception frame can be moved into C part
as well.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 arch/m68k/68000/entry.S       |   3 --
 arch/m68k/coldfire/entry.S    |   3 --
 arch/m68k/include/asm/traps.h |   4 ++
 arch/m68k/kernel/entry.S      |  55 ++++++++++-----------
 arch/m68k/kernel/signal.c     | 111 ++++++++++++++++--------------------------
 5 files changed, 71 insertions(+), 105 deletions(-)

diff --git a/arch/m68k/68000/entry.S b/arch/m68k/68000/entry.S
index 259b3661b614..cce465e850fe 100644
--- a/arch/m68k/68000/entry.S
+++ b/arch/m68k/68000/entry.S
@@ -25,7 +25,6 @@
 .globl system_call
 .globl resume
 .globl ret_from_exception
-.globl ret_from_signal
 .globl sys_call_table
 .globl bad_interrupt
 .globl inthandler1
@@ -59,8 +58,6 @@ do_trace:
 	subql	#4,%sp			/* dummy return address */
 	SAVE_SWITCH_STACK
 	jbsr	syscall_trace_leave
-
-ret_from_signal:
 	RESTORE_SWITCH_STACK
 	addql	#4,%sp
 	jra	ret_from_exception
diff --git a/arch/m68k/coldfire/entry.S b/arch/m68k/coldfire/entry.S
index d43a02795a4a..68adb7b5b296 100644
--- a/arch/m68k/coldfire/entry.S
+++ b/arch/m68k/coldfire/entry.S
@@ -51,7 +51,6 @@ sw_usp:
 .globl system_call
 .globl resume
 .globl ret_from_exception
-.globl ret_from_signal
 .globl sys_call_table
 .globl inthandler
 
@@ -98,8 +97,6 @@ ENTRY(system_call)
 	subql	#4,%sp			/* dummy return address */
 	SAVE_SWITCH_STACK
 	jbsr	syscall_trace_leave
-
-ret_from_signal:
 	RESTORE_SWITCH_STACK
 	addql	#4,%sp
 
diff --git a/arch/m68k/include/asm/traps.h b/arch/m68k/include/asm/traps.h
index 4aff3358fbaf..a9d5c1c870d3 100644
--- a/arch/m68k/include/asm/traps.h
+++ b/arch/m68k/include/asm/traps.h
@@ -267,6 +267,10 @@ struct frame {
     } un;
 };
 
+#ifdef CONFIG_M68040
+asmlinkage void berr_040cleanup(struct frame *fp);
+#endif
+
 #endif /* __ASSEMBLY__ */
 
 #endif /* _M68K_TRAPS_H */
diff --git a/arch/m68k/kernel/entry.S b/arch/m68k/kernel/entry.S
index ff9e842cec0f..8fa9822b5922 100644
--- a/arch/m68k/kernel/entry.S
+++ b/arch/m68k/kernel/entry.S
@@ -78,20 +78,38 @@ ENTRY(__sys_clone3)
 
 ENTRY(sys_sigreturn)
 	SAVE_SWITCH_STACK
-	movel	%sp,%sp@-		  | switch_stack pointer
-	pea	%sp@(SWITCH_STACK_SIZE+4) | pt_regs pointer
+	movel	%sp,%a1			  	| switch_stack pointer
+	lea	%sp@(SWITCH_STACK_SIZE),%a0	| pt_regs pointer
+	lea     %sp@(-84),%sp			| leave a gap
+	movel	%a1,%sp@-
+	movel	%a0,%sp@-
 	jbsr	do_sigreturn
-	addql	#8,%sp
-	RESTORE_SWITCH_STACK
-	rts
+	jra	1f				| shared with rt_sigreturn()
 
 ENTRY(sys_rt_sigreturn)
 	SAVE_SWITCH_STACK
-	movel	%sp,%sp@-		  | switch_stack pointer
-	pea	%sp@(SWITCH_STACK_SIZE+4) | pt_regs pointer
+	movel	%sp,%a1			  	| switch_stack pointer
+	lea	%sp@(SWITCH_STACK_SIZE),%a0	| pt_regs pointer
+	lea     %sp@(-84),%sp			| leave a gap
+	movel	%a1,%sp@-
+	movel	%a0,%sp@-
+	| stack contents:
+	|   [original pt_regs address] [original switch_stack address]
+	|   [gap] [switch_stack] [pt_regs] [exception frame]
 	jbsr	do_rt_sigreturn
-	addql	#8,%sp
+
+1:
+	| stack contents now:
+	|   [original pt_regs address] [original switch_stack address]
+	|   [unused part of the gap] [moved switch_stack] [moved pt_regs]
+	|   [replacement exception frame]
+	| return value of do_{rt_,}sigreturn() points to moved switch_stack.
+
+	movel	%d0,%sp				| discard the leftover junk
 	RESTORE_SWITCH_STACK
+	| stack contents now is just [syscall return address] [pt_regs] [frame]
+	| return pt_regs.d0
+	movel	%sp@(PT_OFF_D0+4),%d0
 	rts
 
 ENTRY(buserr)
@@ -182,27 +200,6 @@ do_trace_exit:
 	addql	#4,%sp
 	jra	.Lret_from_exception
 
-ENTRY(ret_from_signal)
-	movel	%curptr@(TASK_STACK),%a1
-	tstb	%a1@(TINFO_FLAGS+2)
-	jge	1f
-	lea	%sp@(SWITCH_STACK_SIZE),%a1
-	movel	%a1,%curptr@(TASK_THREAD+THREAD_ESP0)
-	jbsr	syscall_trace
-1:	RESTORE_SWITCH_STACK
-	addql	#4,%sp
-/* on 68040 complete pending writebacks if any */
-#ifdef CONFIG_M68040
-	bfextu	%sp@(PT_OFF_FORMATVEC){#0,#4},%d0
-	subql	#7,%d0				| bus error frame ?
-	jbne	1f
-	movel	%sp,%sp@-
-	jbsr	berr_040cleanup
-	addql	#4,%sp
-1:
-#endif
-	jra	.Lret_from_exception
-
 ENTRY(system_call)
 	SAVE_ALL_SYS
 
diff --git a/arch/m68k/kernel/signal.c b/arch/m68k/kernel/signal.c
index cd11eb101eac..338817d0cb3f 100644
--- a/arch/m68k/kernel/signal.c
+++ b/arch/m68k/kernel/signal.c
@@ -641,56 +641,35 @@ static inline void siginfo_build_tests(void)
 static int mangle_kernel_stack(struct pt_regs *regs, int formatvec,
 			       void __user *fp)
 {
-	int fsize = frame_extra_sizes(formatvec >> 12);
-	if (fsize < 0) {
+	int extra = frame_extra_sizes(formatvec >> 12);
+	char buf[sizeof_field(struct frame, un)];
+
+	if (extra < 0) {
 		/*
 		 * user process trying to return with weird frame format
 		 */
 		pr_debug("user process returning with weird frame format\n");
-		return 1;
+		return -1;
 	}
-	if (!fsize) {
-		regs->format = formatvec >> 12;
-		regs->vector = formatvec & 0xfff;
-	} else {
-		struct switch_stack *sw = (struct switch_stack *)regs - 1;
-		/* yes, twice as much as max(sizeof(frame.un.fmt<x>)) */
-		unsigned long buf[sizeof_field(struct frame, un) / 2];
-
-		/* that'll make sure that expansion won't crap over data */
-		if (copy_from_user(buf + fsize / 4, fp, fsize))
-			return 1;
-
-		/* point of no return */
-		regs->format = formatvec >> 12;
-		regs->vector = formatvec & 0xfff;
-#define frame_offset (sizeof(struct pt_regs)+sizeof(struct switch_stack))
-		__asm__ __volatile__ (
-#ifdef CONFIG_COLDFIRE
-			 "   movel %0,%/sp\n\t"
-			 "   bra ret_from_signal\n"
-#else
-			 "   movel %0,%/a0\n\t"
-			 "   subl %1,%/a0\n\t"     /* make room on stack */
-			 "   movel %/a0,%/sp\n\t"  /* set stack pointer */
-			 /* move switch_stack and pt_regs */
-			 "1: movel %0@+,%/a0@+\n\t"
-			 "   dbra %2,1b\n\t"
-			 "   lea %/sp@(%c3),%/a0\n\t" /* add offset of fmt */
-			 "   lsrl  #2,%1\n\t"
-			 "   subql #1,%1\n\t"
-			 /* copy to the gap we'd made */
-			 "2: movel %4@+,%/a0@+\n\t"
-			 "   dbra %1,2b\n\t"
-			 "   bral ret_from_signal\n"
+	if (extra && copy_from_user(buf, fp, extra))
+		return -1;
+	regs->format = formatvec >> 12;
+	regs->vector = formatvec & 0xfff;
+	if (extra) {
+		void *p = (struct switch_stack *)regs - 1;
+		struct frame *new = (void *)regs - extra;
+		int size = sizeof(struct pt_regs)+sizeof(struct switch_stack);
+
+		memmove(p - extra, p, size);
+		memcpy(p - extra + size, buf, extra);
+		current->thread.esp0 = (unsigned long)&new->ptregs;
+#ifdef CONFIG_M68040
+		/* on 68040 complete pending writebacks if any */
+		if (new->ptregs.format == 7) // bus error frame
+			berr_040cleanup(new);
 #endif
-			 : /* no outputs, it doesn't ever return */
-			 : "a" (sw), "d" (fsize), "d" (frame_offset/4-1),
-			   "n" (frame_offset), "a" (buf + fsize/4)
-			 : "a0");
-#undef frame_offset
 	}
-	return 0;
+	return extra;
 }
 
 static inline int
@@ -698,7 +677,6 @@ restore_sigcontext(struct pt_regs *regs, struct sigcontext __user *usc, void __u
 {
 	int formatvec;
 	struct sigcontext context;
-	int err = 0;
 
 	siginfo_build_tests();
 
@@ -707,7 +685,7 @@ restore_sigcontext(struct pt_regs *regs, struct sigcontext __user *usc, void __u
 
 	/* get previous context */
 	if (copy_from_user(&context, usc, sizeof(context)))
-		goto badframe;
+		return -1;
 
 	/* restore passed registers */
 	regs->d0 = context.sc_d0;
@@ -720,15 +698,10 @@ restore_sigcontext(struct pt_regs *regs, struct sigcontext __user *usc, void __u
 	wrusp(context.sc_usp);
 	formatvec = context.sc_formatvec;
 
-	err = restore_fpu_state(&context);
-
-	if (err || mangle_kernel_stack(regs, formatvec, fp))
-		goto badframe;
-
-	return 0;
+	if (restore_fpu_state(&context))
+		return -1;
 
-badframe:
-	return 1;
+	return mangle_kernel_stack(regs, formatvec, fp);
 }
 
 static inline int
@@ -745,7 +718,7 @@ rt_restore_ucontext(struct pt_regs *regs, struct switch_stack *sw,
 
 	err = __get_user(temp, &uc->uc_mcontext.version);
 	if (temp != MCONTEXT_VERSION)
-		goto badframe;
+		return -1;
 	/* restore passed registers */
 	err |= __get_user(regs->d0, &gregs[0]);
 	err |= __get_user(regs->d1, &gregs[1]);
@@ -774,22 +747,17 @@ rt_restore_ucontext(struct pt_regs *regs, struct switch_stack *sw,
 	err |= restore_altstack(&uc->uc_stack);
 
 	if (err)
-		goto badframe;
-
-	if (mangle_kernel_stack(regs, temp, &uc->uc_extra))
-		goto badframe;
+		return -1;
 
-	return 0;
-
-badframe:
-	return 1;
+	return mangle_kernel_stack(regs, temp, &uc->uc_extra);
 }
 
-asmlinkage int do_sigreturn(struct pt_regs *regs, struct switch_stack *sw)
+asmlinkage void *do_sigreturn(struct pt_regs *regs, struct switch_stack *sw)
 {
 	unsigned long usp = rdusp();
 	struct sigframe __user *frame = (struct sigframe __user *)(usp - 4);
 	sigset_t set;
+	int size;
 
 	if (!access_ok(frame, sizeof(*frame)))
 		goto badframe;
@@ -801,20 +769,22 @@ asmlinkage int do_sigreturn(struct pt_regs *regs, struct switch_stack *sw)
 
 	set_current_blocked(&set);
 
-	if (restore_sigcontext(regs, &frame->sc, frame + 1))
+	size = restore_sigcontext(regs, &frame->sc, frame + 1);
+	if (size < 0)
 		goto badframe;
-	return regs->d0;
+	return (void *)sw - size;
 
 badframe:
 	force_sig(SIGSEGV);
-	return 0;
+	return sw;
 }
 
-asmlinkage int do_rt_sigreturn(struct pt_regs *regs, struct switch_stack *sw)
+asmlinkage void *do_rt_sigreturn(struct pt_regs *regs, struct switch_stack *sw)
 {
 	unsigned long usp = rdusp();
 	struct rt_sigframe __user *frame = (struct rt_sigframe __user *)(usp - 4);
 	sigset_t set;
+	int size;
 
 	if (!access_ok(frame, sizeof(*frame)))
 		goto badframe;
@@ -823,13 +793,14 @@ asmlinkage int do_rt_sigreturn(struct pt_regs *regs, struct switch_stack *sw)
 
 	set_current_blocked(&set);
 
-	if (rt_restore_ucontext(regs, sw, &frame->uc))
+	size = rt_restore_ucontext(regs, sw, &frame->uc);
+	if (size < 0)
 		goto badframe;
-	return regs->d0;
+	return (void *)sw - size;
 
 badframe:
 	force_sig(SIGSEGV);
-	return 0;
+	return sw;
 }
 
 static inline struct pt_regs *rte_regs(struct pt_regs *regs)
-- 
2.11.0


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [RFC][CFT] signal handling fixes
  2021-07-25 17:18 [RFC][CFT] signal handling fixes Al Viro
                   ` (2 preceding siblings ...)
  2021-07-25 17:20 ` [PATCH 3/3] m68k: leave stack mangling to asm wrapper of sigreturn() Al Viro
@ 2021-07-27 10:21 ` Finn Thain
  2021-07-27 14:42   ` Al Viro
  2021-09-16  9:03 ` Finn Thain
  2021-09-23 14:45 ` Geert Uytterhoeven
  5 siblings, 1 reply; 18+ messages in thread
From: Finn Thain @ 2021-07-27 10:21 UTC (permalink / raw)
  To: Al Viro; +Cc: linux-m68k, Geert Uytterhoeven, Greg Ungerer, linux-kernel

On Sun, 25 Jul 2021, Al Viro wrote:

> 
> 	The series is on top of 5.14-rc1; it lives in
> git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs.git #untested.m68k
> Individual patches in followups...
> 
> 	_Very_ lightly tested on aranym; no real hardware to test it on.
> Any help with review and testing would be very welcome.
> 

I can test this branch on a Motorola 68040 machine I have here. Can you 
advise how to get decent code coverage? Maybe there's a package out there 
with a signal-heavy test suite? Maybe I need a break point in a signal 
handler? Or perhaps just send ^C to a process running under strace?

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC][CFT] signal handling fixes
  2021-07-27 10:21 ` [RFC][CFT] signal handling fixes Finn Thain
@ 2021-07-27 14:42   ` Al Viro
  2021-07-28  1:23     ` Finn Thain
  0 siblings, 1 reply; 18+ messages in thread
From: Al Viro @ 2021-07-27 14:42 UTC (permalink / raw)
  To: Finn Thain; +Cc: linux-m68k, Geert Uytterhoeven, Greg Ungerer, linux-kernel

On Tue, Jul 27, 2021 at 08:21:52PM +1000, Finn Thain wrote:
> On Sun, 25 Jul 2021, Al Viro wrote:
> 
> > 
> > 	The series is on top of 5.14-rc1; it lives in
> > git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs.git #untested.m68k
> > Individual patches in followups...
> > 
> > 	_Very_ lightly tested on aranym; no real hardware to test it on.
> > Any help with review and testing would be very welcome.
> > 
> 
> I can test this branch on a Motorola 68040 machine I have here. Can you 
> advise how to get decent code coverage? Maybe there's a package out there 
> with a signal-heavy test suite? Maybe I need a break point in a signal 
> handler? Or perhaps just send ^C to a process running under strace?

Generally, SIGINT is not the best insertion vector...

Set a handler of e.g. SIGALRM with sigaction(), with a couple of other signals
in sa_mask (e.g. SIGUSR1 and SIGUSR2).  With raise() on those inside the
SIGALRM handler - then they will become deliverable on return from handler.
And have SIGUSR1 and SIGUSR2 handlers print siginfo and ucontext contents
(have them set with SA_SIGINFO in sa_flags, look at the second and third
arguments of sighandler).

Use alarm(2) to arrange for SIGALRM and sit in a tight loop - that'll give you
delivery on return from interrupt.  Alternatively, raise(SIGALRM) will give
you delivery on return from trap.  And making that a SIGBUS handler instead,
mmapping a file, truncating it to 0 and dereferencing something in mmapped
area will give you delivery on return from access error trap.  Division by
zero (and insertion handler on SIGFPE) ought to give you a type 2 exception
stack frame (4 bytes of aux data, that makes shifted exception frame bugger
format and vector fields of the original).

FWIW, the third argument of handler points to
struct ucontext {
        unsigned long     uc_flags;
        struct ucontext  *uc_link;
        stack_t           uc_stack;
        struct mcontext   uc_mcontext;
        unsigned long     uc_filler[80];
        sigset_t          uc_sigmask;   /* mask last for extensibility */
};
and type/vector is stored in uc_filler[54] (216 bytes into the array), with
aux data from exception stack frame starting from uc_filler[55].

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC][CFT] signal handling fixes
  2021-07-27 14:42   ` Al Viro
@ 2021-07-28  1:23     ` Finn Thain
  0 siblings, 0 replies; 18+ messages in thread
From: Finn Thain @ 2021-07-28  1:23 UTC (permalink / raw)
  To: Al Viro; +Cc: linux-m68k, Geert Uytterhoeven, Greg Ungerer, linux-kernel

On Tue, 27 Jul 2021, Al Viro wrote:

> On Tue, Jul 27, 2021 at 08:21:52PM +1000, Finn Thain wrote:
> > On Sun, 25 Jul 2021, Al Viro wrote:
> > 
> > > 
> > > 	The series is on top of 5.14-rc1; it lives in
> > > git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs.git #untested.m68k
> > > Individual patches in followups...
> > > 
> > > 	_Very_ lightly tested on aranym; no real hardware to test it on.
> > > Any help with review and testing would be very welcome.
> > > 
> > 
> > I can test this branch on a Motorola 68040 machine I have here. Can you 
> > advise how to get decent code coverage? Maybe there's a package out there 
> > with a signal-heavy test suite? Maybe I need a break point in a signal 
> > handler? Or perhaps just send ^C to a process running under strace?
> 
> Generally, SIGINT is not the best insertion vector...
> 

True. I see that 'man 7 signal' says that SIGQUIT will produce a coredump. 
Would that contain anything of interest?

> Set a handler of e.g. SIGALRM with sigaction(), with a couple of other signals
> in sa_mask (e.g. SIGUSR1 and SIGUSR2).  With raise() on those inside the
> SIGALRM handler - then they will become deliverable on return from handler.
> And have SIGUSR1 and SIGUSR2 handlers print siginfo and ucontext contents
> (have them set with SA_SIGINFO in sa_flags, look at the second and third
> arguments of sighandler).
> 
> Use alarm(2) to arrange for SIGALRM and sit in a tight loop - that'll give you
> delivery on return from interrupt.  Alternatively, raise(SIGALRM) will give
> you delivery on return from trap.  And making that a SIGBUS handler instead,
> mmapping a file, truncating it to 0 and dereferencing something in mmapped
> area will give you delivery on return from access error trap.  Division by
> zero (and insertion handler on SIGFPE) ought to give you a type 2 exception
> stack frame (4 bytes of aux data, that makes shifted exception frame bugger
> format and vector fields of the original).
> 
> FWIW, the third argument of handler points to
> struct ucontext {
>         unsigned long     uc_flags;
>         struct ucontext  *uc_link;
>         stack_t           uc_stack;
>         struct mcontext   uc_mcontext;
>         unsigned long     uc_filler[80];
>         sigset_t          uc_sigmask;   /* mask last for extensibility */
> };
> and type/vector is stored in uc_filler[54] (216 bytes into the array), with
> aux data from exception stack frame starting from uc_filler[55].
> 

OK, give me a week or so and I'll see what I can come up with. 

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 1/3] m68k: handle arrivals of multiple signals correctly
  2021-07-25 17:19 ` [PATCH 1/3] m68k: handle arrivals of multiple signals correctly Al Viro
@ 2021-09-15 22:08   ` Michael Schmitz
  0 siblings, 0 replies; 18+ messages in thread
From: Michael Schmitz @ 2021-09-15 22:08 UTC (permalink / raw)
  To: Al Viro, linux-m68k; +Cc: Geert Uytterhoeven, Greg Ungerer, linux-kernel

Hi Al,

On 26/07/21 5:19 am, Al Viro wrote:
> When we have several pending signals, have entered with the kernel
> with large exception frame *and* have already built at least one
> sigframe, regs->stkadj is going to be non-zero and regs->format/sr/pc
> are going to be junk - the real values are in shifted exception stack
> frame we'd built when putting together the first sigframe.
>
> If that happens, subsequent sigframes are going to be garbage.
> Not hard to fix - just need to find the "adjusted" frame first
> and look for format/vector/sr/pc in it.
>
> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

Looks good to me. What's more, it fixes a number of long standing issues 
dating back to the 4.10 ages - see discussions at:

https://lore.kernel.org/r/7517d306-21ad-daa1-a2fb-b273211cb588@gmail.com

https://lore.kernel.org/r/ed2ca322-b957-cd52-8d2f-a8edd2785625@linux-m68k.org

- so should be applied to -stable IMO.

Tested by me on 68030 - Finn Thain did some testing on 68040 and might 
add his own tag.

Tested-by: Michael Schmitz <schmitzmic@gmail.com>

Reviewed-by: Michael Schmitz <schmitzmic@gmail.com>

> ---
>   arch/m68k/kernel/signal.c | 88 ++++++++++++++++++++++-------------------------
>   1 file changed, 42 insertions(+), 46 deletions(-)
>
> diff --git a/arch/m68k/kernel/signal.c b/arch/m68k/kernel/signal.c
> index 8f215e79e70e..cd11eb101eac 100644
> --- a/arch/m68k/kernel/signal.c
> +++ b/arch/m68k/kernel/signal.c
> @@ -447,7 +447,7 @@ static inline void save_fpu_state(struct sigcontext *sc, struct pt_regs *regs)
>   
>   	if (CPU_IS_060 ? sc->sc_fpstate[2] : sc->sc_fpstate[0]) {
>   		fpu_version = sc->sc_fpstate[0];
> -		if (CPU_IS_020_OR_030 &&
> +		if (CPU_IS_020_OR_030 && !regs->stkadj &&
>   		    regs->vector >= (VEC_FPBRUC * 4) &&
>   		    regs->vector <= (VEC_FPNAN * 4)) {
>   			/* Clear pending exception in 68882 idle frame */
> @@ -510,7 +510,7 @@ static inline int rt_save_fpu_state(struct ucontext __user *uc, struct pt_regs *
>   		if (!(CPU_IS_060 || CPU_IS_COLDFIRE))
>   			context_size = fpstate[1];
>   		fpu_version = fpstate[0];
> -		if (CPU_IS_020_OR_030 &&
> +		if (CPU_IS_020_OR_030 && !regs->stkadj &&
>   		    regs->vector >= (VEC_FPBRUC * 4) &&
>   		    regs->vector <= (VEC_FPNAN * 4)) {
>   			/* Clear pending exception in 68882 idle frame */
> @@ -832,18 +832,24 @@ asmlinkage int do_rt_sigreturn(struct pt_regs *regs, struct switch_stack *sw)
>   	return 0;
>   }
>   
> +static inline struct pt_regs *rte_regs(struct pt_regs *regs)
> +{
> +	return (void *)regs + regs->stkadj;
> +}
> +
>   static void setup_sigcontext(struct sigcontext *sc, struct pt_regs *regs,
>   			     unsigned long mask)
>   {
> +	struct pt_regs *tregs = rte_regs(regs);
>   	sc->sc_mask = mask;
>   	sc->sc_usp = rdusp();
>   	sc->sc_d0 = regs->d0;
>   	sc->sc_d1 = regs->d1;
>   	sc->sc_a0 = regs->a0;
>   	sc->sc_a1 = regs->a1;
> -	sc->sc_sr = regs->sr;
> -	sc->sc_pc = regs->pc;
> -	sc->sc_formatvec = regs->format << 12 | regs->vector;
> +	sc->sc_sr = tregs->sr;
> +	sc->sc_pc = tregs->pc;
> +	sc->sc_formatvec = tregs->format << 12 | tregs->vector;
>   	save_a5_state(sc, regs);
>   	save_fpu_state(sc, regs);
>   }
> @@ -851,6 +857,7 @@ static void setup_sigcontext(struct sigcontext *sc, struct pt_regs *regs,
>   static inline int rt_setup_ucontext(struct ucontext __user *uc, struct pt_regs *regs)
>   {
>   	struct switch_stack *sw = (struct switch_stack *)regs - 1;
> +	struct pt_regs *tregs = rte_regs(regs);
>   	greg_t __user *gregs = uc->uc_mcontext.gregs;
>   	int err = 0;
>   
> @@ -871,9 +878,9 @@ static inline int rt_setup_ucontext(struct ucontext __user *uc, struct pt_regs *
>   	err |= __put_user(sw->a5, &gregs[13]);
>   	err |= __put_user(sw->a6, &gregs[14]);
>   	err |= __put_user(rdusp(), &gregs[15]);
> -	err |= __put_user(regs->pc, &gregs[16]);
> -	err |= __put_user(regs->sr, &gregs[17]);
> -	err |= __put_user((regs->format << 12) | regs->vector, &uc->uc_formatvec);
> +	err |= __put_user(tregs->pc, &gregs[16]);
> +	err |= __put_user(tregs->sr, &gregs[17]);
> +	err |= __put_user((tregs->format << 12) | tregs->vector, &uc->uc_formatvec);
>   	err |= rt_save_fpu_state(uc, regs);
>   	return err;
>   }
> @@ -890,13 +897,14 @@ static int setup_frame(struct ksignal *ksig, sigset_t *set,
>   			struct pt_regs *regs)
>   {
>   	struct sigframe __user *frame;
> -	int fsize = frame_extra_sizes(regs->format);
> +	struct pt_regs *tregs = rte_regs(regs);
> +	int fsize = frame_extra_sizes(tregs->format);
>   	struct sigcontext context;
>   	int err = 0, sig = ksig->sig;
>   
>   	if (fsize < 0) {
>   		pr_debug("setup_frame: Unknown frame format %#x\n",
> -			 regs->format);
> +			 tregs->format);
>   		return -EFAULT;
>   	}
>   
> @@ -907,7 +915,7 @@ static int setup_frame(struct ksignal *ksig, sigset_t *set,
>   
>   	err |= __put_user(sig, &frame->sig);
>   
> -	err |= __put_user(regs->vector, &frame->code);
> +	err |= __put_user(tregs->vector, &frame->code);
>   	err |= __put_user(&frame->sc, &frame->psc);
>   
>   	if (_NSIG_WORDS > 1)
> @@ -934,33 +942,27 @@ static int setup_frame(struct ksignal *ksig, sigset_t *set,
>   	push_cache ((unsigned long) &frame->retcode);
>   
>   	/*
> -	 * Set up registers for signal handler.  All the state we are about
> -	 * to destroy is successfully copied to sigframe.
> -	 */
> -	wrusp ((unsigned long) frame);
> -	regs->pc = (unsigned long) ksig->ka.sa.sa_handler;
> -	adjustformat(regs);
> -
> -	/*
>   	 * This is subtle; if we build more than one sigframe, all but the
>   	 * first one will see frame format 0 and have fsize == 0, so we won't
>   	 * screw stkadj.
>   	 */
> -	if (fsize)
> +	if (fsize) {
>   		regs->stkadj = fsize;
> -
> -	/* Prepare to skip over the extra stuff in the exception frame.  */
> -	if (regs->stkadj) {
> -		struct pt_regs *tregs =
> -			(struct pt_regs *)((ulong)regs + regs->stkadj);
> +		tregs = rte_regs(regs);
>   		pr_debug("Performing stackadjust=%04lx\n", regs->stkadj);
> -		/* This must be copied with decreasing addresses to
> -                   handle overlaps.  */
>   		tregs->vector = 0;
>   		tregs->format = 0;
> -		tregs->pc = regs->pc;
>   		tregs->sr = regs->sr;
>   	}
> +
> +	/*
> +	 * Set up registers for signal handler.  All the state we are about
> +	 * to destroy is successfully copied to sigframe.
> +	 */
> +	wrusp ((unsigned long) frame);
> +	tregs->pc = (unsigned long) ksig->ka.sa.sa_handler;
> +	adjustformat(regs);
> +
>   	return 0;
>   }
>   
> @@ -968,7 +970,8 @@ static int setup_rt_frame(struct ksignal *ksig, sigset_t *set,
>   			   struct pt_regs *regs)
>   {
>   	struct rt_sigframe __user *frame;
> -	int fsize = frame_extra_sizes(regs->format);
> +	struct pt_regs *tregs = rte_regs(regs);
> +	int fsize = frame_extra_sizes(tregs->format);
>   	int err = 0, sig = ksig->sig;
>   
>   	if (fsize < 0) {
> @@ -1019,33 +1022,26 @@ static int setup_rt_frame(struct ksignal *ksig, sigset_t *set,
>   	push_cache ((unsigned long) &frame->retcode);
>   
>   	/*
> -	 * Set up registers for signal handler.  All the state we are about
> -	 * to destroy is successfully copied to sigframe.
> -	 */
> -	wrusp ((unsigned long) frame);
> -	regs->pc = (unsigned long) ksig->ka.sa.sa_handler;
> -	adjustformat(regs);
> -
> -	/*
>   	 * This is subtle; if we build more than one sigframe, all but the
>   	 * first one will see frame format 0 and have fsize == 0, so we won't
>   	 * screw stkadj.
>   	 */
> -	if (fsize)
> +	if (fsize) {
>   		regs->stkadj = fsize;
> -
> -	/* Prepare to skip over the extra stuff in the exception frame.  */
> -	if (regs->stkadj) {
> -		struct pt_regs *tregs =
> -			(struct pt_regs *)((ulong)regs + regs->stkadj);
> +		tregs = rte_regs(regs);
>   		pr_debug("Performing stackadjust=%04lx\n", regs->stkadj);
> -		/* This must be copied with decreasing addresses to
> -                   handle overlaps.  */
>   		tregs->vector = 0;
>   		tregs->format = 0;
> -		tregs->pc = regs->pc;
>   		tregs->sr = regs->sr;
>   	}
> +
> +	/*
> +	 * Set up registers for signal handler.  All the state we are about
> +	 * to destroy is successfully copied to sigframe.
> +	 */
> +	wrusp ((unsigned long) frame);
> +	tregs->pc = (unsigned long) ksig->ka.sa.sa_handler;
> +	adjustformat(regs);
>   	return 0;
>   }
>   

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 2/3] m68k: update ->thread.esp0 before calling syscall_trace() in ret_from_signal
  2021-07-25 17:19 ` [PATCH 2/3] m68k: update ->thread.esp0 before calling syscall_trace() in ret_from_signal Al Viro
@ 2021-09-15 22:19   ` Michael Schmitz
  0 siblings, 0 replies; 18+ messages in thread
From: Michael Schmitz @ 2021-09-15 22:19 UTC (permalink / raw)
  To: Al Viro, linux-m68k; +Cc: Geert Uytterhoeven, Greg Ungerer, linux-kernel

Hi Al,

On 26/07/21 5:19 am, Al Viro wrote:
> We get there when sigreturn has performed obscene acts on kernel stack;
> in particular, the location of pt_regs has shifted.  We are about to call
> syscall_trace(), which might stop for tracer.  If that happens, we'd better
> have task_pt_regs() returning correct result...
>
> Fucked-up-by: Al Viro <viro@zeniv.linux.org.uk>
> Fixes: bd6f56a75bb2 ("m68k: Missing syscall_trace() on sigreturn")
> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

Looking good also, and should go to -stable.

Tested-by: Michael Schmitz <schmitzmic@gmail.com>

Reviewed-by: Michael Schmitz <schmitzmic@gmail.com>

> ---
>   arch/m68k/kernel/entry.S | 2 ++
>   1 file changed, 2 insertions(+)
>
> diff --git a/arch/m68k/kernel/entry.S b/arch/m68k/kernel/entry.S
> index 9dd76fbb7c6b..ff9e842cec0f 100644
> --- a/arch/m68k/kernel/entry.S
> +++ b/arch/m68k/kernel/entry.S
> @@ -186,6 +186,8 @@ ENTRY(ret_from_signal)
>   	movel	%curptr@(TASK_STACK),%a1
>   	tstb	%a1@(TINFO_FLAGS+2)
>   	jge	1f
> +	lea	%sp@(SWITCH_STACK_SIZE),%a1
> +	movel	%a1,%curptr@(TASK_THREAD+THREAD_ESP0)
>   	jbsr	syscall_trace
>   1:	RESTORE_SWITCH_STACK
>   	addql	#4,%sp

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 3/3] m68k: leave stack mangling to asm wrapper of sigreturn()
  2021-07-25 17:20 ` [PATCH 3/3] m68k: leave stack mangling to asm wrapper of sigreturn() Al Viro
@ 2021-09-15 23:35   ` Michael Schmitz
  2021-09-16  0:19     ` Al Viro
  0 siblings, 1 reply; 18+ messages in thread
From: Michael Schmitz @ 2021-09-15 23:35 UTC (permalink / raw)
  To: Al Viro, linux-m68k; +Cc: Geert Uytterhoeven, Greg Ungerer, linux-kernel

Hi Al,

On 26/07/21 5:20 am, Al Viro wrote:
> sigreturn has to deal with an unpleasant problem - exception stack frames
> have different sizes, depending upon the exception (and processor model, as
> well) and variable-sized part of exception frame may contain information
> needed for instruction restart.  So when signal handler terminates and calls
> sigreturn to resume the execution at the place where we'd been when we caught
> the signal, it has to rearrange the frame at the bottom of kernel stack.
> Worse, it might need to open a gap in the kernel stack, shifting pt_regs
> towards lower addresses.
>
> Doing that from C is insane - we'd need to shift stack frames (return addresses,
> local variables, etc.) of C call chain, right under the nose of compiler and
> hope it won't fall apart horribly.  What had been actually done is only slightly
> less insane - an inline asm in mangle_kernel_stack() moved the stuff around,
> then reset stack pointer and jumped to label in asm glue.
>
> However, we can avoid all that mess if the asm wrapper we have to use anyway
> would reserve some space on the stack between switch_stack and the C stack
> frame of do_{rt_,}sigreturn().   Then C part can simply memmove() pt_regs +
> switch_stack, memcpy() the variable part of exception frame into the opened
> gap - all of that without inline asm, buggering C call chain, magical jumps
> to asm labels, etc.
>
> Asm wrapper would need to know where the moved switch_stack has ended up -
> it might have been shifted into the gap we'd reserved before do_rt_sigreturn()
> call.  That's where it needs to set the stack pointer to.  So let the C part
> return just that and be done with that.
>
> While we are at it, the call of berr_040cleanup() we need to do when
> returning via 68040 bus error exception frame can be moved into C part
> as well.
>
> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

This one's a little harder - you use a 84 byte gap on each sigreturn, no 
matter what the frame size we need to restore. The original 
mangle_kernel_stack() only makes room on the stack when it has no other 
option (using twice as much size - correct me if I'm wrong).

Ideally, we'd only leave a gap for mangle_kernel_stack() to use if the 
frame size requires us to do so. Working that out in asm glue would be 
sufficiently convoluted as to cancel out the benefits of cleaning up the 
C sigreturn part. Probably not worth it.

Looking at how likely it is that we'll see multiple pending signals, I'd 
say from the frequency of the resulting panic messages in the past four 
years it's quite rare. I've never seen these faults on anything but a 
fully loaded system running some sort of stress test. OTOH, the 
pathological case of exception frame (mid-instruction access fault) is 
expected when we might want to send a segfault signal, which may happen 
anytime.

On balance, I think the extra stack use will occur rare enough and the 
benefit of cleaning up mangle_kernel_stack() outweighs that.

Tested-by: Michael Schmitz <schmitzmic@gmail.com>

Reviewed-by: Michael Schmitz <schmitzmic@gmail.com>



> ---
>   arch/m68k/68000/entry.S       |   3 --
>   arch/m68k/coldfire/entry.S    |   3 --
>   arch/m68k/include/asm/traps.h |   4 ++
>   arch/m68k/kernel/entry.S      |  55 ++++++++++-----------
>   arch/m68k/kernel/signal.c     | 111 ++++++++++++++++--------------------------
>   5 files changed, 71 insertions(+), 105 deletions(-)
>
> diff --git a/arch/m68k/68000/entry.S b/arch/m68k/68000/entry.S
> index 259b3661b614..cce465e850fe 100644
> --- a/arch/m68k/68000/entry.S
> +++ b/arch/m68k/68000/entry.S
> @@ -25,7 +25,6 @@
>   .globl system_call
>   .globl resume
>   .globl ret_from_exception
> -.globl ret_from_signal
>   .globl sys_call_table
>   .globl bad_interrupt
>   .globl inthandler1
> @@ -59,8 +58,6 @@ do_trace:
>   	subql	#4,%sp			/* dummy return address */
>   	SAVE_SWITCH_STACK
>   	jbsr	syscall_trace_leave
> -
> -ret_from_signal:
>   	RESTORE_SWITCH_STACK
>   	addql	#4,%sp
>   	jra	ret_from_exception
> diff --git a/arch/m68k/coldfire/entry.S b/arch/m68k/coldfire/entry.S
> index d43a02795a4a..68adb7b5b296 100644
> --- a/arch/m68k/coldfire/entry.S
> +++ b/arch/m68k/coldfire/entry.S
> @@ -51,7 +51,6 @@ sw_usp:
>   .globl system_call
>   .globl resume
>   .globl ret_from_exception
> -.globl ret_from_signal
>   .globl sys_call_table
>   .globl inthandler
>   
> @@ -98,8 +97,6 @@ ENTRY(system_call)
>   	subql	#4,%sp			/* dummy return address */
>   	SAVE_SWITCH_STACK
>   	jbsr	syscall_trace_leave
> -
> -ret_from_signal:
>   	RESTORE_SWITCH_STACK
>   	addql	#4,%sp
>   
> diff --git a/arch/m68k/include/asm/traps.h b/arch/m68k/include/asm/traps.h
> index 4aff3358fbaf..a9d5c1c870d3 100644
> --- a/arch/m68k/include/asm/traps.h
> +++ b/arch/m68k/include/asm/traps.h
> @@ -267,6 +267,10 @@ struct frame {
>       } un;
>   };
>   
> +#ifdef CONFIG_M68040
> +asmlinkage void berr_040cleanup(struct frame *fp);
> +#endif
> +
>   #endif /* __ASSEMBLY__ */
>   
>   #endif /* _M68K_TRAPS_H */
> diff --git a/arch/m68k/kernel/entry.S b/arch/m68k/kernel/entry.S
> index ff9e842cec0f..8fa9822b5922 100644
> --- a/arch/m68k/kernel/entry.S
> +++ b/arch/m68k/kernel/entry.S
> @@ -78,20 +78,38 @@ ENTRY(__sys_clone3)
>   
>   ENTRY(sys_sigreturn)
>   	SAVE_SWITCH_STACK
> -	movel	%sp,%sp@-		  | switch_stack pointer
> -	pea	%sp@(SWITCH_STACK_SIZE+4) | pt_regs pointer
> +	movel	%sp,%a1			  	| switch_stack pointer
> +	lea	%sp@(SWITCH_STACK_SIZE),%a0	| pt_regs pointer
> +	lea     %sp@(-84),%sp			| leave a gap
> +	movel	%a1,%sp@-
> +	movel	%a0,%sp@-
>   	jbsr	do_sigreturn
> -	addql	#8,%sp
> -	RESTORE_SWITCH_STACK
> -	rts
> +	jra	1f				| shared with rt_sigreturn()
>   
>   ENTRY(sys_rt_sigreturn)
>   	SAVE_SWITCH_STACK
> -	movel	%sp,%sp@-		  | switch_stack pointer
> -	pea	%sp@(SWITCH_STACK_SIZE+4) | pt_regs pointer
> +	movel	%sp,%a1			  	| switch_stack pointer
> +	lea	%sp@(SWITCH_STACK_SIZE),%a0	| pt_regs pointer
> +	lea     %sp@(-84),%sp			| leave a gap
> +	movel	%a1,%sp@-
> +	movel	%a0,%sp@-
> +	| stack contents:
> +	|   [original pt_regs address] [original switch_stack address]
> +	|   [gap] [switch_stack] [pt_regs] [exception frame]
>   	jbsr	do_rt_sigreturn
> -	addql	#8,%sp
> +
> +1:
> +	| stack contents now:
> +	|   [original pt_regs address] [original switch_stack address]
> +	|   [unused part of the gap] [moved switch_stack] [moved pt_regs]
> +	|   [replacement exception frame]
> +	| return value of do_{rt_,}sigreturn() points to moved switch_stack.
> +
> +	movel	%d0,%sp				| discard the leftover junk
>   	RESTORE_SWITCH_STACK
> +	| stack contents now is just [syscall return address] [pt_regs] [frame]
> +	| return pt_regs.d0
> +	movel	%sp@(PT_OFF_D0+4),%d0
>   	rts
>   
>   ENTRY(buserr)
> @@ -182,27 +200,6 @@ do_trace_exit:
>   	addql	#4,%sp
>   	jra	.Lret_from_exception
>   
> -ENTRY(ret_from_signal)
> -	movel	%curptr@(TASK_STACK),%a1
> -	tstb	%a1@(TINFO_FLAGS+2)
> -	jge	1f
> -	lea	%sp@(SWITCH_STACK_SIZE),%a1
> -	movel	%a1,%curptr@(TASK_THREAD+THREAD_ESP0)
> -	jbsr	syscall_trace
> -1:	RESTORE_SWITCH_STACK
> -	addql	#4,%sp
> -/* on 68040 complete pending writebacks if any */
> -#ifdef CONFIG_M68040
> -	bfextu	%sp@(PT_OFF_FORMATVEC){#0,#4},%d0
> -	subql	#7,%d0				| bus error frame ?
> -	jbne	1f
> -	movel	%sp,%sp@-
> -	jbsr	berr_040cleanup
> -	addql	#4,%sp
> -1:
> -#endif
> -	jra	.Lret_from_exception
> -
>   ENTRY(system_call)
>   	SAVE_ALL_SYS
>   
> diff --git a/arch/m68k/kernel/signal.c b/arch/m68k/kernel/signal.c
> index cd11eb101eac..338817d0cb3f 100644
> --- a/arch/m68k/kernel/signal.c
> +++ b/arch/m68k/kernel/signal.c
> @@ -641,56 +641,35 @@ static inline void siginfo_build_tests(void)
>   static int mangle_kernel_stack(struct pt_regs *regs, int formatvec,
>   			       void __user *fp)
>   {
> -	int fsize = frame_extra_sizes(formatvec >> 12);
> -	if (fsize < 0) {
> +	int extra = frame_extra_sizes(formatvec >> 12);
> +	char buf[sizeof_field(struct frame, un)];
> +
> +	if (extra < 0) {
>   		/*
>   		 * user process trying to return with weird frame format
>   		 */
>   		pr_debug("user process returning with weird frame format\n");
> -		return 1;
> +		return -1;
>   	}
> -	if (!fsize) {
> -		regs->format = formatvec >> 12;
> -		regs->vector = formatvec & 0xfff;
> -	} else {
> -		struct switch_stack *sw = (struct switch_stack *)regs - 1;
> -		/* yes, twice as much as max(sizeof(frame.un.fmt<x>)) */
> -		unsigned long buf[sizeof_field(struct frame, un) / 2];
> -
> -		/* that'll make sure that expansion won't crap over data */
> -		if (copy_from_user(buf + fsize / 4, fp, fsize))
> -			return 1;
> -
> -		/* point of no return */
> -		regs->format = formatvec >> 12;
> -		regs->vector = formatvec & 0xfff;
> -#define frame_offset (sizeof(struct pt_regs)+sizeof(struct switch_stack))
> -		__asm__ __volatile__ (
> -#ifdef CONFIG_COLDFIRE
> -			 "   movel %0,%/sp\n\t"
> -			 "   bra ret_from_signal\n"
> -#else
> -			 "   movel %0,%/a0\n\t"
> -			 "   subl %1,%/a0\n\t"     /* make room on stack */
> -			 "   movel %/a0,%/sp\n\t"  /* set stack pointer */
> -			 /* move switch_stack and pt_regs */
> -			 "1: movel %0@+,%/a0@+\n\t"
> -			 "   dbra %2,1b\n\t"
> -			 "   lea %/sp@(%c3),%/a0\n\t" /* add offset of fmt */
> -			 "   lsrl  #2,%1\n\t"
> -			 "   subql #1,%1\n\t"
> -			 /* copy to the gap we'd made */
> -			 "2: movel %4@+,%/a0@+\n\t"
> -			 "   dbra %1,2b\n\t"
> -			 "   bral ret_from_signal\n"
> +	if (extra && copy_from_user(buf, fp, extra))
> +		return -1;
> +	regs->format = formatvec >> 12;
> +	regs->vector = formatvec & 0xfff;
> +	if (extra) {
> +		void *p = (struct switch_stack *)regs - 1;
> +		struct frame *new = (void *)regs - extra;
> +		int size = sizeof(struct pt_regs)+sizeof(struct switch_stack);
> +
> +		memmove(p - extra, p, size);
> +		memcpy(p - extra + size, buf, extra);
> +		current->thread.esp0 = (unsigned long)&new->ptregs;
> +#ifdef CONFIG_M68040
> +		/* on 68040 complete pending writebacks if any */
> +		if (new->ptregs.format == 7) // bus error frame
> +			berr_040cleanup(new);
>   #endif
> -			 : /* no outputs, it doesn't ever return */
> -			 : "a" (sw), "d" (fsize), "d" (frame_offset/4-1),
> -			   "n" (frame_offset), "a" (buf + fsize/4)
> -			 : "a0");
> -#undef frame_offset
>   	}
> -	return 0;
> +	return extra;
>   }
>   
>   static inline int
> @@ -698,7 +677,6 @@ restore_sigcontext(struct pt_regs *regs, struct sigcontext __user *usc, void __u
>   {
>   	int formatvec;
>   	struct sigcontext context;
> -	int err = 0;
>   
>   	siginfo_build_tests();
>   
> @@ -707,7 +685,7 @@ restore_sigcontext(struct pt_regs *regs, struct sigcontext __user *usc, void __u
>   
>   	/* get previous context */
>   	if (copy_from_user(&context, usc, sizeof(context)))
> -		goto badframe;
> +		return -1;
>   
>   	/* restore passed registers */
>   	regs->d0 = context.sc_d0;
> @@ -720,15 +698,10 @@ restore_sigcontext(struct pt_regs *regs, struct sigcontext __user *usc, void __u
>   	wrusp(context.sc_usp);
>   	formatvec = context.sc_formatvec;
>   
> -	err = restore_fpu_state(&context);
> -
> -	if (err || mangle_kernel_stack(regs, formatvec, fp))
> -		goto badframe;
> -
> -	return 0;
> +	if (restore_fpu_state(&context))
> +		return -1;
>   
> -badframe:
> -	return 1;
> +	return mangle_kernel_stack(regs, formatvec, fp);
>   }
>   
>   static inline int
> @@ -745,7 +718,7 @@ rt_restore_ucontext(struct pt_regs *regs, struct switch_stack *sw,
>   
>   	err = __get_user(temp, &uc->uc_mcontext.version);
>   	if (temp != MCONTEXT_VERSION)
> -		goto badframe;
> +		return -1;
>   	/* restore passed registers */
>   	err |= __get_user(regs->d0, &gregs[0]);
>   	err |= __get_user(regs->d1, &gregs[1]);
> @@ -774,22 +747,17 @@ rt_restore_ucontext(struct pt_regs *regs, struct switch_stack *sw,
>   	err |= restore_altstack(&uc->uc_stack);
>   
>   	if (err)
> -		goto badframe;
> -
> -	if (mangle_kernel_stack(regs, temp, &uc->uc_extra))
> -		goto badframe;
> +		return -1;
>   
> -	return 0;
> -
> -badframe:
> -	return 1;
> +	return mangle_kernel_stack(regs, temp, &uc->uc_extra);
>   }
>   
> -asmlinkage int do_sigreturn(struct pt_regs *regs, struct switch_stack *sw)
> +asmlinkage void *do_sigreturn(struct pt_regs *regs, struct switch_stack *sw)
>   {
>   	unsigned long usp = rdusp();
>   	struct sigframe __user *frame = (struct sigframe __user *)(usp - 4);
>   	sigset_t set;
> +	int size;
>   
>   	if (!access_ok(frame, sizeof(*frame)))
>   		goto badframe;
> @@ -801,20 +769,22 @@ asmlinkage int do_sigreturn(struct pt_regs *regs, struct switch_stack *sw)
>   
>   	set_current_blocked(&set);
>   
> -	if (restore_sigcontext(regs, &frame->sc, frame + 1))
> +	size = restore_sigcontext(regs, &frame->sc, frame + 1);
> +	if (size < 0)
>   		goto badframe;
> -	return regs->d0;
> +	return (void *)sw - size;
>   
>   badframe:
>   	force_sig(SIGSEGV);
> -	return 0;
> +	return sw;
>   }
>   
> -asmlinkage int do_rt_sigreturn(struct pt_regs *regs, struct switch_stack *sw)
> +asmlinkage void *do_rt_sigreturn(struct pt_regs *regs, struct switch_stack *sw)
>   {
>   	unsigned long usp = rdusp();
>   	struct rt_sigframe __user *frame = (struct rt_sigframe __user *)(usp - 4);
>   	sigset_t set;
> +	int size;
>   
>   	if (!access_ok(frame, sizeof(*frame)))
>   		goto badframe;
> @@ -823,13 +793,14 @@ asmlinkage int do_rt_sigreturn(struct pt_regs *regs, struct switch_stack *sw)
>   
>   	set_current_blocked(&set);
>   
> -	if (rt_restore_ucontext(regs, sw, &frame->uc))
> +	size = rt_restore_ucontext(regs, sw, &frame->uc);
> +	if (size < 0)
>   		goto badframe;
> -	return regs->d0;
> +	return (void *)sw - size;
>   
>   badframe:
>   	force_sig(SIGSEGV);
> -	return 0;
> +	return sw;
>   }
>   
>   static inline struct pt_regs *rte_regs(struct pt_regs *regs)

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 3/3] m68k: leave stack mangling to asm wrapper of sigreturn()
  2021-09-15 23:35   ` Michael Schmitz
@ 2021-09-16  0:19     ` Al Viro
  2021-09-16  0:53       ` Michael Schmitz
  0 siblings, 1 reply; 18+ messages in thread
From: Al Viro @ 2021-09-16  0:19 UTC (permalink / raw)
  To: Michael Schmitz
  Cc: linux-m68k, Geert Uytterhoeven, Greg Ungerer, linux-kernel

On Thu, Sep 16, 2021 at 11:35:05AM +1200, Michael Schmitz wrote:

> This one's a little harder - you use a 84 byte gap on each sigreturn, no
> matter what the frame size we need to restore. The original
> mangle_kernel_stack() only makes room on the stack when it has no other
> option (using twice as much size - correct me if I'm wrong).
> 
> Ideally, we'd only leave a gap for mangle_kernel_stack() to use if the frame
> size requires us to do so. Working that out in asm glue would be
> sufficiently convoluted as to cancel out the benefits of cleaning up the C
> sigreturn part. Probably not worth it.

You'd need to
	* load the frame type from sigcontext (and deal with EFAULT, etc.)
	* make decision based on that
	* pass the type down into sigreturn(), so we wouldn't run into
mismatches.

And all that just to avoid a single "subtract a constant from stack pointer"
insn.  We are on a very shallow kernel stack here - it's a syscall entry,
after all.  And the stack footprint of do_sigreturn() is fairly small - e.g.
stat(2) eats a lot more.

We are not initializing the gap either - it's just reserved on stack; we only
access it if we need to enlarge the stack frame.

IOW, what would be the benefit of trying to avoid unconditional gap there?

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 3/3] m68k: leave stack mangling to asm wrapper of sigreturn()
  2021-09-16  0:19     ` Al Viro
@ 2021-09-16  0:53       ` Michael Schmitz
  2021-09-16  3:21         ` Al Viro
  0 siblings, 1 reply; 18+ messages in thread
From: Michael Schmitz @ 2021-09-16  0:53 UTC (permalink / raw)
  To: Al Viro; +Cc: linux-m68k, Geert Uytterhoeven, Greg Ungerer, linux-kernel

Hi Al,

On 16/09/21 12:19, Al Viro wrote:
> On Thu, Sep 16, 2021 at 11:35:05AM +1200, Michael Schmitz wrote:
>
>> This one's a little harder - you use a 84 byte gap on each sigreturn, no
>> matter what the frame size we need to restore. The original
>> mangle_kernel_stack() only makes room on the stack when it has no other
>> option (using twice as much size - correct me if I'm wrong).
>>
>> Ideally, we'd only leave a gap for mangle_kernel_stack() to use if the frame
>> size requires us to do so. Working that out in asm glue would be
>> sufficiently convoluted as to cancel out the benefits of cleaning up the C
>> sigreturn part. Probably not worth it.
>
> You'd need to
> 	* load the frame type from sigcontext (and deal with EFAULT, etc.)
> 	* make decision based on that
> 	* pass the type down into sigreturn(), so we wouldn't run into
> mismatches.
>
> And all that just to avoid a single "subtract a constant from stack pointer"
> insn.  We are on a very shallow kernel stack here - it's a syscall entry,
> after all.  And the stack footprint of do_sigreturn() is fairly small - e.g.
> stat(2) eats a lot more.

Thanks, that's what I was wondering. Not worth the extra complexity then.

>
> We are not initializing the gap either - it's just reserved on stack; we only
> access it if we need to enlarge the stack frame.
>
> IOW, what would be the benefit of trying to avoid unconditional gap there?

Avoiding a kernel stack overflow - there are comments in the code that 
warn against that, but those may be largely historic...

Cheers,

	Michael

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 3/3] m68k: leave stack mangling to asm wrapper of sigreturn()
  2021-09-16  0:53       ` Michael Schmitz
@ 2021-09-16  3:21         ` Al Viro
  2021-09-16  5:02           ` Michael Schmitz
  0 siblings, 1 reply; 18+ messages in thread
From: Al Viro @ 2021-09-16  3:21 UTC (permalink / raw)
  To: Michael Schmitz
  Cc: linux-m68k, Geert Uytterhoeven, Greg Ungerer, linux-kernel

On Thu, Sep 16, 2021 at 12:53:53PM +1200, Michael Schmitz wrote:
> > You'd need to
> > 	* load the frame type from sigcontext (and deal with EFAULT, etc.)
> > 	* make decision based on that
> > 	* pass the type down into sigreturn(), so we wouldn't run into
> > mismatches.
> > 
> > And all that just to avoid a single "subtract a constant from stack pointer"
> > insn.  We are on a very shallow kernel stack here - it's a syscall entry,
> > after all.  And the stack footprint of do_sigreturn() is fairly small - e.g.
> > stat(2) eats a lot more.
> 
> Thanks, that's what I was wondering. Not worth the extra complexity then.
> 
> > 
> > We are not initializing the gap either - it's just reserved on stack; we only
> > access it if we need to enlarge the stack frame.
> > 
> > IOW, what would be the benefit of trying to avoid unconditional gap there?
> 
> Avoiding a kernel stack overflow - there are comments in the code that warn
> against that, but those may be largely historic...

This is syscall entry; moreover, it critically relies upon the fixed stack
layout - type 0 exception frame + pt_regs + switch_stack + (now) gap.
Followed by fairly shallow C call chain.  I suspect that the deepest you
can get there is when you get an unmapped page when reading the sigframe
and go into page fault handling, with call chain going into some filesystem's
->readpage().  If it was that close to stack overflow, we'd see them all
the time in e.g. random net ioctl doing copy_from_user() - that's going
to be deeper.  Or in stat(2), for that matter.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 3/3] m68k: leave stack mangling to asm wrapper of sigreturn()
  2021-09-16  3:21         ` Al Viro
@ 2021-09-16  5:02           ` Michael Schmitz
  2021-09-16 16:14             ` Al Viro
  0 siblings, 1 reply; 18+ messages in thread
From: Michael Schmitz @ 2021-09-16  5:02 UTC (permalink / raw)
  To: Al Viro; +Cc: linux-m68k, Geert Uytterhoeven, Greg Ungerer, linux-kernel

Hi Al,

On 16/09/21 15:21, Al Viro wrote:
> On Thu, Sep 16, 2021 at 12:53:53PM +1200, Michael Schmitz wrote:
>>> IOW, what would be the benefit of trying to avoid unconditional gap there?
>>
>> Avoiding a kernel stack overflow - there are comments in the code that warn
>> against that, but those may be largely historic...
>
> This is syscall entry; moreover, it critically relies upon the fixed stack
> layout - type 0 exception frame + pt_regs + switch_stack + (now) gap.

AFAIR, the concerns in the comments I saw were about interrupts - come 
to think of it, back in the early days, we used to have 'fast' and 
'slow' interrupt handlers, with much of the heavy lifting done in the 
handler, and slow interrupts allowed to lower the IPL. Probably no 
longer relevant.

> Followed by fairly shallow C call chain.  I suspect that the deepest you
> can get there is when you get an unmapped page when reading the sigframe
> and go into page fault handling, with call chain going into some filesystem's
> ->readpage().  If it was that close to stack overflow, we'd see them all
> the time in e.g. random net ioctl doing copy_from_user() - that's going
> to be deeper.  Or in stat(2), for that matter.

Your points are well taken - I can see now that my concerns are without 
merit.

The only question that remains is whether the third patch can also go to 
-stable. Most of my testing was with all three patches applied, I can 
drop the third one and retest if you're worries the third one is not 
appropriate for -stable.

Cheers,

	Michael


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC][CFT] signal handling fixes
  2021-07-25 17:18 [RFC][CFT] signal handling fixes Al Viro
                   ` (3 preceding siblings ...)
  2021-07-27 10:21 ` [RFC][CFT] signal handling fixes Finn Thain
@ 2021-09-16  9:03 ` Finn Thain
  2021-09-23 14:43   ` Geert Uytterhoeven
  2021-09-23 14:45 ` Geert Uytterhoeven
  5 siblings, 1 reply; 18+ messages in thread
From: Finn Thain @ 2021-09-16  9:03 UTC (permalink / raw)
  To: Al Viro; +Cc: linux-m68k, Geert Uytterhoeven, Greg Ungerer, linux-kernel

On Sun, 25 Jul 2021, Al Viro wrote:

> ...
> 
> PS:  FWIW, ifdefs in arch/m68k/kernel/signal.c are wrong - it's not !MMU 
> vs. coldfire/MMU vs. classic/MMU.  It's actually 68000 vs. coldfire vs. 
> everything else.  These days it's nearly correct, but only because on 
> MMU variants of coldfire we never see exception stack frames with type 
> other than 4 - it's controlled by alignment of kernel stack pointer on 
> those, and it's under the kernel control, so it's always 32bit-aligned.  
> It used to be more serious back when we had 68360 support - that's !MMU 
> and exception stack frames are like those on 68020, unless I'm 
> misreading their manual...
> 

I don't claim to understand this code but CPU32 cores appear to be 
unsupported on either #ifdef branch: the MMU branch due to CACR and CAAR 
used in push_cache(), and the !MMU branch due to frame format $4 used in 
adjust_format().

The CPU32 Reference Manual appendix says these chips only supports control 
registers SFC, DFC, VBR and stack frame formats $0, $2, $C. 
https://www.nxp.com/files-static/microcontrollers/doc/ref_manual/CPU32RM.pdf

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 3/3] m68k: leave stack mangling to asm wrapper of sigreturn()
  2021-09-16  5:02           ` Michael Schmitz
@ 2021-09-16 16:14             ` Al Viro
  0 siblings, 0 replies; 18+ messages in thread
From: Al Viro @ 2021-09-16 16:14 UTC (permalink / raw)
  To: Michael Schmitz
  Cc: linux-m68k, Geert Uytterhoeven, Greg Ungerer, linux-kernel

On Thu, Sep 16, 2021 at 05:02:22PM +1200, Michael Schmitz wrote:

> The only question that remains is whether the third patch can also go to
> -stable. Most of my testing was with all three patches applied, I can drop
> the third one and retest if you're worries the third one is not appropriate
> for -stable.

	Up to m68k folks, really.  The current mainline mangle_kernel_stack()
is, er, not nice and the entire area is delicate enough as it is (witness the
bugs dealt with in the rest of the series), but strictly speaking the third
patch is not fixing any functional bugs.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC][CFT] signal handling fixes
  2021-09-16  9:03 ` Finn Thain
@ 2021-09-23 14:43   ` Geert Uytterhoeven
  0 siblings, 0 replies; 18+ messages in thread
From: Geert Uytterhoeven @ 2021-09-23 14:43 UTC (permalink / raw)
  To: Finn Thain; +Cc: Al Viro, linux-m68k, Greg Ungerer, Linux Kernel Mailing List

Hi Finn,

On Thu, Sep 16, 2021 at 11:03 AM Finn Thain <fthain@linux-m68k.org> wrote:
> On Sun, 25 Jul 2021, Al Viro wrote:
> > ...
> >
> > PS:  FWIW, ifdefs in arch/m68k/kernel/signal.c are wrong - it's not !MMU
> > vs. coldfire/MMU vs. classic/MMU.  It's actually 68000 vs. coldfire vs.
> > everything else.  These days it's nearly correct, but only because on
> > MMU variants of coldfire we never see exception stack frames with type
> > other than 4 - it's controlled by alignment of kernel stack pointer on
> > those, and it's under the kernel control, so it's always 32bit-aligned.
> > It used to be more serious back when we had 68360 support - that's !MMU
> > and exception stack frames are like those on 68020, unless I'm
> > misreading their manual...
> >
>
> I don't claim to understand this code but CPU32 cores appear to be
> unsupported on either #ifdef branch: the MMU branch due to CACR and CAAR
> used in push_cache(), and the !MMU branch due to frame format $4 used in
> adjust_format().
>
> The CPU32 Reference Manual appendix says these chips only supports control
> registers SFC, DFC, VBR and stack frame formats $0, $2, $C.
> https://www.nxp.com/files-static/microcontrollers/doc/ref_manual/CPU32RM.pdf

As of commit a3595962d82495f5 ("m68knommu: remove obsolete 68360
support"), nothing selects MCPU32 anymore.

Gr{oetje,eeting}s,

                        Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC][CFT] signal handling fixes
  2021-07-25 17:18 [RFC][CFT] signal handling fixes Al Viro
                   ` (4 preceding siblings ...)
  2021-09-16  9:03 ` Finn Thain
@ 2021-09-23 14:45 ` Geert Uytterhoeven
  5 siblings, 0 replies; 18+ messages in thread
From: Geert Uytterhoeven @ 2021-09-23 14:45 UTC (permalink / raw)
  To: Al Viro; +Cc: linux-m68k, Greg Ungerer, Linux Kernel Mailing List

On Sun, Jul 25, 2021 at 7:18 PM Al Viro <viro@zeniv.linux.org.uk> wrote:
>         Back in 2012 or so I'd found a bunch of fun issues with multiple
> pending signals on a lot of architectures.  m68k looked scarier than
> usual (due to the combination of variable-sized exception frames with the
> way kernel stack pointer is handled by the hardware), but I'd convinced
> myself that it had been correct.
>
>         Unfortunately, I was wrong - handling of multiple pending signals
> does *not* work correctly there.

[...]

Thank you, queuing in the m68k branch as fixes.

Gr{oetje,eeting}s,

                        Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2021-09-23 14:45 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-07-25 17:18 [RFC][CFT] signal handling fixes Al Viro
2021-07-25 17:19 ` [PATCH 1/3] m68k: handle arrivals of multiple signals correctly Al Viro
2021-09-15 22:08   ` Michael Schmitz
2021-07-25 17:19 ` [PATCH 2/3] m68k: update ->thread.esp0 before calling syscall_trace() in ret_from_signal Al Viro
2021-09-15 22:19   ` Michael Schmitz
2021-07-25 17:20 ` [PATCH 3/3] m68k: leave stack mangling to asm wrapper of sigreturn() Al Viro
2021-09-15 23:35   ` Michael Schmitz
2021-09-16  0:19     ` Al Viro
2021-09-16  0:53       ` Michael Schmitz
2021-09-16  3:21         ` Al Viro
2021-09-16  5:02           ` Michael Schmitz
2021-09-16 16:14             ` Al Viro
2021-07-27 10:21 ` [RFC][CFT] signal handling fixes Finn Thain
2021-07-27 14:42   ` Al Viro
2021-07-28  1:23     ` Finn Thain
2021-09-16  9:03 ` Finn Thain
2021-09-23 14:43   ` Geert Uytterhoeven
2021-09-23 14:45 ` Geert Uytterhoeven

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).