linux-m68k.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH RFC v2 0/2] m68k uaccess fault handling fixes
@ 2024-04-22  2:29 Michael Schmitz
  2024-04-22  2:29 ` [PATCH RFC v2 1/2] m68k: Handle __generic_copy_to_user faults more carefully Michael Schmitz
  2024-04-22  2:29 ` [PATCH RFC v2 2/2] m68k: improve __constant_copy_to_user_asm() fault handling Michael Schmitz
  0 siblings, 2 replies; 20+ messages in thread
From: Michael Schmitz @ 2024-04-22  2:29 UTC (permalink / raw)
  To: linux-m68k, geert; +Cc: schmitzmic

Version 2 of fixes for uaccess fault handling on 68030 -
these patches ought to work on 68040 just as well.

Patch 1 is a reworked version of my earlier RFC patch, extending
the exception table in __generic_copy_to_user by an additional
instruction past each moves instruction instead of using NOPs
to force the fault on that instruction.

Patch 2 corrects a similar problem in __constant_copy_to_user_asm
by using a combination of additional exception table entries
and a NOP placed after the final moves instruction. 

I have found only the 8 and 12 byte cases of this inline
code in use in the kernel, and have been able to test the 
8 byte case only (llseek). The 12 byte case (netdev ioctl)
may be impossible to exercise on m68k.

Testing on 040 and 060, as well as Coldfire, would be great.

Cheers,

   Michael



^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH RFC v2 1/2] m68k: Handle __generic_copy_to_user faults more carefully
  2024-04-22  2:29 [PATCH RFC v2 0/2] m68k uaccess fault handling fixes Michael Schmitz
@ 2024-04-22  2:29 ` Michael Schmitz
  2024-04-25  4:16   ` Finn Thain
  2024-04-22  2:29 ` [PATCH RFC v2 2/2] m68k: improve __constant_copy_to_user_asm() fault handling Michael Schmitz
  1 sibling, 1 reply; 20+ messages in thread
From: Michael Schmitz @ 2024-04-22  2:29 UTC (permalink / raw)
  To: linux-m68k, geert; +Cc: schmitzmic, Finn Thain, linux-m68k

As mentioned by Finn Thain in his patch to improve put_user
exception handling on 040, a similar problem exists on 030
processors.

A moves instruction that crosses a page boundary from a
mapped page into an unmapped one will cause a mid-instruction
bus error exception (frame format b), with the PC pointing
(usually) two instructions past the faulting movesl instruction.

Our exception handling in __generic_copy_to_user only covers
the instruction immediately following the faulting one. As
a result, fixup_exception in send_fault_sig does not detect
this case, and cause send_fault_sig to oops.

Extend the exception table to cover one additional instruction
beyond the moves[lwb] instructions.

Tested on 68030 (Atari Falcon 030) with a transfer beginning
at a single byte at the end of a mapped page followed by
further bytes on an unmapped page (testcase derived from
stress-ng sysbadaddr stressor by Finn Thain).

A similar problem exists in __clear_user(); modify the exception
table for that function in the same way (untested)

Cc: Finn Thain <fthain@linux-m68k.org>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: linux-m68k@lists.linux-m68k.org
Link: https://lore.kernel.org/all/e0f23460779e6d16e2633486ac4841790ef2aca0.1713176294.git.fthain@linux-m68k.org
Signed-off-by: Michael Schmitz <schmitzmic@gmail.com>

---

Changes from RFC v1:

Michael Schmitz:

- use extended exception table instead of additional NOPs
---
 arch/m68k/lib/uaccess.c | 48 ++++++++++++++++++++++-------------------
 1 file changed, 26 insertions(+), 22 deletions(-)

diff --git a/arch/m68k/lib/uaccess.c b/arch/m68k/lib/uaccess.c
index 7646e461aa62..ef761fc10981 100644
--- a/arch/m68k/lib/uaccess.c
+++ b/arch/m68k/lib/uaccess.c
@@ -60,35 +60,37 @@ unsigned long __generic_copy_to_user(void __user *to, const void *from,
 
 	asm volatile ("\n"
 		"	tst.l	%0\n"
-		"	jeq	4f\n"
+		"	jeq	5f\n"
 		"1:	move.l	(%1)+,%3\n"
 		"2:	"MOVES".l	%3,(%2)+\n"
 		"3:	subq.l	#1,%0\n"
-		"	jne	1b\n"
-		"4:	btst	#1,%5\n"
-		"	jeq	6f\n"
+		"4:	jne	1b\n"
+		"5:	btst	#1,%5\n"
+		"	jeq	7f\n"
 		"	move.w	(%1)+,%3\n"
-		"5:	"MOVES".w	%3,(%2)+\n"
-		"6:	btst	#0,%5\n"
-		"	jeq	8f\n"
+		"6:	"MOVES".w	%3,(%2)+\n"
+		"7:	btst	#0,%5\n"
+		"8:	jeq	10f\n"
 		"	move.b	(%1)+,%3\n"
-		"7:	"MOVES".b  %3,(%2)+\n"
-		"8:\n"
+		"9:	"MOVES".b  %3,(%2)+\n"
+		"10:\n"
 		"	.section .fixup,\"ax\"\n"
 		"	.even\n"
 		"20:	lsl.l	#2,%0\n"
 		"50:	add.l	%5,%0\n"
-		"	jra	8b\n"
+		"	jra	10b\n"
 		"	.previous\n"
 		"\n"
 		"	.section __ex_table,\"a\"\n"
 		"	.align	4\n"
 		"	.long	2b,20b\n"
 		"	.long	3b,20b\n"
-		"	.long	5b,50b\n"
+		"	.long	4b,20b\n"
 		"	.long	6b,50b\n"
 		"	.long	7b,50b\n"
 		"	.long	8b,50b\n"
+		"	.long	9b,50b\n"
+		"	.long	10b,50b\n"
 		"	.previous"
 		: "=d" (res), "+a" (from), "+a" (to), "=&d" (tmp)
 		: "0" (n / 4), "d" (n & 3));
@@ -107,32 +109,34 @@ unsigned long __clear_user(void __user *to, unsigned long n)
 
 	asm volatile ("\n"
 		"	tst.l	%0\n"
-		"	jeq	3f\n"
+		"	jeq	4f\n"
 		"1:	"MOVES".l	%2,(%1)+\n"
 		"2:	subq.l	#1,%0\n"
-		"	jne	1b\n"
-		"3:	btst	#1,%4\n"
-		"	jeq	5f\n"
-		"4:	"MOVES".w	%2,(%1)+\n"
-		"5:	btst	#0,%4\n"
-		"	jeq	7f\n"
-		"6:	"MOVES".b	%2,(%1)\n"
-		"7:\n"
+		"3:	jne	1b\n"
+		"4:	btst	#1,%4\n"
+		"	jeq	6f\n"
+		"5:	"MOVES".w	%2,(%1)+\n"
+		"6:	btst	#0,%4\n"
+		"7:	jeq	9f\n"
+		"8:	"MOVES".b	%2,(%1)\n"
+		"9:\n"
 		"	.section .fixup,\"ax\"\n"
 		"	.even\n"
 		"10:	lsl.l	#2,%0\n"
 		"40:	add.l	%4,%0\n"
-		"	jra	7b\n"
+		"	jra	9b\n"
 		"	.previous\n"
 		"\n"
 		"	.section __ex_table,\"a\"\n"
 		"	.align	4\n"
 		"	.long	1b,10b\n"
 		"	.long	2b,10b\n"
-		"	.long	4b,40b\n"
+		"	.long	3b,10b\n"
 		"	.long	5b,40b\n"
 		"	.long	6b,40b\n"
 		"	.long	7b,40b\n"
+		"	.long	8b,40b\n"
+		"	.long	9b,40b\n"
 		"	.previous"
 		: "=d" (res), "+a" (to)
 		: "d" (0), "0" (n / 4), "d" (n & 3));
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH RFC v2 2/2] m68k: improve __constant_copy_to_user_asm() fault handling
  2024-04-22  2:29 [PATCH RFC v2 0/2] m68k uaccess fault handling fixes Michael Schmitz
  2024-04-22  2:29 ` [PATCH RFC v2 1/2] m68k: Handle __generic_copy_to_user faults more carefully Michael Schmitz
@ 2024-04-22  2:29 ` Michael Schmitz
  1 sibling, 0 replies; 20+ messages in thread
From: Michael Schmitz @ 2024-04-22  2:29 UTC (permalink / raw)
  To: linux-m68k, geert; +Cc: schmitzmic, Finn Thain

A problem similar to that reported for __put_user_asm and
__generic_copy_to_user exists in __constant_copy_to_user_asm.

Address the problem by extending the exception table to cover
two instructions past each moves instruction, and adding a
single NOP at the very end to catch faults on the final
instruction (which is not guaranteed to be a movesb!).

Tested on 68030 (Atari Falcon 030) with a transfer beginning
at a single byte at the end of a mapped page followed by
seven more bytes on an unmapped page (testcase derived from
stress-ng sysbadaddr stressor by Finn Thain and modified to
use the llseek syscall).

Cc: Finn Thain <fthain@linux-m68k.org>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Link: https://lore.kernel.org/all/e0f23460779e6d16e2633486ac4841790ef2aca0.1713176294.git.fthain@linux-m68k.org
Signed-off-by: Michael Schmitz <schmitzmic@gmail.com>
---
 arch/m68k/include/asm/uaccess.h | 9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/arch/m68k/include/asm/uaccess.h b/arch/m68k/include/asm/uaccess.h
index 64914872a5c9..0671df8c6b59 100644
--- a/arch/m68k/include/asm/uaccess.h
+++ b/arch/m68k/include/asm/uaccess.h
@@ -286,10 +286,11 @@ __constant_copy_from_user(void *to, const void __user *from, unsigned long n)
 		"21:	"MOVES"."#s2"	%3,(%1)+\n"			\
 		"22:\n"							\
 		"	.ifnc	\""#s3"\",\"\"\n"			\
-		"	move."#s3"	(%2)+,%3\n"			\
-		"31:	"MOVES"."#s3"	%3,(%1)+\n"			\
-		"32:\n"							\
+		"31:	move."#s3"	(%2)+,%3\n"			\
+		"32:	"MOVES"."#s3"	%3,(%1)+\n"			\
+		"33:\n"							\
 		"	.endif\n"					\
+		"34:	nop\n"						\
 		"4:\n"							\
 		"\n"							\
 		"	.section __ex_table,\"a\"\n"			\
@@ -301,7 +302,9 @@ __constant_copy_from_user(void *to, const void __user *from, unsigned long n)
 		"	.ifnc	\""#s3"\",\"\"\n"			\
 		"	.long	31b,5f\n"				\
 		"	.long	32b,5f\n"				\
+		"	.long	33b,5f\n"				\
 		"	.endif\n"					\
+		"	.long	34b,5f\n"				\
 		"	.previous\n"					\
 		"\n"							\
 		"	.section .fixup,\"ax\"\n"			\
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH RFC v2 1/2] m68k: Handle __generic_copy_to_user faults more carefully
  2024-04-22  2:29 ` [PATCH RFC v2 1/2] m68k: Handle __generic_copy_to_user faults more carefully Michael Schmitz
@ 2024-04-25  4:16   ` Finn Thain
  2024-04-25  5:32     ` Michael Schmitz
                       ` (3 more replies)
  0 siblings, 4 replies; 20+ messages in thread
From: Finn Thain @ 2024-04-25  4:16 UTC (permalink / raw)
  To: Michael Schmitz; +Cc: geert, linux-m68k


On Mon, 22 Apr 2024, Michael Schmitz wrote:

> As mentioned by Finn Thain in his patch to improve put_user exception 
> handling on 040, a similar problem exists on 030 processors.
> 
> A moves instruction that crosses a page boundary from a mapped page into 
> an unmapped one will cause a mid-instruction bus error exception (frame 
> format b), with the PC pointing (usually) two instructions past the 
> faulting movesl instruction.
> 
> Our exception handling in __generic_copy_to_user only covers the 
> instruction immediately following the faulting one. As a result, 
> fixup_exception in send_fault_sig does not detect this case, and cause 
> send_fault_sig to oops.
> 
> Extend the exception table to cover one additional instruction beyond 
> the moves[lwb] instructions.
> 
> Tested on 68030 (Atari Falcon 030) with a transfer beginning at a single 
> byte at the end of a mapped page followed by further bytes on an 
> unmapped page (testcase derived from stress-ng sysbadaddr stressor by 
> Finn Thain).
> 
> A similar problem exists in __clear_user(); modify the exception table 
> for that function in the same way (untested)
> 

I've just tested this on a Motorola 68040 and I got an oops in 
__generic_copy_to_user(). It appears that we need more entries in the 
exception table. (I also tested in QEMU and it did not oops.)

This oops indicates that we are going to need the final NOP that was in 
the first version of your patch. My test program seems inadequate to show 
that it is safe to omit that NOP -- we would need a test which doesn't 
jump over the MOVES.B.

Unable to handle kernel access at virtual address c0029000
Oops: 00000000
Modules linked in:
PC: [<0046454c>] __generic_copy_to_user+0x48/0x5c
SR: 2000  SP: 0152df10  a2: 01502160
d0: 00000000    d1: 00000001    d2: 2f746d70    d3: c0028fff
d4: 00000400    d5: c0028fff    a0: c0029003    a1: 0081bfff
Process badaddr-3syscal (pid: 83, task=01502160)
Frame format=7 eff addr=0152df6c ssw=0c81 faddr=c0028fff
wb 1 stat/addr/data: 0001 20000046 fcae0008
wb 2 stat/addr/data: 0081 c0028fff 2f746d70
wb 3 stat/addr/data: 0045 0152df64 2f746d70
push data: fcae0008 00000005 0152df88 0046451e
Stack from 0152df78:
        c0028fff 00000005 00000005 0081b000 0152dfc4 00128cf6 c0028fff 0081bffb
        00000005 00000400 efd92d5c 8000087a 8000408c 0099dd90 00c29900 0099d910
        00c0c400 0081bffb 00000ffb efd92d48 000027a2 c0028fff 00000400 efd92d5c
        8000087a 8000408c 8000408c 00000000 efd92ea4 ffffffda 000000b7 00000000
        0000c011 a5100080
Call Trace: [<00128cf6>] sys_getcwd+0xc8/0x15a
 [<000027a2>] syscall+0x8/0xc
 [<0000c011>] mac_reset+0x16d/0x170

Code: 0001 6706 3419 0e58 2800 0801 0000 6706 <1419> 0e18 2800 242e fff8 262e fffc 4e5e 4e75 4e75 4e56 0000 48e7 2018 242e 0008
Disabling lock debugging due to kernel taint

00464504 <__generic_copy_to_user>:
  464504:       4e56 0000       linkw %fp,#0
  464508:       2f03            movel %d3,%sp@-
  46450a:       2f02            movel %d2,%sp@-
  46450c:       262e 0008       movel %fp@(8),%d3
  464510:       242e 0010       movel %fp@(16),%d2
  464514:       2f02            movel %d2,%sp@-
  464516:       2f03            movel %d3,%sp@-
  464518:       4eb9 0046 44c6  jsr 4644c6 <__clear_user>
  46451e:       2002            movel %d2,%d0
  464520:       e488            lsrl #2,%d0
  464522:       7203            moveq #3,%d1
  464524:       c282            andl %d2,%d1
  464526:       226e 000c       moveal %fp@(12),%a1
  46452a:       2043            moveal %d3,%a0
  46452c:       4a80            tstl %d0
  46452e:       670a            beqs 46453a <__generic_copy_to_user+0x36>
  464530:       2419            movel %a1@+,%d2
  464532:       0e98 2800       movesl %d2,%a0@+
  464536:       5380            subql #1,%d0
  464538:       66f6            bnes 464530 <__generic_copy_to_user+0x2c>
  46453a:       0801 0001       btst #1,%d1
  46453e:       6706            beqs 464546 <__generic_copy_to_user+0x42>
  464540:       3419            movew %a1@+,%d2
  464542:       0e58 2800       movesw %d2,%a0@+
  464546:       0801 0000       btst #0,%d1
  46454a:       6706            beqs 464552 <__generic_copy_to_user+0x4e>
  46454c:       1419            moveb %a1@+,%d2
  46454e:       0e18 2800       movesb %d2,%a0@+
  464552:       242e fff8       movel %fp@(-8),%d2
  464556:       262e fffc       movel %fp@(-4),%d3
  46455a:       4e5e            unlk %fp
  46455c:       4e75            rts
  46455e:       4e75            rts


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH RFC v2 1/2] m68k: Handle __generic_copy_to_user faults more carefully
  2024-04-25  4:16   ` Finn Thain
@ 2024-04-25  5:32     ` Michael Schmitz
  2024-04-25  6:32       ` Finn Thain
  2024-04-25  5:45     ` Michael Schmitz
                       ` (2 subsequent siblings)
  3 siblings, 1 reply; 20+ messages in thread
From: Michael Schmitz @ 2024-04-25  5:32 UTC (permalink / raw)
  To: Finn Thain; +Cc: geert, linux-m68k

Hi Finn,

thanks for testing!

Am 25.04.2024 um 16:16 schrieb Finn Thain:
>
> On Mon, 22 Apr 2024, Michael Schmitz wrote:
>
>> As mentioned by Finn Thain in his patch to improve put_user exception
>> handling on 040, a similar problem exists on 030 processors.
>>
>> A moves instruction that crosses a page boundary from a mapped page into
>> an unmapped one will cause a mid-instruction bus error exception (frame
>> format b), with the PC pointing (usually) two instructions past the
>> faulting movesl instruction.
>>
>> Our exception handling in __generic_copy_to_user only covers the
>> instruction immediately following the faulting one. As a result,
>> fixup_exception in send_fault_sig does not detect this case, and cause
>> send_fault_sig to oops.
>>
>> Extend the exception table to cover one additional instruction beyond
>> the moves[lwb] instructions.
>>
>> Tested on 68030 (Atari Falcon 030) with a transfer beginning at a single
>> byte at the end of a mapped page followed by further bytes on an
>> unmapped page (testcase derived from stress-ng sysbadaddr stressor by
>> Finn Thain).
>>
>> A similar problem exists in __clear_user(); modify the exception table
>> for that function in the same way (untested)
>>
>
> I've just tested this on a Motorola 68040 and I got an oops in
> __generic_copy_to_user(). It appears that we need more entries in the
> exception table. (I also tested in QEMU and it did not oops.)

I'm a bit puzzled about the location of the fault.

The values of faddr and a0 from the exception frame indicate that the 
transfer leading up to the fault was a longword transfer. Both ssw and 
wbs2 suggest the same. Yet we don't take the fault on the longword 
moves, but apparently on the word moves right after.

That can't be right either - d1 is 1 so the word moves would have been 
skipped. It appears that we only take the movesl fault the next time any 
bus cycle is initiated on 040 (the moveb at 0x46454c).

That's different from how the 030 faulted in the same situation. I 
expect we'll have to add exception table entries on the movew and moveb 
instructions, too. I'll do that next.

> This oops indicates that we are going to need the final NOP that was in
> the first version of your patch. My test program seems inadequate to show
> that it is safe to omit that NOP -- we would need a test which doesn't
> jump over the MOVES.B.

We'd need a test using any number of longword moves expected to succeed, 
followed by a byte move which is expected to fault. The current test 
would attempt to do a byte move, but faults during the longword moves.

This requires running the test program in a directory whose absolute 
path is a multiple of four characters long, and setting the start 
address for the getcwd test accordingly, so the newline at the end of 
the string is the single byte left to copy. Does that make sense?

Incidentally - what is the path this tests is run in? Any path longer 
than five characters (including the newline) would have to had looped 
back to the first movel, and faulted there?

As you said before - we'd need to know a lot more about 
microarchitectural details here.

Cheers,

	Michael


>
> Unable to handle kernel access at virtual address c0029000
> Oops: 00000000
> Modules linked in:
> PC: [<0046454c>] __generic_copy_to_user+0x48/0x5c
> SR: 2000  SP: 0152df10  a2: 01502160
> d0: 00000000    d1: 00000001    d2: 2f746d70    d3: c0028fff
> d4: 00000400    d5: c0028fff    a0: c0029003    a1: 0081bfff
> Process badaddr-3syscal (pid: 83, task=01502160)
> Frame format=7 eff addr=0152df6c ssw=0c81 faddr=c0028fff
> wb 1 stat/addr/data: 0001 20000046 fcae0008
> wb 2 stat/addr/data: 0081 c0028fff 2f746d70
> wb 3 stat/addr/data: 0045 0152df64 2f746d70
> push data: fcae0008 00000005 0152df88 0046451e
> Stack from 0152df78:
>         c0028fff 00000005 00000005 0081b000 0152dfc4 00128cf6 c0028fff 0081bffb
>         00000005 00000400 efd92d5c 8000087a 8000408c 0099dd90 00c29900 0099d910
>         00c0c400 0081bffb 00000ffb efd92d48 000027a2 c0028fff 00000400 efd92d5c
>         8000087a 8000408c 8000408c 00000000 efd92ea4 ffffffda 000000b7 00000000
>         0000c011 a5100080
> Call Trace: [<00128cf6>] sys_getcwd+0xc8/0x15a
>  [<000027a2>] syscall+0x8/0xc
>  [<0000c011>] mac_reset+0x16d/0x170
>
> Code: 0001 6706 3419 0e58 2800 0801 0000 6706 <1419> 0e18 2800 242e fff8 262e fffc 4e5e 4e75 4e75 4e56 0000 48e7 2018 242e 0008
> Disabling lock debugging due to kernel taint
>
> 00464504 <__generic_copy_to_user>:
>   464504:       4e56 0000       linkw %fp,#0
>   464508:       2f03            movel %d3,%sp@-
>   46450a:       2f02            movel %d2,%sp@-
>   46450c:       262e 0008       movel %fp@(8),%d3
>   464510:       242e 0010       movel %fp@(16),%d2
>   464514:       2f02            movel %d2,%sp@-
>   464516:       2f03            movel %d3,%sp@-
>   464518:       4eb9 0046 44c6  jsr 4644c6 <__clear_user>
>   46451e:       2002            movel %d2,%d0
>   464520:       e488            lsrl #2,%d0
>   464522:       7203            moveq #3,%d1
>   464524:       c282            andl %d2,%d1
>   464526:       226e 000c       moveal %fp@(12),%a1
>   46452a:       2043            moveal %d3,%a0
>   46452c:       4a80            tstl %d0
>   46452e:       670a            beqs 46453a <__generic_copy_to_user+0x36>
>   464530:       2419            movel %a1@+,%d2
>   464532:       0e98 2800       movesl %d2,%a0@+
>   464536:       5380            subql #1,%d0
>   464538:       66f6            bnes 464530 <__generic_copy_to_user+0x2c>
>   46453a:       0801 0001       btst #1,%d1
>   46453e:       6706            beqs 464546 <__generic_copy_to_user+0x42>
>   464540:       3419            movew %a1@+,%d2
>   464542:       0e58 2800       movesw %d2,%a0@+
>   464546:       0801 0000       btst #0,%d1
>   46454a:       6706            beqs 464552 <__generic_copy_to_user+0x4e>
>   46454c:       1419            moveb %a1@+,%d2
>   46454e:       0e18 2800       movesb %d2,%a0@+
>   464552:       242e fff8       movel %fp@(-8),%d2
>   464556:       262e fffc       movel %fp@(-4),%d3
>   46455a:       4e5e            unlk %fp
>   46455c:       4e75            rts
>   46455e:       4e75            rts
>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH RFC v2 1/2] m68k: Handle __generic_copy_to_user faults more carefully
  2024-04-25  4:16   ` Finn Thain
  2024-04-25  5:32     ` Michael Schmitz
@ 2024-04-25  5:45     ` Michael Schmitz
  2024-04-25  6:47       ` Finn Thain
  2024-04-25  8:20     ` Michael Schmitz
  2024-04-25 19:15     ` Michael Schmitz
  3 siblings, 1 reply; 20+ messages in thread
From: Michael Schmitz @ 2024-04-25  5:45 UTC (permalink / raw)
  To: Finn Thain; +Cc: geert, linux-m68k

[-- Attachment #1: Type: text/plain, Size: 4851 bytes --]

Hi Finn,

untested patch (to go on top of my v2 patch 1/2) attached. Compiles , 
but I haven't had a chance to test on 030 yet.

Cheers,

	Michael


Am 25.04.2024 um 16:16 schrieb Finn Thain:
>
> On Mon, 22 Apr 2024, Michael Schmitz wrote:
>
>> As mentioned by Finn Thain in his patch to improve put_user exception
>> handling on 040, a similar problem exists on 030 processors.
>>
>> A moves instruction that crosses a page boundary from a mapped page into
>> an unmapped one will cause a mid-instruction bus error exception (frame
>> format b), with the PC pointing (usually) two instructions past the
>> faulting movesl instruction.
>>
>> Our exception handling in __generic_copy_to_user only covers the
>> instruction immediately following the faulting one. As a result,
>> fixup_exception in send_fault_sig does not detect this case, and cause
>> send_fault_sig to oops.
>>
>> Extend the exception table to cover one additional instruction beyond
>> the moves[lwb] instructions.
>>
>> Tested on 68030 (Atari Falcon 030) with a transfer beginning at a single
>> byte at the end of a mapped page followed by further bytes on an
>> unmapped page (testcase derived from stress-ng sysbadaddr stressor by
>> Finn Thain).
>>
>> A similar problem exists in __clear_user(); modify the exception table
>> for that function in the same way (untested)
>>
>
> I've just tested this on a Motorola 68040 and I got an oops in
> __generic_copy_to_user(). It appears that we need more entries in the
> exception table. (I also tested in QEMU and it did not oops.)
>
> This oops indicates that we are going to need the final NOP that was in
> the first version of your patch. My test program seems inadequate to show
> that it is safe to omit that NOP -- we would need a test which doesn't
> jump over the MOVES.B.
>
> Unable to handle kernel access at virtual address c0029000
> Oops: 00000000
> Modules linked in:
> PC: [<0046454c>] __generic_copy_to_user+0x48/0x5c
> SR: 2000  SP: 0152df10  a2: 01502160
> d0: 00000000    d1: 00000001    d2: 2f746d70    d3: c0028fff
> d4: 00000400    d5: c0028fff    a0: c0029003    a1: 0081bfff
> Process badaddr-3syscal (pid: 83, task=01502160)
> Frame format=7 eff addr=0152df6c ssw=0c81 faddr=c0028fff
> wb 1 stat/addr/data: 0001 20000046 fcae0008
> wb 2 stat/addr/data: 0081 c0028fff 2f746d70
> wb 3 stat/addr/data: 0045 0152df64 2f746d70
> push data: fcae0008 00000005 0152df88 0046451e
> Stack from 0152df78:
>         c0028fff 00000005 00000005 0081b000 0152dfc4 00128cf6 c0028fff 0081bffb
>         00000005 00000400 efd92d5c 8000087a 8000408c 0099dd90 00c29900 0099d910
>         00c0c400 0081bffb 00000ffb efd92d48 000027a2 c0028fff 00000400 efd92d5c
>         8000087a 8000408c 8000408c 00000000 efd92ea4 ffffffda 000000b7 00000000
>         0000c011 a5100080
> Call Trace: [<00128cf6>] sys_getcwd+0xc8/0x15a
>  [<000027a2>] syscall+0x8/0xc
>  [<0000c011>] mac_reset+0x16d/0x170
>
> Code: 0001 6706 3419 0e58 2800 0801 0000 6706 <1419> 0e18 2800 242e fff8 262e fffc 4e5e 4e75 4e75 4e56 0000 48e7 2018 242e 0008
> Disabling lock debugging due to kernel taint
>
> 00464504 <__generic_copy_to_user>:
>   464504:       4e56 0000       linkw %fp,#0
>   464508:       2f03            movel %d3,%sp@-
>   46450a:       2f02            movel %d2,%sp@-
>   46450c:       262e 0008       movel %fp@(8),%d3
>   464510:       242e 0010       movel %fp@(16),%d2
>   464514:       2f02            movel %d2,%sp@-
>   464516:       2f03            movel %d3,%sp@-
>   464518:       4eb9 0046 44c6  jsr 4644c6 <__clear_user>
>   46451e:       2002            movel %d2,%d0
>   464520:       e488            lsrl #2,%d0
>   464522:       7203            moveq #3,%d1
>   464524:       c282            andl %d2,%d1
>   464526:       226e 000c       moveal %fp@(12),%a1
>   46452a:       2043            moveal %d3,%a0
>   46452c:       4a80            tstl %d0
>   46452e:       670a            beqs 46453a <__generic_copy_to_user+0x36>
>   464530:       2419            movel %a1@+,%d2
>   464532:       0e98 2800       movesl %d2,%a0@+
>   464536:       5380            subql #1,%d0
>   464538:       66f6            bnes 464530 <__generic_copy_to_user+0x2c>
>   46453a:       0801 0001       btst #1,%d1
>   46453e:       6706            beqs 464546 <__generic_copy_to_user+0x42>
>   464540:       3419            movew %a1@+,%d2
>   464542:       0e58 2800       movesw %d2,%a0@+
>   464546:       0801 0000       btst #0,%d1
>   46454a:       6706            beqs 464552 <__generic_copy_to_user+0x4e>
>   46454c:       1419            moveb %a1@+,%d2
>   46454e:       0e18 2800       movesb %d2,%a0@+
>   464552:       242e fff8       movel %fp@(-8),%d2
>   464556:       262e fffc       movel %fp@(-4),%d3
>   46455a:       4e5e            unlk %fp
>   46455c:       4e75            rts
>   46455e:       4e75            rts
>

[-- Attachment #2: 0001-m68k-add-more-exception-handling-to-__generic_copy_t.patch --]
[-- Type: text/x-diff, Size: 1616 bytes --]

From 23b973a077467818793a7aad89fde5eaf55cbd7f Mon Sep 17 00:00:00 2001
From: Michael Schmitz <schmitzmic@gmail.com>
Date: Thu, 25 Apr 2024 17:39:07 +1200
Subject: [PATCH] m68k: add more exception handling to __generic_copy_to_user

---
 arch/m68k/lib/uaccess.c | 22 +++++++++++++---------
 1 file changed, 13 insertions(+), 9 deletions(-)

diff --git a/arch/m68k/lib/uaccess.c b/arch/m68k/lib/uaccess.c
index ef761fc10981..5e27d5d3f10f 100644
--- a/arch/m68k/lib/uaccess.c
+++ b/arch/m68k/lib/uaccess.c
@@ -66,23 +66,25 @@ unsigned long __generic_copy_to_user(void __user *to, const void *from,
 		"3:	subq.l	#1,%0\n"
 		"4:	jne	1b\n"
 		"5:	btst	#1,%5\n"
-		"	jeq	7f\n"
-		"	move.w	(%1)+,%3\n"
-		"6:	"MOVES".w	%3,(%2)+\n"
-		"7:	btst	#0,%5\n"
-		"8:	jeq	10f\n"
-		"	move.b	(%1)+,%3\n"
-		"9:	"MOVES".b  %3,(%2)+\n"
-		"10:\n"
+		"	jeq	8f\n"
+		"6:	move.w	(%1)+,%3\n"
+		"7:	"MOVES".w	%3,(%2)+\n"
+		"8:	btst	#0,%5\n"
+		"9:	jeq	13f\n"
+		"10:	move.b	(%1)+,%3\n"
+		"11:	"MOVES".b  %3,(%2)+\n"
+		"12:	nop\n"
+		"13:\n"
 		"	.section .fixup,\"ax\"\n"
 		"	.even\n"
 		"20:	lsl.l	#2,%0\n"
 		"50:	add.l	%5,%0\n"
-		"	jra	10b\n"
+		"	jra	13b\n"
 		"	.previous\n"
 		"\n"
 		"	.section __ex_table,\"a\"\n"
 		"	.align	4\n"
+		"	.long	1b,20b\n"
 		"	.long	2b,20b\n"
 		"	.long	3b,20b\n"
 		"	.long	4b,20b\n"
@@ -91,6 +93,8 @@ unsigned long __generic_copy_to_user(void __user *to, const void *from,
 		"	.long	8b,50b\n"
 		"	.long	9b,50b\n"
 		"	.long	10b,50b\n"
+		"	.long	11b,50b\n"
+		"	.long	12b,50b\n"
 		"	.previous"
 		: "=d" (res), "+a" (from), "+a" (to), "=&d" (tmp)
 		: "0" (n / 4), "d" (n & 3));
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH RFC v2 1/2] m68k: Handle __generic_copy_to_user faults more carefully
  2024-04-25  5:32     ` Michael Schmitz
@ 2024-04-25  6:32       ` Finn Thain
  2024-04-25  7:52         ` Michael Schmitz
  0 siblings, 1 reply; 20+ messages in thread
From: Finn Thain @ 2024-04-25  6:32 UTC (permalink / raw)
  To: Michael Schmitz; +Cc: geert, linux-m68k


On Thu, 25 Apr 2024, Michael Schmitz wrote:

> >
> > I've just tested this on a Motorola 68040 and I got an oops in 
> > __generic_copy_to_user(). It appears that we need more entries in the 
> > exception table. (I also tested in QEMU and it did not oops.)
> 
> I'm a bit puzzled about the location of the fault.
> 
> The values of faddr and a0 from the exception frame indicate that the 
> transfer leading up to the fault was a longword transfer. Both ssw and 
> wbs2 suggest the same. Yet we don't take the fault on the longword 
> moves, but apparently on the word moves right after.
> 
> That can't be right either - d1 is 1 so the word moves would have been 
> skipped. It appears that we only take the movesl fault the next time any 
> bus cycle is initiated on 040 (the moveb at 0x46454c).
> 

Seems so.

> That's different from how the 030 faulted in the same situation. I 
> expect we'll have to add exception table entries on the movew and moveb 
> instructions, too. I'll do that next.
> 
> > This oops indicates that we are going to need the final NOP that was 
> > in the first version of your patch. My test program seems inadequate 
> > to show that it is safe to omit that NOP -- we would need a test which 
> > doesn't jump over the MOVES.B.
> 
> We'd need a test using any number of longword moves expected to succeed, 
> followed by a byte move which is expected to fault. The current test 
> would attempt to do a byte move, but faults during the longword moves.
> 
> This requires running the test program in a directory whose absolute 
> path is a multiple of four characters long, and setting the start 
> address for the getcwd test accordingly, so the newline at the end of 
> the string is the single byte left to copy. Does that make sense?
> 

Yes (I take it you meant NUL instead of LF). But my concern was that the 
test program passes a pointer like 0xc0029000 - 1. That means the final 
byte will land on a word that already faulted. I'll need to add a new test 
that passes a pointer like 0xc0029000 - 5.

> Incidentally - what is the path this tests is run in? Any path longer 
> than five characters (including the newline) would have to had looped 
> back to the first movel, and faulted there?
> 

It was /tmp.

> As you said before - we'd need to know a lot more about 
> microarchitectural details here.
> 

It's hard to be certain. We just have to experiment until we find 
something that works on the CPUs we can test.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH RFC v2 1/2] m68k: Handle __generic_copy_to_user faults more carefully
  2024-04-25  5:45     ` Michael Schmitz
@ 2024-04-25  6:47       ` Finn Thain
  2024-04-25  7:43         ` Michael Schmitz
  0 siblings, 1 reply; 20+ messages in thread
From: Finn Thain @ 2024-04-25  6:47 UTC (permalink / raw)
  To: Michael Schmitz; +Cc: geert, linux-m68k

On Thu, 25 Apr 2024, Michael Schmitz wrote:

> --- a/arch/m68k/lib/uaccess.c
> +++ b/arch/m68k/lib/uaccess.c
> @@ -68,23 +68,25 @@ unsigned long __generic_copy_to_user(void __user *to, const void *from,
>  		"3:	subq.l	#1,%0\n"
>  		"4:	jne	1b\n"
>  		"5:	btst	#1,%5\n"
> -		"	jeq	7f\n"
> -		"	move.w	(%1)+,%3\n"
> -		"6:	"MOVES".w	%3,(%2)+\n"
> -		"7:	btst	#0,%5\n"
> -		"8:	jeq	10f\n"
> -		"	move.b	(%1)+,%3\n"
> -		"9:	"MOVES".b  %3,(%2)+\n"
> -		"10:\n"
> +		"	jeq	8f\n"
> +		"6:	move.w	(%1)+,%3\n"
> +		"7:	"MOVES".w	%3,(%2)+\n"
> +		"8:	btst	#0,%5\n"

I understand why you put the MOVE.W and MOVE.B into the exception table.

> +		"9:	jeq	13f\n"
> +		"10:	move.b	(%1)+,%3\n"
> +		"11:	"MOVES".b  %3,(%2)+\n"
> +		"12:	nop\n"
> +		"13:\n"
>  		"	.section .fixup,\"ax\"\n"
>  		"	.even\n"
>  		"20:	lsl.l	#2,%0\n"
>  		"50:	add.l	%5,%0\n"
> -		"	jra	10b\n"
> +		"	jra	13b\n"
>  		"	.previous\n"
>  		"\n"
>  		"	.section __ex_table,\"a\"\n"
>  		"	.align	4\n"
> +		"	.long	1b,20b\n"

... but I don't see why you need the MOVE.L in there (?)

Also, it is odd to see the third JEQ added to the table but not the second.
Perhaps we don't need either one in there (?)

>  		"	.long	2b,20b\n"
>  		"	.long	3b,20b\n"
>  		"	.long	4b,20b\n"
> @@ -93,6 +95,8 @@ unsigned long __generic_copy_to_user(void __user *to, const void *from,
>  		"	.long	8b,50b\n"
>  		"	.long	9b,50b\n"
>  		"	.long	10b,50b\n"
> +		"	.long	11b,50b\n"
> +		"	.long	12b,50b\n"
>  		"	.previous"
>  		: "=d" (res), "+a" (from), "+a" (to), "=&d" (tmp)
>  		: "0" (n / 4), "d" (n & 3));

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH RFC v2 1/2] m68k: Handle __generic_copy_to_user faults more carefully
  2024-04-25  6:47       ` Finn Thain
@ 2024-04-25  7:43         ` Michael Schmitz
  0 siblings, 0 replies; 20+ messages in thread
From: Michael Schmitz @ 2024-04-25  7:43 UTC (permalink / raw)
  To: Finn Thain; +Cc: geert, linux-m68k

Hi Finn,

Am 25.04.2024 um 18:47 schrieb Finn Thain:
> On Thu, 25 Apr 2024, Michael Schmitz wrote:
>
>> --- a/arch/m68k/lib/uaccess.c
>> +++ b/arch/m68k/lib/uaccess.c
>> @@ -68,23 +68,25 @@ unsigned long __generic_copy_to_user(void __user *to, const void *from,
>>  		"3:	subq.l	#1,%0\n"
>>  		"4:	jne	1b\n"
>>  		"5:	btst	#1,%5\n"
>> -		"	jeq	7f\n"
>> -		"	move.w	(%1)+,%3\n"
>> -		"6:	"MOVES".w	%3,(%2)+\n"
>> -		"7:	btst	#0,%5\n"
>> -		"8:	jeq	10f\n"
>> -		"	move.b	(%1)+,%3\n"
>> -		"9:	"MOVES".b  %3,(%2)+\n"
>> -		"10:\n"
>> +		"	jeq	8f\n"
>> +		"6:	move.w	(%1)+,%3\n"
>> +		"7:	"MOVES".w	%3,(%2)+\n"
>> +		"8:	btst	#0,%5\n"
>
> I understand why you put the MOVE.W and MOVE.B into the exception table.
>
>> +		"9:	jeq	13f\n"
>> +		"10:	move.b	(%1)+,%3\n"
>> +		"11:	"MOVES".b  %3,(%2)+\n"
>> +		"12:	nop\n"
>> +		"13:\n"
>>  		"	.section .fixup,\"ax\"\n"
>>  		"	.even\n"
>>  		"20:	lsl.l	#2,%0\n"
>>  		"50:	add.l	%5,%0\n"
>> -		"	jra	10b\n"
>> +		"	jra	13b\n"
>>  		"	.previous\n"
>>  		"\n"
>>  		"	.section __ex_table,\"a\"\n"
>>  		"	.align	4\n"
>> +		"	.long	1b,20b\n"
>
> ... but I don't see why you need the MOVE.L in there (?)

It's part of a loop - if there's enough longwords to copy, that 
instruction may follow the fault instruction eventually.

> Also, it is odd to see the third JEQ added to the table but not the second.
> Perhaps we don't need either one in there (?)

Faults on 030 appear to happen two instructions after a moves. But I may 
be misremembering that...

I'll have to verify that.

Cheers,

	Michael

>>  		"	.long	2b,20b\n"
>>  		"	.long	3b,20b\n"
>>  		"	.long	4b,20b\n"
>> @@ -93,6 +95,8 @@ unsigned long __generic_copy_to_user(void __user *to, const void *from,
>>  		"	.long	8b,50b\n"
>>  		"	.long	9b,50b\n"
>>  		"	.long	10b,50b\n"
>> +		"	.long	11b,50b\n"
>> +		"	.long	12b,50b\n"
>>  		"	.previous"
>>  		: "=d" (res), "+a" (from), "+a" (to), "=&d" (tmp)
>>  		: "0" (n / 4), "d" (n & 3));

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH RFC v2 1/2] m68k: Handle __generic_copy_to_user faults more carefully
  2024-04-25  6:32       ` Finn Thain
@ 2024-04-25  7:52         ` Michael Schmitz
  0 siblings, 0 replies; 20+ messages in thread
From: Michael Schmitz @ 2024-04-25  7:52 UTC (permalink / raw)
  To: Finn Thain; +Cc: geert, linux-m68k

Hi Finn,

Am 25.04.2024 um 18:32 schrieb Finn Thain:
>>> This oops indicates that we are going to need the final NOP that was
>>> in the first version of your patch. My test program seems inadequate
>>> to show that it is safe to omit that NOP -- we would need a test which
>>> doesn't jump over the MOVES.B.
>>
>> We'd need a test using any number of longword moves expected to succeed,
>> followed by a byte move which is expected to fault. The current test
>> would attempt to do a byte move, but faults during the longword moves.
>>
>> This requires running the test program in a directory whose absolute
>> path is a multiple of four characters long, and setting the start
>> address for the getcwd test accordingly, so the newline at the end of
>> the string is the single byte left to copy. Does that make sense?
>>
>
> Yes (I take it you meant NUL instead of LF). But my concern was that the

Yes, my bad ...

> test program passes a pointer like 0xc0029000 - 1. That means the final
> byte will land on a word that already faulted. I'll need to add a new test
> that passes a pointer like 0xc0029000 - 5.

That's what I meant to say, yes.

>> Incidentally - what is the path this tests is run in? Any path longer
>> than five characters (including the newline) would have to had looped
>> back to the first movel, and faulted there?
>>
>
> It was /tmp.

Right - so if I'm right, running the test in /root would exercise the 
movesw path, and fault on the movew. Using /var/tmp would loop back once 
to repeat the movesl, and likely fault on the movel there.

>> As you said before - we'd need to know a lot more about
>> microarchitectural details here.
>>
>
> It's hard to be certain. We just have to experiment until we find
> something that works on the CPUs we can test.

Right - I'll try a few of these ideas on my 030.

Cheers,

	Michael

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH RFC v2 1/2] m68k: Handle __generic_copy_to_user faults more carefully
  2024-04-25  4:16   ` Finn Thain
  2024-04-25  5:32     ` Michael Schmitz
  2024-04-25  5:45     ` Michael Schmitz
@ 2024-04-25  8:20     ` Michael Schmitz
  2024-04-25 19:15     ` Michael Schmitz
  3 siblings, 0 replies; 20+ messages in thread
From: Michael Schmitz @ 2024-04-25  8:20 UTC (permalink / raw)
  To: Finn Thain; +Cc: geert, linux-m68k

Hi Finn,

Am 25.04.2024 um 16:16 schrieb Finn Thain:
>
> On Mon, 22 Apr 2024, Michael Schmitz wrote:
>
>> As mentioned by Finn Thain in his patch to improve put_user exception
>> handling on 040, a similar problem exists on 030 processors.
>>
>> A moves instruction that crosses a page boundary from a mapped page into
>> an unmapped one will cause a mid-instruction bus error exception (frame
>> format b), with the PC pointing (usually) two instructions past the
>> faulting movesl instruction.
>>
>> Our exception handling in __generic_copy_to_user only covers the
>> instruction immediately following the faulting one. As a result,
>> fixup_exception in send_fault_sig does not detect this case, and cause
>> send_fault_sig to oops.
>>
>> Extend the exception table to cover one additional instruction beyond
>> the moves[lwb] instructions.
>>
>> Tested on 68030 (Atari Falcon 030) with a transfer beginning at a single
>> byte at the end of a mapped page followed by further bytes on an
>> unmapped page (testcase derived from stress-ng sysbadaddr stressor by
>> Finn Thain).
>>
>> A similar problem exists in __clear_user(); modify the exception table
>> for that function in the same way (untested)
>>
> I've just tested this on a Motorola 68040 and I got an oops in
> __generic_copy_to_user(). It appears that we need more entries in the
> exception table. (I also tested in QEMU and it did not oops.)

I've tried the same access pattern (one longword access, then continue 
to word or byte access) on 030  by running the old testcase in /tmp, and 
I get a fault on the first btst instruction. The label on that one isn't 
in the exception table, as it was only used as a branch target. I'll add 
that one and retry.

Cheers,

	Michael

> This oops indicates that we are going to need the final NOP that was in
> the first version of your patch. My test program seems inadequate to show
> that it is safe to omit that NOP -- we would need a test which doesn't
> jump over the MOVES.B.
>
> Unable to handle kernel access at virtual address c0029000
> Oops: 00000000
> Modules linked in:
> PC: [<0046454c>] __generic_copy_to_user+0x48/0x5c
> SR: 2000  SP: 0152df10  a2: 01502160
> d0: 00000000    d1: 00000001    d2: 2f746d70    d3: c0028fff
> d4: 00000400    d5: c0028fff    a0: c0029003    a1: 0081bfff
> Process badaddr-3syscal (pid: 83, task=01502160)
> Frame format=7 eff addr=0152df6c ssw=0c81 faddr=c0028fff
> wb 1 stat/addr/data: 0001 20000046 fcae0008
> wb 2 stat/addr/data: 0081 c0028fff 2f746d70
> wb 3 stat/addr/data: 0045 0152df64 2f746d70
> push data: fcae0008 00000005 0152df88 0046451e
> Stack from 0152df78:
>         c0028fff 00000005 00000005 0081b000 0152dfc4 00128cf6 c0028fff 0081bffb
>         00000005 00000400 efd92d5c 8000087a 8000408c 0099dd90 00c29900 0099d910
>         00c0c400 0081bffb 00000ffb efd92d48 000027a2 c0028fff 00000400 efd92d5c
>         8000087a 8000408c 8000408c 00000000 efd92ea4 ffffffda 000000b7 00000000
>         0000c011 a5100080
> Call Trace: [<00128cf6>] sys_getcwd+0xc8/0x15a
>  [<000027a2>] syscall+0x8/0xc
>  [<0000c011>] mac_reset+0x16d/0x170
>
> Code: 0001 6706 3419 0e58 2800 0801 0000 6706 <1419> 0e18 2800 242e fff8 262e fffc 4e5e 4e75 4e75 4e56 0000 48e7 2018 242e 0008
> Disabling lock debugging due to kernel taint
>
> 00464504 <__generic_copy_to_user>:
>   464504:       4e56 0000       linkw %fp,#0
>   464508:       2f03            movel %d3,%sp@-
>   46450a:       2f02            movel %d2,%sp@-
>   46450c:       262e 0008       movel %fp@(8),%d3
>   464510:       242e 0010       movel %fp@(16),%d2
>   464514:       2f02            movel %d2,%sp@-
>   464516:       2f03            movel %d3,%sp@-
>   464518:       4eb9 0046 44c6  jsr 4644c6 <__clear_user>
>   46451e:       2002            movel %d2,%d0
>   464520:       e488            lsrl #2,%d0
>   464522:       7203            moveq #3,%d1
>   464524:       c282            andl %d2,%d1
>   464526:       226e 000c       moveal %fp@(12),%a1
>   46452a:       2043            moveal %d3,%a0
>   46452c:       4a80            tstl %d0
>   46452e:       670a            beqs 46453a <__generic_copy_to_user+0x36>
>   464530:       2419            movel %a1@+,%d2
>   464532:       0e98 2800       movesl %d2,%a0@+
>   464536:       5380            subql #1,%d0
>   464538:       66f6            bnes 464530 <__generic_copy_to_user+0x2c>
>   46453a:       0801 0001       btst #1,%d1
>   46453e:       6706            beqs 464546 <__generic_copy_to_user+0x42>
>   464540:       3419            movew %a1@+,%d2
>   464542:       0e58 2800       movesw %d2,%a0@+
>   464546:       0801 0000       btst #0,%d1
>   46454a:       6706            beqs 464552 <__generic_copy_to_user+0x4e>
>   46454c:       1419            moveb %a1@+,%d2
>   46454e:       0e18 2800       movesb %d2,%a0@+
>   464552:       242e fff8       movel %fp@(-8),%d2
>   464556:       262e fffc       movel %fp@(-4),%d3
>   46455a:       4e5e            unlk %fp
>   46455c:       4e75            rts
>   46455e:       4e75            rts
>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH RFC v2 1/2] m68k: Handle __generic_copy_to_user faults more carefully
  2024-04-25  4:16   ` Finn Thain
                       ` (2 preceding siblings ...)
  2024-04-25  8:20     ` Michael Schmitz
@ 2024-04-25 19:15     ` Michael Schmitz
  2024-04-26  1:00       ` Finn Thain
  3 siblings, 1 reply; 20+ messages in thread
From: Michael Schmitz @ 2024-04-25 19:15 UTC (permalink / raw)
  To: Finn Thain; +Cc: geert, linux-m68k

Hi Finn,

On 25/04/24 16:16, Finn Thain wrote:
> 00464504 <__generic_copy_to_user>:
>    464504:       4e56 0000       linkw %fp,#0
>    464508:       2f03            movel %d3,%sp@-
>    46450a:       2f02            movel %d2,%sp@-
>    46450c:       262e 0008       movel %fp@(8),%d3
>    464510:       242e 0010       movel %fp@(16),%d2
>    464514:       2f02            movel %d2,%sp@-
>    464516:       2f03            movel %d3,%sp@-
>    464518:       4eb9 0046 44c6  jsr 4644c6 <__clear_user>

Not sure you noticed this - the 040 passed __clear_user without fault. 
We managed to test this one without meaning to. Exception handling in 
there appears to work OK (for the cases we're testing).

No idea why you have the __clear_user call occur within 
__generic_copy_to_user - it does not appear in my disassembly.

Cheers,

     Michael

>    46451e:       2002            movel %d2,%d0
>    464520:       e488            lsrl #2,%d0
>    464522:       7203            moveq #3,%d1
>    464524:       c282            andl %d2,%d1
>    464526:       226e 000c       moveal %fp@(12),%a1
>    46452a:       2043            moveal %d3,%a0
>    46452c:       4a80            tstl %d0
>    46452e:       670a            beqs 46453a <__generic_copy_to_user+0x36>
>    464530:       2419            movel %a1@+,%d2
>    464532:       0e98 2800       movesl %d2,%a0@+
>    464536:       5380            subql #1,%d0
>    464538:       66f6            bnes 464530 <__generic_copy_to_user+0x2c>
>    46453a:       0801 0001       btst #1,%d1
>    46453e:       6706            beqs 464546 <__generic_copy_to_user+0x42>
>    464540:       3419            movew %a1@+,%d2
>    464542:       0e58 2800       movesw %d2,%a0@+
>    464546:       0801 0000       btst #0,%d1
>    46454a:       6706            beqs 464552 <__generic_copy_to_user+0x4e>
>    46454c:       1419            moveb %a1@+,%d2
>    46454e:       0e18 2800       movesb %d2,%a0@+
>    464552:       242e fff8       movel %fp@(-8),%d2
>    464556:       262e fffc       movel %fp@(-4),%d3
>    46455a:       4e5e            unlk %fp
>    46455c:       4e75            rts
>    46455e:       4e75            rts
>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH RFC v2 1/2] m68k: Handle __generic_copy_to_user faults more carefully
  2024-04-25 19:15     ` Michael Schmitz
@ 2024-04-26  1:00       ` Finn Thain
  2024-04-26  1:22         ` Michael Schmitz
  0 siblings, 1 reply; 20+ messages in thread
From: Finn Thain @ 2024-04-26  1:00 UTC (permalink / raw)
  To: Michael Schmitz; +Cc: geert, linux-m68k


On Fri, 26 Apr 2024, Michael Schmitz wrote:

> Not sure you noticed this - the 040 passed __clear_user without fault. 
> We managed to test this one without meaning to. Exception handling in 
> there appears to work OK (for the cases we're testing).
> 
> No idea why you have the __clear_user call occur within 
> __generic_copy_to_user - it does not appear in my disassembly.
> 

I'm afraid I neglected to mention that I added the patch below in order to 
exercise that code path.

diff --git a/arch/m68k/lib/uaccess.c b/arch/m68k/lib/uaccess.c
index ef761fc10981..1c9a24a0b554 100644
--- a/arch/m68k/lib/uaccess.c
+++ b/arch/m68k/lib/uaccess.c
@@ -58,6 +58,8 @@ unsigned long __generic_copy_to_user(void __user *to, const void *from,
 {
 	unsigned long tmp, res;
 
+	__clear_user(to, n);
+
 	asm volatile ("\n"
 		"	tst.l	%0\n"
 		"	jeq	5f\n"

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH RFC v2 1/2] m68k: Handle __generic_copy_to_user faults more carefully
  2024-04-26  1:00       ` Finn Thain
@ 2024-04-26  1:22         ` Michael Schmitz
  2024-04-26  7:10           ` Michael Schmitz
  0 siblings, 1 reply; 20+ messages in thread
From: Michael Schmitz @ 2024-04-26  1:22 UTC (permalink / raw)
  To: Finn Thain; +Cc: geert, linux-m68k

Hi Finn,

yes, that would explain that.

Using a start address of badpage-4 and path '/tmp' or '/temp' in order 
to use either the movesw or movesb branches of the code (and force a 
fault on the first byte in the movesw case), I see no more Oops. Still 
have to test forcing the fault on the second byte of a movesw (making it 
a misaligned access again).

Cheers,

     Michael

On 26/04/24 13:00, Finn Thain wrote:

> On Fri, 26 Apr 2024, Michael Schmitz wrote:
>
>> Not sure you noticed this - the 040 passed __clear_user without fault.
>> We managed to test this one without meaning to. Exception handling in
>> there appears to work OK (for the cases we're testing).
>>
>> No idea why you have the __clear_user call occur within
>> __generic_copy_to_user - it does not appear in my disassembly.
>>
> I'm afraid I neglected to mention that I added the patch below in order to
> exercise that code path.
>
> diff --git a/arch/m68k/lib/uaccess.c b/arch/m68k/lib/uaccess.c
> index ef761fc10981..1c9a24a0b554 100644
> --- a/arch/m68k/lib/uaccess.c
> +++ b/arch/m68k/lib/uaccess.c
> @@ -58,6 +58,8 @@ unsigned long __generic_copy_to_user(void __user *to, const void *from,
>   {
>   	unsigned long tmp, res;
>   
> +	__clear_user(to, n);
> +
>   	asm volatile ("\n"
>   		"	tst.l	%0\n"
>   		"	jeq	5f\n"

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH RFC v2 1/2] m68k: Handle __generic_copy_to_user faults more carefully
  2024-04-26  1:22         ` Michael Schmitz
@ 2024-04-26  7:10           ` Michael Schmitz
  2024-04-26  7:57             ` Finn Thain
  2024-04-26  7:58             ` Finn Thain
  0 siblings, 2 replies; 20+ messages in thread
From: Michael Schmitz @ 2024-04-26  7:10 UTC (permalink / raw)
  To: Finn Thain; +Cc: geert, linux-m68k

[-- Attachment #1: Type: text/plain, Size: 838 bytes --]

Hi Finn,

Am 26.04.2024 um 13:22 schrieb Michael Schmitz:
> Hi Finn,
>
> yes, that would explain that.
>
> Using a start address of badpage-4 and path '/tmp' or '/temp' in order
> to use either the movesw or movesb branches of the code (and force a
> fault on the first byte in the movesw case), I see no more Oops. Still
> have to test forcing the fault on the second byte of a movesw (making it
> a misaligned access again).

Similar tests with start address five or six bytes before the start of 
the unmapped page, and corresponding path length to be returned by 
getcwd have shown no more Oops on 030 using the attached corrected 
version of my patch.

Please give that some testing if you can, and (hoping it won't show any 
new faults on 040) I'll post another version of the series with your 
Tested-by added.

Cheers,

	Michael


[-- Attachment #2: 0001-m68k-Handle-__generic_copy_to_user-faults-more-caref.patch --]
[-- Type: text/x-diff, Size: 4378 bytes --]

From d91cf6c8d282e61e57c03e9614ed64eecce54e10 Mon Sep 17 00:00:00 2001
From: Michael Schmitz <schmitzmic@gmail.com>
Date: Wed, 17 Apr 2024 08:47:55 +1200
Subject: [PATCH 1/2] m68k: Handle __generic_copy_to_user faults more carefully

As mentioned by Finn Thain in his patch to improve put_user
exception handling on 040, a similar problem exists on 030
processors.

A moves instruction that crosses a page boundary from a
mapped page into an unmapped one will cause a mid-instruction
bus error exception (frame format b), with the PC pointing
(usually) two instructions past the faulting movesl instruction.

Our exception handling in __generic_copy_to_user only covers
the instruction immediately following the faulting one. As
a result, fixup_exception in send_fault_sig does not detect
this case, and cause send_fault_sig to oops.

Extend the exception table to cover one additional instruction
beyond the moves[lwb] instructions.

Tested on 68030 (Atari Falcon 030) with transfers beginning
at one to six bytes offset from the end of a mapped page,
followed by further bytes on an unmapped page (testcase
derived from stress-ng sysbadaddr stressor by Finn Thain).

Tested on 68040 (Mac Quadra) by Finn Thain.

A similar problem exists in __clear_user(); modify the exception
table for that function in the same way (tested by Finn Thain).

Cc: Finn Thain <fthain@linux-m68k.org>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: linux-m68k@lists.linux-m68k.org
Link: https://lore.kernel.org/all/e0f23460779e6d16e2633486ac4841790ef2aca0.1713176294.git.fthain@linux-m68k.org
Signed-off-by: Michael Schmitz <schmitzmic@gmail.com>

---

Changes from RFC v2:

Finn Thain:

- add missing extension table entries and final NOP after
  faults in 040 tests

Changes from RFC v1:

Michael Schmitz:

- use extended exception table instead of additional NOPs
---
 arch/m68k/lib/uaccess.c | 55 ++++++++++++++++++++++++-----------------
 1 file changed, 32 insertions(+), 23 deletions(-)

diff --git a/arch/m68k/lib/uaccess.c b/arch/m68k/lib/uaccess.c
index 7646e461aa62..1ad4be5f90e9 100644
--- a/arch/m68k/lib/uaccess.c
+++ b/arch/m68k/lib/uaccess.c
@@ -60,35 +60,42 @@ unsigned long __generic_copy_to_user(void __user *to, const void *from,
 
 	asm volatile ("\n"
 		"	tst.l	%0\n"
-		"	jeq	4f\n"
+		"	jeq	5f\n"
 		"1:	move.l	(%1)+,%3\n"
 		"2:	"MOVES".l	%3,(%2)+\n"
 		"3:	subq.l	#1,%0\n"
-		"	jne	1b\n"
-		"4:	btst	#1,%5\n"
-		"	jeq	6f\n"
-		"	move.w	(%1)+,%3\n"
-		"5:	"MOVES".w	%3,(%2)+\n"
-		"6:	btst	#0,%5\n"
+		"4:	jne	1b\n"
+		"5:	btst	#1,%5\n"
 		"	jeq	8f\n"
-		"	move.b	(%1)+,%3\n"
-		"7:	"MOVES".b  %3,(%2)+\n"
-		"8:\n"
+		"6:	move.w	(%1)+,%3\n"
+		"7:	"MOVES".w	%3,(%2)+\n"
+		"8:	btst	#0,%5\n"
+		"9:	jeq	13f\n"
+		"10:	move.b	(%1)+,%3\n"
+		"11:	"MOVES".b  %3,(%2)+\n"
+		"12:	nop\n"
+		"13:\n"
 		"	.section .fixup,\"ax\"\n"
 		"	.even\n"
 		"20:	lsl.l	#2,%0\n"
 		"50:	add.l	%5,%0\n"
-		"	jra	8b\n"
+		"	jra	13b\n"
 		"	.previous\n"
 		"\n"
 		"	.section __ex_table,\"a\"\n"
 		"	.align	4\n"
+		"	.long	1b,20b\n"
 		"	.long	2b,20b\n"
 		"	.long	3b,20b\n"
-		"	.long	5b,50b\n"
+		"	.long	4b,20b\n"
+		"	.long	5b,20b\n"
 		"	.long	6b,50b\n"
 		"	.long	7b,50b\n"
 		"	.long	8b,50b\n"
+		"	.long	9b,50b\n"
+		"	.long	10b,50b\n"
+		"	.long	11b,50b\n"
+		"	.long	12b,50b\n"
 		"	.previous"
 		: "=d" (res), "+a" (from), "+a" (to), "=&d" (tmp)
 		: "0" (n / 4), "d" (n & 3));
@@ -107,32 +114,34 @@ unsigned long __clear_user(void __user *to, unsigned long n)
 
 	asm volatile ("\n"
 		"	tst.l	%0\n"
-		"	jeq	3f\n"
+		"	jeq	4f\n"
 		"1:	"MOVES".l	%2,(%1)+\n"
 		"2:	subq.l	#1,%0\n"
-		"	jne	1b\n"
-		"3:	btst	#1,%4\n"
-		"	jeq	5f\n"
-		"4:	"MOVES".w	%2,(%1)+\n"
-		"5:	btst	#0,%4\n"
-		"	jeq	7f\n"
-		"6:	"MOVES".b	%2,(%1)\n"
-		"7:\n"
+		"3:	jne	1b\n"
+		"4:	btst	#1,%4\n"
+		"	jeq	6f\n"
+		"5:	"MOVES".w	%2,(%1)+\n"
+		"6:	btst	#0,%4\n"
+		"7:	jeq	9f\n"
+		"8:	"MOVES".b	%2,(%1)\n"
+		"9:\n"
 		"	.section .fixup,\"ax\"\n"
 		"	.even\n"
 		"10:	lsl.l	#2,%0\n"
 		"40:	add.l	%4,%0\n"
-		"	jra	7b\n"
+		"	jra	9b\n"
 		"	.previous\n"
 		"\n"
 		"	.section __ex_table,\"a\"\n"
 		"	.align	4\n"
 		"	.long	1b,10b\n"
 		"	.long	2b,10b\n"
-		"	.long	4b,40b\n"
+		"	.long	3b,10b\n"
 		"	.long	5b,40b\n"
 		"	.long	6b,40b\n"
 		"	.long	7b,40b\n"
+		"	.long	8b,40b\n"
+		"	.long	9b,40b\n"
 		"	.previous"
 		: "=d" (res), "+a" (to)
 		: "d" (0), "0" (n / 4), "d" (n & 3));
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH RFC v2 1/2] m68k: Handle __generic_copy_to_user faults more carefully
  2024-04-26  7:10           ` Michael Schmitz
@ 2024-04-26  7:57             ` Finn Thain
  2024-04-26  8:31               ` Michael Schmitz
  2024-04-26  7:58             ` Finn Thain
  1 sibling, 1 reply; 20+ messages in thread
From: Finn Thain @ 2024-04-26  7:57 UTC (permalink / raw)
  To: Michael Schmitz; +Cc: geert, linux-m68k


On Fri, 26 Apr 2024, Michael Schmitz wrote:

> Similar tests with start address five or six bytes before the start of 
> the unmapped page, and corresponding path length to be returned by 
> getcwd have shown no more Oops on 030 using the attached corrected 
> version of my patch.
> 
> Please give that some testing if you can, and (hoping it won't show any 
> new faults on 040) I'll post another version of the series with your 
> Tested-by added.
> 

I will test it tomorrow. I expect that a NOP is needed at the end of 
__clear_user.

BTW, since you're changing this line, I think there should be a tab here 
rather than two spaces:

+               "11:    "MOVES".b  %3,(%2)+\n"


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH RFC v2 1/2] m68k: Handle __generic_copy_to_user faults more carefully
  2024-04-26  7:10           ` Michael Schmitz
  2024-04-26  7:57             ` Finn Thain
@ 2024-04-26  7:58             ` Finn Thain
  2024-04-27  1:44               ` Finn Thain
  1 sibling, 1 reply; 20+ messages in thread
From: Finn Thain @ 2024-04-26  7:58 UTC (permalink / raw)
  To: Michael Schmitz; +Cc: geert, linux-m68k


On Fri, 26 Apr 2024, Michael Schmitz wrote:

> Similar tests with start address five or six bytes before the start of 
> the unmapped page, and corresponding path length to be returned by 
> getcwd have shown no more Oops on 030 using the attached corrected 
> version of my patch.
> 
> Please give that some testing if you can, and (hoping it won't show any 
> new faults on 040) I'll post another version of the series with your 
> Tested-by added.
> 

I will test it tomorrow. I expect that a NOP is needed at the end of 
__clear_user.

BTW, since you're changing this line, I think there should be a tab here 
rather than two spaces:

+               "11:    "MOVES".b  %3,(%2)+\n"


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH RFC v2 1/2] m68k: Handle __generic_copy_to_user faults more carefully
  2024-04-26  7:57             ` Finn Thain
@ 2024-04-26  8:31               ` Michael Schmitz
  0 siblings, 0 replies; 20+ messages in thread
From: Michael Schmitz @ 2024-04-26  8:31 UTC (permalink / raw)
  To: Finn Thain; +Cc: geert, linux-m68k

Hi Finn,

Am 26.04.2024 um 19:57 schrieb Finn Thain:
>
> On Fri, 26 Apr 2024, Michael Schmitz wrote:
>
>> Similar tests with start address five or six bytes before the start of
>> the unmapped page, and corresponding path length to be returned by
>> getcwd have shown no more Oops on 030 using the attached corrected
>> version of my patch.
>>
>> Please give that some testing if you can, and (hoping it won't show any
>> new faults on 040) I'll post another version of the series with your
>> Tested-by added.
>>
>
> I will test it tomorrow. I expect that a NOP is needed at the end of
> __clear_user.

Thanks! I'll add __clear_user at the start of __generic_copy_to_user for 
my tests to check that.

> BTW, since you're changing this line, I think there should be a tab here
> rather than two spaces:
>
> +               "11:    "MOVES".b  %3,(%2)+\n"

Well spotted - I'll change that while I'm at it.

Cheers,

	Michael

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH RFC v2 1/2] m68k: Handle __generic_copy_to_user faults more carefully
  2024-04-26  7:58             ` Finn Thain
@ 2024-04-27  1:44               ` Finn Thain
  2024-04-27  4:41                 ` Michael Schmitz
  0 siblings, 1 reply; 20+ messages in thread
From: Finn Thain @ 2024-04-27  1:44 UTC (permalink / raw)
  To: Michael Schmitz; +Cc: geert, linux-m68k


On Fri, 26 Apr 2024, I wrote:

> I will test it tomorrow. I expect that a NOP is needed at the end of 
> __clear_user.
> 

Everything seems to work now.

I tested this series together with my own RFC patch on a Quadra (68040) 
and IIci (68030). My tests covered getcwd and llseek syscalls. I tried a 
build with the extra call to __clear_user() and a build without it. And I 
covered a range of working directory names.

Tested-by: Finn Thain <fthain@linux-m68k.org>

So apparently there'e no need for a NOP at the end of __clear_user() -- on 
'040 and '030 at least.


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH RFC v2 1/2] m68k: Handle __generic_copy_to_user faults more carefully
  2024-04-27  1:44               ` Finn Thain
@ 2024-04-27  4:41                 ` Michael Schmitz
  0 siblings, 0 replies; 20+ messages in thread
From: Michael Schmitz @ 2024-04-27  4:41 UTC (permalink / raw)
  To: Finn Thain; +Cc: geert, linux-m68k

Hi Finn,

thanks - tests (including __clear_user) with my 030 did show one Oops in 
the same place that __generic_copy_to_user did last fault, that is, the 
btst after the movesl instruction. Adding that instruction to the 
exception table avoids the Oops.

A range of path lengths as well as offsets tested. Still some tests 
pending... Couldn't find any evidence that a NOP is needed either.

I'll post the final version as soon as I've completed testing.

Cheers,

	Michael


Am 27.04.2024 um 13:44 schrieb Finn Thain:
>
> On Fri, 26 Apr 2024, I wrote:
>
>> I will test it tomorrow. I expect that a NOP is needed at the end of
>> __clear_user.
>>
>
> Everything seems to work now.
>
> I tested this series together with my own RFC patch on a Quadra (68040)
> and IIci (68030). My tests covered getcwd and llseek syscalls. I tried a
> build with the extra call to __clear_user() and a build without it. And I
> covered a range of working directory names.
>
> Tested-by: Finn Thain <fthain@linux-m68k.org>
>
> So apparently there'e no need for a NOP at the end of __clear_user() -- on
> '040 and '030 at least.
>

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2024-04-27  4:41 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-04-22  2:29 [PATCH RFC v2 0/2] m68k uaccess fault handling fixes Michael Schmitz
2024-04-22  2:29 ` [PATCH RFC v2 1/2] m68k: Handle __generic_copy_to_user faults more carefully Michael Schmitz
2024-04-25  4:16   ` Finn Thain
2024-04-25  5:32     ` Michael Schmitz
2024-04-25  6:32       ` Finn Thain
2024-04-25  7:52         ` Michael Schmitz
2024-04-25  5:45     ` Michael Schmitz
2024-04-25  6:47       ` Finn Thain
2024-04-25  7:43         ` Michael Schmitz
2024-04-25  8:20     ` Michael Schmitz
2024-04-25 19:15     ` Michael Schmitz
2024-04-26  1:00       ` Finn Thain
2024-04-26  1:22         ` Michael Schmitz
2024-04-26  7:10           ` Michael Schmitz
2024-04-26  7:57             ` Finn Thain
2024-04-26  8:31               ` Michael Schmitz
2024-04-26  7:58             ` Finn Thain
2024-04-27  1:44               ` Finn Thain
2024-04-27  4:41                 ` Michael Schmitz
2024-04-22  2:29 ` [PATCH RFC v2 2/2] m68k: improve __constant_copy_to_user_asm() fault handling Michael Schmitz

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).