All of lore.kernel.org
 help / color / mirror / Atom feed
* CONFIG_OPTIMIZE_INLINING breaks atomic64 test on arm64
@ 2019-05-28 21:42 Laura Abbott
  2019-05-29 11:21 ` Mark Rutland
  2019-05-29 11:29 ` Will Deacon
  0 siblings, 2 replies; 7+ messages in thread
From: Laura Abbott @ 2019-05-28 21:42 UTC (permalink / raw)
  To: linux-arm-kernel, Will Deacon, Ard Biesheuvel; +Cc: Masahiro Yamada

Hi,

CONFIG_OPTIMIZE_INLINING is a selectable option on arm64 now. It currently
triggers a bug on the CONFIG_ATOMIC64_SELFTEST:

[    4.521551] ------------[ cut here ]------------
[    4.521763] kernel BUG at lib/atomic64_test.c:220!
[    4.522059] Internal error: Oops - BUG: 0 [#1] SMP
[    4.522456] Modules linked in:
[    4.522860] Process swapper/0 (pid: 1, stack limit = 0x(____ptrval____))
[    4.523287] CPU: 4 PID: 1 Comm: swapper/0 Not tainted 5.2.0-rc1 #6
[    4.523462] Hardware name: linux,dummy-virt (DT)
[    4.523779] pstate: a0000005 (NzCv daif -PAN -UAO)
[    4.524231] pc : test_atomic64+0x1270/0x13a4
[    4.524392] lr : test_atomic64+0x1258/0x13a4
[    4.524571] sp : ffff000011f2fd60
[    4.524755] x29: ffff000011f2fd60 x28: 0000000000000000
[    4.525397] x27: 0000000000000000 x26: ffff000010d105c4
[    4.525660] x25: ffff000010de10a0 x24: 0000000000000006
[    4.525821] x23: aaa31337c001d00e x22: bbb42448e223f22f
[    4.526032] x21: aaa31337c001d00c x20: 999202269ddfadeb
[    4.526180] x19: aaa31337c001d00d x18: 0000000000000001
[    4.526324] x17: 0000000000000000 x16: 1111111122222221
[    4.526466] x15: ffff000010b838f0 x14: ffff80003fcf28c8
[    4.526609] x13: 0000000000000000 x12: 0000000000000004
[    4.526765] x11: 0000000000000000 x10: 0000000000000c80
[    4.526940] x9 : 0000000000000000 x8 : ffff800020f12b00
[    4.527090] x7 : deadbeefdeafcafe x6 : aaa31337c001d00d
[    4.527234] x5 : ffff000011f2fda0 x4 : deadbeefdeafcafe
[    4.527378] x3 : aaa31337c001d00d x2 : 1111111122222222
[    4.527522] x1 : 1111111122222221 x0 : ffff000011f2fda0
[    4.527759] Call trace:
[    4.527923]  test_atomic64+0x1270/0x13a4
[    4.528047]  test_atomics_init+0x10/0x28
[    4.528162]  do_one_initcall+0x54/0x230
[    4.528273]  kernel_init_freeable+0x1cc/0x278
[    4.528397]  kernel_init+0x18/0x108
[    4.528510]  ret_from_fork+0x10/0x18
[    4.528827] Code: f2c22221 f2e22221 eb01001f 54000040 (d4210000)
[    4.529647] ---[ end trace e08e679056f5b7ee ]---

Snippet of assembly from test_atomic64:

ffff000010de64cc:       d2844442        mov     x2, #0x2222                     // #8738
ffff000010de64d0:       910103e0        add     x0, sp, #0x40
ffff000010de64d4:       f2a44442        movk    x2, #0x2222, lsl #16
ffff000010de64d8:       f2c22222        movk    x2, #0x1111, lsl #32
ffff000010de64dc:       f2e22222        movk    x2, #0x1111, lsl #48
ffff000010de64e0:       f90023e2        str     x2, [sp, #64]
ffff000010de64e4:       97daa73b        bl      ffff0000104901d0 <arch_atomic64_dec_if_positive>
ffff000010de64e8:       d2844421        mov     x1, #0x2221                     // #8737
ffff000010de64ec:       f2a44441        movk    x1, #0x2222, lsl #16
ffff000010de64f0:       f2c22221        movk    x1, #0x1111, lsl #32
ffff000010de64f4:       f2e22221        movk    x1, #0x1111, lsl #48
ffff000010de64f8:       eb01001f        cmp     x0, x1
ffff000010de64fc:       54000040        b.eq    ffff000010de6504 <test_atomic64+0x1274>  // b.none
ffff000010de6500:       d4210000        brk     #0x800

The value returned from arch_atomic64_dec_if_positive looks like
a stack value

ffff0000104901d0 <arch_atomic64_dec_if_positive>:
ffff0000104901d0:       a9bf7bfd        stp     x29, x30, [sp, #-16]!
ffff0000104901d4:       aa0003e1        mov     x1, x0
ffff0000104901d8:       910003fd        mov     x29, sp
ffff0000104901dc:       9412c9d5        bl      ffff000010942930 <__ll_sc_arch_atomic64_dec_if_positive>
ffff0000104901e0:       d503201f        nop
ffff0000104901e4:       d503201f        nop
ffff0000104901e8:       d503201f        nop
ffff0000104901ec:       d503201f        nop
ffff0000104901f0:       d503201f        nop
ffff0000104901f4:       d503201f        nop
ffff0000104901f8:       aa0103e0        mov     x0, x1
ffff0000104901fc:       a8c17bfd        ldp     x29, x30, [sp], #16
ffff000010490200:       d65f03c0        ret

...which seems to be coming from this buggy looking code.

I can generate this with just defconfig + CONFIG_OPTIMIZE_INLINING + CONFIG_ATOMIC64_SELFTEST
on gcc8 and gcc9. This is also incredibly easy to work around and I expect
there are other issues with CONFIG_OPTIMIZE_INLINING but I figured I'd
report this for good measure.

Thanks,
Laura


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: CONFIG_OPTIMIZE_INLINING breaks atomic64 test on arm64
  2019-05-28 21:42 CONFIG_OPTIMIZE_INLINING breaks atomic64 test on arm64 Laura Abbott
@ 2019-05-29 11:21 ` Mark Rutland
  2019-05-29 11:28   ` Mark Rutland
  2019-05-29 11:29 ` Will Deacon
  1 sibling, 1 reply; 7+ messages in thread
From: Mark Rutland @ 2019-05-29 11:21 UTC (permalink / raw)
  To: Laura Abbott
  Cc: Masahiro Yamada, Will Deacon, linux-arm-kernel, Ard Biesheuvel

Hi Laura,

On Tue, May 28, 2019 at 05:42:04PM -0400, Laura Abbott wrote:
> Hi,
> 
> CONFIG_OPTIMIZE_INLINING is a selectable option on arm64 now. It currently
> triggers a bug on the CONFIG_ATOMIC64_SELFTEST:
> 
> [    4.521551] ------------[ cut here ]------------
> [    4.521763] kernel BUG at lib/atomic64_test.c:220!
> [    4.522059] Internal error: Oops - BUG: 0 [#1] SMP
> [    4.522456] Modules linked in:
> [    4.522860] Process swapper/0 (pid: 1, stack limit = 0x(____ptrval____))
> [    4.523287] CPU: 4 PID: 1 Comm: swapper/0 Not tainted 5.2.0-rc1 #6
> [    4.523462] Hardware name: linux,dummy-virt (DT)
> [    4.523779] pstate: a0000005 (NzCv daif -PAN -UAO)
> [    4.524231] pc : test_atomic64+0x1270/0x13a4
> [    4.524392] lr : test_atomic64+0x1258/0x13a4
> [    4.524571] sp : ffff000011f2fd60
> [    4.524755] x29: ffff000011f2fd60 x28: 0000000000000000
> [    4.525397] x27: 0000000000000000 x26: ffff000010d105c4
> [    4.525660] x25: ffff000010de10a0 x24: 0000000000000006
> [    4.525821] x23: aaa31337c001d00e x22: bbb42448e223f22f
> [    4.526032] x21: aaa31337c001d00c x20: 999202269ddfadeb
> [    4.526180] x19: aaa31337c001d00d x18: 0000000000000001
> [    4.526324] x17: 0000000000000000 x16: 1111111122222221
> [    4.526466] x15: ffff000010b838f0 x14: ffff80003fcf28c8
> [    4.526609] x13: 0000000000000000 x12: 0000000000000004
> [    4.526765] x11: 0000000000000000 x10: 0000000000000c80
> [    4.526940] x9 : 0000000000000000 x8 : ffff800020f12b00
> [    4.527090] x7 : deadbeefdeafcafe x6 : aaa31337c001d00d
> [    4.527234] x5 : ffff000011f2fda0 x4 : deadbeefdeafcafe
> [    4.527378] x3 : aaa31337c001d00d x2 : 1111111122222222
> [    4.527522] x1 : 1111111122222221 x0 : ffff000011f2fda0
> [    4.527759] Call trace:
> [    4.527923]  test_atomic64+0x1270/0x13a4
> [    4.528047]  test_atomics_init+0x10/0x28
> [    4.528162]  do_one_initcall+0x54/0x230
> [    4.528273]  kernel_init_freeable+0x1cc/0x278
> [    4.528397]  kernel_init+0x18/0x108
> [    4.528510]  ret_from_fork+0x10/0x18
> [    4.528827] Code: f2c22221 f2e22221 eb01001f 54000040 (d4210000)
> [    4.529647] ---[ end trace e08e679056f5b7ee ]---

Thanks for the report.

FWIW I can reproduce this locally, after hacking cpufeature.c to not
detect the LSE atomics on my test platform.

> Snippet of assembly from test_atomic64:
> 
> ffff000010de64cc:       d2844442        mov     x2, #0x2222                     // #8738
> ffff000010de64d0:       910103e0        add     x0, sp, #0x40
> ffff000010de64d4:       f2a44442        movk    x2, #0x2222, lsl #16
> ffff000010de64d8:       f2c22222        movk    x2, #0x1111, lsl #32
> ffff000010de64dc:       f2e22222        movk    x2, #0x1111, lsl #48
> ffff000010de64e0:       f90023e2        str     x2, [sp, #64]
> ffff000010de64e4:       97daa73b        bl      ffff0000104901d0 <arch_atomic64_dec_if_positive>
> ffff000010de64e8:       d2844421        mov     x1, #0x2221                     // #8737
> ffff000010de64ec:       f2a44441        movk    x1, #0x2222, lsl #16
> ffff000010de64f0:       f2c22221        movk    x1, #0x1111, lsl #32
> ffff000010de64f4:       f2e22221        movk    x1, #0x1111, lsl #48
> ffff000010de64f8:       eb01001f        cmp     x0, x1
> ffff000010de64fc:       54000040        b.eq    ffff000010de6504 <test_atomic64+0x1274>  // b.none
> ffff000010de6500:       d4210000        brk     #0x800
> 
> The value returned from arch_atomic64_dec_if_positive looks like
> a stack value
>
> ffff0000104901d0 <arch_atomic64_dec_if_positive>:
> ffff0000104901d0:       a9bf7bfd        stp     x29, x30, [sp, #-16]!
> ffff0000104901d4:       aa0003e1        mov     x1, x0
> ffff0000104901d8:       910003fd        mov     x29, sp
> ffff0000104901dc:       9412c9d5        bl      ffff000010942930 <__ll_sc_arch_atomic64_dec_if_positive>
> ffff0000104901e0:       d503201f        nop
> ffff0000104901e4:       d503201f        nop
> ffff0000104901e8:       d503201f        nop
> ffff0000104901ec:       d503201f        nop
> ffff0000104901f0:       d503201f        nop
> ffff0000104901f4:       d503201f        nop
> ffff0000104901f8:       aa0103e0        mov     x0, x1
> ffff0000104901fc:       a8c17bfd        ldp     x29, x30, [sp], #16
> ffff000010490200:       d65f03c0        ret
> 
> ...which seems to be coming from this buggy looking code.

Ouch, so it's trying to save/restore x0, and returning the original
value of the atomic64_t *v argument.

AFAICT, GCC *shouldn't* be doing that, since we have a "=&r" constraint
on a register variable in x0, which we use as the return value:
 
static inline long arch_atomic64_dec_if_positive(atomic64_t *v)
{
	register long x0 asm ("x0") = (long)v;

	asm volatile(ARM64_LSE_ATOMIC_INSN(
	/* LL/SC */
	__LL_SC_ATOMIC64(dec_if_positive)
	__nops(6),
	/* LSE atomics */
	"1:     ldr     x30, %[v]\n"
	"       subs    %[ret], x30, #1\n"
	"       b.lt    2f\n"
	"       casal   x30, %[ret], %[v]\n"
	"       sub     x30, x30, #1\n"
	"       sub     x30, x30, %[ret]\n"
	"       cbnz    x30, 1b\n"
	"2:")
	: [ret] "+&r" (x0), [v] "+Q" (v->counter)
	:
	: __LL_SC_CLOBBERS, "cc", "memory");

	return x0;
}

... so this smells like a GCC bug.

I *think* GCC places the "x0" variable into x1 despite it being a local
register variable that should be in x0, and GCC places v in x0.

That would explain why this works for LSE even with the shuffle back,
since the asm would use x1 for ret.

I haven't managed to come up with a test-case that proves that, though,
and I don't knwo how we could bodge around that.

Thanks,
Mark.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: CONFIG_OPTIMIZE_INLINING breaks atomic64 test on arm64
  2019-05-29 11:21 ` Mark Rutland
@ 2019-05-29 11:28   ` Mark Rutland
  0 siblings, 0 replies; 7+ messages in thread
From: Mark Rutland @ 2019-05-29 11:28 UTC (permalink / raw)
  To: Laura Abbott
  Cc: Masahiro Yamada, Will Deacon, linux-arm-kernel, Ard Biesheuvel

On Wed, May 29, 2019 at 12:21:42PM +0100, Mark Rutland wrote:
> On Tue, May 28, 2019 at 05:42:04PM -0400, Laura Abbott wrote:
> > The value returned from arch_atomic64_dec_if_positive looks like
> > a stack value
> >
> > ffff0000104901d0 <arch_atomic64_dec_if_positive>:
> > ffff0000104901d0:       a9bf7bfd        stp     x29, x30, [sp, #-16]!
> > ffff0000104901d4:       aa0003e1        mov     x1, x0
> > ffff0000104901d8:       910003fd        mov     x29, sp
> > ffff0000104901dc:       9412c9d5        bl      ffff000010942930 <__ll_sc_arch_atomic64_dec_if_positive>
> > ffff0000104901e0:       d503201f        nop
> > ffff0000104901e4:       d503201f        nop
> > ffff0000104901e8:       d503201f        nop
> > ffff0000104901ec:       d503201f        nop
> > ffff0000104901f0:       d503201f        nop
> > ffff0000104901f4:       d503201f        nop
> > ffff0000104901f8:       aa0103e0        mov     x0, x1
> > ffff0000104901fc:       a8c17bfd        ldp     x29, x30, [sp], #16
> > ffff000010490200:       d65f03c0        ret
> > 
> > ...which seems to be coming from this buggy looking code.
> 
> Ouch, so it's trying to save/restore x0, and returning the original
> value of the atomic64_t *v argument.
> 
> AFAICT, GCC *shouldn't* be doing that, since we have a "=&r" constraint
> on a register variable in x0, which we use as the return value:
>  
> static inline long arch_atomic64_dec_if_positive(atomic64_t *v)
> {
> 	register long x0 asm ("x0") = (long)v;
> 
> 	asm volatile(ARM64_LSE_ATOMIC_INSN(
> 	/* LL/SC */
> 	__LL_SC_ATOMIC64(dec_if_positive)
> 	__nops(6),
> 	/* LSE atomics */
> 	"1:     ldr     x30, %[v]\n"
> 	"       subs    %[ret], x30, #1\n"
> 	"       b.lt    2f\n"
> 	"       casal   x30, %[ret], %[v]\n"
> 	"       sub     x30, x30, #1\n"
> 	"       sub     x30, x30, %[ret]\n"
> 	"       cbnz    x30, 1b\n"
> 	"2:")
> 	: [ret] "+&r" (x0), [v] "+Q" (v->counter)
> 	:
> 	: __LL_SC_CLOBBERS, "cc", "memory");
> 
> 	return x0;
> }
> 
> ... so this smells like a GCC bug.
> 
> I *think* GCC places the "x0" variable into x1 despite it being a local
> register variable that should be in x0, and GCC places v in x0.
> 
> That would explain why this works for LSE even with the shuffle back,
> since the asm would use x1 for ret.
> 
> I haven't managed to come up with a test-case that proves that, though,
> and I don't knwo how we could bodge around that.

With an __asmeq(), it looks like "x0" isn't in x0:

[mark@lakrids:~/src/linux]% git diff
diff --git a/arch/arm64/include/asm/atomic_lse.h b/arch/arm64/include/asm/atomic_lse.h
index 9256a3921e4b..c3549f87dfe8 100644
--- a/arch/arm64/include/asm/atomic_lse.h
+++ b/arch/arm64/include/asm/atomic_lse.h
@@ -25,6 +25,8 @@
 #error "please don't include this file directly"
 #endif
 
+#define __asmeq(x, y)  ".ifnc " x "," y " ; .err ; .endif\n\t"
+
 #define __LL_SC_ATOMIC(op)     __LL_SC_CALL(arch_atomic_##op)
 #define ATOMIC_OP(op, asm_op)                                          \
 static inline void arch_atomic_##op(int i, atomic_t *v)                        \
@@ -423,6 +425,7 @@ static inline long arch_atomic64_dec_if_positive(atomic64_t *v)
        register long x0 asm ("x0") = (long)v;
 
        asm volatile(ARM64_LSE_ATOMIC_INSN(
+       __asmeq("%[ret]", "x0")
        /* LL/SC */
        __LL_SC_ATOMIC64(dec_if_positive)
        __nops(6),
[mark@lakrids:~/src/linux]% usekorg 8.1.0 make ARCH=arm64 CROSS_COMPILE=aarch64-linux- lib/atomic64_test.o
  CC      kernel/bounds.s
  CC      arch/arm64/kernel/asm-offsets.s
  CALL    scripts/checksyscalls.sh
  CALL    scripts/atomic/check-atomics.sh
  CC      lib/atomic64_test.o
/tmp/ccQ5P7I7.s: Assembler messages:
/tmp/ccQ5P7I7.s:39: Error: .err encountered
scripts/Makefile.build:278: recipe for target 'lib/atomic64_test.o' failed
make[1]: *** [lib/atomic64_test.o] Error 1
Makefile:1747: recipe for target 'lib/atomic64_test.o' failed
make: *** [lib/atomic64_test.o] Error 2

... and is actuall in x1:

[mark@lakrids:~/src/linux]% git diff                                                         
diff --git a/arch/arm64/include/asm/atomic_lse.h b/arch/arm64/include/asm/atomic_lse.h
index 9256a3921e4b..d8b38c168a04 100644
--- a/arch/arm64/include/asm/atomic_lse.h
+++ b/arch/arm64/include/asm/atomic_lse.h
@@ -25,6 +25,8 @@
 #error "please don't include this file directly"
 #endif
 
+#define __asmeq(x, y)  ".ifnc " x "," y " ; .err ; .endif\n\t"
+
 #define __LL_SC_ATOMIC(op)     __LL_SC_CALL(arch_atomic_##op)
 #define ATOMIC_OP(op, asm_op)                                          \
 static inline void arch_atomic_##op(int i, atomic_t *v)                        \
@@ -423,6 +425,7 @@ static inline long arch_atomic64_dec_if_positive(atomic64_t *v)
        register long x0 asm ("x0") = (long)v;
 
        asm volatile(ARM64_LSE_ATOMIC_INSN(
+       __asmeq("%[ret]", "x1")
        /* LL/SC */
        __LL_SC_ATOMIC64(dec_if_positive)
        __nops(6),
[mark@lakrids:~/src/linux]% usekorg 8.1.0 make ARCH=arm64 CROSS_COMPILE=aarch64-linux- lib/atomic64_test.o
  CC      kernel/bounds.s
  CC      arch/arm64/kernel/asm-offsets.s
  CALL    scripts/checksyscalls.sh
  CALL    scripts/atomic/check-atomics.sh
  CC      lib/atomic64_test.o


... which is not fantastic. :/

Thanks,
Mark.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: CONFIG_OPTIMIZE_INLINING breaks atomic64 test on arm64
  2019-05-28 21:42 CONFIG_OPTIMIZE_INLINING breaks atomic64 test on arm64 Laura Abbott
  2019-05-29 11:21 ` Mark Rutland
@ 2019-05-29 11:29 ` Will Deacon
  2019-05-29 11:53   ` Russell King - ARM Linux admin
  2019-06-11 15:50   ` Will Deacon
  1 sibling, 2 replies; 7+ messages in thread
From: Will Deacon @ 2019-05-29 11:29 UTC (permalink / raw)
  To: Laura Abbott
  Cc: Masahiro Yamada, Andrew.Murray, linux-arm-kernel, Ard Biesheuvel

Hi Laura,

Thanks for the report -- I've managed to reproduce the issue locally.

On Tue, May 28, 2019 at 05:42:04PM -0400, Laura Abbott wrote:
> CONFIG_OPTIMIZE_INLINING is a selectable option on arm64 now. It currently
> triggers a bug on the CONFIG_ATOMIC64_SELFTEST:

[...]

> ffff0000104901d0 <arch_atomic64_dec_if_positive>:
> ffff0000104901d0:       a9bf7bfd        stp     x29, x30, [sp, #-16]!
> ffff0000104901d4:       aa0003e1        mov     x1, x0
> ffff0000104901d8:       910003fd        mov     x29, sp
> ffff0000104901dc:       9412c9d5        bl      ffff000010942930 <__ll_sc_arch_atomic64_dec_if_positive>
> ffff0000104901e0:       d503201f        nop
> ffff0000104901e4:       d503201f        nop
> ffff0000104901e8:       d503201f        nop
> ffff0000104901ec:       d503201f        nop
> ffff0000104901f0:       d503201f        nop
> ffff0000104901f4:       d503201f        nop
> ffff0000104901f8:       aa0103e0        mov     x0, x1
> ffff0000104901fc:       a8c17bfd        ldp     x29, x30, [sp], #16
> ffff000010490200:       d65f03c0        ret
> 
> ...which seems to be coming from this buggy looking code.

The problem here is that GCC has allocated the '%[ret]' input operand in x1
for the out-of-line version of the function, despite us using the register
keyword explicitly:

static inline long arch_atomic64_dec_if_positive(atomic64_t *v)
{
	register long x0 asm ("x0") = (long)v;

	asm volatile(ARM64_LSE_ATOMIC_INSN(
	/* LL/SC */
	__LL_SC_ATOMIC64(dec_if_positive)
	__nops(6),
	/* LSE atomics */
	"1:	ldr	x30, %[v]\n"
	"	subs	%[ret], x30, #1\n"
	"	b.lt	2f\n"
	"	casal	x30, %[ret], %[v]\n"
	"	sub	x30, x30, #1\n"
	"	sub	x30, x30, %[ret]\n"
	"	cbnz	x30, 1b\n"
	"2:")
	: [ret] "+&r" (x0), [v] "+Q" (v->counter)
	:
	: __LL_SC_CLOBBERS, "cc", "memory");

	return x0;
}

My reading of:

  https://gcc.gnu.org/onlinedocs/gcc/Local-Register-Variables.html#Local-Register-Variables

is that this should work, although it's fair to say that we are pulling
a bunch of other tricks in this code.

At this point, I'd be inclined to raise something in the GCC bugzilla
with the result of --save-temps for the failing atomic64_test.c. Maybe
they will have a better opinion as to whether we're doing something wrong
here. Is that something you're willing to do, or shall I do it?

Will

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: CONFIG_OPTIMIZE_INLINING breaks atomic64 test on arm64
  2019-05-29 11:29 ` Will Deacon
@ 2019-05-29 11:53   ` Russell King - ARM Linux admin
  2019-06-11 15:50   ` Will Deacon
  1 sibling, 0 replies; 7+ messages in thread
From: Russell King - ARM Linux admin @ 2019-05-29 11:53 UTC (permalink / raw)
  To: Will Deacon
  Cc: Ard Biesheuvel, Masahiro Yamada, Laura Abbott, linux-arm-kernel,
	Andrew.Murray

On Wed, May 29, 2019 at 12:29:56PM +0100, Will Deacon wrote:
> Hi Laura,
> 
> Thanks for the report -- I've managed to reproduce the issue locally.
> 
> On Tue, May 28, 2019 at 05:42:04PM -0400, Laura Abbott wrote:
> > CONFIG_OPTIMIZE_INLINING is a selectable option on arm64 now. It currently
> > triggers a bug on the CONFIG_ATOMIC64_SELFTEST:
> 
> [...]
> 
> > ffff0000104901d0 <arch_atomic64_dec_if_positive>:
> > ffff0000104901d0:       a9bf7bfd        stp     x29, x30, [sp, #-16]!
> > ffff0000104901d4:       aa0003e1        mov     x1, x0
> > ffff0000104901d8:       910003fd        mov     x29, sp
> > ffff0000104901dc:       9412c9d5        bl      ffff000010942930 <__ll_sc_arch_atomic64_dec_if_positive>
> > ffff0000104901e0:       d503201f        nop
> > ffff0000104901e4:       d503201f        nop
> > ffff0000104901e8:       d503201f        nop
> > ffff0000104901ec:       d503201f        nop
> > ffff0000104901f0:       d503201f        nop
> > ffff0000104901f4:       d503201f        nop
> > ffff0000104901f8:       aa0103e0        mov     x0, x1
> > ffff0000104901fc:       a8c17bfd        ldp     x29, x30, [sp], #16
> > ffff000010490200:       d65f03c0        ret
> > 
> > ...which seems to be coming from this buggy looking code.
> 
> The problem here is that GCC has allocated the '%[ret]' input operand in x1
> for the out-of-line version of the function, despite us using the register
> keyword explicitly:
> 
> static inline long arch_atomic64_dec_if_positive(atomic64_t *v)
> {
> 	register long x0 asm ("x0") = (long)v;
> 
> 	asm volatile(ARM64_LSE_ATOMIC_INSN(
> 	/* LL/SC */
> 	__LL_SC_ATOMIC64(dec_if_positive)
> 	__nops(6),
> 	/* LSE atomics */
> 	"1:	ldr	x30, %[v]\n"
> 	"	subs	%[ret], x30, #1\n"
> 	"	b.lt	2f\n"
> 	"	casal	x30, %[ret], %[v]\n"
> 	"	sub	x30, x30, #1\n"
> 	"	sub	x30, x30, %[ret]\n"
> 	"	cbnz	x30, 1b\n"
> 	"2:")
> 	: [ret] "+&r" (x0), [v] "+Q" (v->counter)
> 	:
> 	: __LL_SC_CLOBBERS, "cc", "memory");
> 
> 	return x0;
> }
> 
> My reading of:
> 
>   https://gcc.gnu.org/onlinedocs/gcc/Local-Register-Variables.html#Local-Register-Variables
> 
> is that this should work, although it's fair to say that we are pulling
> a bunch of other tricks in this code.

This is why 32-bit ARM has the __asmeq() macro, after a very similar bug
afflicted that toolchain.  Nico decided it wasn't worth the pain of
debugging these issues, so we added checks to inline assembly that
depends on registers in certain hardware registers.  That's kept us
safe.

I would encourage aarch64 to adopt this approach, especially as you
now have a similar instance of this.  It guarantees that the register
allocation for assembly code is in fact what you expect it to be.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: CONFIG_OPTIMIZE_INLINING breaks atomic64 test on arm64
  2019-05-29 11:29 ` Will Deacon
  2019-05-29 11:53   ` Russell King - ARM Linux admin
@ 2019-06-11 15:50   ` Will Deacon
  2019-06-11 16:13     ` Laura Abbott
  1 sibling, 1 reply; 7+ messages in thread
From: Will Deacon @ 2019-06-11 15:50 UTC (permalink / raw)
  To: Laura Abbott
  Cc: Masahiro Yamada, Andrew.Murray, linux-arm-kernel, Ard Biesheuvel

Hi Laura,

On Wed, May 29, 2019 at 12:30:00PM +0100, Will Deacon wrote:
> On Tue, May 28, 2019 at 05:42:04PM -0400, Laura Abbott wrote:
> > CONFIG_OPTIMIZE_INLINING is a selectable option on arm64 now. It currently
> > triggers a bug on the CONFIG_ATOMIC64_SELFTEST:
> 
> At this point, I'd be inclined to raise something in the GCC bugzilla
> with the result of --save-temps for the failing atomic64_test.c. Maybe
> they will have a better opinion as to whether we're doing something wrong
> here. Is that something you're willing to do, or shall I do it?

Did you file a GCC bugzilla entry for this, or shall I go ahead and make
a new one (assuming I can remember my credentials for that thing)?

Will

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: CONFIG_OPTIMIZE_INLINING breaks atomic64 test on arm64
  2019-06-11 15:50   ` Will Deacon
@ 2019-06-11 16:13     ` Laura Abbott
  0 siblings, 0 replies; 7+ messages in thread
From: Laura Abbott @ 2019-06-11 16:13 UTC (permalink / raw)
  To: Will Deacon
  Cc: Masahiro Yamada, Ard Biesheuvel, linux-arm-kernel, Andrew.Murray

On 6/11/19 11:50 AM, Will Deacon wrote:
> Hi Laura,
> 
> On Wed, May 29, 2019 at 12:30:00PM +0100, Will Deacon wrote:
>> On Tue, May 28, 2019 at 05:42:04PM -0400, Laura Abbott wrote:
>>> CONFIG_OPTIMIZE_INLINING is a selectable option on arm64 now. It currently
>>> triggers a bug on the CONFIG_ATOMIC64_SELFTEST:
>>
>> At this point, I'd be inclined to raise something in the GCC bugzilla
>> with the result of --save-temps for the failing atomic64_test.c. Maybe
>> they will have a better opinion as to whether we're doing something wrong
>> here. Is that something you're willing to do, or shall I do it?
> 
> Did you file a GCC bugzilla entry for this, or shall I go ahead and make
> a new one (assuming I can remember my credentials for that thing)?
> 
> Will
> 

I didn't file one, if you already have the account I'm happy to let you
do that since I don't actually have one (maybe I should though at the
rate I seem to stumble across bugs...)

Thanks,
Laura

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2019-06-11 16:13 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-05-28 21:42 CONFIG_OPTIMIZE_INLINING breaks atomic64 test on arm64 Laura Abbott
2019-05-29 11:21 ` Mark Rutland
2019-05-29 11:28   ` Mark Rutland
2019-05-29 11:29 ` Will Deacon
2019-05-29 11:53   ` Russell King - ARM Linux admin
2019-06-11 15:50   ` Will Deacon
2019-06-11 16:13     ` Laura Abbott

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.