From: Vincenzo Frascino <vincenzo.frascino@arm.com>
To: Dave Martin <Dave.Martin@arm.com>
Cc: linux-arch@vger.kernel.org, Shuah Khan <shuah@kernel.org>,
linux-mips@vger.kernel.org,
Andre Przywara <andre.przywara@arm.com>,
Arnd Bergmann <arnd@arndb.de>, Huw Davies <huw@codeweavers.com>,
Catalin Marinas <catalin.marinas@arm.com>,
Daniel Lezcano <daniel.lezcano@linaro.org>,
Will Deacon <will.deacon@arm.com>,
linux-kernel@vger.kernel.org, Ralf Baechle <ralf@linux-mips.org>,
Mark Salyzyn <salyzyn@android.com>,
Paul Burton <paul.burton@mips.com>,
linux-kselftest@vger.kernel.org,
Rasmus Villemoes <linux@rasmusvillemoes.dk>,
Russell King <linux@armlinux.org.uk>,
Dmitry Safonov <0x7f454c46@gmail.com>,
Shijith Thotton <sthotton@marvell.com>,
Peter Collingbourne <pcc@google.com>,
Thomas Gleixner <tglx@linutronix.de>,
linux-arm-kernel@lists.infradead.org
Subject: Re: [PATCH v7 04/25] arm64: Substitute gettimeofday with C implementation
Date: Thu, 27 Jun 2019 12:59:07 +0100 [thread overview]
Message-ID: <a07b66cb-186f-a743-4f1d-41227f23db74@arm.com> (raw)
In-Reply-To: <20190627112731.GF2790@e103592.cambridge.arm.com>
On 6/27/19 12:27 PM, Dave Martin wrote:
> On Thu, Jun 27, 2019 at 11:57:36AM +0100, Vincenzo Frascino wrote:
>> Hi Dave,
>>
>> Overall, I want to thank you for bringing out the topic. It helped me to
>> question some decisions and make sure that we have no holes left in
>> the approach.
>
> Fair enough.
>
> This is really just a nasty compiler corner-case... the validity of the
> overall approach isn't affected.
>
>>>>
>>>> vDSO library is a shared object not compiled with LTO as far as I can
>>>> see, hence if this involved LTO should not applicable in this case.
>>>
>>> That turned to be a spurious hypothesis on my part -- LTO isn't the
>>> smoking gun. (See below.)
>>>
>>
>> Ok.
>>
>>>>> The classic example of this (triggered directly and not due to inlining)
>>>>> would be something like:
>>>>>
>>>>> int bar(int, int);
>>>>>
>>>>> void foo(int x, int y)
>>>>> {
>>>>> register int x_ asm("r0") = x;
>>>>> register int y_ asm("r1") = bar(x, y);
>>>>>
>>>>> asm volatile (
>>>>> "svc #0"
>>>>> :: "r" (x_), "r" (y_)
>>>>> : "memory"
>>>>> );
>>>>> }
>>>>>
>>>>> ->
>>>>>
>>>>> 0000000000000000 <foo>:
>>>>> 0: a9bf7bfd stp x29, x30, [sp, #-16]!
>>>>> 4: 910003fd mov x29, sp
>>>>> 8: 94000000 bl 0 <bar>
>>>>> c: 2a0003e1 mov w1, w0
>>>>> 10: d4000001 svc #0x0
>>>>> 14: a8c17bfd ldp x29, x30, [sp], #16
>>>>> 18: d65f03c0 ret
>>>>>
>>>>
>>>> Contextualized to what my vdso fallback functions do, this should not be a
>>>> concern because in no case a function result is directly set to a variable
>>>> declared as register.
>>>>
>>>> Since the vdso fallback functions serve a very specific and limited purpose, I
>>>> do not expect that that code is going to change much in future.
>>>>
>>>> The only thing that can happen is something similar to what I wrote in my
>>>> example, which as I empirically proved does not trigger the problematic behavior.
>>>>
>>>>>
>>>>> The gcc documentation is vague and ambiguous about precisely whan this
>>>>> can happen and about how to avoid it.
>>>>>
>>>>
>>>> On this I agree, it is not very clear, but this seems more something to raise
>>>> with the gcc folks in order to have a more "explicit" description that leaves no
>>>> room to the interpretation.
>>>>
>>>> ...
>>>>
>>>>>
>>>>> However, the workaround is cheap, and to avoid the chance of subtle
>>>>> intermittent code gen bugs it may be worth it:
>>>>>
>>>>> void foo(int x, int y)
>>>>> {
>>>>> asm volatile (
>>>>> "mov x0, %0\n\t"
>>>>> "mov x1, %1\n\t"
>>>>> "svc #0"
>>>>> :: "r" (x), "r" (bar(x, y))
>>>>> : "r0", "r1", "memory"
>>>>> );
>>>>> }
>>>>>
>>>>> ->
>>>>>
>>>>> 0000000000000000 <foo>:
>>>>> 0: a9be7bfd stp x29, x30, [sp, #-32]!
>>>>> 4: 910003fd mov x29, sp
>>>>> 8: f9000bf3 str x19, [sp, #16]
>>>>> c: 2a0003f3 mov w19, w0
>>>>> 10: 94000000 bl 0 <bar>
>>>>> 14: 2a0003e2 mov w2, w0
>>>>> 18: aa1303e0 mov x0, x19
>>>>> 1c: aa0203e1 mov x1, x2
>>>>> 20: d4000001 svc #0x0
>>>>> 24: f9400bf3 ldr x19, [sp, #16]
>>>>> 28: a8c27bfd ldp x29, x30, [sp], #32
>>>>> 2c: d65f03c0 ret
>>>>>
>>>>>
>>>>> What do you think?
>>>>>
>>>>
>>>> The solution seems ok, thanks for providing it, but IMHO I think we
>>>> should find a workaround for something that is broken, which, unless
>>>> I am missing something major, this seems not the case.
>>>
>>> So, after a bit of further experimentation, I found that I could trigger
>>> it with implicit function calls on an older compiler. I couldn't show
>>> it with explicit function calls (as in your example).
>>>
>>> With the following code, inlining if an expression that causes an
>>> implicit call to a libgcc helper can trigger this issue, but I had to
>>> try an older compiler:
>>>
>>> int foo(int x, int y)
>>> {
>>> register int res asm("r0");
>>> register const int x_ asm("r0") = x;
>>> register const int y_ asm("r1") = y;
>>>
>>> asm volatile (
>>> "svc #0"
>>> : "=r" (res)
>>> : "r" (x_), "r" (y_)
>>> : "memory"
>>> );
>>>
>>> return res;
>>> }
>>>
>>> int bar(int x, int y)
>>> {
>>> return foo(x, x / y);
>>> }
>>>
>>> -> (arm-linux-gnueabihf-gcc 9.1 -O2)
>>>
>>> 00000000 <foo>:
>>> 0: df00 svc 0
>>> 2: 4770 bx lr
>>>
>>> 00000004 <bar>:
>>> 4: b510 push {r4, lr}
>>> 6: 4604 mov r4, r0
>>> 8: f7ff fffe bl 0 <__aeabi_idiv>
>>> c: 4601 mov r1, r0
>>> e: 4620 mov r0, r4
>>> 10: df00 svc 0
>>> 12: bd10 pop {r4, pc}
>>>
>>> -> (arm-linux-gnueabihf-gcc 5.1 -O2)
>>>
>>> 00000000 <foo>:
>>> 0: df00 svc 0
>>> 2: 4770 bx lr
>>>
>>> 00000004 <bar>:
>>> 4: b508 push {r3, lr}
>>> 6: f7ff fffe bl 0 <__aeabi_idiv>
>>> a: 4601 mov r1, r0
>>> c: df00 svc 0
>>> e: bd08 pop {r3, pc}
>>>
>>
>> Thanks for reporting this. I had a go with gcc-5.1 on the vDSO library and seems
>> Ok, but it was worth trying.
>>
>> For obvious reasons I am not reporting the objdump here :)
>>
>>> I was struggling to find a way to emit an implicit function call for
>>> AArch64, except for 128-bit divide, which would complicate things since
>>> uint128_t doesn't fit in a single register anyway.
>>>
>>> Maybe this was considered a bug and fixed sometime after GCC 5, but I
>>> think the GCC documentation is still quite unclear on the semantics of
>>> register asm vars that alias call-clobbered registers in the PCS.
>>>
>>> If we can get a promise out of the GCC folks that this will not happen
>>> with any future compiler, then maybe we could just require a new enough
>>> compiler to be used.
>>>
>>
>> On this I fully agree, the compiler should never change an "expected" behavior.
>>
>> If the issue comes from a gray area in the documentation, we have to address it
>> and have it fixed there.
>>
>> The minimum version of the compiler from linux-4.19 is 4.6, hence I had to try
>> that the vDSO lib does not break with 5.1 [1].
>>
>> [1]
>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=cafa0010cd51fb711fdcb50fc55f394c5f167a0a
>
> OK
>
>>> Then of course there is clang.
>>>
>>
>> I could not help myself and I tried clang.8 and clang.7 as well with my example,
>> just to make sure that we are fine even in that case. Please find below the
>> results (pretty identical).
>>
>> main.clang.7.o: file format ELF64-aarch64-little
>>
>> Disassembly of section .text:
>> 0000000000000000 show_it:
>> 0: e8 03 1f aa mov x8, xzr
>> 4: 09 68 68 38 ldrb w9, [x0, x8]
>> 8: 08 05 00 91 add x8, x8, #1
>> c: c9 ff ff 34 cbz w9, #-8 <show_it+0x4>
>> 10: 02 05 00 51 sub w2, w8, #1
>> 14: e1 03 00 aa mov x1, x0
>> 18: 08 08 80 d2 mov x8, #64
>> 1c: 01 00 00 d4 svc #0
>> 20: c0 03 5f d6 ret
>>
>> main.clang.8.o: file format ELF64-aarch64-little
>>
>> Disassembly of section .text:
>> 0000000000000000 show_it:
>> 0: e8 03 1f aa mov x8, xzr
>> 4: 09 68 68 38 ldrb w9, [x0, x8]
>> 8: 08 05 00 91 add x8, x8, #1
>> c: c9 ff ff 34 cbz w9, #-8 <show_it+0x4>
>> 10: 02 05 00 51 sub w2, w8, #1
>> 14: e1 03 00 aa mov x1, x0
>> 18: 08 08 80 d2 mov x8, #64
>> 1c: 01 00 00 d4 svc #0
>> 20: c0 03 5f d6 ret
>>
>> Commands used:
>>
>> $ clang -target aarch64-linux-gnueabi main.c -O -c -o main.clang.<x>.o
>> $ llvm-objdump -d main.clang.<x>.o
>
> Actually, I'm not sure this is comparable with the reproducer I quoted
> in my last reply.
>
As explained in my previous email, this is the only case that can realistically
happen. vDSO has no dependency on any other library (i.e. libgcc you were
mentioning) and we are referring to the fallbacks which fall in this category.
> The compiler can see the definition of strlen and fully inlines it.
> I only ever saw the problem when the compiler emits an out-of-line
> implicit function call.
> > What does clang do with my example on 32-bit?
When clang is selected compat vDSOs are currently disabled on arm64, will be
introduced with a future patch series.
Anyway since I am curious as well, this is what happens with your example with
clang.8 target=arm-linux-gnueabihf:
dave-code.clang.8.o: file format ELF32-arm-little
Disassembly of section .text:
0000000000000000 foo:
0: 00 00 00 ef svc #0
4: 1e ff 2f e1 bx lr
0000000000000008 bar:
8: 10 4c 2d e9 push {r4, r10, r11, lr}
c: 08 b0 8d e2 add r11, sp, #8
10: 00 40 a0 e1 mov r4, r0
14: fe ff ff eb bl #-8 <bar+0xc>
18: 00 10 a0 e1 mov r1, r0
1c: 04 00 a0 e1 mov r0, r4
20: 00 00 00 ef svc #0
24: 10 8c bd e8 pop {r4, r10, r11, pc}
Compiled with -O2, -O3, -Os never inlines.
Same thing happens for aarch64-linux-gnueabi:
dave-code.clang.8.o: file format ELF64-aarch64-little
Disassembly of section .text:
0000000000000000 foo:
0: e0 03 00 2a mov w0, w0
4: e1 03 01 2a mov w1, w1
8: 01 00 00 d4 svc #0
c: c0 03 5f d6 ret
0000000000000010 bar:
10: 01 0c c1 1a sdiv w1, w0, w1
14: e0 03 00 2a mov w0, w0
18: 01 00 00 d4 svc #0
1c: c0 03 5f d6 ret
Based on this I think we can conclude our investigation.
>
> Cheers
> ---Dave
>
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
>
--
Regards,
Vincenzo
next prev parent reply other threads:[~2019-06-27 11:58 UTC|newest]
Thread overview: 108+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-06-21 9:52 [PATCH v7 00/25] Unify vDSOs across more architectures Vincenzo Frascino
2019-06-21 9:52 ` [PATCH v7 01/25] kernel: Standardize vdso_datapage Vincenzo Frascino
2019-06-24 13:56 ` Catalin Marinas
2019-06-21 9:52 ` [PATCH v7 02/25] kernel: Define gettimeofday vdso common code Vincenzo Frascino
2019-06-21 9:52 ` [PATCH v7 03/25] kernel: Unify update_vsyscall implementation Vincenzo Frascino
2019-06-21 10:49 ` Huw Davies
2019-06-21 9:52 ` [PATCH v7 04/25] arm64: Substitute gettimeofday with C implementation Vincenzo Frascino
2019-06-24 13:36 ` Will Deacon
2019-06-24 13:59 ` Vincenzo Frascino
2019-06-25 16:18 ` [PATCH 1/3] lib/vdso: Delay mask application in do_hres() Vincenzo Frascino
2019-06-25 16:18 ` [PATCH 2/3] arm64: Fix __arch_get_hw_counter() implementation Vincenzo Frascino
2019-06-25 16:18 ` [PATCH 3/3] arm64: compat: " Vincenzo Frascino
2019-06-25 17:02 ` [PATCH 1/3] lib/vdso: Delay mask application in do_hres() Thomas Gleixner
2019-06-25 18:27 ` Thomas Gleixner
2019-06-25 20:15 ` Andy Lutomirski
2019-06-25 22:24 ` Thomas Gleixner
2019-06-26 6:38 ` Thomas Gleixner
2019-06-26 9:25 ` Vincenzo Frascino
2019-06-26 10:02 ` lib/vdso: Make delta calculation work correctly Thomas Gleixner
2019-06-26 11:08 ` Vincenzo Frascino
2019-06-24 13:58 ` [PATCH v7 04/25] arm64: Substitute gettimeofday with C implementation Catalin Marinas
2019-06-25 15:33 ` Dave Martin
2019-06-26 13:27 ` Vincenzo Frascino
2019-06-26 16:14 ` Dave Martin
2019-06-26 19:01 ` Vincenzo Frascino
2019-06-27 10:01 ` Dave Martin
2019-06-27 10:57 ` Vincenzo Frascino
2019-06-27 11:27 ` Dave Martin
2019-06-27 11:59 ` Vincenzo Frascino [this message]
2019-06-27 14:38 ` Dave Martin
2019-06-27 15:34 ` Vincenzo Frascino
2019-06-25 17:43 ` [PATCH] arm64: vdso: Fix compilation with clang < 8 Vincenzo Frascino
2019-06-26 11:36 ` [PATCH v2] arm64: vdso: Fix compilation with clang older then 8 Vincenzo Frascino
[not found] ` <CGME20190628130921eucas1p239935b0771032c331911eacc1a69dd2e@eucas1p2.samsung.com>
2019-06-28 13:09 ` [PATCH v7 04/25] arm64: Substitute gettimeofday with C implementation Marek Szyprowski
2019-06-28 14:32 ` Vincenzo Frascino
2019-06-28 16:50 ` Sylwester Nawrocki
2019-06-29 6:58 ` Vincenzo Frascino
2019-07-08 12:57 ` Sylwester Nawrocki
2019-07-08 13:09 ` Vincenzo Frascino
2019-06-21 9:52 ` [PATCH v7 05/25] arm64: Build vDSO with -ffixed-x18 Vincenzo Frascino
2019-06-21 9:52 ` [PATCH v7 06/25] arm64: compat: Add missing syscall numbers Vincenzo Frascino
2019-06-21 9:52 ` [PATCH v7 07/25] arm64: compat: Expose signal related structures Vincenzo Frascino
2019-06-21 9:52 ` [PATCH v7 08/25] arm64: compat: Generate asm offsets for signals Vincenzo Frascino
2019-06-21 9:52 ` [PATCH v7 09/25] lib: vdso: Add compat support Vincenzo Frascino
2019-06-21 9:52 ` [PATCH v7 10/25] arm64: compat: Add vDSO Vincenzo Frascino
2019-06-24 14:00 ` Catalin Marinas
2019-07-10 4:02 ` John Stultz
2019-07-10 6:12 ` Thomas Gleixner
2019-07-10 9:48 ` Vincenzo Frascino
2019-07-10 8:27 ` Will Deacon
2019-07-10 8:58 ` Thomas Gleixner
2019-07-10 9:12 ` Will Deacon
2019-07-10 9:47 ` Vincenzo Frascino
2019-07-10 13:41 ` Vincenzo Frascino
2019-07-10 13:04 ` [PATCH] arm64: vdso: Fix ABI regression in compat vdso Vincenzo Frascino
2019-07-10 13:25 ` Will Deacon
2019-07-10 13:42 ` Vincenzo Frascino
2019-07-10 14:01 ` [PATCH v2] " Vincenzo Frascino
2019-07-10 15:44 ` John Stultz
2019-07-10 15:53 ` Vincenzo Frascino
2019-07-11 9:45 ` Will Deacon
2019-07-11 10:34 ` Thomas Gleixner
2019-07-11 11:32 ` Will Deacon
2019-06-21 9:52 ` [PATCH v7 11/25] arm64: Refactor vDSO code Vincenzo Frascino
2019-06-21 9:52 ` [PATCH v7 12/25] arm64: compat: vDSO setup for compat layer Vincenzo Frascino
2019-06-21 9:52 ` [PATCH v7 13/25] arm64: elf: vDSO code page discovery Vincenzo Frascino
2019-06-21 9:52 ` [PATCH v7 14/25] arm64: compat: Get sigreturn trampolines from vDSO Vincenzo Frascino
2019-06-21 9:52 ` [PATCH v7 15/25] arm64: Add vDSO compat support Vincenzo Frascino
2019-06-21 9:52 ` [PATCH v7 16/25] arm: Add support for generic vDSO Vincenzo Frascino
2019-12-04 13:51 ` [PATCH v7 16/25] arm: Add support for generic vDSO (causing crash) Guenter Roeck
2019-12-04 13:58 ` Vincenzo Frascino
2019-12-04 16:16 ` Guenter Roeck
2019-12-04 17:15 ` Vincenzo Frascino
2019-12-04 19:39 ` Guenter Roeck
2019-12-05 9:42 ` Philippe Mathieu-Daudé
2019-12-05 10:00 ` Vincenzo Frascino
2019-12-05 11:02 ` Arnd Bergmann
2019-12-05 14:56 ` Philippe Mathieu-Daudé
2019-06-21 9:52 ` [PATCH v7 17/25] arm: Add clock_getres entry point Vincenzo Frascino
2019-06-21 9:52 ` [PATCH v7 18/25] arm: Add clock_gettime64 " Vincenzo Frascino
2019-06-21 9:52 ` [PATCH v7 19/25] mips: Add support for generic vDSO Vincenzo Frascino
2019-07-26 5:15 ` Paul Burton
2019-07-26 16:29 ` [PATCH 0/2] mips: vdso: Fix Makefile Vincenzo Frascino
2019-07-26 16:29 ` [PATCH 1/2] mips: vdso: Fix source path Vincenzo Frascino
2019-07-26 16:29 ` [PATCH 2/2] mips: vdso: Fix flip/flop vdso building bug Vincenzo Frascino
2019-07-28 22:20 ` [PATCH 0/2] mips: vdso: Fix Makefile Paul Burton
2019-06-21 9:52 ` [PATCH v7 20/25] mips: Add clock_getres entry point Vincenzo Frascino
2019-07-26 5:15 ` Paul Burton
2019-06-21 9:52 ` [PATCH v7 21/25] mips: Add clock_gettime64 " Vincenzo Frascino
2019-07-26 5:15 ` Paul Burton
2019-06-21 9:52 ` [PATCH v7 22/25] x86: Add support for generic vDSO Vincenzo Frascino
2019-06-21 9:52 ` [PATCH v7 23/25] x86: Add clock_getres entry point Vincenzo Frascino
2019-06-21 9:52 ` [PATCH v7 24/25] x86: Add clock_gettime64 " Vincenzo Frascino
2019-06-21 9:52 ` [PATCH v7 25/25] kselftest: Extend vDSO selftest Vincenzo Frascino
2019-06-24 0:34 ` [PATCH v7 00/25] Unify vDSOs across more architectures Thomas Gleixner
2019-06-24 1:15 ` Andy Lutomirski
2019-06-24 7:42 ` Thomas Gleixner
2019-06-24 13:21 ` Vincenzo Frascino
2019-06-24 14:18 ` Thomas Gleixner
2019-06-24 14:23 ` Russell King - ARM Linux admin
2019-06-24 14:49 ` Catalin Marinas
2019-06-24 16:20 ` Vincenzo Frascino
2019-10-25 11:42 ` Geert Uytterhoeven
2019-06-24 18:41 ` Paul Burton
2019-06-24 23:16 ` Vincenzo Frascino
2019-06-25 17:11 ` Paul Burton
2019-06-25 17:17 ` Vincenzo Frascino
2019-06-24 12:50 ` Andre Przywara
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=a07b66cb-186f-a743-4f1d-41227f23db74@arm.com \
--to=vincenzo.frascino@arm.com \
--cc=0x7f454c46@gmail.com \
--cc=Dave.Martin@arm.com \
--cc=andre.przywara@arm.com \
--cc=arnd@arndb.de \
--cc=catalin.marinas@arm.com \
--cc=daniel.lezcano@linaro.org \
--cc=huw@codeweavers.com \
--cc=linux-arch@vger.kernel.org \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-kselftest@vger.kernel.org \
--cc=linux-mips@vger.kernel.org \
--cc=linux@armlinux.org.uk \
--cc=linux@rasmusvillemoes.dk \
--cc=paul.burton@mips.com \
--cc=pcc@google.com \
--cc=ralf@linux-mips.org \
--cc=salyzyn@android.com \
--cc=shuah@kernel.org \
--cc=sthotton@marvell.com \
--cc=tglx@linutronix.de \
--cc=will.deacon@arm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).