From: Nicolas Pitre <nico@fluxnic.net> To: Ard Biesheuvel <ardb@kernel.org>, Antony Yu <swpenim@gmail.com>, Nick Desaulniers <ndesaulniers@google.com>, Russell King <linux@armlinux.org.uk> Cc: Linux Kernel Mailing List <linux-kernel@vger.kernel.org>, clang-built-linux <clang-built-linux@googlegroups.com>, Nathan Chancellor <natechancellor@gmail.com>, Linux ARM <linux-arm-kernel@lists.infradead.org> Subject: [PATCH] __div64_32(): straighten up inline asm constraints Date: Mon, 30 Nov 2020 14:05:27 -0500 (EST) [thread overview] Message-ID: <pr6q9q72-6n62-236q-s59n-7osq71o285r9@syhkavp.arg> (raw) The ARM version of __div64_32() encapsulates a call to __do_div64 with non-standard argument passing. In particular, __n is a 64-bit input argument assigned to r0-r1 and __rem is an output argument sharing half of that 40-r1 register pair. With __n being an input argument, the compiler is in its right to presume that r0-r1 would still hold the value of __n past the inline assembly statement. Normally, the compiler would have assigned non overlapping registers to __n and __rem if the value for __n is needed again. However, here we enforce our own register assignment and gcc fails to notice the conflict. In practice this doesn't cause any problem as __n is considered dead after the asm statement and *n is overwritten. However this is not always guaranteed and clang rightfully complains. Let's fix it properly by making __n into an input-output variable. This makes it clear that those registers representing __n have been modified. Then we can extract __rem as the high part of __n with plain C code. This asm constraint "abuse" was likely relied upon back when gcc didn't handle 64-bit values optimally Turns out that gcc is now able to optimize things and produces the same code with this patch applied. Signed-off-by: Nicolas Pitre <nico@fluxnic.net> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Tested-by: Ard Biesheuvel <ardb@kernel.org> --- This is related to the thread titled "[RESEND,PATCH] ARM: fix __div64_32() error when compiling with clang". My limited compile test with clang appears to make it happy. If no more comments I'll push this to RMK's patch system. diff --git a/arch/arm/include/asm/div64.h b/arch/arm/include/asm/div64.h index 898e9c78a7..595e538f5b 100644 --- a/arch/arm/include/asm/div64.h +++ b/arch/arm/include/asm/div64.h @@ -21,29 +21,20 @@ * assembly implementation with completely non standard calling convention * for arguments and results (beware). */ - -#ifdef __ARMEB__ -#define __xh "r0" -#define __xl "r1" -#else -#define __xl "r0" -#define __xh "r1" -#endif - static inline uint32_t __div64_32(uint64_t *n, uint32_t base) { register unsigned int __base asm("r4") = base; register unsigned long long __n asm("r0") = *n; register unsigned long long __res asm("r2"); - register unsigned int __rem asm(__xh); - asm( __asmeq("%0", __xh) + unsigned int __rem; + asm( __asmeq("%0", "r0") __asmeq("%1", "r2") - __asmeq("%2", "r0") - __asmeq("%3", "r4") + __asmeq("%2", "r4") "bl __do_div64" - : "=r" (__rem), "=r" (__res) - : "r" (__n), "r" (__base) + : "+r" (__n), "=r" (__res) + : "r" (__base) : "ip", "lr", "cc"); + __rem = __n >> 32; *n = __res; return __rem; }
WARNING: multiple messages have this Message-ID (diff)
From: Nicolas Pitre <nico@fluxnic.net> To: Ard Biesheuvel <ardb@kernel.org>, Antony Yu <swpenim@gmail.com>, Nick Desaulniers <ndesaulniers@google.com>, Russell King <linux@armlinux.org.uk> Cc: clang-built-linux <clang-built-linux@googlegroups.com>, Nathan Chancellor <natechancellor@gmail.com>, Linux Kernel Mailing List <linux-kernel@vger.kernel.org>, Linux ARM <linux-arm-kernel@lists.infradead.org> Subject: [PATCH] __div64_32(): straighten up inline asm constraints Date: Mon, 30 Nov 2020 14:05:27 -0500 (EST) [thread overview] Message-ID: <pr6q9q72-6n62-236q-s59n-7osq71o285r9@syhkavp.arg> (raw) The ARM version of __div64_32() encapsulates a call to __do_div64 with non-standard argument passing. In particular, __n is a 64-bit input argument assigned to r0-r1 and __rem is an output argument sharing half of that 40-r1 register pair. With __n being an input argument, the compiler is in its right to presume that r0-r1 would still hold the value of __n past the inline assembly statement. Normally, the compiler would have assigned non overlapping registers to __n and __rem if the value for __n is needed again. However, here we enforce our own register assignment and gcc fails to notice the conflict. In practice this doesn't cause any problem as __n is considered dead after the asm statement and *n is overwritten. However this is not always guaranteed and clang rightfully complains. Let's fix it properly by making __n into an input-output variable. This makes it clear that those registers representing __n have been modified. Then we can extract __rem as the high part of __n with plain C code. This asm constraint "abuse" was likely relied upon back when gcc didn't handle 64-bit values optimally Turns out that gcc is now able to optimize things and produces the same code with this patch applied. Signed-off-by: Nicolas Pitre <nico@fluxnic.net> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Tested-by: Ard Biesheuvel <ardb@kernel.org> --- This is related to the thread titled "[RESEND,PATCH] ARM: fix __div64_32() error when compiling with clang". My limited compile test with clang appears to make it happy. If no more comments I'll push this to RMK's patch system. diff --git a/arch/arm/include/asm/div64.h b/arch/arm/include/asm/div64.h index 898e9c78a7..595e538f5b 100644 --- a/arch/arm/include/asm/div64.h +++ b/arch/arm/include/asm/div64.h @@ -21,29 +21,20 @@ * assembly implementation with completely non standard calling convention * for arguments and results (beware). */ - -#ifdef __ARMEB__ -#define __xh "r0" -#define __xl "r1" -#else -#define __xl "r0" -#define __xh "r1" -#endif - static inline uint32_t __div64_32(uint64_t *n, uint32_t base) { register unsigned int __base asm("r4") = base; register unsigned long long __n asm("r0") = *n; register unsigned long long __res asm("r2"); - register unsigned int __rem asm(__xh); - asm( __asmeq("%0", __xh) + unsigned int __rem; + asm( __asmeq("%0", "r0") __asmeq("%1", "r2") - __asmeq("%2", "r0") - __asmeq("%3", "r4") + __asmeq("%2", "r4") "bl __do_div64" - : "=r" (__rem), "=r" (__res) - : "r" (__n), "r" (__base) + : "+r" (__n), "=r" (__res) + : "r" (__base) : "ip", "lr", "cc"); + __rem = __n >> 32; *n = __res; return __rem; } _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
next reply other threads:[~2020-11-30 19:06 UTC|newest] Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top 2020-11-30 19:05 Nicolas Pitre [this message] 2020-11-30 19:05 ` [PATCH] __div64_32(): straighten up inline asm constraints Nicolas Pitre 2020-11-30 19:33 ` Nick Desaulniers 2020-11-30 19:33 ` Nick Desaulniers 2020-11-30 20:27 ` Nicolas Pitre 2020-11-30 20:27 ` Nicolas Pitre
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=pr6q9q72-6n62-236q-s59n-7osq71o285r9@syhkavp.arg \ --to=nico@fluxnic.net \ --cc=ardb@kernel.org \ --cc=clang-built-linux@googlegroups.com \ --cc=linux-arm-kernel@lists.infradead.org \ --cc=linux-kernel@vger.kernel.org \ --cc=linux@armlinux.org.uk \ --cc=natechancellor@gmail.com \ --cc=ndesaulniers@google.com \ --cc=swpenim@gmail.com \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.