* [PATCH] __div64_32(): straighten up inline asm constraints
@ 2020-11-30 19:05 Nicolas Pitre
2020-11-30 19:33 ` Nick Desaulniers
0 siblings, 1 reply; 3+ messages in thread
From: Nicolas Pitre @ 2020-11-30 19:05 UTC (permalink / raw)
To: Ard Biesheuvel, Antony Yu, Nick Desaulniers, Russell King
Cc: Linux Kernel Mailing List, clang-built-linux, Nathan Chancellor,
Linux ARM
The ARM version of __div64_32() encapsulates a call to __do_div64 with
non-standard argument passing. In particular, __n is a 64-bit input
argument assigned to r0-r1 and __rem is an output argument sharing half
of that 40-r1 register pair.
With __n being an input argument, the compiler is in its right to
presume that r0-r1 would still hold the value of __n past the inline
assembly statement. Normally, the compiler would have assigned non
overlapping registers to __n and __rem if the value for __n is needed
again.
However, here we enforce our own register assignment and gcc fails to
notice the conflict. In practice this doesn't cause any problem as __n
is considered dead after the asm statement and *n is overwritten.
However this is not always guaranteed and clang rightfully complains.
Let's fix it properly by making __n into an input-output variable. This
makes it clear that those registers representing __n have been modified.
Then we can extract __rem as the high part of __n with plain C code.
This asm constraint "abuse" was likely relied upon back when gcc didn't
handle 64-bit values optimally Turns out that gcc is now able to
optimize things and produces the same code with this patch applied.
Signed-off-by: Nicolas Pitre <nico@fluxnic.net>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Tested-by: Ard Biesheuvel <ardb@kernel.org>
---
This is related to the thread titled "[RESEND,PATCH] ARM: fix
__div64_32() error when compiling with clang". My limited compile test
with clang appears to make it happy. If no more comments I'll push this
to RMK's patch system.
diff --git a/arch/arm/include/asm/div64.h b/arch/arm/include/asm/div64.h
index 898e9c78a7..595e538f5b 100644
--- a/arch/arm/include/asm/div64.h
+++ b/arch/arm/include/asm/div64.h
@@ -21,29 +21,20 @@
* assembly implementation with completely non standard calling convention
* for arguments and results (beware).
*/
-
-#ifdef __ARMEB__
-#define __xh "r0"
-#define __xl "r1"
-#else
-#define __xl "r0"
-#define __xh "r1"
-#endif
-
static inline uint32_t __div64_32(uint64_t *n, uint32_t base)
{
register unsigned int __base asm("r4") = base;
register unsigned long long __n asm("r0") = *n;
register unsigned long long __res asm("r2");
- register unsigned int __rem asm(__xh);
- asm( __asmeq("%0", __xh)
+ unsigned int __rem;
+ asm( __asmeq("%0", "r0")
__asmeq("%1", "r2")
- __asmeq("%2", "r0")
- __asmeq("%3", "r4")
+ __asmeq("%2", "r4")
"bl __do_div64"
- : "=r" (__rem), "=r" (__res)
- : "r" (__n), "r" (__base)
+ : "+r" (__n), "=r" (__res)
+ : "r" (__base)
: "ip", "lr", "cc");
+ __rem = __n >> 32;
*n = __res;
return __rem;
}
^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: [PATCH] __div64_32(): straighten up inline asm constraints
2020-11-30 19:05 [PATCH] __div64_32(): straighten up inline asm constraints Nicolas Pitre
@ 2020-11-30 19:33 ` Nick Desaulniers
2020-11-30 20:27 ` Nicolas Pitre
0 siblings, 1 reply; 3+ messages in thread
From: Nick Desaulniers @ 2020-11-30 19:33 UTC (permalink / raw)
To: Nicolas Pitre
Cc: Ard Biesheuvel, Antony Yu, Russell King,
Linux Kernel Mailing List, clang-built-linux, Nathan Chancellor,
Linux ARM
On Mon, Nov 30, 2020 at 11:05 AM Nicolas Pitre <nico@fluxnic.net> wrote:
>
> The ARM version of __div64_32() encapsulates a call to __do_div64 with
> non-standard argument passing. In particular, __n is a 64-bit input
> argument assigned to r0-r1 and __rem is an output argument sharing half
> of that 40-r1 register pair.
Should `40` be `r0`?
>
> With __n being an input argument, the compiler is in its right to
> presume that r0-r1 would still hold the value of __n past the inline
> assembly statement. Normally, the compiler would have assigned non
> overlapping registers to __n and __rem if the value for __n is needed
> again.
>
> However, here we enforce our own register assignment and gcc fails to
> notice the conflict. In practice this doesn't cause any problem as __n
> is considered dead after the asm statement and *n is overwritten.
> However this is not always guaranteed and clang rightfully complains.
>
> Let's fix it properly by making __n into an input-output variable. This
> makes it clear that those registers representing __n have been modified.
> Then we can extract __rem as the high part of __n with plain C code.
>
> This asm constraint "abuse" was likely relied upon back when gcc didn't
> handle 64-bit values optimally Turns out that gcc is now able to
^ Missing punctuation (period after `optimally`).
> optimize things and produces the same code with this patch applied.
>
> Signed-off-by: Nicolas Pitre <nico@fluxnic.net>
> Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
> Tested-by: Ard Biesheuvel <ardb@kernel.org>
Reported-by: Antony Yu <swpenim@gmail.com>
> ---
>
> This is related to the thread titled "[RESEND,PATCH] ARM: fix
> __div64_32() error when compiling with clang". My limited compile test
> with clang appears to make it happy. If no more comments I'll push this
> to RMK's patch system.
>
> diff --git a/arch/arm/include/asm/div64.h b/arch/arm/include/asm/div64.h
> index 898e9c78a7..595e538f5b 100644
> --- a/arch/arm/include/asm/div64.h
> +++ b/arch/arm/include/asm/div64.h
> @@ -21,29 +21,20 @@
> * assembly implementation with completely non standard calling convention
> * for arguments and results (beware).
> */
> -
> -#ifdef __ARMEB__
> -#define __xh "r0"
> -#define __xl "r1"
> -#else
> -#define __xl "r0"
> -#define __xh "r1"
> -#endif
> -
> static inline uint32_t __div64_32(uint64_t *n, uint32_t base)
> {
> register unsigned int __base asm("r4") = base;
> register unsigned long long __n asm("r0") = *n;
> register unsigned long long __res asm("r2");
> - register unsigned int __rem asm(__xh);
> - asm( __asmeq("%0", __xh)
> + unsigned int __rem;
> + asm( __asmeq("%0", "r0")
> __asmeq("%1", "r2")
> - __asmeq("%2", "r0")
> - __asmeq("%3", "r4")
> + __asmeq("%2", "r4")
> "bl __do_div64"
> - : "=r" (__rem), "=r" (__res)
> - : "r" (__n), "r" (__base)
> + : "+r" (__n), "=r" (__res)
> + : "r" (__base)
> : "ip", "lr", "cc");
> + __rem = __n >> 32;
> *n = __res;
> return __rem;
The above 3 statement could be:
```
*n = __res;
return __n >> 32;
```
> }
--
Thanks,
~Nick Desaulniers
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH] __div64_32(): straighten up inline asm constraints
2020-11-30 19:33 ` Nick Desaulniers
@ 2020-11-30 20:27 ` Nicolas Pitre
0 siblings, 0 replies; 3+ messages in thread
From: Nicolas Pitre @ 2020-11-30 20:27 UTC (permalink / raw)
To: Nick Desaulniers
Cc: Ard Biesheuvel, Antony Yu, Russell King,
Linux Kernel Mailing List, clang-built-linux, Nathan Chancellor,
Linux ARM
On Mon, 30 Nov 2020, Nick Desaulniers wrote:
> On Mon, Nov 30, 2020 at 11:05 AM Nicolas Pitre <nico@fluxnic.net> wrote:
>
> > + __rem = __n >> 32;
> > *n = __res;
> > return __rem;
>
> The above 3 statement could be:
>
> ```
> *n = __res;
> return __n >> 32;
> ```
They could. However the compiler doesn't care, and the extra line makes
it more obvious that the reminder is the high part of __n. So,
semantically the extra line has value.
Thanks for the review.
Nicolas
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2020-11-30 20:28 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-11-30 19:05 [PATCH] __div64_32(): straighten up inline asm constraints Nicolas Pitre
2020-11-30 19:33 ` Nick Desaulniers
2020-11-30 20:27 ` Nicolas Pitre
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).