* [PATCH 3/4] MIPS: Reinstate platform `__div64_32' handler
[not found] <alpine.DEB.2.21.2104200044060.44318@angie.orcam.me.uk>
@ 2021-04-20 2:50 ` Maciej W. Rozycki
2021-04-22 18:36 ` Guenter Roeck
0 siblings, 1 reply; 3+ messages in thread
From: Maciej W. Rozycki @ 2021-04-20 2:50 UTC (permalink / raw)
To: Arnd Bergmann, Thomas Bogendoerfer
Cc: Huacai Chen, Huacai Chen, Jiaxun Yang, linux-arch, linux-mips,
linux-kernel, stable
Our current MIPS platform `__div64_32' handler is inactive, because it
is incorrectly only enabled for 64-bit configurations, for which generic
`do_div' code does not call it anyway.
The handler is not suitable for being called from there though as it
only calculates 32 bits of the quotient under the assumption the 64-bit
divident has been suitably reduced. Code for such reduction used to be
there, however it has been incorrectly removed with commit c21004cd5b4c
("MIPS: Rewrite <asm/div64.h> to work with gcc 4.4.0."), which should
have only updated an obsoleted constraint for an inline asm involving
$hi and $lo register outputs, while possibly wiring the original MIPS
variant of the `do_div' macro as `__div64_32' handler for the generic
`do_div' implementation
Correct the handler as follows then:
- Revert most of the commit referred, however retaining the current
formatting, except for the final two instructions of the inline asm
sequence, which the original commit missed. Omit the original 64-bit
parts though.
- Rename the original `do_div' macro to `__div64_32'. Use the combined
`x' constraint referring to the MD accumulator as a whole, replacing
the original individual `h' and `l' constraints used for $hi and $lo
registers respectively, of which `h' has been obsoleted with GCC 4.4.
Update surrounding code accordingly.
We have since removed support for GCC versions before 4.9, so no need
for a special arrangement here; GCC has supported the `x' constraint
since forever anyway, or at least going back to 1991.
- Rename the `__base' local variable in `__div64_32' to `__radix' to
avoid a conflict with a local variable in `do_div'.
- Actually enable this code for 32-bit rather than 64-bit configurations
by qualifying it with BITS_PER_LONG being 32 instead of 64. Include
<asm/bitsperlong.h> for this macro rather than <linux/types.h> as we
don't need anything else.
- Finally include <asm-generic/div64.h> last rather than first.
This has passed correctness verification with test_div64 and reduced the
module's average execution time down to 1.0668s and 0.2629s from 2.1529s
and 0.5647s respectively for an R3400 CPU @40MHz and a 5Kc CPU @160MHz.
For a reference 64-bit `do_div' code where we have the DDIVU instruction
available to do the whole calculation right away averages at 0.0660s for
the latter CPU.
Reported-by: Huacai Chen <chenhuacai@kernel.org>
Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk>
Fixes: c21004cd5b4c ("MIPS: Rewrite <asm/div64.h> to work with gcc 4.4.0.")
Cc: stable@vger.kernel.org # v2.6.30+
---
Our handcrafted handler seems to run at ~25% of the performance of the
64-bit hardware instruction; not too bad I would say. Though there's
likely some overhead from surrounding code that interferes with the
figures.
Then there are a couple of `checkpatch.pl' nits about trailing whitespace
in inline asm, which however makes it more readable. So the change stays
as it is.
---
arch/mips/include/asm/div64.h | 57 ++++++++++++++++++++++++++++++------------
1 file changed, 41 insertions(+), 16 deletions(-)
linux-mips-div64-generic-fix.diff
Index: linux-3maxp-div64/arch/mips/include/asm/div64.h
===================================================================
--- linux-3maxp-div64.orig/arch/mips/include/asm/div64.h
+++ linux-3maxp-div64/arch/mips/include/asm/div64.h
@@ -1,5 +1,5 @@
/*
- * Copyright (C) 2000, 2004 Maciej W. Rozycki
+ * Copyright (C) 2000, 2004, 2021 Maciej W. Rozycki
* Copyright (C) 2003, 07 Ralf Baechle (ralf@linux-mips.org)
*
* This file is subject to the terms and conditions of the GNU General Public
@@ -9,25 +9,18 @@
#ifndef __ASM_DIV64_H
#define __ASM_DIV64_H
-#include <asm-generic/div64.h>
-
-#if BITS_PER_LONG == 64
+#include <asm/bitsperlong.h>
-#include <linux/types.h>
+#if BITS_PER_LONG == 32
/*
* No traps on overflows for any of these...
*/
-#define __div64_32(n, base) \
-({ \
+#define do_div64_32(res, high, low, base) ({ \
unsigned long __cf, __tmp, __tmp2, __i; \
unsigned long __quot32, __mod32; \
- unsigned long __high, __low; \
- unsigned long long __n; \
\
- __high = *__n >> 32; \
- __low = __n; \
__asm__( \
" .set push \n" \
" .set noat \n" \
@@ -51,18 +44,50 @@
" subu %0, %0, %z6 \n" \
" addiu %2, %2, 1 \n" \
"3: \n" \
- " bnez %4, 0b\n\t" \
- " srl %5, %1, 0x1f\n\t" \
+ " bnez %4, 0b \n" \
+ " srl %5, %1, 0x1f \n" \
" .set pop" \
: "=&r" (__mod32), "=&r" (__tmp), \
"=&r" (__quot32), "=&r" (__cf), \
"=&r" (__i), "=&r" (__tmp2) \
- : "Jr" (base), "0" (__high), "1" (__low)); \
+ : "Jr" (base), "0" (high), "1" (low)); \
\
- (__n) = __quot32; \
+ (res) = __quot32; \
__mod32; \
})
-#endif /* BITS_PER_LONG == 64 */
+#define __div64_32(n, base) ({ \
+ unsigned long __upper, __low, __high, __radix; \
+ unsigned long long __modquot; \
+ unsigned long long __quot; \
+ unsigned long long __div; \
+ unsigned long __mod; \
+ \
+ __div = (*n); \
+ __radix = (base); \
+ \
+ __high = __div >> 32; \
+ __low = __div; \
+ __upper = __high; \
+ \
+ if (__high) { \
+ __asm__("divu $0, %z1, %z2" \
+ : "=x" (__modquot) \
+ : "Jr" (__high), "Jr" (__radix)); \
+ __upper = __modquot >> 32; \
+ __high = __modquot; \
+ } \
+ \
+ __mod = do_div64_32(__low, __upper, __low, __radix); \
+ \
+ __quot = __high; \
+ __quot = __quot << 32 | __low; \
+ (*n) = __quot; \
+ __mod; \
+})
+
+#endif /* BITS_PER_LONG == 32 */
+
+#include <asm-generic/div64.h>
#endif /* __ASM_DIV64_H */
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH 3/4] MIPS: Reinstate platform `__div64_32' handler
2021-04-20 2:50 ` [PATCH 3/4] MIPS: Reinstate platform `__div64_32' handler Maciej W. Rozycki
@ 2021-04-22 18:36 ` Guenter Roeck
2021-04-22 20:43 ` Maciej W. Rozycki
0 siblings, 1 reply; 3+ messages in thread
From: Guenter Roeck @ 2021-04-22 18:36 UTC (permalink / raw)
To: Maciej W. Rozycki
Cc: Arnd Bergmann, Thomas Bogendoerfer, Huacai Chen, Huacai Chen,
Jiaxun Yang, linux-arch, linux-mips, linux-kernel, stable
On Tue, Apr 20, 2021 at 04:50:40AM +0200, Maciej W. Rozycki wrote:
> Our current MIPS platform `__div64_32' handler is inactive, because it
> is incorrectly only enabled for 64-bit configurations, for which generic
> `do_div' code does not call it anyway.
>
> The handler is not suitable for being called from there though as it
> only calculates 32 bits of the quotient under the assumption the 64-bit
> divident has been suitably reduced. Code for such reduction used to be
> there, however it has been incorrectly removed with commit c21004cd5b4c
> ("MIPS: Rewrite <asm/div64.h> to work with gcc 4.4.0."), which should
> have only updated an obsoleted constraint for an inline asm involving
> $hi and $lo register outputs, while possibly wiring the original MIPS
> variant of the `do_div' macro as `__div64_32' handler for the generic
> `do_div' implementation
>
> Correct the handler as follows then:
>
> - Revert most of the commit referred, however retaining the current
> formatting, except for the final two instructions of the inline asm
> sequence, which the original commit missed. Omit the original 64-bit
> parts though.
>
> - Rename the original `do_div' macro to `__div64_32'. Use the combined
> `x' constraint referring to the MD accumulator as a whole, replacing
> the original individual `h' and `l' constraints used for $hi and $lo
> registers respectively, of which `h' has been obsoleted with GCC 4.4.
> Update surrounding code accordingly.
>
> We have since removed support for GCC versions before 4.9, so no need
> for a special arrangement here; GCC has supported the `x' constraint
> since forever anyway, or at least going back to 1991.
>
> - Rename the `__base' local variable in `__div64_32' to `__radix' to
> avoid a conflict with a local variable in `do_div'.
>
> - Actually enable this code for 32-bit rather than 64-bit configurations
> by qualifying it with BITS_PER_LONG being 32 instead of 64. Include
> <asm/bitsperlong.h> for this macro rather than <linux/types.h> as we
> don't need anything else.
>
> - Finally include <asm-generic/div64.h> last rather than first.
>
> This has passed correctness verification with test_div64 and reduced the
> module's average execution time down to 1.0668s and 0.2629s from 2.1529s
> and 0.5647s respectively for an R3400 CPU @40MHz and a 5Kc CPU @160MHz.
> For a reference 64-bit `do_div' code where we have the DDIVU instruction
> available to do the whole calculation right away averages at 0.0660s for
> the latter CPU.
>
This patch results in:
arch/mips/mti-malta/malta-time.c: In function 'plat_time_init':
./arch/mips/include/asm/div64.h:76:3: error: inconsistent operand constraints in an 'asm'
and similar errors when trying to compile malta_qemu_32r6_defconfig.
I tried with gcc 8.3.0, 8.4.0, 9.3.0, and 10.3.0.
Does this need some additional new compile flags ?
Thanks,
Guenter
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH 3/4] MIPS: Reinstate platform `__div64_32' handler
2021-04-22 18:36 ` Guenter Roeck
@ 2021-04-22 20:43 ` Maciej W. Rozycki
0 siblings, 0 replies; 3+ messages in thread
From: Maciej W. Rozycki @ 2021-04-22 20:43 UTC (permalink / raw)
To: Guenter Roeck
Cc: Arnd Bergmann, Thomas Bogendoerfer, Huacai Chen, Huacai Chen,
Jiaxun Yang, linux-arch, linux-mips, linux-kernel, stable
On Thu, 22 Apr 2021, Guenter Roeck wrote:
> This patch results in:
>
> arch/mips/mti-malta/malta-time.c: In function 'plat_time_init':
> ./arch/mips/include/asm/div64.h:76:3: error: inconsistent operand constraints in an 'asm'
>
> and similar errors when trying to compile malta_qemu_32r6_defconfig.
Thanks for the heads-up, however the 0-DAY bot has caught this issue
already last night and I would have addressed it earlier on if not for a
failure of my Malta board :( which disrupted my verification.
> I tried with gcc 8.3.0, 8.4.0, 9.3.0, and 10.3.0.
>
> Does this need some additional new compile flags ?
MIPSr6 doesn't have the original division instruction along with the MD
accumulator registers anymore, and consequently GCC cannot fit the
constraint requested.
We don't need that asm however. Maybe we didn't with GCC 2.95 either,
but I suspect there was something to it. Anyway I have just posted a fix.
Maciej
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2021-04-22 20:43 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
[not found] <alpine.DEB.2.21.2104200044060.44318@angie.orcam.me.uk>
2021-04-20 2:50 ` [PATCH 3/4] MIPS: Reinstate platform `__div64_32' handler Maciej W. Rozycki
2021-04-22 18:36 ` Guenter Roeck
2021-04-22 20:43 ` Maciej W. Rozycki
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).