linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Maciej W. Rozycki" <macro@orcam.me.uk>
To: Arnd Bergmann <arnd@arndb.de>,
	Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Huacai Chen <chenhuacai@kernel.org>,
	Huacai Chen <chenhuacai@loongson.cn>,
	Jiaxun Yang <jiaxun.yang@flygoat.com>,
	linux-arch@vger.kernel.org, linux-mips@vger.kernel.org,
	linux-kernel@vger.kernel.org, stable@vger.kernel.org
Subject: [PATCH 3/4] MIPS: Reinstate platform `__div64_32' handler
Date: Tue, 20 Apr 2021 04:50:40 +0200 (CEST)	[thread overview]
Message-ID: <alpine.DEB.2.21.2104200212500.44318@angie.orcam.me.uk> (raw)
In-Reply-To: <alpine.DEB.2.21.2104200044060.44318@angie.orcam.me.uk>

Our current MIPS platform `__div64_32' handler is inactive, because it 
is incorrectly only enabled for 64-bit configurations, for which generic 
`do_div' code does not call it anyway.

The handler is not suitable for being called from there though as it 
only calculates 32 bits of the quotient under the assumption the 64-bit 
divident has been suitably reduced.  Code for such reduction used to be 
there, however it has been incorrectly removed with commit c21004cd5b4c 
("MIPS: Rewrite <asm/div64.h> to work with gcc 4.4.0."), which should 
have only updated an obsoleted constraint for an inline asm involving 
$hi and $lo register outputs, while possibly wiring the original MIPS 
variant of the `do_div' macro as `__div64_32' handler for the generic 
`do_div' implementation

Correct the handler as follows then:

- Revert most of the commit referred, however retaining the current 
  formatting, except for the final two instructions of the inline asm 
  sequence, which the original commit missed.  Omit the original 64-bit 
  parts though.

- Rename the original `do_div' macro to `__div64_32'.  Use the combined 
  `x' constraint referring to the MD accumulator as a whole, replacing 
  the original individual `h' and `l' constraints used for $hi and $lo 
  registers respectively, of which `h' has been obsoleted with GCC 4.4. 
  Update surrounding code accordingly.

  We have since removed support for GCC versions before 4.9, so no need 
  for a special arrangement here; GCC has supported the `x' constraint 
  since forever anyway, or at least going back to 1991.

- Rename the `__base' local variable in `__div64_32' to `__radix' to 
  avoid a conflict with a local variable in `do_div'.

- Actually enable this code for 32-bit rather than 64-bit configurations
  by qualifying it with BITS_PER_LONG being 32 instead of 64.  Include 
  <asm/bitsperlong.h> for this macro rather than <linux/types.h> as we 
  don't need anything else.

- Finally include <asm-generic/div64.h> last rather than first.

This has passed correctness verification with test_div64 and reduced the 
module's average execution time down to 1.0668s and 0.2629s from 2.1529s 
and 0.5647s respectively for an R3400 CPU @40MHz and a 5Kc CPU @160MHz.  
For a reference 64-bit `do_div' code where we have the DDIVU instruction 
available to do the whole calculation right away averages at 0.0660s for 
the latter CPU.

Reported-by: Huacai Chen <chenhuacai@kernel.org>
Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk>
Fixes: c21004cd5b4c ("MIPS: Rewrite <asm/div64.h> to work with gcc 4.4.0.")
Cc: stable@vger.kernel.org # v2.6.30+
---
Our handcrafted handler seems to run at ~25% of the performance of the 
64-bit hardware instruction; not too bad I would say.  Though there's 
likely some overhead from surrounding code that interferes with the 
figures.

Then there are a couple of `checkpatch.pl' nits about trailing whitespace 
in inline asm, which however makes it more readable.  So the change stays 
as it is.
---
 arch/mips/include/asm/div64.h |   57 ++++++++++++++++++++++++++++++------------
 1 file changed, 41 insertions(+), 16 deletions(-)

linux-mips-div64-generic-fix.diff
Index: linux-3maxp-div64/arch/mips/include/asm/div64.h
===================================================================
--- linux-3maxp-div64.orig/arch/mips/include/asm/div64.h
+++ linux-3maxp-div64/arch/mips/include/asm/div64.h
@@ -1,5 +1,5 @@
 /*
- * Copyright (C) 2000, 2004  Maciej W. Rozycki
+ * Copyright (C) 2000, 2004, 2021  Maciej W. Rozycki
  * Copyright (C) 2003, 07 Ralf Baechle (ralf@linux-mips.org)
  *
  * This file is subject to the terms and conditions of the GNU General Public
@@ -9,25 +9,18 @@
 #ifndef __ASM_DIV64_H
 #define __ASM_DIV64_H
 
-#include <asm-generic/div64.h>
-
-#if BITS_PER_LONG == 64
+#include <asm/bitsperlong.h>
 
-#include <linux/types.h>
+#if BITS_PER_LONG == 32
 
 /*
  * No traps on overflows for any of these...
  */
 
-#define __div64_32(n, base)						\
-({									\
+#define do_div64_32(res, high, low, base) ({				\
 	unsigned long __cf, __tmp, __tmp2, __i;				\
 	unsigned long __quot32, __mod32;				\
-	unsigned long __high, __low;					\
-	unsigned long long __n;						\
 									\
-	__high = *__n >> 32;						\
-	__low = __n;							\
 	__asm__(							\
 	"	.set	push					\n"	\
 	"	.set	noat					\n"	\
@@ -51,18 +44,50 @@
 	"	subu	%0, %0, %z6				\n"	\
 	"	addiu	%2, %2, 1				\n"	\
 	"3:							\n"	\
-	"	bnez	%4, 0b\n\t"					\
-	"	 srl	%5, %1, 0x1f\n\t"				\
+	"	bnez	%4, 0b					\n"	\
+	"	 srl	%5, %1, 0x1f				\n"	\
 	"	.set	pop"						\
 	: "=&r" (__mod32), "=&r" (__tmp),				\
 	  "=&r" (__quot32), "=&r" (__cf),				\
 	  "=&r" (__i), "=&r" (__tmp2)					\
-	: "Jr" (base), "0" (__high), "1" (__low));			\
+	: "Jr" (base), "0" (high), "1" (low));				\
 									\
-	(__n) = __quot32;						\
+	(res) = __quot32;						\
 	__mod32;							\
 })
 
-#endif /* BITS_PER_LONG == 64 */
+#define __div64_32(n, base) ({						\
+	unsigned long __upper, __low, __high, __radix;			\
+	unsigned long long __modquot;					\
+	unsigned long long __quot;					\
+	unsigned long long __div;					\
+	unsigned long __mod;						\
+									\
+	__div = (*n);							\
+	__radix = (base);						\
+									\
+	__high = __div >> 32;						\
+	__low = __div;							\
+	__upper = __high;						\
+									\
+	if (__high) {							\
+		__asm__("divu	$0, %z1, %z2"				\
+		: "=x" (__modquot)					\
+		: "Jr" (__high), "Jr" (__radix));			\
+		__upper = __modquot >> 32;				\
+		__high = __modquot;					\
+	}								\
+									\
+	__mod = do_div64_32(__low, __upper, __low, __radix);		\
+									\
+	__quot = __high;						\
+	__quot = __quot << 32 | __low;					\
+	(*n) = __quot;							\
+	__mod;								\
+})
+
+#endif /* BITS_PER_LONG == 32 */
+
+#include <asm-generic/div64.h>
 
 #endif /* __ASM_DIV64_H */

  parent reply	other threads:[~2021-04-20  2:50 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-04-20  2:50 [PATCH 0/4] Reinstate and improve MIPS `do_div' implementation Maciej W. Rozycki
2021-04-20  2:50 ` [PATCH 1/4] lib/math: Add a `do_div' test module Maciej W. Rozycki
2021-04-20  2:50 ` [PATCH 2/4] div64: Correct inline documentation for `do_div' Maciej W. Rozycki
2021-04-20  2:50 ` Maciej W. Rozycki [this message]
2021-04-22 18:36   ` [PATCH 3/4] MIPS: Reinstate platform `__div64_32' handler Guenter Roeck
2021-04-22 20:43     ` Maciej W. Rozycki
2021-04-20  2:50 ` [PATCH 4/4] MIPS: Avoid DIVU in `__div64_32' is result would be zero Maciej W. Rozycki
2021-04-21 16:05   ` H. Nikolaus Schaller
2021-04-21 16:16     ` Maciej W. Rozycki
2021-04-22  7:56       ` Thomas Bogendoerfer
2021-04-22  9:12         ` Maciej W. Rozycki
2021-04-22 11:08           ` Thomas Bogendoerfer
2021-04-22 20:47             ` Maciej W. Rozycki
2021-04-27 12:16           ` Maciej W. Rozycki
2021-04-22 11:17   ` Andreas Schwab
2021-04-21 12:01 ` [PATCH 0/4] Reinstate and improve MIPS `do_div' implementation Thomas Bogendoerfer
2021-04-21 13:12   ` Maciej W. Rozycki
2021-04-21 16:00 ` H. Nikolaus Schaller
2021-04-21 19:04   ` Maciej W. Rozycki
2021-04-22  5:53     ` H. Nikolaus Schaller
2021-04-22 13:39       ` Jiaxun Yang
2021-04-22 15:58         ` Maciej W. Rozycki
2021-04-22 16:00         ` H. Nikolaus Schaller
2021-04-22 16:55           ` Maciej W. Rozycki
2021-04-22 17:06             ` H. Nikolaus Schaller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=alpine.DEB.2.21.2104200212500.44318@angie.orcam.me.uk \
    --to=macro@orcam.me.uk \
    --cc=arnd@arndb.de \
    --cc=chenhuacai@kernel.org \
    --cc=chenhuacai@loongson.cn \
    --cc=jiaxun.yang@flygoat.com \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mips@vger.kernel.org \
    --cc=stable@vger.kernel.org \
    --cc=tsbogend@alpha.franken.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).