All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/4] Reinstate and improve MIPS `do_div' implementation
@ 2021-04-20  2:50 Maciej W. Rozycki
  2021-04-20  2:50 ` [PATCH 1/4] lib/math: Add a `do_div' test module Maciej W. Rozycki
                   ` (5 more replies)
  0 siblings, 6 replies; 25+ messages in thread
From: Maciej W. Rozycki @ 2021-04-20  2:50 UTC (permalink / raw)
  To: Arnd Bergmann, Thomas Bogendoerfer
  Cc: Huacai Chen, Huacai Chen, Jiaxun Yang, linux-arch, linux-mips,
	linux-kernel

Hi,

 As Huacai has recently discovered the MIPS backend for `do_div' has been 
broken and inadvertently disabled with commit c21004cd5b4c ("MIPS: Rewrite 
<asm/div64.h> to work with gcc 4.4.0.").  As it is code I have originally 
written myself and Huacai had issues bringing it back to life leading to a 
request to discard it even I have decided to step in.

 In the end I have fixed the code and measured its performance to be ~100% 
better on average than our generic code.  I have decided it would be worth 
having the test module I have prepared for correctness evaluation as well 
as benchmarking, so I have included it with the series, also so that I can 
refer to the results easily.

 In the end I have included four patches on this occasion: 1/4 is the test 
module, 2/4 is an inline documentation fix/clarification for the `do_div' 
wrapper, 3/4 enables the MIPS `__div64_32' backend and 4/4 adds a small 
performance improvement to it.

 I have investigated a fifth change as a potential improvement where I 
replaced the call to `do_div64_32' with a DIVU instruction for cases where 
the high part of the intermediate divident is zero, but it has turned out 
to regress performance a little, so I have discarded it.

 Also a follow-up change might be worth having to reduce the code size and 
place `__div64_32' out of line for CC_OPTIMIZE_FOR_SIZE configurations, 
but I have not fully prepared such a change at this time.  I did use the 
WIP form I have for performance evaluation however; see the figures quoted 
with 4/4.

 These changes have been verified with a DECstation system with an R3400 
MIPS I processor @40MHz and a MTI Malta system with a 5Kc MIPS64 processor 
@160MHz.

 See individual change descriptions and any additional discussions for
further details.

 Questions, comments or concerns?  Otherwise please apply.

  Maciej

^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2021-04-27 12:16 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-04-20  2:50 [PATCH 0/4] Reinstate and improve MIPS `do_div' implementation Maciej W. Rozycki
2021-04-20  2:50 ` [PATCH 1/4] lib/math: Add a `do_div' test module Maciej W. Rozycki
2021-04-20  2:50 ` [PATCH 2/4] div64: Correct inline documentation for `do_div' Maciej W. Rozycki
2021-04-20  2:50 ` [PATCH 3/4] MIPS: Reinstate platform `__div64_32' handler Maciej W. Rozycki
2021-04-22 18:36   ` Guenter Roeck
2021-04-22 20:43     ` Maciej W. Rozycki
2021-04-20  2:50 ` [PATCH 4/4] MIPS: Avoid DIVU in `__div64_32' is result would be zero Maciej W. Rozycki
2021-04-21 16:05   ` H. Nikolaus Schaller
2021-04-21 16:16     ` Maciej W. Rozycki
2021-04-22  7:56       ` Thomas Bogendoerfer
2021-04-22  9:12         ` Maciej W. Rozycki
2021-04-22 11:08           ` Thomas Bogendoerfer
2021-04-22 20:47             ` Maciej W. Rozycki
2021-04-27 12:16           ` Maciej W. Rozycki
2021-04-22 11:17   ` Andreas Schwab
2021-04-21 12:01 ` [PATCH 0/4] Reinstate and improve MIPS `do_div' implementation Thomas Bogendoerfer
2021-04-21 13:12   ` Maciej W. Rozycki
2021-04-21 16:00 ` H. Nikolaus Schaller
2021-04-21 19:04   ` Maciej W. Rozycki
2021-04-22  5:53     ` H. Nikolaus Schaller
2021-04-22 13:39       ` Jiaxun Yang
2021-04-22 15:58         ` Maciej W. Rozycki
2021-04-22 16:00         ` H. Nikolaus Schaller
2021-04-22 16:55           ` Maciej W. Rozycki
2021-04-22 17:06             ` H. Nikolaus Schaller

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.