From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751923AbcE1T6N (ORCPT ); Sat, 28 May 2016 15:58:13 -0400 Received: from science.sciencehorizons.net ([71.41.210.147]:15078 "HELO ns.sciencehorizons.net" rhost-flags-OK-FAIL-OK-OK) by vger.kernel.org with SMTP id S1751754AbcE1T6H (ORCPT ); Sat, 28 May 2016 15:58:07 -0400 From: George Spelvin To: Linus Torvalds , lkml Cc: "J . Bruce Fields" , George Spelvin , Geert Uytterhoeven , Greg Ungerer , Andreas Schwab , Philippe De Muyter , linux-m68k@vger.kernel.org Subject: [PATCH v3 08/10] m68k: Add Date: Sat, 28 May 2016 15:57:21 -0400 Message-Id: <1464465443-25305-9-git-send-email-linux@sciencehorizons.net> X-Mailer: git-send-email 2.8.1 In-Reply-To: <1464465443-25305-1-git-send-email-linux@sciencehorizons.net> References: <1464465443-25305-1-git-send-email-linux@sciencehorizons.net> Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org This provides a multiply by constant GOLDEN_RATIO_32 = 0x61C88647 for the original mc68000, which lacks a 32x32-bit multiply instruction. Yes, the amount of optimization effort put in is excessive. :-) Shift-add chain found by Yevgen Voronenko's Hcub algorithm at http://spiral.ece.cmu.edu/mcm/gen.html Signed-off-by: George Spelvin Cc: Geert Uytterhoeven Cc: Greg Ungerer Cc: Andreas Schwab Cc: Philippe De Muyter Cc: linux-m68k@lists.linux-m68k.org --- arch/m68k/Kconfig.cpu | 1 + arch/m68k/include/asm/hash.h | 59 ++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 60 insertions(+) create mode 100644 arch/m68k/include/asm/hash.h diff --git a/arch/m68k/Kconfig.cpu b/arch/m68k/Kconfig.cpu index 0dfcf128..bf3de464 100644 --- a/arch/m68k/Kconfig.cpu +++ b/arch/m68k/Kconfig.cpu @@ -40,6 +40,7 @@ config M68000 select CPU_HAS_NO_MULDIV64 select CPU_HAS_NO_UNALIGNED select GENERIC_CSUM + select HAVE_ARCH_HASH help The Freescale (was Motorola) 68000 CPU is the first generation of the well known M68K family of processors. The CPU core as well as diff --git a/arch/m68k/include/asm/hash.h b/arch/m68k/include/asm/hash.h new file mode 100644 index 00000000..6407af84 --- /dev/null +++ b/arch/m68k/include/asm/hash.h @@ -0,0 +1,59 @@ +#ifndef _ASM_HASH_H +#define _ASM_HASH_H + +/* + * If CONFIG_M68000=y (original mc68000/010), this file is #included + * to work around the lack of a MULU.L instruction. + */ + +#define HAVE_ARCH__HASH_32 1 +/* + * While it would be legal to substitute a different hash operation + * entirely, let's keep it simple and just use an optimized multiply + * by GOLDEN_RATIO_32 = 0x61C88647. + * + * The best way to do that appears to be to multiply by 0x8647 with + * shifts and adds, and use mulu.w to multiply the high half by 0x61C8. + * + * Because the 68000 has multi-cycle shifts, this addition chain is + * chosen to minimise the shift distances. + * + * Despite every attempt to spoon-feed it simple operations, GCC + * 6.1.1 doggedly insists on doing annoying things like converting + * "lsl.l #2," (12 cycles) to two adds (8+8 cycles). + * + * It also likes to notice two shifts in a row, like "a = x << 2" and + * "a <<= 7", and convert that to "a = x << 9". But shifts longer + * than 8 bits are extra-slow on m68k, so that's a lose. + * + * Since the 68000 is a very simple in-order processor with no + * instruction scheduling effects on execution time, we can safely + * take it out of GCC's hands and write one big asm() block. + * + * Without calling overhead, this operation is 30 bytes (14 instructions + * plus one immediate constant) and 166 cycles. + * + * (Because %2 is fetched twice, it can't be postincrement, and thus it + * can't be a fully general "g" or "m". Register is preferred, but + * offsettable memory or immediate will work.) + */ +static inline u32 __attribute_const__ __hash_32(u32 x) +{ + u32 a, b; + + asm( "move.l %2,%0" /* a = x * 0x0001 */ + "\n lsl.l #2,%0" /* a = x * 0x0004 */ + "\n move.l %0,%1" + "\n lsl.l #7,%0" /* a = x * 0x0200 */ + "\n add.l %2,%0" /* a = x * 0x0201 */ + "\n add.l %0,%1" /* b = x * 0x0205 */ + "\n add.l %0,%0" /* a = x * 0x0402 */ + "\n add.l %0,%1" /* b = x * 0x0607 */ + "\n lsl.l #5,%0" /* a = x * 0x8040 */ + : "=&d,d" (a), "=&r,r" (b) + : "r,roi?" (x)); /* a+b = x*0x8647 */ + + return ((u16)(x*0x61c8) << 16) + a + b; +} + +#endif /* _ASM_HASH_H */ -- 2.8.1