All of lore.kernel.org
 help / color / mirror / Atom feed
From: "George Spelvin" <linux@horizon.com>
To: geert@linux-m68k.org, gerg@linux-m68k.org,
	linux-kernel@vger.kernel.org, linux-m68k@vger.kernel.org,
	linux@sciencehorizons.net
Cc: alistair.francis@xilinx.com, michal.simek@xilinx.com,
	phdm@macq.eu, schwab@linux-m68k.org,
	uclinux-h8-devel@lists.sourceforge.jp,
	ysato@users.sourceforge.jp
Subject: [PATCH v2 08/10] m68k: Add <asm/hash.h>
Date: 26 May 2016 13:19:29 -0400	[thread overview]
Message-ID: <20160526171929.24497.qmail@ns.sciencehorizons.net> (raw)
In-Reply-To: <20160525073455.5709.qmail@ns.sciencehorizons.net>

This provides a multiply by constant GOLDEN_RATIO_32 = 0x61C88647
for the original mc68000, which lacks a 32x32-bit multiply instruction.

Yes, the amount of optimization effort put in is excessive. :-)

Shift-add chain found by Yevgen Voronenko's Hcub algorithm at
http://spiral.ece.cmu.edu/mcm/gen.html

Signed-off-by: George Spelvin <linux@sciencehorizons.net>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Greg Ungerer <gerg@linux-m68k.org>
Cc: Andreas Schwab <schwab@linux-m68k.org>
Cc: Philippe De Muyter <phdm@macq.eu>
Cc: linux-m68k@lists.linux-m68k.org
---
 arch/m68k/Kconfig.cpu        |  1 +
 arch/m68k/include/asm/hash.h | 59 ++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 60 insertions(+)
 create mode 100644 arch/m68k/include/asm/hash.h

diff --git a/arch/m68k/Kconfig.cpu b/arch/m68k/Kconfig.cpu
index 0dfcf128..bf3de464 100644
--- a/arch/m68k/Kconfig.cpu
+++ b/arch/m68k/Kconfig.cpu
@@ -40,6 +40,7 @@ config M68000
 	select CPU_HAS_NO_MULDIV64
 	select CPU_HAS_NO_UNALIGNED
 	select GENERIC_CSUM
+	select HAVE_ARCH_HASH
 	help
 	  The Freescale (was Motorola) 68000 CPU is the first generation of
 	  the well known M68K family of processors. The CPU core as well as
diff --git a/arch/m68k/include/asm/hash.h b/arch/m68k/include/asm/hash.h
new file mode 100644
index 00000000..6407af84
--- /dev/null
+++ b/arch/m68k/include/asm/hash.h
@@ -0,0 +1,59 @@
+#ifndef _ASM_HASH_H
+#define _ASM_HASH_H
+
+/*
+ * If CONFIG_M68000=y (original mc68000/010), this file is #included
+ * to work around the lack of a MULU.L instruction.
+ */
+
+#define HAVE_ARCH__HASH_32 1
+/*
+ * While it would be legal to substitute a different hash operation
+ * entirely, let's keep it simple and just use an optimized multiply
+ * by GOLDEN_RATIO_32 = 0x61C88647.
+ *
+ * The best way to do that appears to be to multiply by 0x8647 with
+ * shifts and adds, and use mulu.w to multiply the high half by 0x61C8.
+ *
+ * Because the 68000 has multi-cycle shifts, this addition chain is
+ * chosen to minimise the shift distances.
+ *
+ * Despite every attempt to spoon-feed it simple operations, GCC
+ * 6.1.1 doggedly insists on doing annoying things like converting
+ * "lsl.l #2,<reg>" (12 cycles) to two adds (8+8 cycles).
+ *
+ * It also likes to notice two shifts in a row, like "a = x << 2" and
+ * "a <<= 7", and convert that to "a = x << 9".  But shifts longer
+ * than 8 bits are extra-slow on m68k, so that's a lose.
+ *
+ * Since the 68000 is a very simple in-order processor with no
+ * instruction scheduling effects on execution time, we can safely
+ * take it out of GCC's hands and write one big asm() block.
+ *
+ * Without calling overhead, this operation is 30 bytes (14 instructions
+ * plus one immediate constant) and 166 cycles.
+ *
+ * (Because %2 is fetched twice, it can't be postincrement, and thus it
+ * can't be a fully general "g" or "m".  Register is preferred, but
+ * offsettable memory or immediate will work.)
+ */
+static inline u32 __attribute_const__ __hash_32(u32 x)
+{
+	u32 a, b;
+
+	asm(   "move.l %2,%0"	/* a = x * 0x0001 */
+	"\n	lsl.l #2,%0"	/* a = x * 0x0004 */
+	"\n	move.l %0,%1"
+	"\n	lsl.l #7,%0"	/* a = x * 0x0200 */
+	"\n	add.l %2,%0"	/* a = x * 0x0201 */
+	"\n	add.l %0,%1"	/* b = x * 0x0205 */
+	"\n	add.l %0,%0"	/* a = x * 0x0402 */
+	"\n	add.l %0,%1"	/* b = x * 0x0607 */
+	"\n	lsl.l #5,%0"	/* a = x * 0x8040 */
+	: "=&d,d" (a), "=&r,r" (b)
+	: "r,roi?" (x));	/* a+b = x*0x8647 */
+
+	return ((u16)(x*0x61c8) << 16) + a + b;
+}
+
+#endif	/* _ASM_HASH_H */
-- 
2.8.1

WARNING: multiple messages have this Message-ID (diff)
From: "George Spelvin" <linux@horizon.com>
To: geert@linux-m68k.org, gerg@linux-m68k.org,
	linux-kernel@vger.kernel.org, linux-m68k@lists.linux-m68k.org,
	linux@sciencehorizons.net
Cc: alistair.francis@xilinx.com, michal.simek@xilinx.com,
	phdm@macq.eu, schwab@linux-m68k.org,
	uclinux-h8-devel@lists.sourceforge.jp,
	ysato@users.sourceforge.jp
Subject: [PATCH v2 08/10] m68k: Add <asm/hash.h>
Date: 26 May 2016 13:19:29 -0400	[thread overview]
Message-ID: <20160526171929.24497.qmail@ns.sciencehorizons.net> (raw)
In-Reply-To: <20160525073455.5709.qmail@ns.sciencehorizons.net>

This provides a multiply by constant GOLDEN_RATIO_32 = 0x61C88647
for the original mc68000, which lacks a 32x32-bit multiply instruction.

Yes, the amount of optimization effort put in is excessive. :-)

Shift-add chain found by Yevgen Voronenko's Hcub algorithm at
http://spiral.ece.cmu.edu/mcm/gen.html

Signed-off-by: George Spelvin <linux@sciencehorizons.net>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Greg Ungerer <gerg@linux-m68k.org>
Cc: Andreas Schwab <schwab@linux-m68k.org>
Cc: Philippe De Muyter <phdm@macq.eu>
Cc: linux-m68k@lists.linux-m68k.org
---
 arch/m68k/Kconfig.cpu        |  1 +
 arch/m68k/include/asm/hash.h | 59 ++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 60 insertions(+)
 create mode 100644 arch/m68k/include/asm/hash.h

diff --git a/arch/m68k/Kconfig.cpu b/arch/m68k/Kconfig.cpu
index 0dfcf128..bf3de464 100644
--- a/arch/m68k/Kconfig.cpu
+++ b/arch/m68k/Kconfig.cpu
@@ -40,6 +40,7 @@ config M68000
 	select CPU_HAS_NO_MULDIV64
 	select CPU_HAS_NO_UNALIGNED
 	select GENERIC_CSUM
+	select HAVE_ARCH_HASH
 	help
 	  The Freescale (was Motorola) 68000 CPU is the first generation of
 	  the well known M68K family of processors. The CPU core as well as
diff --git a/arch/m68k/include/asm/hash.h b/arch/m68k/include/asm/hash.h
new file mode 100644
index 00000000..6407af84
--- /dev/null
+++ b/arch/m68k/include/asm/hash.h
@@ -0,0 +1,59 @@
+#ifndef _ASM_HASH_H
+#define _ASM_HASH_H
+
+/*
+ * If CONFIG_M68000=y (original mc68000/010), this file is #included
+ * to work around the lack of a MULU.L instruction.
+ */
+
+#define HAVE_ARCH__HASH_32 1
+/*
+ * While it would be legal to substitute a different hash operation
+ * entirely, let's keep it simple and just use an optimized multiply
+ * by GOLDEN_RATIO_32 = 0x61C88647.
+ *
+ * The best way to do that appears to be to multiply by 0x8647 with
+ * shifts and adds, and use mulu.w to multiply the high half by 0x61C8.
+ *
+ * Because the 68000 has multi-cycle shifts, this addition chain is
+ * chosen to minimise the shift distances.
+ *
+ * Despite every attempt to spoon-feed it simple operations, GCC
+ * 6.1.1 doggedly insists on doing annoying things like converting
+ * "lsl.l #2,<reg>" (12 cycles) to two adds (8+8 cycles).
+ *
+ * It also likes to notice two shifts in a row, like "a = x << 2" and
+ * "a <<= 7", and convert that to "a = x << 9".  But shifts longer
+ * than 8 bits are extra-slow on m68k, so that's a lose.
+ *
+ * Since the 68000 is a very simple in-order processor with no
+ * instruction scheduling effects on execution time, we can safely
+ * take it out of GCC's hands and write one big asm() block.
+ *
+ * Without calling overhead, this operation is 30 bytes (14 instructions
+ * plus one immediate constant) and 166 cycles.
+ *
+ * (Because %2 is fetched twice, it can't be postincrement, and thus it
+ * can't be a fully general "g" or "m".  Register is preferred, but
+ * offsettable memory or immediate will work.)
+ */
+static inline u32 __attribute_const__ __hash_32(u32 x)
+{
+	u32 a, b;
+
+	asm(   "move.l %2,%0"	/* a = x * 0x0001 */
+	"\n	lsl.l #2,%0"	/* a = x * 0x0004 */
+	"\n	move.l %0,%1"
+	"\n	lsl.l #7,%0"	/* a = x * 0x0200 */
+	"\n	add.l %2,%0"	/* a = x * 0x0201 */
+	"\n	add.l %0,%1"	/* b = x * 0x0205 */
+	"\n	add.l %0,%0"	/* a = x * 0x0402 */
+	"\n	add.l %0,%1"	/* b = x * 0x0607 */
+	"\n	lsl.l #5,%0"	/* a = x * 0x8040 */
+	: "=&d,d" (a), "=&r,r" (b)
+	: "r,roi?" (x));	/* a+b = x*0x8647 */
+
+	return ((u16)(x*0x61c8) << 16) + a + b;
+}
+
+#endif	/* _ASM_HASH_H */
-- 
2.8.1

  parent reply	other threads:[~2016-05-27 15:06 UTC|newest]

Thread overview: 84+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <CA+55aFxPSW+84KfQ1N_WmND-wtvgj2zQm8nFPkRcc+gyU=uing@mail.gmail.com>
2016-05-25  7:20 ` [PATCH 00/10] String hash improvements George Spelvin
2016-05-25  7:20   ` George Spelvin
2016-05-25  8:00   ` Geert Uytterhoeven
2016-05-25  8:11     ` George Spelvin
2016-05-25  8:11       ` George Spelvin
2016-05-25  8:50       ` Geert Uytterhoeven
2016-05-25  8:50       ` Geert Uytterhoeven
2016-05-25  9:07         ` George Spelvin
2016-05-25  9:07           ` George Spelvin
2016-05-25  8:00   ` Geert Uytterhoeven
2016-05-25 16:08   ` Linus Torvalds
2016-05-25 16:08   ` Linus Torvalds
2016-05-28 19:57     ` [PATCH v3 " George Spelvin
2016-05-28 19:57       ` [PATCH v3 01/10] Pull out string hash to <linux/stringhash.h> George Spelvin
2016-05-28 19:57       ` [PATCH v3 02/10] fs/namei.c: Add hashlen_string() function George Spelvin
2016-05-28 19:57       ` [PATCH v3 03/10] <linux/sunrpc/svcauth.h>: Define hash_str() in terms of hashlen_string() George Spelvin
2016-05-28 19:57       ` [PATCH v3 04/10] Change hash_64() return value to 32 bits George Spelvin
2016-05-28 19:57       ` [PATCH v3 05/10] Eliminate bad hash multipliers from hash_32() and hash_64() George Spelvin
2016-05-28 19:57       ` [PATCH v3 06/10] fs/namei.c: Improve dcache hash function George Spelvin
2016-05-30 15:11         ` Peter Zijlstra
2016-05-30 16:06           ` George Spelvin
2016-05-30 16:27             ` Peter Zijlstra
2016-05-30 18:10               ` George Spelvin
2016-06-02  1:18                 ` Linus Torvalds
2016-06-02  2:31                   ` George Spelvin
2016-06-02 16:35                     ` Linus Torvalds
2016-06-02 18:23                       ` George Spelvin
2016-05-28 19:57       ` [PATCH v3 07/10] <linux/hash.h>: Add support for architecture-specific functions George Spelvin
2016-05-28 19:57       ` George Spelvin
2016-05-29  7:57         ` Geert Uytterhoeven
2016-05-29  7:57           ` Geert Uytterhoeven
2016-05-28 19:57       ` [PATCH v3 08/10] m68k: Add <asm/hash.h> George Spelvin
2016-05-28 19:57       ` George Spelvin
2016-05-28 19:57       ` [PATCH v3 09/10] microblaze: " George Spelvin
2016-05-28 19:57       ` [PATCH v3 10/10] h8300: " George Spelvin
2016-05-28 20:47       ` [PATCH v3 00/10] String hash improvements Linus Torvalds
2016-05-28 20:54         ` George Spelvin
2016-06-02 22:59     ` [PATCH " Fubo Chen
2016-06-02 22:59       ` Fubo Chen
2016-05-26 17:09   ` [PATCH v2 " George Spelvin
2016-05-26 17:09   ` George Spelvin
2016-05-25  7:21 ` [PATCH 01/10] Pull out string hash to <linux/stringhash.h> George Spelvin
2016-05-25  7:22 ` [PATCH 02/10] fs/namei.c: Add hash_string() function George Spelvin
2016-05-25  7:26 ` [PATCH 03/10] <linux/sunrpc/svcauth.h>: Define hash_str() in terms of hash_string() George Spelvin
2016-05-26 18:39   ` [PATCH RESEND " George Spelvin
2016-05-26 18:45     ` J. Bruce Fields
2016-05-25  7:28 ` [PATCH 04/10] Change hash_64() return value to 32 bits George Spelvin
2016-05-25  7:29 ` [PATCH 05/10] Eliminate bad hash multipliers from hash_32() and hash_64() George Spelvin
2016-05-25  7:31 ` [PATCH 06/10] fs/namei.c: Improve dcache hash function George Spelvin
2016-05-25  7:33 ` [PATCH 07/10] <linux/hash.h>: Add support for architecture-specific functions George Spelvin
2016-05-25  7:33   ` George Spelvin
2016-05-26 17:16   ` [PATCH v2 " George Spelvin
2016-05-26 17:16     ` George Spelvin
2016-05-25  7:34 ` [PATCH 08/10] m68k: Add <asm/archhash.h> George Spelvin
2016-05-25  7:34 ` George Spelvin
2016-05-25  7:34   ` George Spelvin
2016-05-25  8:07   ` Geert Uytterhoeven
2016-05-25  8:07     ` Geert Uytterhoeven
2016-05-25  8:19     ` George Spelvin
2016-05-25  8:19     ` George Spelvin
2016-05-25  8:24     ` [PATCH 08v2/10] " George Spelvin
2016-05-25  8:24       ` George Spelvin
2016-05-25  8:48       ` Geert Uytterhoeven
2016-05-25  8:48       ` Geert Uytterhoeven
2016-05-25  8:56   ` [PATCH 08/10] " Philippe De Muyter
2016-05-25  8:56     ` Philippe De Muyter
2016-05-25  9:14     ` George Spelvin
2016-05-25  9:14       ` George Spelvin
2016-05-25  9:31       ` Andreas Schwab
2016-05-25  9:31       ` Andreas Schwab
2016-05-25  9:51       ` Philippe De Muyter
2016-05-25  9:51       ` Philippe De Muyter
2016-05-25 13:24   ` Philippe De Muyter
2016-05-25 13:24     ` Philippe De Muyter
2016-05-25 13:42     ` George Spelvin
2016-05-25 13:42       ` George Spelvin
2016-05-26 17:19   ` George Spelvin [this message]
2016-05-26 17:19     ` [PATCH v2 08/10] m68k: Add <asm/hash.h> George Spelvin
2016-05-25  7:37 ` [PATCH 09/10] microblaze: Add <asm/archhash.h> George Spelvin
2016-05-26 17:21   ` [PATCH v2 09/10] microblaze: Add <asm/hash.h> George Spelvin
2016-05-26 17:21   ` George Spelvin
2016-05-25  7:38 ` [PATCH 10/10] h8300: Add <asm/archhash.h> George Spelvin
2016-05-26 17:23   ` [PATCH v2 10/10] h8300: Add <asm/hash.h> George Spelvin
2016-05-26 17:23     ` George Spelvin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160526171929.24497.qmail@ns.sciencehorizons.net \
    --to=linux@horizon.com \
    --cc=alistair.francis@xilinx.com \
    --cc=geert@linux-m68k.org \
    --cc=gerg@linux-m68k.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-m68k@vger.kernel.org \
    --cc=linux@sciencehorizons.net \
    --cc=michal.simek@xilinx.com \
    --cc=phdm@macq.eu \
    --cc=schwab@linux-m68k.org \
    --cc=uclinux-h8-devel@lists.sourceforge.jp \
    --cc=ysato@users.sourceforge.jp \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.