All of lore.kernel.org
 help / color / mirror / Atom feed
From: "George Spelvin" <linux@horizon.com>
To: torvalds@linux-foundation.org
Cc: linux-kernel@vger.kernel.org, linux@horizon.com, tglx@linutronix.de
Subject: Re: [patch 2/7] lib/hashmod: Add modulo based hash mechanism
Date: 29 Apr 2016 19:31:15 -0400	[thread overview]
Message-ID: <20160429233115.8864.qmail@ns.horizon.com> (raw)
In-Reply-To: <CA+55aFxS9zZ8Gpyg9cn=xgkamx-G6dDZ3DTR2B=iQzXYcAhGoQ@mail.gmail.com>

On Fri, Apr 29, 2016 at 9:32 PM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
wrote:
> For example, that _long_ range of bits set ("7fffffffc" in the middle)
> is effectively just one bit set with a subtraction. And it's *right*
> in that bit area that is supposed to shuffle bits 14-40 to the high bits
> (which is what we actually *use*. So it effectively shuffles none of those
> bits around at all, and if you have a stride of 4096, your'e pretty much
> done for.

Gee, I recall saying something a lot like that.
> 64 bits is just as bad...  0x9e37fffffffc0001 becomes
> 0x7fffffffc0001, which is 2^51 - 2^18 + 1.

After researching it, I think that the "high bits of a multiply" is
in fact a decent way to do such a hash.  Interestingly, for a randomly
chosen odd multiplier A, the high k bits of the w-bit product A*x is a
universal hash function in the cryptographic sense.  See section 2.3 of
http://arxiv.org/abs/1504.06804


One thing I note is that the advice in the comments to choose a prime
number is misquoting Knuth!  Knuth says (vol. 3 section 6.4) the number
should be *relatively* prime to the word size, which for binary computers
simply means odd.

When we have a hardware multiplier, keeping the Hamming weight low is
a waste of time.  When we don't, clever organization can do
better than the very naive addition/subtraction chain in the
current hash_64().

To multiply by the 32-bit constant 1640531527 = 0x61c88647 (which is
the negative of the golden ratio, so has identical distribution
properties) can be done in 6 shifts + adds, with a critical path
length of 7 operations (3 shifts + 4 adds).

#define GOLDEN_RATIO_32 0x61c88647	/* phi^2 = 1-phi */
/* Returns x * GOLDEN_RATIO_32 without a hardware multiplier */
unsigned hash_32(unsigned x)
{
        unsigned y, z;
				/* Path length */
        y = (x << 19) + x;	/* 1 shift + 1 add */
        z = (x << 9) + y;	/* 1 shift + 2 add */
        x = (x << 23) + z;	/* 1 shift + 3 add */
        z = (z << 8) + y;	/* 2 shift + 3 add */
	x = (x << 6) - x;	/* 2 shift + 4 add */
        return (z << 3) + x;	/* 3 shift + 4 add */
}

Finding a similarly efficient chain for the 64-bit golden ratio
0x9E3779B97F4A7C15 = 11400714819323198485
or
0x61C8864680B583EB = 7046029254386353131

is a bit of a challenge, but algorithms are known.

  parent reply	other threads:[~2016-04-29 23:31 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-04-29  2:57 [patch 2/7] lib/hashmod: Add modulo based hash mechanism George Spelvin
2016-04-29  3:16 ` Linus Torvalds
2016-04-29  4:12   ` George Spelvin
2016-04-29 23:31   ` George Spelvin [this message]
2016-04-30  0:05     ` Linus Torvalds
2016-04-30  0:32       ` George Spelvin
2016-04-30  1:12         ` Linus Torvalds
2016-04-30  3:04           ` George Spelvin
     [not found] <CA+55aFxBWfAHQNAdBbdVr+z8ror4GVteyce3D3=vwDWxhu5KqQ@mail.gmail.com>
2016-04-30 20:52 ` George Spelvin
2016-05-01  8:35   ` Thomas Gleixner
2016-05-01  9:43     ` George Spelvin
2016-05-01 16:51       ` Linus Torvalds
2016-05-14  3:54         ` George Spelvin
2016-05-14 18:35           ` Linus Torvalds
2016-05-02  7:11       ` Thomas Gleixner
  -- strict thread matches above, loose matches on Subject: below --
2016-04-28 16:42 [patch 0/7] futex: Add support for process private hashing Thomas Gleixner
2016-04-28 16:42 ` [patch 2/7] lib/hashmod: Add modulo based hash mechanism Thomas Gleixner
2016-04-28 18:32   ` Linus Torvalds
2016-04-28 23:26     ` Thomas Gleixner
2016-04-29  2:25       ` Linus Torvalds
2016-04-30 13:02         ` Thomas Gleixner
2016-04-30 16:45           ` Eric Dumazet
2016-04-30 17:12             ` Linus Torvalds
2016-04-30 17:37               ` Eric Dumazet
2016-06-12 12:18         ` Sandy Harris
2016-04-29 21:10     ` Linus Torvalds
2016-04-29 23:51       ` Linus Torvalds
2016-04-30  1:34         ` Rik van Riel
2016-05-02  9:39         ` Torvald Riegel
2016-04-30 15:22       ` Thomas Gleixner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160429233115.8864.qmail@ns.horizon.com \
    --to=linux@horizon.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.