linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Linus Torvalds <torvalds@linux-foundation.org>
To: George Spelvin <linux@horizon.com>
Cc: Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Thomas Gleixner <tglx@linutronix.de>
Subject: Re: [patch 2/7] lib/hashmod: Add modulo based hash mechanism
Date: Fri, 29 Apr 2016 17:05:59 -0700	[thread overview]
Message-ID: <CA+55aFyEPA5jkUL14xNXqs0Qh991DhT+DQpuQQC=aBHCN5Z=ig@mail.gmail.com> (raw)
In-Reply-To: <20160429233115.8864.qmail@ns.horizon.com>

On Fri, Apr 29, 2016 at 4:31 PM, George Spelvin <linux@horizon.com> wrote:
>
> After researching it, I think that the "high bits of a multiply" is
> in fact a decent way to do such a hash.

Our emails crossed. Yes. My test harness actually likes the
multiplication more than most of the specialized "spread out bits"
versions I've found, but only if the constants are better-chosen than
the ones we have now.

> One thing I note is that the advice in the comments to choose a prime
> number is misquoting Knuth!  Knuth says (vol. 3 section 6.4) the number
> should be *relatively* prime to the word size, which for binary computers
> simply means odd.

At least for my tests, even that seems to actually be a total
non-issue. Yes, odd values *might* be better, but as mentioned in my
crossing email, it doesn't actually seem to matter for any case the
kernel cares about, since we tend to want to hash down to 10-20 bits
of data, so the least significant bit (particularly for the 64-bit
case) just doesn't matter all that much.

For the 32-bit case I suspect it's more noticeable, since we might be
using even half or more of the result.

But as mentioned: that least-order bit seems to be a *lot* less
important than the mix of the bits in the middle. Because even if your
input ends up being all zeroes in the low bits (it that worst-case
"page aligned pointers" case that Thomas had), and the hash multiplies
by an even number, you'll still end up just dropping all those zero
bits anyway.

> When we have a hardware multiplier, keeping the Hamming weight low is
> a waste of time.  When we don't, clever organization can do
> better than the very naive addition/subtraction chain in the
> current hash_64().

Yeah. gcc will actually do the clever stuff for the 32-bit case, afaik.

Nothing I know of does it for the 64-bit case, which is why our
current hand-written one isn't even the smark one.

> To multiply by the 32-bit constant 1640531527 = 0x61c88647 (which is
> the negative of the golden ratio, so has identical distribution
> properties) can be done in 6 shifts + adds, with a critical path
> length of 7 operations (3 shifts + 4 adds).

So the reason we don't do this for the 32-bit case is exactly that gcc
can already do this.

If you can do the same for the 64-bit case, that might be worth it.

                Linus

  reply	other threads:[~2016-04-30  0:06 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-04-29  2:57 [patch 2/7] lib/hashmod: Add modulo based hash mechanism George Spelvin
2016-04-29  3:16 ` Linus Torvalds
2016-04-29  4:12   ` George Spelvin
2016-04-29 23:31   ` George Spelvin
2016-04-30  0:05     ` Linus Torvalds [this message]
2016-04-30  0:32       ` George Spelvin
2016-04-30  1:12         ` Linus Torvalds
2016-04-30  3:04           ` George Spelvin
     [not found] <CA+55aFxBWfAHQNAdBbdVr+z8ror4GVteyce3D3=vwDWxhu5KqQ@mail.gmail.com>
2016-04-30 20:52 ` George Spelvin
2016-05-01  8:35   ` Thomas Gleixner
2016-05-01  9:43     ` George Spelvin
2016-05-01 16:51       ` Linus Torvalds
2016-05-14  3:54         ` George Spelvin
2016-05-14 18:35           ` Linus Torvalds
2016-05-02  7:11       ` Thomas Gleixner
  -- strict thread matches above, loose matches on Subject: below --
2016-04-28 16:42 [patch 0/7] futex: Add support for process private hashing Thomas Gleixner
2016-04-28 16:42 ` [patch 2/7] lib/hashmod: Add modulo based hash mechanism Thomas Gleixner
2016-04-28 18:32   ` Linus Torvalds
2016-04-28 23:26     ` Thomas Gleixner
2016-04-29  2:25       ` Linus Torvalds
2016-04-30 13:02         ` Thomas Gleixner
2016-04-30 16:45           ` Eric Dumazet
2016-04-30 17:12             ` Linus Torvalds
2016-04-30 17:37               ` Eric Dumazet
2016-06-12 12:18         ` Sandy Harris
2016-04-29 21:10     ` Linus Torvalds
2016-04-29 23:51       ` Linus Torvalds
2016-04-30  1:34         ` Rik van Riel
2016-05-02  9:39         ` Torvald Riegel
2016-04-30 15:22       ` Thomas Gleixner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CA+55aFyEPA5jkUL14xNXqs0Qh991DhT+DQpuQQC=aBHCN5Z=ig@mail.gmail.com' \
    --to=torvalds@linux-foundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux@horizon.com \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).