linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Torvald Riegel <triegel@redhat.com>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>,
	Rik van Riel <riel@redhat.com>,
	LKML <linux-kernel@vger.kernel.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Ingo Molnar <mingo@kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Sebastian Andrzej Siewior <bigeasy@linutronix.de>,
	Darren Hart <darren@dvhart.com>,
	Michael Kerrisk <mtk.manpages@googlemail.com>,
	Davidlohr Bueso <dave@stgolabs.net>, Chris Mason <clm@fb.com>,
	"Carlos O'Donell" <carlos@redhat.com>,
	Eric Dumazet <edumazet@google.com>
Subject: Re: [patch 2/7] lib/hashmod: Add modulo based hash mechanism
Date: Mon, 02 May 2016 11:39:28 +0200	[thread overview]
Message-ID: <1462181968.3869.670.camel@localhost.localdomain> (raw)
In-Reply-To: <CA+55aFyjYkwkTo2bNYqJ6h4mr1bbT5Vrak+EtiZmujOD-NzMOQ@mail.gmail.com>

On Fri, 2016-04-29 at 16:51 -0700, Linus Torvalds wrote:
> On Fri, Apr 29, 2016 at 2:10 PM, Linus Torvalds
> <torvalds@linux-foundation.org> wrote:
> > On Thu, Apr 28, 2016 at 11:32 AM, Linus Torvalds
> > <torvalds@linux-foundation.org> wrote:
> >>
> >> hash_long/ptr really shouldn't care about the low bits being zero at
> >> all, because it should mix in all the bits (using a prime multiplier
> >> and taking the high bits).
> >
> > Looking at this assertion, it doesn't actually pan out very well at all.
> >
> > The 32-bit hash looks to be ok. Certainly not perfect, but not horrible.
> >
> > The 64-bit hash seems to be quite horribly bad with lots of values.
> 
> Ok, I have tried to research this a bit more. There's a lot of
> confusion here that causes the fact that hash_64() sucks donkey balls.
> 
> The basic method for the hashing method by multiplication is fairly
> sane. It's well-documented, and attributed to Knuth as the comment
> above it says.

Section 2.2 of this article might also be of interest:

M. Dietzfelbinger, T. Hagerup, J. Katajainen, and M. Penttonen. A Re-
liable Randomized Algorithm for the Closest-Pair Problem. Journal of
Algorithms, 25(1):19 – 51, 1997.

I don't know much about hash functions, but my understanding of this is
that one can do good hashing of words by multiplying with a random
number and using the most-significant bits of the lower-significant word
of the result.  Different random numbers may work better than others for
the input data (and some might be really awful), but the paper seems to
claim that one *can* find a good random number for all input data.  In
practice, this might mean that re-hashing with a different random number
after seeing too many conflicts in a hash table should eventually lead
to a good hashing (ie, because the *class* of such hash functions is
universal).

  parent reply	other threads:[~2016-05-02  9:39 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-04-28 16:42 [patch 0/7] futex: Add support for process private hashing Thomas Gleixner
2016-04-28 16:42 ` [patch 1/7] futex: Add some more function commentry Thomas Gleixner
2016-04-28 16:42 ` [patch 2/7] lib/hashmod: Add modulo based hash mechanism Thomas Gleixner
2016-04-28 18:32   ` Linus Torvalds
2016-04-28 23:26     ` Thomas Gleixner
2016-04-29  2:25       ` Linus Torvalds
2016-04-30 13:02         ` Thomas Gleixner
2016-04-30 16:45           ` Eric Dumazet
2016-04-30 17:12             ` Linus Torvalds
2016-04-30 17:37               ` Eric Dumazet
2016-06-12 12:18         ` Sandy Harris
2016-04-29 21:10     ` Linus Torvalds
2016-04-29 23:51       ` Linus Torvalds
2016-04-30  1:34         ` Rik van Riel
2016-05-02  9:39         ` Torvald Riegel [this message]
2016-04-30 15:22       ` Thomas Gleixner
2016-04-28 16:42 ` [patch 3/7] futex: Hash private futexes per process Thomas Gleixner
2016-04-28 16:42 ` [patch 4/7] futex: Add op for hash preallocation Thomas Gleixner
2016-04-28 16:42 ` [patch 5/7] futex: Add sysctl knobs for process private hash Thomas Gleixner
2016-04-28 16:42 ` [patch 6/7] perf/bench/futex-hash: Support NUMA Thomas Gleixner
2016-04-28 16:42 ` [patch 7/7] perf/bench/futex-hash: Support preallocate hash table Thomas Gleixner
2016-04-29  2:57 [patch 2/7] lib/hashmod: Add modulo based hash mechanism George Spelvin
2016-04-29  3:16 ` Linus Torvalds
2016-04-29  4:12   ` George Spelvin
2016-04-29 23:31   ` George Spelvin
2016-04-30  0:05     ` Linus Torvalds
2016-04-30  0:32       ` George Spelvin
2016-04-30  1:12         ` Linus Torvalds
2016-04-30  3:04           ` George Spelvin
     [not found] <CA+55aFxBWfAHQNAdBbdVr+z8ror4GVteyce3D3=vwDWxhu5KqQ@mail.gmail.com>
2016-04-30 20:52 ` George Spelvin
2016-05-01  8:35   ` Thomas Gleixner
2016-05-01  9:43     ` George Spelvin
2016-05-01 16:51       ` Linus Torvalds
2016-05-14  3:54         ` George Spelvin
2016-05-14 18:35           ` Linus Torvalds
2016-05-02  7:11       ` Thomas Gleixner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1462181968.3869.670.camel@localhost.localdomain \
    --to=triegel@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=bigeasy@linutronix.de \
    --cc=carlos@redhat.com \
    --cc=clm@fb.com \
    --cc=darren@dvhart.com \
    --cc=dave@stgolabs.net \
    --cc=edumazet@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=mtk.manpages@googlemail.com \
    --cc=peterz@infradead.org \
    --cc=riel@redhat.com \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).