* [PATCH] dwarf_loader: use a better hashing function
@ 2021-02-10 23:23 Bill Wendling
2021-02-10 23:59 ` Andrii Nakryiko
2021-02-12 8:01 ` [PATCH v2] " Bill Wendling
0 siblings, 2 replies; 11+ messages in thread
From: Bill Wendling @ 2021-02-10 23:23 UTC (permalink / raw)
To: dwarves, bpf; +Cc: arnaldo.melo, Bill Wendling
This hashing function[1] produces better hash table bucket
distributions. The original hashing function always produced zeros in
the three least significant bits.
The new hashing funciton gives a modest performance boost.
Original New
0:11.41 0:11.38
0:11.36 0:11.34
0:11.35 0:11.26
-----------------------
Avg: 0:11.373 0:11.327
for a performance improvement of 0.4%.
[1] From Numerical Recipes, 3rd Ed. 7.1.4 Random Hashes and Random Bytes
Signed-off-by: Bill Wendling <morbo@google.com>
---
hash.h | 25 ++++++++++---------------
1 file changed, 10 insertions(+), 15 deletions(-)
diff --git a/hash.h b/hash.h
index d3aa416..ea201ab 100644
--- a/hash.h
+++ b/hash.h
@@ -33,22 +33,17 @@
static inline uint64_t hash_64(const uint64_t val, const unsigned int bits)
{
- uint64_t hash = val;
+ uint64_t hash = val * 0x369DEA0F31A53F85UL + 0x255992D382208B61UL;
- /* Sigh, gcc can't optimise this alone like it does for 32 bits. */
- uint64_t n = hash;
- n <<= 18;
- hash -= n;
- n <<= 33;
- hash -= n;
- n <<= 3;
- hash += n;
- n <<= 3;
- hash -= n;
- n <<= 4;
- hash += n;
- n <<= 2;
- hash += n;
+ hash ^= hash >> 21;
+ hash ^= hash << 37;
+ hash ^= hash >> 4;
+
+ hash *= 0x422E19E1D95D2F0DUL;
+
+ hash ^= hash << 20;
+ hash ^= hash >> 41;
+ hash ^= hash << 5;
/* High bits are more random, so use them. */
return hash >> (64 - bits);
--
2.30.0.478.g8a0d178c01-goog
^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH] dwarf_loader: use a better hashing function
2021-02-10 23:23 [PATCH] dwarf_loader: use a better hashing function Bill Wendling
@ 2021-02-10 23:59 ` Andrii Nakryiko
2021-02-11 1:24 ` Bill Wendling
2021-02-12 8:01 ` [PATCH v2] " Bill Wendling
1 sibling, 1 reply; 11+ messages in thread
From: Andrii Nakryiko @ 2021-02-10 23:59 UTC (permalink / raw)
To: Bill Wendling; +Cc: dwarves, bpf, Arnaldo Carvalho de Melo
On Wed, Feb 10, 2021 at 3:25 PM Bill Wendling <morbo@google.com> wrote:
>
> This hashing function[1] produces better hash table bucket
> distributions. The original hashing function always produced zeros in
> the three least significant bits.
>
> The new hashing funciton gives a modest performance boost.
>
> Original New
> 0:11.41 0:11.38
> 0:11.36 0:11.34
> 0:11.35 0:11.26
> -----------------------
> Avg: 0:11.373 0:11.327
>
> for a performance improvement of 0.4%.
>
> [1] From Numerical Recipes, 3rd Ed. 7.1.4 Random Hashes and Random Bytes
>
Can you please also test with the one libbpf uses internally:
return (val * 11400714819323198485llu) >> (64 - bits);
?
Thanks!
> Signed-off-by: Bill Wendling <morbo@google.com>
> ---
> hash.h | 25 ++++++++++---------------
> 1 file changed, 10 insertions(+), 15 deletions(-)
>
> diff --git a/hash.h b/hash.h
> index d3aa416..ea201ab 100644
> --- a/hash.h
> +++ b/hash.h
> @@ -33,22 +33,17 @@
>
> static inline uint64_t hash_64(const uint64_t val, const unsigned int bits)
> {
> - uint64_t hash = val;
> + uint64_t hash = val * 0x369DEA0F31A53F85UL + 0x255992D382208B61UL;
>
> - /* Sigh, gcc can't optimise this alone like it does for 32 bits. */
> - uint64_t n = hash;
> - n <<= 18;
> - hash -= n;
> - n <<= 33;
> - hash -= n;
> - n <<= 3;
> - hash += n;
> - n <<= 3;
> - hash -= n;
> - n <<= 4;
> - hash += n;
> - n <<= 2;
> - hash += n;
> + hash ^= hash >> 21;
> + hash ^= hash << 37;
> + hash ^= hash >> 4;
> +
> + hash *= 0x422E19E1D95D2F0DUL;
> +
> + hash ^= hash << 20;
> + hash ^= hash >> 41;
> + hash ^= hash << 5;
>
> /* High bits are more random, so use them. */
> return hash >> (64 - bits);
> --
> 2.30.0.478.g8a0d178c01-goog
>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] dwarf_loader: use a better hashing function
2021-02-10 23:59 ` Andrii Nakryiko
@ 2021-02-11 1:24 ` Bill Wendling
2021-02-11 1:31 ` Andrii Nakryiko
0 siblings, 1 reply; 11+ messages in thread
From: Bill Wendling @ 2021-02-11 1:24 UTC (permalink / raw)
To: Andrii Nakryiko; +Cc: dwarves, bpf, Arnaldo Carvalho de Melo
On Wed, Feb 10, 2021 at 4:00 PM Andrii Nakryiko
<andrii.nakryiko@gmail.com> wrote:
>
> On Wed, Feb 10, 2021 at 3:25 PM Bill Wendling <morbo@google.com> wrote:
> >
> > This hashing function[1] produces better hash table bucket
> > distributions. The original hashing function always produced zeros in
> > the three least significant bits.
> >
> > The new hashing funciton gives a modest performance boost.
> >
> > Original New
> > 0:11.41 0:11.38
> > 0:11.36 0:11.34
> > 0:11.35 0:11.26
> > -----------------------
> > Avg: 0:11.373 0:11.327
> >
> > for a performance improvement of 0.4%.
> >
> > [1] From Numerical Recipes, 3rd Ed. 7.1.4 Random Hashes and Random Bytes
> >
>
> Can you please also test with the one libbpf uses internally:
>
> return (val * 11400714819323198485llu) >> (64 - bits);
>
> ?
>
> Thanks!
>
It's giving me a running time of ~11.11s, which is even better. Would
you like me to submit a patch?
-bw
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] dwarf_loader: use a better hashing function
2021-02-11 1:24 ` Bill Wendling
@ 2021-02-11 1:31 ` Andrii Nakryiko
2021-02-11 13:01 ` Arnaldo Carvalho de Melo
0 siblings, 1 reply; 11+ messages in thread
From: Andrii Nakryiko @ 2021-02-11 1:31 UTC (permalink / raw)
To: Bill Wendling; +Cc: dwarves, bpf, Arnaldo Carvalho de Melo
On Wed, Feb 10, 2021 at 5:24 PM Bill Wendling <morbo@google.com> wrote:
>
> On Wed, Feb 10, 2021 at 4:00 PM Andrii Nakryiko
> <andrii.nakryiko@gmail.com> wrote:
> >
> > On Wed, Feb 10, 2021 at 3:25 PM Bill Wendling <morbo@google.com> wrote:
> > >
> > > This hashing function[1] produces better hash table bucket
> > > distributions. The original hashing function always produced zeros in
> > > the three least significant bits.
> > >
> > > The new hashing funciton gives a modest performance boost.
> > >
> > > Original New
> > > 0:11.41 0:11.38
> > > 0:11.36 0:11.34
> > > 0:11.35 0:11.26
> > > -----------------------
> > > Avg: 0:11.373 0:11.327
> > >
> > > for a performance improvement of 0.4%.
> > >
> > > [1] From Numerical Recipes, 3rd Ed. 7.1.4 Random Hashes and Random Bytes
> > >
> >
> > Can you please also test with the one libbpf uses internally:
> >
> > return (val * 11400714819323198485llu) >> (64 - bits);
> >
> > ?
> >
> > Thanks!
> >
> It's giving me a running time of ~11.11s, which is even better. Would
> you like me to submit a patch?
faster is better, so yeah, why not? :)
>
> -bw
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] dwarf_loader: use a better hashing function
2021-02-11 1:31 ` Andrii Nakryiko
@ 2021-02-11 13:01 ` Arnaldo Carvalho de Melo
2021-02-12 6:55 ` Bill Wendling
0 siblings, 1 reply; 11+ messages in thread
From: Arnaldo Carvalho de Melo @ 2021-02-11 13:01 UTC (permalink / raw)
To: Andrii Nakryiko; +Cc: Bill Wendling, dwarves, bpf, Arnaldo Carvalho de Melo
Em Wed, Feb 10, 2021 at 05:31:48PM -0800, Andrii Nakryiko escreveu:
> On Wed, Feb 10, 2021 at 5:24 PM Bill Wendling <morbo@google.com> wrote:
> > On Wed, Feb 10, 2021 at 4:00 PM Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote:
> > > On Wed, Feb 10, 2021 at 3:25 PM Bill Wendling <morbo@google.com> wrote:
> > > > This hashing function[1] produces better hash table bucket
> > > > distributions. The original hashing function always produced zeros in
> > > > the three least significant bits.
> > > > The new hashing funciton gives a modest performance boost.
> > > > Original New
> > > > 0:11.41 0:11.38
> > > > 0:11.36 0:11.34
> > > > 0:11.35 0:11.26
> > > > -----------------------
> > > > Avg: 0:11.373 0:11.327
> > > > for a performance improvement of 0.4%.
> > > > [1] From Numerical Recipes, 3rd Ed. 7.1.4 Random Hashes and Random Bytes
> > > Can you please also test with the one libbpf uses internally:
> > > return (val * 11400714819323198485llu) >> (64 - bits);
> > > ?
> > It's giving me a running time of ~11.11s, which is even better. Would
> > you like me to submit a patch?
> faster is better, so yeah, why not? :)
Yeah, I agree, faster is better, please make it so :-)
- Arnaldo
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] dwarf_loader: use a better hashing function
2021-02-11 13:01 ` Arnaldo Carvalho de Melo
@ 2021-02-12 6:55 ` Bill Wendling
2021-02-12 12:35 ` Arnaldo Carvalho de Melo
0 siblings, 1 reply; 11+ messages in thread
From: Bill Wendling @ 2021-02-12 6:55 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo
Cc: Andrii Nakryiko, dwarves, bpf, Arnaldo Carvalho de Melo
On Thu, Feb 11, 2021 at 5:01 AM Arnaldo Carvalho de Melo
<acme@kernel.org> wrote:
>
> Em Wed, Feb 10, 2021 at 05:31:48PM -0800, Andrii Nakryiko escreveu:
> > On Wed, Feb 10, 2021 at 5:24 PM Bill Wendling <morbo@google.com> wrote:
> > > On Wed, Feb 10, 2021 at 4:00 PM Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote:
> > > > On Wed, Feb 10, 2021 at 3:25 PM Bill Wendling <morbo@google.com> wrote:
> > > > > This hashing function[1] produces better hash table bucket
> > > > > distributions. The original hashing function always produced zeros in
> > > > > the three least significant bits.
>
> > > > > The new hashing funciton gives a modest performance boost.
>
> > > > > Original New
> > > > > 0:11.41 0:11.38
> > > > > 0:11.36 0:11.34
> > > > > 0:11.35 0:11.26
> > > > > -----------------------
> > > > > Avg: 0:11.373 0:11.327
>
> > > > > for a performance improvement of 0.4%.
>
> > > > > [1] From Numerical Recipes, 3rd Ed. 7.1.4 Random Hashes and Random Bytes
>
> > > > Can you please also test with the one libbpf uses internally:
>
> > > > return (val * 11400714819323198485llu) >> (64 - bits);
>
> > > > ?
>
> > > It's giving me a running time of ~11.11s, which is even better. Would
> > > you like me to submit a patch?
>
> > faster is better, so yeah, why not? :)
>
> Yeah, I agree, faster is better, please make it so :-)
>
Your wish is my command! :-) Done.
-bw
^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH v2] dwarf_loader: use a better hashing function
2021-02-10 23:23 [PATCH] dwarf_loader: use a better hashing function Bill Wendling
2021-02-10 23:59 ` Andrii Nakryiko
@ 2021-02-12 8:01 ` Bill Wendling
2021-02-12 12:37 ` Arnaldo Carvalho de Melo
1 sibling, 1 reply; 11+ messages in thread
From: Bill Wendling @ 2021-02-12 8:01 UTC (permalink / raw)
To: dwarves, bpf; +Cc: arnaldo.melo, Bill Wendling
This hashing function[1] produces better hash table bucket
distributions. The original hashing function always produced zeros in
the three least significant bits. The new hashing function gives a
modest performance boost:
Original: 0:11.373s
New: 0:11.110s
for a performance improvement of ~2%.
[1] From the hash function used in libbpf.
Signed-off-by: Bill Wendling <morbo@google.com>
---
hash.h | 20 +-------------------
1 file changed, 1 insertion(+), 19 deletions(-)
diff --git a/hash.h b/hash.h
index d3aa416..6f952c7 100644
--- a/hash.h
+++ b/hash.h
@@ -33,25 +33,7 @@
static inline uint64_t hash_64(const uint64_t val, const unsigned int bits)
{
- uint64_t hash = val;
-
- /* Sigh, gcc can't optimise this alone like it does for 32 bits. */
- uint64_t n = hash;
- n <<= 18;
- hash -= n;
- n <<= 33;
- hash -= n;
- n <<= 3;
- hash += n;
- n <<= 3;
- hash -= n;
- n <<= 4;
- hash += n;
- n <<= 2;
- hash += n;
-
- /* High bits are more random, so use them. */
- return hash >> (64 - bits);
+ return (val * 11400714819323198485LLU) >> (64 - bits);
}
static inline uint32_t hash_32(uint32_t val, unsigned int bits)
--
2.30.0.478.g8a0d178c01-goog
^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH] dwarf_loader: use a better hashing function
2021-02-12 6:55 ` Bill Wendling
@ 2021-02-12 12:35 ` Arnaldo Carvalho de Melo
0 siblings, 0 replies; 11+ messages in thread
From: Arnaldo Carvalho de Melo @ 2021-02-12 12:35 UTC (permalink / raw)
To: Bill Wendling; +Cc: Andrii Nakryiko, dwarves, bpf, Arnaldo Carvalho de Melo
Em Thu, Feb 11, 2021 at 10:55:32PM -0800, Bill Wendling escreveu:
> On Thu, Feb 11, 2021 at 5:01 AM Arnaldo Carvalho de Melo
> <acme@kernel.org> wrote:
> >
> > Em Wed, Feb 10, 2021 at 05:31:48PM -0800, Andrii Nakryiko escreveu:
> > > On Wed, Feb 10, 2021 at 5:24 PM Bill Wendling <morbo@google.com> wrote:
> > > > On Wed, Feb 10, 2021 at 4:00 PM Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote:
> > > > > On Wed, Feb 10, 2021 at 3:25 PM Bill Wendling <morbo@google.com> wrote:
> > > > > > This hashing function[1] produces better hash table bucket
> > > > > > distributions. The original hashing function always produced zeros in
> > > > > > the three least significant bits.
> >
> > > > > > The new hashing funciton gives a modest performance boost.
> >
> > > > > > Original New
> > > > > > 0:11.41 0:11.38
> > > > > > 0:11.36 0:11.34
> > > > > > 0:11.35 0:11.26
> > > > > > -----------------------
> > > > > > Avg: 0:11.373 0:11.327
> >
> > > > > > for a performance improvement of 0.4%.
> >
> > > > > > [1] From Numerical Recipes, 3rd Ed. 7.1.4 Random Hashes and Random Bytes
> >
> > > > > Can you please also test with the one libbpf uses internally:
> >
> > > > > return (val * 11400714819323198485llu) >> (64 - bits);
> >
> > > > > ?
> >
> > > > It's giving me a running time of ~11.11s, which is even better. Would
> > > > you like me to submit a patch?
> >
> > > faster is better, so yeah, why not? :)
> >
> > Yeah, I agree, faster is better, please make it so :-)
> >
> Your wish is my command! :-) Done.
Thanks, looking for the patch and applying!
No go think about something else to make it faster 8-)
- Arnaldo
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v2] dwarf_loader: use a better hashing function
2021-02-12 8:01 ` [PATCH v2] " Bill Wendling
@ 2021-02-12 12:37 ` Arnaldo Carvalho de Melo
2021-02-12 12:39 ` Arnaldo Carvalho de Melo
2021-02-12 20:14 ` Bill Wendling
0 siblings, 2 replies; 11+ messages in thread
From: Arnaldo Carvalho de Melo @ 2021-02-12 12:37 UTC (permalink / raw)
To: Bill Wendling; +Cc: Andrii Nakryiko, dwarves, bpf
Em Fri, Feb 12, 2021 at 12:01:04AM -0800, Bill Wendling escreveu:
> This hashing function[1] produces better hash table bucket
> distributions. The original hashing function always produced zeros in
> the three least significant bits. The new hashing function gives a
> modest performance boost:
Some tidbits:
You forgot to CC Andrii and also to add this, which I'm doing now:
Suggested-by: Andrii Nakryiko <andrii@kernel.org>
:-)
- Arnaldo
> Original: 0:11.373s
> New: 0:11.110s
>
> for a performance improvement of ~2%.
>
> [1] From the hash function used in libbpf.
>
> Signed-off-by: Bill Wendling <morbo@google.com>
> ---
> hash.h | 20 +-------------------
> 1 file changed, 1 insertion(+), 19 deletions(-)
>
> diff --git a/hash.h b/hash.h
> index d3aa416..6f952c7 100644
> --- a/hash.h
> +++ b/hash.h
> @@ -33,25 +33,7 @@
>
> static inline uint64_t hash_64(const uint64_t val, const unsigned int bits)
> {
> - uint64_t hash = val;
> -
> - /* Sigh, gcc can't optimise this alone like it does for 32 bits. */
> - uint64_t n = hash;
> - n <<= 18;
> - hash -= n;
> - n <<= 33;
> - hash -= n;
> - n <<= 3;
> - hash += n;
> - n <<= 3;
> - hash -= n;
> - n <<= 4;
> - hash += n;
> - n <<= 2;
> - hash += n;
> -
> - /* High bits are more random, so use them. */
> - return hash >> (64 - bits);
> + return (val * 11400714819323198485LLU) >> (64 - bits);
> }
>
> static inline uint32_t hash_32(uint32_t val, unsigned int bits)
> --
> 2.30.0.478.g8a0d178c01-goog
>
--
- Arnaldo
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v2] dwarf_loader: use a better hashing function
2021-02-12 12:37 ` Arnaldo Carvalho de Melo
@ 2021-02-12 12:39 ` Arnaldo Carvalho de Melo
2021-02-12 20:14 ` Bill Wendling
1 sibling, 0 replies; 11+ messages in thread
From: Arnaldo Carvalho de Melo @ 2021-02-12 12:39 UTC (permalink / raw)
To: Bill Wendling; +Cc: Andrii Nakryiko, dwarves, bpf
Em Fri, Feb 12, 2021 at 09:37:22AM -0300, Arnaldo Carvalho de Melo escreveu:
> Em Fri, Feb 12, 2021 at 12:01:04AM -0800, Bill Wendling escreveu:
> > This hashing function[1] produces better hash table bucket
> > distributions. The original hashing function always produced zeros in
> > the three least significant bits. The new hashing function gives a
> > modest performance boost:
>
> Some tidbits:
>
> You forgot to CC Andrii and also to add this, which I'm doing now:
>
> Suggested-by: Andrii Nakryiko <andrii@kernel.org>
>
> :-)
See below the full cset, that will go public after some more tests here.
- Arnaldo
commit 9fecc77ed82d429fd3fe49ba275465813228e617 (HEAD -> master)
Author: Bill Wendling <morbo@google.com>
Date: Fri Feb 12 00:01:04 2021 -0800
dwarf_loader: Use a better hashing function, from libbpf
This hashing function[1] produces better hash table bucket
distributions. The original hashing function always produced zeros in
the three least significant bits. The new hashing function gives a
modest performance boost:
Original: 0:11.373s
New: 0:11.110s
for a performance improvement of ~2%.
[1] From the hash function used in libbpf.
Committer notes:
Bill found the suboptimality of the hash function being used, Andrii
suggested using the libbpf one, which ended up being better.
Signed-off-by: Bill Wendling <morbo@google.com>
Suggested-by: Andrii Nakryiko <andrii@kernel.org>
Cc: bpf@vger.kernel.org
Cc: dwarves@vger.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v2] dwarf_loader: use a better hashing function
2021-02-12 12:37 ` Arnaldo Carvalho de Melo
2021-02-12 12:39 ` Arnaldo Carvalho de Melo
@ 2021-02-12 20:14 ` Bill Wendling
1 sibling, 0 replies; 11+ messages in thread
From: Bill Wendling @ 2021-02-12 20:14 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo; +Cc: Andrii Nakryiko, dwarves, bpf
On Fri, Feb 12, 2021 at 4:37 AM Arnaldo Carvalho de Melo
<acme@kernel.org> wrote:
>
> Em Fri, Feb 12, 2021 at 12:01:04AM -0800, Bill Wendling escreveu:
> > This hashing function[1] produces better hash table bucket
> > distributions. The original hashing function always produced zeros in
> > the three least significant bits. The new hashing function gives a
> > modest performance boost:
>
> Some tidbits:
>
> You forgot to CC Andrii and also to add this, which I'm doing now:
>
> Suggested-by: Andrii Nakryiko <andrii@kernel.org>
>
> :-)
>
Doh! You're right. Sorry about that. Thanks for catching it!
-bw
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2021-02-12 20:16 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-02-10 23:23 [PATCH] dwarf_loader: use a better hashing function Bill Wendling
2021-02-10 23:59 ` Andrii Nakryiko
2021-02-11 1:24 ` Bill Wendling
2021-02-11 1:31 ` Andrii Nakryiko
2021-02-11 13:01 ` Arnaldo Carvalho de Melo
2021-02-12 6:55 ` Bill Wendling
2021-02-12 12:35 ` Arnaldo Carvalho de Melo
2021-02-12 8:01 ` [PATCH v2] " Bill Wendling
2021-02-12 12:37 ` Arnaldo Carvalho de Melo
2021-02-12 12:39 ` Arnaldo Carvalho de Melo
2021-02-12 20:14 ` Bill Wendling
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).