* [PATCH] dwarf_loader: use a better hashing function @ 2021-02-10 23:23 Bill Wendling 2021-02-10 23:59 ` Andrii Nakryiko 2021-02-12 8:01 ` [PATCH v2] " Bill Wendling 0 siblings, 2 replies; 11+ messages in thread From: Bill Wendling @ 2021-02-10 23:23 UTC (permalink / raw) To: dwarves, bpf; +Cc: arnaldo.melo, Bill Wendling This hashing function[1] produces better hash table bucket distributions. The original hashing function always produced zeros in the three least significant bits. The new hashing funciton gives a modest performance boost. Original New 0:11.41 0:11.38 0:11.36 0:11.34 0:11.35 0:11.26 ----------------------- Avg: 0:11.373 0:11.327 for a performance improvement of 0.4%. [1] From Numerical Recipes, 3rd Ed. 7.1.4 Random Hashes and Random Bytes Signed-off-by: Bill Wendling <morbo@google.com> --- hash.h | 25 ++++++++++--------------- 1 file changed, 10 insertions(+), 15 deletions(-) diff --git a/hash.h b/hash.h index d3aa416..ea201ab 100644 --- a/hash.h +++ b/hash.h @@ -33,22 +33,17 @@ static inline uint64_t hash_64(const uint64_t val, const unsigned int bits) { - uint64_t hash = val; + uint64_t hash = val * 0x369DEA0F31A53F85UL + 0x255992D382208B61UL; - /* Sigh, gcc can't optimise this alone like it does for 32 bits. */ - uint64_t n = hash; - n <<= 18; - hash -= n; - n <<= 33; - hash -= n; - n <<= 3; - hash += n; - n <<= 3; - hash -= n; - n <<= 4; - hash += n; - n <<= 2; - hash += n; + hash ^= hash >> 21; + hash ^= hash << 37; + hash ^= hash >> 4; + + hash *= 0x422E19E1D95D2F0DUL; + + hash ^= hash << 20; + hash ^= hash >> 41; + hash ^= hash << 5; /* High bits are more random, so use them. */ return hash >> (64 - bits); -- 2.30.0.478.g8a0d178c01-goog ^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH] dwarf_loader: use a better hashing function 2021-02-10 23:23 [PATCH] dwarf_loader: use a better hashing function Bill Wendling @ 2021-02-10 23:59 ` Andrii Nakryiko 2021-02-11 1:24 ` Bill Wendling 2021-02-12 8:01 ` [PATCH v2] " Bill Wendling 1 sibling, 1 reply; 11+ messages in thread From: Andrii Nakryiko @ 2021-02-10 23:59 UTC (permalink / raw) To: Bill Wendling; +Cc: dwarves, bpf, Arnaldo Carvalho de Melo On Wed, Feb 10, 2021 at 3:25 PM Bill Wendling <morbo@google.com> wrote: > > This hashing function[1] produces better hash table bucket > distributions. The original hashing function always produced zeros in > the three least significant bits. > > The new hashing funciton gives a modest performance boost. > > Original New > 0:11.41 0:11.38 > 0:11.36 0:11.34 > 0:11.35 0:11.26 > ----------------------- > Avg: 0:11.373 0:11.327 > > for a performance improvement of 0.4%. > > [1] From Numerical Recipes, 3rd Ed. 7.1.4 Random Hashes and Random Bytes > Can you please also test with the one libbpf uses internally: return (val * 11400714819323198485llu) >> (64 - bits); ? Thanks! > Signed-off-by: Bill Wendling <morbo@google.com> > --- > hash.h | 25 ++++++++++--------------- > 1 file changed, 10 insertions(+), 15 deletions(-) > > diff --git a/hash.h b/hash.h > index d3aa416..ea201ab 100644 > --- a/hash.h > +++ b/hash.h > @@ -33,22 +33,17 @@ > > static inline uint64_t hash_64(const uint64_t val, const unsigned int bits) > { > - uint64_t hash = val; > + uint64_t hash = val * 0x369DEA0F31A53F85UL + 0x255992D382208B61UL; > > - /* Sigh, gcc can't optimise this alone like it does for 32 bits. */ > - uint64_t n = hash; > - n <<= 18; > - hash -= n; > - n <<= 33; > - hash -= n; > - n <<= 3; > - hash += n; > - n <<= 3; > - hash -= n; > - n <<= 4; > - hash += n; > - n <<= 2; > - hash += n; > + hash ^= hash >> 21; > + hash ^= hash << 37; > + hash ^= hash >> 4; > + > + hash *= 0x422E19E1D95D2F0DUL; > + > + hash ^= hash << 20; > + hash ^= hash >> 41; > + hash ^= hash << 5; > > /* High bits are more random, so use them. */ > return hash >> (64 - bits); > -- > 2.30.0.478.g8a0d178c01-goog > ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] dwarf_loader: use a better hashing function 2021-02-10 23:59 ` Andrii Nakryiko @ 2021-02-11 1:24 ` Bill Wendling 2021-02-11 1:31 ` Andrii Nakryiko 0 siblings, 1 reply; 11+ messages in thread From: Bill Wendling @ 2021-02-11 1:24 UTC (permalink / raw) To: Andrii Nakryiko; +Cc: dwarves, bpf, Arnaldo Carvalho de Melo On Wed, Feb 10, 2021 at 4:00 PM Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote: > > On Wed, Feb 10, 2021 at 3:25 PM Bill Wendling <morbo@google.com> wrote: > > > > This hashing function[1] produces better hash table bucket > > distributions. The original hashing function always produced zeros in > > the three least significant bits. > > > > The new hashing funciton gives a modest performance boost. > > > > Original New > > 0:11.41 0:11.38 > > 0:11.36 0:11.34 > > 0:11.35 0:11.26 > > ----------------------- > > Avg: 0:11.373 0:11.327 > > > > for a performance improvement of 0.4%. > > > > [1] From Numerical Recipes, 3rd Ed. 7.1.4 Random Hashes and Random Bytes > > > > Can you please also test with the one libbpf uses internally: > > return (val * 11400714819323198485llu) >> (64 - bits); > > ? > > Thanks! > It's giving me a running time of ~11.11s, which is even better. Would you like me to submit a patch? -bw ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] dwarf_loader: use a better hashing function 2021-02-11 1:24 ` Bill Wendling @ 2021-02-11 1:31 ` Andrii Nakryiko 2021-02-11 13:01 ` Arnaldo Carvalho de Melo 0 siblings, 1 reply; 11+ messages in thread From: Andrii Nakryiko @ 2021-02-11 1:31 UTC (permalink / raw) To: Bill Wendling; +Cc: dwarves, bpf, Arnaldo Carvalho de Melo On Wed, Feb 10, 2021 at 5:24 PM Bill Wendling <morbo@google.com> wrote: > > On Wed, Feb 10, 2021 at 4:00 PM Andrii Nakryiko > <andrii.nakryiko@gmail.com> wrote: > > > > On Wed, Feb 10, 2021 at 3:25 PM Bill Wendling <morbo@google.com> wrote: > > > > > > This hashing function[1] produces better hash table bucket > > > distributions. The original hashing function always produced zeros in > > > the three least significant bits. > > > > > > The new hashing funciton gives a modest performance boost. > > > > > > Original New > > > 0:11.41 0:11.38 > > > 0:11.36 0:11.34 > > > 0:11.35 0:11.26 > > > ----------------------- > > > Avg: 0:11.373 0:11.327 > > > > > > for a performance improvement of 0.4%. > > > > > > [1] From Numerical Recipes, 3rd Ed. 7.1.4 Random Hashes and Random Bytes > > > > > > > Can you please also test with the one libbpf uses internally: > > > > return (val * 11400714819323198485llu) >> (64 - bits); > > > > ? > > > > Thanks! > > > It's giving me a running time of ~11.11s, which is even better. Would > you like me to submit a patch? faster is better, so yeah, why not? :) > > -bw ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] dwarf_loader: use a better hashing function 2021-02-11 1:31 ` Andrii Nakryiko @ 2021-02-11 13:01 ` Arnaldo Carvalho de Melo 2021-02-12 6:55 ` Bill Wendling 0 siblings, 1 reply; 11+ messages in thread From: Arnaldo Carvalho de Melo @ 2021-02-11 13:01 UTC (permalink / raw) To: Andrii Nakryiko; +Cc: Bill Wendling, dwarves, bpf, Arnaldo Carvalho de Melo Em Wed, Feb 10, 2021 at 05:31:48PM -0800, Andrii Nakryiko escreveu: > On Wed, Feb 10, 2021 at 5:24 PM Bill Wendling <morbo@google.com> wrote: > > On Wed, Feb 10, 2021 at 4:00 PM Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote: > > > On Wed, Feb 10, 2021 at 3:25 PM Bill Wendling <morbo@google.com> wrote: > > > > This hashing function[1] produces better hash table bucket > > > > distributions. The original hashing function always produced zeros in > > > > the three least significant bits. > > > > The new hashing funciton gives a modest performance boost. > > > > Original New > > > > 0:11.41 0:11.38 > > > > 0:11.36 0:11.34 > > > > 0:11.35 0:11.26 > > > > ----------------------- > > > > Avg: 0:11.373 0:11.327 > > > > for a performance improvement of 0.4%. > > > > [1] From Numerical Recipes, 3rd Ed. 7.1.4 Random Hashes and Random Bytes > > > Can you please also test with the one libbpf uses internally: > > > return (val * 11400714819323198485llu) >> (64 - bits); > > > ? > > It's giving me a running time of ~11.11s, which is even better. Would > > you like me to submit a patch? > faster is better, so yeah, why not? :) Yeah, I agree, faster is better, please make it so :-) - Arnaldo ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] dwarf_loader: use a better hashing function 2021-02-11 13:01 ` Arnaldo Carvalho de Melo @ 2021-02-12 6:55 ` Bill Wendling 2021-02-12 12:35 ` Arnaldo Carvalho de Melo 0 siblings, 1 reply; 11+ messages in thread From: Bill Wendling @ 2021-02-12 6:55 UTC (permalink / raw) To: Arnaldo Carvalho de Melo Cc: Andrii Nakryiko, dwarves, bpf, Arnaldo Carvalho de Melo On Thu, Feb 11, 2021 at 5:01 AM Arnaldo Carvalho de Melo <acme@kernel.org> wrote: > > Em Wed, Feb 10, 2021 at 05:31:48PM -0800, Andrii Nakryiko escreveu: > > On Wed, Feb 10, 2021 at 5:24 PM Bill Wendling <morbo@google.com> wrote: > > > On Wed, Feb 10, 2021 at 4:00 PM Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote: > > > > On Wed, Feb 10, 2021 at 3:25 PM Bill Wendling <morbo@google.com> wrote: > > > > > This hashing function[1] produces better hash table bucket > > > > > distributions. The original hashing function always produced zeros in > > > > > the three least significant bits. > > > > > > The new hashing funciton gives a modest performance boost. > > > > > > Original New > > > > > 0:11.41 0:11.38 > > > > > 0:11.36 0:11.34 > > > > > 0:11.35 0:11.26 > > > > > ----------------------- > > > > > Avg: 0:11.373 0:11.327 > > > > > > for a performance improvement of 0.4%. > > > > > > [1] From Numerical Recipes, 3rd Ed. 7.1.4 Random Hashes and Random Bytes > > > > > Can you please also test with the one libbpf uses internally: > > > > > return (val * 11400714819323198485llu) >> (64 - bits); > > > > > ? > > > > It's giving me a running time of ~11.11s, which is even better. Would > > > you like me to submit a patch? > > > faster is better, so yeah, why not? :) > > Yeah, I agree, faster is better, please make it so :-) > Your wish is my command! :-) Done. -bw ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] dwarf_loader: use a better hashing function 2021-02-12 6:55 ` Bill Wendling @ 2021-02-12 12:35 ` Arnaldo Carvalho de Melo 0 siblings, 0 replies; 11+ messages in thread From: Arnaldo Carvalho de Melo @ 2021-02-12 12:35 UTC (permalink / raw) To: Bill Wendling; +Cc: Andrii Nakryiko, dwarves, bpf, Arnaldo Carvalho de Melo Em Thu, Feb 11, 2021 at 10:55:32PM -0800, Bill Wendling escreveu: > On Thu, Feb 11, 2021 at 5:01 AM Arnaldo Carvalho de Melo > <acme@kernel.org> wrote: > > > > Em Wed, Feb 10, 2021 at 05:31:48PM -0800, Andrii Nakryiko escreveu: > > > On Wed, Feb 10, 2021 at 5:24 PM Bill Wendling <morbo@google.com> wrote: > > > > On Wed, Feb 10, 2021 at 4:00 PM Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote: > > > > > On Wed, Feb 10, 2021 at 3:25 PM Bill Wendling <morbo@google.com> wrote: > > > > > > This hashing function[1] produces better hash table bucket > > > > > > distributions. The original hashing function always produced zeros in > > > > > > the three least significant bits. > > > > > > > > The new hashing funciton gives a modest performance boost. > > > > > > > > Original New > > > > > > 0:11.41 0:11.38 > > > > > > 0:11.36 0:11.34 > > > > > > 0:11.35 0:11.26 > > > > > > ----------------------- > > > > > > Avg: 0:11.373 0:11.327 > > > > > > > > for a performance improvement of 0.4%. > > > > > > > > [1] From Numerical Recipes, 3rd Ed. 7.1.4 Random Hashes and Random Bytes > > > > > > > Can you please also test with the one libbpf uses internally: > > > > > > > return (val * 11400714819323198485llu) >> (64 - bits); > > > > > > > ? > > > > > > It's giving me a running time of ~11.11s, which is even better. Would > > > > you like me to submit a patch? > > > > > faster is better, so yeah, why not? :) > > > > Yeah, I agree, faster is better, please make it so :-) > > > Your wish is my command! :-) Done. Thanks, looking for the patch and applying! No go think about something else to make it faster 8-) - Arnaldo ^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH v2] dwarf_loader: use a better hashing function 2021-02-10 23:23 [PATCH] dwarf_loader: use a better hashing function Bill Wendling 2021-02-10 23:59 ` Andrii Nakryiko @ 2021-02-12 8:01 ` Bill Wendling 2021-02-12 12:37 ` Arnaldo Carvalho de Melo 1 sibling, 1 reply; 11+ messages in thread From: Bill Wendling @ 2021-02-12 8:01 UTC (permalink / raw) To: dwarves, bpf; +Cc: arnaldo.melo, Bill Wendling This hashing function[1] produces better hash table bucket distributions. The original hashing function always produced zeros in the three least significant bits. The new hashing function gives a modest performance boost: Original: 0:11.373s New: 0:11.110s for a performance improvement of ~2%. [1] From the hash function used in libbpf. Signed-off-by: Bill Wendling <morbo@google.com> --- hash.h | 20 +------------------- 1 file changed, 1 insertion(+), 19 deletions(-) diff --git a/hash.h b/hash.h index d3aa416..6f952c7 100644 --- a/hash.h +++ b/hash.h @@ -33,25 +33,7 @@ static inline uint64_t hash_64(const uint64_t val, const unsigned int bits) { - uint64_t hash = val; - - /* Sigh, gcc can't optimise this alone like it does for 32 bits. */ - uint64_t n = hash; - n <<= 18; - hash -= n; - n <<= 33; - hash -= n; - n <<= 3; - hash += n; - n <<= 3; - hash -= n; - n <<= 4; - hash += n; - n <<= 2; - hash += n; - - /* High bits are more random, so use them. */ - return hash >> (64 - bits); + return (val * 11400714819323198485LLU) >> (64 - bits); } static inline uint32_t hash_32(uint32_t val, unsigned int bits) -- 2.30.0.478.g8a0d178c01-goog ^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH v2] dwarf_loader: use a better hashing function 2021-02-12 8:01 ` [PATCH v2] " Bill Wendling @ 2021-02-12 12:37 ` Arnaldo Carvalho de Melo 2021-02-12 12:39 ` Arnaldo Carvalho de Melo 2021-02-12 20:14 ` Bill Wendling 0 siblings, 2 replies; 11+ messages in thread From: Arnaldo Carvalho de Melo @ 2021-02-12 12:37 UTC (permalink / raw) To: Bill Wendling; +Cc: Andrii Nakryiko, dwarves, bpf Em Fri, Feb 12, 2021 at 12:01:04AM -0800, Bill Wendling escreveu: > This hashing function[1] produces better hash table bucket > distributions. The original hashing function always produced zeros in > the three least significant bits. The new hashing function gives a > modest performance boost: Some tidbits: You forgot to CC Andrii and also to add this, which I'm doing now: Suggested-by: Andrii Nakryiko <andrii@kernel.org> :-) - Arnaldo > Original: 0:11.373s > New: 0:11.110s > > for a performance improvement of ~2%. > > [1] From the hash function used in libbpf. > > Signed-off-by: Bill Wendling <morbo@google.com> > --- > hash.h | 20 +------------------- > 1 file changed, 1 insertion(+), 19 deletions(-) > > diff --git a/hash.h b/hash.h > index d3aa416..6f952c7 100644 > --- a/hash.h > +++ b/hash.h > @@ -33,25 +33,7 @@ > > static inline uint64_t hash_64(const uint64_t val, const unsigned int bits) > { > - uint64_t hash = val; > - > - /* Sigh, gcc can't optimise this alone like it does for 32 bits. */ > - uint64_t n = hash; > - n <<= 18; > - hash -= n; > - n <<= 33; > - hash -= n; > - n <<= 3; > - hash += n; > - n <<= 3; > - hash -= n; > - n <<= 4; > - hash += n; > - n <<= 2; > - hash += n; > - > - /* High bits are more random, so use them. */ > - return hash >> (64 - bits); > + return (val * 11400714819323198485LLU) >> (64 - bits); > } > > static inline uint32_t hash_32(uint32_t val, unsigned int bits) > -- > 2.30.0.478.g8a0d178c01-goog > -- - Arnaldo ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v2] dwarf_loader: use a better hashing function 2021-02-12 12:37 ` Arnaldo Carvalho de Melo @ 2021-02-12 12:39 ` Arnaldo Carvalho de Melo 2021-02-12 20:14 ` Bill Wendling 1 sibling, 0 replies; 11+ messages in thread From: Arnaldo Carvalho de Melo @ 2021-02-12 12:39 UTC (permalink / raw) To: Bill Wendling; +Cc: Andrii Nakryiko, dwarves, bpf Em Fri, Feb 12, 2021 at 09:37:22AM -0300, Arnaldo Carvalho de Melo escreveu: > Em Fri, Feb 12, 2021 at 12:01:04AM -0800, Bill Wendling escreveu: > > This hashing function[1] produces better hash table bucket > > distributions. The original hashing function always produced zeros in > > the three least significant bits. The new hashing function gives a > > modest performance boost: > > Some tidbits: > > You forgot to CC Andrii and also to add this, which I'm doing now: > > Suggested-by: Andrii Nakryiko <andrii@kernel.org> > > :-) See below the full cset, that will go public after some more tests here. - Arnaldo commit 9fecc77ed82d429fd3fe49ba275465813228e617 (HEAD -> master) Author: Bill Wendling <morbo@google.com> Date: Fri Feb 12 00:01:04 2021 -0800 dwarf_loader: Use a better hashing function, from libbpf This hashing function[1] produces better hash table bucket distributions. The original hashing function always produced zeros in the three least significant bits. The new hashing function gives a modest performance boost: Original: 0:11.373s New: 0:11.110s for a performance improvement of ~2%. [1] From the hash function used in libbpf. Committer notes: Bill found the suboptimality of the hash function being used, Andrii suggested using the libbpf one, which ended up being better. Signed-off-by: Bill Wendling <morbo@google.com> Suggested-by: Andrii Nakryiko <andrii@kernel.org> Cc: bpf@vger.kernel.org Cc: dwarves@vger.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v2] dwarf_loader: use a better hashing function 2021-02-12 12:37 ` Arnaldo Carvalho de Melo 2021-02-12 12:39 ` Arnaldo Carvalho de Melo @ 2021-02-12 20:14 ` Bill Wendling 1 sibling, 0 replies; 11+ messages in thread From: Bill Wendling @ 2021-02-12 20:14 UTC (permalink / raw) To: Arnaldo Carvalho de Melo; +Cc: Andrii Nakryiko, dwarves, bpf On Fri, Feb 12, 2021 at 4:37 AM Arnaldo Carvalho de Melo <acme@kernel.org> wrote: > > Em Fri, Feb 12, 2021 at 12:01:04AM -0800, Bill Wendling escreveu: > > This hashing function[1] produces better hash table bucket > > distributions. The original hashing function always produced zeros in > > the three least significant bits. The new hashing function gives a > > modest performance boost: > > Some tidbits: > > You forgot to CC Andrii and also to add this, which I'm doing now: > > Suggested-by: Andrii Nakryiko <andrii@kernel.org> > > :-) > Doh! You're right. Sorry about that. Thanks for catching it! -bw ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2021-02-12 20:16 UTC | newest] Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2021-02-10 23:23 [PATCH] dwarf_loader: use a better hashing function Bill Wendling 2021-02-10 23:59 ` Andrii Nakryiko 2021-02-11 1:24 ` Bill Wendling 2021-02-11 1:31 ` Andrii Nakryiko 2021-02-11 13:01 ` Arnaldo Carvalho de Melo 2021-02-12 6:55 ` Bill Wendling 2021-02-12 12:35 ` Arnaldo Carvalho de Melo 2021-02-12 8:01 ` [PATCH v2] " Bill Wendling 2021-02-12 12:37 ` Arnaldo Carvalho de Melo 2021-02-12 12:39 ` Arnaldo Carvalho de Melo 2021-02-12 20:14 ` Bill Wendling
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).