From: "Yordan Karadzhov (VMware)" <y.karadz@gmail.com>
To: Steven Rostedt <rostedt@goodmis.org>,
Linux Trace Devel <linux-trace-devel@vger.kernel.org>
Subject: Re: [PATCH] kernel-shark: Increase the size of the task hash
Date: Thu, 29 Aug 2019 15:46:24 +0300 [thread overview]
Message-ID: <b651e39a-6f32-fe59-36bf-224cad5bc649@gmail.com> (raw)
In-Reply-To: <20190828140016.3ce1be4f@gandalf.local.home>
On 28.08.19 г. 21:00 ч., Steven Rostedt wrote:
> From: Steven Rostedt (VMware) <rostedt@goodmis.org>
>
> When loading a data file that contained 100,000s of tasks, using a 256
> bucket size hash crippled it. By increasing the hash to 2^16 (65536) it
> solves the issue (still small enough not to waste too much memory).
>
> In the process, I changed the knuth_hash() in libkshark.c to use the 32
> bit version and just have the key use what it needs:
If the size the hash table is 65536 then the multiplicative hashing
function has to multiply the key by a prime number which is closer to
65536 / 1.61803398875 (the so called "golden ratio" of the size).
It makes no sense to multiply by a number that is several orders of
magnitude bigger than the size of the hash table.
>
> key = knuth_hash();
> key += knuth_hash >> SHIFT;
> key &= (1 << SHIFT) - 1;
>
> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
> ---
> diff --git a/kernel-shark/src/libkshark.c b/kernel-shark/src/libkshark.c
> index 4201fa02..41572e18 100644
> --- a/kernel-shark/src/libkshark.c
> +++ b/kernel-shark/src/libkshark.c
> @@ -252,19 +252,19 @@ void kshark_free(struct kshark_context *kshark_ctx)
> free(kshark_ctx);
> }
>
> -static inline uint8_t knuth_hash(uint32_t val)
> +static inline uint32_t knuth_hash(uint32_t val)
> {
> /*
> - * Small table hashing function adapted from Donald E. Knuth's 32 bit
> + * Hashing function adapted from Donald E. Knuth's 32 bit
> * multiplicative hash. See The Art of Computer Programming (TAOCP).
> * Multiplication by the Prime number, closest to the golden ratio of
> - * 2^8.
> + * 2^32.
> */
> - return UINT8_C(val) * UINT8_C(157);
> + return val * UINT32_C(2654435761);
So according to my understanding of the idea in the book, this line
should look look this:
return UINT16_C(val) * UINT8_16(65537);
and you do not need to do any additional manipulation of the key (PID).
Also the function has to return uint16_t.
Thanks!
Yordan
> }
>
> static struct kshark_task_list *
> -kshark_find_task(struct kshark_context *kshark_ctx, uint8_t key, int pid)
> +kshark_find_task(struct kshark_context *kshark_ctx, uint32_t key, int pid)
> {
> struct kshark_task_list *list;
>
> @@ -280,9 +280,12 @@ static struct kshark_task_list *
> kshark_add_task(struct kshark_context *kshark_ctx, int pid)
> {
> struct kshark_task_list *list;
> - uint8_t key;
> + uint32_t key;
>
> key = knuth_hash(pid);
> + key += key >> KS_TASK_HASH_SHIFT;
> + key &= (1 << KS_TASK_HASH_SHIFT) - 1;
> +
> list = kshark_find_task(kshark_ctx, key, pid);
> if (list)
> return list;
> diff --git a/kernel-shark/src/libkshark.h b/kernel-shark/src/libkshark.h
> index 04e9cbfc..3407db19 100644
> --- a/kernel-shark/src/libkshark.h
> +++ b/kernel-shark/src/libkshark.h
> @@ -72,7 +72,8 @@ struct kshark_entry {
> };
>
> /** Size of the task's hash table. */
> -#define KS_TASK_HASH_SIZE 256
> +#define KS_TASK_HASH_SHIFT 16
> +#define KS_TASK_HASH_SIZE (1 << KS_TASK_HASH_SHIFT)
>
> /** Linked list of tasks. */
> struct kshark_task_list {
>
next prev parent reply other threads:[~2019-08-29 12:46 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-08-28 18:00 [PATCH] kernel-shark: Increase the size of the task hash Steven Rostedt
2019-08-29 12:46 ` Yordan Karadzhov (VMware) [this message]
2019-08-29 13:18 ` Steven Rostedt
2019-08-29 15:49 ` Steven Rostedt
2019-09-17 15:03 ` Yordan Karadzhov (VMware)
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=b651e39a-6f32-fe59-36bf-224cad5bc649@gmail.com \
--to=y.karadz@gmail.com \
--cc=linux-trace-devel@vger.kernel.org \
--cc=rostedt@goodmis.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).