All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Emilio G. Cota" <cota@braap.org>
To: Richard Henderson <rth@twiddle.net>
Cc: "QEMU Developers" <qemu-devel@nongnu.org>,
	"MTTCG Devel" <mttcg@greensocs.com>,
	"Alex Bennée" <alex.bennee@linaro.org>,
	"Paolo Bonzini" <pbonzini@redhat.com>,
	"Peter Crosthwaite" <crosthwaite.peter@gmail.com>,
	"Peter Maydell" <peter.maydell@linaro.org>,
	"Sergey Fedorov" <serge.fdrv@gmail.com>
Subject: Re: [Qemu-devel] [PATCH v3 11/11] translate-all: add tb hash bucket info to 'info jit' dump
Date: Fri, 22 Apr 2016 15:23:24 -0400	[thread overview]
Message-ID: <20160422192324.GA15938@flamenco> (raw)
In-Reply-To: <571A6245.4070209@twiddle.net>

On Fri, Apr 22, 2016 at 10:41:25 -0700, Richard Henderson wrote:
> On 04/19/2016 04:07 PM, Emilio G. Cota wrote:
> > +    ht_avg_len = qht_avg_bucket_chain_length(&tcg_ctx.tb_ctx.htable, &ht_heads);
> > +    cpu_fprintf(f, "TB hash avg chain   %0.5f buckets\n", ht_avg_len);
> > +    cpu_fprintf(f, "TB hash size        %zu head buckets\n", ht_heads);
> 
> I think the accounting is questionable here.
> 
> Consider the following data:
> 
> TB count             230467/671088
> TB invalidate count  25915
> TB hash avg chain    1.03073 buckets
> TB hash size         131072 head buckets
> 
> This means that we've got 230467 - 25915 = 204552 active TB's, installed into a
> hash table with 131072 heads.  For a perfectly uniform distribution of TBs,
> that would be an average chain length of 204552 / 131072 = 1.56.
>
> In order to get the average down to 1.03, there need to be a substantial number
> of heads with zero entries.

Remember that each head can host up to QHT_BUCKET_ENTRIES, so that
1) the bucket fits in a cache line
2) resizing the cache is fast since the full hashes are also stored
   in the bucket
3) full comparisons (i.e. calls to the user-provided cmp() function) are only
   due to full hash collisions, not due to the hash table being
   undersized.
QHT_BUCKET_ENTRIES is 4 entries on 64-bit hosts, and 6 entries on 32-bit.

For your example hash table, assuming we're on 64-bit and a uniform
distribution, occupancy would be
  204552 / 131072 / 4 = 0.39

The confusion comes from the fact that qht_avg_bucket_chain_length()
returns the number of head buckets, not head entries:
/*
 * Returns the number of buckets that an average lookup will traverse.
 * The return value is always >= 1. With a good hashing function, this
 * value should be close to 1.
 * Note that each bucket tracks up to QHT_BUCKET_ENTRIES items.
 */
double qht_avg_bucket_chain_length(struct qht *ht, size_t *n_head_buckets)

The rationale is that all entries on the same bucket can be found in O(1):
even if a bucket is empty, a lookup will do QHT_BUCKET_ENTRIES
comparisons, since a removal simply sets bucket[entry] to NULL, without
touching other entries (on that bucket or any other chained bucket).
This is done to limit churn on removals (the corresponding bucket chain
would have to be traversed). However, although given the relatively
low amount of removals that we have, this might be worth exploring.

> I also wonder if it wouldn't be better to size the hash table as appropriate
> for the maximum number of allowable TBs.

This would hurt performance for workloads that have very low amounts of TB's
and that go to the hash table often, such as booting a minimal ARM linux image.

		Emilio

  reply	other threads:[~2016-04-22 19:23 UTC|newest]

Thread overview: 45+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-04-19 23:07 [Qemu-devel] [PATCH v3 00/11] tb hash improvements Emilio G. Cota
2016-04-19 23:07 ` [Qemu-devel] [PATCH v3 01/11] compiler.h: add QEMU_ALIGNED() to enforce struct alignment Emilio G. Cota
2016-04-22  9:32   ` Alex Bennée
2016-04-22  9:35   ` Peter Maydell
2016-04-22 15:50     ` Emilio G. Cota
2016-04-19 23:07 ` [Qemu-devel] [PATCH v3 02/11] seqlock: remove optional mutex Emilio G. Cota
2016-04-19 23:07 ` [Qemu-devel] [PATCH v3 03/11] seqlock: rename write_lock/unlock to write_begin/end Emilio G. Cota
2016-04-19 23:07 ` [Qemu-devel] [PATCH v3 04/11] include/processor.h: define cpu_relax() Emilio G. Cota
2016-04-20 12:15   ` KONRAD Frederic
2016-04-20 17:16     ` Emilio G. Cota
2016-04-20 17:18       ` Peter Maydell
2016-04-20 17:32   ` [Qemu-devel] [UPDATED " Emilio G. Cota
2016-04-22  9:35   ` [Qemu-devel] [PATCH " Alex Bennée
2016-04-19 23:07 ` [Qemu-devel] [PATCH v3 05/11] qemu-thread: add simple test-and-set spinlock Emilio G. Cota
2016-04-20 15:18   ` Richard Henderson
2016-04-20 17:17     ` Emilio G. Cota
2016-04-20 17:55       ` Richard Henderson
2016-04-20 18:11         ` Emilio G. Cota
2016-04-20 19:39           ` Richard Henderson
2016-04-21 16:24             ` Emilio G. Cota
2016-04-20 17:35   ` [Qemu-devel] [UPDATED " Emilio G. Cota
2016-04-19 23:07 ` [Qemu-devel] [PATCH v3 06/11] exec: add tb_hash_func5, derived from xxhash Emilio G. Cota
2016-04-20 15:19   ` Richard Henderson
2016-04-22 12:58   ` Alex Bennée
2016-04-19 23:07 ` [Qemu-devel] [PATCH v3 07/11] tb hash: hash phys_pc, pc, and flags with xxhash Emilio G. Cota
2016-04-22 12:58   ` Alex Bennée
2016-04-19 23:07 ` [Qemu-devel] [PATCH v3 08/11] qht: QEMU's fast, resizable and scalable Hash Table Emilio G. Cota
2016-04-22 14:04   ` Alex Bennée
2016-04-24 20:01   ` Richard Henderson
2016-04-24 21:58     ` Emilio G. Cota
2016-04-19 23:07 ` [Qemu-devel] [PATCH v3 09/11] qht: add test program Emilio G. Cota
2016-04-22 14:35   ` Alex Bennée
2016-04-19 23:07 ` [Qemu-devel] [PATCH v3 10/11] tb hash: track translated blocks with qht Emilio G. Cota
2016-04-28 13:27   ` Alex Bennée
2016-04-19 23:07 ` [Qemu-devel] [PATCH v3 11/11] translate-all: add tb hash bucket info to 'info jit' dump Emilio G. Cota
2016-04-20 15:21   ` Richard Henderson
2016-04-22 14:38   ` Alex Bennée
2016-04-22 17:41   ` Richard Henderson
2016-04-22 19:23     ` Emilio G. Cota [this message]
2016-04-22 19:59     ` Richard Henderson
2016-04-22 23:57       ` Emilio G. Cota
2016-04-24 19:46         ` Richard Henderson
2016-04-24 22:06           ` Emilio G. Cota
2016-04-27  2:43             ` Emilio G. Cota
2016-04-28 16:37               ` Richard Henderson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160422192324.GA15938@flamenco \
    --to=cota@braap.org \
    --cc=alex.bennee@linaro.org \
    --cc=crosthwaite.peter@gmail.com \
    --cc=mttcg@greensocs.com \
    --cc=pbonzini@redhat.com \
    --cc=peter.maydell@linaro.org \
    --cc=qemu-devel@nongnu.org \
    --cc=rth@twiddle.net \
    --cc=serge.fdrv@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.