From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:40716) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1athFA-0005Ty-Ji for qemu-devel@nongnu.org; Fri, 22 Apr 2016 16:00:05 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1athF7-0001xx-AM for qemu-devel@nongnu.org; Fri, 22 Apr 2016 16:00:00 -0400 Received: from mail-qk0-x243.google.com ([2607:f8b0:400d:c09::243]:33532) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1athF7-0001xs-3K for qemu-devel@nongnu.org; Fri, 22 Apr 2016 15:59:57 -0400 Received: by mail-qk0-x243.google.com with SMTP id q184so6458194qkf.0 for ; Fri, 22 Apr 2016 12:59:56 -0700 (PDT) Sender: Richard Henderson Reply-To: Richard Henderson References: <1461107270-19234-1-git-send-email-cota@braap.org> <1461107270-19234-12-git-send-email-cota@braap.org> <571A6245.4070209@twiddle.net> From: Richard Henderson Message-ID: <571A82B8.5080908@twiddle.net> Date: Fri, 22 Apr 2016 12:59:52 -0700 MIME-Version: 1.0 In-Reply-To: <571A6245.4070209@twiddle.net> Content-Type: multipart/mixed; boundary="------------050403030907080203060003" Subject: Re: [Qemu-devel] [PATCH v3 11/11] translate-all: add tb hash bucket info to 'info jit' dump List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: "Emilio G. Cota" , QEMU Developers , MTTCG Devel Cc: =?UTF-8?Q?Alex_Benn=c3=a9e?= , Paolo Bonzini , Peter Crosthwaite , Peter Maydell , Sergey Fedorov This is a multi-part message in MIME format. --------------050403030907080203060003 Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit On 04/22/2016 10:41 AM, Richard Henderson wrote: > On 04/19/2016 04:07 PM, Emilio G. Cota wrote: >> + ht_avg_len = qht_avg_bucket_chain_length(&tcg_ctx.tb_ctx.htable, &ht_heads); >> + cpu_fprintf(f, "TB hash avg chain %0.5f buckets\n", ht_avg_len); >> + cpu_fprintf(f, "TB hash size %zu head buckets\n", ht_heads); > > > I think the accounting is questionable here. > > Consider the following data: > > TB count 230467/671088 > TB invalidate count 25915 > TB hash avg chain 1.03073 buckets > TB hash size 131072 head buckets > > This means that we've got 230467 - 25915 = 204552 active TB's, installed into a > hash table with 131072 heads. For a perfectly uniform distribution of TBs, > that would be an average chain length of 204552 / 131072 = 1.56. > > In order to get the average down to 1.03, there need to be a substantial number > of heads with zero entries. > > I think perhaps it might be more enlightening to separately account for empty > and non-empty heads. E.g. > > TB hash buckets used xxxx/131072 > TB hash avg chain yyyy > > where xxxx is the number of non-empty heads, and yyyy = |TBs| / xxxx. > > I also wonder if it wouldn't be better to size the hash table as appropriate > for the maximum number of allowable TBs. FWIW, so that I could get an idea of how the stats change as we improve the hashing, I inserted the attachment 1 patch between patches 5 and 6, and with attachment 2 attempting to fix the accounting for patches 9 and 10. For booting an alpha kernel to login prompt: Before hashing changes (@5/11) TB count 175363/671088 TB invalidate count 3996 TB hash buckets 31731/32768 TB hash avg chain 5.289 max=59 After xxhash patch (@7/11) TB hash buckets 32582/32768 TB hash avg chain 5.260 max=18 So far so good! After qht patches (@11/11) TB hash buckets 94360/131072 TB hash avg chain 1.774 max=8 Do note that those last numbers are off: 1.774 avg * 94360 used buckets = 167394 total entries, which is far from 171367, the correct number of total entries. I'm tempted to pull over gcc's non-chaining hash table implementation (libiberty/hashtab.c, still gplv2+) and compare... r~ --------------050403030907080203060003 Content-Type: text/plain; charset=UTF-8; name="z1.txt" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="z1.txt" >>From cdc7b3631fd78bd2e31d2823f7543e2a56681149 Mon Sep 17 00:00:00 2001 From: Richard Henderson Date: Fri, 22 Apr 2016 11:28:52 -0700 Subject: translate-all: Add hashtable accounting to info jit Dump hash table occupancy numbers with "info jit". Signed-off-by: Richard Henderson diff --git a/translate-all.c b/translate-all.c index 1a8f68b..ed296d5 100644 --- a/translate-all.c +++ b/translate-all.c @@ -1671,39 +1671,55 @@ void tb_flush_jmp_cache(CPUState *cpu, target_ulong addr) void dump_exec_info(FILE *f, fprintf_function cpu_fprintf) { - int i, target_code_size, max_target_code_size; - int direct_jmp_count, direct_jmp2_count, cross_page; + size_t target_code_size, max_target_code_size; + unsigned direct_jmp_count, direct_jmp2_count, cross_page; + unsigned used_buckets, max_chain, hash_tbs; TranslationBlock *tb; + int i; target_code_size = 0; max_target_code_size = 0; cross_page = 0; direct_jmp_count = 0; direct_jmp2_count = 0; - for (i = 0; i < tcg_ctx.tb_ctx.nb_tbs; i++) { - tb = &tcg_ctx.tb_ctx.tbs[i]; - target_code_size += tb->size; - if (tb->size > max_target_code_size) { - max_target_code_size = tb->size; - } - if (tb->page_addr[1] != -1) { - cross_page++; - } - if (tb->tb_next_offset[0] != 0xffff) { - direct_jmp_count++; - if (tb->tb_next_offset[1] != 0xffff) { - direct_jmp2_count++; + used_buckets = 0; + hash_tbs = 0; + max_chain = 0; + + for (i = 0; i < CODE_GEN_PHYS_HASH_SIZE; i++) { + if (tcg_ctx.tb_ctx.tb_phys_hash[i]) { + unsigned this_chain = 0; + for (tb = tcg_ctx.tb_ctx.tb_phys_hash[i]; tb != NULL; + tb = tb->phys_hash_next) { + this_chain++; + hash_tbs++; + target_code_size += tb->size; + if (tb->page_addr[1] != -1) { + cross_page++; + } + if (tb->tb_next_offset[0] != 0xffff) { + direct_jmp_count++; + if (tb->tb_next_offset[1] != 0xffff) { + direct_jmp2_count++; + } + } } + if (this_chain > max_chain) { + max_chain = this_chain; + } + used_buckets++; } } - /* XXX: avoid using doubles ? */ + assert(hash_tbs == + tcg_ctx.tb_ctx.nb_tbs - tcg_ctx.tb_ctx.tb_phys_invalidate_count); + cpu_fprintf(f, "Translation buffer state:\n"); cpu_fprintf(f, "gen code size %td/%zd\n", tcg_ctx.code_gen_ptr - tcg_ctx.code_gen_buffer, tcg_ctx.code_gen_highwater - tcg_ctx.code_gen_buffer); cpu_fprintf(f, "TB count %d/%d\n", tcg_ctx.tb_ctx.nb_tbs, tcg_ctx.code_gen_max_blocks); - cpu_fprintf(f, "TB avg target size %d max=%d bytes\n", + cpu_fprintf(f, "TB avg target size %zd max=%zd bytes\n", tcg_ctx.tb_ctx.nb_tbs ? target_code_size / tcg_ctx.tb_ctx.nb_tbs : 0, max_target_code_size); @@ -1717,13 +1733,18 @@ void dump_exec_info(FILE *f, fprintf_function cpu_fprintf) cpu_fprintf(f, "cross page TB count %d (%d%%)\n", cross_page, tcg_ctx.tb_ctx.nb_tbs ? (cross_page * 100) / tcg_ctx.tb_ctx.nb_tbs : 0); - cpu_fprintf(f, "direct jump count %d (%d%%) (2 jumps=%d %d%%)\n", + cpu_fprintf(f, "direct jump count %u (%u%%) (2 jumps=%u %u%%)\n", direct_jmp_count, tcg_ctx.tb_ctx.nb_tbs ? (direct_jmp_count * 100) / tcg_ctx.tb_ctx.nb_tbs : 0, direct_jmp2_count, tcg_ctx.tb_ctx.nb_tbs ? (direct_jmp2_count * 100) / tcg_ctx.tb_ctx.nb_tbs : 0); + cpu_fprintf(f, "TB hash buckets %u/%d\n", + used_buckets, CODE_GEN_PHYS_HASH_SIZE); + cpu_fprintf(f, "TB hash avg chain %0.3f max=%u\n", + used_buckets ? (double)hash_tbs / used_buckets : 0.0, + max_chain); cpu_fprintf(f, "\nStatistics:\n"); cpu_fprintf(f, "TB flush count %d\n", tcg_ctx.tb_ctx.tb_flush_count); cpu_fprintf(f, "TB invalidate count %d\n", --------------050403030907080203060003 Content-Type: text/plain; charset=UTF-8; name="z2.txt" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="z2.txt" >>From 7f1d677f3d085b5891e1adbd5f602185d68ba81a Mon Sep 17 00:00:00 2001 From: Richard Henderson Date: Fri, 22 Apr 2016 12:50:00 -0700 Subject: fixup to {09,10,11}/11 diff --git a/include/qemu/qht.h b/include/qemu/qht.h index a0a1aa8..2d0b58f 100644 --- a/include/qemu/qht.h +++ b/include/qemu/qht.h @@ -49,6 +49,14 @@ void qht_grow(struct qht *ht); void *qht_lookup(struct qht *ht, qht_lookup_func_t func, const void *userp, uint32_t hash); -double qht_avg_bucket_chain_length(struct qht *ht, size_t *n_head_buckets); + +struct qht_stats { + size_t used_buckets; + size_t max_buckets; + size_t used_entries; + size_t max_chain; +}; + +struct qht_stats qht_statistics(struct qht *ht); #endif /* QHT_H */ diff --git a/translate-all.c b/translate-all.c index 3b73b46..a9ceb0a 100644 --- a/translate-all.c +++ b/translate-all.c @@ -1664,8 +1664,7 @@ void dump_exec_info(FILE *f, fprintf_function cpu_fprintf) { size_t target_code_size, max_target_code_size; unsigned direct_jmp_count, direct_jmp2_count, cross_page; - unsigned used_buckets, max_chain, hash_tbs; - TranslationBlock *tb; + struct qht_stats hinfo; int i; target_code_size = 0; @@ -1673,35 +1672,23 @@ void dump_exec_info(FILE *f, fprintf_function cpu_fprintf) cross_page = 0; direct_jmp_count = 0; direct_jmp2_count = 0; - used_buckets = 0; - hash_tbs = 0; - max_chain = 0; - - for (i = 0; i < CODE_GEN_PHYS_HASH_SIZE; i++) { - if (tcg_ctx.tb_ctx.tb_phys_hash[i]) { - unsigned this_chain = 0; - for (tb = tcg_ctx.tb_ctx.tb_phys_hash[i]; tb != NULL; - tb = tb->phys_hash_next) { - this_chain++; - hash_tbs++; - target_code_size += tb->size; - if (tb->page_addr[1] != -1) { - cross_page++; - } - if (tb->tb_next_offset[0] != 0xffff) { - direct_jmp_count++; - if (tb->tb_next_offset[1] != 0xffff) { - direct_jmp2_count++; - } - } - } - if (this_chain > max_chain) { - max_chain = this_chain; + + for (i = 0; i < tcg_ctx.tb_ctx.nb_tbs; i++) { + const TranslationBlock *tb = &tcg_ctx.tb_ctx.tbs[i]; + target_code_size += tb->size; + if (tb->page_addr[1] != -1) { + cross_page++; + } + if (tb->tb_next_offset[0] != 0xffff) { + direct_jmp_count++; + if (tb->tb_next_offset[1] != 0xffff) { + direct_jmp2_count++; } - used_buckets++; } } - assert(hash_tbs == + + hinfo = qht_statistics(&tcg_ctx.tb_ctx.htable); + assert(hinfo.used_entries == tcg_ctx.tb_ctx.nb_tbs - tcg_ctx.tb_ctx.tb_phys_invalidate_count); cpu_fprintf(f, "Translation buffer state:\n"); @@ -1731,11 +1718,12 @@ void dump_exec_info(FILE *f, fprintf_function cpu_fprintf) direct_jmp2_count, tcg_ctx.tb_ctx.nb_tbs ? (direct_jmp2_count * 100) / tcg_ctx.tb_ctx.nb_tbs : 0); - cpu_fprintf(f, "TB hash buckets %u/%d\n", - used_buckets, CODE_GEN_PHYS_HASH_SIZE); - cpu_fprintf(f, "TB hash avg chain %0.3f max=%u\n", - used_buckets ? (double)hash_tbs / used_buckets : 0.0, - max_chain); + cpu_fprintf(f, "TB hash buckets %zu/%zu\n", + hinfo.used_buckets, hinfo.max_buckets); + cpu_fprintf(f, "TB hash avg chain %0.3f max=%zu\n", + hinfo.used_buckets + ? (double)hinfo.used_entries / hinfo.used_buckets : 0.0, + hinfo.max_chain); cpu_fprintf(f, "\nStatistics:\n"); cpu_fprintf(f, "TB flush count %d\n", tcg_ctx.tb_ctx.tb_flush_count); cpu_fprintf(f, "TB invalidate count %d\n", diff --git a/util/qht.c b/util/qht.c index 05ea5e8..535057b 100644 --- a/util/qht.c +++ b/util/qht.c @@ -556,35 +556,45 @@ void qht_grow(struct qht *ht) * value should be close to 1. * Note that each bucket tracks up to QHT_BUCKET_ENTRIES items. */ -double qht_avg_bucket_chain_length(struct qht *ht, size_t *n_head_buckets) +struct qht_stats qht_statistics(struct qht *ht) { struct qht_map *map; - size_t count = 0; - size_t i; + struct qht_stats s = {}; + size_t i, n; map = atomic_read(&ht->map); /* paired with smp_wmb() before setting ht->map */ smp_rmb(); + s.max_buckets = n = map->n; - for (i = 0; i < map->n; i++) { + for (i = 0; i < n; i++) { struct qht_bucket *head = &map->buckets[i]; struct qht_bucket *b; - size_t bucket_count; + size_t this_chain; uint32_t version; do { version = seqlock_read_begin(&head->sequence); - bucket_count = 0; + this_chain = 0; b = head; do { - bucket_count++; + int j; + for (j = 0; j < QHT_BUCKET_ENTRIES; j++) { + if (b->hashes[j]) { + this_chain++; + } + } b = b->next; } while (b); } while (seqlock_read_retry(&head->sequence, version)); - count += bucket_count; - } - if (n_head_buckets) { - *n_head_buckets = map->n; + if (this_chain != 0) { + s.used_entries += this_chain; + if (s.max_chain < this_chain) { + s.max_chain = this_chain; + } + s.used_buckets++; + } } - return (double)count / map->n; + + return s; } --------------050403030907080203060003--