On 04/22/2016 10:41 AM, Richard Henderson wrote: > On 04/19/2016 04:07 PM, Emilio G. Cota wrote: >> + ht_avg_len = qht_avg_bucket_chain_length(&tcg_ctx.tb_ctx.htable, &ht_heads); >> + cpu_fprintf(f, "TB hash avg chain %0.5f buckets\n", ht_avg_len); >> + cpu_fprintf(f, "TB hash size %zu head buckets\n", ht_heads); > > > I think the accounting is questionable here. > > Consider the following data: > > TB count 230467/671088 > TB invalidate count 25915 > TB hash avg chain 1.03073 buckets > TB hash size 131072 head buckets > > This means that we've got 230467 - 25915 = 204552 active TB's, installed into a > hash table with 131072 heads. For a perfectly uniform distribution of TBs, > that would be an average chain length of 204552 / 131072 = 1.56. > > In order to get the average down to 1.03, there need to be a substantial number > of heads with zero entries. > > I think perhaps it might be more enlightening to separately account for empty > and non-empty heads. E.g. > > TB hash buckets used xxxx/131072 > TB hash avg chain yyyy > > where xxxx is the number of non-empty heads, and yyyy = |TBs| / xxxx. > > I also wonder if it wouldn't be better to size the hash table as appropriate > for the maximum number of allowable TBs. FWIW, so that I could get an idea of how the stats change as we improve the hashing, I inserted the attachment 1 patch between patches 5 and 6, and with attachment 2 attempting to fix the accounting for patches 9 and 10. For booting an alpha kernel to login prompt: Before hashing changes (@5/11) TB count 175363/671088 TB invalidate count 3996 TB hash buckets 31731/32768 TB hash avg chain 5.289 max=59 After xxhash patch (@7/11) TB hash buckets 32582/32768 TB hash avg chain 5.260 max=18 So far so good! After qht patches (@11/11) TB hash buckets 94360/131072 TB hash avg chain 1.774 max=8 Do note that those last numbers are off: 1.774 avg * 94360 used buckets = 167394 total entries, which is far from 171367, the correct number of total entries. I'm tempted to pull over gcc's non-chaining hash table implementation (libiberty/hashtab.c, still gplv2+) and compare... r~