From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:34090) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1auPyx-0005bW-3l for qemu-devel@nongnu.org; Sun, 24 Apr 2016 15:46:16 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1auPyt-0001Eu-T2 for qemu-devel@nongnu.org; Sun, 24 Apr 2016 15:46:15 -0400 Received: from mail-pa0-x22e.google.com ([2607:f8b0:400e:c03::22e]:34301) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1auPyt-0001EN-JG for qemu-devel@nongnu.org; Sun, 24 Apr 2016 15:46:11 -0400 Received: by mail-pa0-x22e.google.com with SMTP id r5so50731216pag.1 for ; Sun, 24 Apr 2016 12:46:11 -0700 (PDT) Sender: Richard Henderson References: <1461107270-19234-1-git-send-email-cota@braap.org> <1461107270-19234-12-git-send-email-cota@braap.org> <571A6245.4070209@twiddle.net> <571A82B8.5080908@twiddle.net> <20160422235738.GA2410@flamenco> From: Richard Henderson Message-ID: <571D2280.3040908@twiddle.net> Date: Sun, 24 Apr 2016 12:46:08 -0700 MIME-Version: 1.0 In-Reply-To: <20160422235738.GA2410@flamenco> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [PATCH v3 11/11] translate-all: add tb hash bucket info to 'info jit' dump List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: "Emilio G. Cota" Cc: QEMU Developers , MTTCG Devel , =?UTF-8?Q?Alex_Benn=c3=a9e?= , Paolo Bonzini , Peter Crosthwaite , Peter Maydell , Sergey Fedorov On 04/22/2016 04:57 PM, Emilio G. Cota wrote: > On Fri, Apr 22, 2016 at 12:59:52 -0700, Richard Henderson wrote: >> FWIW, so that I could get an idea of how the stats change as we improve the >> hashing, I inserted the attachment 1 patch between patches 5 and 6, and with >> attachment 2 attempting to fix the accounting for patches 9 and 10. > > For qht, I dislike the approach of reporting "avg chain" per-element, > instead of per-bucket. Performance for a bucket whose entries are > all valid is virtually the same as that of a bucket that only > has one valid element; thus, with per-bucket reporting, we'd say that > the chain lenght is 1 in both cases, i.e. "perfect". With per-element > reporting, we'd report 4 (on a 64-bit host, since that's the value of > QHT_BUCKET_ENTRIES) when the bucket is full, which IMO gives the > wrong idea (users would think they're in trouble, when they're not). But otherwise you have no way of knowing how full the buckets are. The bucket size is just something that you have to keep in mind. > If those numbers are off, then either this > assert(hinfo.used_entries == > tcg_ctx.tb_ctx.nb_tbs - tcg_ctx.tb_ctx.tb_phys_invalidate_count); > should trigger, or the accounting isn't right. I think I used an NDEBUG build, so these weren't effective. > Only the second report ("after 7/11") seems good (taking into account > lack of precision of just 3 decimals): > 5.26*32582=171381.32 ~= 171367 > which leads me to believe that you've used the TB and invalidate > counts from that test. The TB and invalidate numbers are repeatable; the same every time. > You can try, but I think performance wouldn't be great, because > the comparison function would be called way too often due to the > ht using open addressing. The problem there is not only the comparisons > themselves, but the all the cache lines needed to read the fields of > the comparison. I haven't tested libiberty's htable but I did test > the htable in concurrencykit[1], which also uses open addressing. You are right that having the full hash for primary comparison is a big win, especially with how complex our comparison functions are. And you're right that we have to have two of them. > This led me to a design that had buckets with a small set of > hash & pointer pairs, all in the same cache line as the head (then > I discovered somebody else had thought of this, and that's why there's > a link to the CLHT paper in qht.c). Fair. It's a good design. r~