From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:50697) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1XTx5b-0003bt-5S for qemu-devel@nongnu.org; Tue, 16 Sep 2014 14:03:04 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1XTx5R-0000ln-Ty for qemu-devel@nongnu.org; Tue, 16 Sep 2014 14:02:55 -0400 Received: from mail-pa0-x232.google.com ([2607:f8b0:400e:c03::232]:57225) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1XTx5R-0000lA-MD for qemu-devel@nongnu.org; Tue, 16 Sep 2014 14:02:45 -0400 Received: by mail-pa0-f50.google.com with SMTP id bj1so277032pad.37 for ; Tue, 16 Sep 2014 11:02:41 -0700 (PDT) Sender: Richard Henderson Message-ID: <54187B3D.8000909@twiddle.net> Date: Tue, 16 Sep 2014 11:02:37 -0700 From: Richard Henderson MIME-Version: 1.0 References: <1410793421-6453-1-git-send-email-pbonzini@redhat.com> <1410793421-6453-4-git-send-email-pbonzini@redhat.com> <5418716A.9080508@gmail.com> In-Reply-To: <5418716A.9080508@gmail.com> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [PATCH 03/14] target-ppc: use separate indices for various translation modes List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Tom Musta , Paolo Bonzini , qemu-devel@nongnu.org Cc: agraf@suse.de On 09/16/2014 10:20 AM, Tom Musta wrote: > On 9/15/2014 10:03 AM, Paolo Bonzini wrote: >> PowerPC TCG flushes the TLB on every IR/DR change, which basically >> means on every user<->kernel context switch. Encode IR/DR in the >> MMU index. >> >> This brings the number of TLB flushes down from ~900000 to ~50000 >> for starting up the Debian installer, which is in line with x86 >> and gives a ~10% performance improvement. >> >> Signed-off-by: Paolo Bonzini >> --- >> target-ppc/cpu.h | 7 ++----- >> target-ppc/excp_helper.c | 3 --- >> target-ppc/helper_regs.h | 11 ++++++----- >> 3 files changed, 8 insertions(+), 13 deletions(-) >> >> diff --git a/target-ppc/cpu.h b/target-ppc/cpu.h >> index b64c652..c29ce3b 100644 >> --- a/target-ppc/cpu.h >> +++ b/target-ppc/cpu.h >> @@ -922,7 +922,8 @@ struct ppc_segment_page_sizes { >> >> /*****************************************************************************/ >> /* The whole PowerPC CPU context */ >> -#define NB_MMU_MODES 3 >> +#define NB_MMU_MODES 12 >> +#define MMU_USER_IDX 3 /* PR=IR=DR=1 */ > > This doesn't build for me: > > CC ppc64-softmmu/tcg/tcg.o > In file included from /bghome/tmusta/powerisa/qemu/qemu/tcg/tcg.c:264: > /bghome/tmusta/powerisa/qemu/qemu/tcg/ppc/tcg-target.c: In function ?tcg_out_tlb_read?: > /bghome/tmusta/powerisa/qemu/qemu/tcg/ppc/tcg-target.c:1394: error: size of array ?qemu_build_bug_on__1396? is negative > make[1]: *** [tcg/tcg.o] Error 1 > make: *** [subdir-ppc64-softmmu] Error 2 > > which correlates with this: > > 1389 /* Compensate for very large offsets. */ > 1390 if (add_off >= 0x8000) { > 1391 /* Most target env are smaller than 32k; none are larger than 64k. > 1392 Simplify the logic here merely to offset by 0x7ff0, giving us a > 1393 range just shy of 64k. Check this assumption. */ > 1394 QEMU_BUILD_BUG_ON(offsetof(CPUArchState, > 1395 tlb_table[NB_MMU_MODES - 1][1]) > 1396 > 0x7ff0 + 0x7fff); > 1397 tcg_out32(s, ADDI | TAI(TCG_REG_TMP1, base, 0x7ff0)); > 1398 base = TCG_REG_TMP1; > 1399 cmp_off -= 0x7ff0; > 1400 add_off -= 0x7ff0; > 1401 } Ouch, yes indeed. While we could probably fix this for ppc (using addis), it's not nearly so easily fixable for arm -- without impacting performance anyway. Does 96k worth of TLBs really help that much? Are all 12 of them actually used? Can we use a more complex encoding scheme for the mmu_idx and use less? r~