Hi Richard, On Tue, Feb 09, 2016 at 09:39:55PM +1100, Richard Henderson wrote: > @@ -1212,11 +1237,24 @@ static void tcg_out_tlb_load(TCGContext *s, TCGReg base, TCGReg addrl, > : offsetof(CPUArchState, tlb_table[mem_index][0].addr_write)); > int add_off = offsetof(CPUArchState, tlb_table[mem_index][0].addend); > > - tcg_out_opc_sa(s, OPC_SRL, TCG_REG_A0, addrl, > - TARGET_PAGE_BITS - CPU_TLB_ENTRY_BITS); > - tcg_out_opc_imm(s, OPC_ANDI, TCG_REG_A0, TCG_REG_A0, > - (CPU_TLB_SIZE - 1) << CPU_TLB_ENTRY_BITS); > - tcg_out_opc_reg(s, OPC_ADDU, TCG_REG_A0, TCG_REG_A0, TCG_AREG0); > + if (use_mips32r2_instructions) { > + if (TCG_TARGET_REG_BITS == 32 || TARGET_LONG_BITS == 32) { > + tcg_out_opc_bf(s, OPC_EXT, TCG_REG_A0, addrl, > + TARGET_PAGE_BITS + CPU_TLB_ENTRY_BITS - 1, > + CPU_TLB_ENTRY_BITS); > + } else { > + tcg_out_opc_bf64(s, OPC_DEXT, OPC_DEXTM, OPC_DEXTU, > + TCG_REG_A0, addrl, > + TARGET_PAGE_BITS + CPU_TLB_ENTRY_BITS - 1, > + CPU_TLB_ENTRY_BITS); > + } The ext/dext here will end up with bits below bit CPU_TLB_ENTRY_BITS set, which will result in load of addend from slightly offset address, so things go badly wrong. You still need to either ANDI off the low bits or trim them off with the ext/dext and shift it left again. So I don't think there's any benefit to the use of these instructions unless CPU_TLB_SIZE + CPU_TLB_ENTRY_BITS exceeds the 16-bits available in the ANDI immediate field for the non r2 case. Cheers James > + } else { > + tcg_out_opc_sa(s, ALIAS_TSRL, TCG_REG_A0, addrl, > + TARGET_PAGE_BITS - CPU_TLB_ENTRY_BITS); > + tcg_out_opc_imm(s, OPC_ANDI, TCG_REG_A0, TCG_REG_A0, > + (CPU_TLB_SIZE - 1) << CPU_TLB_ENTRY_BITS); > + } > + tcg_out_opc_reg(s, ALIAS_PADD, TCG_REG_A0, TCG_REG_A0, TCG_AREG0); > > /* Compensate for very large offsets. */ > if (add_off >= 0x8000) {