All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jani Kokkonen <Jani.Kokkonen@huawei.com>
To: Richard Henderson <rth@twiddle.net>
Cc: Laurent Desnogues <laurent.desnogues@gmail.com>,
	Peter Maydell <peter.maydell@linaro.org>,
	Claudio Fontana <claudio.fontana@huawei.com>,
	qemu-devel@nongnu.org
Subject: Re: [Qemu-devel] [PATCH 4/4] tcg/aarch64: implement tlb lookup fast path
Date: Mon, 3 Jun 2013 13:21:55 +0200	[thread overview]
Message-ID: <51AC7C53.40105@huawei.com> (raw)
In-Reply-To: <51A90736.4090005@twiddle.net>

On 5/31/2013 10:25 PM, Richard Henderson wrote:
> On 05/31/2013 11:07 AM, Jani Kokkonen wrote:
>> +/* Load and compare a TLB entry, leaving the flags set.  Leaves X2 pointing
>> +   to the tlb entry.  Clobbers X0,X1,X2,X3 and TMP.  */
>> +
>> +static void tcg_out_tlb_read(TCGContext *s, TCGReg addr_reg,
>> +                             int s_bits, uint8_t **label_ptr, int tlb_offset)
>> +{
> 
> You copied the comment from ARM, and it isn't correct.  You generate branches.

I will fix the comment.
> 
>> +    TCGReg base = TCG_AREG0;
>> +
>> +    tcg_out_shr(s, 1, TCG_REG_TMP, addr_reg, TARGET_PAGE_BITS);
>> +    tcg_out_movi(s, TCG_TYPE_I64, TCG_REG_X1, tlb_offset);
>> +    tcg_out_arith(s, ARITH_ADD, 1, TCG_REG_X2, base, TCG_REG_X1, 0);
>> +    tcg_out_andi(s, 1, TCG_REG_X0, TCG_REG_TMP, CPU_TLB_BITS, 0);
>> +    tcg_out_arith(s, ARITH_ADD, 1, TCG_REG_X2, TCG_REG_X2,
>> +                             TCG_REG_X0, -CPU_TLB_ENTRY_BITS);
>> +#if TARGET_LONG_BITS == 64
>> +    tcg_out_ldst(s, LDST_64, LDST_LD, TCG_REG_X3, TCG_REG_X2, 0);
>> +#else
>> +    tcg_out_ldst(s, LDST_32, LDST_LD, TCG_REG_X3, TCG_REG_X2, 0);
>> +#endif
>> +    /* check alignment */
>> +    if (s_bits) {
>> +        tcg_out_tst(s, 1, addr_reg, s_bits, 0);
>> +        label_ptr[0] = s->code_ptr;
>> +        tcg_out_goto_cond_noaddr(s, TCG_COND_NE);
>> +    }
>> +    tcg_out_cmp(s, 1, TCG_REG_X3, TCG_REG_TMP, -TARGET_PAGE_BITS);
>> +    label_ptr[1] = s->code_ptr;
>> +    tcg_out_goto_cond_noaddr(s, TCG_COND_NE);
> 
> I'm positive that the branch predictor would be happier with a single branch
> rather than the two you generate here.  It ought to be possible to use a
> different set of insns to do this in one go.
> 
> How about something like
> 
> 	@ extract the tlb index from the address
> 	ubfm	w0, addr_reg, TARGET_PAGE_BITS, CPU_TLB_BITS
> 
> 	@ add any "high bits" from the tlb offset
> 	@ noting that env will be much smaller than 24 bits.
> 	add	x1, env, tlb_offset & 0xfff000
> 
> 	@ zap the tlb index from the address for compare
> 	@ this is all high bits plus 0-3 low bits set, so this
> 	@ should match a logical immediate.
> 	and	w/x2, addr_reg, TARGET_PAGE_MASK | ((1 << s_bits) - 1)
> 
> 	@ merge the tlb index into the env+tlb_offset
> 	add	x1, x1, x0, lsl #3
> 
> 	@ load the tlb comparator.  the 12-bit scaled offset
> 	@ form will fit the bits remaining from above, given that
> 	@ we're loading an aligned object, and so the low 2/3 bits
> 	@ will be clear.
> 	ldr	w/x0, [x1, tlb_offset & 0xfff]
> 
> 	@ load the tlb addend.  do this early to avoid stalling.
> 	@ the addend_offset differs from tlb_offset by 1-3 words.
> 	@ given that we've got overlap between the scaled 12-bit
> 	@ value and the 12-bit shifted value above, this also ought
> 	@ to always be representable.
> 	ldr	x3, [x1, (tlb_offset & 0xfff) + (addend_offset - tlb_offset)]
> 
> 	@ perform the comparison
> 	cmp	w/x0, w/x2
> 
> 	@ generate the complete host address in parallel with the cmp.
> 	add	x3, x3, addr_reg		@ 64-bit guest
> 	add	x3, x3, addr_reg, uxtw		@ 32-bit guest
> 
> 	bne	miss_label
> 
> Note that the w/x above indicates the ext setting that ought to be used,
> depending on the address size of the guest.
> 
> This is at least 2 insns shorter than your sequence.

Ok, thanks. ubfm instruction will be added and I will modify implementation based on your comments.
> 
> Have you looked at doing the out-of-line tlb miss sequence right from the
> very beginning?  It's not that much more difficult to accomplish than the
> inline tlb miss.

I have to look into this one.
> 
> See CONFIG_QEMU_LDST_OPTIMIZATION, and the implementation in tcg/arm.
> You won't need two nops after the call; aarch64 can do all the required
> extensions and data movement operations in a single insn.
> 
> 

I will take this also into account. 

> r~
> 

-Jani

  reply	other threads:[~2013-06-03 11:23 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-05-31 17:51 [Qemu-devel] [PATCH 0/4] ARM aarch64 TCG tlb fast lookup Jani Kokkonen
2013-05-31 17:57 ` [Qemu-devel] [PATCH 1/4] tcg/aarch64: more low level ops in preparation of tlb, lookup Jani Kokkonen
2013-05-31 19:07   ` Richard Henderson
2013-06-03  9:43     ` Claudio Fontana
2013-05-31 18:01 ` [Qemu-devel] [PATCH 2/4] tcg/aarch64: implement byte swap operations Jani Kokkonen
2013-05-31 19:11   ` Richard Henderson
2013-06-03  9:44     ` Claudio Fontana
2013-05-31 18:05 ` [Qemu-devel] [PATCH 3/4] tcg/aarch64: implement sign/zero extend operations Jani Kokkonen
2013-05-31 19:13   ` Richard Henderson
2013-06-03  9:48     ` Claudio Fontana
2013-05-31 18:07 ` [Qemu-devel] [PATCH 4/4] tcg/aarch64: implement tlb lookup fast path Jani Kokkonen
2013-05-31 20:25   ` Richard Henderson
2013-06-03 11:21     ` Jani Kokkonen [this message]
2013-06-03 15:52       ` Richard Henderson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=51AC7C53.40105@huawei.com \
    --to=jani.kokkonen@huawei.com \
    --cc=claudio.fontana@huawei.com \
    --cc=laurent.desnogues@gmail.com \
    --cc=peter.maydell@linaro.org \
    --cc=qemu-devel@nongnu.org \
    --cc=rth@twiddle.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.