All of lore.kernel.org
 help / color / mirror / Atom feed
From: Richard Henderson <rth@twiddle.net>
To: qemu-devel@nongnu.org
Cc: claudio.fontana@huawei.com
Subject: [Qemu-devel] [PATCH v4 04/25] tcg-aarch64: Use MOVN in tcg_out_movi
Date: Fri, 11 Apr 2014 08:40:06 -0700	[thread overview]
Message-ID: <1397230827-24222-5-git-send-email-rth@twiddle.net> (raw)
In-Reply-To: <1397230827-24222-1-git-send-email-rth@twiddle.net>

When profitable, initialize the register with MOVN instead of MOVZ,
before setting the remaining lanes with MOVK.

Reviewed-by: Claudio Fontana <claudio.fontana@huawei.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 tcg/aarch64/tcg-target.c | 63 ++++++++++++++++++++++++++++++++++++++----------
 1 file changed, 50 insertions(+), 13 deletions(-)

diff --git a/tcg/aarch64/tcg-target.c b/tcg/aarch64/tcg-target.c
index 5e6d10b..1d7612c 100644
--- a/tcg/aarch64/tcg-target.c
+++ b/tcg/aarch64/tcg-target.c
@@ -531,24 +531,61 @@ static void tcg_out_movi(TCGContext *s, TCGType type, TCGReg rd,
                          tcg_target_long value)
 {
     AArch64Insn insn;
-
-    if (type == TCG_TYPE_I32) {
+    int i, wantinv, shift;
+    tcg_target_long svalue = value;
+    tcg_target_long ivalue = ~value;
+    tcg_target_long imask;
+
+    /* For 32-bit values, discard potential garbage in value.  For 64-bit
+       values within [2**31, 2**32-1], we can create smaller sequences by
+       interpreting this as a negative 32-bit number, while ensuring that
+       the high 32 bits are cleared by setting SF=0.  */
+    if (type == TCG_TYPE_I32 || (value & ~0xffffffffull) == 0) {
+        svalue = (int32_t)value;
         value = (uint32_t)value;
+        ivalue = (uint32_t)ivalue;
+        type = TCG_TYPE_I32;
+    }
+
+    /* Would it take fewer insns to begin with MOVN?  For the value and its
+       inverse, count the number of 16-bit lanes that are 0.  */
+    for (i = wantinv = imask = 0; i < 64; i += 16) {
+        tcg_target_long mask = 0xffffull << i;
+        if ((value & mask) == 0) {
+            wantinv -= 1;
+        }
+        if ((ivalue & mask) == 0) {
+            wantinv += 1;
+            imask |= mask;
+        }
     }
 
-    /* count trailing zeros in 16 bit steps, mapping 64 to 0. Emit the
-       first MOVZ with the half-word immediate skipping the zeros, with a shift
-       (LSL) equal to this number. Then all next instructions use MOVKs.
-       Zero the processed half-word in the value, continue until empty.
-       We build the final result 16bits at a time with up to 4 instructions,
-       but do not emit instructions for 16bit zero holes. */
+    /* If we had more 0xffff than 0x0000, invert VALUE and use MOVN.  */
     insn = I3405_MOVZ;
-    do {
-        unsigned shift = ctz64(value) & (63 & -16);
-        tcg_out_insn_3405(s, insn, shift >= 32, rd, value >> shift, shift);
+    if (wantinv > 0) {
+        value = ivalue;
+        insn = I3405_MOVN;
+    }
+
+    /* Find the lowest lane that is not 0x0000.  */
+    shift = ctz64(value) & (63 & -16);
+    tcg_out_insn_3405(s, insn, type, rd, value >> shift, shift);
+
+    if (wantinv > 0) {
+        /* Re-invert the value, so MOVK sees non-inverted bits.  */
+        value = ~value;
+        /* Clear out all the 0xffff lanes.  */
+        value ^= imask;
+    }
+    /* Clear out the lane that we just set.  */
+    value &= ~(0xffffUL << shift);
+
+    /* Iterate until all lanes have been set, and thus cleared from VALUE.  */
+    while (value) {
+        shift = ctz64(value) & (63 & -16);
+        tcg_out_insn(s, 3405, MOVK, type, rd, value >> shift, shift);
         value &= ~(0xffffUL << shift);
-        insn = I3405_MOVK;
-    } while (value);
+    }
 }
 
 static inline void tcg_out_ldst_r(TCGContext *s,
-- 
1.9.0

  parent reply	other threads:[~2014-04-11 15:41 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-04-11 15:40 [Qemu-devel] [PATCH v4 00/25] tcg-aarch64 improvments Richard Henderson
2014-04-11 15:40 ` [Qemu-devel] [PATCH v4 01/25] tcg-aarch64: Properly detect SIGSEGV writes Richard Henderson
2014-04-11 15:40 ` [Qemu-devel] [PATCH v4 02/25] tcg-aarch64: Use intptr_t apropriately Richard Henderson
2014-04-11 15:40 ` [Qemu-devel] [PATCH v4 03/25] tcg-aarch64: Use TCGType and TCGMemOp constants Richard Henderson
2014-04-11 15:40 ` Richard Henderson [this message]
2014-04-11 15:40 ` [Qemu-devel] [PATCH v4 05/25] tcg-aarch64: Use ORRI in tcg_out_movi Richard Henderson
2014-04-11 15:40 ` [Qemu-devel] [PATCH v4 06/25] tcg-aarch64: Special case small constants " Richard Henderson
2014-04-11 15:40 ` [Qemu-devel] [PATCH v4 07/25] tcg-aarch64: Use adrp " Richard Henderson
2014-04-11 15:40 ` [Qemu-devel] [PATCH v4 08/25] tcg-aarch64: Use symbolic names for branches Richard Henderson
2014-04-11 15:40 ` [Qemu-devel] [PATCH v4 09/25] tcg-aarch64: Create tcg_out_brcond Richard Henderson
2014-04-11 15:40 ` [Qemu-devel] [PATCH v4 10/25] tcg-aarch64: Use CBZ and CBNZ Richard Henderson
2014-04-11 15:40 ` [Qemu-devel] [PATCH v4 11/25] tcg-aarch64: Reuse LR in translated code Richard Henderson
2014-04-11 15:40 ` [Qemu-devel] [PATCH v4 12/25] tcg-aarch64: Introduce tcg_out_insn_3314 Richard Henderson
2014-04-11 15:40 ` [Qemu-devel] [PATCH v4 13/25] tcg-aarch64: Implement tcg_register_jit Richard Henderson
2014-04-11 15:40 ` [Qemu-devel] [PATCH v4 14/25] tcg-aarch64: Avoid add with zero in tlb load Richard Henderson
2014-04-11 15:40 ` [Qemu-devel] [PATCH v4 15/25] tcg-aarch64: Use tcg_out_call for qemu_ld/st Richard Henderson
2014-04-11 15:40 ` [Qemu-devel] [PATCH v4 16/25] tcg-aarch64: Use ADR to pass the return address to the ld/st helpers Richard Henderson
2014-04-11 15:40 ` [Qemu-devel] [PATCH v4 17/25] tcg-aarch64: Use TCGMemOp in qemu_ld/st Richard Henderson
2014-04-11 15:40 ` [Qemu-devel] [PATCH v4 18/25] tcg-aarch64: Pass qemu_ld/st arguments directly Richard Henderson
2014-04-11 15:40 ` [Qemu-devel] [PATCH v4 19/25] tcg-aarch64: Implement TCG_TARGET_HAS_new_ldst Richard Henderson
2014-04-11 15:40 ` [Qemu-devel] [PATCH v4 20/25] tcg-aarch64: Support stores of zero Richard Henderson
2014-04-11 15:40 ` [Qemu-devel] [PATCH v4 21/25] tcg-aarch64: Introduce tcg_out_insn_3507 Richard Henderson
2014-04-11 15:40 ` [Qemu-devel] [PATCH v4 22/25] tcg-aarch64: Merge aarch64_ldst_get_data/type into tcg_out_op Richard Henderson
2014-04-11 15:40 ` [Qemu-devel] [PATCH v4 23/25] tcg-aarch64: Introduce tcg_out_insn_3312, _3310, _3313 Richard Henderson
2014-04-11 15:40 ` [Qemu-devel] [PATCH v4 24/25] tcg-aarch64: Prefer unsigned offsets before signed offsets for ldst Richard Henderson
2014-04-11 15:40 ` [Qemu-devel] [PATCH v4 25/25] tcg-aarch64: Use tcg_out_mov in preference to tcg_out_movr Richard Henderson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1397230827-24222-5-git-send-email-rth@twiddle.net \
    --to=rth@twiddle.net \
    --cc=claudio.fontana@huawei.com \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.