[Qemu-devel] [PATCH 00/10] A64 SIMD patchset one: ld/st, C3.6.1..C3.6.7

All of lore.kernel.org
 help / color / mirror / Atom feed

* [Qemu-devel] [PATCH 00/10] A64 SIMD patchset one: ld/st, C3.6.1..C3.6.7
@ 2014-01-10 17:12 Peter Maydell
  2014-01-10 17:12 ` [Qemu-devel] [PATCH 01/10] target-arm: A64: Add SIMD ld/st multiple Peter Maydell
                   ` (9 more replies)
  0 siblings, 10 replies; 30+ messages in thread
From: Peter Maydell @ 2014-01-10 17:12 UTC (permalink / raw)
  To: qemu-devel
  Cc: patches, Michael Matz, Alexander Graf, Claudio Fontana,
	Dirk Mueller, Will Newton, Laurent Desnogues, Alex Bennée,
	kvmarm, Christoffer Dall, Richard Henderson

This is an initial set of patches which make a start on SIMD (Neon)
emulation in the A64 decoder. The patches implement all the SIMD
load/store operations, provide a decoder skeleton for the SIMD
dp instructions, and implement all the instructions in the ARM ARM's
groupings C3.6.1 through C3.6.7.

(It's more fluke than anything else that I ended up with all the
first seven groupings in this set; they happened to all be easy small
groupings. For some of the larger SIMD instruction groups I expect
that we will end up implementing only some of the instructions
in a group, in order to get more quickly to the useful milestone
of "implement all the instructions gcc happens to emit today".)

thanks
-- PMM

Alex Bennée (4):
  target-arm: A64: Add SIMD ld/st multiple
  target-arm: A64: Add decode skeleton for SIMD data processing insns
  target-arm: A64: Add SIMD copy operations
  target-arm: A64: Add SIMD modified immediate group

Michael Matz (3):
  target-arm: A64: Add SIMD TBL/TBLX
  target-arm: A64: Add SIMD ZIP/UZP/TRN
  target-arm: A64: Add SIMD across-lanes instructions

Peter Maydell (3):
  target-arm: A64: Add SIMD ld/st single
  target-arm: A64: Add SIMD EXT
  target-arm: A64: Add SIMD scalar copy instructions

 target-arm/helper-a64.c    |   31 +
 target-arm/helper-a64.h    |    1 +
 target-arm/translate-a64.c | 1440 +++++++++++++++++++++++++++++++++++++++++++-
 3 files changed, 1463 insertions(+), 9 deletions(-)

-- 
1.8.5

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [Qemu-devel] [PATCH 01/10] target-arm: A64: Add SIMD ld/st multiple
  2014-01-10 17:12 [Qemu-devel] [PATCH 00/10] A64 SIMD patchset one: ld/st, C3.6.1..C3.6.7 Peter Maydell
@ 2014-01-10 17:12 ` Peter Maydell
  2014-01-10 18:05   ` Richard Henderson
  2014-01-10 17:12 ` [Qemu-devel] [PATCH 02/10] target-arm: A64: Add SIMD ld/st single Peter Maydell
                   ` (8 subsequent siblings)
  9 siblings, 1 reply; 30+ messages in thread
From: Peter Maydell @ 2014-01-10 17:12 UTC (permalink / raw)
  To: qemu-devel
  Cc: patches, Michael Matz, Alexander Graf, Claudio Fontana,
	Dirk Mueller, Will Newton, Laurent Desnogues, Alex Bennée,
	kvmarm, Christoffer Dall, Richard Henderson

From: Alex Bennée <alex.bennee@linaro.org>

This adds support support for the SIMD load/store
multiple category of instructions.

This also brings in a couple of helper functions for manipulating
sections of the SIMD registers:

  * do_vec_get - fetch value from a slice of a vector register
  * do_vec_set - set a slice of a vector register

which use vec_reg_offset for consistent processing of offsets in an
endian aware manner. There are also additional helpers:

  * do_vec_ld - load value into SIMD
  * do_vec_st - store value from SIMD

which load or store a slice of a vector register to memory.
These don't zero extend like the fp variants.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---

v2 -> v3:
   - use extract32/sextract32 instead of get_bits and get_sbits

v3 -> v4 (ajb):
   - move into new decoder structure
   - use new API for loading temp addr
   - push various variables to local blocks
   - fix semantics of clearing V reg on load
   - tested with risu

v4 -> v5 (ajb):
   - catch more unallocated values
   - add missing returns
   - use do_fp_ld for offset==0 instead of explicit clear_reg

v5 -> v6 (ajb):
   - merge all the various vector helpers into one commit
---
 target-arm/translate-a64.c | 247 ++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 245 insertions(+), 2 deletions(-)

diff --git a/target-arm/translate-a64.c b/target-arm/translate-a64.c
index cf80c46..4482e73 100644
--- a/target-arm/translate-a64.c
+++ b/target-arm/translate-a64.c
@@ -308,6 +308,28 @@ static TCGv_i64 read_cpu_reg_sp(DisasContext *s, int reg, int sf)
     return v;
 }
 
+/* Return the offset into CPUARMState of an element of specified
+ * size, 'element' places in from the least significant end of
+ * the FP/vector register Qn.
+ */
+static inline int vec_reg_offset(int regno, int element, TCGMemOp size)
+{
+    int offs = offsetof(CPUARMState, vfp.regs[regno * 2]);
+#ifdef HOST_WORDS_BIGENDIAN
+    /* This is complicated slightly because vfp.regs[2n] is
+     * still the low half and  vfp.regs[2n+1] the high half
+     * of the 128 bit vector, even on big endian systems.
+     * Calculate the offset assuming a fully bigendian 128 bits,
+     * then XOR to account for the order of the two 64 bit halves.
+     */
+    offs += (16 - ((element + 1) * (1 << size)));
+    offs ^= 8;
+#else
+    offs += element * (1 << size);
+#endif
+    return offs;
+}
+
 /* Return the offset into CPUARMState of a slice (from
  * the least significant end) of FP register Qn (ie
  * Dn, Sn, Hn or Bn).
@@ -661,6 +683,108 @@ static void do_fp_ld(DisasContext *s, int destidx, TCGv_i64 tcg_addr, int size)
 }
 
 /*
+ * Vector load/store helpers.
+ *
+ * The principal difference between this and a FP load is that we don't
+ * zero extend as we are filling a partial chunk of the vector register.
+ * These functions don't support 128 bit loads/stores, which would be
+ * normal load/store operations.
+ */
+
+/* Get value of an element within a vector register */
+static void read_vec_element(DisasContext *s, TCGv_i64 tcg_dest, int srcidx,
+                             int element, TCGMemOp memop)
+{
+    int vect_off = vec_reg_offset(srcidx, element, memop & MO_SIZE);
+    switch (memop) {
+    case MO_8:
+        tcg_gen_ld8u_i64(tcg_dest, cpu_env, vect_off);
+        break;
+    case MO_16:
+        tcg_gen_ld16u_i64(tcg_dest, cpu_env, vect_off);
+        break;
+    case MO_32:
+        tcg_gen_ld32u_i64(tcg_dest, cpu_env, vect_off);
+        break;
+    case MO_8|MO_SIGN:
+        tcg_gen_ld8s_i64(tcg_dest, cpu_env, vect_off);
+        break;
+    case MO_16|MO_SIGN:
+        tcg_gen_ld16s_i64(tcg_dest, cpu_env, vect_off);
+        break;
+    case MO_32|MO_SIGN:
+        tcg_gen_ld32s_i64(tcg_dest, cpu_env, vect_off);
+        break;
+    case MO_64:
+    case MO_64|MO_SIGN:
+        tcg_gen_ld_i64(tcg_dest, cpu_env, vect_off);
+        break;
+    default:
+        g_assert_not_reached();
+    }
+}
+
+/* Set value of an element within a vector register */
+static void write_vec_element(DisasContext *s, TCGv_i64 tcg_src, int destidx,
+                              int element, TCGMemOp memop)
+{
+    int vect_off = vec_reg_offset(destidx, element, memop & MO_SIZE);
+    switch (memop) {
+    case MO_8:
+        tcg_gen_st8_i64(tcg_src, cpu_env, vect_off);
+        break;
+    case MO_16:
+        tcg_gen_st16_i64(tcg_src, cpu_env, vect_off);
+        break;
+    case MO_32:
+        tcg_gen_st32_i64(tcg_src, cpu_env, vect_off);
+        break;
+    case MO_64:
+        tcg_gen_st_i64(tcg_src, cpu_env, vect_off);
+        break;
+    default:
+        g_assert_not_reached();
+    }
+}
+
+/* Clear the high 64 bits of a 128 bit vector (in general non-quad
+ * vector ops all need to do this).
+ */
+static void clear_vec_high(DisasContext *s, int rd)
+{
+    TCGv_i64 tcg_zero = tcg_const_i64(0);
+
+    write_vec_element(s, tcg_zero, rd, 1, MO_64);
+    tcg_temp_free_i64(tcg_zero);
+}
+
+/* Store from vector register to memory */
+static void do_vec_st(DisasContext *s, int srcidx, int element,
+                      TCGv_i64 tcg_addr, int size)
+{
+    TCGMemOp memop =  MO_TE + size;
+    TCGv_i64 tcg_tmp = tcg_temp_new_i64();
+
+    read_vec_element(s, tcg_tmp, srcidx, element, size);
+    tcg_gen_qemu_st_i64(tcg_tmp, tcg_addr, get_mem_index(s), memop);
+
+    tcg_temp_free_i64(tcg_tmp);
+}
+
+/* Load from memory to vector register */
+static void do_vec_ld(DisasContext *s, int destidx, int element,
+                      TCGv_i64 tcg_addr, int size)
+{
+    TCGMemOp memop =  MO_TE + size;
+    TCGv_i64 tcg_tmp = tcg_temp_new_i64();
+
+    tcg_gen_qemu_ld_i64(tcg_tmp, tcg_addr, get_mem_index(s), memop);
+    write_vec_element(s, tcg_tmp, destidx, element, size);
+
+    tcg_temp_free_i64(tcg_tmp);
+}
+
+/*
  * This utility function is for doing register extension with an
  * optional shift. You will likely want to pass a temporary for the
  * destination register. See DecodeRegExtend() in the ARM ARM.
@@ -1835,10 +1959,129 @@ static void disas_ldst_reg(DisasContext *s, uint32_t insn)
     }
 }
 
-/* AdvSIMD load/store multiple structures */
+/* C3.3.1 AdvSIMD load/store multiple structures
+ *
+ *  31  30  29           23 22  21         16 15    12 11  10 9    5 4    0
+ * +---+---+---------------+---+-------------+--------+------+------+------+
+ * | 0 | Q | 0 0 1 1 0 0 0 | L | 0 0 0 0 0 0 | opcode | size |  Rn  |  Rt  |
+ * +---+---+---------------+---+-------------+--------+------+------+------+
+ *
+ * C3.3.2 AdvSIMD load/store multiple structures (post-indexed)
+ *
+ *  31  30  29           23 22  21  20     16 15    12 11  10 9    5 4    0
+ * +---+---+---------------+---+---+---------+--------+------+------+------+
+ * | 0 | Q | 0 0 1 1 0 0 1 | L | 0 |   Rm    | opcode | size |  Rn  |  Rt  |
+ * +---+---+---------------+---+---+---------+--------+------+------+------+
+ *
+ * Rt: first (or only) SIMD&FP register to be transferred
+ * Rn: base address or SP
+ * Rm (post-index only): post-index register (when !31) or size dependent #imm
+ */
 static void disas_ldst_multiple_struct(DisasContext *s, uint32_t insn)
 {
-    unsupported_encoding(s, insn);
+    int rt = extract32(insn, 0, 5);
+    int rn = extract32(insn, 5, 5);
+    int size = extract32(insn, 10, 2);
+    int opcode = extract32(insn, 12, 4);
+    bool is_store = !extract32(insn, 22, 1);
+    bool is_postidx = extract32(insn, 23, 1);
+    bool is_q = extract32(insn, 30, 1);
+    TCGv_i64 tcg_addr;
+
+    int ebytes = 1 << size;
+    int elements = (is_q ? 128 : 64) / (8 << size);
+    int rpt;    /* num iterations */
+    int selem;  /* structure elements */
+    int r;
+
+    if (extract32(insn, 31, 1) || extract32(insn, 21, 1)) {
+        unallocated_encoding(s);
+        return;
+    }
+
+    /* From the shared decode logic */
+    switch (opcode) {
+    case 0x0:
+        rpt = 1;
+        selem = 4;
+        break;
+    case 0x2:
+        rpt = 4;
+        selem = 1;
+        break;
+    case 0x4:
+        rpt = 1;
+        selem = 3;
+        break;
+    case 0x6:
+        rpt = 3;
+        selem = 1;
+        break;
+    case 0x7:
+        rpt = 1;
+        selem = 1;
+        break;
+    case 0x8:
+        rpt = 1;
+        selem = 2;
+        break;
+    case 0xa:
+        rpt = 2;
+        selem = 1;
+        break;
+    default:
+        unallocated_encoding(s);
+        return;
+    }
+
+    if (size == 3 && !is_q && selem != 1) {
+        /* reserved */
+        unallocated_encoding(s);
+        return;
+    }
+
+    tcg_addr = read_cpu_reg_sp(s, rn, 1);
+
+    if (rn == 31) {
+        gen_check_sp_alignment(s);
+    }
+
+    for (r = 0; r < rpt; r++) {
+        int e;
+        for (e = 0; e < elements; e++) {
+            int tt = (rt + r) % 32;
+            int xs;
+            for (xs = 0; xs < selem; xs++) {
+                if (is_store) {
+                    do_vec_st(s, tt, e, tcg_addr, size);
+                } else {
+                    do_vec_ld(s, tt, e, tcg_addr, size);
+
+                    /* For non-quad operations, setting a slice of the low
+                     * 64 bits of the register clears the high 64 bits (in
+                     * the ARM ARM pseudocode this is implicit in the fact
+                     * that 'rval' is a 64 bit wide variable). We optimize
+                     * by noticing that we only need to do this the first
+                     * time we touch a register.
+                     */
+                    if (!is_q && e == 0 && (r == 0 || xs == selem - 1)) {
+                        clear_vec_high(s, tt);
+                    }
+                }
+                tcg_gen_addi_i64(tcg_addr, tcg_addr, ebytes);
+                tt = (tt + 1) % 32;
+            }
+        }
+    }
+
+    if (is_postidx) {
+        int rm = extract32(insn, 16, 5);
+        if (rm == 31) {
+            tcg_gen_mov_i64(cpu_reg_sp(s, rn), tcg_addr);
+        } else {
+            tcg_gen_add_i64(cpu_reg_sp(s, rn), cpu_reg(s, rn), cpu_reg(s, rm));
+        }
+    }
 }
 
 /* AdvSIMD load/store single structure */
-- 
1.8.5

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [Qemu-devel] [PATCH 02/10] target-arm: A64: Add SIMD ld/st single
  2014-01-10 17:12 [Qemu-devel] [PATCH 00/10] A64 SIMD patchset one: ld/st, C3.6.1..C3.6.7 Peter Maydell
  2014-01-10 17:12 ` [Qemu-devel] [PATCH 01/10] target-arm: A64: Add SIMD ld/st multiple Peter Maydell
@ 2014-01-10 17:12 ` Peter Maydell
  2014-01-10 18:12   ` Richard Henderson
  2014-01-10 17:12 ` [Qemu-devel] [PATCH 03/10] target-arm: A64: Add decode skeleton for SIMD data processing insns Peter Maydell
                   ` (7 subsequent siblings)
  9 siblings, 1 reply; 30+ messages in thread
From: Peter Maydell @ 2014-01-10 17:12 UTC (permalink / raw)
  To: qemu-devel
  Cc: patches, Michael Matz, Alexander Graf, Claudio Fontana,
	Dirk Mueller, Will Newton, Laurent Desnogues, Alex Bennée,
	kvmarm, Christoffer Dall, Richard Henderson

Implement the SIMD ld/st single structure instructions.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target-arm/translate-a64.c | 141 ++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 139 insertions(+), 2 deletions(-)

diff --git a/target-arm/translate-a64.c b/target-arm/translate-a64.c
index 4482e73..ee56588 100644
--- a/target-arm/translate-a64.c
+++ b/target-arm/translate-a64.c
@@ -2084,10 +2084,147 @@ static void disas_ldst_multiple_struct(DisasContext *s, uint32_t insn)
     }
 }
 
-/* AdvSIMD load/store single structure */
+/* C3.3.3 AdvSIMD load/store single structure
+ *
+ *  31  30  29           23 22 21 20       16 15 13 12  11  10 9    5 4    0
+ * +---+---+---------------+-----+-----------+-----+---+------+------+------+
+ * | 0 | Q | 0 0 1 1 0 1 0 | L R | 0 0 0 0 0 | opc | S | size |  Rn  |  Rt  |
+ * +---+---+---------------+-----+-----------+-----+---+------+------+------+
+ *
+ * C3.3.4 AdvSIMD load/store single structure (post-indexed)
+ *
+ *  31  30  29           23 22 21 20       16 15 13 12  11  10 9    5 4    0
+ * +---+---+---------------+-----+-----------+-----+---+------+------+------+
+ * | 0 | Q | 0 0 1 1 0 1 1 | L R |     Rm    | opc | S | size |  Rn  |  Rt  |
+ * +---+---+---------------+-----+-----------+-----+---+------+------+------+
+ *
+ * Rt: first (or only) SIMD&FP register to be transferred
+ * Rn: base address or SP
+ * Rm (post-index only): post-index register (when !31) or size dependent #imm
+ * index = encoded in Q:S:size dependent on size
+ *
+ * lane_size = encoded in R, opc
+ * transfer width = encoded in opc, S, size
+ */
 static void disas_ldst_single_struct(DisasContext *s, uint32_t insn)
 {
-    unsupported_encoding(s, insn);
+    int rt = extract32(insn, 0, 5);
+    int rn = extract32(insn, 5, 5);
+    int size = extract32(insn, 10, 2);
+    int S = extract32(insn, 12, 1);
+    int opc = extract32(insn, 13, 3);
+    int R = extract32(insn, 21, 1);
+    int is_load = extract32(insn, 22, 1);
+    int is_postidx = extract32(insn, 23, 1);
+    int is_q = extract32(insn, 30, 1);
+
+    int scale = extract32(opc, 1, 2);
+    int selem = (extract32(opc, 0, 1) << 1 | R) + 1;
+    bool replicate = false;
+    int index = is_q << 3 | S << 2 | size;
+    int ebytes, xs;
+    TCGv_i64 tcg_addr;
+
+    switch (scale) {
+    case 3:
+        if (!is_load || S) {
+            unallocated_encoding(s);
+            return;
+        }
+        scale = size;
+        replicate = true;
+        break;
+    case 0:
+        break;
+    case 1:
+        if (extract32(size, 0, 1)) {
+            unallocated_encoding(s);
+            return;
+        }
+        index >>= 1;
+        break;
+    case 2:
+        if (extract32(size, 1, 1)) {
+            unallocated_encoding(s);
+            return;
+        }
+        if (!extract32(size, 0, 1)) {
+            index >>= 2;
+        } else {
+            if (S) {
+                unallocated_encoding(s);
+                return;
+            }
+            index >>= 3;
+            scale = 3;
+        }
+        break;
+    default:
+        g_assert_not_reached();
+    }
+
+    ebytes = 1 << scale;
+
+    tcg_addr = read_cpu_reg_sp(s, rn, 1);
+
+    if (rn == 31) {
+        gen_check_sp_alignment(s);
+    }
+
+    for (xs = 0; xs < selem; xs++) {
+        if (replicate) {
+            /* Load and replicate to all elements */
+            uint64_t mulconst;
+            TCGv_i64 tcg_tmp = tcg_temp_new_i64();
+
+            tcg_gen_qemu_ld_i64(tcg_tmp, tcg_addr,
+                                get_mem_index(s), MO_TE + scale);
+            switch (scale) {
+            case 0:
+                mulconst = 0x0101010101010101ULL;
+                break;
+            case 1:
+                mulconst = 0x0001000100010001ULL;
+                break;
+            case 2:
+                mulconst = 0x0000000100000001ULL;
+                break;
+            case 3:
+                mulconst = 0;
+                break;
+            default:
+                g_assert_not_reached();
+            }
+            if (mulconst) {
+                tcg_gen_muli_i64(tcg_tmp, tcg_tmp, mulconst);
+            }
+            write_vec_element(s, tcg_tmp, rt, 0, MO_64);
+            if (is_q) {
+                write_vec_element(s, tcg_tmp, rt, 1, MO_64);
+            } else {
+                clear_vec_high(s, rt);
+            }
+            tcg_temp_free_i64(tcg_tmp);
+        } else {
+            /* Load/store one element per register */
+            if (is_load) {
+                do_vec_ld(s, rt, index, tcg_addr, MO_TE + scale);
+            } else {
+                do_vec_st(s, rt, index, tcg_addr, MO_TE + scale);
+            }
+        }
+        tcg_gen_addi_i64(tcg_addr, tcg_addr, ebytes);
+        rt = (rt + 1) % 32;
+    }
+
+    if (is_postidx) {
+        int rm = extract32(insn, 16, 5);
+        if (rm == 31) {
+            tcg_gen_mov_i64(cpu_reg_sp(s, rn), tcg_addr);
+        } else {
+            tcg_gen_add_i64(cpu_reg_sp(s, rn), cpu_reg(s, rn), cpu_reg(s, rm));
+        }
+    }
 }
 
 /* C3.3 Loads and stores */
-- 
1.8.5

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [Qemu-devel] [PATCH 03/10] target-arm: A64: Add decode skeleton for SIMD data processing insns
  2014-01-10 17:12 [Qemu-devel] [PATCH 00/10] A64 SIMD patchset one: ld/st, C3.6.1..C3.6.7 Peter Maydell
  2014-01-10 17:12 ` [Qemu-devel] [PATCH 01/10] target-arm: A64: Add SIMD ld/st multiple Peter Maydell
  2014-01-10 17:12 ` [Qemu-devel] [PATCH 02/10] target-arm: A64: Add SIMD ld/st single Peter Maydell
@ 2014-01-10 17:12 ` Peter Maydell
  2014-01-10 18:55   ` Richard Henderson
  2014-01-10 19:05   ` Richard Henderson
  2014-01-10 17:12 ` [Qemu-devel] [PATCH 04/10] target-arm: A64: Add SIMD EXT Peter Maydell
                   ` (6 subsequent siblings)
  9 siblings, 2 replies; 30+ messages in thread
From: Peter Maydell @ 2014-01-10 17:12 UTC (permalink / raw)
  To: qemu-devel
  Cc: patches, Michael Matz, Alexander Graf, Claudio Fontana,
	Dirk Mueller, Will Newton, Laurent Desnogues, Alex Bennée,
	kvmarm, Christoffer Dall, Richard Henderson

From: Alex Bennée <alex.bennee@linaro.org>

Add decode skeleton and function placeholders for all the SIMD data
processing instructions. Due to the complexity of this part of the
table the normal extract and switch approach gets very messy very
quickly, so we use a simple data-driven pattern-and-mask approach.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target-arm/translate-a64.c | 306 ++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 305 insertions(+), 1 deletion(-)

diff --git a/target-arm/translate-a64.c b/target-arm/translate-a64.c
index ee56588..fe5ad52 100644
--- a/target-arm/translate-a64.c
+++ b/target-arm/translate-a64.c
@@ -61,6 +61,17 @@ enum a64_shift_type {
     A64_SHIFT_TYPE_ROR = 3
 };
 
+/* Table based decoder typedefs - used when the relevant bits for decode
+ * are too awkwardly scattered across the instruction (eg SIMD).
+ */
+typedef void AArch64DecodeFn(DisasContext *s, uint32_t insn);
+
+typedef struct AArch64DecodeTable {
+    uint32_t pattern;
+    uint32_t mask;
+    AArch64DecodeFn *disas_fn;
+} AArch64DecodeTable;
+
 /* initialize TCG globals.  */
 void a64_translate_init(void)
 {
@@ -846,6 +857,31 @@ static inline void gen_check_sp_alignment(DisasContext *s)
 }
 
 /*
+ * This provides a simple table based table lookup decoder. It is
+ * intended to be used when the relevant bits for decode are too
+ * awkwardly placed and switch/if based logic would be confusing and
+ * deeply nested. Since it's a linear search through the table, tables
+ * should be kept small.
+ *
+ * It returns the first handler where insn & mask == pattern, or
+ * NULL if there is no match.
+ * The table is terminated by an empty mask (i.e. 0)
+ */
+static inline AArch64DecodeFn *lookup_disas_fn(AArch64DecodeTable *table,
+                                               uint32_t insn)
+{
+    AArch64DecodeTable *tptr = table;
+
+    while (tptr->mask) {
+        if ((insn & tptr->mask) == tptr->pattern) {
+            return tptr->disas_fn;
+        }
+        tptr++;
+    }
+    return NULL;
+}
+
+/*
  * the instruction disassembly implemented here matches
  * the instruction encoding classifications in chapter 3 (C3)
  * of the ARM Architecture Reference Manual (DDI0487A_a)
@@ -4604,13 +4640,281 @@ static void disas_data_proc_fp(DisasContext *s, uint32_t insn)
     }
 }
 
+/* C3.6.1 EXT
+ *   31  30 29         24 23 22  21 20  16 15  14  11 10  9    5 4    0
+ * +---+---+-------------+-----+---+------+---+------+---+------+------+
+ * | 0 | Q | 0 0 1 1 1 0 | op2 | 0 |  Rm  | 0 | imm4 | 0 |  Rn  |  Rd  |
+ * +---+---+-------------+-----+---+------+---+------+---+------+------+
+ */
+static void disas_simd_ext(DisasContext *s, uint32_t insn)
+{
+    unsupported_encoding(s, insn);
+}
+
+/* C3.6.2 TBL/TBX
+ *   31  30 29         24 23 22  21 20  16 15  14 13  12  11 10 9    5 4    0
+ * +---+---+-------------+-----+---+------+---+-----+----+-----+------+------+
+ * | 0 | Q | 0 0 1 1 1 0 | op2 | 0 |  Rm  | 0 | len | op | 0 0 |  Rn  |  Rd  |
+ * +---+---+-------------+-----+---+------+---+-----+----+-----+------+------+
+ */
+static void disas_simd_tb(DisasContext *s, uint32_t insn)
+{
+    unsupported_encoding(s, insn);
+}
+
+/* C3.6.3 ZIP/UZP/TRN
+ *   31  30 29         24 23  22  21 20   16 15 14 12 11 10 9    5 4    0
+ * +---+---+-------------+------+---+------+---+------------------+------+
+ * | 0 | Q | 0 0 1 1 1 0 | size | 0 |  Rm  | 0 | opc | 1 0 |  Rn  |  Rd  |
+ * +---+---+-------------+------+---+------+---+------------------+------+
+ */
+static void disas_simd_zip_trn(DisasContext *s, uint32_t insn)
+{
+    unsupported_encoding(s, insn);
+}
+
+/* C3.6.4 AdvSIMD across lanes
+ *   31  30  29 28       24 23  22 21       17 16    12 11 10 9    5 4    0
+ * +---+---+---+-----------+------+-----------+--------+-----+------+------+
+ * | 0 | Q | U | 0 1 1 1 0 | size | 1 1 0 0 0 | opcode | 1 0 |  Rn  |  Rd  |
+ * +---+---+---+-----------+------+-----------+--------+-----+------+------+
+ */
+static void disas_simd_across_lanes(DisasContext *s, uint32_t insn)
+{
+    unsupported_encoding(s, insn);
+}
+
+/* C3.6.5 AdvSIMD copy
+ *   31  30  29  28             21 20  16 15  14  11 10  9    5 4    0
+ * +---+---+----+-----------------+------+---+------+---+------+------+
+ * | 0 | Q | op | 0 1 1 1 0 0 0 0 | imm5 | 0 | imm4 | 1 |  Rn  |  Rd  |
+ * +---+---+----+-----------------+------+---+------+---+------+------+
+ */
+static void disas_simd_copy(DisasContext *s, uint32_t insn)
+{
+    unsupported_encoding(s, insn);
+}
+
+/* C3.6.6 AdvSIMD modified immediate
+ *  31  30   29  28                 19 18 16 15   12  11  10  9     5 4    0
+ * +---+---+----+---------------------+-----+-------+----+---+-------+------+
+ * | 0 | Q | op | 0 1 1 1 1 0 0 0 0 0 | abc | cmode | o2 | 1 | defgh |  Rd  |
+ * +---+---+----+---------------------+-----+-------+----+---+-------+------+
+ */
+static void disas_simd_mod_imm(DisasContext *s, uint32_t insn)
+{
+    unsupported_encoding(s, insn);
+}
+
+/* C3.6.7 AdvSIMD scalar copy
+ *  31 30  29  28             21 20  16 15  14  11 10  9    5 4    0
+ * +-----+----+-----------------+------+---+------+---+------+------+
+ * | 0 1 | op | 1 1 1 1 0 0 0 0 | imm5 | 0 | imm4 | 1 |  Rn  |  Rd  |
+ * +-----+----+-----------------+------+---+------+---+------+------+
+ */
+static void disas_simd_scalar_copy(DisasContext *s, uint32_t insn)
+{
+    unsupported_encoding(s, insn);
+}
+
+/* C3.6.8 AdvSIMD scalar pairwise
+ *  31 30  29 28       24 23  22 21       17 16    12 11 10 9    5 4    0
+ * +-----+---+-----------+------+-----------+--------+-----+------+------+
+ * | 0 1 | U | 1 1 1 1 0 | size | 1 1 0 0 0 | opcode | 1 0 |  Rn  |  Rd  |
+ * +-----+---+-----------+------+-----------+--------+-----+------+------+
+ */
+static void disas_simd_scalar_pairwise(DisasContext *s, uint32_t insn)
+{
+    unsupported_encoding(s, insn);
+}
+
+/* C3.6.9 AdvSIMD scalar shift by immediate
+ *  31 30  29 28         23 22  19 18  16 15    11  10 9    5 4    0
+ * +-----+---+-------------+------+------+--------+---+------+------+
+ * | 0 1 | U | 1 1 1 1 1 0 | immh | immb | opcode | 1 |  Rn  |  Rd  |
+ * +-----+---+-------------+------+------+--------+---+------+------+
+ */
+static void disas_simd_scalar_shift_imm(DisasContext *s, uint32_t insn)
+{
+    unsupported_encoding(s, insn);
+}
+
+/* C3.6.10 AdvSIMD scalar three different
+ *  31 30  29 28       24 23  22  21 20  16 15    12 11 10 9    5 4    0
+ * +-----+---+-----------+------+---+------+--------+-----+------+------+
+ * | 0 1 | U | 1 1 1 1 0 | size | 1 |  Rm  | opcode | 0 0 |  Rn  |  Rd  |
+ * +-----+---+-----------+------+---+------+--------+-----+------+------+
+ */
+static void disas_simd_scalar_three_reg_diff(DisasContext *s, uint32_t insn)
+{
+    unsupported_encoding(s, insn);
+}
+
+/* C3.6.11 AdvSIMD scalar three same
+ *  31 30  29 28       24 23  22  21 20  16 15    11  10 9    5 4    0
+ * +-----+---+-----------+------+---+------+--------+---+------+------+
+ * | 0 1 | U | 1 1 1 1 0 | size | 1 |  Rm  | opcode | 1 |  Rn  |  Rd  |
+ * +-----+---+-----------+------+---+------+--------+---+------+------+
+ */
+static void disas_simd_scalar_three_reg_same(DisasContext *s, uint32_t insn)
+{
+    unsupported_encoding(s, insn);
+}
+
+/* C3.6.12 AdvSIMD scalar two reg misc
+ *  31 30  29 28       24 23  22 21       17 16    12 11 10 9    5 4    0
+ * +-----+---+-----------+------+-----------+--------+-----+------+------+
+ * | 0 1 | U | 1 1 1 1 0 | size | 1 0 0 0 0 | opcode | 1 0 |  Rn  |  Rd  |
+ * +-----+---+-----------+------+-----------+--------+-----+------+------+
+ */
+static void disas_simd_scalar_two_reg_misc(DisasContext *s, uint32_t insn)
+{
+    unsupported_encoding(s, insn);
+}
+
+/* C3.6.13 AdvSIMD scalar x indexed element
+ *  31 30  29 28       24 23  22 21  20  19  16 15 12  11  10 9    5 4    0
+ * +-----+---+-----------+------+---+---+------+-----+---+---+------+------+
+ * | 0 1 | U | 1 1 1 1 1 | size | L | M |  Rm  | opc | H | 0 |  Rn  |  Rd  |
+ * +-----+---+-----------+------+---+---+------+-----+---+---+------+------+
+ */
+static void disas_simd_scalar_indexed(DisasContext *s, uint32_t insn)
+{
+    unsupported_encoding(s, insn);
+}
+
+/* C3.6.14 AdvSIMD shift by immediate
+ *  31  30   29 28         23 22  19 18  16 15    11  10 9    5 4    0
+ * +---+---+---+-------------+------+------+--------+---+------+------+
+ * | 0 | Q | U | 0 1 1 1 1 0 | immh | immb | opcode | 1 |  Rn  |  Rd  |
+ * +---+---+---+-------------+------+------+--------+---+------+------+
+ */
+static void disas_simd_shift_imm(DisasContext *s, uint32_t insn)
+{
+    unsupported_encoding(s, insn);
+}
+
+/* C3.6.15 AdvSIMD three different
+ *   31  30  29 28       24 23  22  21 20  16 15    12 11 10 9    5 4    0
+ * +---+---+---+-----------+------+------+---+--------+-----+------+------+
+ * | 0 | Q | U | 0 1 1 1 0 | size | 1 |  Rm  | opcode | 0 0 |  Rn  |  Rd  |
+ * +---+---+---+-----------+------+------+---+--------+-----+------+------+
+ */
+static void disas_simd_three_reg_diff(DisasContext *s, uint32_t insn)
+{
+    unsupported_encoding(s, insn);
+}
+
+/* C3.6.16 AdvSIMD three same
+ *  31 30  29 28       24 23  22  21 20  16 15    11  10 9    5 4    0
+ * +-----+---+-----------+------+---+------+--------+---+------+------+
+ * | 0 1 | U | 1 1 1 1 0 | size | 1 |  Rm  | opcode | 1 |  Rn  |  Rd  |
+ * +-----+---+-----------+------+---+------+--------+---+------+------+
+ */
+static void disas_simd_three_reg_same(DisasContext *s, uint32_t insn)
+{
+    unsupported_encoding(s, insn);
+}
+
+/* C3.6.17 AdvSIMD two reg misc
+ *   31  30  29 28       24 23  22 21       17 16    12 11 10 9    5 4    0
+ * +---+---+---+-----------+------+-----------+--------+-----+------+------+
+ * | 0 | Q | U | 0 1 1 1 0 | size | 1 0 0 0 0 | opcode | 1 0 |  Rn  |  Rd  |
+ * +---+---+---+-----------+------+-----------+--------+-----+------+------+
+ */
+static void disas_simd_two_reg_misc(DisasContext *s, uint32_t insn)
+{
+    unsupported_encoding(s, insn);
+}
+
+/* C3.6.18 AdvSIMD vector x indexed element
+ *   31  30  29 28       24 23  22 21  20  19  16 15 12  11  10 9    5 4    0
+ * +---+---+---+-----------+------+---+---+------+-----+---+---+------+------+
+ * | 0 | Q | U | 0 1 1 1 1 | size | L | M |  Rm  | opc | H | 0 |  Rn  |  Rd  |
+ * +---+---+---+-----------+------+---+---+------+-----+---+---+------+------+
+ */
+static void disas_simd_indexed_vector(DisasContext *s, uint32_t insn)
+{
+    unsupported_encoding(s, insn);
+}
+
+/* C3.6.19 Crypto AES
+ *  31             24 23  22 21       17 16    12 11 10 9    5 4    0
+ * +-----------------+------+-----------+--------+-----+------+------+
+ * | 0 1 0 0 1 1 1 0 | size | 1 0 1 0 0 | opcode | 1 0 |  Rn  |  Rd  |
+ * +-----------------+------+-----------+--------+-----+------+------+
+ */
+static void disas_crypto_aes(DisasContext *s, uint32_t insn)
+{
+    unsupported_encoding(s, insn);
+}
+
+/* C3.6.20 Crypto three-reg SHA
+ *  31             24 23  22  21 20  16  15 14    12 11 10 9    5 4    0
+ * +-----------------+------+---+------+---+--------+-----+------+------+
+ * | 0 1 0 1 1 1 1 0 | size | 0 |  Rm  | 0 | opcode | 0 0 |  Rn  |  Rd  |
+ * +-----------------+------+---+------+---+--------+-----+------+------+
+ */
+static void disas_crypto_three_reg_sha(DisasContext *s, uint32_t insn)
+{
+    unsupported_encoding(s, insn);
+}
+
+/* C3.6.21 Crypto two-reg SHA
+ *  31             24 23  22 21       17 16    12 11 10 9    5 4    0
+ * +-----------------+------+-----------+--------+-----+------+------+
+ * | 0 1 0 1 1 1 1 0 | size | 1 0 1 0 0 | opcode | 1 0 |  Rn  |  Rd  |
+ * +-----------------+------+-----------+--------+-----+------+------+
+ */
+static void disas_crypto_two_reg_sha(DisasContext *s, uint32_t insn)
+{
+    unsupported_encoding(s, insn);
+}
+
+/* C3.6 Data processing - SIMD, inc Crypto
+ *
+ * As the decode gets a little complex we are using a table based
+ * approach for this part of the decode.
+ */
+static AArch64DecodeTable data_proc_simd[] = {
+    /* pattern  ,  mask     ,  fn                        */
+    { 0x0e200400, 0x9f200400, disas_simd_three_reg_same },
+    { 0x0e200000, 0x9f200c00, disas_simd_three_reg_diff },
+    { 0x0e200800, 0x9f3e0c00, disas_simd_two_reg_misc },
+    { 0x0e300800, 0x9f3e0c00, disas_simd_across_lanes },
+    { 0x0e000400, 0x9fe08400, disas_simd_copy },
+    { 0x0f000000, 0x9f000400, disas_simd_indexed_vector },
+    /* simd_mod_imm decode is a subset of simd_shift_imm, so must precede it */
+    { 0x0f000400, 0x9ff80400, disas_simd_mod_imm },
+    { 0x0f000400, 0x9f800400, disas_simd_shift_imm },
+    { 0x0e000000, 0xbf208c00, disas_simd_tb },
+    { 0x0e000800, 0xbf208c00, disas_simd_zip_trn },
+    { 0x2e000000, 0xbf208400, disas_simd_ext },
+    { 0x5e200400, 0xdf200400, disas_simd_scalar_three_reg_same },
+    { 0x5e200000, 0xdf200c00, disas_simd_scalar_three_reg_diff },
+    { 0x5e200800, 0xdf3e0c00, disas_simd_scalar_two_reg_misc },
+    { 0x5e300800, 0xdf3e0c00, disas_simd_scalar_pairwise },
+    { 0x5e000400, 0xdfe08400, disas_simd_scalar_copy },
+    { 0x5f000000, 0xdf000400, disas_simd_scalar_indexed },
+    { 0x5f000400, 0xdf800400, disas_simd_scalar_shift_imm },
+    { 0x4e280800, 0xff3e0c00, disas_crypto_aes },
+    { 0x5e000000, 0xff208c00, disas_crypto_three_reg_sha },
+    { 0x5e280800, 0xff3e0c00, disas_crypto_two_reg_sha },
+    { 0x00000000, 0x00000000, NULL }
+};
+
 static void disas_data_proc_simd(DisasContext *s, uint32_t insn)
 {
     /* Note that this is called with all non-FP cases from
      * table C3-6 so it must UNDEF for entries not specifically
      * allocated to instructions in that table.
      */
-    unsupported_encoding(s, insn);
+    AArch64DecodeFn *fn = lookup_disas_fn(&data_proc_simd[0], insn);
+    if (fn) {
+        (fn) (s, insn);
+    } else {
+        unallocated_encoding(s);
+    }
 }
 
 /* C3.6 Data processing - SIMD and floating point */
-- 
1.8.5

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [Qemu-devel] [PATCH 04/10] target-arm: A64: Add SIMD EXT
  2014-01-10 17:12 [Qemu-devel] [PATCH 00/10] A64 SIMD patchset one: ld/st, C3.6.1..C3.6.7 Peter Maydell
                   ` (2 preceding siblings ...)
  2014-01-10 17:12 ` [Qemu-devel] [PATCH 03/10] target-arm: A64: Add decode skeleton for SIMD data processing insns Peter Maydell
@ 2014-01-10 17:12 ` Peter Maydell
  2014-01-10 19:13   ` Richard Henderson
  2014-01-10 17:12 ` [Qemu-devel] [PATCH 05/10] target-arm: A64: Add SIMD TBL/TBLX Peter Maydell
                   ` (5 subsequent siblings)
  9 siblings, 1 reply; 30+ messages in thread
From: Peter Maydell @ 2014-01-10 17:12 UTC (permalink / raw)
  To: qemu-devel
  Cc: patches, Michael Matz, Alexander Graf, Claudio Fontana,
	Dirk Mueller, Will Newton, Laurent Desnogues, Alex Bennée,
	kvmarm, Christoffer Dall, Richard Henderson

Add support for the SIMD EXT instruction (the only one in its
group, C3.6.1).

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target-arm/translate-a64.c | 62 +++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 61 insertions(+), 1 deletion(-)

diff --git a/target-arm/translate-a64.c b/target-arm/translate-a64.c
index fe5ad52..83ae222 100644
--- a/target-arm/translate-a64.c
+++ b/target-arm/translate-a64.c
@@ -4640,6 +4640,32 @@ static void disas_data_proc_fp(DisasContext *s, uint32_t insn)
     }
 }
 
+static TCGv_i64 do_ext64(DisasContext *s, int leftreg, int leftelt,
+                         int rightreg, int rightelt, int pos)
+{
+    /* Extract 64 bits from the middle of two concatenated 64 bit
+     * vector register slices left:right. The extracted bits start
+     * at 'pos' bits into the right (least significant) side.
+     * For each slice, 'reg' indicates the vector register and
+     * 'elt' indicates which of the two 64 bit elements of it to use.
+     * The extracted value is returned in a TCGv_i64 temp.
+     */
+    TCGv_i64 tcg_res = tcg_temp_new_i64();
+    assert(pos >= 0 && pos < 64);
+
+    read_vec_element(s, tcg_res, rightreg, rightelt, MO_64);
+    if (pos != 0) {
+        TCGv_i64 tcg_left = tcg_temp_new_i64();
+
+        read_vec_element(s, tcg_left, leftreg, leftelt, MO_64);
+        tcg_gen_shli_i64(tcg_left, tcg_left, 64 - pos);
+        tcg_gen_shri_i64(tcg_res, tcg_res, pos);
+        tcg_gen_or_i64(tcg_res, tcg_res, tcg_left);
+        tcg_temp_free_i64(tcg_left);
+    }
+    return tcg_res;
+}
+
 /* C3.6.1 EXT
  *   31  30 29         24 23 22  21 20  16 15  14  11 10  9    5 4    0
  * +---+---+-------------+-----+---+------+---+------+---+------+------+
@@ -4648,7 +4674,41 @@ static void disas_data_proc_fp(DisasContext *s, uint32_t insn)
  */
 static void disas_simd_ext(DisasContext *s, uint32_t insn)
 {
-    unsupported_encoding(s, insn);
+    int is_q = extract32(insn, 30, 1);
+    int op2 = extract32(insn, 22, 2);
+    int imm4 = extract32(insn, 11, 4);
+    int rm = extract32(insn, 16, 5);
+    int rn = extract32(insn, 5, 5);
+    int rd = extract32(insn, 0, 5);
+    int pos = imm4 << 3;
+    TCGv_i64 tcg_resl, tcg_resh;
+
+    if (op2 != 0 || (!is_q && extract32(imm4, 3, 1))) {
+        unallocated_encoding(s);
+        return;
+    }
+
+    /* Vd gets bits starting at pos bits into Vm:Vn. This is
+     * either extracting 128 bits from a 128:128 concatenation, or
+     * extracting 64 bits from a 64:64 concatenation.
+     */
+    if (!is_q) {
+        tcg_resl = do_ext64(s, rm, 0, rn, 0, pos);
+        tcg_resh = tcg_const_i64(0);
+    } else {
+        if (pos < 64) {
+            tcg_resl = do_ext64(s, rn, 1, rn, 0, pos);
+            tcg_resh = do_ext64(s, rm, 0, rn, 1, pos);
+        } else {
+            tcg_resl = do_ext64(s, rm, 0, rn, 1, pos - 64);
+            tcg_resh = do_ext64(s, rm, 1, rm, 0, pos - 64);
+        }
+    }
+
+    write_vec_element(s, tcg_resl, rd, 0, MO_64);
+    tcg_temp_free_i64(tcg_resl);
+    write_vec_element(s, tcg_resh, rd, 1, MO_64);
+    tcg_temp_free_i64(tcg_resh);
 }
 
 /* C3.6.2 TBL/TBX
-- 
1.8.5

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [Qemu-devel] [PATCH 05/10] target-arm: A64: Add SIMD TBL/TBLX
  2014-01-10 17:12 [Qemu-devel] [PATCH 00/10] A64 SIMD patchset one: ld/st, C3.6.1..C3.6.7 Peter Maydell
                   ` (3 preceding siblings ...)
  2014-01-10 17:12 ` [Qemu-devel] [PATCH 04/10] target-arm: A64: Add SIMD EXT Peter Maydell
@ 2014-01-10 17:12 ` Peter Maydell
  2014-01-10 19:19   ` Richard Henderson
  2014-01-10 17:12 ` [Qemu-devel] [PATCH 06/10] target-arm: A64: Add SIMD ZIP/UZP/TRN Peter Maydell
                   ` (4 subsequent siblings)
  9 siblings, 1 reply; 30+ messages in thread
From: Peter Maydell @ 2014-01-10 17:12 UTC (permalink / raw)
  To: qemu-devel
  Cc: patches, Michael Matz, Alexander Graf, Claudio Fontana,
	Dirk Mueller, Will Newton, Laurent Desnogues, Alex Bennée,
	kvmarm, Christoffer Dall, Richard Henderson

From: Michael Matz <matz@suse.de>

Add support for the SIMD TBL/TBLX instructions (group C3.6.2).

Signed-off-by: Michael Matz <matz@suse.de>
[PMM: rewritten to do more of the decode in translate-a64.c,
 and to do only one 64 bit pass at a time in the helper]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target-arm/helper-a64.c    | 31 ++++++++++++++++++++++++++
 target-arm/helper-a64.h    |  1 +
 target-arm/translate-a64.c | 54 +++++++++++++++++++++++++++++++++++++++++++++-
 3 files changed, 85 insertions(+), 1 deletion(-)

diff --git a/target-arm/helper-a64.c b/target-arm/helper-a64.c
index 4ce0d01..810e7c5 100644
--- a/target-arm/helper-a64.c
+++ b/target-arm/helper-a64.c
@@ -122,3 +122,34 @@ uint64_t HELPER(vfp_cmped_a64)(float64 x, float64 y, void *fp_status)
 {
     return float_rel_to_flags(float64_compare(x, y, fp_status));
 }
+
+uint64_t HELPER(simd_tbl)(CPUARMState *env, uint64_t result, uint64_t indices,
+                          uint64_t rn, uint64_t numregs)
+{
+    /* Helper function for SIMD TBL and TBX. We have to do the table
+     * lookup part for the 64 bits worth of indices we're passed in.
+     * result is the initial results vector (either zeroes for TBL
+     * or some guest values for TBX), rn the register number where
+     * the table starts, and numregs the number of registers in the table.
+     * We return the results of the lookups.
+     */
+    int shift;
+
+    for (shift = 0; shift < 64; shift += 8) {
+        int index = extract64(indices, shift, 8);
+        if (index < 16 * numregs) {
+            /* Convert index (a byte offset into the virtual table
+             * which is a series of 128-bit vectors concatenated)
+             * into the correct vfp.regs[] element plus a bit offset
+             * into that element, bearing in mind that the table
+             * can wrap around from V31 to V0.
+             */
+            int elt = (rn * 2 + (index >> 3)) % 64;
+            int bitidx = (index & 7) * 8;
+            uint64_t val = extract64(env->vfp.regs[elt], bitidx, 8);
+
+            result = deposit64(result, shift, 8, val);
+        }
+    }
+    return result;
+}
diff --git a/target-arm/helper-a64.h b/target-arm/helper-a64.h
index bca19f3..0d265d5 100644
--- a/target-arm/helper-a64.h
+++ b/target-arm/helper-a64.h
@@ -26,3 +26,4 @@ DEF_HELPER_3(vfp_cmps_a64, i64, f32, f32, ptr)
 DEF_HELPER_3(vfp_cmpes_a64, i64, f32, f32, ptr)
 DEF_HELPER_3(vfp_cmpd_a64, i64, f64, f64, ptr)
 DEF_HELPER_3(vfp_cmped_a64, i64, f64, f64, ptr)
+DEF_HELPER_FLAGS_5(simd_tbl, TCG_CALL_NO_RWG_SE, i64, env, i64, i64, i64, i64)
diff --git a/target-arm/translate-a64.c b/target-arm/translate-a64.c
index 83ae222..336e544 100644
--- a/target-arm/translate-a64.c
+++ b/target-arm/translate-a64.c
@@ -4719,7 +4719,59 @@ static void disas_simd_ext(DisasContext *s, uint32_t insn)
  */
 static void disas_simd_tb(DisasContext *s, uint32_t insn)
 {
-    unsupported_encoding(s, insn);
+    int op2 = extract32(insn, 22, 2);
+    int is_q = extract32(insn, 30, 1);
+    int rm = extract32(insn, 16, 5);
+    int rn = extract32(insn, 5, 5);
+    int rd = extract32(insn, 0, 5);
+    int is_tblx = extract32(insn, 12, 1);
+    int len = extract32(insn, 13, 2);
+    TCGv_i64 tcg_resl, tcg_resh, tcg_idx, tcg_regno, tcg_numregs;
+
+    if (op2 != 0) {
+        unallocated_encoding(s);
+        return;
+    }
+
+    /* This does a table lookup: for every byte element in the input
+     * we index into a table formed from up to four vector registers,
+     * and then the output is the result of the lookups. Our helper
+     * function does the lookup operation for a single 64 bit part of
+     * the input.
+     */
+    tcg_resl = tcg_temp_new_i64();
+    tcg_resh = tcg_temp_new_i64();
+
+    if (is_tblx) {
+        read_vec_element(s, tcg_resl, rd, 0, MO_64);
+    } else {
+        tcg_gen_movi_i64(tcg_resl, 0);
+    }
+    if (is_tblx && is_q) {
+        read_vec_element(s, tcg_resh, rd, 1, MO_64);
+    } else {
+        tcg_gen_movi_i64(tcg_resh, 0);
+    }
+
+    tcg_idx = tcg_temp_new_i64();
+    tcg_regno = tcg_const_i64(rn);
+    tcg_numregs = tcg_const_i64(len + 1);
+    read_vec_element(s, tcg_idx, rm, 0, MO_64);
+    gen_helper_simd_tbl(tcg_resl, cpu_env, tcg_resl, tcg_idx,
+                        tcg_regno, tcg_numregs);
+    if (is_q) {
+        read_vec_element(s, tcg_idx, rm, 1, MO_64);
+        gen_helper_simd_tbl(tcg_resh, cpu_env, tcg_resh, tcg_idx,
+                            tcg_regno, tcg_numregs);
+    }
+    tcg_temp_free_i64(tcg_idx);
+    tcg_temp_free_i64(tcg_regno);
+    tcg_temp_free_i64(tcg_numregs);
+
+    write_vec_element(s, tcg_resl, rd, 0, MO_64);
+    tcg_temp_free_i64(tcg_resl);
+    write_vec_element(s, tcg_resh, rd, 1, MO_64);
+    tcg_temp_free_i64(tcg_resh);
 }
 
 /* C3.6.3 ZIP/UZP/TRN
-- 
1.8.5

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [Qemu-devel] [PATCH 06/10] target-arm: A64: Add SIMD ZIP/UZP/TRN
  2014-01-10 17:12 [Qemu-devel] [PATCH 00/10] A64 SIMD patchset one: ld/st, C3.6.1..C3.6.7 Peter Maydell
                   ` (4 preceding siblings ...)
  2014-01-10 17:12 ` [Qemu-devel] [PATCH 05/10] target-arm: A64: Add SIMD TBL/TBLX Peter Maydell
@ 2014-01-10 17:12 ` Peter Maydell
  2014-01-10 19:29   ` Richard Henderson
  2014-01-10 17:12 ` [Qemu-devel] [PATCH 07/10] target-arm: A64: Add SIMD across-lanes instructions Peter Maydell
                   ` (3 subsequent siblings)
  9 siblings, 1 reply; 30+ messages in thread
From: Peter Maydell @ 2014-01-10 17:12 UTC (permalink / raw)
  To: qemu-devel
  Cc: patches, Michael Matz, Alexander Graf, Claudio Fontana,
	Dirk Mueller, Will Newton, Laurent Desnogues, Alex Bennée,
	kvmarm, Christoffer Dall, Richard Henderson

From: Michael Matz <matz@suse.de>

Add support for the SIMD ZIP/UZIP/TRN instruction group
(C3.6.3).

Signed-off-by: Michael Matz <matz@suse.de>
[PMM: use new do_vec_get/set etc functions and generally update to new
 codebase standards; refactor to pull per-element loop outside switch]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target-arm/translate-a64.c | 76 +++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 75 insertions(+), 1 deletion(-)

diff --git a/target-arm/translate-a64.c b/target-arm/translate-a64.c
index 336e544..ec39dd3 100644
--- a/target-arm/translate-a64.c
+++ b/target-arm/translate-a64.c
@@ -4782,7 +4782,81 @@ static void disas_simd_tb(DisasContext *s, uint32_t insn)
  */
 static void disas_simd_zip_trn(DisasContext *s, uint32_t insn)
 {
-    unsupported_encoding(s, insn);
+    int rd = extract32(insn, 0, 5);
+    int rn = extract32(insn, 5, 5);
+    int rm = extract32(insn, 16, 5);
+    int size = extract32(insn, 22, 2);
+    /* opc field bits [1:0] indicate ZIP/UZP/TRN;
+     * bit 2 indicates 1 vs 2 variant of the insn.
+     */
+    int opcode = extract32(insn, 12, 2);
+    bool part = extract32(insn, 14, 1);
+    bool is_q = extract32(insn, 30, 1);
+    int esize = 8 << size;
+    int i, ofs;
+    int datasize = is_q ? 128 : 64;
+    int elements = datasize / esize;
+    TCGv_i64 tcg_res, tcg_resl, tcg_resh;
+
+    if (opcode == 0 || (size == 3 && !is_q)) {
+        unallocated_encoding(s);
+        return;
+    }
+
+    tcg_resl = tcg_const_i64(0);
+    tcg_resh = tcg_const_i64(0);
+    tcg_res = tcg_temp_new_i64();
+
+    for (i = 0; i < elements; i++) {
+        switch (opcode) {
+        case 1: /* UZP1/2 */
+        {
+            int midpoint = elements / 2;
+            if (i < midpoint) {
+                read_vec_element(s, tcg_res, rn, 2 * i + part, size);
+            } else {
+                read_vec_element(s, tcg_res, rm,
+                                 2 * (i - midpoint) + part, size);
+            }
+            break;
+        }
+        case 2: /* TRN1/2 */
+            if (i & 1) {
+                read_vec_element(s, tcg_res, rm, (i & ~1) + part, size);
+            } else {
+                read_vec_element(s, tcg_res, rn, (i & ~1) + part, size);
+            }
+            break;
+        case 3: /* ZIP1/2 */
+        {
+            int base = part * elements / 2;
+            if (i & 1) {
+                read_vec_element(s, tcg_res, rm, base + (i >> 1), size);
+            } else {
+                read_vec_element(s, tcg_res, rn, base + (i >> 1), size);
+            }
+            break;
+        }
+        default:
+            g_assert_not_reached();
+        }
+
+        ofs = i * esize;
+        if (ofs < 64) {
+            tcg_gen_shli_i64(tcg_res, tcg_res, ofs);
+            tcg_gen_or_i64(tcg_resl, tcg_resl, tcg_res);
+        } else {
+            tcg_gen_shli_i64(tcg_res, tcg_res, ofs - 64);
+            tcg_gen_or_i64(tcg_resh, tcg_resh, tcg_res);
+        }
+    }
+
+    tcg_temp_free_i64(tcg_res);
+
+    write_vec_element(s, tcg_resl, rd, 0, MO_64);
+    tcg_temp_free_i64(tcg_resl);
+    write_vec_element(s, tcg_resh, rd, 1, MO_64);
+    tcg_temp_free_i64(tcg_resh);
 }
 
 /* C3.6.4 AdvSIMD across lanes
-- 
1.8.5

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [Qemu-devel] [PATCH 07/10] target-arm: A64: Add SIMD across-lanes instructions
  2014-01-10 17:12 [Qemu-devel] [PATCH 00/10] A64 SIMD patchset one: ld/st, C3.6.1..C3.6.7 Peter Maydell
                   ` (5 preceding siblings ...)
  2014-01-10 17:12 ` [Qemu-devel] [PATCH 06/10] target-arm: A64: Add SIMD ZIP/UZP/TRN Peter Maydell
@ 2014-01-10 17:12 ` Peter Maydell
  2014-01-10 19:38   ` Richard Henderson
  2014-01-10 17:12 ` [Qemu-devel] [PATCH 08/10] target-arm: A64: Add SIMD copy operations Peter Maydell
                   ` (2 subsequent siblings)
  9 siblings, 1 reply; 30+ messages in thread
From: Peter Maydell @ 2014-01-10 17:12 UTC (permalink / raw)
  To: qemu-devel
  Cc: patches, Michael Matz, Alexander Graf, Claudio Fontana,
	Dirk Mueller, Will Newton, Laurent Desnogues, Alex Bennée,
	kvmarm, Christoffer Dall, Richard Henderson

From: Michael Matz <matz@suse.de>

Add support for the SIMD "across lanes" instruction group (C3.6.4).

Signed-off-by: Michael Matz <matz@suse.de>
[PMM: Updated to current codebase, added fp min/max ops,
 added unallocated encoding checks]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target-arm/translate-a64.c | 177 ++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 176 insertions(+), 1 deletion(-)

diff --git a/target-arm/translate-a64.c b/target-arm/translate-a64.c
index ec39dd3..e9aeaa0 100644
--- a/target-arm/translate-a64.c
+++ b/target-arm/translate-a64.c
@@ -4859,6 +4859,29 @@ static void disas_simd_zip_trn(DisasContext *s, uint32_t insn)
     tcg_temp_free_i64(tcg_resh);
 }
 
+static void do_minmaxop(DisasContext *s, TCGv_i32 tcg_elt1, TCGv_i32 tcg_elt2,
+                        int opc, bool is_min, TCGv_ptr fpst)
+{
+    /* Helper function for disas_simd_across_lanes: do a single precision
+     * min/max operation on the specified two inputs,
+     * and return the result in tcg_elt1.
+     */
+    if (opc == 0xc) {
+        if (is_min) {
+            gen_helper_vfp_minnums(tcg_elt1, tcg_elt1, tcg_elt2, fpst);
+        } else {
+            gen_helper_vfp_maxnums(tcg_elt1, tcg_elt1, tcg_elt2, fpst);
+        }
+    } else {
+        assert(opc == 0xf);
+        if (is_min) {
+            gen_helper_vfp_mins(tcg_elt1, tcg_elt1, tcg_elt2, fpst);
+        } else {
+            gen_helper_vfp_maxs(tcg_elt1, tcg_elt1, tcg_elt2, fpst);
+        }
+    }
+}
+
 /* C3.6.4 AdvSIMD across lanes
  *   31  30  29 28       24 23  22 21       17 16    12 11 10 9    5 4    0
  * +---+---+---+-----------+------+-----------+--------+-----+------+------+
@@ -4867,7 +4890,159 @@ static void disas_simd_zip_trn(DisasContext *s, uint32_t insn)
  */
 static void disas_simd_across_lanes(DisasContext *s, uint32_t insn)
 {
-    unsupported_encoding(s, insn);
+    int rd = extract32(insn, 0, 5);
+    int rn = extract32(insn, 5, 5);
+    int size = extract32(insn, 22, 2);
+    int opcode = extract32(insn, 12, 5);
+    bool is_q = extract32(insn, 30, 1);
+    bool is_u = extract32(insn, 29, 1);
+    bool is_fp = false;
+    bool is_min = false;
+    int esize;
+    int elements;
+    int i;
+    TCGv_i64 tcg_res, tcg_elt;
+
+    switch (opcode) {
+    case 0x1b: /* ADDV */
+        if (is_u) {
+            unallocated_encoding(s);
+            return;
+        }
+        /* fall through */
+    case 0x3: /* SADDLV, UADDLV */
+    case 0xa: /* SMAXV, UMAXV */
+    case 0x1a: /* SMINV, UMINV */
+        if (size == 3 || (size == 2 && !is_q)) {
+            unallocated_encoding(s);
+            return;
+        }
+        break;
+    case 0xc: /* FMAXNMV, FMINNMV */
+    case 0xf: /* FMAXV, FMINV */
+        if (!is_u || !is_q || extract32(size, 0, 1)) {
+            unallocated_encoding(s);
+            return;
+        }
+        /* Bit 1 of size field encodes min vs max, and actual size is always
+         * 32 bits: adjust the size variable so following code can rely on it
+         */
+        is_min = extract32(size, 1, 1);
+        is_fp = true;
+        size = 2;
+        break;
+    default:
+        unallocated_encoding(s);
+        return;
+    }
+
+    esize = 8 << size;
+    elements = (is_q ? 128 : 64) / esize;
+
+    tcg_res = tcg_temp_new_i64();
+    tcg_elt = tcg_temp_new_i64();
+
+    /* These instructions operate across all lanes of a vector
+     * to produce a single result. We can guarantee that a 64
+     * bit intermediate is sufficient:
+     *  + for [US]ADDLV the maximum element size is 32 bits, and
+     *    the result type is 64 bits
+     *  + for FMAX*V, FMIN*V, ADDV the intermediate type is the
+     *    same as the element size, which is 32 bits at most
+     * For the integer operations we can choose to work at 64
+     * or 32 bits and truncate at the end; for simplicity
+     * we use 64 bits always. The floating point
+     * ops do require 32 bit intermediates, though.
+     */
+    if (!is_fp) {
+        read_vec_element(s, tcg_res, rn, 0, size | (is_u ? 0 : MO_SIGN));
+
+        for (i = 1; i < elements; i++) {
+            read_vec_element(s, tcg_elt, rn, i, size | (is_u ? 0 : MO_SIGN));
+
+            switch (opcode) {
+            case 0x03: /* SADDLV / UADDLV */
+            case 0x1b: /* ADDV */
+                tcg_gen_add_i64(tcg_res, tcg_res, tcg_elt);
+                break;
+            case 0x0a: /* SMAXV / UMAXV */
+                tcg_gen_movcond_i64(is_u ? TCG_COND_GEU : TCG_COND_GE,
+                                    tcg_res,
+                                    tcg_res, tcg_elt, tcg_res, tcg_elt);
+                break;
+            case 0x1a: /* SMINV / UMINV */
+                tcg_gen_movcond_i64(is_u ? TCG_COND_LEU : TCG_COND_LE,
+                                    tcg_res,
+                                    tcg_res, tcg_elt, tcg_res, tcg_elt);
+                break;
+                break;
+            default:
+                g_assert_not_reached();
+            }
+
+        }
+    } else {
+        /* Floating point ops which work on 32 bit (single) intermediates.
+         * Note that correct NaN propagation requires that we do these
+         * operations in exactly the order specified by the pseudocode.
+         */
+        TCGv_i32 tcg_elt1 = tcg_temp_new_i32();
+        TCGv_i32 tcg_elt2 = tcg_temp_new_i32();
+        TCGv_i32 tcg_elt3 = tcg_temp_new_i32();
+        TCGv_ptr fpst = get_fpstatus_ptr();
+
+        assert(esize == 32);
+        assert(elements == 4);
+
+        read_vec_element(s, tcg_elt, rn, 0, MO_32);
+        tcg_gen_trunc_i64_i32(tcg_elt1, tcg_elt);
+        read_vec_element(s, tcg_elt, rn, 1, MO_32);
+        tcg_gen_trunc_i64_i32(tcg_elt2, tcg_elt);
+
+        do_minmaxop(s, tcg_elt1, tcg_elt2, opcode, is_min, fpst);
+
+        read_vec_element(s, tcg_elt, rn, 2, MO_32);
+        tcg_gen_trunc_i64_i32(tcg_elt2, tcg_elt);
+        read_vec_element(s, tcg_elt, rn, 3, MO_32);
+        tcg_gen_trunc_i64_i32(tcg_elt3, tcg_elt);
+
+        do_minmaxop(s, tcg_elt2, tcg_elt3, opcode, is_min, fpst);
+
+        do_minmaxop(s, tcg_elt1, tcg_elt2, opcode, is_min, fpst);
+
+        tcg_gen_extu_i32_i64(tcg_res, tcg_elt1);
+        tcg_temp_free_i32(tcg_elt1);
+        tcg_temp_free_i32(tcg_elt2);
+        tcg_temp_free_i32(tcg_elt3);
+        tcg_temp_free_ptr(fpst);
+    }
+
+    tcg_temp_free_i64(tcg_elt);
+
+    /* Now truncate the result to the width required for the final output */
+    if (opcode == 0x03) {
+        /* SADDLV, UADDLV: result is 2*esize */
+        size++;
+    }
+
+    switch (size) {
+    case 0:
+        tcg_gen_ext8u_i64(tcg_res, tcg_res);
+        break;
+    case 1:
+        tcg_gen_ext16u_i64(tcg_res, tcg_res);
+        break;
+    case 2:
+        tcg_gen_ext32u_i64(tcg_res, tcg_res);
+        break;
+    case 3:
+        break;
+    default:
+        g_assert_not_reached();
+    }
+
+    write_fp_dreg(s, rd, tcg_res);
+    tcg_temp_free_i64(tcg_res);
 }
 
 /* C3.6.5 AdvSIMD copy
-- 
1.8.5

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [Qemu-devel] [PATCH 08/10] target-arm: A64: Add SIMD copy operations
  2014-01-10 17:12 [Qemu-devel] [PATCH 00/10] A64 SIMD patchset one: ld/st, C3.6.1..C3.6.7 Peter Maydell
                   ` (6 preceding siblings ...)
  2014-01-10 17:12 ` [Qemu-devel] [PATCH 07/10] target-arm: A64: Add SIMD across-lanes instructions Peter Maydell
@ 2014-01-10 17:12 ` Peter Maydell
  2014-01-10 19:50   ` Richard Henderson
  2014-01-10 17:12 ` [Qemu-devel] [PATCH 09/10] target-arm: A64: Add SIMD modified immediate group Peter Maydell
  2014-01-10 17:12 ` [Qemu-devel] [PATCH 10/10] target-arm: A64: Add SIMD scalar copy instructions Peter Maydell
  9 siblings, 1 reply; 30+ messages in thread
From: Peter Maydell @ 2014-01-10 17:12 UTC (permalink / raw)
  To: qemu-devel
  Cc: patches, Michael Matz, Alexander Graf, Claudio Fontana,
	Dirk Mueller, Will Newton, Laurent Desnogues, Alex Bennée,
	kvmarm, Christoffer Dall, Richard Henderson

From: Alex Bennée <alex.bennee@linaro.org>

This adds support for the all the AdvSIMD vector copy operations
(ARM ARM 3.6.5).

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target-arm/translate-a64.c | 210 ++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 209 insertions(+), 1 deletion(-)

diff --git a/target-arm/translate-a64.c b/target-arm/translate-a64.c
index e9aeaa0..396782e 100644
--- a/target-arm/translate-a64.c
+++ b/target-arm/translate-a64.c
@@ -5045,6 +5045,173 @@ static void disas_simd_across_lanes(DisasContext *s, uint32_t insn)
     tcg_temp_free_i64(tcg_res);
 }
 
+/* C6.3.31 DUP (Element, Vector)
+ *
+ *  31  30   29              21 20    16 15        10  9    5 4    0
+ * +---+---+-------------------+--------+-------------+------+------+
+ * | 0 | Q | 0 0 1 1 1 0 0 0 0 |  imm5  | 0 0 0 0 0 1 |  Rn  |  Rd  |
+ * +---+---+-------------------+--------+-------------+------+------+
+ *
+ * size: encoded in imm5 (see ARM ARM LowestSetBit())
+ */
+static void handle_simd_dupe(DisasContext *s, int is_q, int rd, int rn,
+                             int imm5)
+{
+    int size = ctz32(imm5);
+    int esize = 8 << size;
+    int elements = (is_q ? 128 : 64) / esize;
+    int index, i;
+    TCGv_i64 tmp;
+
+    if (size > 3 || (size == 3 && !is_q)) {
+        unallocated_encoding(s);
+        return;
+    }
+
+    index = imm5 >> (size + 1);
+
+    tmp = tcg_temp_new_i64();
+    read_vec_element(s, tmp, rn, index, size);
+
+    for (i = 0; i < elements; i++) {
+        write_vec_element(s, tmp, rd, i, size);
+    }
+
+    if (!is_q) {
+        clear_vec_high(s, rd);
+    }
+
+    tcg_temp_free_i64(tmp);
+}
+
+/* C6.3.32 DUP (General)
+ *
+ *  31  30   29              21 20    16 15        10  9    5 4    0
+ * +---+---+-------------------+--------+-------------+------+------+
+ * | 0 | Q | 0 0 1 1 1 0 0 0 0 |  imm5  | 0 0 0 0 1 1 |  Rn  |  Rd  |
+ * +---+---+-------------------+--------+-------------+------+------+
+ *
+ * size: encoded in imm5 (see ARM ARM LowestSetBit())
+ */
+static void handle_simd_dupg(DisasContext *s, int is_q, int rd, int rn,
+                             int imm5)
+{
+    int size = ctz32(imm5);
+    int esize = 8 << size;
+    int elements = (is_q ? 128 : 64)/esize;
+    int i = 0;
+
+    if (size > 3 || ((size == 3) && !is_q)) {
+        unallocated_encoding(s);
+        return;
+    }
+    for (i = 0; i < elements; i++) {
+        write_vec_element(s, cpu_reg(s, rn), rd, i, size);
+    }
+    if (!is_q) {
+        clear_vec_high(s, rd);
+    }
+}
+
+/* C6.3.150 INS (Element)
+ *
+ *  31                   21 20    16 15  14    11  10 9    5 4    0
+ * +-----------------------+--------+------------+---+------+------+
+ * | 0 1 1 0 1 1 1 0 0 0 0 |  imm5  | 0 |  imm4  | 1 |  Rn  |  Rd  |
+ * +-----------------------+--------+------------+---+------+------+
+ *
+ * size: encoded in imm5 (see ARM ARM LowestSetBit())
+ * index: encoded in imm5<4:size+1>
+ */
+static void handle_simd_inse(DisasContext *s, int rd, int rn,
+                             int imm4, int imm5)
+{
+    int size = ctz32(imm5);
+    int src_index, dst_index;
+    TCGv_i64 tmp;
+
+    if (size > 3) {
+        unallocated_encoding(s);
+        return;
+    }
+    dst_index = extract32(imm5, 1+size, 5);
+    src_index = extract32(imm4, size, 4);
+
+    tmp = tcg_temp_new_i64();
+
+    read_vec_element(s, tmp, rn, src_index, size);
+    write_vec_element(s, tmp, rd, dst_index, size);
+
+    tcg_temp_free_i64(tmp);
+}
+
+
+/* C6.3.151 INS (General)
+ *
+ *  31                   21 20    16 15        10  9    5 4    0
+ * +-----------------------+--------+-------------+------+------+
+ * | 0 1 0 0 1 1 1 0 0 0 0 |  imm5  | 0 0 0 1 1 1 |  Rn  |  Rd  |
+ * +-----------------------+--------+-------------+------+------+
+ *
+ * size: encoded in imm5 (see ARM ARM LowestSetBit())
+ * index: encoded in imm5<4:size+1>
+ */
+static void handle_simd_insg(DisasContext *s, int rd, int rn, int imm5)
+{
+    int size = ctz32(imm5);
+    int idx;
+
+    if (size > 3) {
+        unallocated_encoding(s);
+        return;
+    }
+
+    idx = extract32(imm5, 1 + size, 4 - size);
+    write_vec_element(s, cpu_reg(s, rn), rd, idx, size);
+}
+
+/*
+ * C6.3.321 UMOV (General)
+ * C6.3.237 SMOV (General)
+ *
+ *  31  30   29              21 20    16 15    12   10 9    5 4    0
+ * +---+---+-------------------+--------+-------------+------+------+
+ * | 0 | Q | 0 0 1 1 1 0 0 0 0 |  imm5  | 0 0 1 U 1 1 |  Rn  |  Rd  |
+ * +---+---+-------------------+--------+-------------+------+------+
+ *
+ * U: unsigned when set
+ * size: encoded in imm5 (see ARM ARM LowestSetBit())
+ */
+static void handle_simd_umov_smov(DisasContext *s, int is_q, int is_signed,
+                                  int rn, int rd, int imm5)
+{
+    int size = ctz32(imm5);
+    int element;
+    TCGv_i64 tcg_rd;
+
+    /* Check for UnallocatedEncodings */
+    if (is_signed) {
+        if (size > 2 || (size == 2 && !is_q)) {
+            unallocated_encoding(s);
+            return;
+        }
+    } else {
+        if (size > 3
+            || (size < 3 && is_q)
+            || (size == 3 && !is_q)) {
+            unallocated_encoding(s);
+            return;
+        }
+    }
+    element = extract32(imm5, 1+size, 4);
+
+    tcg_rd = cpu_reg(s, rd);
+    read_vec_element(s, tcg_rd, rn, element, size | (is_signed ? MO_SIGN : 0));
+    if (is_signed && !is_q) {
+        tcg_gen_ext32u_i64(tcg_rd, tcg_rd);
+    }
+}
+
 /* C3.6.5 AdvSIMD copy
  *   31  30  29  28             21 20  16 15  14  11 10  9    5 4    0
  * +---+---+----+-----------------+------+---+------+---+------+------+
@@ -5053,7 +5220,48 @@ static void disas_simd_across_lanes(DisasContext *s, uint32_t insn)
  */
 static void disas_simd_copy(DisasContext *s, uint32_t insn)
 {
-    unsupported_encoding(s, insn);
+    int rd = extract32(insn, 0, 5);
+    int rn = extract32(insn, 5, 5);
+    int imm4 = extract32(insn, 11, 4);
+    int op = extract32(insn, 29, 1);
+    int is_q = extract32(insn, 30, 1);
+    int imm5 = extract32(insn, 16, 5);
+
+    if (op) {
+        if (is_q) {
+            /* INS (element) */
+            handle_simd_inse(s, rd, rn, imm4, imm5);
+        } else {
+            unallocated_encoding(s);
+        }
+    } else {
+        switch (imm4) {
+        case 0:
+            /* DUP (element - vector) */
+            handle_simd_dupe(s, is_q, rd, rn, imm5);
+            break;
+        case 1:
+            /* DUP (general) */
+            handle_simd_dupg(s, is_q, rd, rn, imm5);
+            break;
+        case 3:
+            if (is_q) {
+                /* INS (general) */
+                handle_simd_insg(s, rd, rn, imm5);
+            } else {
+                unallocated_encoding(s);
+            }
+            break;
+        case 5:
+        case 7:
+            /* UMOV/SMOV (is_q indicates 32/64; imm4 indicates signedness) */
+            handle_simd_umov_smov(s, is_q, (imm4 == 5), rn, rd, imm5);
+            break;
+        default:
+            unallocated_encoding(s);
+            break;
+        }
+    }
 }
 
 /* C3.6.6 AdvSIMD modified immediate
-- 
1.8.5

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [Qemu-devel] [PATCH 09/10] target-arm: A64: Add SIMD modified immediate group
  2014-01-10 17:12 [Qemu-devel] [PATCH 00/10] A64 SIMD patchset one: ld/st, C3.6.1..C3.6.7 Peter Maydell
                   ` (7 preceding siblings ...)
  2014-01-10 17:12 ` [Qemu-devel] [PATCH 08/10] target-arm: A64: Add SIMD copy operations Peter Maydell
@ 2014-01-10 17:12 ` Peter Maydell
  2014-01-10 20:00   ` Richard Henderson
  2014-01-10 17:12 ` [Qemu-devel] [PATCH 10/10] target-arm: A64: Add SIMD scalar copy instructions Peter Maydell
  9 siblings, 1 reply; 30+ messages in thread
From: Peter Maydell @ 2014-01-10 17:12 UTC (permalink / raw)
  To: qemu-devel
  Cc: patches, Michael Matz, Alexander Graf, Claudio Fontana,
	Dirk Mueller, Will Newton, Laurent Desnogues, Alex Bennée,
	kvmarm, Christoffer Dall, Richard Henderson

From: Alex Bennée <alex.bennee@linaro.org>

This patch adds support for the AdvSIMD modified immediate group
(C3.6.6) with all its suboperations (movi, orr, fmov, mvni, bic).

Signed-off-by: Alexander Graf <agraf@suse.de>
[AJB: new decode struct, minor bug fixes, optimisation]
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target-arm/translate-a64.c | 131 ++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 130 insertions(+), 1 deletion(-)

diff --git a/target-arm/translate-a64.c b/target-arm/translate-a64.c
index 396782e..153a28a 100644
--- a/target-arm/translate-a64.c
+++ b/target-arm/translate-a64.c
@@ -5269,10 +5269,139 @@ static void disas_simd_copy(DisasContext *s, uint32_t insn)
  * +---+---+----+---------------------+-----+-------+----+---+-------+------+
  * | 0 | Q | op | 0 1 1 1 1 0 0 0 0 0 | abc | cmode | o2 | 1 | defgh |  Rd  |
  * +---+---+----+---------------------+-----+-------+----+---+-------+------+
+ *
+ * There are a number of operations that can be carried out here:
+ *   MOVI - move (shifted) imm into register
+ *   MVNI - move inverted (shifted) imm into register
+ *   ORR  - bitwise OR of (shifted) imm with register
+ *   BIC  - bitwise clear of (shifted) imm with register
  */
 static void disas_simd_mod_imm(DisasContext *s, uint32_t insn)
 {
-    unsupported_encoding(s, insn);
+    int rd = extract32(insn, 0, 5);
+    int cmode = extract32(insn, 12, 4);
+    int cmode_3_1 = extract32(cmode, 1, 3);
+    int cmode_0 = extract32(cmode, 0, 1);
+    int o2 = extract32(insn, 11, 1);
+    uint64_t abcdefgh = extract32(insn, 5, 5) | (extract32(insn, 16, 3) << 5);
+    bool is_neg = extract32(insn, 29, 1);
+    bool is_q = extract32(insn, 30, 1);
+    uint64_t imm = 0;
+    TCGv_i64 tcg_rd, tcg_imm;
+    int i;
+
+    if (o2 != 0 || ((cmode == 0xf) && is_neg && !is_q)) {
+        unallocated_encoding(s);
+        return;
+    }
+
+    /* See AdvSIMDExpandImm() in ARM ARM */
+    switch (cmode_3_1) {
+    case 0: /* Replicate(Zeros(24):imm8, 2) */
+    case 1: /* Replicate(Zeros(16):imm8:Zeros(8), 2) */
+    case 2: /* Replicate(Zeros(8):imm8:Zeros(16), 2) */
+    case 3: /* Replicate(imm8:Zeros(24), 2) */
+    {
+        int shift = cmode_3_1 * 8;
+        imm = (abcdefgh << shift) | (abcdefgh << (32 + shift));
+        break;
+    }
+    case 4: /* Replicate(Zeros(8):imm8, 4) */
+    case 5: /* Replicate(imm8:Zeros(8), 4) */
+    {
+        int shift = (cmode_3_1 & 0x1) * 8;
+        imm = (abcdefgh << shift) |
+              (abcdefgh << (16 + shift)) |
+              (abcdefgh << (32 + shift)) |
+              (abcdefgh << (48 + shift));
+        break;
+    }
+    case 6:
+        if (cmode_0) {
+            /* Replicate(Zeros(8):imm8:Ones(16), 2) */
+            imm = (abcdefgh << 16) | 0xffff;
+            imm |= (imm << 32);
+        } else {
+            /* Replicate(Zeros(16):imm8:Ones(8), 2) */
+            imm = (abcdefgh << 8) | 0xff;
+            imm |= (imm << 32);
+        }
+        break;
+    case 7:
+        if (!cmode_0 && !is_neg) {
+            imm = abcdefgh |
+                  (abcdefgh << 8) |
+                  (abcdefgh << 16) |
+                  (abcdefgh << 24) |
+                  (abcdefgh << 32) |
+                  (abcdefgh << 40) |
+                  (abcdefgh << 48) |
+                  (abcdefgh << 56);
+        } else if (!cmode_0 && is_neg) {
+            int i;
+            imm = 0;
+            for (i = 0; i < 8; i++) {
+                if ((abcdefgh) & (1 << i)) {
+                    imm |= 0xffULL << (i * 8);
+                }
+            }
+        } else if (cmode_0) {
+            if (is_neg) {
+                imm = (abcdefgh & 0x3f) << 48;
+                if (abcdefgh & 0x80) {
+                    imm |= 0x8000000000000000ULL;
+                }
+                if (abcdefgh & 0x40) {
+                    imm |= 0x3fc0000000000000ULL;
+                } else {
+                    imm |= 0x4000000000000000ULL;
+                }
+            } else {
+                imm = (abcdefgh & 0x3f) << 19;
+                if (abcdefgh & 0x80) {
+                    imm |= 0x80000000;
+                }
+                if (abcdefgh & 0x40) {
+                    imm |= 0x3e000000;
+                } else {
+                    imm |= 0x40000000;
+                }
+                imm |= (imm << 32);
+            }
+        }
+        break;
+    }
+
+    if (cmode_3_1 != 7 && is_neg) {
+        imm = ~imm;
+    }
+
+    tcg_imm = tcg_const_i64(imm);
+    tcg_rd = new_tmp_a64(s);
+
+    for (i = 0; i < 2; i++) {
+        int foffs = i ? fp_reg_hi_offset(rd) : fp_reg_offset(rd, MO_64);
+
+        if (i == 1 && !is_q) {
+            /* non-quad ops clear high half of vector */
+            tcg_gen_movi_i64(tcg_rd, 0);
+        } else if ((cmode & 0x9) == 0x1 || (cmode & 0xd) == 0x9) {
+            tcg_gen_ld_i64(tcg_rd, cpu_env, foffs);
+            if (is_neg) {
+                /* AND (BIC) */
+                tcg_gen_and_i64(tcg_rd, tcg_rd, tcg_imm);
+            } else {
+                /* ORR */
+                tcg_gen_or_i64(tcg_rd, tcg_rd, tcg_imm);
+            }
+        } else {
+            /* MOVI */
+            tcg_gen_mov_i64(tcg_rd, tcg_imm);
+        }
+        tcg_gen_st_i64(tcg_rd, cpu_env, foffs);
+    }
+
+    tcg_temp_free_i64(tcg_imm);
 }
 
 /* C3.6.7 AdvSIMD scalar copy
-- 
1.8.5

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [Qemu-devel] [PATCH 10/10] target-arm: A64: Add SIMD scalar copy instructions
  2014-01-10 17:12 [Qemu-devel] [PATCH 00/10] A64 SIMD patchset one: ld/st, C3.6.1..C3.6.7 Peter Maydell
                   ` (8 preceding siblings ...)
  2014-01-10 17:12 ` [Qemu-devel] [PATCH 09/10] target-arm: A64: Add SIMD modified immediate group Peter Maydell
@ 2014-01-10 17:12 ` Peter Maydell
  2014-01-10 20:03   ` Richard Henderson
  2014-01-15 15:10   ` Claudio Fontana
  9 siblings, 2 replies; 30+ messages in thread
From: Peter Maydell @ 2014-01-10 17:12 UTC (permalink / raw)
  To: qemu-devel
  Cc: patches, Michael Matz, Alexander Graf, Claudio Fontana,
	Dirk Mueller, Will Newton, Laurent Desnogues, Alex Bennée,
	kvmarm, Christoffer Dall, Richard Henderson

Add support for the SIMD scalar copy instruction group (C3.6.7),
which consists of the single instruction DUP (element, scalar).

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target-arm/translate-a64.c | 42 +++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 41 insertions(+), 1 deletion(-)

diff --git a/target-arm/translate-a64.c b/target-arm/translate-a64.c
index 153a28a..70a8314 100644
--- a/target-arm/translate-a64.c
+++ b/target-arm/translate-a64.c
@@ -5084,6 +5084,35 @@ static void handle_simd_dupe(DisasContext *s, int is_q, int rd, int rn,
     tcg_temp_free_i64(tmp);
 }
 
+/* C6.3.31 DUP (element, scalar)
+ *  31                   21 20    16 15        10  9    5 4    0
+ * +-----------------------+--------+-------------+------+------+
+ * | 0 1 0 0 1 1 1 0 0 0 0 |  imm5  | 0 0 0 0 0 1 |  Rn  |  Rd  |
+ * +-----------------------+--------+-------------+------+------+
+ */
+static void handle_simd_dupes(DisasContext *s, int rd, int rn,
+                              int imm5)
+{
+    int size = ctz32(imm5);
+    int index;
+    TCGv_i64 tmp;
+
+    if (size > 3) {
+        unallocated_encoding(s);
+        return;
+    }
+
+    index = imm5 >> (size + 1);
+
+    /* This instruction just extracts the specified element and
+     * zero-extends it into the bottom of the destination register.
+     */
+    tmp = tcg_temp_new_i64();
+    read_vec_element(s, tmp, rn, index, size);
+    write_fp_dreg(s, rd, tmp);
+    tcg_temp_free_i64(tmp);
+}
+
 /* C6.3.32 DUP (General)
  *
  *  31  30   29              21 20    16 15        10  9    5 4    0
@@ -5412,7 +5441,18 @@ static void disas_simd_mod_imm(DisasContext *s, uint32_t insn)
  */
 static void disas_simd_scalar_copy(DisasContext *s, uint32_t insn)
 {
-    unsupported_encoding(s, insn);
+    int rd = extract32(insn, 0, 5);
+    int rn = extract32(insn, 5, 5);
+    int imm4 = extract32(insn, 11, 4);
+    int imm5 = extract32(insn, 16, 5);
+    int op = extract32(insn, 29, 1);
+
+    if (op != 0 || imm4 != 0) {
+        unallocated_encoding(s);
+    }
+
+    /* DUP (element, scalar) */
+    handle_simd_dupes(s, rd, rn, imm5);
 }
 
 /* C3.6.8 AdvSIMD scalar pairwise
-- 
1.8.5

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* Re: [Qemu-devel] [PATCH 01/10] target-arm: A64: Add SIMD ld/st multiple
  2014-01-10 17:12 ` [Qemu-devel] [PATCH 01/10] target-arm: A64: Add SIMD ld/st multiple Peter Maydell
@ 2014-01-10 18:05   ` Richard Henderson
  2014-01-10 18:18     ` Peter Maydell
  0 siblings, 1 reply; 30+ messages in thread
From: Richard Henderson @ 2014-01-10 18:05 UTC (permalink / raw)
  To: Peter Maydell, qemu-devel
  Cc: Laurent Desnogues, patches, Michael Matz, Alexander Graf,
	Claudio Fontana, Dirk Mueller, Will Newton, Alex Bennée,
	kvmarm, Christoffer Dall

On 01/10/2014 09:12 AM, Peter Maydell wrote:
> +    TCGMemOp memop =  MO_TE + size;

Double space after =.  Multiple occurrences.

> +    if (is_postidx) {
> +        int rm = extract32(insn, 16, 5);
> +        if (rm == 31) {
> +            tcg_gen_mov_i64(cpu_reg_sp(s, rn), tcg_addr);
> +        } else {
> +            tcg_gen_add_i64(cpu_reg_sp(s, rn), cpu_reg(s, rn), cpu_reg(s, rm));
> +        }

Second cpu_reg must be cpu_reg_sp as well.  Maybe better to hoist load of
tcg_rn to before initial assignment of tcg_addr?


r~

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [Qemu-devel] [PATCH 02/10] target-arm: A64: Add SIMD ld/st single
  2014-01-10 17:12 ` [Qemu-devel] [PATCH 02/10] target-arm: A64: Add SIMD ld/st single Peter Maydell
@ 2014-01-10 18:12   ` Richard Henderson
  0 siblings, 0 replies; 30+ messages in thread
From: Richard Henderson @ 2014-01-10 18:12 UTC (permalink / raw)
  To: Peter Maydell, qemu-devel
  Cc: Laurent Desnogues, patches, Michael Matz, Alexander Graf,
	Claudio Fontana, Dirk Mueller, Will Newton, Alex Bennée,
	kvmarm, Christoffer Dall

On 01/10/2014 09:12 AM, Peter Maydell wrote:
> +            tcg_gen_add_i64(cpu_reg_sp(s, rn), cpu_reg(s, rn), cpu_reg(s, rm));

Same cpu_reg_sp bug as patch 1.


r~

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [Qemu-devel] [PATCH 01/10] target-arm: A64: Add SIMD ld/st multiple
  2014-01-10 18:05   ` Richard Henderson
@ 2014-01-10 18:18     ` Peter Maydell
  2014-01-10 18:28       ` Richard Henderson
  0 siblings, 1 reply; 30+ messages in thread
From: Peter Maydell @ 2014-01-10 18:18 UTC (permalink / raw)
  To: Richard Henderson
  Cc: Laurent Desnogues, Patch Tracking, Michael Matz, Alexander Graf,
	QEMU Developers, Claudio Fontana, Dirk Mueller, Will Newton,
	Alex Bennée, kvmarm, Christoffer Dall

On 10 January 2014 18:05, Richard Henderson <rth@twiddle.net> wrote:
> On 01/10/2014 09:12 AM, Peter Maydell wrote:
>> +    TCGMemOp memop =  MO_TE + size;
>
> Double space after =.  Multiple occurrences.

Just this one plus its copy-n-paste in do_vec_st, I think.

>> +    if (is_postidx) {
>> +        int rm = extract32(insn, 16, 5);
>> +        if (rm == 31) {
>> +            tcg_gen_mov_i64(cpu_reg_sp(s, rn), tcg_addr);
>> +        } else {
>> +            tcg_gen_add_i64(cpu_reg_sp(s, rn), cpu_reg(s, rn), cpu_reg(s, rm));
>> +        }
>
> Second cpu_reg must be cpu_reg_sp as well.

Yes. Unfortunately the testing tool we're using doesn't
support testing of SP-relative accesses, so this kind
of bug can slip through.

> Maybe better to hoist load of
> tcg_rn to before initial assignment of tcg_addr?

Not sure what you have in mind here. Pulling the
cpu_reg_sp() call out one level like:

    if (is_postidx) {
        int rm = extract32(insn, 16, 5);
        TCGv_i64 tcg_rn = cpu_reg_sp(s, rn);
        if (rm == 31) {
            tcg_gen_mov_i64(tcg_rn, tcg_addr);
        } else {
            tcg_gen_add_i64(tcg_rn, tcg_rn, cpu_reg(s, rm));
        }
    }

seems like a good idea though.

thanks
-- PMM

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [Qemu-devel] [PATCH 01/10] target-arm: A64: Add SIMD ld/st multiple
  2014-01-10 18:18     ` Peter Maydell
@ 2014-01-10 18:28       ` Richard Henderson
  2014-01-10 18:37         ` Peter Maydell
  0 siblings, 1 reply; 30+ messages in thread
From: Richard Henderson @ 2014-01-10 18:28 UTC (permalink / raw)
  To: Peter Maydell
  Cc: Patch Tracking, Michael Matz, QEMU Developers, Alexander Graf,
	Claudio Fontana, Dirk Mueller, Will Newton, Laurent Desnogues,
	Alex Bennée, kvmarm, Christoffer Dall

On 01/10/2014 10:18 AM, Peter Maydell wrote:
>> > Maybe better to hoist load of
>> > tcg_rn to before initial assignment of tcg_addr?
> Not sure what you have in mind here. Pulling the
> cpu_reg_sp() call out one level like:
> 
>     if (is_postidx) {
>         int rm = extract32(insn, 16, 5);
>         TCGv_i64 tcg_rn = cpu_reg_sp(s, rn);
>         if (rm == 31) {
>             tcg_gen_mov_i64(tcg_rn, tcg_addr);
>         } else {
>             tcg_gen_add_i64(tcg_rn, tcg_rn, cpu_reg(s, rm));
>         }
>     }
> 
> seems like a good idea though.

I was thinking

  TCGv_i64 tcg_rn = cpu_reg_sp(s, rn);
  TCGv_i64 tcg_addr = tcg_temp_new_i64();
  tcg_gen_mov_i64(tcg_addr, tcg_rn);

up above.  But even as you have there is good.


r~

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [Qemu-devel] [PATCH 01/10] target-arm: A64: Add SIMD ld/st multiple
  2014-01-10 18:28       ` Richard Henderson
@ 2014-01-10 18:37         ` Peter Maydell
  2014-01-10 19:00           ` Richard Henderson
  0 siblings, 1 reply; 30+ messages in thread
From: Peter Maydell @ 2014-01-10 18:37 UTC (permalink / raw)
  To: Richard Henderson
  Cc: Patch Tracking, Michael Matz, QEMU Developers, Alexander Graf,
	Claudio Fontana, Dirk Mueller, Will Newton, Laurent Desnogues,
	Alex Bennée, kvmarm, Christoffer Dall

On 10 January 2014 18:28, Richard Henderson <rth@twiddle.net> wrote:
> On 01/10/2014 10:18 AM, Peter Maydell wrote:
>>> > Maybe better to hoist load of
>>> > tcg_rn to before initial assignment of tcg_addr?
>> Not sure what you have in mind here. Pulling the
>> cpu_reg_sp() call out one level like:
>>
>>     if (is_postidx) {
>>         int rm = extract32(insn, 16, 5);
>>         TCGv_i64 tcg_rn = cpu_reg_sp(s, rn);
>>         if (rm == 31) {
>>             tcg_gen_mov_i64(tcg_rn, tcg_addr);
>>         } else {
>>             tcg_gen_add_i64(tcg_rn, tcg_rn, cpu_reg(s, rm));
>>         }
>>     }
>>
>> seems like a good idea though.
>
> I was thinking
>
>   TCGv_i64 tcg_rn = cpu_reg_sp(s, rn);
>   TCGv_i64 tcg_addr = tcg_temp_new_i64();
>   tcg_gen_mov_i64(tcg_addr, tcg_rn);
>
> up above.  But even as you have there is good.

Oh, right. Yes, I like that -- have made the change.

thanks
-- PMM

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [Qemu-devel] [PATCH 03/10] target-arm: A64: Add decode skeleton for SIMD data processing insns
  2014-01-10 17:12 ` [Qemu-devel] [PATCH 03/10] target-arm: A64: Add decode skeleton for SIMD data processing insns Peter Maydell
@ 2014-01-10 18:55   ` Richard Henderson
  2014-01-10 19:05   ` Richard Henderson
  1 sibling, 0 replies; 30+ messages in thread
From: Richard Henderson @ 2014-01-10 18:55 UTC (permalink / raw)
  To: Peter Maydell, qemu-devel
  Cc: Laurent Desnogues, patches, Michael Matz, Alexander Graf,
	Claudio Fontana, Dirk Mueller, Will Newton, Alex Bennée,
	kvmarm, Christoffer Dall

On 01/10/2014 09:12 AM, Peter Maydell wrote:
> +static inline AArch64DecodeFn *lookup_disas_fn(AArch64DecodeTable *table,
> +                                               uint32_t insn)

Better make table const.

> +static AArch64DecodeTable data_proc_simd[] = {

So that you can make this const.

> +/* C3.6.1 EXT
> + *   31  30 29         24 23 22  21 20  16 15  14  11 10  9    5 4    0
> + * +---+---+-------------+-----+---+------+---+------+---+------+------+
> + * | 0 | Q | 0 0 1 1 1 0 | op2 | 0 |  Rm  | 0 | imm4 | 0 |  Rn  |  Rd  |
> + * +---+---+-------------+-----+---+------+---+------+---+------+------+
> + */

Error...        1

> +/* C3.6.16 AdvSIMD three same
> + *  31 30  29 28       24 23  22  21 20  16 15    11  10 9    5 4    0
> + * +-----+---+-----------+------+---+------+--------+---+------+------+
> + * | 0 1 | U | 1 1 1 1 0 | size | 1 |  Rm  | opcode | 1 |  Rn  |  Rd  |
> + * +-----+---+-----------+------+---+------+--------+---+------+------+
> + */

Error.  Cut and paste?

> +    /* pattern  ,  mask     ,  fn                        */
> +    { 0x0e200400, 0x9f200400, disas_simd_three_reg_same },		ok
> +    { 0x0e200000, 0x9f200c00, disas_simd_three_reg_diff },		ok
> +    { 0x0e200800, 0x9f3e0c00, disas_simd_two_reg_misc },		ok
> +    { 0x0e300800, 0x9f3e0c00, disas_simd_across_lanes },		ok
> +    { 0x0e000400, 0x9fe08400, disas_simd_copy },			ok
> +    { 0x0f000000, 0x9f000400, disas_simd_indexed_vector },		ok
> +    /* simd_mod_imm decode is a subset of simd_shift_imm, so must precede it */
> +    { 0x0f000400, 0x9ff80400, disas_simd_mod_imm },			ok
> +    { 0x0f000400, 0x9f800400, disas_simd_shift_imm },		ok
> +    { 0x0e000000, 0xbf208c00, disas_simd_tb },			ok
> +    { 0x0e000800, 0xbf208c00, disas_simd_zip_trn },			ok
> +    { 0x2e000000, 0xbf208400, disas_simd_ext },			ok
> +    { 0x5e200400, 0xdf200400, disas_simd_scalar_three_reg_same },	ok
> +    { 0x5e200000, 0xdf200c00, disas_simd_scalar_three_reg_diff },	ok
> +    { 0x5e200800, 0xdf3e0c00, disas_simd_scalar_two_reg_misc },	ok
> +    { 0x5e300800, 0xdf3e0c00, disas_simd_scalar_pairwise },		ok
> +    { 0x5e000400, 0xdfe08400, disas_simd_scalar_copy },		ok
> +    { 0x5f000000, 0xdf000400, disas_simd_scalar_indexed },		ok
> +    { 0x5f000400, 0xdf800400, disas_simd_scalar_shift_imm },		ok
> +    { 0x4e280800, 0xff3e0c00, disas_crypto_aes },			ok
> +    { 0x5e000000, 0xff208c00, disas_crypto_three_reg_sha },		ok
> +    { 0x5e280800, 0xff3e0c00, disas_crypto_two_reg_sha },		ok
> +    { 0x00000000, 0x00000000, NULL }

The errors in the comments above are not present in this table.  I've verified
the pattern and mask entries, but not the ordering requirements.

> +        (fn) (s, insn);

Surely coding style sez

    fn(s, insn);
or
    (*fn)(s, insn);

Otherwise,

Reviewed-by: Richard Henderson <rth@twiddle.net>


r~

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [Qemu-devel] [PATCH 01/10] target-arm: A64: Add SIMD ld/st multiple
  2014-01-10 18:37         ` Peter Maydell
@ 2014-01-10 19:00           ` Richard Henderson
  0 siblings, 0 replies; 30+ messages in thread
From: Richard Henderson @ 2014-01-10 19:00 UTC (permalink / raw)
  To: Peter Maydell
  Cc: Laurent Desnogues, Patch Tracking, Michael Matz, Alexander Graf,
	QEMU Developers, Claudio Fontana, Dirk Mueller, Will Newton,
	Alex Bennée, kvmarm, Christoffer Dall

On 01/10/2014 10:37 AM, Peter Maydell wrote:
> On 10 January 2014 18:28, Richard Henderson <rth@twiddle.net> wrote:
>> On 01/10/2014 10:18 AM, Peter Maydell wrote:
>>>>> Maybe better to hoist load of
>>>>> tcg_rn to before initial assignment of tcg_addr?
>>> Not sure what you have in mind here. Pulling the
>>> cpu_reg_sp() call out one level like:
>>>
>>>     if (is_postidx) {
>>>         int rm = extract32(insn, 16, 5);
>>>         TCGv_i64 tcg_rn = cpu_reg_sp(s, rn);
>>>         if (rm == 31) {
>>>             tcg_gen_mov_i64(tcg_rn, tcg_addr);
>>>         } else {
>>>             tcg_gen_add_i64(tcg_rn, tcg_rn, cpu_reg(s, rm));
>>>         }
>>>     }
>>>
>>> seems like a good idea though.
>>
>> I was thinking
>>
>>   TCGv_i64 tcg_rn = cpu_reg_sp(s, rn);
>>   TCGv_i64 tcg_addr = tcg_temp_new_i64();
>>   tcg_gen_mov_i64(tcg_addr, tcg_rn);
>>
>> up above.  But even as you have there is good.
> 
> Oh, right. Yes, I like that -- have made the change.

Don't forget the free, of course.  Or use new_tmp_a64.


r~

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [Qemu-devel] [PATCH 03/10] target-arm: A64: Add decode skeleton for SIMD data processing insns
  2014-01-10 17:12 ` [Qemu-devel] [PATCH 03/10] target-arm: A64: Add decode skeleton for SIMD data processing insns Peter Maydell
  2014-01-10 18:55   ` Richard Henderson
@ 2014-01-10 19:05   ` Richard Henderson
  2014-01-11  0:01     ` Peter Maydell
  1 sibling, 1 reply; 30+ messages in thread
From: Richard Henderson @ 2014-01-10 19:05 UTC (permalink / raw)
  To: Peter Maydell, qemu-devel
  Cc: Laurent Desnogues, patches, Michael Matz, Alexander Graf,
	Claudio Fontana, Dirk Mueller, Will Newton, Alex Bennée,
	kvmarm, Christoffer Dall

On 01/10/2014 09:12 AM, Peter Maydell wrote:
>  static void disas_data_proc_simd(DisasContext *s, uint32_t insn)
>  {
>      /* Note that this is called with all non-FP cases from
>       * table C3-6 so it must UNDEF for entries not specifically
>       * allocated to instructions in that table.
>       */
> -    unsupported_encoding(s, insn);
> +    AArch64DecodeFn *fn = lookup_disas_fn(&data_proc_simd[0], insn);
> +    if (fn) {
> +        (fn) (s, insn);

Oh, do you want to CheckFPAdvSIMDEnabled64 here before calling fn?
Otherwise that's the first thing I noticed missing from patch 4.


r~

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [Qemu-devel] [PATCH 04/10] target-arm: A64: Add SIMD EXT
  2014-01-10 17:12 ` [Qemu-devel] [PATCH 04/10] target-arm: A64: Add SIMD EXT Peter Maydell
@ 2014-01-10 19:13   ` Richard Henderson
  0 siblings, 0 replies; 30+ messages in thread
From: Richard Henderson @ 2014-01-10 19:13 UTC (permalink / raw)
  To: Peter Maydell, qemu-devel
  Cc: Laurent Desnogues, patches, Michael Matz, Alexander Graf,
	Claudio Fontana, Dirk Mueller, Will Newton, Alex Bennée,
	kvmarm, Christoffer Dall

On 01/10/2014 09:12 AM, Peter Maydell wrote:
> +        if (pos < 64) {
> +            tcg_resl = do_ext64(s, rn, 1, rn, 0, pos);
> +            tcg_resh = do_ext64(s, rm, 0, rn, 1, pos);
> +        } else {
> +            tcg_resl = do_ext64(s, rm, 0, rn, 1, pos - 64);
> +            tcg_resh = do_ext64(s, rm, 1, rm, 0, pos - 64);
> +        }

Perhaps better to pre-load the values before do_ext64?

In the first case you're loading rn[1] twice, and in the second rm[0] twice.

Otherwise,

Reviewed-by: Richard Henderson <rth@twiddle.net>


r~

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [Qemu-devel] [PATCH 05/10] target-arm: A64: Add SIMD TBL/TBLX
  2014-01-10 17:12 ` [Qemu-devel] [PATCH 05/10] target-arm: A64: Add SIMD TBL/TBLX Peter Maydell
@ 2014-01-10 19:19   ` Richard Henderson
  0 siblings, 0 replies; 30+ messages in thread
From: Richard Henderson @ 2014-01-10 19:19 UTC (permalink / raw)
  To: Peter Maydell, qemu-devel
  Cc: Laurent Desnogues, patches, Michael Matz, Alexander Graf,
	Claudio Fontana, Dirk Mueller, Will Newton, Alex Bennée,
	kvmarm, Christoffer Dall

On 01/10/2014 09:12 AM, Peter Maydell wrote:
> +uint64_t HELPER(simd_tbl)(CPUARMState *env, uint64_t result, uint64_t indices,
> +                          uint64_t rn, uint64_t numregs)

Better with rn and numregs uint32_t?

Otherwise,

Reviewed-by: Richard Henderson <rth@twiddle.net>


r~

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [Qemu-devel] [PATCH 06/10] target-arm: A64: Add SIMD ZIP/UZP/TRN
  2014-01-10 17:12 ` [Qemu-devel] [PATCH 06/10] target-arm: A64: Add SIMD ZIP/UZP/TRN Peter Maydell
@ 2014-01-10 19:29   ` Richard Henderson
  2014-01-11  8:30     ` Alex Bennée
  0 siblings, 1 reply; 30+ messages in thread
From: Richard Henderson @ 2014-01-10 19:29 UTC (permalink / raw)
  To: Peter Maydell, qemu-devel
  Cc: Laurent Desnogues, patches, Michael Matz, Alexander Graf,
	Claudio Fontana, Dirk Mueller, Will Newton, Alex Bennée,
	kvmarm, Christoffer Dall

On 01/10/2014 09:12 AM, Peter Maydell wrote:
> +    for (i = 0; i < elements; i++) {
> +        switch (opcode) {
> +        case 1: /* UZP1/2 */
> +        {
> +            int midpoint = elements / 2;
> +            if (i < midpoint) {
> +                read_vec_element(s, tcg_res, rn, 2 * i + part, size);
> +            } else {
> +                read_vec_element(s, tcg_res, rm,
> +                                 2 * (i - midpoint) + part, size);
> +            }
> +            break;
> +        }

You're generating up to 16 * 3 + 2 = 50 opcodes here.  I do wonder if it
wouldn't be better to implement these as helpers.  But,

Reviewed-by: Richard Henderson <rth@twiddle.net>


r~

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [Qemu-devel] [PATCH 07/10] target-arm: A64: Add SIMD across-lanes instructions
  2014-01-10 17:12 ` [Qemu-devel] [PATCH 07/10] target-arm: A64: Add SIMD across-lanes instructions Peter Maydell
@ 2014-01-10 19:38   ` Richard Henderson
  0 siblings, 0 replies; 30+ messages in thread
From: Richard Henderson @ 2014-01-10 19:38 UTC (permalink / raw)
  To: Peter Maydell, qemu-devel
  Cc: Laurent Desnogues, patches, Michael Matz, Alexander Graf,
	Claudio Fontana, Dirk Mueller, Will Newton, Alex Bennée,
	kvmarm, Christoffer Dall

On 01/10/2014 09:12 AM, Peter Maydell wrote:
> From: Michael Matz <matz@suse.de>
> 
> Add support for the SIMD "across lanes" instruction group (C3.6.4).
> 
> Signed-off-by: Michael Matz <matz@suse.de>
> [PMM: Updated to current codebase, added fp min/max ops,
>  added unallocated encoding checks]
> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
> ---
>  target-arm/translate-a64.c | 177 ++++++++++++++++++++++++++++++++++++++++++++-
>  1 file changed, 176 insertions(+), 1 deletion(-)

Reviewed-by: Richard Henderson <rth@twiddle.net>


r~

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [Qemu-devel] [PATCH 08/10] target-arm: A64: Add SIMD copy operations
  2014-01-10 17:12 ` [Qemu-devel] [PATCH 08/10] target-arm: A64: Add SIMD copy operations Peter Maydell
@ 2014-01-10 19:50   ` Richard Henderson
  0 siblings, 0 replies; 30+ messages in thread
From: Richard Henderson @ 2014-01-10 19:50 UTC (permalink / raw)
  To: Peter Maydell, qemu-devel
  Cc: Laurent Desnogues, patches, Michael Matz, Alexander Graf,
	Claudio Fontana, Dirk Mueller, Will Newton, Alex Bennée,
	kvmarm, Christoffer Dall

On 01/10/2014 09:12 AM, Peter Maydell wrote:
> From: Alex Bennée <alex.bennee@linaro.org>
> 
> This adds support for the all the AdvSIMD vector copy operations
> (ARM ARM 3.6.5).
> 
> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
> ---
>  target-arm/translate-a64.c | 210 ++++++++++++++++++++++++++++++++++++++++++++-
>  1 file changed, 209 insertions(+), 1 deletion(-)

Reviewed-by: Richard Henderson <rth@twiddle.net>


r~

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [Qemu-devel] [PATCH 09/10] target-arm: A64: Add SIMD modified immediate group
  2014-01-10 17:12 ` [Qemu-devel] [PATCH 09/10] target-arm: A64: Add SIMD modified immediate group Peter Maydell
@ 2014-01-10 20:00   ` Richard Henderson
  0 siblings, 0 replies; 30+ messages in thread
From: Richard Henderson @ 2014-01-10 20:00 UTC (permalink / raw)
  To: Peter Maydell, qemu-devel
  Cc: Laurent Desnogues, patches, Michael Matz, Alexander Graf,
	Claudio Fontana, Dirk Mueller, Will Newton, Alex Bennée,
	kvmarm, Christoffer Dall

On 01/10/2014 09:12 AM, Peter Maydell wrote:
> +    case 0: /* Replicate(Zeros(24):imm8, 2) */
> +    case 1: /* Replicate(Zeros(16):imm8:Zeros(8), 2) */
> +    case 2: /* Replicate(Zeros(8):imm8:Zeros(16), 2) */
> +    case 3: /* Replicate(imm8:Zeros(24), 2) */
> +    {
> +        int shift = cmode_3_1 * 8;
> +        imm = (abcdefgh << shift) | (abcdefgh << (32 + shift));
> +        break;
> +    }

Better to use bitfield_replicate with these?

Reviewed-by: Richard Henderson <rth@twiddle.net>


r~

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [Qemu-devel] [PATCH 10/10] target-arm: A64: Add SIMD scalar copy instructions
  2014-01-10 17:12 ` [Qemu-devel] [PATCH 10/10] target-arm: A64: Add SIMD scalar copy instructions Peter Maydell
@ 2014-01-10 20:03   ` Richard Henderson
  2014-01-15 15:10   ` Claudio Fontana
  1 sibling, 0 replies; 30+ messages in thread
From: Richard Henderson @ 2014-01-10 20:03 UTC (permalink / raw)
  To: Peter Maydell, qemu-devel
  Cc: Laurent Desnogues, patches, Michael Matz, Alexander Graf,
	Claudio Fontana, Dirk Mueller, Will Newton, Alex Bennée,
	kvmarm, Christoffer Dall

On 01/10/2014 09:12 AM, Peter Maydell wrote:
> +/* C6.3.31 DUP (element, scalar)
> + *  31                   21 20    16 15        10  9    5 4    0
> + * +-----------------------+--------+-------------+------+------+
> + * | 0 1 0 0 1 1 1 0 0 0 0 |  imm5  | 0 0 0 0 0 1 |  Rn  |  Rd  |
> + * +-----------------------+--------+-------------+------+------+
> + */

Error...      1

Otherwise,

Reviewed-by: Richard Henderson <rth@twiddle.net>


r~

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [Qemu-devel] [PATCH 03/10] target-arm: A64: Add decode skeleton for SIMD data processing insns
  2014-01-10 19:05   ` Richard Henderson
@ 2014-01-11  0:01     ` Peter Maydell
  0 siblings, 0 replies; 30+ messages in thread
From: Peter Maydell @ 2014-01-11  0:01 UTC (permalink / raw)
  To: Richard Henderson
  Cc: Laurent Desnogues, Patch Tracking, Michael Matz, Alexander Graf,
	QEMU Developers, Claudio Fontana, Dirk Mueller, Will Newton,
	Alex Bennée, kvmarm, Christoffer Dall

On 10 January 2014 19:05, Richard Henderson <rth@twiddle.net> wrote:
> On 01/10/2014 09:12 AM, Peter Maydell wrote:
>>  static void disas_data_proc_simd(DisasContext *s, uint32_t insn)
>>  {
>>      /* Note that this is called with all non-FP cases from
>>       * table C3-6 so it must UNDEF for entries not specifically
>>       * allocated to instructions in that table.
>>       */
>> -    unsupported_encoding(s, insn);
>> +    AArch64DecodeFn *fn = lookup_disas_fn(&data_proc_simd[0], insn);
>> +    if (fn) {
>> +        (fn) (s, insn);
>
> Oh, do you want to CheckFPAdvSIMDEnabled64 here before calling fn?
> Otherwise that's the first thing I noticed missing from patch 4.

We don't currently check that for the FP insns either. Since it's a system
register check and will always pass for usermode emulation I was planning
to leave it for when I did system emulation and wired up the CPACR_EL1.

thanks
-- PMM

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [Qemu-devel] [PATCH 06/10] target-arm: A64: Add SIMD ZIP/UZP/TRN
  2014-01-10 19:29   ` Richard Henderson
@ 2014-01-11  8:30     ` Alex Bennée
  0 siblings, 0 replies; 30+ messages in thread
From: Alex Bennée @ 2014-01-11  8:30 UTC (permalink / raw)
  To: Richard Henderson
  Cc: Peter Maydell, Laurent Desnogues, patches, Michael Matz,
	Alexander Graf, qemu-devel, Claudio Fontana, Dirk Mueller,
	Will Newton, Alex Bennée, kvmarm, Christoffer Dall


rth@twiddle.net writes:

> On 01/10/2014 09:12 AM, Peter Maydell wrote:
>> +    for (i = 0; i < elements; i++) {
>> +        switch (opcode) {
>> +        case 1: /* UZP1/2 */
>> +        {
>> +            int midpoint = elements / 2;
>> +            if (i < midpoint) {
>> +                read_vec_element(s, tcg_res, rn, 2 * i + part, size);
>> +            } else {
>> +                read_vec_element(s, tcg_res, rm,
>> +                                 2 * (i - midpoint) + part, size);
>> +            }
>> +            break;
>> +        }
>
> You're generating up to 16 * 3 + 2 = 50 opcodes here.  I do wonder if it
> wouldn't be better to implement these as helpers.  But,

What's the hit in terms of opcodes for calling a helper function? I
would have thought you spend 10s of opcodes with messing around with
stack frames and the like before you get to the helper.

I was wondering if there is a demand for more SIMD like TCG opcodes but
I suspect all the arches are subtly different enough to make it a pain
to abstract.

>
> Reviewed-by: Richard Henderson <rth@twiddle.net>
>
>
> r~

-- 
Sent with my mu4e

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [Qemu-devel] [PATCH 10/10] target-arm: A64: Add SIMD scalar copy instructions
  2014-01-10 17:12 ` [Qemu-devel] [PATCH 10/10] target-arm: A64: Add SIMD scalar copy instructions Peter Maydell
  2014-01-10 20:03   ` Richard Henderson
@ 2014-01-15 15:10   ` Claudio Fontana
  2014-01-15 18:01     ` Peter Maydell
  1 sibling, 1 reply; 30+ messages in thread
From: Claudio Fontana @ 2014-01-15 15:10 UTC (permalink / raw)
  To: Peter Maydell, qemu-devel
  Cc: Laurent Desnogues, Michael Matz, Alexander Graf, Claudio Fontana,
	Dirk Mueller, Will Newton, Alex Bennée, kvmarm,
	Christoffer Dall, Richard Henderson

Hello Peter,

a missing return here I think:

On 10.01.2014 18:12, Peter Maydell wrote:
> Add support for the SIMD scalar copy instruction group (C3.6.7),
> which consists of the single instruction DUP (element, scalar).
> 
> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
> ---
>  target-arm/translate-a64.c | 42 +++++++++++++++++++++++++++++++++++++++++-
>  1 file changed, 41 insertions(+), 1 deletion(-)
> 
> diff --git a/target-arm/translate-a64.c b/target-arm/translate-a64.c
> index 153a28a..70a8314 100644
> --- a/target-arm/translate-a64.c
> +++ b/target-arm/translate-a64.c
> @@ -5084,6 +5084,35 @@ static void handle_simd_dupe(DisasContext *s, int is_q, int rd, int rn,
>      tcg_temp_free_i64(tmp);
>  }
>  
> +/* C6.3.31 DUP (element, scalar)
> + *  31                   21 20    16 15        10  9    5 4    0
> + * +-----------------------+--------+-------------+------+------+
> + * | 0 1 0 0 1 1 1 0 0 0 0 |  imm5  | 0 0 0 0 0 1 |  Rn  |  Rd  |
> + * +-----------------------+--------+-------------+------+------+
> + */
> +static void handle_simd_dupes(DisasContext *s, int rd, int rn,
> +                              int imm5)
> +{
> +    int size = ctz32(imm5);
> +    int index;
> +    TCGv_i64 tmp;
> +
> +    if (size > 3) {
> +        unallocated_encoding(s);
> +        return;
> +    }
> +
> +    index = imm5 >> (size + 1);
> +
> +    /* This instruction just extracts the specified element and
> +     * zero-extends it into the bottom of the destination register.
> +     */
> +    tmp = tcg_temp_new_i64();
> +    read_vec_element(s, tmp, rn, index, size);
> +    write_fp_dreg(s, rd, tmp);
> +    tcg_temp_free_i64(tmp);
> +}
> +
>  /* C6.3.32 DUP (General)
>   *
>   *  31  30   29              21 20    16 15        10  9    5 4    0
> @@ -5412,7 +5441,18 @@ static void disas_simd_mod_imm(DisasContext *s, uint32_t insn)
>   */
>  static void disas_simd_scalar_copy(DisasContext *s, uint32_t insn)
>  {
> -    unsupported_encoding(s, insn);
> +    int rd = extract32(insn, 0, 5);
> +    int rn = extract32(insn, 5, 5);
> +    int imm4 = extract32(insn, 11, 4);
> +    int imm5 = extract32(insn, 16, 5);
> +    int op = extract32(insn, 29, 1);
> +
> +    if (op != 0 || imm4 != 0) {
> +        unallocated_encoding(s);

add a return here.

> +    }
> +
> +    /* DUP (element, scalar) */
> +    handle_simd_dupes(s, rd, rn, imm5);
>  }
>  
>  /* C3.6.8 AdvSIMD scalar pairwise
> 

Ciao,

Claudio

-- 
Claudio Fontana
Server Virtualization Architect
Huawei Technologies Duesseldorf GmbH
Riesstraße 25 - 80992 München

office: +49 89 158834 4135
mobile: +49 15253060158

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [Qemu-devel] [PATCH 10/10] target-arm: A64: Add SIMD scalar copy instructions
  2014-01-15 15:10   ` Claudio Fontana
@ 2014-01-15 18:01     ` Peter Maydell
  0 siblings, 0 replies; 30+ messages in thread
From: Peter Maydell @ 2014-01-15 18:01 UTC (permalink / raw)
  To: Claudio Fontana
  Cc: Laurent Desnogues, Michael Matz, QEMU Developers, Alexander Graf,
	Claudio Fontana, Dirk Mueller, Will Newton, Alex Bennée,
	kvmarm, Christoffer Dall, Richard Henderson

On 15 January 2014 15:10, Claudio Fontana <claudio.fontana@huawei.com> wrote:
> Hello Peter,
>
> a missing return here I think:

Thanks. I'd already queued this set onto target-arm.next, so I've just fixed
this nit there.

-- PMM

^ permalink raw reply	[flat|nested] 30+ messages in thread

end of thread, other threads:[~2014-01-15 18:01 UTC | newest]

Thread overview: 30+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-01-10 17:12 [Qemu-devel] [PATCH 00/10] A64 SIMD patchset one: ld/st, C3.6.1..C3.6.7 Peter Maydell
2014-01-10 17:12 ` [Qemu-devel] [PATCH 01/10] target-arm: A64: Add SIMD ld/st multiple Peter Maydell
2014-01-10 18:05   ` Richard Henderson
2014-01-10 18:18     ` Peter Maydell
2014-01-10 18:28       ` Richard Henderson
2014-01-10 18:37         ` Peter Maydell
2014-01-10 19:00           ` Richard Henderson
2014-01-10 17:12 ` [Qemu-devel] [PATCH 02/10] target-arm: A64: Add SIMD ld/st single Peter Maydell
2014-01-10 18:12   ` Richard Henderson
2014-01-10 17:12 ` [Qemu-devel] [PATCH 03/10] target-arm: A64: Add decode skeleton for SIMD data processing insns Peter Maydell
2014-01-10 18:55   ` Richard Henderson
2014-01-10 19:05   ` Richard Henderson
2014-01-11  0:01     ` Peter Maydell
2014-01-10 17:12 ` [Qemu-devel] [PATCH 04/10] target-arm: A64: Add SIMD EXT Peter Maydell
2014-01-10 19:13   ` Richard Henderson
2014-01-10 17:12 ` [Qemu-devel] [PATCH 05/10] target-arm: A64: Add SIMD TBL/TBLX Peter Maydell
2014-01-10 19:19   ` Richard Henderson
2014-01-10 17:12 ` [Qemu-devel] [PATCH 06/10] target-arm: A64: Add SIMD ZIP/UZP/TRN Peter Maydell
2014-01-10 19:29   ` Richard Henderson
2014-01-11  8:30     ` Alex Bennée
2014-01-10 17:12 ` [Qemu-devel] [PATCH 07/10] target-arm: A64: Add SIMD across-lanes instructions Peter Maydell
2014-01-10 19:38   ` Richard Henderson
2014-01-10 17:12 ` [Qemu-devel] [PATCH 08/10] target-arm: A64: Add SIMD copy operations Peter Maydell
2014-01-10 19:50   ` Richard Henderson
2014-01-10 17:12 ` [Qemu-devel] [PATCH 09/10] target-arm: A64: Add SIMD modified immediate group Peter Maydell
2014-01-10 20:00   ` Richard Henderson
2014-01-10 17:12 ` [Qemu-devel] [PATCH 10/10] target-arm: A64: Add SIMD scalar copy instructions Peter Maydell
2014-01-10 20:03   ` Richard Henderson
2014-01-15 15:10   ` Claudio Fontana
2014-01-15 18:01     ` Peter Maydell

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.