All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PULL 00/38] target-arm queue
@ 2014-01-29 13:39 Peter Maydell
  2014-01-29 13:39 ` [Qemu-devel] [PULL 01/38] target-arm: A64: Add SIMD ld/st multiple Peter Maydell
                   ` (38 more replies)
  0 siblings, 39 replies; 40+ messages in thread
From: Peter Maydell @ 2014-01-29 13:39 UTC (permalink / raw)
  To: Anthony Liguori; +Cc: Blue Swirl, qemu-devel, Aurelien Jarno

Here's the target-arm queue; nothing earthshaking here.
I anticipate the queue will fill up with another 30+ patches
within a week or so; please pull :-)

thanks
-- PMM

The following changes since commit 0169c511554cb0014a00290b0d3d26c31a49818f:

  Merge remote-tracking branch 'qemu-kvm/uq/master' into staging (2014-01-24 15:52:44 -0800)

are available in the git repository at:


  git://git.linaro.org/people/pmaydell/qemu-arm.git tags/pull-target-arm-20140129

for you to fetch changes up to 984bc70fb5e6331cf3cbfa836373fefadf1435f9:

  arm_gic: Fix GICD_ICPENDR and GICD_ISPENDR writes (2014-01-29 13:27:34 +0000)

----------------------------------------------------------------
target-arm queue:
 * implementation of first part of the A64 Neon instruction set
 * v8 AArch32 rounding and 16<->64 fp conversion instructions
 * add support for KVM ARM using the irqchip creation kernel API
 * fix MIDR value on Zynq boards
 * some minor bugfixes/code cleanups

----------------------------------------------------------------
Alex Bennée (5):
      target-arm: A64: Add SIMD ld/st multiple
      target-arm: A64: Add decode skeleton for SIMD data processing insns
      target-arm: A64: Add SIMD copy operations
      target-arm: A64: Add SIMD modified immediate group
      target-arm: A64: Add SIMD shift by immediate

Alistair Francis (2):
      ARM: Convert MIDR to a property
      ZYNQ: Implement board MIDR control for Zynq

Christoffer Dall (6):
      linux-headers: Update from Linus' master ba635f8
      kvm: Introduce kvm_arch_irqchip_create
      kvm: Common device control API functions
      arm: vgic device control api support
      arm_gic: Introduce define for GIC_NR_SGIS
      arm_gic: Fix GICD_ICPENDR and GICD_ISPENDR writes

Michael Matz (3):
      target-arm: A64: Add SIMD TBL/TBLX
      target-arm: A64: Add SIMD ZIP/UZP/TRN
      target-arm: A64: Add SIMD across-lanes instructions

Paolo Bonzini (1):
      display: avoid multi-statement macro

Peter Maydell (11):
      target-arm: A64: Add SIMD ld/st single
      target-arm: A64: Add SIMD EXT
      target-arm: A64: Add SIMD scalar copy instructions
      hw/arm/boot: Don't set up ATAGS for autogenerated dtb booting
      target-arm: A64: Add SIMD three-different multiply accumulate insns
      target-arm: A64: Add SIMD three-different ABDL instructions
      target-arm: A64: Add SIMD scalar 3 same add, sub and compare ops
      target-arm: A64: Add top level decode for SIMD 3-same group
      target-arm: A64: Add logic ops from SIMD 3 same group
      target-arm: A64: Add integer ops from SIMD 3-same group
      target-arm: A64: Add simple SIMD 3-same floating point ops

Will Newton (10):
      target-arm: Move arm_rmode_to_sf to a shared location.
      target-arm: Add AArch32 FP VRINTA, VRINTN, VRINTP and VRINTM
      target-arm: Add support for AArch32 FP VRINTR
      target-arm: Add support for AArch32 FP VRINTZ
      target-arm: Add support for AArch32 FP VRINTX
      target-arm: Add support for AArch32 SIMD VRINTX
      target-arm: Add set_neon_rmode helper
      target-arm: Add AArch32 SIMD VRINTA, VRINTN, VRINTP, VRINTM, VRINTZ
      target-arm: Add AArch32 FP VCVTA, VCVTN, VCVTP and VCVTM
      target-arm: Add AArch32 SIMD VCVTA, VCVTN, VCVTP and VCVTM

 hw/arm/boot.c                    |    9 +-
 hw/arm/xilinx_zynq.c             |    7 +
 hw/display/blizzard_template.h   |   40 +-
 hw/display/pl110_template.h      |   12 +-
 hw/display/pxa2xx_template.h     |   22 +-
 hw/display/tc6393xb_template.h   |   14 +-
 hw/intc/arm_gic.c                |   21 +-
 hw/intc/arm_gic_kvm.c            |   22 +-
 include/hw/intc/arm_gic_common.h |    2 +
 include/sysemu/kvm.h             |   34 +
 kvm-all.c                        |   50 +-
 linux-headers/asm-arm/kvm.h      |   28 +
 linux-headers/asm-arm64/kvm.h    |   21 +-
 linux-headers/asm-x86/hyperv.h   |   13 +
 linux-headers/linux/kvm.h        |    2 +
 stubs/Makefile.objs              |    1 +
 stubs/kvm.c                      |    7 +
 target-arm/cpu.c                 |    1 +
 target-arm/cpu.h                 |    2 +
 target-arm/helper-a64.c          |   31 +
 target-arm/helper-a64.h          |    1 +
 target-arm/helper.c              |   45 +
 target-arm/helper.h              |    1 +
 target-arm/kvm.c                 |   55 +-
 target-arm/kvm_arm.h             |   17 +-
 target-arm/translate-a64.c       | 2707 +++++++++++++++++++++++++++++++++++++-
 target-arm/translate.c           |  251 ++++
 trace-events                     |    1 +
 28 files changed, 3325 insertions(+), 92 deletions(-)
 create mode 100644 stubs/kvm.c

^ permalink raw reply	[flat|nested] 40+ messages in thread

* [Qemu-devel] [PULL 01/38] target-arm: A64: Add SIMD ld/st multiple
  2014-01-29 13:39 [Qemu-devel] [PULL 00/38] target-arm queue Peter Maydell
@ 2014-01-29 13:39 ` Peter Maydell
  2014-01-29 13:39 ` [Qemu-devel] [PULL 02/38] target-arm: A64: Add SIMD ld/st single Peter Maydell
                   ` (37 subsequent siblings)
  38 siblings, 0 replies; 40+ messages in thread
From: Peter Maydell @ 2014-01-29 13:39 UTC (permalink / raw)
  To: Anthony Liguori; +Cc: Blue Swirl, qemu-devel, Aurelien Jarno

From: Alex Bennée <alex.bennee@linaro.org>

This adds support support for the SIMD load/store
multiple category of instructions.

This also brings in a couple of helper functions for manipulating
sections of the SIMD registers:

  * do_vec_get - fetch value from a slice of a vector register
  * do_vec_set - set a slice of a vector register

which use vec_reg_offset for consistent processing of offsets in an
endian aware manner. There are also additional helpers:

  * do_vec_ld - load value into SIMD
  * do_vec_st - store value from SIMD

which load or store a slice of a vector register to memory.
These don't zero extend like the fp variants.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
---
 target-arm/translate-a64.c | 250 ++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 248 insertions(+), 2 deletions(-)

diff --git a/target-arm/translate-a64.c b/target-arm/translate-a64.c
index cf80c46..e4fdf00 100644
--- a/target-arm/translate-a64.c
+++ b/target-arm/translate-a64.c
@@ -308,6 +308,28 @@ static TCGv_i64 read_cpu_reg_sp(DisasContext *s, int reg, int sf)
     return v;
 }
 
+/* Return the offset into CPUARMState of an element of specified
+ * size, 'element' places in from the least significant end of
+ * the FP/vector register Qn.
+ */
+static inline int vec_reg_offset(int regno, int element, TCGMemOp size)
+{
+    int offs = offsetof(CPUARMState, vfp.regs[regno * 2]);
+#ifdef HOST_WORDS_BIGENDIAN
+    /* This is complicated slightly because vfp.regs[2n] is
+     * still the low half and  vfp.regs[2n+1] the high half
+     * of the 128 bit vector, even on big endian systems.
+     * Calculate the offset assuming a fully bigendian 128 bits,
+     * then XOR to account for the order of the two 64 bit halves.
+     */
+    offs += (16 - ((element + 1) * (1 << size)));
+    offs ^= 8;
+#else
+    offs += element * (1 << size);
+#endif
+    return offs;
+}
+
 /* Return the offset into CPUARMState of a slice (from
  * the least significant end) of FP register Qn (ie
  * Dn, Sn, Hn or Bn).
@@ -661,6 +683,108 @@ static void do_fp_ld(DisasContext *s, int destidx, TCGv_i64 tcg_addr, int size)
 }
 
 /*
+ * Vector load/store helpers.
+ *
+ * The principal difference between this and a FP load is that we don't
+ * zero extend as we are filling a partial chunk of the vector register.
+ * These functions don't support 128 bit loads/stores, which would be
+ * normal load/store operations.
+ */
+
+/* Get value of an element within a vector register */
+static void read_vec_element(DisasContext *s, TCGv_i64 tcg_dest, int srcidx,
+                             int element, TCGMemOp memop)
+{
+    int vect_off = vec_reg_offset(srcidx, element, memop & MO_SIZE);
+    switch (memop) {
+    case MO_8:
+        tcg_gen_ld8u_i64(tcg_dest, cpu_env, vect_off);
+        break;
+    case MO_16:
+        tcg_gen_ld16u_i64(tcg_dest, cpu_env, vect_off);
+        break;
+    case MO_32:
+        tcg_gen_ld32u_i64(tcg_dest, cpu_env, vect_off);
+        break;
+    case MO_8|MO_SIGN:
+        tcg_gen_ld8s_i64(tcg_dest, cpu_env, vect_off);
+        break;
+    case MO_16|MO_SIGN:
+        tcg_gen_ld16s_i64(tcg_dest, cpu_env, vect_off);
+        break;
+    case MO_32|MO_SIGN:
+        tcg_gen_ld32s_i64(tcg_dest, cpu_env, vect_off);
+        break;
+    case MO_64:
+    case MO_64|MO_SIGN:
+        tcg_gen_ld_i64(tcg_dest, cpu_env, vect_off);
+        break;
+    default:
+        g_assert_not_reached();
+    }
+}
+
+/* Set value of an element within a vector register */
+static void write_vec_element(DisasContext *s, TCGv_i64 tcg_src, int destidx,
+                              int element, TCGMemOp memop)
+{
+    int vect_off = vec_reg_offset(destidx, element, memop & MO_SIZE);
+    switch (memop) {
+    case MO_8:
+        tcg_gen_st8_i64(tcg_src, cpu_env, vect_off);
+        break;
+    case MO_16:
+        tcg_gen_st16_i64(tcg_src, cpu_env, vect_off);
+        break;
+    case MO_32:
+        tcg_gen_st32_i64(tcg_src, cpu_env, vect_off);
+        break;
+    case MO_64:
+        tcg_gen_st_i64(tcg_src, cpu_env, vect_off);
+        break;
+    default:
+        g_assert_not_reached();
+    }
+}
+
+/* Clear the high 64 bits of a 128 bit vector (in general non-quad
+ * vector ops all need to do this).
+ */
+static void clear_vec_high(DisasContext *s, int rd)
+{
+    TCGv_i64 tcg_zero = tcg_const_i64(0);
+
+    write_vec_element(s, tcg_zero, rd, 1, MO_64);
+    tcg_temp_free_i64(tcg_zero);
+}
+
+/* Store from vector register to memory */
+static void do_vec_st(DisasContext *s, int srcidx, int element,
+                      TCGv_i64 tcg_addr, int size)
+{
+    TCGMemOp memop = MO_TE + size;
+    TCGv_i64 tcg_tmp = tcg_temp_new_i64();
+
+    read_vec_element(s, tcg_tmp, srcidx, element, size);
+    tcg_gen_qemu_st_i64(tcg_tmp, tcg_addr, get_mem_index(s), memop);
+
+    tcg_temp_free_i64(tcg_tmp);
+}
+
+/* Load from memory to vector register */
+static void do_vec_ld(DisasContext *s, int destidx, int element,
+                      TCGv_i64 tcg_addr, int size)
+{
+    TCGMemOp memop = MO_TE + size;
+    TCGv_i64 tcg_tmp = tcg_temp_new_i64();
+
+    tcg_gen_qemu_ld_i64(tcg_tmp, tcg_addr, get_mem_index(s), memop);
+    write_vec_element(s, tcg_tmp, destidx, element, size);
+
+    tcg_temp_free_i64(tcg_tmp);
+}
+
+/*
  * This utility function is for doing register extension with an
  * optional shift. You will likely want to pass a temporary for the
  * destination register. See DecodeRegExtend() in the ARM ARM.
@@ -1835,10 +1959,132 @@ static void disas_ldst_reg(DisasContext *s, uint32_t insn)
     }
 }
 
-/* AdvSIMD load/store multiple structures */
+/* C3.3.1 AdvSIMD load/store multiple structures
+ *
+ *  31  30  29           23 22  21         16 15    12 11  10 9    5 4    0
+ * +---+---+---------------+---+-------------+--------+------+------+------+
+ * | 0 | Q | 0 0 1 1 0 0 0 | L | 0 0 0 0 0 0 | opcode | size |  Rn  |  Rt  |
+ * +---+---+---------------+---+-------------+--------+------+------+------+
+ *
+ * C3.3.2 AdvSIMD load/store multiple structures (post-indexed)
+ *
+ *  31  30  29           23 22  21  20     16 15    12 11  10 9    5 4    0
+ * +---+---+---------------+---+---+---------+--------+------+------+------+
+ * | 0 | Q | 0 0 1 1 0 0 1 | L | 0 |   Rm    | opcode | size |  Rn  |  Rt  |
+ * +---+---+---------------+---+---+---------+--------+------+------+------+
+ *
+ * Rt: first (or only) SIMD&FP register to be transferred
+ * Rn: base address or SP
+ * Rm (post-index only): post-index register (when !31) or size dependent #imm
+ */
 static void disas_ldst_multiple_struct(DisasContext *s, uint32_t insn)
 {
-    unsupported_encoding(s, insn);
+    int rt = extract32(insn, 0, 5);
+    int rn = extract32(insn, 5, 5);
+    int size = extract32(insn, 10, 2);
+    int opcode = extract32(insn, 12, 4);
+    bool is_store = !extract32(insn, 22, 1);
+    bool is_postidx = extract32(insn, 23, 1);
+    bool is_q = extract32(insn, 30, 1);
+    TCGv_i64 tcg_addr, tcg_rn;
+
+    int ebytes = 1 << size;
+    int elements = (is_q ? 128 : 64) / (8 << size);
+    int rpt;    /* num iterations */
+    int selem;  /* structure elements */
+    int r;
+
+    if (extract32(insn, 31, 1) || extract32(insn, 21, 1)) {
+        unallocated_encoding(s);
+        return;
+    }
+
+    /* From the shared decode logic */
+    switch (opcode) {
+    case 0x0:
+        rpt = 1;
+        selem = 4;
+        break;
+    case 0x2:
+        rpt = 4;
+        selem = 1;
+        break;
+    case 0x4:
+        rpt = 1;
+        selem = 3;
+        break;
+    case 0x6:
+        rpt = 3;
+        selem = 1;
+        break;
+    case 0x7:
+        rpt = 1;
+        selem = 1;
+        break;
+    case 0x8:
+        rpt = 1;
+        selem = 2;
+        break;
+    case 0xa:
+        rpt = 2;
+        selem = 1;
+        break;
+    default:
+        unallocated_encoding(s);
+        return;
+    }
+
+    if (size == 3 && !is_q && selem != 1) {
+        /* reserved */
+        unallocated_encoding(s);
+        return;
+    }
+
+    if (rn == 31) {
+        gen_check_sp_alignment(s);
+    }
+
+    tcg_rn = cpu_reg_sp(s, rn);
+    tcg_addr = tcg_temp_new_i64();
+    tcg_gen_mov_i64(tcg_addr, tcg_rn);
+
+    for (r = 0; r < rpt; r++) {
+        int e;
+        for (e = 0; e < elements; e++) {
+            int tt = (rt + r) % 32;
+            int xs;
+            for (xs = 0; xs < selem; xs++) {
+                if (is_store) {
+                    do_vec_st(s, tt, e, tcg_addr, size);
+                } else {
+                    do_vec_ld(s, tt, e, tcg_addr, size);
+
+                    /* For non-quad operations, setting a slice of the low
+                     * 64 bits of the register clears the high 64 bits (in
+                     * the ARM ARM pseudocode this is implicit in the fact
+                     * that 'rval' is a 64 bit wide variable). We optimize
+                     * by noticing that we only need to do this the first
+                     * time we touch a register.
+                     */
+                    if (!is_q && e == 0 && (r == 0 || xs == selem - 1)) {
+                        clear_vec_high(s, tt);
+                    }
+                }
+                tcg_gen_addi_i64(tcg_addr, tcg_addr, ebytes);
+                tt = (tt + 1) % 32;
+            }
+        }
+    }
+
+    if (is_postidx) {
+        int rm = extract32(insn, 16, 5);
+        if (rm == 31) {
+            tcg_gen_mov_i64(tcg_rn, tcg_addr);
+        } else {
+            tcg_gen_add_i64(tcg_rn, tcg_rn, cpu_reg(s, rm));
+        }
+    }
+    tcg_temp_free_i64(tcg_addr);
 }
 
 /* AdvSIMD load/store single structure */
-- 
1.8.5

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [Qemu-devel] [PULL 02/38] target-arm: A64: Add SIMD ld/st single
  2014-01-29 13:39 [Qemu-devel] [PULL 00/38] target-arm queue Peter Maydell
  2014-01-29 13:39 ` [Qemu-devel] [PULL 01/38] target-arm: A64: Add SIMD ld/st multiple Peter Maydell
@ 2014-01-29 13:39 ` Peter Maydell
  2014-01-29 13:39 ` [Qemu-devel] [PULL 03/38] target-arm: A64: Add decode skeleton for SIMD data processing insns Peter Maydell
                   ` (36 subsequent siblings)
  38 siblings, 0 replies; 40+ messages in thread
From: Peter Maydell @ 2014-01-29 13:39 UTC (permalink / raw)
  To: Anthony Liguori; +Cc: Blue Swirl, qemu-devel, Aurelien Jarno

Implement the SIMD ld/st single structure instructions.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
---
 target-arm/translate-a64.c | 144 ++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 142 insertions(+), 2 deletions(-)

diff --git a/target-arm/translate-a64.c b/target-arm/translate-a64.c
index e4fdf00..6e31740 100644
--- a/target-arm/translate-a64.c
+++ b/target-arm/translate-a64.c
@@ -2087,10 +2087,150 @@ static void disas_ldst_multiple_struct(DisasContext *s, uint32_t insn)
     tcg_temp_free_i64(tcg_addr);
 }
 
-/* AdvSIMD load/store single structure */
+/* C3.3.3 AdvSIMD load/store single structure
+ *
+ *  31  30  29           23 22 21 20       16 15 13 12  11  10 9    5 4    0
+ * +---+---+---------------+-----+-----------+-----+---+------+------+------+
+ * | 0 | Q | 0 0 1 1 0 1 0 | L R | 0 0 0 0 0 | opc | S | size |  Rn  |  Rt  |
+ * +---+---+---------------+-----+-----------+-----+---+------+------+------+
+ *
+ * C3.3.4 AdvSIMD load/store single structure (post-indexed)
+ *
+ *  31  30  29           23 22 21 20       16 15 13 12  11  10 9    5 4    0
+ * +---+---+---------------+-----+-----------+-----+---+------+------+------+
+ * | 0 | Q | 0 0 1 1 0 1 1 | L R |     Rm    | opc | S | size |  Rn  |  Rt  |
+ * +---+---+---------------+-----+-----------+-----+---+------+------+------+
+ *
+ * Rt: first (or only) SIMD&FP register to be transferred
+ * Rn: base address or SP
+ * Rm (post-index only): post-index register (when !31) or size dependent #imm
+ * index = encoded in Q:S:size dependent on size
+ *
+ * lane_size = encoded in R, opc
+ * transfer width = encoded in opc, S, size
+ */
 static void disas_ldst_single_struct(DisasContext *s, uint32_t insn)
 {
-    unsupported_encoding(s, insn);
+    int rt = extract32(insn, 0, 5);
+    int rn = extract32(insn, 5, 5);
+    int size = extract32(insn, 10, 2);
+    int S = extract32(insn, 12, 1);
+    int opc = extract32(insn, 13, 3);
+    int R = extract32(insn, 21, 1);
+    int is_load = extract32(insn, 22, 1);
+    int is_postidx = extract32(insn, 23, 1);
+    int is_q = extract32(insn, 30, 1);
+
+    int scale = extract32(opc, 1, 2);
+    int selem = (extract32(opc, 0, 1) << 1 | R) + 1;
+    bool replicate = false;
+    int index = is_q << 3 | S << 2 | size;
+    int ebytes, xs;
+    TCGv_i64 tcg_addr, tcg_rn;
+
+    switch (scale) {
+    case 3:
+        if (!is_load || S) {
+            unallocated_encoding(s);
+            return;
+        }
+        scale = size;
+        replicate = true;
+        break;
+    case 0:
+        break;
+    case 1:
+        if (extract32(size, 0, 1)) {
+            unallocated_encoding(s);
+            return;
+        }
+        index >>= 1;
+        break;
+    case 2:
+        if (extract32(size, 1, 1)) {
+            unallocated_encoding(s);
+            return;
+        }
+        if (!extract32(size, 0, 1)) {
+            index >>= 2;
+        } else {
+            if (S) {
+                unallocated_encoding(s);
+                return;
+            }
+            index >>= 3;
+            scale = 3;
+        }
+        break;
+    default:
+        g_assert_not_reached();
+    }
+
+    ebytes = 1 << scale;
+
+    if (rn == 31) {
+        gen_check_sp_alignment(s);
+    }
+
+    tcg_rn = cpu_reg_sp(s, rn);
+    tcg_addr = tcg_temp_new_i64();
+    tcg_gen_mov_i64(tcg_addr, tcg_rn);
+
+    for (xs = 0; xs < selem; xs++) {
+        if (replicate) {
+            /* Load and replicate to all elements */
+            uint64_t mulconst;
+            TCGv_i64 tcg_tmp = tcg_temp_new_i64();
+
+            tcg_gen_qemu_ld_i64(tcg_tmp, tcg_addr,
+                                get_mem_index(s), MO_TE + scale);
+            switch (scale) {
+            case 0:
+                mulconst = 0x0101010101010101ULL;
+                break;
+            case 1:
+                mulconst = 0x0001000100010001ULL;
+                break;
+            case 2:
+                mulconst = 0x0000000100000001ULL;
+                break;
+            case 3:
+                mulconst = 0;
+                break;
+            default:
+                g_assert_not_reached();
+            }
+            if (mulconst) {
+                tcg_gen_muli_i64(tcg_tmp, tcg_tmp, mulconst);
+            }
+            write_vec_element(s, tcg_tmp, rt, 0, MO_64);
+            if (is_q) {
+                write_vec_element(s, tcg_tmp, rt, 1, MO_64);
+            } else {
+                clear_vec_high(s, rt);
+            }
+            tcg_temp_free_i64(tcg_tmp);
+        } else {
+            /* Load/store one element per register */
+            if (is_load) {
+                do_vec_ld(s, rt, index, tcg_addr, MO_TE + scale);
+            } else {
+                do_vec_st(s, rt, index, tcg_addr, MO_TE + scale);
+            }
+        }
+        tcg_gen_addi_i64(tcg_addr, tcg_addr, ebytes);
+        rt = (rt + 1) % 32;
+    }
+
+    if (is_postidx) {
+        int rm = extract32(insn, 16, 5);
+        if (rm == 31) {
+            tcg_gen_mov_i64(tcg_rn, tcg_addr);
+        } else {
+            tcg_gen_add_i64(tcg_rn, tcg_rn, cpu_reg(s, rm));
+        }
+    }
+    tcg_temp_free_i64(tcg_addr);
 }
 
 /* C3.3 Loads and stores */
-- 
1.8.5

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [Qemu-devel] [PULL 03/38] target-arm: A64: Add decode skeleton for SIMD data processing insns
  2014-01-29 13:39 [Qemu-devel] [PULL 00/38] target-arm queue Peter Maydell
  2014-01-29 13:39 ` [Qemu-devel] [PULL 01/38] target-arm: A64: Add SIMD ld/st multiple Peter Maydell
  2014-01-29 13:39 ` [Qemu-devel] [PULL 02/38] target-arm: A64: Add SIMD ld/st single Peter Maydell
@ 2014-01-29 13:39 ` Peter Maydell
  2014-01-29 13:39 ` [Qemu-devel] [PULL 04/38] target-arm: A64: Add SIMD EXT Peter Maydell
                   ` (35 subsequent siblings)
  38 siblings, 0 replies; 40+ messages in thread
From: Peter Maydell @ 2014-01-29 13:39 UTC (permalink / raw)
  To: Anthony Liguori; +Cc: Blue Swirl, qemu-devel, Aurelien Jarno

From: Alex Bennée <alex.bennee@linaro.org>

Add decode skeleton and function placeholders for all the SIMD data
processing instructions. Due to the complexity of this part of the
table the normal extract and switch approach gets very messy very
quickly, so we use a simple data-driven pattern-and-mask approach.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
---
 target-arm/translate-a64.c | 306 ++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 305 insertions(+), 1 deletion(-)

diff --git a/target-arm/translate-a64.c b/target-arm/translate-a64.c
index 6e31740..8060c9f 100644
--- a/target-arm/translate-a64.c
+++ b/target-arm/translate-a64.c
@@ -61,6 +61,17 @@ enum a64_shift_type {
     A64_SHIFT_TYPE_ROR = 3
 };
 
+/* Table based decoder typedefs - used when the relevant bits for decode
+ * are too awkwardly scattered across the instruction (eg SIMD).
+ */
+typedef void AArch64DecodeFn(DisasContext *s, uint32_t insn);
+
+typedef struct AArch64DecodeTable {
+    uint32_t pattern;
+    uint32_t mask;
+    AArch64DecodeFn *disas_fn;
+} AArch64DecodeTable;
+
 /* initialize TCG globals.  */
 void a64_translate_init(void)
 {
@@ -846,6 +857,31 @@ static inline void gen_check_sp_alignment(DisasContext *s)
 }
 
 /*
+ * This provides a simple table based table lookup decoder. It is
+ * intended to be used when the relevant bits for decode are too
+ * awkwardly placed and switch/if based logic would be confusing and
+ * deeply nested. Since it's a linear search through the table, tables
+ * should be kept small.
+ *
+ * It returns the first handler where insn & mask == pattern, or
+ * NULL if there is no match.
+ * The table is terminated by an empty mask (i.e. 0)
+ */
+static inline AArch64DecodeFn *lookup_disas_fn(const AArch64DecodeTable *table,
+                                               uint32_t insn)
+{
+    const AArch64DecodeTable *tptr = table;
+
+    while (tptr->mask) {
+        if ((insn & tptr->mask) == tptr->pattern) {
+            return tptr->disas_fn;
+        }
+        tptr++;
+    }
+    return NULL;
+}
+
+/*
  * the instruction disassembly implemented here matches
  * the instruction encoding classifications in chapter 3 (C3)
  * of the ARM Architecture Reference Manual (DDI0487A_a)
@@ -4610,13 +4646,281 @@ static void disas_data_proc_fp(DisasContext *s, uint32_t insn)
     }
 }
 
+/* C3.6.1 EXT
+ *   31  30 29         24 23 22  21 20  16 15  14  11 10  9    5 4    0
+ * +---+---+-------------+-----+---+------+---+------+---+------+------+
+ * | 0 | Q | 1 0 1 1 1 0 | op2 | 0 |  Rm  | 0 | imm4 | 0 |  Rn  |  Rd  |
+ * +---+---+-------------+-----+---+------+---+------+---+------+------+
+ */
+static void disas_simd_ext(DisasContext *s, uint32_t insn)
+{
+    unsupported_encoding(s, insn);
+}
+
+/* C3.6.2 TBL/TBX
+ *   31  30 29         24 23 22  21 20  16 15  14 13  12  11 10 9    5 4    0
+ * +---+---+-------------+-----+---+------+---+-----+----+-----+------+------+
+ * | 0 | Q | 0 0 1 1 1 0 | op2 | 0 |  Rm  | 0 | len | op | 0 0 |  Rn  |  Rd  |
+ * +---+---+-------------+-----+---+------+---+-----+----+-----+------+------+
+ */
+static void disas_simd_tb(DisasContext *s, uint32_t insn)
+{
+    unsupported_encoding(s, insn);
+}
+
+/* C3.6.3 ZIP/UZP/TRN
+ *   31  30 29         24 23  22  21 20   16 15 14 12 11 10 9    5 4    0
+ * +---+---+-------------+------+---+------+---+------------------+------+
+ * | 0 | Q | 0 0 1 1 1 0 | size | 0 |  Rm  | 0 | opc | 1 0 |  Rn  |  Rd  |
+ * +---+---+-------------+------+---+------+---+------------------+------+
+ */
+static void disas_simd_zip_trn(DisasContext *s, uint32_t insn)
+{
+    unsupported_encoding(s, insn);
+}
+
+/* C3.6.4 AdvSIMD across lanes
+ *   31  30  29 28       24 23  22 21       17 16    12 11 10 9    5 4    0
+ * +---+---+---+-----------+------+-----------+--------+-----+------+------+
+ * | 0 | Q | U | 0 1 1 1 0 | size | 1 1 0 0 0 | opcode | 1 0 |  Rn  |  Rd  |
+ * +---+---+---+-----------+------+-----------+--------+-----+------+------+
+ */
+static void disas_simd_across_lanes(DisasContext *s, uint32_t insn)
+{
+    unsupported_encoding(s, insn);
+}
+
+/* C3.6.5 AdvSIMD copy
+ *   31  30  29  28             21 20  16 15  14  11 10  9    5 4    0
+ * +---+---+----+-----------------+------+---+------+---+------+------+
+ * | 0 | Q | op | 0 1 1 1 0 0 0 0 | imm5 | 0 | imm4 | 1 |  Rn  |  Rd  |
+ * +---+---+----+-----------------+------+---+------+---+------+------+
+ */
+static void disas_simd_copy(DisasContext *s, uint32_t insn)
+{
+    unsupported_encoding(s, insn);
+}
+
+/* C3.6.6 AdvSIMD modified immediate
+ *  31  30   29  28                 19 18 16 15   12  11  10  9     5 4    0
+ * +---+---+----+---------------------+-----+-------+----+---+-------+------+
+ * | 0 | Q | op | 0 1 1 1 1 0 0 0 0 0 | abc | cmode | o2 | 1 | defgh |  Rd  |
+ * +---+---+----+---------------------+-----+-------+----+---+-------+------+
+ */
+static void disas_simd_mod_imm(DisasContext *s, uint32_t insn)
+{
+    unsupported_encoding(s, insn);
+}
+
+/* C3.6.7 AdvSIMD scalar copy
+ *  31 30  29  28             21 20  16 15  14  11 10  9    5 4    0
+ * +-----+----+-----------------+------+---+------+---+------+------+
+ * | 0 1 | op | 1 1 1 1 0 0 0 0 | imm5 | 0 | imm4 | 1 |  Rn  |  Rd  |
+ * +-----+----+-----------------+------+---+------+---+------+------+
+ */
+static void disas_simd_scalar_copy(DisasContext *s, uint32_t insn)
+{
+    unsupported_encoding(s, insn);
+}
+
+/* C3.6.8 AdvSIMD scalar pairwise
+ *  31 30  29 28       24 23  22 21       17 16    12 11 10 9    5 4    0
+ * +-----+---+-----------+------+-----------+--------+-----+------+------+
+ * | 0 1 | U | 1 1 1 1 0 | size | 1 1 0 0 0 | opcode | 1 0 |  Rn  |  Rd  |
+ * +-----+---+-----------+------+-----------+--------+-----+------+------+
+ */
+static void disas_simd_scalar_pairwise(DisasContext *s, uint32_t insn)
+{
+    unsupported_encoding(s, insn);
+}
+
+/* C3.6.9 AdvSIMD scalar shift by immediate
+ *  31 30  29 28         23 22  19 18  16 15    11  10 9    5 4    0
+ * +-----+---+-------------+------+------+--------+---+------+------+
+ * | 0 1 | U | 1 1 1 1 1 0 | immh | immb | opcode | 1 |  Rn  |  Rd  |
+ * +-----+---+-------------+------+------+--------+---+------+------+
+ */
+static void disas_simd_scalar_shift_imm(DisasContext *s, uint32_t insn)
+{
+    unsupported_encoding(s, insn);
+}
+
+/* C3.6.10 AdvSIMD scalar three different
+ *  31 30  29 28       24 23  22  21 20  16 15    12 11 10 9    5 4    0
+ * +-----+---+-----------+------+---+------+--------+-----+------+------+
+ * | 0 1 | U | 1 1 1 1 0 | size | 1 |  Rm  | opcode | 0 0 |  Rn  |  Rd  |
+ * +-----+---+-----------+------+---+------+--------+-----+------+------+
+ */
+static void disas_simd_scalar_three_reg_diff(DisasContext *s, uint32_t insn)
+{
+    unsupported_encoding(s, insn);
+}
+
+/* C3.6.11 AdvSIMD scalar three same
+ *  31 30  29 28       24 23  22  21 20  16 15    11  10 9    5 4    0
+ * +-----+---+-----------+------+---+------+--------+---+------+------+
+ * | 0 1 | U | 1 1 1 1 0 | size | 1 |  Rm  | opcode | 1 |  Rn  |  Rd  |
+ * +-----+---+-----------+------+---+------+--------+---+------+------+
+ */
+static void disas_simd_scalar_three_reg_same(DisasContext *s, uint32_t insn)
+{
+    unsupported_encoding(s, insn);
+}
+
+/* C3.6.12 AdvSIMD scalar two reg misc
+ *  31 30  29 28       24 23  22 21       17 16    12 11 10 9    5 4    0
+ * +-----+---+-----------+------+-----------+--------+-----+------+------+
+ * | 0 1 | U | 1 1 1 1 0 | size | 1 0 0 0 0 | opcode | 1 0 |  Rn  |  Rd  |
+ * +-----+---+-----------+------+-----------+--------+-----+------+------+
+ */
+static void disas_simd_scalar_two_reg_misc(DisasContext *s, uint32_t insn)
+{
+    unsupported_encoding(s, insn);
+}
+
+/* C3.6.13 AdvSIMD scalar x indexed element
+ *  31 30  29 28       24 23  22 21  20  19  16 15 12  11  10 9    5 4    0
+ * +-----+---+-----------+------+---+---+------+-----+---+---+------+------+
+ * | 0 1 | U | 1 1 1 1 1 | size | L | M |  Rm  | opc | H | 0 |  Rn  |  Rd  |
+ * +-----+---+-----------+------+---+---+------+-----+---+---+------+------+
+ */
+static void disas_simd_scalar_indexed(DisasContext *s, uint32_t insn)
+{
+    unsupported_encoding(s, insn);
+}
+
+/* C3.6.14 AdvSIMD shift by immediate
+ *  31  30   29 28         23 22  19 18  16 15    11  10 9    5 4    0
+ * +---+---+---+-------------+------+------+--------+---+------+------+
+ * | 0 | Q | U | 0 1 1 1 1 0 | immh | immb | opcode | 1 |  Rn  |  Rd  |
+ * +---+---+---+-------------+------+------+--------+---+------+------+
+ */
+static void disas_simd_shift_imm(DisasContext *s, uint32_t insn)
+{
+    unsupported_encoding(s, insn);
+}
+
+/* C3.6.15 AdvSIMD three different
+ *   31  30  29 28       24 23  22  21 20  16 15    12 11 10 9    5 4    0
+ * +---+---+---+-----------+------+---+------+--------+-----+------+------+
+ * | 0 | Q | U | 0 1 1 1 0 | size | 1 |  Rm  | opcode | 0 0 |  Rn  |  Rd  |
+ * +---+---+---+-----------+------+---+------+--------+-----+------+------+
+ */
+static void disas_simd_three_reg_diff(DisasContext *s, uint32_t insn)
+{
+    unsupported_encoding(s, insn);
+}
+
+/* C3.6.16 AdvSIMD three same
+ *  31  30  29  28       24 23  22  21 20  16 15    11  10 9    5 4    0
+ * +---+---+---+-----------+------+---+------+--------+---+------+------+
+ * | 0 | Q | U | 0 1 1 1 0 | size | 1 |  Rm  | opcode | 1 |  Rn  |  Rd  |
+ * +---+---+---+-----------+------+---+------+--------+---+------+------+
+ */
+static void disas_simd_three_reg_same(DisasContext *s, uint32_t insn)
+{
+    unsupported_encoding(s, insn);
+}
+
+/* C3.6.17 AdvSIMD two reg misc
+ *   31  30  29 28       24 23  22 21       17 16    12 11 10 9    5 4    0
+ * +---+---+---+-----------+------+-----------+--------+-----+------+------+
+ * | 0 | Q | U | 0 1 1 1 0 | size | 1 0 0 0 0 | opcode | 1 0 |  Rn  |  Rd  |
+ * +---+---+---+-----------+------+-----------+--------+-----+------+------+
+ */
+static void disas_simd_two_reg_misc(DisasContext *s, uint32_t insn)
+{
+    unsupported_encoding(s, insn);
+}
+
+/* C3.6.18 AdvSIMD vector x indexed element
+ *   31  30  29 28       24 23  22 21  20  19  16 15 12  11  10 9    5 4    0
+ * +---+---+---+-----------+------+---+---+------+-----+---+---+------+------+
+ * | 0 | Q | U | 0 1 1 1 1 | size | L | M |  Rm  | opc | H | 0 |  Rn  |  Rd  |
+ * +---+---+---+-----------+------+---+---+------+-----+---+---+------+------+
+ */
+static void disas_simd_indexed_vector(DisasContext *s, uint32_t insn)
+{
+    unsupported_encoding(s, insn);
+}
+
+/* C3.6.19 Crypto AES
+ *  31             24 23  22 21       17 16    12 11 10 9    5 4    0
+ * +-----------------+------+-----------+--------+-----+------+------+
+ * | 0 1 0 0 1 1 1 0 | size | 1 0 1 0 0 | opcode | 1 0 |  Rn  |  Rd  |
+ * +-----------------+------+-----------+--------+-----+------+------+
+ */
+static void disas_crypto_aes(DisasContext *s, uint32_t insn)
+{
+    unsupported_encoding(s, insn);
+}
+
+/* C3.6.20 Crypto three-reg SHA
+ *  31             24 23  22  21 20  16  15 14    12 11 10 9    5 4    0
+ * +-----------------+------+---+------+---+--------+-----+------+------+
+ * | 0 1 0 1 1 1 1 0 | size | 0 |  Rm  | 0 | opcode | 0 0 |  Rn  |  Rd  |
+ * +-----------------+------+---+------+---+--------+-----+------+------+
+ */
+static void disas_crypto_three_reg_sha(DisasContext *s, uint32_t insn)
+{
+    unsupported_encoding(s, insn);
+}
+
+/* C3.6.21 Crypto two-reg SHA
+ *  31             24 23  22 21       17 16    12 11 10 9    5 4    0
+ * +-----------------+------+-----------+--------+-----+------+------+
+ * | 0 1 0 1 1 1 1 0 | size | 1 0 1 0 0 | opcode | 1 0 |  Rn  |  Rd  |
+ * +-----------------+------+-----------+--------+-----+------+------+
+ */
+static void disas_crypto_two_reg_sha(DisasContext *s, uint32_t insn)
+{
+    unsupported_encoding(s, insn);
+}
+
+/* C3.6 Data processing - SIMD, inc Crypto
+ *
+ * As the decode gets a little complex we are using a table based
+ * approach for this part of the decode.
+ */
+static const AArch64DecodeTable data_proc_simd[] = {
+    /* pattern  ,  mask     ,  fn                        */
+    { 0x0e200400, 0x9f200400, disas_simd_three_reg_same },
+    { 0x0e200000, 0x9f200c00, disas_simd_three_reg_diff },
+    { 0x0e200800, 0x9f3e0c00, disas_simd_two_reg_misc },
+    { 0x0e300800, 0x9f3e0c00, disas_simd_across_lanes },
+    { 0x0e000400, 0x9fe08400, disas_simd_copy },
+    { 0x0f000000, 0x9f000400, disas_simd_indexed_vector },
+    /* simd_mod_imm decode is a subset of simd_shift_imm, so must precede it */
+    { 0x0f000400, 0x9ff80400, disas_simd_mod_imm },
+    { 0x0f000400, 0x9f800400, disas_simd_shift_imm },
+    { 0x0e000000, 0xbf208c00, disas_simd_tb },
+    { 0x0e000800, 0xbf208c00, disas_simd_zip_trn },
+    { 0x2e000000, 0xbf208400, disas_simd_ext },
+    { 0x5e200400, 0xdf200400, disas_simd_scalar_three_reg_same },
+    { 0x5e200000, 0xdf200c00, disas_simd_scalar_three_reg_diff },
+    { 0x5e200800, 0xdf3e0c00, disas_simd_scalar_two_reg_misc },
+    { 0x5e300800, 0xdf3e0c00, disas_simd_scalar_pairwise },
+    { 0x5e000400, 0xdfe08400, disas_simd_scalar_copy },
+    { 0x5f000000, 0xdf000400, disas_simd_scalar_indexed },
+    { 0x5f000400, 0xdf800400, disas_simd_scalar_shift_imm },
+    { 0x4e280800, 0xff3e0c00, disas_crypto_aes },
+    { 0x5e000000, 0xff208c00, disas_crypto_three_reg_sha },
+    { 0x5e280800, 0xff3e0c00, disas_crypto_two_reg_sha },
+    { 0x00000000, 0x00000000, NULL }
+};
+
 static void disas_data_proc_simd(DisasContext *s, uint32_t insn)
 {
     /* Note that this is called with all non-FP cases from
      * table C3-6 so it must UNDEF for entries not specifically
      * allocated to instructions in that table.
      */
-    unsupported_encoding(s, insn);
+    AArch64DecodeFn *fn = lookup_disas_fn(&data_proc_simd[0], insn);
+    if (fn) {
+        fn(s, insn);
+    } else {
+        unallocated_encoding(s);
+    }
 }
 
 /* C3.6 Data processing - SIMD and floating point */
-- 
1.8.5

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [Qemu-devel] [PULL 04/38] target-arm: A64: Add SIMD EXT
  2014-01-29 13:39 [Qemu-devel] [PULL 00/38] target-arm queue Peter Maydell
                   ` (2 preceding siblings ...)
  2014-01-29 13:39 ` [Qemu-devel] [PULL 03/38] target-arm: A64: Add decode skeleton for SIMD data processing insns Peter Maydell
@ 2014-01-29 13:39 ` Peter Maydell
  2014-01-29 13:39 ` [Qemu-devel] [PULL 05/38] target-arm: A64: Add SIMD TBL/TBLX Peter Maydell
                   ` (34 subsequent siblings)
  38 siblings, 0 replies; 40+ messages in thread
From: Peter Maydell @ 2014-01-29 13:39 UTC (permalink / raw)
  To: Anthony Liguori; +Cc: Blue Swirl, qemu-devel, Aurelien Jarno

Add support for the SIMD EXT instruction (the only one in its
group, C3.6.1).

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
---
 target-arm/translate-a64.c | 79 +++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 78 insertions(+), 1 deletion(-)

diff --git a/target-arm/translate-a64.c b/target-arm/translate-a64.c
index 8060c9f..75b0039 100644
--- a/target-arm/translate-a64.c
+++ b/target-arm/translate-a64.c
@@ -4646,6 +4646,25 @@ static void disas_data_proc_fp(DisasContext *s, uint32_t insn)
     }
 }
 
+static void do_ext64(DisasContext *s, TCGv_i64 tcg_left, TCGv_i64 tcg_right,
+                     int pos)
+{
+    /* Extract 64 bits from the middle of two concatenated 64 bit
+     * vector register slices left:right. The extracted bits start
+     * at 'pos' bits into the right (least significant) side.
+     * We return the result in tcg_right, and guarantee not to
+     * trash tcg_left.
+     */
+    TCGv_i64 tcg_tmp = tcg_temp_new_i64();
+    assert(pos > 0 && pos < 64);
+
+    tcg_gen_shri_i64(tcg_right, tcg_right, pos);
+    tcg_gen_shli_i64(tcg_tmp, tcg_left, 64 - pos);
+    tcg_gen_or_i64(tcg_right, tcg_right, tcg_tmp);
+
+    tcg_temp_free_i64(tcg_tmp);
+}
+
 /* C3.6.1 EXT
  *   31  30 29         24 23 22  21 20  16 15  14  11 10  9    5 4    0
  * +---+---+-------------+-----+---+------+---+------+---+------+------+
@@ -4654,7 +4673,65 @@ static void disas_data_proc_fp(DisasContext *s, uint32_t insn)
  */
 static void disas_simd_ext(DisasContext *s, uint32_t insn)
 {
-    unsupported_encoding(s, insn);
+    int is_q = extract32(insn, 30, 1);
+    int op2 = extract32(insn, 22, 2);
+    int imm4 = extract32(insn, 11, 4);
+    int rm = extract32(insn, 16, 5);
+    int rn = extract32(insn, 5, 5);
+    int rd = extract32(insn, 0, 5);
+    int pos = imm4 << 3;
+    TCGv_i64 tcg_resl, tcg_resh;
+
+    if (op2 != 0 || (!is_q && extract32(imm4, 3, 1))) {
+        unallocated_encoding(s);
+        return;
+    }
+
+    tcg_resh = tcg_temp_new_i64();
+    tcg_resl = tcg_temp_new_i64();
+
+    /* Vd gets bits starting at pos bits into Vm:Vn. This is
+     * either extracting 128 bits from a 128:128 concatenation, or
+     * extracting 64 bits from a 64:64 concatenation.
+     */
+    if (!is_q) {
+        read_vec_element(s, tcg_resl, rn, 0, MO_64);
+        if (pos != 0) {
+            read_vec_element(s, tcg_resh, rm, 0, MO_64);
+            do_ext64(s, tcg_resh, tcg_resl, pos);
+        }
+        tcg_gen_movi_i64(tcg_resh, 0);
+    } else {
+        TCGv_i64 tcg_hh;
+        typedef struct {
+            int reg;
+            int elt;
+        } EltPosns;
+        EltPosns eltposns[] = { {rn, 0}, {rn, 1}, {rm, 0}, {rm, 1} };
+        EltPosns *elt = eltposns;
+
+        if (pos >= 64) {
+            elt++;
+            pos -= 64;
+        }
+
+        read_vec_element(s, tcg_resl, elt->reg, elt->elt, MO_64);
+        elt++;
+        read_vec_element(s, tcg_resh, elt->reg, elt->elt, MO_64);
+        elt++;
+        if (pos != 0) {
+            do_ext64(s, tcg_resh, tcg_resl, pos);
+            tcg_hh = tcg_temp_new_i64();
+            read_vec_element(s, tcg_hh, elt->reg, elt->elt, MO_64);
+            do_ext64(s, tcg_hh, tcg_resh, pos);
+            tcg_temp_free_i64(tcg_hh);
+        }
+    }
+
+    write_vec_element(s, tcg_resl, rd, 0, MO_64);
+    tcg_temp_free_i64(tcg_resl);
+    write_vec_element(s, tcg_resh, rd, 1, MO_64);
+    tcg_temp_free_i64(tcg_resh);
 }
 
 /* C3.6.2 TBL/TBX
-- 
1.8.5

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [Qemu-devel] [PULL 05/38] target-arm: A64: Add SIMD TBL/TBLX
  2014-01-29 13:39 [Qemu-devel] [PULL 00/38] target-arm queue Peter Maydell
                   ` (3 preceding siblings ...)
  2014-01-29 13:39 ` [Qemu-devel] [PULL 04/38] target-arm: A64: Add SIMD EXT Peter Maydell
@ 2014-01-29 13:39 ` Peter Maydell
  2014-01-29 13:39 ` [Qemu-devel] [PULL 06/38] target-arm: A64: Add SIMD ZIP/UZP/TRN Peter Maydell
                   ` (33 subsequent siblings)
  38 siblings, 0 replies; 40+ messages in thread
From: Peter Maydell @ 2014-01-29 13:39 UTC (permalink / raw)
  To: Anthony Liguori; +Cc: Blue Swirl, qemu-devel, Aurelien Jarno

From: Michael Matz <matz@suse.de>

Add support for the SIMD TBL/TBLX instructions (group C3.6.2).

Signed-off-by: Michael Matz <matz@suse.de>
[PMM: rewritten to do more of the decode in translate-a64.c,
 and to do only one 64 bit pass at a time in the helper]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
---
 target-arm/helper-a64.c    | 31 ++++++++++++++++++++++++++
 target-arm/helper-a64.h    |  1 +
 target-arm/translate-a64.c | 55 +++++++++++++++++++++++++++++++++++++++++++++-
 3 files changed, 86 insertions(+), 1 deletion(-)

diff --git a/target-arm/helper-a64.c b/target-arm/helper-a64.c
index 4ce0d01..6ca958a 100644
--- a/target-arm/helper-a64.c
+++ b/target-arm/helper-a64.c
@@ -122,3 +122,34 @@ uint64_t HELPER(vfp_cmped_a64)(float64 x, float64 y, void *fp_status)
 {
     return float_rel_to_flags(float64_compare(x, y, fp_status));
 }
+
+uint64_t HELPER(simd_tbl)(CPUARMState *env, uint64_t result, uint64_t indices,
+                          uint32_t rn, uint32_t numregs)
+{
+    /* Helper function for SIMD TBL and TBX. We have to do the table
+     * lookup part for the 64 bits worth of indices we're passed in.
+     * result is the initial results vector (either zeroes for TBL
+     * or some guest values for TBX), rn the register number where
+     * the table starts, and numregs the number of registers in the table.
+     * We return the results of the lookups.
+     */
+    int shift;
+
+    for (shift = 0; shift < 64; shift += 8) {
+        int index = extract64(indices, shift, 8);
+        if (index < 16 * numregs) {
+            /* Convert index (a byte offset into the virtual table
+             * which is a series of 128-bit vectors concatenated)
+             * into the correct vfp.regs[] element plus a bit offset
+             * into that element, bearing in mind that the table
+             * can wrap around from V31 to V0.
+             */
+            int elt = (rn * 2 + (index >> 3)) % 64;
+            int bitidx = (index & 7) * 8;
+            uint64_t val = extract64(env->vfp.regs[elt], bitidx, 8);
+
+            result = deposit64(result, shift, 8, val);
+        }
+    }
+    return result;
+}
diff --git a/target-arm/helper-a64.h b/target-arm/helper-a64.h
index bca19f3..99832ee 100644
--- a/target-arm/helper-a64.h
+++ b/target-arm/helper-a64.h
@@ -26,3 +26,4 @@ DEF_HELPER_3(vfp_cmps_a64, i64, f32, f32, ptr)
 DEF_HELPER_3(vfp_cmpes_a64, i64, f32, f32, ptr)
 DEF_HELPER_3(vfp_cmpd_a64, i64, f64, f64, ptr)
 DEF_HELPER_3(vfp_cmped_a64, i64, f64, f64, ptr)
+DEF_HELPER_FLAGS_5(simd_tbl, TCG_CALL_NO_RWG_SE, i64, env, i64, i64, i32, i32)
diff --git a/target-arm/translate-a64.c b/target-arm/translate-a64.c
index 75b0039..7a6b00a 100644
--- a/target-arm/translate-a64.c
+++ b/target-arm/translate-a64.c
@@ -4742,7 +4742,60 @@ static void disas_simd_ext(DisasContext *s, uint32_t insn)
  */
 static void disas_simd_tb(DisasContext *s, uint32_t insn)
 {
-    unsupported_encoding(s, insn);
+    int op2 = extract32(insn, 22, 2);
+    int is_q = extract32(insn, 30, 1);
+    int rm = extract32(insn, 16, 5);
+    int rn = extract32(insn, 5, 5);
+    int rd = extract32(insn, 0, 5);
+    int is_tblx = extract32(insn, 12, 1);
+    int len = extract32(insn, 13, 2);
+    TCGv_i64 tcg_resl, tcg_resh, tcg_idx;
+    TCGv_i32 tcg_regno, tcg_numregs;
+
+    if (op2 != 0) {
+        unallocated_encoding(s);
+        return;
+    }
+
+    /* This does a table lookup: for every byte element in the input
+     * we index into a table formed from up to four vector registers,
+     * and then the output is the result of the lookups. Our helper
+     * function does the lookup operation for a single 64 bit part of
+     * the input.
+     */
+    tcg_resl = tcg_temp_new_i64();
+    tcg_resh = tcg_temp_new_i64();
+
+    if (is_tblx) {
+        read_vec_element(s, tcg_resl, rd, 0, MO_64);
+    } else {
+        tcg_gen_movi_i64(tcg_resl, 0);
+    }
+    if (is_tblx && is_q) {
+        read_vec_element(s, tcg_resh, rd, 1, MO_64);
+    } else {
+        tcg_gen_movi_i64(tcg_resh, 0);
+    }
+
+    tcg_idx = tcg_temp_new_i64();
+    tcg_regno = tcg_const_i32(rn);
+    tcg_numregs = tcg_const_i32(len + 1);
+    read_vec_element(s, tcg_idx, rm, 0, MO_64);
+    gen_helper_simd_tbl(tcg_resl, cpu_env, tcg_resl, tcg_idx,
+                        tcg_regno, tcg_numregs);
+    if (is_q) {
+        read_vec_element(s, tcg_idx, rm, 1, MO_64);
+        gen_helper_simd_tbl(tcg_resh, cpu_env, tcg_resh, tcg_idx,
+                            tcg_regno, tcg_numregs);
+    }
+    tcg_temp_free_i64(tcg_idx);
+    tcg_temp_free_i32(tcg_regno);
+    tcg_temp_free_i32(tcg_numregs);
+
+    write_vec_element(s, tcg_resl, rd, 0, MO_64);
+    tcg_temp_free_i64(tcg_resl);
+    write_vec_element(s, tcg_resh, rd, 1, MO_64);
+    tcg_temp_free_i64(tcg_resh);
 }
 
 /* C3.6.3 ZIP/UZP/TRN
-- 
1.8.5

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [Qemu-devel] [PULL 06/38] target-arm: A64: Add SIMD ZIP/UZP/TRN
  2014-01-29 13:39 [Qemu-devel] [PULL 00/38] target-arm queue Peter Maydell
                   ` (4 preceding siblings ...)
  2014-01-29 13:39 ` [Qemu-devel] [PULL 05/38] target-arm: A64: Add SIMD TBL/TBLX Peter Maydell
@ 2014-01-29 13:39 ` Peter Maydell
  2014-01-29 13:39 ` [Qemu-devel] [PULL 07/38] target-arm: A64: Add SIMD across-lanes instructions Peter Maydell
                   ` (32 subsequent siblings)
  38 siblings, 0 replies; 40+ messages in thread
From: Peter Maydell @ 2014-01-29 13:39 UTC (permalink / raw)
  To: Anthony Liguori; +Cc: Blue Swirl, qemu-devel, Aurelien Jarno

From: Michael Matz <matz@suse.de>

Add support for the SIMD ZIP/UZIP/TRN instruction group
(C3.6.3).

Signed-off-by: Michael Matz <matz@suse.de>
[PMM: use new do_vec_get/set etc functions and generally update to new
 codebase standards; refactor to pull per-element loop outside switch]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
---
 target-arm/translate-a64.c | 76 +++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 75 insertions(+), 1 deletion(-)

diff --git a/target-arm/translate-a64.c b/target-arm/translate-a64.c
index 7a6b00a..59e2a85 100644
--- a/target-arm/translate-a64.c
+++ b/target-arm/translate-a64.c
@@ -4806,7 +4806,81 @@ static void disas_simd_tb(DisasContext *s, uint32_t insn)
  */
 static void disas_simd_zip_trn(DisasContext *s, uint32_t insn)
 {
-    unsupported_encoding(s, insn);
+    int rd = extract32(insn, 0, 5);
+    int rn = extract32(insn, 5, 5);
+    int rm = extract32(insn, 16, 5);
+    int size = extract32(insn, 22, 2);
+    /* opc field bits [1:0] indicate ZIP/UZP/TRN;
+     * bit 2 indicates 1 vs 2 variant of the insn.
+     */
+    int opcode = extract32(insn, 12, 2);
+    bool part = extract32(insn, 14, 1);
+    bool is_q = extract32(insn, 30, 1);
+    int esize = 8 << size;
+    int i, ofs;
+    int datasize = is_q ? 128 : 64;
+    int elements = datasize / esize;
+    TCGv_i64 tcg_res, tcg_resl, tcg_resh;
+
+    if (opcode == 0 || (size == 3 && !is_q)) {
+        unallocated_encoding(s);
+        return;
+    }
+
+    tcg_resl = tcg_const_i64(0);
+    tcg_resh = tcg_const_i64(0);
+    tcg_res = tcg_temp_new_i64();
+
+    for (i = 0; i < elements; i++) {
+        switch (opcode) {
+        case 1: /* UZP1/2 */
+        {
+            int midpoint = elements / 2;
+            if (i < midpoint) {
+                read_vec_element(s, tcg_res, rn, 2 * i + part, size);
+            } else {
+                read_vec_element(s, tcg_res, rm,
+                                 2 * (i - midpoint) + part, size);
+            }
+            break;
+        }
+        case 2: /* TRN1/2 */
+            if (i & 1) {
+                read_vec_element(s, tcg_res, rm, (i & ~1) + part, size);
+            } else {
+                read_vec_element(s, tcg_res, rn, (i & ~1) + part, size);
+            }
+            break;
+        case 3: /* ZIP1/2 */
+        {
+            int base = part * elements / 2;
+            if (i & 1) {
+                read_vec_element(s, tcg_res, rm, base + (i >> 1), size);
+            } else {
+                read_vec_element(s, tcg_res, rn, base + (i >> 1), size);
+            }
+            break;
+        }
+        default:
+            g_assert_not_reached();
+        }
+
+        ofs = i * esize;
+        if (ofs < 64) {
+            tcg_gen_shli_i64(tcg_res, tcg_res, ofs);
+            tcg_gen_or_i64(tcg_resl, tcg_resl, tcg_res);
+        } else {
+            tcg_gen_shli_i64(tcg_res, tcg_res, ofs - 64);
+            tcg_gen_or_i64(tcg_resh, tcg_resh, tcg_res);
+        }
+    }
+
+    tcg_temp_free_i64(tcg_res);
+
+    write_vec_element(s, tcg_resl, rd, 0, MO_64);
+    tcg_temp_free_i64(tcg_resl);
+    write_vec_element(s, tcg_resh, rd, 1, MO_64);
+    tcg_temp_free_i64(tcg_resh);
 }
 
 /* C3.6.4 AdvSIMD across lanes
-- 
1.8.5

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [Qemu-devel] [PULL 07/38] target-arm: A64: Add SIMD across-lanes instructions
  2014-01-29 13:39 [Qemu-devel] [PULL 00/38] target-arm queue Peter Maydell
                   ` (5 preceding siblings ...)
  2014-01-29 13:39 ` [Qemu-devel] [PULL 06/38] target-arm: A64: Add SIMD ZIP/UZP/TRN Peter Maydell
@ 2014-01-29 13:39 ` Peter Maydell
  2014-01-29 13:39 ` [Qemu-devel] [PULL 08/38] target-arm: A64: Add SIMD copy operations Peter Maydell
                   ` (31 subsequent siblings)
  38 siblings, 0 replies; 40+ messages in thread
From: Peter Maydell @ 2014-01-29 13:39 UTC (permalink / raw)
  To: Anthony Liguori; +Cc: Blue Swirl, qemu-devel, Aurelien Jarno

From: Michael Matz <matz@suse.de>

Add support for the SIMD "across lanes" instruction group (C3.6.4).

Signed-off-by: Michael Matz <matz@suse.de>
[PMM: Updated to current codebase, added fp min/max ops,
 added unallocated encoding checks]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
---
 target-arm/translate-a64.c | 177 ++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 176 insertions(+), 1 deletion(-)

diff --git a/target-arm/translate-a64.c b/target-arm/translate-a64.c
index 59e2a85..de4f518 100644
--- a/target-arm/translate-a64.c
+++ b/target-arm/translate-a64.c
@@ -4883,6 +4883,29 @@ static void disas_simd_zip_trn(DisasContext *s, uint32_t insn)
     tcg_temp_free_i64(tcg_resh);
 }
 
+static void do_minmaxop(DisasContext *s, TCGv_i32 tcg_elt1, TCGv_i32 tcg_elt2,
+                        int opc, bool is_min, TCGv_ptr fpst)
+{
+    /* Helper function for disas_simd_across_lanes: do a single precision
+     * min/max operation on the specified two inputs,
+     * and return the result in tcg_elt1.
+     */
+    if (opc == 0xc) {
+        if (is_min) {
+            gen_helper_vfp_minnums(tcg_elt1, tcg_elt1, tcg_elt2, fpst);
+        } else {
+            gen_helper_vfp_maxnums(tcg_elt1, tcg_elt1, tcg_elt2, fpst);
+        }
+    } else {
+        assert(opc == 0xf);
+        if (is_min) {
+            gen_helper_vfp_mins(tcg_elt1, tcg_elt1, tcg_elt2, fpst);
+        } else {
+            gen_helper_vfp_maxs(tcg_elt1, tcg_elt1, tcg_elt2, fpst);
+        }
+    }
+}
+
 /* C3.6.4 AdvSIMD across lanes
  *   31  30  29 28       24 23  22 21       17 16    12 11 10 9    5 4    0
  * +---+---+---+-----------+------+-----------+--------+-----+------+------+
@@ -4891,7 +4914,159 @@ static void disas_simd_zip_trn(DisasContext *s, uint32_t insn)
  */
 static void disas_simd_across_lanes(DisasContext *s, uint32_t insn)
 {
-    unsupported_encoding(s, insn);
+    int rd = extract32(insn, 0, 5);
+    int rn = extract32(insn, 5, 5);
+    int size = extract32(insn, 22, 2);
+    int opcode = extract32(insn, 12, 5);
+    bool is_q = extract32(insn, 30, 1);
+    bool is_u = extract32(insn, 29, 1);
+    bool is_fp = false;
+    bool is_min = false;
+    int esize;
+    int elements;
+    int i;
+    TCGv_i64 tcg_res, tcg_elt;
+
+    switch (opcode) {
+    case 0x1b: /* ADDV */
+        if (is_u) {
+            unallocated_encoding(s);
+            return;
+        }
+        /* fall through */
+    case 0x3: /* SADDLV, UADDLV */
+    case 0xa: /* SMAXV, UMAXV */
+    case 0x1a: /* SMINV, UMINV */
+        if (size == 3 || (size == 2 && !is_q)) {
+            unallocated_encoding(s);
+            return;
+        }
+        break;
+    case 0xc: /* FMAXNMV, FMINNMV */
+    case 0xf: /* FMAXV, FMINV */
+        if (!is_u || !is_q || extract32(size, 0, 1)) {
+            unallocated_encoding(s);
+            return;
+        }
+        /* Bit 1 of size field encodes min vs max, and actual size is always
+         * 32 bits: adjust the size variable so following code can rely on it
+         */
+        is_min = extract32(size, 1, 1);
+        is_fp = true;
+        size = 2;
+        break;
+    default:
+        unallocated_encoding(s);
+        return;
+    }
+
+    esize = 8 << size;
+    elements = (is_q ? 128 : 64) / esize;
+
+    tcg_res = tcg_temp_new_i64();
+    tcg_elt = tcg_temp_new_i64();
+
+    /* These instructions operate across all lanes of a vector
+     * to produce a single result. We can guarantee that a 64
+     * bit intermediate is sufficient:
+     *  + for [US]ADDLV the maximum element size is 32 bits, and
+     *    the result type is 64 bits
+     *  + for FMAX*V, FMIN*V, ADDV the intermediate type is the
+     *    same as the element size, which is 32 bits at most
+     * For the integer operations we can choose to work at 64
+     * or 32 bits and truncate at the end; for simplicity
+     * we use 64 bits always. The floating point
+     * ops do require 32 bit intermediates, though.
+     */
+    if (!is_fp) {
+        read_vec_element(s, tcg_res, rn, 0, size | (is_u ? 0 : MO_SIGN));
+
+        for (i = 1; i < elements; i++) {
+            read_vec_element(s, tcg_elt, rn, i, size | (is_u ? 0 : MO_SIGN));
+
+            switch (opcode) {
+            case 0x03: /* SADDLV / UADDLV */
+            case 0x1b: /* ADDV */
+                tcg_gen_add_i64(tcg_res, tcg_res, tcg_elt);
+                break;
+            case 0x0a: /* SMAXV / UMAXV */
+                tcg_gen_movcond_i64(is_u ? TCG_COND_GEU : TCG_COND_GE,
+                                    tcg_res,
+                                    tcg_res, tcg_elt, tcg_res, tcg_elt);
+                break;
+            case 0x1a: /* SMINV / UMINV */
+                tcg_gen_movcond_i64(is_u ? TCG_COND_LEU : TCG_COND_LE,
+                                    tcg_res,
+                                    tcg_res, tcg_elt, tcg_res, tcg_elt);
+                break;
+                break;
+            default:
+                g_assert_not_reached();
+            }
+
+        }
+    } else {
+        /* Floating point ops which work on 32 bit (single) intermediates.
+         * Note that correct NaN propagation requires that we do these
+         * operations in exactly the order specified by the pseudocode.
+         */
+        TCGv_i32 tcg_elt1 = tcg_temp_new_i32();
+        TCGv_i32 tcg_elt2 = tcg_temp_new_i32();
+        TCGv_i32 tcg_elt3 = tcg_temp_new_i32();
+        TCGv_ptr fpst = get_fpstatus_ptr();
+
+        assert(esize == 32);
+        assert(elements == 4);
+
+        read_vec_element(s, tcg_elt, rn, 0, MO_32);
+        tcg_gen_trunc_i64_i32(tcg_elt1, tcg_elt);
+        read_vec_element(s, tcg_elt, rn, 1, MO_32);
+        tcg_gen_trunc_i64_i32(tcg_elt2, tcg_elt);
+
+        do_minmaxop(s, tcg_elt1, tcg_elt2, opcode, is_min, fpst);
+
+        read_vec_element(s, tcg_elt, rn, 2, MO_32);
+        tcg_gen_trunc_i64_i32(tcg_elt2, tcg_elt);
+        read_vec_element(s, tcg_elt, rn, 3, MO_32);
+        tcg_gen_trunc_i64_i32(tcg_elt3, tcg_elt);
+
+        do_minmaxop(s, tcg_elt2, tcg_elt3, opcode, is_min, fpst);
+
+        do_minmaxop(s, tcg_elt1, tcg_elt2, opcode, is_min, fpst);
+
+        tcg_gen_extu_i32_i64(tcg_res, tcg_elt1);
+        tcg_temp_free_i32(tcg_elt1);
+        tcg_temp_free_i32(tcg_elt2);
+        tcg_temp_free_i32(tcg_elt3);
+        tcg_temp_free_ptr(fpst);
+    }
+
+    tcg_temp_free_i64(tcg_elt);
+
+    /* Now truncate the result to the width required for the final output */
+    if (opcode == 0x03) {
+        /* SADDLV, UADDLV: result is 2*esize */
+        size++;
+    }
+
+    switch (size) {
+    case 0:
+        tcg_gen_ext8u_i64(tcg_res, tcg_res);
+        break;
+    case 1:
+        tcg_gen_ext16u_i64(tcg_res, tcg_res);
+        break;
+    case 2:
+        tcg_gen_ext32u_i64(tcg_res, tcg_res);
+        break;
+    case 3:
+        break;
+    default:
+        g_assert_not_reached();
+    }
+
+    write_fp_dreg(s, rd, tcg_res);
+    tcg_temp_free_i64(tcg_res);
 }
 
 /* C3.6.5 AdvSIMD copy
-- 
1.8.5

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [Qemu-devel] [PULL 08/38] target-arm: A64: Add SIMD copy operations
  2014-01-29 13:39 [Qemu-devel] [PULL 00/38] target-arm queue Peter Maydell
                   ` (6 preceding siblings ...)
  2014-01-29 13:39 ` [Qemu-devel] [PULL 07/38] target-arm: A64: Add SIMD across-lanes instructions Peter Maydell
@ 2014-01-29 13:39 ` Peter Maydell
  2014-01-29 13:39 ` [Qemu-devel] [PULL 09/38] target-arm: A64: Add SIMD modified immediate group Peter Maydell
                   ` (30 subsequent siblings)
  38 siblings, 0 replies; 40+ messages in thread
From: Peter Maydell @ 2014-01-29 13:39 UTC (permalink / raw)
  To: Anthony Liguori; +Cc: Blue Swirl, qemu-devel, Aurelien Jarno

From: Alex Bennée <alex.bennee@linaro.org>

This adds support for the all the AdvSIMD vector copy operations
(ARM ARM 3.6.5).

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
---
 target-arm/translate-a64.c | 210 ++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 209 insertions(+), 1 deletion(-)

diff --git a/target-arm/translate-a64.c b/target-arm/translate-a64.c
index de4f518..6521f73 100644
--- a/target-arm/translate-a64.c
+++ b/target-arm/translate-a64.c
@@ -5069,6 +5069,173 @@ static void disas_simd_across_lanes(DisasContext *s, uint32_t insn)
     tcg_temp_free_i64(tcg_res);
 }
 
+/* C6.3.31 DUP (Element, Vector)
+ *
+ *  31  30   29              21 20    16 15        10  9    5 4    0
+ * +---+---+-------------------+--------+-------------+------+------+
+ * | 0 | Q | 0 0 1 1 1 0 0 0 0 |  imm5  | 0 0 0 0 0 1 |  Rn  |  Rd  |
+ * +---+---+-------------------+--------+-------------+------+------+
+ *
+ * size: encoded in imm5 (see ARM ARM LowestSetBit())
+ */
+static void handle_simd_dupe(DisasContext *s, int is_q, int rd, int rn,
+                             int imm5)
+{
+    int size = ctz32(imm5);
+    int esize = 8 << size;
+    int elements = (is_q ? 128 : 64) / esize;
+    int index, i;
+    TCGv_i64 tmp;
+
+    if (size > 3 || (size == 3 && !is_q)) {
+        unallocated_encoding(s);
+        return;
+    }
+
+    index = imm5 >> (size + 1);
+
+    tmp = tcg_temp_new_i64();
+    read_vec_element(s, tmp, rn, index, size);
+
+    for (i = 0; i < elements; i++) {
+        write_vec_element(s, tmp, rd, i, size);
+    }
+
+    if (!is_q) {
+        clear_vec_high(s, rd);
+    }
+
+    tcg_temp_free_i64(tmp);
+}
+
+/* C6.3.32 DUP (General)
+ *
+ *  31  30   29              21 20    16 15        10  9    5 4    0
+ * +---+---+-------------------+--------+-------------+------+------+
+ * | 0 | Q | 0 0 1 1 1 0 0 0 0 |  imm5  | 0 0 0 0 1 1 |  Rn  |  Rd  |
+ * +---+---+-------------------+--------+-------------+------+------+
+ *
+ * size: encoded in imm5 (see ARM ARM LowestSetBit())
+ */
+static void handle_simd_dupg(DisasContext *s, int is_q, int rd, int rn,
+                             int imm5)
+{
+    int size = ctz32(imm5);
+    int esize = 8 << size;
+    int elements = (is_q ? 128 : 64)/esize;
+    int i = 0;
+
+    if (size > 3 || ((size == 3) && !is_q)) {
+        unallocated_encoding(s);
+        return;
+    }
+    for (i = 0; i < elements; i++) {
+        write_vec_element(s, cpu_reg(s, rn), rd, i, size);
+    }
+    if (!is_q) {
+        clear_vec_high(s, rd);
+    }
+}
+
+/* C6.3.150 INS (Element)
+ *
+ *  31                   21 20    16 15  14    11  10 9    5 4    0
+ * +-----------------------+--------+------------+---+------+------+
+ * | 0 1 1 0 1 1 1 0 0 0 0 |  imm5  | 0 |  imm4  | 1 |  Rn  |  Rd  |
+ * +-----------------------+--------+------------+---+------+------+
+ *
+ * size: encoded in imm5 (see ARM ARM LowestSetBit())
+ * index: encoded in imm5<4:size+1>
+ */
+static void handle_simd_inse(DisasContext *s, int rd, int rn,
+                             int imm4, int imm5)
+{
+    int size = ctz32(imm5);
+    int src_index, dst_index;
+    TCGv_i64 tmp;
+
+    if (size > 3) {
+        unallocated_encoding(s);
+        return;
+    }
+    dst_index = extract32(imm5, 1+size, 5);
+    src_index = extract32(imm4, size, 4);
+
+    tmp = tcg_temp_new_i64();
+
+    read_vec_element(s, tmp, rn, src_index, size);
+    write_vec_element(s, tmp, rd, dst_index, size);
+
+    tcg_temp_free_i64(tmp);
+}
+
+
+/* C6.3.151 INS (General)
+ *
+ *  31                   21 20    16 15        10  9    5 4    0
+ * +-----------------------+--------+-------------+------+------+
+ * | 0 1 0 0 1 1 1 0 0 0 0 |  imm5  | 0 0 0 1 1 1 |  Rn  |  Rd  |
+ * +-----------------------+--------+-------------+------+------+
+ *
+ * size: encoded in imm5 (see ARM ARM LowestSetBit())
+ * index: encoded in imm5<4:size+1>
+ */
+static void handle_simd_insg(DisasContext *s, int rd, int rn, int imm5)
+{
+    int size = ctz32(imm5);
+    int idx;
+
+    if (size > 3) {
+        unallocated_encoding(s);
+        return;
+    }
+
+    idx = extract32(imm5, 1 + size, 4 - size);
+    write_vec_element(s, cpu_reg(s, rn), rd, idx, size);
+}
+
+/*
+ * C6.3.321 UMOV (General)
+ * C6.3.237 SMOV (General)
+ *
+ *  31  30   29              21 20    16 15    12   10 9    5 4    0
+ * +---+---+-------------------+--------+-------------+------+------+
+ * | 0 | Q | 0 0 1 1 1 0 0 0 0 |  imm5  | 0 0 1 U 1 1 |  Rn  |  Rd  |
+ * +---+---+-------------------+--------+-------------+------+------+
+ *
+ * U: unsigned when set
+ * size: encoded in imm5 (see ARM ARM LowestSetBit())
+ */
+static void handle_simd_umov_smov(DisasContext *s, int is_q, int is_signed,
+                                  int rn, int rd, int imm5)
+{
+    int size = ctz32(imm5);
+    int element;
+    TCGv_i64 tcg_rd;
+
+    /* Check for UnallocatedEncodings */
+    if (is_signed) {
+        if (size > 2 || (size == 2 && !is_q)) {
+            unallocated_encoding(s);
+            return;
+        }
+    } else {
+        if (size > 3
+            || (size < 3 && is_q)
+            || (size == 3 && !is_q)) {
+            unallocated_encoding(s);
+            return;
+        }
+    }
+    element = extract32(imm5, 1+size, 4);
+
+    tcg_rd = cpu_reg(s, rd);
+    read_vec_element(s, tcg_rd, rn, element, size | (is_signed ? MO_SIGN : 0));
+    if (is_signed && !is_q) {
+        tcg_gen_ext32u_i64(tcg_rd, tcg_rd);
+    }
+}
+
 /* C3.6.5 AdvSIMD copy
  *   31  30  29  28             21 20  16 15  14  11 10  9    5 4    0
  * +---+---+----+-----------------+------+---+------+---+------+------+
@@ -5077,7 +5244,48 @@ static void disas_simd_across_lanes(DisasContext *s, uint32_t insn)
  */
 static void disas_simd_copy(DisasContext *s, uint32_t insn)
 {
-    unsupported_encoding(s, insn);
+    int rd = extract32(insn, 0, 5);
+    int rn = extract32(insn, 5, 5);
+    int imm4 = extract32(insn, 11, 4);
+    int op = extract32(insn, 29, 1);
+    int is_q = extract32(insn, 30, 1);
+    int imm5 = extract32(insn, 16, 5);
+
+    if (op) {
+        if (is_q) {
+            /* INS (element) */
+            handle_simd_inse(s, rd, rn, imm4, imm5);
+        } else {
+            unallocated_encoding(s);
+        }
+    } else {
+        switch (imm4) {
+        case 0:
+            /* DUP (element - vector) */
+            handle_simd_dupe(s, is_q, rd, rn, imm5);
+            break;
+        case 1:
+            /* DUP (general) */
+            handle_simd_dupg(s, is_q, rd, rn, imm5);
+            break;
+        case 3:
+            if (is_q) {
+                /* INS (general) */
+                handle_simd_insg(s, rd, rn, imm5);
+            } else {
+                unallocated_encoding(s);
+            }
+            break;
+        case 5:
+        case 7:
+            /* UMOV/SMOV (is_q indicates 32/64; imm4 indicates signedness) */
+            handle_simd_umov_smov(s, is_q, (imm4 == 5), rn, rd, imm5);
+            break;
+        default:
+            unallocated_encoding(s);
+            break;
+        }
+    }
 }
 
 /* C3.6.6 AdvSIMD modified immediate
-- 
1.8.5

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [Qemu-devel] [PULL 09/38] target-arm: A64: Add SIMD modified immediate group
  2014-01-29 13:39 [Qemu-devel] [PULL 00/38] target-arm queue Peter Maydell
                   ` (7 preceding siblings ...)
  2014-01-29 13:39 ` [Qemu-devel] [PULL 08/38] target-arm: A64: Add SIMD copy operations Peter Maydell
@ 2014-01-29 13:39 ` Peter Maydell
  2014-01-29 13:39 ` [Qemu-devel] [PULL 10/38] target-arm: A64: Add SIMD scalar copy instructions Peter Maydell
                   ` (29 subsequent siblings)
  38 siblings, 0 replies; 40+ messages in thread
From: Peter Maydell @ 2014-01-29 13:39 UTC (permalink / raw)
  To: Anthony Liguori; +Cc: Blue Swirl, qemu-devel, Aurelien Jarno

From: Alex Bennée <alex.bennee@linaro.org>

This patch adds support for the AdvSIMD modified immediate group
(C3.6.6) with all its suboperations (movi, orr, fmov, mvni, bic).

Signed-off-by: Alexander Graf <agraf@suse.de>
[AJB: new decode struct, minor bug fixes, optimisation]
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
---
 target-arm/translate-a64.c | 120 ++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 119 insertions(+), 1 deletion(-)

diff --git a/target-arm/translate-a64.c b/target-arm/translate-a64.c
index 6521f73..69c75c0 100644
--- a/target-arm/translate-a64.c
+++ b/target-arm/translate-a64.c
@@ -5293,10 +5293,128 @@ static void disas_simd_copy(DisasContext *s, uint32_t insn)
  * +---+---+----+---------------------+-----+-------+----+---+-------+------+
  * | 0 | Q | op | 0 1 1 1 1 0 0 0 0 0 | abc | cmode | o2 | 1 | defgh |  Rd  |
  * +---+---+----+---------------------+-----+-------+----+---+-------+------+
+ *
+ * There are a number of operations that can be carried out here:
+ *   MOVI - move (shifted) imm into register
+ *   MVNI - move inverted (shifted) imm into register
+ *   ORR  - bitwise OR of (shifted) imm with register
+ *   BIC  - bitwise clear of (shifted) imm with register
  */
 static void disas_simd_mod_imm(DisasContext *s, uint32_t insn)
 {
-    unsupported_encoding(s, insn);
+    int rd = extract32(insn, 0, 5);
+    int cmode = extract32(insn, 12, 4);
+    int cmode_3_1 = extract32(cmode, 1, 3);
+    int cmode_0 = extract32(cmode, 0, 1);
+    int o2 = extract32(insn, 11, 1);
+    uint64_t abcdefgh = extract32(insn, 5, 5) | (extract32(insn, 16, 3) << 5);
+    bool is_neg = extract32(insn, 29, 1);
+    bool is_q = extract32(insn, 30, 1);
+    uint64_t imm = 0;
+    TCGv_i64 tcg_rd, tcg_imm;
+    int i;
+
+    if (o2 != 0 || ((cmode == 0xf) && is_neg && !is_q)) {
+        unallocated_encoding(s);
+        return;
+    }
+
+    /* See AdvSIMDExpandImm() in ARM ARM */
+    switch (cmode_3_1) {
+    case 0: /* Replicate(Zeros(24):imm8, 2) */
+    case 1: /* Replicate(Zeros(16):imm8:Zeros(8), 2) */
+    case 2: /* Replicate(Zeros(8):imm8:Zeros(16), 2) */
+    case 3: /* Replicate(imm8:Zeros(24), 2) */
+    {
+        int shift = cmode_3_1 * 8;
+        imm = bitfield_replicate(abcdefgh << shift, 32);
+        break;
+    }
+    case 4: /* Replicate(Zeros(8):imm8, 4) */
+    case 5: /* Replicate(imm8:Zeros(8), 4) */
+    {
+        int shift = (cmode_3_1 & 0x1) * 8;
+        imm = bitfield_replicate(abcdefgh << shift, 16);
+        break;
+    }
+    case 6:
+        if (cmode_0) {
+            /* Replicate(Zeros(8):imm8:Ones(16), 2) */
+            imm = (abcdefgh << 16) | 0xffff;
+        } else {
+            /* Replicate(Zeros(16):imm8:Ones(8), 2) */
+            imm = (abcdefgh << 8) | 0xff;
+        }
+        imm = bitfield_replicate(imm, 32);
+        break;
+    case 7:
+        if (!cmode_0 && !is_neg) {
+            imm = bitfield_replicate(abcdefgh, 8);
+        } else if (!cmode_0 && is_neg) {
+            int i;
+            imm = 0;
+            for (i = 0; i < 8; i++) {
+                if ((abcdefgh) & (1 << i)) {
+                    imm |= 0xffULL << (i * 8);
+                }
+            }
+        } else if (cmode_0) {
+            if (is_neg) {
+                imm = (abcdefgh & 0x3f) << 48;
+                if (abcdefgh & 0x80) {
+                    imm |= 0x8000000000000000ULL;
+                }
+                if (abcdefgh & 0x40) {
+                    imm |= 0x3fc0000000000000ULL;
+                } else {
+                    imm |= 0x4000000000000000ULL;
+                }
+            } else {
+                imm = (abcdefgh & 0x3f) << 19;
+                if (abcdefgh & 0x80) {
+                    imm |= 0x80000000;
+                }
+                if (abcdefgh & 0x40) {
+                    imm |= 0x3e000000;
+                } else {
+                    imm |= 0x40000000;
+                }
+                imm |= (imm << 32);
+            }
+        }
+        break;
+    }
+
+    if (cmode_3_1 != 7 && is_neg) {
+        imm = ~imm;
+    }
+
+    tcg_imm = tcg_const_i64(imm);
+    tcg_rd = new_tmp_a64(s);
+
+    for (i = 0; i < 2; i++) {
+        int foffs = i ? fp_reg_hi_offset(rd) : fp_reg_offset(rd, MO_64);
+
+        if (i == 1 && !is_q) {
+            /* non-quad ops clear high half of vector */
+            tcg_gen_movi_i64(tcg_rd, 0);
+        } else if ((cmode & 0x9) == 0x1 || (cmode & 0xd) == 0x9) {
+            tcg_gen_ld_i64(tcg_rd, cpu_env, foffs);
+            if (is_neg) {
+                /* AND (BIC) */
+                tcg_gen_and_i64(tcg_rd, tcg_rd, tcg_imm);
+            } else {
+                /* ORR */
+                tcg_gen_or_i64(tcg_rd, tcg_rd, tcg_imm);
+            }
+        } else {
+            /* MOVI */
+            tcg_gen_mov_i64(tcg_rd, tcg_imm);
+        }
+        tcg_gen_st_i64(tcg_rd, cpu_env, foffs);
+    }
+
+    tcg_temp_free_i64(tcg_imm);
 }
 
 /* C3.6.7 AdvSIMD scalar copy
-- 
1.8.5

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [Qemu-devel] [PULL 10/38] target-arm: A64: Add SIMD scalar copy instructions
  2014-01-29 13:39 [Qemu-devel] [PULL 00/38] target-arm queue Peter Maydell
                   ` (8 preceding siblings ...)
  2014-01-29 13:39 ` [Qemu-devel] [PULL 09/38] target-arm: A64: Add SIMD modified immediate group Peter Maydell
@ 2014-01-29 13:39 ` Peter Maydell
  2014-01-29 13:39 ` [Qemu-devel] [PULL 11/38] hw/arm/boot: Don't set up ATAGS for autogenerated dtb booting Peter Maydell
                   ` (28 subsequent siblings)
  38 siblings, 0 replies; 40+ messages in thread
From: Peter Maydell @ 2014-01-29 13:39 UTC (permalink / raw)
  To: Anthony Liguori; +Cc: Blue Swirl, qemu-devel, Aurelien Jarno

Add support for the SIMD scalar copy instruction group (C3.6.7),
which consists of the single instruction DUP (element, scalar).

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
---
 target-arm/translate-a64.c | 43 ++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 42 insertions(+), 1 deletion(-)

diff --git a/target-arm/translate-a64.c b/target-arm/translate-a64.c
index 69c75c0..7cfb55b 100644
--- a/target-arm/translate-a64.c
+++ b/target-arm/translate-a64.c
@@ -5108,6 +5108,35 @@ static void handle_simd_dupe(DisasContext *s, int is_q, int rd, int rn,
     tcg_temp_free_i64(tmp);
 }
 
+/* C6.3.31 DUP (element, scalar)
+ *  31                   21 20    16 15        10  9    5 4    0
+ * +-----------------------+--------+-------------+------+------+
+ * | 0 1 0 1 1 1 1 0 0 0 0 |  imm5  | 0 0 0 0 0 1 |  Rn  |  Rd  |
+ * +-----------------------+--------+-------------+------+------+
+ */
+static void handle_simd_dupes(DisasContext *s, int rd, int rn,
+                              int imm5)
+{
+    int size = ctz32(imm5);
+    int index;
+    TCGv_i64 tmp;
+
+    if (size > 3) {
+        unallocated_encoding(s);
+        return;
+    }
+
+    index = imm5 >> (size + 1);
+
+    /* This instruction just extracts the specified element and
+     * zero-extends it into the bottom of the destination register.
+     */
+    tmp = tcg_temp_new_i64();
+    read_vec_element(s, tmp, rn, index, size);
+    write_fp_dreg(s, rd, tmp);
+    tcg_temp_free_i64(tmp);
+}
+
 /* C6.3.32 DUP (General)
  *
  *  31  30   29              21 20    16 15        10  9    5 4    0
@@ -5425,7 +5454,19 @@ static void disas_simd_mod_imm(DisasContext *s, uint32_t insn)
  */
 static void disas_simd_scalar_copy(DisasContext *s, uint32_t insn)
 {
-    unsupported_encoding(s, insn);
+    int rd = extract32(insn, 0, 5);
+    int rn = extract32(insn, 5, 5);
+    int imm4 = extract32(insn, 11, 4);
+    int imm5 = extract32(insn, 16, 5);
+    int op = extract32(insn, 29, 1);
+
+    if (op != 0 || imm4 != 0) {
+        unallocated_encoding(s);
+        return;
+    }
+
+    /* DUP (element, scalar) */
+    handle_simd_dupes(s, rd, rn, imm5);
 }
 
 /* C3.6.8 AdvSIMD scalar pairwise
-- 
1.8.5

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [Qemu-devel] [PULL 11/38] hw/arm/boot: Don't set up ATAGS for autogenerated dtb booting
  2014-01-29 13:39 [Qemu-devel] [PULL 00/38] target-arm queue Peter Maydell
                   ` (9 preceding siblings ...)
  2014-01-29 13:39 ` [Qemu-devel] [PULL 10/38] target-arm: A64: Add SIMD scalar copy instructions Peter Maydell
@ 2014-01-29 13:39 ` Peter Maydell
  2014-01-29 13:39 ` [Qemu-devel] [PULL 12/38] ARM: Convert MIDR to a property Peter Maydell
                   ` (27 subsequent siblings)
  38 siblings, 0 replies; 40+ messages in thread
From: Peter Maydell @ 2014-01-29 13:39 UTC (permalink / raw)
  To: Anthony Liguori; +Cc: Blue Swirl, qemu-devel, Aurelien Jarno

The code which decides whether to set up the ATAGS data structure on
reset was using the wrong conditional, which meant we were creating
an ATAGS structure when doing a device-tree boot if the dtb was
autogenerated by the board. This is harmless, but unnecessary, so
bring it in to line with user-provided-dtb boots.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Message-id: 1388326833-656-1-git-send-email-peter.maydell@linaro.org
---
 hw/arm/boot.c | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/hw/arm/boot.c b/hw/arm/boot.c
index 1c1b0e5..4036262 100644
--- a/hw/arm/boot.c
+++ b/hw/arm/boot.c
@@ -173,6 +173,11 @@ static void default_reset_secondary(ARMCPU *cpu,
     env->regs[15] = info->smp_loader_start;
 }
 
+static inline bool have_dtb(const struct arm_boot_info *info)
+{
+    return info->dtb_filename || info->get_dtb;
+}
+
 #define WRITE_WORD(p, value) do { \
     stl_phys_notdirty(p, value);  \
     p += 4;                       \
@@ -421,7 +426,7 @@ static void do_cpu_reset(void *opaque)
                     env->regs[15] = info->loader_start;
                 }
 
-                if (!info->dtb_filename) {
+                if (!have_dtb(info)) {
                     if (old_param) {
                         set_kernel_args_old(info);
                     } else {
@@ -542,7 +547,7 @@ void arm_load_kernel(ARMCPU *cpu, struct arm_boot_info *info)
         /* for device tree boot, we pass the DTB directly in r2. Otherwise
          * we point to the kernel args.
          */
-        if (info->dtb_filename || info->get_dtb) {
+        if (have_dtb(info)) {
             /* Place the DTB after the initrd in memory. Note that some
              * kernels will trash anything in the 4K page the initrd
              * ends in, so make sure the DTB isn't caught up in that.
-- 
1.8.5

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [Qemu-devel] [PULL 12/38] ARM: Convert MIDR to a property
  2014-01-29 13:39 [Qemu-devel] [PULL 00/38] target-arm queue Peter Maydell
                   ` (10 preceding siblings ...)
  2014-01-29 13:39 ` [Qemu-devel] [PULL 11/38] hw/arm/boot: Don't set up ATAGS for autogenerated dtb booting Peter Maydell
@ 2014-01-29 13:39 ` Peter Maydell
  2014-01-29 13:39 ` [Qemu-devel] [PULL 13/38] ZYNQ: Implement board MIDR control for Zynq Peter Maydell
                   ` (26 subsequent siblings)
  38 siblings, 0 replies; 40+ messages in thread
From: Peter Maydell @ 2014-01-29 13:39 UTC (permalink / raw)
  To: Anthony Liguori; +Cc: Blue Swirl, qemu-devel, Aurelien Jarno

From: Alistair Francis <alistair.francis@xilinx.com>

Convert the MIDR register to a property. This allows boards to later set
a custom MIDR value. This has been done in such a way to maintain
compatibility with all existing CPUs and boards

Signed-off-by: Alistair Francis <alistair.francis@xilinx.com>
Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Message-id: 878613f2f12d4162f12629522fd99de8df904856.1390176489.git.alistair.francis@xilinx.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target-arm/cpu.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/target-arm/cpu.c b/target-arm/cpu.c
index 52efd5d..45ad7f0 100644
--- a/target-arm/cpu.c
+++ b/target-arm/cpu.c
@@ -982,6 +982,7 @@ static const ARMCPUInfo arm_cpus[] = {
 
 static Property arm_cpu_properties[] = {
     DEFINE_PROP_BOOL("start-powered-off", ARMCPU, start_powered_off, false),
+    DEFINE_PROP_UINT32("midr", ARMCPU, midr, 0),
     DEFINE_PROP_END_OF_LIST()
 };
 
-- 
1.8.5

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [Qemu-devel] [PULL 13/38] ZYNQ: Implement board MIDR control for Zynq
  2014-01-29 13:39 [Qemu-devel] [PULL 00/38] target-arm queue Peter Maydell
                   ` (11 preceding siblings ...)
  2014-01-29 13:39 ` [Qemu-devel] [PULL 12/38] ARM: Convert MIDR to a property Peter Maydell
@ 2014-01-29 13:39 ` Peter Maydell
  2014-01-29 13:39 ` [Qemu-devel] [PULL 14/38] display: avoid multi-statement macro Peter Maydell
                   ` (25 subsequent siblings)
  38 siblings, 0 replies; 40+ messages in thread
From: Peter Maydell @ 2014-01-29 13:39 UTC (permalink / raw)
  To: Anthony Liguori; +Cc: Blue Swirl, qemu-devel, Aurelien Jarno

From: Alistair Francis <alistair.francis@xilinx.com>

This patch uses the fact that the midr variable is now a property
This patch sets the midr variable to the boards custom midr

Signed-off-by: Alistair Francis <alistair.francis@xilinx.com>
Message-id: a3754b10d150af72e4688a993e484fa2b9b8fa21.1390176489.git.alistair.francis@xilinx.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 hw/arm/xilinx_zynq.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/hw/arm/xilinx_zynq.c b/hw/arm/xilinx_zynq.c
index 98e0958..9ee21e7 100644
--- a/hw/arm/xilinx_zynq.c
+++ b/hw/arm/xilinx_zynq.c
@@ -37,6 +37,7 @@
 #define IRQ_OFFSET 32 /* pic interrupts start from index 32 */
 
 #define MPCORE_PERIPHBASE 0xF8F00000
+#define ZYNQ_BOARD_MIDR 0x413FC090
 
 static const int dma_irqs[8] = {
     46, 47, 48, 49, 72, 73, 74, 75
@@ -125,6 +126,12 @@ static void zynq_init(QEMUMachineInitArgs *args)
 
     cpu = ARM_CPU(object_new(object_class_get_name(cpu_oc)));
 
+    object_property_set_int(OBJECT(cpu), ZYNQ_BOARD_MIDR, "midr", &err);
+    if (err) {
+        error_report("%s", error_get_pretty(err));
+        exit(1);
+    }
+
     object_property_set_int(OBJECT(cpu), MPCORE_PERIPHBASE, "reset-cbar", &err);
     if (err) {
         error_report("%s", error_get_pretty(err));
-- 
1.8.5

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [Qemu-devel] [PULL 14/38] display: avoid multi-statement macro
  2014-01-29 13:39 [Qemu-devel] [PULL 00/38] target-arm queue Peter Maydell
                   ` (12 preceding siblings ...)
  2014-01-29 13:39 ` [Qemu-devel] [PULL 13/38] ZYNQ: Implement board MIDR control for Zynq Peter Maydell
@ 2014-01-29 13:39 ` Peter Maydell
  2014-01-29 13:39 ` [Qemu-devel] [PULL 15/38] target-arm: Move arm_rmode_to_sf to a shared location Peter Maydell
                   ` (24 subsequent siblings)
  38 siblings, 0 replies; 40+ messages in thread
From: Peter Maydell @ 2014-01-29 13:39 UTC (permalink / raw)
  To: Anthony Liguori; +Cc: Blue Swirl, qemu-devel, Aurelien Jarno

From: Paolo Bonzini <pbonzini@redhat.com>

For blizzard, pl110 and tc6393xb this is harmless, but for pxa2xx
Coverity noticed that it is used inside an "if" statement.
Fix it because it's the file with the highest number of defects
in the whole QEMU tree!  Use "do...while (0)", or just remove the
semicolon if there's a single statement in the macro.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 hw/display/blizzard_template.h | 40 +++++++++++++++++++++++++---------------
 hw/display/pl110_template.h    | 12 ++++++++----
 hw/display/pxa2xx_template.h   | 22 +++++++++++++++++-----
 hw/display/tc6393xb_template.h | 14 +++++++++-----
 4 files changed, 59 insertions(+), 29 deletions(-)

diff --git a/hw/display/blizzard_template.h b/hw/display/blizzard_template.h
index a8a8899..b7ef27c 100644
--- a/hw/display/blizzard_template.h
+++ b/hw/display/blizzard_template.h
@@ -18,25 +18,35 @@
  * with this program; if not, see <http://www.gnu.org/licenses/>.
  */
 
-#define SKIP_PIXEL(to)		to += deststep
+#define SKIP_PIXEL(to)         (to += deststep)
 #if DEPTH == 8
-# define PIXEL_TYPE		uint8_t
-# define COPY_PIXEL(to, from)	*to = from; SKIP_PIXEL(to)
-# define COPY_PIXEL1(to, from)	*to ++ = from
+# define PIXEL_TYPE            uint8_t
+# define COPY_PIXEL(to, from)  do { *to = from; SKIP_PIXEL(to); } while (0)
+# define COPY_PIXEL1(to, from) (*to++ = from)
 #elif DEPTH == 15 || DEPTH == 16
-# define PIXEL_TYPE		uint16_t
-# define COPY_PIXEL(to, from)	*to = from; SKIP_PIXEL(to)
-# define COPY_PIXEL1(to, from)	*to ++ = from
+# define PIXEL_TYPE            uint16_t
+# define COPY_PIXEL(to, from)  do { *to = from; SKIP_PIXEL(to); } while (0)
+# define COPY_PIXEL1(to, from) (*to++ = from)
 #elif DEPTH == 24
-# define PIXEL_TYPE		uint8_t
-# define COPY_PIXEL(to, from)	\
-    to[0] = from; to[1] = (from) >> 8; to[2] = (from) >> 16; SKIP_PIXEL(to)
-# define COPY_PIXEL1(to, from)	\
-    *to ++ = from; *to ++ = (from) >> 8; *to ++ = (from) >> 16
+# define PIXEL_TYPE            uint8_t
+# define COPY_PIXEL(to, from) \
+    do {                      \
+        to[0] = from;         \
+        to[1] = (from) >> 8;  \
+        to[2] = (from) >> 16; \
+        SKIP_PIXEL(to);       \
+    } while (0)
+
+# define COPY_PIXEL1(to, from) \
+    do {                       \
+        *to++ = from;          \
+        *to++ = (from) >> 8;   \
+        *to++ = (from) >> 16;  \
+    } while (0)
 #elif DEPTH == 32
-# define PIXEL_TYPE		uint32_t
-# define COPY_PIXEL(to, from)	*to = from; SKIP_PIXEL(to)
-# define COPY_PIXEL1(to, from)	*to ++ = from
+# define PIXEL_TYPE            uint32_t
+# define COPY_PIXEL(to, from)  do { *to = from; SKIP_PIXEL(to); } while (0)
+# define COPY_PIXEL1(to, from) (*to++ = from)
 #else
 # error unknown bit depth
 #endif
diff --git a/hw/display/pl110_template.h b/hw/display/pl110_template.h
index e738e4a..36ba791 100644
--- a/hw/display/pl110_template.h
+++ b/hw/display/pl110_template.h
@@ -14,12 +14,16 @@
 #if BITS == 8
 #define COPY_PIXEL(to, from) *(to++) = from
 #elif BITS == 15 || BITS == 16
-#define COPY_PIXEL(to, from) *(uint16_t *)to = from; to += 2;
+#define COPY_PIXEL(to, from) do { *(uint16_t *)to = from; to += 2; } while (0)
 #elif BITS == 24
-#define COPY_PIXEL(to, from) \
-  *(to++) = from; *(to++) = (from) >> 8; *(to++) = (from) >> 16
+#define COPY_PIXEL(to, from)    \
+    do {                        \
+        *(to++) = from;         \
+        *(to++) = (from) >> 8;  \
+        *(to++) = (from) >> 16; \
+    } while (0)
 #elif BITS == 32
-#define COPY_PIXEL(to, from) *(uint32_t *)to = from; to += 4;
+#define COPY_PIXEL(to, from) do { *(uint32_t *)to = from; to += 4; } while (0)
 #else
 #error unknown bit depth
 #endif
diff --git a/hw/display/pxa2xx_template.h b/hw/display/pxa2xx_template.h
index 1cbe36c..c64eebc 100644
--- a/hw/display/pxa2xx_template.h
+++ b/hw/display/pxa2xx_template.h
@@ -11,14 +11,26 @@
 
 # define SKIP_PIXEL(to)		to += deststep
 #if BITS == 8
-# define COPY_PIXEL(to, from)	*to = from; SKIP_PIXEL(to)
+# define COPY_PIXEL(to, from)  do { *to = from; SKIP_PIXEL(to); } while (0)
 #elif BITS == 15 || BITS == 16
-# define COPY_PIXEL(to, from)	*(uint16_t *) to = from; SKIP_PIXEL(to)
+# define COPY_PIXEL(to, from)    \
+    do {                         \
+        *(uint16_t *) to = from; \
+        SKIP_PIXEL(to);          \
+    } while (0)
 #elif BITS == 24
-# define COPY_PIXEL(to, from)	\
-	*(uint16_t *) to = from; *(to + 2) = (from) >> 16; SKIP_PIXEL(to)
+# define COPY_PIXEL(to, from)     \
+    do {                          \
+        *(uint16_t *) to = from;  \
+        *(to + 2) = (from) >> 16; \
+        SKIP_PIXEL(to);           \
+    } while (0)
 #elif BITS == 32
-# define COPY_PIXEL(to, from)	*(uint32_t *) to = from; SKIP_PIXEL(to)
+# define COPY_PIXEL(to, from)    \
+    do {                         \
+        *(uint32_t *) to = from; \
+        SKIP_PIXEL(to);          \
+    } while (0)
 #else
 # error unknown bit depth
 #endif
diff --git a/hw/display/tc6393xb_template.h b/hw/display/tc6393xb_template.h
index 154aafd..78629c0 100644
--- a/hw/display/tc6393xb_template.h
+++ b/hw/display/tc6393xb_template.h
@@ -22,14 +22,18 @@
  */
 
 #if BITS == 8
-# define SET_PIXEL(addr, color)	*(uint8_t*)addr = color;
+# define SET_PIXEL(addr, color)  (*(uint8_t *)addr = color)
 #elif BITS == 15 || BITS == 16
-# define SET_PIXEL(addr, color)	*(uint16_t*)addr = color;
+# define SET_PIXEL(addr, color)  (*(uint16_t *)addr = color)
 #elif BITS == 24
-# define SET_PIXEL(addr, color)	\
-    addr[0] = color; addr[1] = (color) >> 8; addr[2] = (color) >> 16;
+# define SET_PIXEL(addr, color)  \
+    do {                         \
+        addr[0] = color;         \
+        addr[1] = (color) >> 8;  \
+        addr[2] = (color) >> 16; \
+    } while (0)
 #elif BITS == 32
-# define SET_PIXEL(addr, color)	*(uint32_t*)addr = color;
+# define SET_PIXEL(addr, color)  (*(uint32_t *)addr = color)
 #else
 # error unknown bit depth
 #endif
-- 
1.8.5

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [Qemu-devel] [PULL 15/38] target-arm: Move arm_rmode_to_sf to a shared location.
  2014-01-29 13:39 [Qemu-devel] [PULL 00/38] target-arm queue Peter Maydell
                   ` (13 preceding siblings ...)
  2014-01-29 13:39 ` [Qemu-devel] [PULL 14/38] display: avoid multi-statement macro Peter Maydell
@ 2014-01-29 13:39 ` Peter Maydell
  2014-01-29 13:39 ` [Qemu-devel] [PULL 16/38] target-arm: Add AArch32 FP VRINTA, VRINTN, VRINTP and VRINTM Peter Maydell
                   ` (23 subsequent siblings)
  38 siblings, 0 replies; 40+ messages in thread
From: Peter Maydell @ 2014-01-29 13:39 UTC (permalink / raw)
  To: Anthony Liguori; +Cc: Blue Swirl, qemu-devel, Aurelien Jarno

From: Will Newton <will.newton@linaro.org>

This function will be needed for AArch32 ARMv8 support, so move it to
helper.c where it can be used by both targets. Also moves the code out
of line, but as it is quite a large function I don't believe this
should be a significant performance impact.

Signed-off-by: Will Newton <will.newton@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target-arm/cpu.h           |  2 ++
 target-arm/helper.c        | 28 ++++++++++++++++++++++++++++
 target-arm/translate-a64.c | 28 ----------------------------
 3 files changed, 30 insertions(+), 28 deletions(-)

diff --git a/target-arm/cpu.h b/target-arm/cpu.h
index 198b6b8..383c582 100644
--- a/target-arm/cpu.h
+++ b/target-arm/cpu.h
@@ -496,6 +496,8 @@ enum arm_fprounding {
     FPROUNDING_ODD
 };
 
+int arm_rmode_to_sf(int rmode);
+
 enum arm_cpu_mode {
   ARM_CPU_MODE_USR = 0x10,
   ARM_CPU_MODE_FIQ = 0x11,
diff --git a/target-arm/helper.c b/target-arm/helper.c
index c708f15..b1541b9 100644
--- a/target-arm/helper.c
+++ b/target-arm/helper.c
@@ -4418,3 +4418,31 @@ float64 HELPER(rintd)(float64 x, void *fp_status)
 
     return ret;
 }
+
+/* Convert ARM rounding mode to softfloat */
+int arm_rmode_to_sf(int rmode)
+{
+    switch (rmode) {
+    case FPROUNDING_TIEAWAY:
+        rmode = float_round_ties_away;
+        break;
+    case FPROUNDING_ODD:
+        /* FIXME: add support for TIEAWAY and ODD */
+        qemu_log_mask(LOG_UNIMP, "arm: unimplemented rounding mode: %d\n",
+                      rmode);
+    case FPROUNDING_TIEEVEN:
+    default:
+        rmode = float_round_nearest_even;
+        break;
+    case FPROUNDING_POSINF:
+        rmode = float_round_up;
+        break;
+    case FPROUNDING_NEGINF:
+        rmode = float_round_down;
+        break;
+    case FPROUNDING_ZERO:
+        rmode = float_round_to_zero;
+        break;
+    }
+    return rmode;
+}
diff --git a/target-arm/translate-a64.c b/target-arm/translate-a64.c
index 7cfb55b..c22bbb5 100644
--- a/target-arm/translate-a64.c
+++ b/target-arm/translate-a64.c
@@ -3608,34 +3608,6 @@ static void disas_data_proc_reg(DisasContext *s, uint32_t insn)
     }
 }
 
-/* Convert ARM rounding mode to softfloat */
-static inline int arm_rmode_to_sf(int rmode)
-{
-    switch (rmode) {
-    case FPROUNDING_TIEAWAY:
-        rmode = float_round_ties_away;
-        break;
-    case FPROUNDING_ODD:
-        /* FIXME: add support for TIEAWAY and ODD */
-        qemu_log_mask(LOG_UNIMP, "arm: unimplemented rounding mode: %d\n",
-                      rmode);
-    case FPROUNDING_TIEEVEN:
-    default:
-        rmode = float_round_nearest_even;
-        break;
-    case FPROUNDING_POSINF:
-        rmode = float_round_up;
-        break;
-    case FPROUNDING_NEGINF:
-        rmode = float_round_down;
-        break;
-    case FPROUNDING_ZERO:
-        rmode = float_round_to_zero;
-        break;
-    }
-    return rmode;
-}
-
 static void handle_fp_compare(DisasContext *s, bool is_double,
                               unsigned int rn, unsigned int rm,
                               bool cmp_with_zero, bool signal_all_nans)
-- 
1.8.5

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [Qemu-devel] [PULL 16/38] target-arm: Add AArch32 FP VRINTA, VRINTN, VRINTP and VRINTM
  2014-01-29 13:39 [Qemu-devel] [PULL 00/38] target-arm queue Peter Maydell
                   ` (14 preceding siblings ...)
  2014-01-29 13:39 ` [Qemu-devel] [PULL 15/38] target-arm: Move arm_rmode_to_sf to a shared location Peter Maydell
@ 2014-01-29 13:39 ` Peter Maydell
  2014-01-29 13:39 ` [Qemu-devel] [PULL 17/38] target-arm: Add support for AArch32 FP VRINTR Peter Maydell
                   ` (22 subsequent siblings)
  38 siblings, 0 replies; 40+ messages in thread
From: Peter Maydell @ 2014-01-29 13:39 UTC (permalink / raw)
  To: Anthony Liguori; +Cc: Blue Swirl, qemu-devel, Aurelien Jarno

From: Will Newton <will.newton@linaro.org>

Add support for AArch32 ARMv8 FP VRINTA, VRINTN, VRINTP and VRINTM
instructions.

Signed-off-by: Will Newton <will.newton@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target-arm/translate.c | 54 ++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 54 insertions(+)

diff --git a/target-arm/translate.c b/target-arm/translate.c
index 8d240e1..2db6812 100644
--- a/target-arm/translate.c
+++ b/target-arm/translate.c
@@ -2759,6 +2759,56 @@ static int handle_vminmaxnm(uint32_t insn, uint32_t rd, uint32_t rn,
     return 0;
 }
 
+static int handle_vrint(uint32_t insn, uint32_t rd, uint32_t rm, uint32_t dp,
+                        int rounding)
+{
+    TCGv_ptr fpst = get_fpstatus_ptr(0);
+    TCGv_i32 tcg_rmode;
+
+    tcg_rmode = tcg_const_i32(arm_rmode_to_sf(rounding));
+    gen_helper_set_rmode(tcg_rmode, tcg_rmode, cpu_env);
+
+    if (dp) {
+        TCGv_i64 tcg_op;
+        TCGv_i64 tcg_res;
+        tcg_op = tcg_temp_new_i64();
+        tcg_res = tcg_temp_new_i64();
+        tcg_gen_ld_f64(tcg_op, cpu_env, vfp_reg_offset(dp, rm));
+        gen_helper_rintd(tcg_res, tcg_op, fpst);
+        tcg_gen_st_f64(tcg_res, cpu_env, vfp_reg_offset(dp, rd));
+        tcg_temp_free_i64(tcg_op);
+        tcg_temp_free_i64(tcg_res);
+    } else {
+        TCGv_i32 tcg_op;
+        TCGv_i32 tcg_res;
+        tcg_op = tcg_temp_new_i32();
+        tcg_res = tcg_temp_new_i32();
+        tcg_gen_ld_f32(tcg_op, cpu_env, vfp_reg_offset(dp, rm));
+        gen_helper_rints(tcg_res, tcg_op, fpst);
+        tcg_gen_st_f32(tcg_res, cpu_env, vfp_reg_offset(dp, rd));
+        tcg_temp_free_i32(tcg_op);
+        tcg_temp_free_i32(tcg_res);
+    }
+
+    gen_helper_set_rmode(tcg_rmode, tcg_rmode, cpu_env);
+    tcg_temp_free_i32(tcg_rmode);
+
+    tcg_temp_free_ptr(fpst);
+    return 0;
+}
+
+
+/* Table for converting the most common AArch32 encoding of
+ * rounding mode to arm_fprounding order (which matches the
+ * common AArch64 order); see ARM ARM pseudocode FPDecodeRM().
+ */
+static const uint8_t fp_decode_rm[] = {
+    FPROUNDING_TIEAWAY,
+    FPROUNDING_TIEEVEN,
+    FPROUNDING_POSINF,
+    FPROUNDING_NEGINF,
+};
+
 static int disas_vfp_v8_insn(CPUARMState *env, DisasContext *s, uint32_t insn)
 {
     uint32_t rd, rn, rm, dp = extract32(insn, 8, 1);
@@ -2781,6 +2831,10 @@ static int disas_vfp_v8_insn(CPUARMState *env, DisasContext *s, uint32_t insn)
         return handle_vsel(insn, rd, rn, rm, dp);
     } else if ((insn & 0x0fb00e10) == 0x0e800a00) {
         return handle_vminmaxnm(insn, rd, rn, rm, dp);
+    } else if ((insn & 0x0fbc0ed0) == 0x0eb80a40) {
+        /* VRINTA, VRINTN, VRINTP, VRINTM */
+        int rounding = fp_decode_rm[extract32(insn, 16, 2)];
+        return handle_vrint(insn, rd, rm, dp, rounding);
     }
     return 1;
 }
-- 
1.8.5

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [Qemu-devel] [PULL 17/38] target-arm: Add support for AArch32 FP VRINTR
  2014-01-29 13:39 [Qemu-devel] [PULL 00/38] target-arm queue Peter Maydell
                   ` (15 preceding siblings ...)
  2014-01-29 13:39 ` [Qemu-devel] [PULL 16/38] target-arm: Add AArch32 FP VRINTA, VRINTN, VRINTP and VRINTM Peter Maydell
@ 2014-01-29 13:39 ` Peter Maydell
  2014-01-29 13:39 ` [Qemu-devel] [PULL 18/38] target-arm: Add support for AArch32 FP VRINTZ Peter Maydell
                   ` (21 subsequent siblings)
  38 siblings, 0 replies; 40+ messages in thread
From: Peter Maydell @ 2014-01-29 13:39 UTC (permalink / raw)
  To: Anthony Liguori; +Cc: Blue Swirl, qemu-devel, Aurelien Jarno

From: Will Newton <will.newton@linaro.org>

Add support for the AArch32 floating-point VRINTR instruction.

Signed-off-by: Will Newton <will.newton@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target-arm/translate.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/target-arm/translate.c b/target-arm/translate.c
index 2db6812..2b3157c 100644
--- a/target-arm/translate.c
+++ b/target-arm/translate.c
@@ -3379,6 +3379,17 @@ static int disas_vfp_insn(CPUARMState * env, DisasContext *s, uint32_t insn)
                         gen_vfp_F1_ld0(dp);
                         gen_vfp_cmpe(dp);
                         break;
+                    case 12: /* vrintr */
+                    {
+                        TCGv_ptr fpst = get_fpstatus_ptr(0);
+                        if (dp) {
+                            gen_helper_rintd(cpu_F0d, cpu_F0d, fpst);
+                        } else {
+                            gen_helper_rints(cpu_F0s, cpu_F0s, fpst);
+                        }
+                        tcg_temp_free_ptr(fpst);
+                        break;
+                    }
                     case 15: /* single<->double conversion */
                         if (dp)
                             gen_helper_vfp_fcvtsd(cpu_F0s, cpu_F0d, cpu_env);
-- 
1.8.5

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [Qemu-devel] [PULL 18/38] target-arm: Add support for AArch32 FP VRINTZ
  2014-01-29 13:39 [Qemu-devel] [PULL 00/38] target-arm queue Peter Maydell
                   ` (16 preceding siblings ...)
  2014-01-29 13:39 ` [Qemu-devel] [PULL 17/38] target-arm: Add support for AArch32 FP VRINTR Peter Maydell
@ 2014-01-29 13:39 ` Peter Maydell
  2014-01-29 13:39 ` [Qemu-devel] [PULL 19/38] target-arm: Add support for AArch32 FP VRINTX Peter Maydell
                   ` (20 subsequent siblings)
  38 siblings, 0 replies; 40+ messages in thread
From: Peter Maydell @ 2014-01-29 13:39 UTC (permalink / raw)
  To: Anthony Liguori; +Cc: Blue Swirl, qemu-devel, Aurelien Jarno

From: Will Newton <will.newton@linaro.org>

Add support for the AArch32 floating-point VRINTZ instruction.

Signed-off-by: Will Newton <will.newton@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target-arm/translate.c | 16 ++++++++++++++++
 1 file changed, 16 insertions(+)

diff --git a/target-arm/translate.c b/target-arm/translate.c
index 2b3157c..9afb19f 100644
--- a/target-arm/translate.c
+++ b/target-arm/translate.c
@@ -3390,6 +3390,22 @@ static int disas_vfp_insn(CPUARMState * env, DisasContext *s, uint32_t insn)
                         tcg_temp_free_ptr(fpst);
                         break;
                     }
+                    case 13: /* vrintz */
+                    {
+                        TCGv_ptr fpst = get_fpstatus_ptr(0);
+                        TCGv_i32 tcg_rmode;
+                        tcg_rmode = tcg_const_i32(float_round_to_zero);
+                        gen_helper_set_rmode(tcg_rmode, tcg_rmode, cpu_env);
+                        if (dp) {
+                            gen_helper_rintd(cpu_F0d, cpu_F0d, fpst);
+                        } else {
+                            gen_helper_rints(cpu_F0s, cpu_F0s, fpst);
+                        }
+                        gen_helper_set_rmode(tcg_rmode, tcg_rmode, cpu_env);
+                        tcg_temp_free_i32(tcg_rmode);
+                        tcg_temp_free_ptr(fpst);
+                        break;
+                    }
                     case 15: /* single<->double conversion */
                         if (dp)
                             gen_helper_vfp_fcvtsd(cpu_F0s, cpu_F0d, cpu_env);
-- 
1.8.5

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [Qemu-devel] [PULL 19/38] target-arm: Add support for AArch32 FP VRINTX
  2014-01-29 13:39 [Qemu-devel] [PULL 00/38] target-arm queue Peter Maydell
                   ` (17 preceding siblings ...)
  2014-01-29 13:39 ` [Qemu-devel] [PULL 18/38] target-arm: Add support for AArch32 FP VRINTZ Peter Maydell
@ 2014-01-29 13:39 ` Peter Maydell
  2014-01-29 13:39 ` [Qemu-devel] [PULL 20/38] target-arm: Add support for AArch32 SIMD VRINTX Peter Maydell
                   ` (19 subsequent siblings)
  38 siblings, 0 replies; 40+ messages in thread
From: Peter Maydell @ 2014-01-29 13:39 UTC (permalink / raw)
  To: Anthony Liguori; +Cc: Blue Swirl, qemu-devel, Aurelien Jarno

From: Will Newton <will.newton@linaro.org>

Add support for the AArch32 floating-point VRINTX instruction.

Signed-off-by: Will Newton <will.newton@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target-arm/translate.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/target-arm/translate.c b/target-arm/translate.c
index 9afb19f..9eb5b92 100644
--- a/target-arm/translate.c
+++ b/target-arm/translate.c
@@ -3406,6 +3406,17 @@ static int disas_vfp_insn(CPUARMState * env, DisasContext *s, uint32_t insn)
                         tcg_temp_free_ptr(fpst);
                         break;
                     }
+                    case 14: /* vrintx */
+                    {
+                        TCGv_ptr fpst = get_fpstatus_ptr(0);
+                        if (dp) {
+                            gen_helper_rintd_exact(cpu_F0d, cpu_F0d, fpst);
+                        } else {
+                            gen_helper_rints_exact(cpu_F0s, cpu_F0s, fpst);
+                        }
+                        tcg_temp_free_ptr(fpst);
+                        break;
+                    }
                     case 15: /* single<->double conversion */
                         if (dp)
                             gen_helper_vfp_fcvtsd(cpu_F0s, cpu_F0d, cpu_env);
-- 
1.8.5

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [Qemu-devel] [PULL 20/38] target-arm: Add support for AArch32 SIMD VRINTX
  2014-01-29 13:39 [Qemu-devel] [PULL 00/38] target-arm queue Peter Maydell
                   ` (18 preceding siblings ...)
  2014-01-29 13:39 ` [Qemu-devel] [PULL 19/38] target-arm: Add support for AArch32 FP VRINTX Peter Maydell
@ 2014-01-29 13:39 ` Peter Maydell
  2014-01-29 13:39 ` [Qemu-devel] [PULL 21/38] target-arm: Add set_neon_rmode helper Peter Maydell
                   ` (18 subsequent siblings)
  38 siblings, 0 replies; 40+ messages in thread
From: Peter Maydell @ 2014-01-29 13:39 UTC (permalink / raw)
  To: Anthony Liguori; +Cc: Blue Swirl, qemu-devel, Aurelien Jarno

From: Will Newton <will.newton@linaro.org>

Add support for the AArch32 Advanced SIMD VRINTX instruction.

Signed-off-by: Will Newton <will.newton@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target-arm/translate.c | 11 ++++++++++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/target-arm/translate.c b/target-arm/translate.c
index 9eb5b92..c179817 100644
--- a/target-arm/translate.c
+++ b/target-arm/translate.c
@@ -4709,6 +4709,7 @@ static const uint8_t neon_3r_sizes[] = {
 #define NEON_2RM_VMOVN 36 /* Includes VQMOVN, VQMOVUN */
 #define NEON_2RM_VQMOVN 37 /* Includes VQMOVUN */
 #define NEON_2RM_VSHLL 38
+#define NEON_2RM_VRINTX 41
 #define NEON_2RM_VCVT_F16_F32 44
 #define NEON_2RM_VCVT_F32_F16 46
 #define NEON_2RM_VRECPE 56
@@ -4724,7 +4725,7 @@ static int neon_2rm_is_float_op(int op)
 {
     /* Return true if this neon 2reg-misc op is float-to-float */
     return (op == NEON_2RM_VABS_F || op == NEON_2RM_VNEG_F ||
-            op >= NEON_2RM_VRECPE_F);
+            op == NEON_2RM_VRINTX || op >= NEON_2RM_VRECPE_F);
 }
 
 /* Each entry in this array has bit n set if the insn allows
@@ -4768,6 +4769,7 @@ static const uint8_t neon_2rm_sizes[] = {
     [NEON_2RM_VMOVN] = 0x7,
     [NEON_2RM_VQMOVN] = 0x7,
     [NEON_2RM_VSHLL] = 0x7,
+    [NEON_2RM_VRINTX] = 0x4,
     [NEON_2RM_VCVT_F16_F32] = 0x2,
     [NEON_2RM_VCVT_F32_F16] = 0x2,
     [NEON_2RM_VRECPE] = 0x4,
@@ -6480,6 +6482,13 @@ static int disas_neon_data_insn(CPUARMState * env, DisasContext *s, uint32_t ins
                             }
                             neon_store_reg(rm, pass, tmp2);
                             break;
+                        case NEON_2RM_VRINTX:
+                        {
+                            TCGv_ptr fpstatus = get_fpstatus_ptr(1);
+                            gen_helper_rints_exact(cpu_F0s, cpu_F0s, fpstatus);
+                            tcg_temp_free_ptr(fpstatus);
+                            break;
+                        }
                         case NEON_2RM_VRECPE:
                             gen_helper_recpe_u32(tmp, tmp, cpu_env);
                             break;
-- 
1.8.5

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [Qemu-devel] [PULL 21/38] target-arm: Add set_neon_rmode helper
  2014-01-29 13:39 [Qemu-devel] [PULL 00/38] target-arm queue Peter Maydell
                   ` (19 preceding siblings ...)
  2014-01-29 13:39 ` [Qemu-devel] [PULL 20/38] target-arm: Add support for AArch32 SIMD VRINTX Peter Maydell
@ 2014-01-29 13:39 ` Peter Maydell
  2014-01-29 13:39 ` [Qemu-devel] [PULL 22/38] target-arm: Add AArch32 SIMD VRINTA, VRINTN, VRINTP, VRINTM, VRINTZ Peter Maydell
                   ` (17 subsequent siblings)
  38 siblings, 0 replies; 40+ messages in thread
From: Peter Maydell @ 2014-01-29 13:39 UTC (permalink / raw)
  To: Anthony Liguori; +Cc: Blue Swirl, qemu-devel, Aurelien Jarno

From: Will Newton <will.newton@linaro.org>

This helper sets the rounding mode in the standard_fp_status word to
allow NEON instructions to modify the rounding mode whilst using the
standard FPSCR values for everything else.

Signed-off-by: Will Newton <will.newton@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target-arm/helper.c | 17 +++++++++++++++++
 target-arm/helper.h |  1 +
 2 files changed, 18 insertions(+)

diff --git a/target-arm/helper.c b/target-arm/helper.c
index b1541b9..ca5b000 100644
--- a/target-arm/helper.c
+++ b/target-arm/helper.c
@@ -4048,6 +4048,23 @@ uint32_t HELPER(set_rmode)(uint32_t rmode, CPUARMState *env)
     return prev_rmode;
 }
 
+/* Set the current fp rounding mode in the standard fp status and return
+ * the old one. This is for NEON instructions that need to change the
+ * rounding mode but wish to use the standard FPSCR values for everything
+ * else. Always set the rounding mode back to the correct value after
+ * modifying it.
+ * The argument is a softfloat float_round_ value.
+ */
+uint32_t HELPER(set_neon_rmode)(uint32_t rmode, CPUARMState *env)
+{
+    float_status *fp_status = &env->vfp.standard_fp_status;
+
+    uint32_t prev_rmode = get_float_rounding_mode(fp_status);
+    set_float_rounding_mode(rmode, fp_status);
+
+    return prev_rmode;
+}
+
 /* Half precision conversions.  */
 static float32 do_fcvt_f16_to_f32(uint32_t a, CPUARMState *env, float_status *s)
 {
diff --git a/target-arm/helper.h b/target-arm/helper.h
index 70872df..71b8411 100644
--- a/target-arm/helper.h
+++ b/target-arm/helper.h
@@ -149,6 +149,7 @@ DEF_HELPER_3(vfp_ultod, f64, i64, i32, ptr)
 DEF_HELPER_3(vfp_uqtod, f64, i64, i32, ptr)
 
 DEF_HELPER_FLAGS_2(set_rmode, TCG_CALL_NO_RWG, i32, i32, env)
+DEF_HELPER_FLAGS_2(set_neon_rmode, TCG_CALL_NO_RWG, i32, i32, env)
 
 DEF_HELPER_2(vfp_fcvt_f16_to_f32, f32, i32, env)
 DEF_HELPER_2(vfp_fcvt_f32_to_f16, i32, f32, env)
-- 
1.8.5

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [Qemu-devel] [PULL 22/38] target-arm: Add AArch32 SIMD VRINTA, VRINTN, VRINTP, VRINTM, VRINTZ
  2014-01-29 13:39 [Qemu-devel] [PULL 00/38] target-arm queue Peter Maydell
                   ` (20 preceding siblings ...)
  2014-01-29 13:39 ` [Qemu-devel] [PULL 21/38] target-arm: Add set_neon_rmode helper Peter Maydell
@ 2014-01-29 13:39 ` Peter Maydell
  2014-01-29 13:39 ` [Qemu-devel] [PULL 23/38] target-arm: Add AArch32 FP VCVTA, VCVTN, VCVTP and VCVTM Peter Maydell
                   ` (16 subsequent siblings)
  38 siblings, 0 replies; 40+ messages in thread
From: Peter Maydell @ 2014-01-29 13:39 UTC (permalink / raw)
  To: Anthony Liguori; +Cc: Blue Swirl, qemu-devel, Aurelien Jarno

From: Will Newton <will.newton@linaro.org>

Add support for the AArch32 Advanced SIMD VRINTA, VRINTN, VRINTP
VRINTM and VRINTZ instructions.

Signed-off-by: Will Newton <will.newton@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target-arm/translate.c | 40 +++++++++++++++++++++++++++++++++++++++-
 1 file changed, 39 insertions(+), 1 deletion(-)

diff --git a/target-arm/translate.c b/target-arm/translate.c
index c179817..5a8ca24 100644
--- a/target-arm/translate.c
+++ b/target-arm/translate.c
@@ -4709,9 +4709,14 @@ static const uint8_t neon_3r_sizes[] = {
 #define NEON_2RM_VMOVN 36 /* Includes VQMOVN, VQMOVUN */
 #define NEON_2RM_VQMOVN 37 /* Includes VQMOVUN */
 #define NEON_2RM_VSHLL 38
+#define NEON_2RM_VRINTN 40
 #define NEON_2RM_VRINTX 41
+#define NEON_2RM_VRINTA 42
+#define NEON_2RM_VRINTZ 43
 #define NEON_2RM_VCVT_F16_F32 44
+#define NEON_2RM_VRINTM 45
 #define NEON_2RM_VCVT_F32_F16 46
+#define NEON_2RM_VRINTP 47
 #define NEON_2RM_VRECPE 56
 #define NEON_2RM_VRSQRTE 57
 #define NEON_2RM_VRECPE_F 58
@@ -4725,7 +4730,9 @@ static int neon_2rm_is_float_op(int op)
 {
     /* Return true if this neon 2reg-misc op is float-to-float */
     return (op == NEON_2RM_VABS_F || op == NEON_2RM_VNEG_F ||
-            op == NEON_2RM_VRINTX || op >= NEON_2RM_VRECPE_F);
+            (op >= NEON_2RM_VRINTN && op <= NEON_2RM_VRINTZ) ||
+            op == NEON_2RM_VRINTM || op == NEON_2RM_VRINTP ||
+            op >= NEON_2RM_VRECPE_F);
 }
 
 /* Each entry in this array has bit n set if the insn allows
@@ -4769,9 +4776,14 @@ static const uint8_t neon_2rm_sizes[] = {
     [NEON_2RM_VMOVN] = 0x7,
     [NEON_2RM_VQMOVN] = 0x7,
     [NEON_2RM_VSHLL] = 0x7,
+    [NEON_2RM_VRINTN] = 0x4,
     [NEON_2RM_VRINTX] = 0x4,
+    [NEON_2RM_VRINTA] = 0x4,
+    [NEON_2RM_VRINTZ] = 0x4,
     [NEON_2RM_VCVT_F16_F32] = 0x2,
+    [NEON_2RM_VRINTM] = 0x4,
     [NEON_2RM_VCVT_F32_F16] = 0x2,
+    [NEON_2RM_VRINTP] = 0x4,
     [NEON_2RM_VRECPE] = 0x4,
     [NEON_2RM_VRSQRTE] = 0x4,
     [NEON_2RM_VRECPE_F] = 0x4,
@@ -6482,6 +6494,32 @@ static int disas_neon_data_insn(CPUARMState * env, DisasContext *s, uint32_t ins
                             }
                             neon_store_reg(rm, pass, tmp2);
                             break;
+                        case NEON_2RM_VRINTN:
+                        case NEON_2RM_VRINTA:
+                        case NEON_2RM_VRINTM:
+                        case NEON_2RM_VRINTP:
+                        case NEON_2RM_VRINTZ:
+                        {
+                            TCGv_i32 tcg_rmode;
+                            TCGv_ptr fpstatus = get_fpstatus_ptr(1);
+                            int rmode;
+
+                            if (op == NEON_2RM_VRINTZ) {
+                                rmode = FPROUNDING_ZERO;
+                            } else {
+                                rmode = fp_decode_rm[((op & 0x6) >> 1) ^ 1];
+                            }
+
+                            tcg_rmode = tcg_const_i32(arm_rmode_to_sf(rmode));
+                            gen_helper_set_neon_rmode(tcg_rmode, tcg_rmode,
+                                                      cpu_env);
+                            gen_helper_rints(cpu_F0s, cpu_F0s, fpstatus);
+                            gen_helper_set_neon_rmode(tcg_rmode, tcg_rmode,
+                                                      cpu_env);
+                            tcg_temp_free_ptr(fpstatus);
+                            tcg_temp_free_i32(tcg_rmode);
+                            break;
+                        }
                         case NEON_2RM_VRINTX:
                         {
                             TCGv_ptr fpstatus = get_fpstatus_ptr(1);
-- 
1.8.5

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [Qemu-devel] [PULL 23/38] target-arm: Add AArch32 FP VCVTA, VCVTN, VCVTP and VCVTM
  2014-01-29 13:39 [Qemu-devel] [PULL 00/38] target-arm queue Peter Maydell
                   ` (21 preceding siblings ...)
  2014-01-29 13:39 ` [Qemu-devel] [PULL 22/38] target-arm: Add AArch32 SIMD VRINTA, VRINTN, VRINTP, VRINTM, VRINTZ Peter Maydell
@ 2014-01-29 13:39 ` Peter Maydell
  2014-01-29 13:39 ` [Qemu-devel] [PULL 24/38] target-arm: Add AArch32 SIMD " Peter Maydell
                   ` (15 subsequent siblings)
  38 siblings, 0 replies; 40+ messages in thread
From: Peter Maydell @ 2014-01-29 13:39 UTC (permalink / raw)
  To: Anthony Liguori; +Cc: Blue Swirl, qemu-devel, Aurelien Jarno

From: Will Newton <will.newton@linaro.org>

Add support for the AArch32 floating-point VCVTA, VCVTN, VCVTP
and VCVTM instructions.

Signed-off-by: Will Newton <will.newton@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target-arm/translate.c | 61 ++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 61 insertions(+)

diff --git a/target-arm/translate.c b/target-arm/translate.c
index 5a8ca24..0fcc159 100644
--- a/target-arm/translate.c
+++ b/target-arm/translate.c
@@ -2797,6 +2797,63 @@ static int handle_vrint(uint32_t insn, uint32_t rd, uint32_t rm, uint32_t dp,
     return 0;
 }
 
+static int handle_vcvt(uint32_t insn, uint32_t rd, uint32_t rm, uint32_t dp,
+                       int rounding)
+{
+    bool is_signed = extract32(insn, 7, 1);
+    TCGv_ptr fpst = get_fpstatus_ptr(0);
+    TCGv_i32 tcg_rmode, tcg_shift;
+
+    tcg_shift = tcg_const_i32(0);
+
+    tcg_rmode = tcg_const_i32(arm_rmode_to_sf(rounding));
+    gen_helper_set_rmode(tcg_rmode, tcg_rmode, cpu_env);
+
+    if (dp) {
+        TCGv_i64 tcg_double, tcg_res;
+        TCGv_i32 tcg_tmp;
+        /* Rd is encoded as a single precision register even when the source
+         * is double precision.
+         */
+        rd = ((rd << 1) & 0x1e) | ((rd >> 4) & 0x1);
+        tcg_double = tcg_temp_new_i64();
+        tcg_res = tcg_temp_new_i64();
+        tcg_tmp = tcg_temp_new_i32();
+        tcg_gen_ld_f64(tcg_double, cpu_env, vfp_reg_offset(1, rm));
+        if (is_signed) {
+            gen_helper_vfp_tosld(tcg_res, tcg_double, tcg_shift, fpst);
+        } else {
+            gen_helper_vfp_tould(tcg_res, tcg_double, tcg_shift, fpst);
+        }
+        tcg_gen_trunc_i64_i32(tcg_tmp, tcg_res);
+        tcg_gen_st_f32(tcg_tmp, cpu_env, vfp_reg_offset(0, rd));
+        tcg_temp_free_i32(tcg_tmp);
+        tcg_temp_free_i64(tcg_res);
+        tcg_temp_free_i64(tcg_double);
+    } else {
+        TCGv_i32 tcg_single, tcg_res;
+        tcg_single = tcg_temp_new_i32();
+        tcg_res = tcg_temp_new_i32();
+        tcg_gen_ld_f32(tcg_single, cpu_env, vfp_reg_offset(0, rm));
+        if (is_signed) {
+            gen_helper_vfp_tosls(tcg_res, tcg_single, tcg_shift, fpst);
+        } else {
+            gen_helper_vfp_touls(tcg_res, tcg_single, tcg_shift, fpst);
+        }
+        tcg_gen_st_f32(tcg_res, cpu_env, vfp_reg_offset(0, rd));
+        tcg_temp_free_i32(tcg_res);
+        tcg_temp_free_i32(tcg_single);
+    }
+
+    gen_helper_set_rmode(tcg_rmode, tcg_rmode, cpu_env);
+    tcg_temp_free_i32(tcg_rmode);
+
+    tcg_temp_free_i32(tcg_shift);
+
+    tcg_temp_free_ptr(fpst);
+
+    return 0;
+}
 
 /* Table for converting the most common AArch32 encoding of
  * rounding mode to arm_fprounding order (which matches the
@@ -2835,6 +2892,10 @@ static int disas_vfp_v8_insn(CPUARMState *env, DisasContext *s, uint32_t insn)
         /* VRINTA, VRINTN, VRINTP, VRINTM */
         int rounding = fp_decode_rm[extract32(insn, 16, 2)];
         return handle_vrint(insn, rd, rm, dp, rounding);
+    } else if ((insn & 0x0fbc0e50) == 0x0ebc0a40) {
+        /* VCVTA, VCVTN, VCVTP, VCVTM */
+        int rounding = fp_decode_rm[extract32(insn, 16, 2)];
+        return handle_vcvt(insn, rd, rm, dp, rounding);
     }
     return 1;
 }
-- 
1.8.5

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [Qemu-devel] [PULL 24/38] target-arm: Add AArch32 SIMD VCVTA, VCVTN, VCVTP and VCVTM
  2014-01-29 13:39 [Qemu-devel] [PULL 00/38] target-arm queue Peter Maydell
                   ` (22 preceding siblings ...)
  2014-01-29 13:39 ` [Qemu-devel] [PULL 23/38] target-arm: Add AArch32 FP VCVTA, VCVTN, VCVTP and VCVTM Peter Maydell
@ 2014-01-29 13:39 ` Peter Maydell
  2014-01-29 13:39 ` [Qemu-devel] [PULL 25/38] target-arm: A64: Add SIMD three-different multiply accumulate insns Peter Maydell
                   ` (14 subsequent siblings)
  38 siblings, 0 replies; 40+ messages in thread
From: Peter Maydell @ 2014-01-29 13:39 UTC (permalink / raw)
  To: Anthony Liguori; +Cc: Blue Swirl, qemu-devel, Aurelien Jarno

From: Will Newton <will.newton@linaro.org>

Add support for the AArch32 Advanced SIMD VCVTA, VCVTN, VCVTP
and VCVTM instructions.

Signed-off-by: Will Newton <will.newton@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target-arm/translate.c | 53 +++++++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 52 insertions(+), 1 deletion(-)

diff --git a/target-arm/translate.c b/target-arm/translate.c
index 0fcc159..e701c0f 100644
--- a/target-arm/translate.c
+++ b/target-arm/translate.c
@@ -4778,6 +4778,14 @@ static const uint8_t neon_3r_sizes[] = {
 #define NEON_2RM_VRINTM 45
 #define NEON_2RM_VCVT_F32_F16 46
 #define NEON_2RM_VRINTP 47
+#define NEON_2RM_VCVTAU 48
+#define NEON_2RM_VCVTAS 49
+#define NEON_2RM_VCVTNU 50
+#define NEON_2RM_VCVTNS 51
+#define NEON_2RM_VCVTPU 52
+#define NEON_2RM_VCVTPS 53
+#define NEON_2RM_VCVTMU 54
+#define NEON_2RM_VCVTMS 55
 #define NEON_2RM_VRECPE 56
 #define NEON_2RM_VRSQRTE 57
 #define NEON_2RM_VRECPE_F 58
@@ -4792,7 +4800,8 @@ static int neon_2rm_is_float_op(int op)
     /* Return true if this neon 2reg-misc op is float-to-float */
     return (op == NEON_2RM_VABS_F || op == NEON_2RM_VNEG_F ||
             (op >= NEON_2RM_VRINTN && op <= NEON_2RM_VRINTZ) ||
-            op == NEON_2RM_VRINTM || op == NEON_2RM_VRINTP ||
+            op == NEON_2RM_VRINTM ||
+            (op >= NEON_2RM_VRINTP && op <= NEON_2RM_VCVTMS) ||
             op >= NEON_2RM_VRECPE_F);
 }
 
@@ -4845,6 +4854,14 @@ static const uint8_t neon_2rm_sizes[] = {
     [NEON_2RM_VRINTM] = 0x4,
     [NEON_2RM_VCVT_F32_F16] = 0x2,
     [NEON_2RM_VRINTP] = 0x4,
+    [NEON_2RM_VCVTAU] = 0x4,
+    [NEON_2RM_VCVTAS] = 0x4,
+    [NEON_2RM_VCVTNU] = 0x4,
+    [NEON_2RM_VCVTNS] = 0x4,
+    [NEON_2RM_VCVTPU] = 0x4,
+    [NEON_2RM_VCVTPS] = 0x4,
+    [NEON_2RM_VCVTMU] = 0x4,
+    [NEON_2RM_VCVTMS] = 0x4,
     [NEON_2RM_VRECPE] = 0x4,
     [NEON_2RM_VRSQRTE] = 0x4,
     [NEON_2RM_VRECPE_F] = 0x4,
@@ -6588,6 +6605,40 @@ static int disas_neon_data_insn(CPUARMState * env, DisasContext *s, uint32_t ins
                             tcg_temp_free_ptr(fpstatus);
                             break;
                         }
+                        case NEON_2RM_VCVTAU:
+                        case NEON_2RM_VCVTAS:
+                        case NEON_2RM_VCVTNU:
+                        case NEON_2RM_VCVTNS:
+                        case NEON_2RM_VCVTPU:
+                        case NEON_2RM_VCVTPS:
+                        case NEON_2RM_VCVTMU:
+                        case NEON_2RM_VCVTMS:
+                        {
+                            bool is_signed = !extract32(insn, 7, 1);
+                            TCGv_ptr fpst = get_fpstatus_ptr(1);
+                            TCGv_i32 tcg_rmode, tcg_shift;
+                            int rmode = fp_decode_rm[extract32(insn, 8, 2)];
+
+                            tcg_shift = tcg_const_i32(0);
+                            tcg_rmode = tcg_const_i32(arm_rmode_to_sf(rmode));
+                            gen_helper_set_neon_rmode(tcg_rmode, tcg_rmode,
+                                                      cpu_env);
+
+                            if (is_signed) {
+                                gen_helper_vfp_tosls(cpu_F0s, cpu_F0s,
+                                                     tcg_shift, fpst);
+                            } else {
+                                gen_helper_vfp_touls(cpu_F0s, cpu_F0s,
+                                                     tcg_shift, fpst);
+                            }
+
+                            gen_helper_set_neon_rmode(tcg_rmode, tcg_rmode,
+                                                      cpu_env);
+                            tcg_temp_free_i32(tcg_rmode);
+                            tcg_temp_free_i32(tcg_shift);
+                            tcg_temp_free_ptr(fpst);
+                            break;
+                        }
                         case NEON_2RM_VRECPE:
                             gen_helper_recpe_u32(tmp, tmp, cpu_env);
                             break;
-- 
1.8.5

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [Qemu-devel] [PULL 25/38] target-arm: A64: Add SIMD three-different multiply accumulate insns
  2014-01-29 13:39 [Qemu-devel] [PULL 00/38] target-arm queue Peter Maydell
                   ` (23 preceding siblings ...)
  2014-01-29 13:39 ` [Qemu-devel] [PULL 24/38] target-arm: Add AArch32 SIMD " Peter Maydell
@ 2014-01-29 13:39 ` Peter Maydell
  2014-01-29 13:39 ` [Qemu-devel] [PULL 26/38] target-arm: A64: Add SIMD three-different ABDL instructions Peter Maydell
                   ` (13 subsequent siblings)
  38 siblings, 0 replies; 40+ messages in thread
From: Peter Maydell @ 2014-01-29 13:39 UTC (permalink / raw)
  To: Anthony Liguori; +Cc: Blue Swirl, qemu-devel, Aurelien Jarno

Add support for the multiply-accumulate instructions from the
SIMD three-different instructions group (C3.6.15):
 * skeleton decode of unallocated encodings and split of
   the group into its three sub-parts
 * framework for handling the 64x64->128 widening subpart
 * implementation of the multiply-accumulate instructions
   SMLAL, SMLAL2, UMLAL, UMLAL2, SMLSL, SMLSL2, UMLSL, UMLSL2,
   UMULL, UMULL2, SMULL, SMULL2

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
---
 target-arm/translate-a64.c | 233 ++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 232 insertions(+), 1 deletion(-)

diff --git a/target-arm/translate-a64.c b/target-arm/translate-a64.c
index c22bbb5..e8479a5 100644
--- a/target-arm/translate-a64.c
+++ b/target-arm/translate-a64.c
@@ -700,6 +700,9 @@ static void do_fp_ld(DisasContext *s, int destidx, TCGv_i64 tcg_addr, int size)
  * zero extend as we are filling a partial chunk of the vector register.
  * These functions don't support 128 bit loads/stores, which would be
  * normal load/store operations.
+ *
+ * The _i32 versions are useful when operating on 32 bit quantities
+ * (eg for floating point single or using Neon helper functions).
  */
 
 /* Get value of an element within a vector register */
@@ -735,6 +738,32 @@ static void read_vec_element(DisasContext *s, TCGv_i64 tcg_dest, int srcidx,
     }
 }
 
+static void read_vec_element_i32(DisasContext *s, TCGv_i32 tcg_dest, int srcidx,
+                                 int element, TCGMemOp memop)
+{
+    int vect_off = vec_reg_offset(srcidx, element, memop & MO_SIZE);
+    switch (memop) {
+    case MO_8:
+        tcg_gen_ld8u_i32(tcg_dest, cpu_env, vect_off);
+        break;
+    case MO_16:
+        tcg_gen_ld16u_i32(tcg_dest, cpu_env, vect_off);
+        break;
+    case MO_8|MO_SIGN:
+        tcg_gen_ld8s_i32(tcg_dest, cpu_env, vect_off);
+        break;
+    case MO_16|MO_SIGN:
+        tcg_gen_ld16s_i32(tcg_dest, cpu_env, vect_off);
+        break;
+    case MO_32:
+    case MO_32|MO_SIGN:
+        tcg_gen_ld_i32(tcg_dest, cpu_env, vect_off);
+        break;
+    default:
+        g_assert_not_reached();
+    }
+}
+
 /* Set value of an element within a vector register */
 static void write_vec_element(DisasContext *s, TCGv_i64 tcg_src, int destidx,
                               int element, TCGMemOp memop)
@@ -5518,6 +5547,150 @@ static void disas_simd_shift_imm(DisasContext *s, uint32_t insn)
     unsupported_encoding(s, insn);
 }
 
+static void handle_3rd_widening(DisasContext *s, int is_q, int is_u, int size,
+                                int opcode, int rd, int rn, int rm)
+{
+    /* 3-reg-different widening insns: 64 x 64 -> 128 */
+    TCGv_i64 tcg_res[2];
+    int pass, accop;
+
+    tcg_res[0] = tcg_temp_new_i64();
+    tcg_res[1] = tcg_temp_new_i64();
+
+    /* Does this op do an adding accumulate, a subtracting accumulate,
+     * or no accumulate at all?
+     */
+    switch (opcode) {
+    case 5:
+    case 8:
+    case 9:
+        accop = 1;
+        break;
+    case 10:
+    case 11:
+        accop = -1;
+        break;
+    default:
+        accop = 0;
+        break;
+    }
+
+    if (accop != 0) {
+        read_vec_element(s, tcg_res[0], rd, 0, MO_64);
+        read_vec_element(s, tcg_res[1], rd, 1, MO_64);
+    }
+
+    /* size == 2 means two 32x32->64 operations; this is worth special
+     * casing because we can generally handle it inline.
+     */
+    if (size == 2) {
+        for (pass = 0; pass < 2; pass++) {
+            TCGv_i64 tcg_op1 = tcg_temp_new_i64();
+            TCGv_i64 tcg_op2 = tcg_temp_new_i64();
+            TCGv_i64 tcg_passres;
+            TCGMemOp memop = MO_32 | (is_u ? 0 : MO_SIGN);
+
+            int elt = pass + is_q * 2;
+
+            read_vec_element(s, tcg_op1, rn, elt, memop);
+            read_vec_element(s, tcg_op2, rm, elt, memop);
+
+            if (accop == 0) {
+                tcg_passres = tcg_res[pass];
+            } else {
+                tcg_passres = tcg_temp_new_i64();
+            }
+
+            switch (opcode) {
+            case 8: /* SMLAL, SMLAL2, UMLAL, UMLAL2 */
+            case 10: /* SMLSL, SMLSL2, UMLSL, UMLSL2 */
+            case 12: /* UMULL, UMULL2, SMULL, SMULL2 */
+                tcg_gen_mul_i64(tcg_passres, tcg_op1, tcg_op2);
+                break;
+            default:
+                g_assert_not_reached();
+            }
+
+            if (accop > 0) {
+                tcg_gen_add_i64(tcg_res[pass], tcg_res[pass], tcg_passres);
+                tcg_temp_free_i64(tcg_passres);
+            } else if (accop < 0) {
+                tcg_gen_sub_i64(tcg_res[pass], tcg_res[pass], tcg_passres);
+                tcg_temp_free_i64(tcg_passres);
+            }
+
+            tcg_temp_free_i64(tcg_op1);
+            tcg_temp_free_i64(tcg_op2);
+        }
+    } else {
+        /* size 0 or 1, generally helper functions */
+        for (pass = 0; pass < 2; pass++) {
+            TCGv_i32 tcg_op1 = tcg_temp_new_i32();
+            TCGv_i32 tcg_op2 = tcg_temp_new_i32();
+            TCGv_i64 tcg_passres;
+            int elt = pass + is_q * 2;
+
+            read_vec_element_i32(s, tcg_op1, rn, elt, MO_32);
+            read_vec_element_i32(s, tcg_op2, rm, elt, MO_32);
+
+            if (accop == 0) {
+                tcg_passres = tcg_res[pass];
+            } else {
+                tcg_passres = tcg_temp_new_i64();
+            }
+
+            switch (opcode) {
+            case 8: /* SMLAL, SMLAL2, UMLAL, UMLAL2 */
+            case 10: /* SMLSL, SMLSL2, UMLSL, UMLSL2 */
+            case 12: /* UMULL, UMULL2, SMULL, SMULL2 */
+                if (size == 0) {
+                    if (is_u) {
+                        gen_helper_neon_mull_u8(tcg_passres, tcg_op1, tcg_op2);
+                    } else {
+                        gen_helper_neon_mull_s8(tcg_passres, tcg_op1, tcg_op2);
+                    }
+                } else {
+                    if (is_u) {
+                        gen_helper_neon_mull_u16(tcg_passres, tcg_op1, tcg_op2);
+                    } else {
+                        gen_helper_neon_mull_s16(tcg_passres, tcg_op1, tcg_op2);
+                    }
+                }
+                break;
+            default:
+                g_assert_not_reached();
+            }
+            tcg_temp_free_i32(tcg_op1);
+            tcg_temp_free_i32(tcg_op2);
+
+            if (accop > 0) {
+                if (size == 0) {
+                    gen_helper_neon_addl_u16(tcg_res[pass], tcg_res[pass],
+                                             tcg_passres);
+                } else {
+                    gen_helper_neon_addl_u32(tcg_res[pass], tcg_res[pass],
+                                             tcg_passres);
+                }
+                tcg_temp_free_i64(tcg_passres);
+            } else if (accop < 0) {
+                if (size == 0) {
+                    gen_helper_neon_subl_u16(tcg_res[pass], tcg_res[pass],
+                                             tcg_passres);
+                } else {
+                    gen_helper_neon_subl_u32(tcg_res[pass], tcg_res[pass],
+                                             tcg_passres);
+                }
+                tcg_temp_free_i64(tcg_passres);
+            }
+        }
+    }
+
+    write_vec_element(s, tcg_res[0], rd, 0, MO_64);
+    write_vec_element(s, tcg_res[1], rd, 1, MO_64);
+    tcg_temp_free_i64(tcg_res[0]);
+    tcg_temp_free_i64(tcg_res[1]);
+}
+
 /* C3.6.15 AdvSIMD three different
  *   31  30  29 28       24 23  22  21 20  16 15    12 11 10 9    5 4    0
  * +---+---+---+-----------+------+---+------+--------+-----+------+------+
@@ -5526,7 +5699,65 @@ static void disas_simd_shift_imm(DisasContext *s, uint32_t insn)
  */
 static void disas_simd_three_reg_diff(DisasContext *s, uint32_t insn)
 {
-    unsupported_encoding(s, insn);
+    /* Instructions in this group fall into three basic classes
+     * (in each case with the operation working on each element in
+     * the input vectors):
+     * (1) widening 64 x 64 -> 128 (with possibly Vd as an extra
+     *     128 bit input)
+     * (2) wide 64 x 128 -> 128
+     * (3) narrowing 128 x 128 -> 64
+     * Here we do initial decode, catch unallocated cases and
+     * dispatch to separate functions for each class.
+     */
+    int is_q = extract32(insn, 30, 1);
+    int is_u = extract32(insn, 29, 1);
+    int size = extract32(insn, 22, 2);
+    int opcode = extract32(insn, 12, 4);
+    int rm = extract32(insn, 16, 5);
+    int rn = extract32(insn, 5, 5);
+    int rd = extract32(insn, 0, 5);
+
+    switch (opcode) {
+    case 1: /* SADDW, SADDW2, UADDW, UADDW2 */
+    case 3: /* SSUBW, SSUBW2, USUBW, USUBW2 */
+        /* 64 x 128 -> 128 */
+        unsupported_encoding(s, insn);
+        break;
+    case 4: /* ADDHN, ADDHN2, RADDHN, RADDHN2 */
+    case 6: /* SUBHN, SUBHN2, RSUBHN, RSUBHN2 */
+        /* 128 x 128 -> 64 */
+        unsupported_encoding(s, insn);
+        break;
+    case 9:
+    case 11:
+    case 13:
+    case 14:
+        if (is_u) {
+            unallocated_encoding(s);
+            return;
+        }
+        /* fall through */
+    case 0:
+    case 2:
+    case 5:
+    case 7:
+        unsupported_encoding(s, insn);
+        break;
+    case 8:
+    case 10:
+    case 12:
+        /* 64 x 64 -> 128 */
+        if (size == 3) {
+            unallocated_encoding(s);
+            return;
+        }
+        handle_3rd_widening(s, is_q, is_u, size, opcode, rd, rn, rm);
+        break;
+    default:
+        /* opcode 15 not allocated */
+        unallocated_encoding(s);
+        break;
+    }
 }
 
 /* C3.6.16 AdvSIMD three same
-- 
1.8.5

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [Qemu-devel] [PULL 26/38] target-arm: A64: Add SIMD three-different ABDL instructions
  2014-01-29 13:39 [Qemu-devel] [PULL 00/38] target-arm queue Peter Maydell
                   ` (24 preceding siblings ...)
  2014-01-29 13:39 ` [Qemu-devel] [PULL 25/38] target-arm: A64: Add SIMD three-different multiply accumulate insns Peter Maydell
@ 2014-01-29 13:39 ` Peter Maydell
  2014-01-29 13:39 ` [Qemu-devel] [PULL 27/38] target-arm: A64: Add SIMD scalar 3 same add, sub and compare ops Peter Maydell
                   ` (12 subsequent siblings)
  38 siblings, 0 replies; 40+ messages in thread
From: Peter Maydell @ 2014-01-29 13:39 UTC (permalink / raw)
  To: Anthony Liguori; +Cc: Blue Swirl, qemu-devel, Aurelien Jarno

Implement the absolute-difference instructions in the SIMD
three-different group: SABAL, SABAL2, UABAL, UABAL2, SABDL,
SABDL2, UABDL, UABDL2.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
---
 target-arm/translate-a64.c | 35 +++++++++++++++++++++++++++++++++--
 1 file changed, 33 insertions(+), 2 deletions(-)

diff --git a/target-arm/translate-a64.c b/target-arm/translate-a64.c
index e8479a5..e098efd 100644
--- a/target-arm/translate-a64.c
+++ b/target-arm/translate-a64.c
@@ -5602,6 +5602,21 @@ static void handle_3rd_widening(DisasContext *s, int is_q, int is_u, int size,
             }
 
             switch (opcode) {
+            case 5: /* SABAL, SABAL2, UABAL, UABAL2 */
+            case 7: /* SABDL, SABDL2, UABDL, UABDL2 */
+            {
+                TCGv_i64 tcg_tmp1 = tcg_temp_new_i64();
+                TCGv_i64 tcg_tmp2 = tcg_temp_new_i64();
+
+                tcg_gen_sub_i64(tcg_tmp1, tcg_op1, tcg_op2);
+                tcg_gen_sub_i64(tcg_tmp2, tcg_op2, tcg_op1);
+                tcg_gen_movcond_i64(is_u ? TCG_COND_GEU : TCG_COND_GE,
+                                    tcg_passres,
+                                    tcg_op1, tcg_op2, tcg_tmp1, tcg_tmp2);
+                tcg_temp_free_i64(tcg_tmp1);
+                tcg_temp_free_i64(tcg_tmp2);
+                break;
+            }
             case 8: /* SMLAL, SMLAL2, UMLAL, UMLAL2 */
             case 10: /* SMLSL, SMLSL2, UMLSL, UMLSL2 */
             case 12: /* UMULL, UMULL2, SMULL, SMULL2 */
@@ -5640,6 +5655,22 @@ static void handle_3rd_widening(DisasContext *s, int is_q, int is_u, int size,
             }
 
             switch (opcode) {
+            case 5: /* SABAL, SABAL2, UABAL, UABAL2 */
+            case 7: /* SABDL, SABDL2, UABDL, UABDL2 */
+                if (size == 0) {
+                    if (is_u) {
+                        gen_helper_neon_abdl_u16(tcg_passres, tcg_op1, tcg_op2);
+                    } else {
+                        gen_helper_neon_abdl_s16(tcg_passres, tcg_op1, tcg_op2);
+                    }
+                } else {
+                    if (is_u) {
+                        gen_helper_neon_abdl_u32(tcg_passres, tcg_op1, tcg_op2);
+                    } else {
+                        gen_helper_neon_abdl_s32(tcg_passres, tcg_op1, tcg_op2);
+                    }
+                }
+                break;
             case 8: /* SMLAL, SMLAL2, UMLAL, UMLAL2 */
             case 10: /* SMLSL, SMLSL2, UMLSL, UMLSL2 */
             case 12: /* UMULL, UMULL2, SMULL, SMULL2 */
@@ -5739,10 +5770,10 @@ static void disas_simd_three_reg_diff(DisasContext *s, uint32_t insn)
         /* fall through */
     case 0:
     case 2:
-    case 5:
-    case 7:
         unsupported_encoding(s, insn);
         break;
+    case 5:
+    case 7:
     case 8:
     case 10:
     case 12:
-- 
1.8.5

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [Qemu-devel] [PULL 27/38] target-arm: A64: Add SIMD scalar 3 same add, sub and compare ops
  2014-01-29 13:39 [Qemu-devel] [PULL 00/38] target-arm queue Peter Maydell
                   ` (25 preceding siblings ...)
  2014-01-29 13:39 ` [Qemu-devel] [PULL 26/38] target-arm: A64: Add SIMD three-different ABDL instructions Peter Maydell
@ 2014-01-29 13:39 ` Peter Maydell
  2014-01-29 13:39 ` [Qemu-devel] [PULL 28/38] target-arm: A64: Add top level decode for SIMD 3-same group Peter Maydell
                   ` (11 subsequent siblings)
  38 siblings, 0 replies; 40+ messages in thread
From: Peter Maydell @ 2014-01-29 13:39 UTC (permalink / raw)
  To: Anthony Liguori; +Cc: Blue Swirl, qemu-devel, Aurelien Jarno

Implement the add, sub and compare ops from the SIMD "scalar three same"
group.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
---
 target-arm/translate-a64.c | 131 ++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 130 insertions(+), 1 deletion(-)

diff --git a/target-arm/translate-a64.c b/target-arm/translate-a64.c
index e098efd..f0ebbb5 100644
--- a/target-arm/translate-a64.c
+++ b/target-arm/translate-a64.c
@@ -5503,6 +5503,58 @@ static void disas_simd_scalar_three_reg_diff(DisasContext *s, uint32_t insn)
     unsupported_encoding(s, insn);
 }
 
+static void handle_3same_64(DisasContext *s, int opcode, bool u,
+                            TCGv_i64 tcg_rd, TCGv_i64 tcg_rn, TCGv_i64 tcg_rm)
+{
+    /* Handle 64x64->64 opcodes which are shared between the scalar
+     * and vector 3-same groups. We cover every opcode where size == 3
+     * is valid in either the three-reg-same (integer, not pairwise)
+     * or scalar-three-reg-same groups. (Some opcodes are not yet
+     * implemented.)
+     */
+    TCGCond cond;
+
+    switch (opcode) {
+    case 0x6: /* CMGT, CMHI */
+        /* 64 bit integer comparison, result = test ? (2^64 - 1) : 0.
+         * We implement this using setcond (test) and then negating.
+         */
+        cond = u ? TCG_COND_GTU : TCG_COND_GT;
+    do_cmop:
+        tcg_gen_setcond_i64(cond, tcg_rd, tcg_rn, tcg_rm);
+        tcg_gen_neg_i64(tcg_rd, tcg_rd);
+        break;
+    case 0x7: /* CMGE, CMHS */
+        cond = u ? TCG_COND_GEU : TCG_COND_GE;
+        goto do_cmop;
+    case 0x11: /* CMTST, CMEQ */
+        if (u) {
+            cond = TCG_COND_EQ;
+            goto do_cmop;
+        }
+        /* CMTST : test is "if (X & Y != 0)". */
+        tcg_gen_and_i64(tcg_rd, tcg_rn, tcg_rm);
+        tcg_gen_setcondi_i64(TCG_COND_NE, tcg_rd, tcg_rd, 0);
+        tcg_gen_neg_i64(tcg_rd, tcg_rd);
+        break;
+    case 0x10: /* ADD, SUB */
+        if (u) {
+            tcg_gen_sub_i64(tcg_rd, tcg_rn, tcg_rm);
+        } else {
+            tcg_gen_add_i64(tcg_rd, tcg_rn, tcg_rm);
+        }
+        break;
+    case 0x1: /* SQADD */
+    case 0x5: /* SQSUB */
+    case 0x8: /* SSHL, USHL */
+    case 0x9: /* SQSHL, UQSHL */
+    case 0xa: /* SRSHL, URSHL */
+    case 0xb: /* SQRSHL, UQRSHL */
+    default:
+        g_assert_not_reached();
+    }
+}
+
 /* C3.6.11 AdvSIMD scalar three same
  *  31 30  29 28       24 23  22  21 20  16 15    11  10 9    5 4    0
  * +-----+---+-----------+------+---+------+--------+---+------+------+
@@ -5511,7 +5563,84 @@ static void disas_simd_scalar_three_reg_diff(DisasContext *s, uint32_t insn)
  */
 static void disas_simd_scalar_three_reg_same(DisasContext *s, uint32_t insn)
 {
-    unsupported_encoding(s, insn);
+    int rd = extract32(insn, 0, 5);
+    int rn = extract32(insn, 5, 5);
+    int opcode = extract32(insn, 11, 5);
+    int rm = extract32(insn, 16, 5);
+    int size = extract32(insn, 22, 2);
+    bool u = extract32(insn, 29, 1);
+    TCGv_i64 tcg_rn;
+    TCGv_i64 tcg_rm;
+    TCGv_i64 tcg_rd;
+
+    if (opcode >= 0x18) {
+        /* Floating point: U, size[1] and opcode indicate operation */
+        int fpopcode = opcode | (extract32(size, 1, 1) << 5) | (u << 6);
+        switch (fpopcode) {
+        case 0x1b: /* FMULX */
+        case 0x1c: /* FCMEQ */
+        case 0x1f: /* FRECPS */
+        case 0x3f: /* FRSQRTS */
+        case 0x5c: /* FCMGE */
+        case 0x5d: /* FACGE */
+        case 0x7a: /* FABD  */
+        case 0x7c: /* FCMGT */
+        case 0x7d: /* FACGT */
+            unsupported_encoding(s, insn);
+            return;
+        default:
+            unallocated_encoding(s);
+            return;
+        }
+    }
+
+    switch (opcode) {
+    case 0x1: /* SQADD, UQADD */
+    case 0x5: /* SQSUB, UQSUB */
+    case 0x8: /* SSHL, USHL */
+    case 0xa: /* SRSHL, URSHL */
+        unsupported_encoding(s, insn);
+        return;
+    case 0x6: /* CMGT, CMHI */
+    case 0x7: /* CMGE, CMHS */
+    case 0x11: /* CMTST, CMEQ */
+    case 0x10: /* ADD, SUB (vector) */
+        if (size != 3) {
+            unallocated_encoding(s);
+            return;
+        }
+        break;
+    case 0x9: /* SQSHL, UQSHL */
+    case 0xb: /* SQRSHL, UQRSHL */
+        unsupported_encoding(s, insn);
+        return;
+    case 0x16: /* SQDMULH, SQRDMULH (vector) */
+        if (size != 1 && size != 2) {
+            unallocated_encoding(s);
+            return;
+        }
+        unsupported_encoding(s, insn);
+        return;
+    default:
+        unallocated_encoding(s);
+        return;
+    }
+
+    tcg_rn = read_fp_dreg(s, rn);       /* op1 */
+    tcg_rm = read_fp_dreg(s, rm);       /* op2 */
+    tcg_rd = tcg_temp_new_i64();
+
+    /* For the moment we only support the opcodes which are
+     * 64-bit-width only. The size != 3 cases will
+     * be handled later when the relevant ops are implemented.
+     */
+    handle_3same_64(s, opcode, u, tcg_rd, tcg_rn, tcg_rm);
+
+    write_fp_dreg(s, rd, tcg_rd);
+
+    tcg_temp_free_i64(tcg_rn);
+    tcg_temp_free_i64(tcg_rm);
+    tcg_temp_free_i64(tcg_rd);
 }
 
 /* C3.6.12 AdvSIMD scalar two reg misc
-- 
1.8.5

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [Qemu-devel] [PULL 28/38] target-arm: A64: Add top level decode for SIMD 3-same group
  2014-01-29 13:39 [Qemu-devel] [PULL 00/38] target-arm queue Peter Maydell
                   ` (26 preceding siblings ...)
  2014-01-29 13:39 ` [Qemu-devel] [PULL 27/38] target-arm: A64: Add SIMD scalar 3 same add, sub and compare ops Peter Maydell
@ 2014-01-29 13:39 ` Peter Maydell
  2014-01-29 13:39 ` [Qemu-devel] [PULL 29/38] target-arm: A64: Add logic ops from SIMD 3 same group Peter Maydell
                   ` (10 subsequent siblings)
  38 siblings, 0 replies; 40+ messages in thread
From: Peter Maydell @ 2014-01-29 13:39 UTC (permalink / raw)
  To: Anthony Liguori; +Cc: Blue Swirl, qemu-devel, Aurelien Jarno

Add top level decode for the A64 SIMD three regs same group
(C3.6.16), splitting it into the pairwise, logical, float and
integer subgroups.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
---
 target-arm/translate-a64.c | 45 ++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 44 insertions(+), 1 deletion(-)

diff --git a/target-arm/translate-a64.c b/target-arm/translate-a64.c
index f0ebbb5..a215f083 100644
--- a/target-arm/translate-a64.c
+++ b/target-arm/translate-a64.c
@@ -5920,6 +5920,30 @@ static void disas_simd_three_reg_diff(DisasContext *s, uint32_t insn)
     }
 }
 
+/* Logic op (opcode == 3) subgroup of C3.6.16. */
+static void disas_simd_3same_logic(DisasContext *s, uint32_t insn)
+{
+    unsupported_encoding(s, insn);
+}
+
+/* Pairwise op subgroup of C3.6.16. */
+static void disas_simd_3same_pair(DisasContext *s, uint32_t insn)
+{
+    unsupported_encoding(s, insn);
+}
+
+/* Floating point op subgroup of C3.6.16. */
+static void disas_simd_3same_float(DisasContext *s, uint32_t insn)
+{
+    unsupported_encoding(s, insn);
+}
+
+/* Integer op subgroup of C3.6.16. */
+static void disas_simd_3same_int(DisasContext *s, uint32_t insn)
+{
+    unsupported_encoding(s, insn);
+}
+
 /* C3.6.16 AdvSIMD three same
  *  31  30  29  28       24 23  22  21 20  16 15    11  10 9    5 4    0
  * +---+---+---+-----------+------+---+------+--------+---+------+------+
@@ -5928,7 +5952,26 @@ static void disas_simd_three_reg_diff(DisasContext *s, uint32_t insn)
  */
 static void disas_simd_three_reg_same(DisasContext *s, uint32_t insn)
 {
-    unsupported_encoding(s, insn);
+    int opcode = extract32(insn, 11, 5);
+
+    switch (opcode) {
+    case 0x3: /* logic ops */
+        disas_simd_3same_logic(s, insn);
+        break;
+    case 0x17: /* ADDP */
+    case 0x14: /* SMAXP, UMAXP */
+    case 0x15: /* SMINP, UMINP */
+        /* Pairwise operations */
+        disas_simd_3same_pair(s, insn);
+        break;
+    case 0x18 ... 0x31:
+        /* floating point ops, sz[1] and U are part of opcode */
+        disas_simd_3same_float(s, insn);
+        break;
+    default:
+        disas_simd_3same_int(s, insn);
+        break;
+    }
 }
 
 /* C3.6.17 AdvSIMD two reg misc
-- 
1.8.5

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [Qemu-devel] [PULL 29/38] target-arm: A64: Add logic ops from SIMD 3 same group
  2014-01-29 13:39 [Qemu-devel] [PULL 00/38] target-arm queue Peter Maydell
                   ` (27 preceding siblings ...)
  2014-01-29 13:39 ` [Qemu-devel] [PULL 28/38] target-arm: A64: Add top level decode for SIMD 3-same group Peter Maydell
@ 2014-01-29 13:39 ` Peter Maydell
  2014-01-29 13:39 ` [Qemu-devel] [PULL 30/38] target-arm: A64: Add integer ops from SIMD 3-same group Peter Maydell
                   ` (9 subsequent siblings)
  38 siblings, 0 replies; 40+ messages in thread
From: Peter Maydell @ 2014-01-29 13:39 UTC (permalink / raw)
  To: Anthony Liguori; +Cc: Blue Swirl, qemu-devel, Aurelien Jarno

Add support for the logical operations (ORR, AND, BIC, ORN, EOR, BSL,
BIT and BIF) from the SIMD 3 register same group (C3.6.16).

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
---
 target-arm/translate-a64.c | 73 +++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 72 insertions(+), 1 deletion(-)

diff --git a/target-arm/translate-a64.c b/target-arm/translate-a64.c
index a215f083..aa53ddc 100644
--- a/target-arm/translate-a64.c
+++ b/target-arm/translate-a64.c
@@ -5923,7 +5923,78 @@ static void disas_simd_three_reg_diff(DisasContext *s, uint32_t insn)
 /* Logic op (opcode == 3) subgroup of C3.6.16. */
 static void disas_simd_3same_logic(DisasContext *s, uint32_t insn)
 {
-    unsupported_encoding(s, insn);
+    int rd = extract32(insn, 0, 5);
+    int rn = extract32(insn, 5, 5);
+    int rm = extract32(insn, 16, 5);
+    int size = extract32(insn, 22, 2);
+    bool is_u = extract32(insn, 29, 1);
+    bool is_q = extract32(insn, 30, 1);
+    TCGv_i64 tcg_op1 = tcg_temp_new_i64();
+    TCGv_i64 tcg_op2 = tcg_temp_new_i64();
+    TCGv_i64 tcg_res[2];
+    int pass;
+
+    tcg_res[0] = tcg_temp_new_i64();
+    tcg_res[1] = tcg_temp_new_i64();
+
+    for (pass = 0; pass < (is_q ? 2 : 1); pass++) {
+        read_vec_element(s, tcg_op1, rn, pass, MO_64);
+        read_vec_element(s, tcg_op2, rm, pass, MO_64);
+
+        if (!is_u) {
+            switch (size) {
+            case 0: /* AND */
+                tcg_gen_and_i64(tcg_res[pass], tcg_op1, tcg_op2);
+                break;
+            case 1: /* BIC */
+                tcg_gen_andc_i64(tcg_res[pass], tcg_op1, tcg_op2);
+                break;
+            case 2: /* ORR */
+                tcg_gen_or_i64(tcg_res[pass], tcg_op1, tcg_op2);
+                break;
+            case 3: /* ORN */
+                tcg_gen_orc_i64(tcg_res[pass], tcg_op1, tcg_op2);
+                break;
+            }
+        } else {
+            if (size != 0) {
+                /* B* ops need res loaded to operate on */
+                read_vec_element(s, tcg_res[pass], rd, pass, MO_64);
+            }
+
+            switch (size) {
+            case 0: /* EOR */
+                tcg_gen_xor_i64(tcg_res[pass], tcg_op1, tcg_op2);
+                break;
+            case 1: /* BSL bitwise select */
+                tcg_gen_xor_i64(tcg_op1, tcg_op1, tcg_op2);
+                tcg_gen_and_i64(tcg_op1, tcg_op1, tcg_res[pass]);
+                tcg_gen_xor_i64(tcg_res[pass], tcg_op2, tcg_op1);
+                break;
+            case 2: /* BIT, bitwise insert if true */
+                tcg_gen_xor_i64(tcg_op1, tcg_op1, tcg_res[pass]);
+                tcg_gen_and_i64(tcg_op1, tcg_op1, tcg_op2);
+                tcg_gen_xor_i64(tcg_res[pass], tcg_res[pass], tcg_op1);
+                break;
+            case 3: /* BIF, bitwise insert if false */
+                tcg_gen_xor_i64(tcg_op1, tcg_op1, tcg_res[pass]);
+                tcg_gen_andc_i64(tcg_op1, tcg_op1, tcg_op2);
+                tcg_gen_xor_i64(tcg_res[pass], tcg_res[pass], tcg_op1);
+                break;
+            }
+        }
+    }
+
+    write_vec_element(s, tcg_res[0], rd, 0, MO_64);
+    if (!is_q) {
+        tcg_gen_movi_i64(tcg_res[1], 0);
+    }
+    write_vec_element(s, tcg_res[1], rd, 1, MO_64);
+
+    tcg_temp_free_i64(tcg_op1);
+    tcg_temp_free_i64(tcg_op2);
+    tcg_temp_free_i64(tcg_res[0]);
+    tcg_temp_free_i64(tcg_res[1]);
 }
 
 /* Pairwise op subgroup of C3.6.16. */
-- 
1.8.5

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [Qemu-devel] [PULL 30/38] target-arm: A64: Add integer ops from SIMD 3-same group
  2014-01-29 13:39 [Qemu-devel] [PULL 00/38] target-arm queue Peter Maydell
                   ` (28 preceding siblings ...)
  2014-01-29 13:39 ` [Qemu-devel] [PULL 29/38] target-arm: A64: Add logic ops from SIMD 3 same group Peter Maydell
@ 2014-01-29 13:39 ` Peter Maydell
  2014-01-29 13:39 ` [Qemu-devel] [PULL 31/38] target-arm: A64: Add simple SIMD 3-same floating point ops Peter Maydell
                   ` (8 subsequent siblings)
  38 siblings, 0 replies; 40+ messages in thread
From: Peter Maydell @ 2014-01-29 13:39 UTC (permalink / raw)
  To: Anthony Liguori; +Cc: Blue Swirl, qemu-devel, Aurelien Jarno

Add some of the integer operations in the SIMD 3-same group:
specifically, the comparisons, addition and subtraction.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
---
 target-arm/translate-a64.c | 165 ++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 164 insertions(+), 1 deletion(-)

diff --git a/target-arm/translate-a64.c b/target-arm/translate-a64.c
index aa53ddc..9e7401c 100644
--- a/target-arm/translate-a64.c
+++ b/target-arm/translate-a64.c
@@ -72,6 +72,9 @@ typedef struct AArch64DecodeTable {
     AArch64DecodeFn *disas_fn;
 } AArch64DecodeTable;
 
+/* Function prototype for gen_ functions for calling Neon helpers */
+typedef void NeonGenTwoOpFn(TCGv_i32, TCGv_i32, TCGv_i32);
+
 /* initialize TCG globals.  */
 void a64_translate_init(void)
 {
@@ -787,6 +790,25 @@ static void write_vec_element(DisasContext *s, TCGv_i64 tcg_src, int destidx,
     }
 }
 
+static void write_vec_element_i32(DisasContext *s, TCGv_i32 tcg_src,
+                                  int destidx, int element, TCGMemOp memop)
+{
+    int vect_off = vec_reg_offset(destidx, element, memop & MO_SIZE);
+    switch (memop) {
+    case MO_8:
+        tcg_gen_st8_i32(tcg_src, cpu_env, vect_off);
+        break;
+    case MO_16:
+        tcg_gen_st16_i32(tcg_src, cpu_env, vect_off);
+        break;
+    case MO_32:
+        tcg_gen_st_i32(tcg_src, cpu_env, vect_off);
+        break;
+    default:
+        g_assert_not_reached();
+    }
+}
+
 /* Clear the high 64 bits of a 128 bit vector (in general non-quad
  * vector ops all need to do this).
  */
@@ -6012,7 +6034,148 @@ static void disas_simd_3same_float(DisasContext *s, uint32_t insn)
 /* Integer op subgroup of C3.6.16. */
 static void disas_simd_3same_int(DisasContext *s, uint32_t insn)
 {
-    unsupported_encoding(s, insn);
+    int is_q = extract32(insn, 30, 1);
+    int u = extract32(insn, 29, 1);
+    int size = extract32(insn, 22, 2);
+    int opcode = extract32(insn, 11, 5);
+    int rm = extract32(insn, 16, 5);
+    int rn = extract32(insn, 5, 5);
+    int rd = extract32(insn, 0, 5);
+    int pass;
+
+    switch (opcode) {
+    case 0x13: /* MUL, PMUL */
+        if (u && size != 0) {
+            unallocated_encoding(s);
+            return;
+        }
+        /* fall through */
+    case 0x0: /* SHADD, UHADD */
+    case 0x2: /* SRHADD, URHADD */
+    case 0x4: /* SHSUB, UHSUB */
+    case 0xc: /* SMAX, UMAX */
+    case 0xd: /* SMIN, UMIN */
+    case 0xe: /* SABD, UABD */
+    case 0xf: /* SABA, UABA */
+    case 0x12: /* MLA, MLS */
+        if (size == 3) {
+            unallocated_encoding(s);
+            return;
+        }
+        unsupported_encoding(s, insn);
+        return;
+    case 0x1: /* SQADD */
+    case 0x5: /* SQSUB */
+    case 0x8: /* SSHL, USHL */
+    case 0x9: /* SQSHL, UQSHL */
+    case 0xa: /* SRSHL, URSHL */
+    case 0xb: /* SQRSHL, UQRSHL */
+        if (size == 3 && !is_q) {
+            unallocated_encoding(s);
+            return;
+        }
+        unsupported_encoding(s, insn);
+        return;
+    case 0x16: /* SQDMULH, SQRDMULH */
+        if (size == 0 || size == 3) {
+            unallocated_encoding(s);
+            return;
+        }
+        unsupported_encoding(s, insn);
+        return;
+    default:
+        if (size == 3 && !is_q) {
+            unallocated_encoding(s);
+            return;
+        }
+        break;
+    }
+
+    if (size == 3) {
+        for (pass = 0; pass < (is_q ? 2 : 1); pass++) {
+            TCGv_i64 tcg_op1 = tcg_temp_new_i64();
+            TCGv_i64 tcg_op2 = tcg_temp_new_i64();
+            TCGv_i64 tcg_res = tcg_temp_new_i64();
+
+            read_vec_element(s, tcg_op1, rn, pass, MO_64);
+            read_vec_element(s, tcg_op2, rm, pass, MO_64);
+
+            handle_3same_64(s, opcode, u, tcg_res, tcg_op1, tcg_op2);
+
+            write_vec_element(s, tcg_res, rd, pass, MO_64);
+
+            tcg_temp_free_i64(tcg_res);
+            tcg_temp_free_i64(tcg_op1);
+            tcg_temp_free_i64(tcg_op2);
+        }
+    } else {
+        for (pass = 0; pass < (is_q ? 4 : 2); pass++) {
+            TCGv_i32 tcg_op1 = tcg_temp_new_i32();
+            TCGv_i32 tcg_op2 = tcg_temp_new_i32();
+            TCGv_i32 tcg_res = tcg_temp_new_i32();
+            NeonGenTwoOpFn *genfn;
+
+            read_vec_element_i32(s, tcg_op1, rn, pass, MO_32);
+            read_vec_element_i32(s, tcg_op2, rm, pass, MO_32);
+
+            switch (opcode) {
+            case 0x6: /* CMGT, CMHI */
+            {
+                static NeonGenTwoOpFn * const fns[3][2] = {
+                    { gen_helper_neon_cgt_s8, gen_helper_neon_cgt_u8 },
+                    { gen_helper_neon_cgt_s16, gen_helper_neon_cgt_u16 },
+                    { gen_helper_neon_cgt_s32, gen_helper_neon_cgt_u32 },
+                };
+                genfn = fns[size][u];
+                break;
+            }
+            case 0x7: /* CMGE, CMHS */
+            {
+                static NeonGenTwoOpFn * const fns[3][2] = {
+                    { gen_helper_neon_cge_s8, gen_helper_neon_cge_u8 },
+                    { gen_helper_neon_cge_s16, gen_helper_neon_cge_u16 },
+                    { gen_helper_neon_cge_s32, gen_helper_neon_cge_u32 },
+                };
+                genfn = fns[size][u];
+                break;
+            }
+            case 0x10: /* ADD, SUB */
+            {
+                static NeonGenTwoOpFn * const fns[3][2] = {
+                    { gen_helper_neon_add_u8, gen_helper_neon_sub_u8 },
+                    { gen_helper_neon_add_u16, gen_helper_neon_sub_u16 },
+                    { tcg_gen_add_i32, tcg_gen_sub_i32 },
+                };
+                genfn = fns[size][u];
+                break;
+            }
+            case 0x11: /* CMTST, CMEQ */
+            {
+                static NeonGenTwoOpFn * const fns[3][2] = {
+                    { gen_helper_neon_tst_u8, gen_helper_neon_ceq_u8 },
+                    { gen_helper_neon_tst_u16, gen_helper_neon_ceq_u16 },
+                    { gen_helper_neon_tst_u32, gen_helper_neon_ceq_u32 },
+                };
+                genfn = fns[size][u];
+                break;
+            }
+            default:
+                g_assert_not_reached();
+            }
+
+            genfn(tcg_res, tcg_op1, tcg_op2);
+
+            write_vec_element_i32(s, tcg_res, rd, pass, MO_32);
+
+            tcg_temp_free_i32(tcg_res);
+            tcg_temp_free_i32(tcg_op1);
+            tcg_temp_free_i32(tcg_op2);
+        }
+    }
+
+    if (!is_q) {
+        clear_vec_high(s, rd);
+    }
 }
 
 /* C3.6.16 AdvSIMD three same
-- 
1.8.5

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [Qemu-devel] [PULL 31/38] target-arm: A64: Add simple SIMD 3-same floating point ops
  2014-01-29 13:39 [Qemu-devel] [PULL 00/38] target-arm queue Peter Maydell
                   ` (29 preceding siblings ...)
  2014-01-29 13:39 ` [Qemu-devel] [PULL 30/38] target-arm: A64: Add integer ops from SIMD 3-same group Peter Maydell
@ 2014-01-29 13:39 ` Peter Maydell
  2014-01-29 13:39 ` [Qemu-devel] [PULL 32/38] target-arm: A64: Add SIMD shift by immediate Peter Maydell
                   ` (7 subsequent siblings)
  38 siblings, 0 replies; 40+ messages in thread
From: Peter Maydell @ 2014-01-29 13:39 UTC (permalink / raw)
  To: Anthony Liguori; +Cc: Blue Swirl, qemu-devel, Aurelien Jarno

Implement a simple subset of the SIMD 3-same floating point
operations. This includes a common helper function used for both
scalar and vector ops; FABD is the only currently implemented
shared op.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
---
 target-arm/translate-a64.c | 190 ++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 188 insertions(+), 2 deletions(-)

diff --git a/target-arm/translate-a64.c b/target-arm/translate-a64.c
index 9e7401c..509982c 100644
--- a/target-arm/translate-a64.c
+++ b/target-arm/translate-a64.c
@@ -5577,6 +5577,131 @@ static void handle_3same_64(DisasContext *s, int opcode, bool u,
     }
 }
 
+/* Handle the 3-same-operands float operations; shared by the scalar
+ * and vector encodings. The caller must filter out any encodings
+ * not allocated for the encoding it is dealing with.
+ */
+static void handle_3same_float(DisasContext *s, int size, int elements,
+                               int fpopcode, int rd, int rn, int rm)
+{
+    int pass;
+    TCGv_ptr fpst = get_fpstatus_ptr();
+
+    for (pass = 0; pass < elements; pass++) {
+        if (size) {
+            /* Double */
+            TCGv_i64 tcg_op1 = tcg_temp_new_i64();
+            TCGv_i64 tcg_op2 = tcg_temp_new_i64();
+            TCGv_i64 tcg_res = tcg_temp_new_i64();
+
+            read_vec_element(s, tcg_op1, rn, pass, MO_64);
+            read_vec_element(s, tcg_op2, rm, pass, MO_64);
+
+            switch (fpopcode) {
+            case 0x18: /* FMAXNM */
+                gen_helper_vfp_maxnumd(tcg_res, tcg_op1, tcg_op2, fpst);
+                break;
+            case 0x1a: /* FADD */
+                gen_helper_vfp_addd(tcg_res, tcg_op1, tcg_op2, fpst);
+                break;
+            case 0x1e: /* FMAX */
+                gen_helper_vfp_maxd(tcg_res, tcg_op1, tcg_op2, fpst);
+                break;
+            case 0x38: /* FMINNM */
+                gen_helper_vfp_minnumd(tcg_res, tcg_op1, tcg_op2, fpst);
+                break;
+            case 0x3a: /* FSUB */
+                gen_helper_vfp_subd(tcg_res, tcg_op1, tcg_op2, fpst);
+                break;
+            case 0x3e: /* FMIN */
+                gen_helper_vfp_mind(tcg_res, tcg_op1, tcg_op2, fpst);
+                break;
+            case 0x5b: /* FMUL */
+                gen_helper_vfp_muld(tcg_res, tcg_op1, tcg_op2, fpst);
+                break;
+            case 0x5f: /* FDIV */
+                gen_helper_vfp_divd(tcg_res, tcg_op1, tcg_op2, fpst);
+                break;
+            case 0x7a: /* FABD */
+                gen_helper_vfp_subd(tcg_res, tcg_op1, tcg_op2, fpst);
+                gen_helper_vfp_absd(tcg_res, tcg_res);
+                break;
+            default:
+                g_assert_not_reached();
+            }
+
+            write_vec_element(s, tcg_res, rd, pass, MO_64);
+
+            tcg_temp_free_i64(tcg_res);
+            tcg_temp_free_i64(tcg_op1);
+            tcg_temp_free_i64(tcg_op2);
+        } else {
+            /* Single */
+            TCGv_i32 tcg_op1 = tcg_temp_new_i32();
+            TCGv_i32 tcg_op2 = tcg_temp_new_i32();
+            TCGv_i32 tcg_res = tcg_temp_new_i32();
+
+            read_vec_element_i32(s, tcg_op1, rn, pass, MO_32);
+            read_vec_element_i32(s, tcg_op2, rm, pass, MO_32);
+
+            switch (fpopcode) {
+            case 0x1a: /* FADD */
+                gen_helper_vfp_adds(tcg_res, tcg_op1, tcg_op2, fpst);
+                break;
+            case 0x1e: /* FMAX */
+                gen_helper_vfp_maxs(tcg_res, tcg_op1, tcg_op2, fpst);
+                break;
+            case 0x18: /* FMAXNM */
+                gen_helper_vfp_maxnums(tcg_res, tcg_op1, tcg_op2, fpst);
+                break;
+            case 0x38: /* FMINNM */
+                gen_helper_vfp_minnums(tcg_res, tcg_op1, tcg_op2, fpst);
+                break;
+            case 0x3a: /* FSUB */
+                gen_helper_vfp_subs(tcg_res, tcg_op1, tcg_op2, fpst);
+                break;
+            case 0x3e: /* FMIN */
+                gen_helper_vfp_mins(tcg_res, tcg_op1, tcg_op2, fpst);
+                break;
+            case 0x5b: /* FMUL */
+                gen_helper_vfp_muls(tcg_res, tcg_op1, tcg_op2, fpst);
+                break;
+            case 0x5f: /* FDIV */
+                gen_helper_vfp_divs(tcg_res, tcg_op1, tcg_op2, fpst);
+                break;
+            case 0x7a: /* FABD */
+                gen_helper_vfp_subs(tcg_res, tcg_op1, tcg_op2, fpst);
+                gen_helper_vfp_abss(tcg_res, tcg_res);
+                break;
+            default:
+                g_assert_not_reached();
+            }
+
+            if (elements == 1) {
+                /* scalar single so clear high part */
+                TCGv_i64 tcg_tmp = tcg_temp_new_i64();
+
+                tcg_gen_extu_i32_i64(tcg_tmp, tcg_res);
+                write_vec_element(s, tcg_tmp, rd, pass, MO_64);
+                tcg_temp_free_i64(tcg_tmp);
+            } else {
+                write_vec_element_i32(s, tcg_res, rd, pass, MO_32);
+            }
+
+            tcg_temp_free_i32(tcg_res);
+            tcg_temp_free_i32(tcg_op1);
+            tcg_temp_free_i32(tcg_op2);
+        }
+    }
+
+    tcg_temp_free_ptr(fpst);
+
+    if ((elements << size) < 4) {
+        /* scalar, or non-quad vector op */
+        clear_vec_high(s, rd);
+    }
+}
+
 /* C3.6.11 AdvSIMD scalar three same
  *  31 30  29 28       24 23  22  21 20  16 15    11  10 9    5 4    0
  * +-----+---+-----------+------+---+------+--------+---+------+------+
@@ -5605,15 +5730,19 @@ static void disas_simd_scalar_three_reg_same(DisasContext *s, uint32_t insn)
         case 0x3f: /* FRSQRTS */
         case 0x5c: /* FCMGE */
         case 0x5d: /* FACGE */
-        case 0x7a: /* FABD  */
         case 0x7c: /* FCMGT */
         case 0x7d: /* FACGT */
             unsupported_encoding(s, insn);
             return;
+        case 0x7a: /* FABD */
+            break;
         default:
             unallocated_encoding(s);
             return;
         }
+
+        handle_3same_float(s, extract32(size, 0, 1), 1, fpopcode, rd, rn, rm);
+        return;
     }
 
     switch (opcode) {
@@ -6028,7 +6157,64 @@ static void disas_simd_3same_pair(DisasContext *s, uint32_t insn)
 /* Floating point op subgroup of C3.6.16. */
 static void disas_simd_3same_float(DisasContext *s, uint32_t insn)
 {
-    unsupported_encoding(s, insn);
+    /* For floating point ops, the U, size[1] and opcode bits
+     * together indicate the operation. size[0] indicates single
+     * or double.
+     */
+    int fpopcode = extract32(insn, 11, 5)
+        | (extract32(insn, 23, 1) << 5)
+        | (extract32(insn, 29, 1) << 6);
+    int is_q = extract32(insn, 30, 1);
+    int size = extract32(insn, 22, 1);
+    int rm = extract32(insn, 16, 5);
+    int rn = extract32(insn, 5, 5);
+    int rd = extract32(insn, 0, 5);
+
+    int datasize = is_q ? 128 : 64;
+    int esize = 32 << size;
+    int elements = datasize / esize;
+
+    if (size == 1 && !is_q) {
+        unallocated_encoding(s);
+        return;
+    }
+
+    switch (fpopcode) {
+    case 0x58: /* FMAXNMP */
+    case 0x5a: /* FADDP */
+    case 0x5e: /* FMAXP */
+    case 0x78: /* FMINNMP */
+    case 0x7e: /* FMINP */
+        /* pairwise ops */
+        unsupported_encoding(s, insn);
+        return;
+    case 0x1b: /* FMULX */
+    case 0x1c: /* FCMEQ */
+    case 0x1f: /* FRECPS */
+    case 0x3f: /* FRSQRTS */
+    case 0x5c: /* FCMGE */
+    case 0x5d: /* FACGE */
+    case 0x7c: /* FCMGT */
+    case 0x7d: /* FACGT */
+    case 0x19: /* FMLA */
+    case 0x39: /* FMLS */
+        unsupported_encoding(s, insn);
+        return;
+    case 0x18: /* FMAXNM */
+    case 0x1a: /* FADD */
+    case 0x1e: /* FMAX */
+    case 0x38: /* FMINNM */
+    case 0x3a: /* FSUB */
+    case 0x3e: /* FMIN */
+    case 0x5b: /* FMUL */
+    case 0x5f: /* FDIV */
+    case 0x7a: /* FABD */
+        handle_3same_float(s, size, elements, fpopcode, rd, rn, rm);
+        return;
+    default:
+        unallocated_encoding(s);
+        return;
+    }
 }
 
 /* Integer op subgroup of C3.6.16. */
-- 
1.8.5

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [Qemu-devel] [PULL 32/38] target-arm: A64: Add SIMD shift by immediate
  2014-01-29 13:39 [Qemu-devel] [PULL 00/38] target-arm queue Peter Maydell
                   ` (30 preceding siblings ...)
  2014-01-29 13:39 ` [Qemu-devel] [PULL 31/38] target-arm: A64: Add simple SIMD 3-same floating point ops Peter Maydell
@ 2014-01-29 13:39 ` Peter Maydell
  2014-01-29 13:40 ` [Qemu-devel] [PULL 33/38] linux-headers: Update from Linus' master ba635f8 Peter Maydell
                   ` (6 subsequent siblings)
  38 siblings, 0 replies; 40+ messages in thread
From: Peter Maydell @ 2014-01-29 13:39 UTC (permalink / raw)
  To: Anthony Liguori; +Cc: Blue Swirl, qemu-devel, Aurelien Jarno

From: Alex Bennée <alex.bennee@linaro.org>

This implements a subset of the AdvSIMD shift operations (namely all the
none saturating or narrowing ones). The actual shift generation code
itself is common for both the scalar and vector cases but wrapped with
either vector element iteration or the fp reg access.

The rounding operations need to take special care to correctly reflect
the result of adding rounding bits on high bits as the intermediates do
not truncate.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target-arm/translate-a64.c | 375 ++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 373 insertions(+), 2 deletions(-)

diff --git a/target-arm/translate-a64.c b/target-arm/translate-a64.c
index 509982c..6c1ec1e 100644
--- a/target-arm/translate-a64.c
+++ b/target-arm/translate-a64.c
@@ -5503,15 +5503,216 @@ static void disas_simd_scalar_pairwise(DisasContext *s, uint32_t insn)
     unsupported_encoding(s, insn);
 }
 
+/*
+ * Common SSHR[RA]/USHR[RA] - Shift right (optional rounding/accumulate)
+ *
+ * This code is handles the common shifting code and is used by both
+ * the vector and scalar code.
+ */
+static void handle_shri_with_rndacc(TCGv_i64 tcg_res, TCGv_i64 tcg_src,
+                                    TCGv_i64 tcg_rnd, bool accumulate,
+                                    bool is_u, int size, int shift)
+{
+    bool extended_result = false;
+    bool round = !TCGV_IS_UNUSED_I64(tcg_rnd);
+    int ext_lshift = 0;
+    TCGv_i64 tcg_src_hi;
+
+    if (round && size == 3) {
+        extended_result = true;
+        ext_lshift = 64 - shift;
+        tcg_src_hi = tcg_temp_new_i64();
+    } else if (shift == 64) {
+        if (!accumulate && is_u) {
+            /* result is zero */
+            tcg_gen_movi_i64(tcg_res, 0);
+            return;
+        }
+    }
+
+    /* Deal with the rounding step */
+    if (round) {
+        if (extended_result) {
+            TCGv_i64 tcg_zero = tcg_const_i64(0);
+            if (!is_u) {
+                /* take care of sign extending tcg_res */
+                tcg_gen_sari_i64(tcg_src_hi, tcg_src, 63);
+                tcg_gen_add2_i64(tcg_src, tcg_src_hi,
+                                 tcg_src, tcg_src_hi,
+                                 tcg_rnd, tcg_zero);
+            } else {
+                tcg_gen_add2_i64(tcg_src, tcg_src_hi,
+                                 tcg_src, tcg_zero,
+                                 tcg_rnd, tcg_zero);
+            }
+            tcg_temp_free_i64(tcg_zero);
+        } else {
+            tcg_gen_add_i64(tcg_src, tcg_src, tcg_rnd);
+        }
+    }
+
+    /* Now do the shift right */
+    if (round && extended_result) {
+        /* extended case, >64 bit precision required */
+        if (ext_lshift == 0) {
+            /* special case, only high bits matter */
+            tcg_gen_mov_i64(tcg_src, tcg_src_hi);
+        } else {
+            tcg_gen_shri_i64(tcg_src, tcg_src, shift);
+            tcg_gen_shli_i64(tcg_src_hi, tcg_src_hi, ext_lshift);
+            tcg_gen_or_i64(tcg_src, tcg_src, tcg_src_hi);
+        }
+    } else {
+        if (is_u) {
+            if (shift == 64) {
+                /* essentially shifting in 64 zeros */
+                tcg_gen_movi_i64(tcg_src, 0);
+            } else {
+                tcg_gen_shri_i64(tcg_src, tcg_src, shift);
+            }
+        } else {
+            if (shift == 64) {
+                /* effectively extending the sign-bit */
+                tcg_gen_sari_i64(tcg_src, tcg_src, 63);
+            } else {
+                tcg_gen_sari_i64(tcg_src, tcg_src, shift);
+            }
+        }
+    }
+
+    if (accumulate) {
+        tcg_gen_add_i64(tcg_res, tcg_res, tcg_src);
+    } else {
+        tcg_gen_mov_i64(tcg_res, tcg_src);
+    }
+
+    if (extended_result) {
+        tcg_temp_free_i64(tcg_src_hi);
+    }
+}
+
+/* Common SHL/SLI - Shift left with an optional insert */
+static void handle_shli_with_ins(TCGv_i64 tcg_res, TCGv_i64 tcg_src,
+                                 bool insert, int shift)
+{
+    if (insert) { /* SLI */
+        tcg_gen_deposit_i64(tcg_res, tcg_res, tcg_src, shift, 64 - shift);
+    } else { /* SHL */
+        tcg_gen_shli_i64(tcg_res, tcg_src, shift);
+    }
+}
+
+/* SSHR[RA]/USHR[RA] - Scalar shift right (optional rounding/accumulate) */
+static void handle_scalar_simd_shri(DisasContext *s,
+                                    bool is_u, int immh, int immb,
+                                    int opcode, int rn, int rd)
+{
+    const int size = 3;
+    int immhb = immh << 3 | immb;
+    int shift = 2 * (8 << size) - immhb;
+    bool accumulate = false;
+    bool round = false;
+    TCGv_i64 tcg_rn;
+    TCGv_i64 tcg_rd;
+    TCGv_i64 tcg_round;
+
+    if (!extract32(immh, 3, 1)) {
+        unallocated_encoding(s);
+        return;
+    }
+
+    switch (opcode) {
+    case 0x02: /* SSRA / USRA (accumulate) */
+        accumulate = true;
+        break;
+    case 0x04: /* SRSHR / URSHR (rounding) */
+        round = true;
+        break;
+    case 0x06: /* SRSRA / URSRA (accum + rounding) */
+        accumulate = round = true;
+        break;
+    }
+
+    if (round) {
+        uint64_t round_const = 1ULL << (shift - 1);
+        tcg_round = tcg_const_i64(round_const);
+    } else {
+        TCGV_UNUSED_I64(tcg_round);
+    }
+
+    tcg_rn = read_fp_dreg(s, rn);
+    tcg_rd = accumulate ? read_fp_dreg(s, rd) : tcg_temp_new_i64();
+
+    handle_shri_with_rndacc(tcg_rd, tcg_rn, tcg_round,
+                               accumulate, is_u, size, shift);
+
+    write_fp_dreg(s, rd, tcg_rd);
+
+    tcg_temp_free_i64(tcg_rn);
+    tcg_temp_free_i64(tcg_rd);
+    if (round) {
+        tcg_temp_free_i64(tcg_round);
+    }
+}
+
+/* SHL/SLI - Scalar shift left */
+static void handle_scalar_simd_shli(DisasContext *s, bool insert,
+                                    int immh, int immb, int opcode,
+                                    int rn, int rd)
+{
+    int size = 32 - clz32(immh) - 1;
+    int immhb = immh << 3 | immb;
+    int shift = immhb - (8 << size);
+    TCGv_i64 tcg_rn = new_tmp_a64(s);
+    TCGv_i64 tcg_rd = new_tmp_a64(s);
+
+    if (!extract32(immh, 3, 1)) {
+        unallocated_encoding(s);
+        return;
+    }
+
+    tcg_rn = read_fp_dreg(s, rn);
+    tcg_rd = insert ? read_fp_dreg(s, rd) : tcg_temp_new_i64();
+
+    handle_shli_with_ins(tcg_rd, tcg_rn, insert, shift);
+
+    write_fp_dreg(s, rd, tcg_rd);
+
+    tcg_temp_free_i64(tcg_rn);
+    tcg_temp_free_i64(tcg_rd);
+}
+
 /* C3.6.9 AdvSIMD scalar shift by immediate
  *  31 30  29 28         23 22  19 18  16 15    11  10 9    5 4    0
  * +-----+---+-------------+------+------+--------+---+------+------+
  * | 0 1 | U | 1 1 1 1 1 0 | immh | immb | opcode | 1 |  Rn  |  Rd  |
  * +-----+---+-------------+------+------+--------+---+------+------+
+ *
+ * This is the scalar version so it works on a fixed sized registers
  */
 static void disas_simd_scalar_shift_imm(DisasContext *s, uint32_t insn)
 {
-    unsupported_encoding(s, insn);
+    int rd = extract32(insn, 0, 5);
+    int rn = extract32(insn, 5, 5);
+    int opcode = extract32(insn, 11, 5);
+    int immb = extract32(insn, 16, 3);
+    int immh = extract32(insn, 19, 4);
+    bool is_u = extract32(insn, 29, 1);
+
+    switch (opcode) {
+    case 0x00: /* SSHR / USHR */
+    case 0x02: /* SSRA / USRA */
+    case 0x04: /* SRSHR / URSHR */
+    case 0x06: /* SRSRA / URSRA */
+        handle_scalar_simd_shri(s, is_u, immh, immb, opcode, rn, rd);
+        break;
+    case 0x0a: /* SHL / SLI */
+        handle_scalar_simd_shli(s, is_u, immh, immb, opcode, rn, rd);
+        break;
+    default:
+        unsupported_encoding(s, insn);
+        break;
+    }
 }
 
 /* C3.6.10 AdvSIMD scalar three different
@@ -5816,6 +6017,148 @@ static void disas_simd_scalar_indexed(DisasContext *s, uint32_t insn)
     unsupported_encoding(s, insn);
 }
 
+/* SSHR[RA]/USHR[RA] - Vector shift right (optional rounding/accumulate) */
+static void handle_vec_simd_shri(DisasContext *s, bool is_q, bool is_u,
+                                 int immh, int immb, int opcode, int rn, int rd)
+{
+    int size = 32 - clz32(immh) - 1;
+    int immhb = immh << 3 | immb;
+    int shift = 2 * (8 << size) - immhb;
+    bool accumulate = false;
+    bool round = false;
+    int dsize = is_q ? 128 : 64;
+    int esize = 8 << size;
+    int elements = dsize/esize;
+    TCGMemOp memop = size | (is_u ? 0 : MO_SIGN);
+    TCGv_i64 tcg_rn = new_tmp_a64(s);
+    TCGv_i64 tcg_rd = new_tmp_a64(s);
+    TCGv_i64 tcg_round;
+    int i;
+
+    if (extract32(immh, 3, 1) && !is_q) {
+        unallocated_encoding(s);
+        return;
+    }
+
+    if (size > 3 && !is_q) {
+        unallocated_encoding(s);
+        return;
+    }
+
+    switch (opcode) {
+    case 0x02: /* SSRA / USRA (accumulate) */
+        accumulate = true;
+        break;
+    case 0x04: /* SRSHR / URSHR (rounding) */
+        round = true;
+        break;
+    case 0x06: /* SRSRA / URSRA (accum + rounding) */
+        accumulate = round = true;
+        break;
+    }
+
+    if (round) {
+        uint64_t round_const = 1ULL << (shift - 1);
+        tcg_round = tcg_const_i64(round_const);
+    } else {
+        TCGV_UNUSED_I64(tcg_round);
+    }
+
+    for (i = 0; i < elements; i++) {
+        read_vec_element(s, tcg_rn, rn, i, memop);
+        if (accumulate) {
+            read_vec_element(s, tcg_rd, rd, i, memop);
+        }
+
+        handle_shri_with_rndacc(tcg_rd, tcg_rn, tcg_round,
+                                accumulate, is_u, size, shift);
+
+        write_vec_element(s, tcg_rd, rd, i, size);
+    }
+
+    if (!is_q) {
+        clear_vec_high(s, rd);
+    }
+
+    if (round) {
+        tcg_temp_free_i64(tcg_round);
+    }
+}
+
+/* SHL/SLI - Vector shift left */
+static void handle_vec_simd_shli(DisasContext *s, bool is_q, bool insert,
+                                int immh, int immb, int opcode, int rn, int rd)
+{
+    int size = 32 - clz32(immh) - 1;
+    int immhb = immh << 3 | immb;
+    int shift = immhb - (8 << size);
+    int dsize = is_q ? 128 : 64;
+    int esize = 8 << size;
+    int elements = dsize/esize;
+    TCGv_i64 tcg_rn = new_tmp_a64(s);
+    TCGv_i64 tcg_rd = new_tmp_a64(s);
+    int i;
+
+    if (extract32(immh, 3, 1) && !is_q) {
+        unallocated_encoding(s);
+        return;
+    }
+
+    if (size > 3 && !is_q) {
+        unallocated_encoding(s);
+        return;
+    }
+
+    for (i = 0; i < elements; i++) {
+        read_vec_element(s, tcg_rn, rn, i, size);
+        if (insert) {
+            read_vec_element(s, tcg_rd, rd, i, size);
+        }
+
+        handle_shli_with_ins(tcg_rd, tcg_rn, insert, shift);
+
+        write_vec_element(s, tcg_rd, rd, i, size);
+    }
+
+    if (!is_q) {
+        clear_vec_high(s, rd);
+    }
+}
+
+/* USHLL/SHLL - Vector shift left with widening */
+static void handle_vec_simd_wshli(DisasContext *s, bool is_q, bool is_u,
+                                 int immh, int immb, int opcode, int rn, int rd)
+{
+    int size = 32 - clz32(immh) - 1;
+    int immhb = immh << 3 | immb;
+    int shift = immhb - (8 << size);
+    int dsize = 64;
+    int esize = 8 << size;
+    int elements = dsize/esize;
+    TCGv_i64 tcg_rn = new_tmp_a64(s);
+    TCGv_i64 tcg_rd = new_tmp_a64(s);
+    int i;
+
+    if (size >= 3) {
+        unallocated_encoding(s);
+        return;
+    }
+
+    /* For the LL variants the store is larger than the load,
+     * so if rd == rn we would overwrite parts of our input.
+     * So load everything right now and use shifts in the main loop.
+     */
+    read_vec_element(s, tcg_rn, rn, is_q ? 1 : 0, MO_64);
+
+    for (i = 0; i < elements; i++) {
+        tcg_gen_shri_i64(tcg_rd, tcg_rn, i * esize);
+        ext_and_shift_reg(tcg_rd, tcg_rd, size | (!is_u << 2), 0);
+        tcg_gen_shli_i64(tcg_rd, tcg_rd, shift);
+        write_vec_element(s, tcg_rd, rd, i, size + 1);
+    }
+}
+
+
 /* C3.6.14 AdvSIMD shift by immediate
  *  31  30   29 28         23 22  19 18  16 15    11  10 9    5 4    0
  * +---+---+---+-------------+------+------+--------+---+------+------+
@@ -5824,7 +6167,35 @@ static void disas_simd_scalar_indexed(DisasContext *s, uint32_t insn)
  */
 static void disas_simd_shift_imm(DisasContext *s, uint32_t insn)
 {
-    unsupported_encoding(s, insn);
+    int rd = extract32(insn, 0, 5);
+    int rn = extract32(insn, 5, 5);
+    int opcode = extract32(insn, 11, 5);
+    int immb = extract32(insn, 16, 3);
+    int immh = extract32(insn, 19, 4);
+    bool is_u = extract32(insn, 29, 1);
+    bool is_q = extract32(insn, 30, 1);
+
+    switch (opcode) {
+    case 0x00: /* SSHR / USHR */
+    case 0x02: /* SSRA / USRA (accumulate) */
+    case 0x04: /* SRSHR / URSHR (rounding) */
+    case 0x06: /* SRSRA / URSRA (accum + rounding) */
+        handle_vec_simd_shri(s, is_q, is_u, immh, immb, opcode, rn, rd);
+        break;
+    case 0x0a: /* SHL / SLI */
+        handle_vec_simd_shli(s, is_q, is_u, immh, immb, opcode, rn, rd);
+        break;
+    case 0x14: /* SSHLL / USHLL */
+        handle_vec_simd_wshli(s, is_q, is_u, immh, immb, opcode, rn, rd);
+        break;
+    default:
+        /* We don't currently implement any of the Narrow or saturating shifts;
+         * nor do we implement the fixed-point conversions in this
+         * encoding group (SCVTF, FCVTZS, UCVTF, FCVTZU).
+         */
+        unsupported_encoding(s, insn);
+        return;
+    }
 }
 
 static void handle_3rd_widening(DisasContext *s, int is_q, int is_u, int size,
-- 
1.8.5

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [Qemu-devel] [PULL 33/38] linux-headers: Update from Linus' master ba635f8
  2014-01-29 13:39 [Qemu-devel] [PULL 00/38] target-arm queue Peter Maydell
                   ` (31 preceding siblings ...)
  2014-01-29 13:39 ` [Qemu-devel] [PULL 32/38] target-arm: A64: Add SIMD shift by immediate Peter Maydell
@ 2014-01-29 13:40 ` Peter Maydell
  2014-01-29 13:40 ` [Qemu-devel] [PULL 34/38] kvm: Introduce kvm_arch_irqchip_create Peter Maydell
                   ` (5 subsequent siblings)
  38 siblings, 0 replies; 40+ messages in thread
From: Peter Maydell @ 2014-01-29 13:40 UTC (permalink / raw)
  To: Anthony Liguori; +Cc: Blue Swirl, qemu-devel, Aurelien Jarno

From: Christoffer Dall <christoffer.dall@linaro.org>

Update to upstream commit
ba635f8cd20ebc7bddf1eb8e1f4eae28a034e916

Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 linux-headers/asm-arm/kvm.h    | 28 ++++++++++++++++++++++++++++
 linux-headers/asm-arm64/kvm.h  | 21 ++++++++++++++++++++-
 linux-headers/asm-x86/hyperv.h | 13 +++++++++++++
 linux-headers/linux/kvm.h      |  2 ++
 4 files changed, 63 insertions(+), 1 deletion(-)

diff --git a/linux-headers/asm-arm/kvm.h b/linux-headers/asm-arm/kvm.h
index c498b60..ef0c878 100644
--- a/linux-headers/asm-arm/kvm.h
+++ b/linux-headers/asm-arm/kvm.h
@@ -119,6 +119,26 @@ struct kvm_arch_memory_slot {
 #define KVM_REG_ARM_32_CRN_MASK		0x0000000000007800
 #define KVM_REG_ARM_32_CRN_SHIFT	11
 
+#define ARM_CP15_REG_SHIFT_MASK(x,n) \
+	(((x) << KVM_REG_ARM_ ## n ## _SHIFT) & KVM_REG_ARM_ ## n ## _MASK)
+
+#define __ARM_CP15_REG(op1,crn,crm,op2) \
+	(KVM_REG_ARM | (15 << KVM_REG_ARM_COPROC_SHIFT) | \
+	ARM_CP15_REG_SHIFT_MASK(op1, OPC1) | \
+	ARM_CP15_REG_SHIFT_MASK(crn, 32_CRN) | \
+	ARM_CP15_REG_SHIFT_MASK(crm, CRM) | \
+	ARM_CP15_REG_SHIFT_MASK(op2, 32_OPC2))
+
+#define ARM_CP15_REG32(...) (__ARM_CP15_REG(__VA_ARGS__) | KVM_REG_SIZE_U32)
+
+#define __ARM_CP15_REG64(op1,crm) \
+	(__ARM_CP15_REG(op1, 0, crm, 0) | KVM_REG_SIZE_U64)
+#define ARM_CP15_REG64(...) __ARM_CP15_REG64(__VA_ARGS__)
+
+#define KVM_REG_ARM_TIMER_CTL		ARM_CP15_REG32(0, 14, 3, 1)
+#define KVM_REG_ARM_TIMER_CNT		ARM_CP15_REG64(1, 14) 
+#define KVM_REG_ARM_TIMER_CVAL		ARM_CP15_REG64(3, 14) 
+
 /* Normal registers are mapped as coprocessor 16. */
 #define KVM_REG_ARM_CORE		(0x0010 << KVM_REG_ARM_COPROC_SHIFT)
 #define KVM_REG_ARM_CORE_REG(name)	(offsetof(struct kvm_regs, name) / 4)
@@ -143,6 +163,14 @@ struct kvm_arch_memory_slot {
 #define KVM_REG_ARM_VFP_FPINST		0x1009
 #define KVM_REG_ARM_VFP_FPINST2		0x100A
 
+/* Device Control API: ARM VGIC */
+#define KVM_DEV_ARM_VGIC_GRP_ADDR	0
+#define KVM_DEV_ARM_VGIC_GRP_DIST_REGS	1
+#define KVM_DEV_ARM_VGIC_GRP_CPU_REGS	2
+#define   KVM_DEV_ARM_VGIC_CPUID_SHIFT	32
+#define   KVM_DEV_ARM_VGIC_CPUID_MASK	(0xffULL << KVM_DEV_ARM_VGIC_CPUID_SHIFT)
+#define   KVM_DEV_ARM_VGIC_OFFSET_SHIFT	0
+#define   KVM_DEV_ARM_VGIC_OFFSET_MASK	(0xffffffffULL << KVM_DEV_ARM_VGIC_OFFSET_SHIFT)
 
 /* KVM_IRQ_LINE irq field index values */
 #define KVM_ARM_IRQ_TYPE_SHIFT		24
diff --git a/linux-headers/asm-arm64/kvm.h b/linux-headers/asm-arm64/kvm.h
index 5031f42..495ab6f 100644
--- a/linux-headers/asm-arm64/kvm.h
+++ b/linux-headers/asm-arm64/kvm.h
@@ -55,8 +55,9 @@ struct kvm_regs {
 #define KVM_ARM_TARGET_AEM_V8		0
 #define KVM_ARM_TARGET_FOUNDATION_V8	1
 #define KVM_ARM_TARGET_CORTEX_A57	2
+#define KVM_ARM_TARGET_XGENE_POTENZA	3
 
-#define KVM_ARM_NUM_TARGETS		3
+#define KVM_ARM_NUM_TARGETS		4
 
 /* KVM_ARM_SET_DEVICE_ADDR ioctl id encoding */
 #define KVM_ARM_DEVICE_TYPE_SHIFT	0
@@ -129,6 +130,24 @@ struct kvm_arch_memory_slot {
 #define KVM_REG_ARM64_SYSREG_OP2_MASK	0x0000000000000007
 #define KVM_REG_ARM64_SYSREG_OP2_SHIFT	0
 
+#define ARM64_SYS_REG_SHIFT_MASK(x,n) \
+	(((x) << KVM_REG_ARM64_SYSREG_ ## n ## _SHIFT) & \
+	KVM_REG_ARM64_SYSREG_ ## n ## _MASK)
+
+#define __ARM64_SYS_REG(op0,op1,crn,crm,op2) \
+	(KVM_REG_ARM64 | KVM_REG_ARM64_SYSREG | \
+	ARM64_SYS_REG_SHIFT_MASK(op0, OP0) | \
+	ARM64_SYS_REG_SHIFT_MASK(op1, OP1) | \
+	ARM64_SYS_REG_SHIFT_MASK(crn, CRN) | \
+	ARM64_SYS_REG_SHIFT_MASK(crm, CRM) | \
+	ARM64_SYS_REG_SHIFT_MASK(op2, OP2))
+
+#define ARM64_SYS_REG(...) (__ARM64_SYS_REG(__VA_ARGS__) | KVM_REG_SIZE_U64)
+
+#define KVM_REG_ARM_TIMER_CTL		ARM64_SYS_REG(3, 3, 14, 3, 1)
+#define KVM_REG_ARM_TIMER_CNT		ARM64_SYS_REG(3, 3, 14, 3, 2)
+#define KVM_REG_ARM_TIMER_CVAL		ARM64_SYS_REG(3, 3, 14, 0, 2)
+
 /* KVM_IRQ_LINE irq field index values */
 #define KVM_ARM_IRQ_TYPE_SHIFT		24
 #define KVM_ARM_IRQ_TYPE_MASK		0xff
diff --git a/linux-headers/asm-x86/hyperv.h b/linux-headers/asm-x86/hyperv.h
index b8f1c01..462efe7 100644
--- a/linux-headers/asm-x86/hyperv.h
+++ b/linux-headers/asm-x86/hyperv.h
@@ -28,6 +28,9 @@
 /* Partition Reference Counter (HV_X64_MSR_TIME_REF_COUNT) available*/
 #define HV_X64_MSR_TIME_REF_COUNT_AVAILABLE	(1 << 1)
 
+/* A partition's reference time stamp counter (TSC) page */
+#define HV_X64_MSR_REFERENCE_TSC		0x40000021
+
 /*
  * There is a single feature flag that signifies the presence of the MSR
  * that can be used to retrieve both the local APIC Timer frequency as
@@ -198,6 +201,9 @@
 #define HV_X64_MSR_APIC_ASSIST_PAGE_ADDRESS_MASK	\
 		(~((1ull << HV_X64_MSR_APIC_ASSIST_PAGE_ADDRESS_SHIFT) - 1))
 
+#define HV_X64_MSR_TSC_REFERENCE_ENABLE		0x00000001
+#define HV_X64_MSR_TSC_REFERENCE_ADDRESS_SHIFT	12
+
 #define HV_PROCESSOR_POWER_STATE_C0		0
 #define HV_PROCESSOR_POWER_STATE_C1		1
 #define HV_PROCESSOR_POWER_STATE_C2		2
@@ -210,4 +216,11 @@
 #define HV_STATUS_INVALID_ALIGNMENT		4
 #define HV_STATUS_INSUFFICIENT_BUFFERS		19
 
+typedef struct _HV_REFERENCE_TSC_PAGE {
+	__u32 tsc_sequence;
+	__u32 res1;
+	__u64 tsc_scale;
+	__s64 tsc_offset;
+} HV_REFERENCE_TSC_PAGE, *PHV_REFERENCE_TSC_PAGE;
+
 #endif
diff --git a/linux-headers/linux/kvm.h b/linux-headers/linux/kvm.h
index 5a49671..77ad35c 100644
--- a/linux-headers/linux/kvm.h
+++ b/linux-headers/linux/kvm.h
@@ -674,6 +674,7 @@ struct kvm_ppc_smmu_info {
 #define KVM_CAP_ARM_EL1_32BIT 93
 #define KVM_CAP_SPAPR_MULTITCE 94
 #define KVM_CAP_EXT_EMUL_CPUID 95
+#define KVM_CAP_HYPERV_TIME 96
 
 #ifdef KVM_CAP_IRQ_ROUTING
 
@@ -853,6 +854,7 @@ struct kvm_device_attr {
 #define  KVM_DEV_VFIO_GROUP			1
 #define   KVM_DEV_VFIO_GROUP_ADD			1
 #define   KVM_DEV_VFIO_GROUP_DEL			2
+#define KVM_DEV_TYPE_ARM_VGIC_V2	5
 
 /*
  * ioctls for VM fds
-- 
1.8.5

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [Qemu-devel] [PULL 34/38] kvm: Introduce kvm_arch_irqchip_create
  2014-01-29 13:39 [Qemu-devel] [PULL 00/38] target-arm queue Peter Maydell
                   ` (32 preceding siblings ...)
  2014-01-29 13:40 ` [Qemu-devel] [PULL 33/38] linux-headers: Update from Linus' master ba635f8 Peter Maydell
@ 2014-01-29 13:40 ` Peter Maydell
  2014-01-29 13:40 ` [Qemu-devel] [PULL 35/38] kvm: Common device control API functions Peter Maydell
                   ` (4 subsequent siblings)
  38 siblings, 0 replies; 40+ messages in thread
From: Peter Maydell @ 2014-01-29 13:40 UTC (permalink / raw)
  To: Anthony Liguori; +Cc: Blue Swirl, qemu-devel, Aurelien Jarno

From: Christoffer Dall <christoffer.dall@linaro.org>

Introduce kvm_arch_irqchip_create an arch-specific hook in preparation
for architecture-specific use of the device control API to create IRQ
chips.

Following patches will implement the ARM irqchip create method to prefer
the device control API over the older KVM_CREATE_IRQCHIP API.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 include/sysemu/kvm.h | 12 ++++++++++++
 kvm-all.c            | 11 +++++++++--
 stubs/Makefile.objs  |  1 +
 stubs/kvm.c          |  7 +++++++
 4 files changed, 29 insertions(+), 2 deletions(-)
 create mode 100644 stubs/kvm.c

diff --git a/include/sysemu/kvm.h b/include/sysemu/kvm.h
index 3b25f27..e4e43b8 100644
--- a/include/sysemu/kvm.h
+++ b/include/sysemu/kvm.h
@@ -319,4 +319,16 @@ int kvm_irqchip_remove_irqfd_notifier(KVMState *s, EventNotifier *n, int virq);
 void kvm_pc_gsi_handler(void *opaque, int n, int level);
 void kvm_pc_setup_irq_routing(bool pci_enabled);
 void kvm_init_irq_routing(KVMState *s);
+
+/**
+ * kvm_arch_irqchip_create:
+ * @KVMState: The KVMState pointer
+ *
+ * Allow architectures to create an in-kernel irq chip themselves.
+ *
+ * Returns: < 0: error
+ *            0: irq chip was not created
+ *          > 0: irq chip was created
+ */
+int kvm_arch_irqchip_create(KVMState *s);
 #endif
diff --git a/kvm-all.c b/kvm-all.c
index a3fb8de..a890152 100644
--- a/kvm-all.c
+++ b/kvm-all.c
@@ -1298,10 +1298,17 @@ static int kvm_irqchip_create(KVMState *s)
         return 0;
     }
 
-    ret = kvm_vm_ioctl(s, KVM_CREATE_IRQCHIP);
+    /* First probe and see if there's a arch-specific hook to create the
+     * in-kernel irqchip for us */
+    ret = kvm_arch_irqchip_create(s);
     if (ret < 0) {
-        fprintf(stderr, "Create kernel irqchip failed\n");
         return ret;
+    } else if (ret == 0) {
+        ret = kvm_vm_ioctl(s, KVM_CREATE_IRQCHIP);
+        if (ret < 0) {
+            fprintf(stderr, "Create kernel irqchip failed\n");
+            return ret;
+        }
     }
 
     kvm_kernel_irqchip = true;
diff --git a/stubs/Makefile.objs b/stubs/Makefile.objs
index df92fe5..df3aa7a 100644
--- a/stubs/Makefile.objs
+++ b/stubs/Makefile.objs
@@ -27,3 +27,4 @@ stub-obj-y += vm-stop.o
 stub-obj-y += vmstate.o
 stub-obj-$(CONFIG_WIN32) += fd-register.o
 stub-obj-y += cpus.o
+stub-obj-y += kvm.o
diff --git a/stubs/kvm.c b/stubs/kvm.c
new file mode 100644
index 0000000..e7c60b6
--- /dev/null
+++ b/stubs/kvm.c
@@ -0,0 +1,7 @@
+#include "qemu-common.h"
+#include "sysemu/kvm.h"
+
+int kvm_arch_irqchip_create(KVMState *s)
+{
+    return 0;
+}
-- 
1.8.5

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [Qemu-devel] [PULL 35/38] kvm: Common device control API functions
  2014-01-29 13:39 [Qemu-devel] [PULL 00/38] target-arm queue Peter Maydell
                   ` (33 preceding siblings ...)
  2014-01-29 13:40 ` [Qemu-devel] [PULL 34/38] kvm: Introduce kvm_arch_irqchip_create Peter Maydell
@ 2014-01-29 13:40 ` Peter Maydell
  2014-01-29 13:40 ` [Qemu-devel] [PULL 36/38] arm: vgic device control api support Peter Maydell
                   ` (3 subsequent siblings)
  38 siblings, 0 replies; 40+ messages in thread
From: Peter Maydell @ 2014-01-29 13:40 UTC (permalink / raw)
  To: Anthony Liguori; +Cc: Blue Swirl, qemu-devel, Aurelien Jarno

From: Christoffer Dall <christoffer.dall@linaro.org>

Introduces two simple functions:
    int kvm_device_ioctl(int fd, int type, ...);
    int kvm_create_device(KVMState *s, uint64_t type, bool test);

These functions wrap the basic ioctl-based interactions with KVM in a
way similar to other KVM ioctl wrappers.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 include/sysemu/kvm.h | 22 ++++++++++++++++++++++
 kvm-all.c            | 39 +++++++++++++++++++++++++++++++++++++++
 trace-events         |  1 +
 3 files changed, 62 insertions(+)

diff --git a/include/sysemu/kvm.h b/include/sysemu/kvm.h
index e4e43b8..a02d67c 100644
--- a/include/sysemu/kvm.h
+++ b/include/sysemu/kvm.h
@@ -194,6 +194,28 @@ int kvm_vm_ioctl(KVMState *s, int type, ...);
 
 int kvm_vcpu_ioctl(CPUState *cpu, int type, ...);
 
+/**
+ * kvm_device_ioctl - call an ioctl on a kvm device
+ * @fd: The KVM device file descriptor as returned from KVM_CREATE_DEVICE
+ * @type: The device-ctrl ioctl number
+ *
+ * Returns: -errno on error, nonnegative on success
+ */
+int kvm_device_ioctl(int fd, int type, ...);
+
+/**
+ * kvm_create_device - create a KVM device for the device control API
+ * @KVMState: The KVMState pointer
+ * @type: The KVM device type (see Documentation/virtual/kvm/devices in the
+ *        kernel source)
+ * @test: If true, only test if device can be created, but don't actually
+ *        create the device.
+ *
+ * Returns: -errno on error, nonnegative on success: @test ? 0 : device fd;
+ */
+int kvm_create_device(KVMState *s, uint64_t type, bool test);
+
+
 /* Arch specific hooks */
 
 extern const KVMCapabilityInfo kvm_arch_required_capabilities[];
diff --git a/kvm-all.c b/kvm-all.c
index a890152..8310c1f 100644
--- a/kvm-all.c
+++ b/kvm-all.c
@@ -1784,6 +1784,24 @@ int kvm_vcpu_ioctl(CPUState *cpu, int type, ...)
     return ret;
 }
 
+int kvm_device_ioctl(int fd, int type, ...)
+{
+    int ret;
+    void *arg;
+    va_list ap;
+
+    va_start(ap, type);
+    arg = va_arg(ap, void *);
+    va_end(ap);
+
+    trace_kvm_device_ioctl(fd, type, arg);
+    ret = ioctl(fd, type, arg);
+    if (ret == -1) {
+        ret = -errno;
+    }
+    return ret;
+}
+
 int kvm_has_sync_mmu(void)
 {
     return kvm_check_extension(kvm_state, KVM_CAP_SYNC_MMU);
@@ -2065,3 +2083,24 @@ int kvm_on_sigbus(int code, void *addr)
 {
     return kvm_arch_on_sigbus(code, addr);
 }
+
+int kvm_create_device(KVMState *s, uint64_t type, bool test)
+{
+    int ret;
+    struct kvm_create_device create_dev;
+
+    create_dev.type = type;
+    create_dev.fd = -1;
+    create_dev.flags = test ? KVM_CREATE_DEVICE_TEST : 0;
+
+    if (!kvm_check_extension(s, KVM_CAP_DEVICE_CTRL)) {
+        return -ENOTSUP;
+    }
+
+    ret = kvm_vm_ioctl(s, KVM_CREATE_DEVICE, &create_dev);
+    if (ret) {
+        return ret;
+    }
+
+    return test ? 0 : create_dev.fd;
+}
diff --git a/trace-events b/trace-events
index 1b668d1..cb827ad 100644
--- a/trace-events
+++ b/trace-events
@@ -1170,6 +1170,7 @@ kvm_ioctl(int type, void *arg) "type 0x%x, arg %p"
 kvm_vm_ioctl(int type, void *arg) "type 0x%x, arg %p"
 kvm_vcpu_ioctl(int cpu_index, int type, void *arg) "cpu_index %d, type 0x%x, arg %p"
 kvm_run_exit(int cpu_index, uint32_t reason) "cpu_index %d, reason %d"
+kvm_device_ioctl(int fd, int type, void *arg) "dev fd %d, type 0x%x, arg %p"
 
 # memory.c
 memory_region_ops_read(void *mr, uint64_t addr, uint64_t value, unsigned size) "mr %p addr %#"PRIx64" value %#"PRIx64" size %u"
-- 
1.8.5

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [Qemu-devel] [PULL 36/38] arm: vgic device control api support
  2014-01-29 13:39 [Qemu-devel] [PULL 00/38] target-arm queue Peter Maydell
                   ` (34 preceding siblings ...)
  2014-01-29 13:40 ` [Qemu-devel] [PULL 35/38] kvm: Common device control API functions Peter Maydell
@ 2014-01-29 13:40 ` Peter Maydell
  2014-01-29 13:40 ` [Qemu-devel] [PULL 37/38] arm_gic: Introduce define for GIC_NR_SGIS Peter Maydell
                   ` (2 subsequent siblings)
  38 siblings, 0 replies; 40+ messages in thread
From: Peter Maydell @ 2014-01-29 13:40 UTC (permalink / raw)
  To: Anthony Liguori; +Cc: Blue Swirl, qemu-devel, Aurelien Jarno

From: Christoffer Dall <christoffer.dall@linaro.org>

Support creating the ARM vgic device through the device control API and
setting the base address for the distributor and cpu interfaces in KVM
VMs using this API.

Because the older KVM_CREATE_IRQCHIP interface needs the irq chip to be
created prior to creating the VCPUs, we first test if we can use the
device control API in kvm_arch_irqchip_create (using the test flag from
the device control API).  If we cannot, it means we have to fall back to
KVM_CREATE_IRQCHIP and use the older ioctl at this point in time.  If
however, we can use the device control API, we don't do anything and
wait until the arm_gic_kvm driver initializes and let that use the
device control API.

Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 hw/intc/arm_gic_kvm.c            | 22 ++++++++++++++--
 include/hw/intc/arm_gic_common.h |  1 +
 target-arm/kvm.c                 | 55 +++++++++++++++++++++++++++++++++++-----
 target-arm/kvm_arm.h             | 17 ++++++++-----
 4 files changed, 80 insertions(+), 15 deletions(-)

diff --git a/hw/intc/arm_gic_kvm.c b/hw/intc/arm_gic_kvm.c
index 59a3da5..964d4d7 100644
--- a/hw/intc/arm_gic_kvm.c
+++ b/hw/intc/arm_gic_kvm.c
@@ -97,6 +97,7 @@ static void kvm_arm_gic_realize(DeviceState *dev, Error **errp)
     GICState *s = KVM_ARM_GIC(dev);
     SysBusDevice *sbd = SYS_BUS_DEVICE(dev);
     KVMARMGICClass *kgc = KVM_ARM_GIC_GET_CLASS(s);
+    int ret;
 
     kgc->parent_realize(dev, errp);
     if (error_is_set(errp)) {
@@ -119,13 +120,27 @@ static void kvm_arm_gic_realize(DeviceState *dev, Error **errp)
     for (i = 0; i < s->num_cpu; i++) {
         sysbus_init_irq(sbd, &s->parent_irq[i]);
     }
+
+    /* Try to create the device via the device control API */
+    s->dev_fd = -1;
+    ret = kvm_create_device(kvm_state, KVM_DEV_TYPE_ARM_VGIC_V2, false);
+    if (ret >= 0) {
+        s->dev_fd = ret;
+    } else if (ret != -ENODEV && ret != -ENOTSUP) {
+        error_setg_errno(errp, -ret, "error creating in-kernel VGIC");
+        return;
+    }
+
     /* Distributor */
     memory_region_init_reservation(&s->iomem, OBJECT(s),
                                    "kvm-gic_dist", 0x1000);
     sysbus_init_mmio(sbd, &s->iomem);
     kvm_arm_register_device(&s->iomem,
                             (KVM_ARM_DEVICE_VGIC_V2 << KVM_ARM_DEVICE_ID_SHIFT)
-                            | KVM_VGIC_V2_ADDR_TYPE_DIST);
+                            | KVM_VGIC_V2_ADDR_TYPE_DIST,
+                            KVM_DEV_ARM_VGIC_GRP_ADDR,
+                            KVM_VGIC_V2_ADDR_TYPE_DIST,
+                            s->dev_fd);
     /* CPU interface for current core. Unlike arm_gic, we don't
      * provide the "interface for core #N" memory regions, because
      * cores with a VGIC don't have those.
@@ -135,7 +150,10 @@ static void kvm_arm_gic_realize(DeviceState *dev, Error **errp)
     sysbus_init_mmio(sbd, &s->cpuiomem[0]);
     kvm_arm_register_device(&s->cpuiomem[0],
                             (KVM_ARM_DEVICE_VGIC_V2 << KVM_ARM_DEVICE_ID_SHIFT)
-                            | KVM_VGIC_V2_ADDR_TYPE_CPU);
+                            | KVM_VGIC_V2_ADDR_TYPE_CPU,
+                            KVM_DEV_ARM_VGIC_GRP_ADDR,
+                            KVM_VGIC_V2_ADDR_TYPE_CPU,
+                            s->dev_fd);
 }
 
 static void kvm_arm_gic_class_init(ObjectClass *klass, void *data)
diff --git a/include/hw/intc/arm_gic_common.h b/include/hw/intc/arm_gic_common.h
index 0d232df..40cd3d6 100644
--- a/include/hw/intc/arm_gic_common.h
+++ b/include/hw/intc/arm_gic_common.h
@@ -70,6 +70,7 @@ typedef struct GICState {
     MemoryRegion cpuiomem[GIC_NCPU + 1]; /* CPU interfaces */
     uint32_t num_irq;
     uint32_t revision;
+    int dev_fd; /* kvm device fd if backed by kvm vgic support */
 } GICState;
 
 #define TYPE_ARM_GIC_COMMON "arm_gic_common"
diff --git a/target-arm/kvm.c b/target-arm/kvm.c
index 1d2688d..39202d7 100644
--- a/target-arm/kvm.c
+++ b/target-arm/kvm.c
@@ -165,8 +165,10 @@ unsigned long kvm_arch_vcpu_id(CPUState *cpu)
  */
 typedef struct KVMDevice {
     struct kvm_arm_device_addr kda;
+    struct kvm_device_attr kdattr;
     MemoryRegion *mr;
     QSLIST_ENTRY(KVMDevice) entries;
+    int dev_fd;
 } KVMDevice;
 
 static QSLIST_HEAD(kvm_devices_head, KVMDevice) kvm_devices_head;
@@ -200,6 +202,29 @@ static MemoryListener devlistener = {
     .region_del = kvm_arm_devlistener_del,
 };
 
+static void kvm_arm_set_device_addr(KVMDevice *kd)
+{
+    struct kvm_device_attr *attr = &kd->kdattr;
+    int ret;
+
+    /* If the device control API is available and we have a device fd on the
+     * KVMDevice struct, let's use the newer API
+     */
+    if (kd->dev_fd >= 0) {
+        uint64_t addr = kd->kda.addr;
+        attr->addr = (uintptr_t)&addr;
+        ret = kvm_device_ioctl(kd->dev_fd, KVM_SET_DEVICE_ATTR, attr);
+    } else {
+        ret = kvm_vm_ioctl(kvm_state, KVM_ARM_SET_DEVICE_ADDR, &kd->kda);
+    }
+
+    if (ret < 0) {
+        fprintf(stderr, "Failed to set device address: %s\n",
+                strerror(-ret));
+        abort();
+    }
+}
+
 static void kvm_arm_machine_init_done(Notifier *notifier, void *data)
 {
     KVMDevice *kd, *tkd;
@@ -207,12 +232,7 @@ static void kvm_arm_machine_init_done(Notifier *notifier, void *data)
     memory_listener_unregister(&devlistener);
     QSLIST_FOREACH_SAFE(kd, &kvm_devices_head, entries, tkd) {
         if (kd->kda.addr != -1) {
-            if (kvm_vm_ioctl(kvm_state, KVM_ARM_SET_DEVICE_ADDR,
-                             &kd->kda) < 0) {
-                fprintf(stderr, "KVM_ARM_SET_DEVICE_ADDRESS failed: %s\n",
-                        strerror(errno));
-                abort();
-            }
+            kvm_arm_set_device_addr(kd);
         }
         memory_region_unref(kd->mr);
         g_free(kd);
@@ -223,7 +243,8 @@ static Notifier notify = {
     .notify = kvm_arm_machine_init_done,
 };
 
-void kvm_arm_register_device(MemoryRegion *mr, uint64_t devid)
+void kvm_arm_register_device(MemoryRegion *mr, uint64_t devid, uint64_t group,
+                             uint64_t attr, int dev_fd)
 {
     KVMDevice *kd;
 
@@ -239,6 +260,10 @@ void kvm_arm_register_device(MemoryRegion *mr, uint64_t devid)
     kd->mr = mr;
     kd->kda.id = devid;
     kd->kda.addr = -1;
+    kd->kdattr.flags = 0;
+    kd->kdattr.group = group;
+    kd->kdattr.attr = attr;
+    kd->dev_fd = dev_fd;
     QSLIST_INSERT_HEAD(&kvm_devices_head, kd, entries);
     memory_region_ref(kd->mr);
 }
@@ -389,3 +414,19 @@ void kvm_arch_remove_all_hw_breakpoints(void)
 void kvm_arch_init_irq_routing(KVMState *s)
 {
 }
+
+int kvm_arch_irqchip_create(KVMState *s)
+{
+    int ret;
+
+    /* If we can create the VGIC using the newer device control API, we
+     * let the device do this when it initializes itself, otherwise we
+     * fall back to the old API */
+
+    ret = kvm_create_device(s, KVM_DEV_TYPE_ARM_VGIC_V2, true);
+    if (ret == 0) {
+        return 1;
+    }
+
+    return 0;
+}
diff --git a/target-arm/kvm_arm.h b/target-arm/kvm_arm.h
index cd3d13c..137c567 100644
--- a/target-arm/kvm_arm.h
+++ b/target-arm/kvm_arm.h
@@ -18,16 +18,21 @@
  * kvm_arm_register_device:
  * @mr: memory region for this device
  * @devid: the KVM device ID
+ * @group: device control API group for setting addresses
+ * @attr: device control API address type
+ * @dev_fd: device control device file descriptor (or -1 if not supported)
  *
  * Remember the memory region @mr, and when it is mapped by the
  * machine model, tell the kernel that base address using the
- * KVM_SET_DEVICE_ADDRESS ioctl. @devid should be the ID of
- * the device as defined by KVM_SET_DEVICE_ADDRESS.
- * The machine model may map and unmap the device multiple times;
- * the kernel will only be told the final address at the point
- * where machine init is complete.
+ * KVM_ARM_SET_DEVICE_ADDRESS ioctl or the newer device control API.  @devid
+ * should be the ID of the device as defined by KVM_ARM_SET_DEVICE_ADDRESS or
+ * the arm-vgic device in the device control API.
+ * The machine model may map
+ * and unmap the device multiple times; the kernel will only be told the final
+ * address at the point where machine init is complete.
  */
-void kvm_arm_register_device(MemoryRegion *mr, uint64_t devid);
+void kvm_arm_register_device(MemoryRegion *mr, uint64_t devid, uint64_t group,
+                             uint64_t attr, int dev_fd);
 
 /**
  * write_list_to_kvmstate:
-- 
1.8.5

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [Qemu-devel] [PULL 37/38] arm_gic: Introduce define for GIC_NR_SGIS
  2014-01-29 13:39 [Qemu-devel] [PULL 00/38] target-arm queue Peter Maydell
                   ` (35 preceding siblings ...)
  2014-01-29 13:40 ` [Qemu-devel] [PULL 36/38] arm: vgic device control api support Peter Maydell
@ 2014-01-29 13:40 ` Peter Maydell
  2014-01-29 13:40 ` [Qemu-devel] [PULL 38/38] arm_gic: Fix GICD_ICPENDR and GICD_ISPENDR writes Peter Maydell
  2014-01-31 14:36 ` [Qemu-devel] [PULL 00/38] target-arm queue Peter Maydell
  38 siblings, 0 replies; 40+ messages in thread
From: Peter Maydell @ 2014-01-29 13:40 UTC (permalink / raw)
  To: Anthony Liguori; +Cc: Blue Swirl, qemu-devel, Aurelien Jarno

From: Christoffer Dall <christoffer.dall@linaro.org>

Instead of hardcoding 16 various places in the code, use a define to
make it more clear what is going on.

Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 hw/intc/arm_gic.c                | 17 +++++++++++------
 include/hw/intc/arm_gic_common.h |  1 +
 2 files changed, 12 insertions(+), 6 deletions(-)

diff --git a/hw/intc/arm_gic.c b/hw/intc/arm_gic.c
index 9409684..98c6ff5 100644
--- a/hw/intc/arm_gic.c
+++ b/hw/intc/arm_gic.c
@@ -380,8 +380,10 @@ static void gic_dist_writeb(void *opaque, hwaddr offset,
         irq = (offset - 0x100) * 8 + GIC_BASE_IRQ;
         if (irq >= s->num_irq)
             goto bad_reg;
-        if (irq < 16)
-          value = 0xff;
+        if (irq < GIC_NR_SGIS) {
+            value = 0xff;
+        }
+
         for (i = 0; i < 8; i++) {
             if (value & (1 << i)) {
                 int mask =
@@ -406,8 +408,10 @@ static void gic_dist_writeb(void *opaque, hwaddr offset,
         irq = (offset - 0x180) * 8 + GIC_BASE_IRQ;
         if (irq >= s->num_irq)
             goto bad_reg;
-        if (irq < 16)
-          value = 0;
+        if (irq < GIC_NR_SGIS) {
+            value = 0;
+        }
+
         for (i = 0; i < 8; i++) {
             if (value & (1 << i)) {
                 int cm = (irq < GIC_INTERNAL) ? (1 << cpu) : ALL_CPU_MASK;
@@ -423,8 +427,9 @@ static void gic_dist_writeb(void *opaque, hwaddr offset,
         irq = (offset - 0x200) * 8 + GIC_BASE_IRQ;
         if (irq >= s->num_irq)
             goto bad_reg;
-        if (irq < 16)
-          irq = 0;
+        if (irq < GIC_NR_SGIS) {
+            irq = 0;
+        }
 
         for (i = 0; i < 8; i++) {
             if (value & (1 << i)) {
diff --git a/include/hw/intc/arm_gic_common.h b/include/hw/intc/arm_gic_common.h
index 40cd3d6..dbf8787 100644
--- a/include/hw/intc/arm_gic_common.h
+++ b/include/hw/intc/arm_gic_common.h
@@ -27,6 +27,7 @@
 #define GIC_MAXIRQ 1020
 /* First 32 are private to each CPU (SGIs and PPIs). */
 #define GIC_INTERNAL 32
+#define GIC_NR_SGIS 16
 /* Maximum number of possible CPU interfaces, determined by GIC architecture */
 #define GIC_NCPU 8
 
-- 
1.8.5

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [Qemu-devel] [PULL 38/38] arm_gic: Fix GICD_ICPENDR and GICD_ISPENDR writes
  2014-01-29 13:39 [Qemu-devel] [PULL 00/38] target-arm queue Peter Maydell
                   ` (36 preceding siblings ...)
  2014-01-29 13:40 ` [Qemu-devel] [PULL 37/38] arm_gic: Introduce define for GIC_NR_SGIS Peter Maydell
@ 2014-01-29 13:40 ` Peter Maydell
  2014-01-31 14:36 ` [Qemu-devel] [PULL 00/38] target-arm queue Peter Maydell
  38 siblings, 0 replies; 40+ messages in thread
From: Peter Maydell @ 2014-01-29 13:40 UTC (permalink / raw)
  To: Anthony Liguori; +Cc: Blue Swirl, qemu-devel, Aurelien Jarno

From: Christoffer Dall <christoffer.dall@linaro.org>

Fix two bugs that would allow changing the state of SGIs through the
ICPENDR and ISPENDRs.

Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 hw/intc/arm_gic.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/hw/intc/arm_gic.c b/hw/intc/arm_gic.c
index 98c6ff5..1c4a114 100644
--- a/hw/intc/arm_gic.c
+++ b/hw/intc/arm_gic.c
@@ -428,7 +428,7 @@ static void gic_dist_writeb(void *opaque, hwaddr offset,
         if (irq >= s->num_irq)
             goto bad_reg;
         if (irq < GIC_NR_SGIS) {
-            irq = 0;
+            value = 0;
         }
 
         for (i = 0; i < 8; i++) {
@@ -441,6 +441,10 @@ static void gic_dist_writeb(void *opaque, hwaddr offset,
         irq = (offset - 0x280) * 8 + GIC_BASE_IRQ;
         if (irq >= s->num_irq)
             goto bad_reg;
+        if (irq < GIC_NR_SGIS) {
+            value = 0;
+        }
+
         for (i = 0; i < 8; i++) {
             /* ??? This currently clears the pending bit for all CPUs, even
                for per-CPU interrupts.  It's unclear whether this is the
-- 
1.8.5

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* Re: [Qemu-devel] [PULL 00/38] target-arm queue
  2014-01-29 13:39 [Qemu-devel] [PULL 00/38] target-arm queue Peter Maydell
                   ` (37 preceding siblings ...)
  2014-01-29 13:40 ` [Qemu-devel] [PULL 38/38] arm_gic: Fix GICD_ICPENDR and GICD_ISPENDR writes Peter Maydell
@ 2014-01-31 14:36 ` Peter Maydell
  38 siblings, 0 replies; 40+ messages in thread
From: Peter Maydell @ 2014-01-31 14:36 UTC (permalink / raw)
  To: Anthony Liguori; +Cc: Blue Swirl, QEMU Developers, Aurelien Jarno

On 29 January 2014 13:39, Peter Maydell <peter.maydell@linaro.org> wrote:
> Here's the target-arm queue; nothing earthshaking here.
> I anticipate the queue will fill up with another 30+ patches
> within a week or so; please pull :-)


> Christoffer Dall (6):
>       linux-headers: Update from Linus' master ba635f8
>       kvm: Introduce kvm_arch_irqchip_create
>       kvm: Common device control API functions
>       arm: vgic device control api support

This set of patches break the build on aarch64 hosts, so
I'm removing it from target-arm.next and well send out
a revised pull request mail shortly. (I won't bother
retransmitting the 34 unchanged patches...)

thanks
-- PMM

^ permalink raw reply	[flat|nested] 40+ messages in thread

end of thread, other threads:[~2014-01-31 14:36 UTC | newest]

Thread overview: 40+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-01-29 13:39 [Qemu-devel] [PULL 00/38] target-arm queue Peter Maydell
2014-01-29 13:39 ` [Qemu-devel] [PULL 01/38] target-arm: A64: Add SIMD ld/st multiple Peter Maydell
2014-01-29 13:39 ` [Qemu-devel] [PULL 02/38] target-arm: A64: Add SIMD ld/st single Peter Maydell
2014-01-29 13:39 ` [Qemu-devel] [PULL 03/38] target-arm: A64: Add decode skeleton for SIMD data processing insns Peter Maydell
2014-01-29 13:39 ` [Qemu-devel] [PULL 04/38] target-arm: A64: Add SIMD EXT Peter Maydell
2014-01-29 13:39 ` [Qemu-devel] [PULL 05/38] target-arm: A64: Add SIMD TBL/TBLX Peter Maydell
2014-01-29 13:39 ` [Qemu-devel] [PULL 06/38] target-arm: A64: Add SIMD ZIP/UZP/TRN Peter Maydell
2014-01-29 13:39 ` [Qemu-devel] [PULL 07/38] target-arm: A64: Add SIMD across-lanes instructions Peter Maydell
2014-01-29 13:39 ` [Qemu-devel] [PULL 08/38] target-arm: A64: Add SIMD copy operations Peter Maydell
2014-01-29 13:39 ` [Qemu-devel] [PULL 09/38] target-arm: A64: Add SIMD modified immediate group Peter Maydell
2014-01-29 13:39 ` [Qemu-devel] [PULL 10/38] target-arm: A64: Add SIMD scalar copy instructions Peter Maydell
2014-01-29 13:39 ` [Qemu-devel] [PULL 11/38] hw/arm/boot: Don't set up ATAGS for autogenerated dtb booting Peter Maydell
2014-01-29 13:39 ` [Qemu-devel] [PULL 12/38] ARM: Convert MIDR to a property Peter Maydell
2014-01-29 13:39 ` [Qemu-devel] [PULL 13/38] ZYNQ: Implement board MIDR control for Zynq Peter Maydell
2014-01-29 13:39 ` [Qemu-devel] [PULL 14/38] display: avoid multi-statement macro Peter Maydell
2014-01-29 13:39 ` [Qemu-devel] [PULL 15/38] target-arm: Move arm_rmode_to_sf to a shared location Peter Maydell
2014-01-29 13:39 ` [Qemu-devel] [PULL 16/38] target-arm: Add AArch32 FP VRINTA, VRINTN, VRINTP and VRINTM Peter Maydell
2014-01-29 13:39 ` [Qemu-devel] [PULL 17/38] target-arm: Add support for AArch32 FP VRINTR Peter Maydell
2014-01-29 13:39 ` [Qemu-devel] [PULL 18/38] target-arm: Add support for AArch32 FP VRINTZ Peter Maydell
2014-01-29 13:39 ` [Qemu-devel] [PULL 19/38] target-arm: Add support for AArch32 FP VRINTX Peter Maydell
2014-01-29 13:39 ` [Qemu-devel] [PULL 20/38] target-arm: Add support for AArch32 SIMD VRINTX Peter Maydell
2014-01-29 13:39 ` [Qemu-devel] [PULL 21/38] target-arm: Add set_neon_rmode helper Peter Maydell
2014-01-29 13:39 ` [Qemu-devel] [PULL 22/38] target-arm: Add AArch32 SIMD VRINTA, VRINTN, VRINTP, VRINTM, VRINTZ Peter Maydell
2014-01-29 13:39 ` [Qemu-devel] [PULL 23/38] target-arm: Add AArch32 FP VCVTA, VCVTN, VCVTP and VCVTM Peter Maydell
2014-01-29 13:39 ` [Qemu-devel] [PULL 24/38] target-arm: Add AArch32 SIMD " Peter Maydell
2014-01-29 13:39 ` [Qemu-devel] [PULL 25/38] target-arm: A64: Add SIMD three-different multiply accumulate insns Peter Maydell
2014-01-29 13:39 ` [Qemu-devel] [PULL 26/38] target-arm: A64: Add SIMD three-different ABDL instructions Peter Maydell
2014-01-29 13:39 ` [Qemu-devel] [PULL 27/38] target-arm: A64: Add SIMD scalar 3 same add, sub and compare ops Peter Maydell
2014-01-29 13:39 ` [Qemu-devel] [PULL 28/38] target-arm: A64: Add top level decode for SIMD 3-same group Peter Maydell
2014-01-29 13:39 ` [Qemu-devel] [PULL 29/38] target-arm: A64: Add logic ops from SIMD 3 same group Peter Maydell
2014-01-29 13:39 ` [Qemu-devel] [PULL 30/38] target-arm: A64: Add integer ops from SIMD 3-same group Peter Maydell
2014-01-29 13:39 ` [Qemu-devel] [PULL 31/38] target-arm: A64: Add simple SIMD 3-same floating point ops Peter Maydell
2014-01-29 13:39 ` [Qemu-devel] [PULL 32/38] target-arm: A64: Add SIMD shift by immediate Peter Maydell
2014-01-29 13:40 ` [Qemu-devel] [PULL 33/38] linux-headers: Update from Linus' master ba635f8 Peter Maydell
2014-01-29 13:40 ` [Qemu-devel] [PULL 34/38] kvm: Introduce kvm_arch_irqchip_create Peter Maydell
2014-01-29 13:40 ` [Qemu-devel] [PULL 35/38] kvm: Common device control API functions Peter Maydell
2014-01-29 13:40 ` [Qemu-devel] [PULL 36/38] arm: vgic device control api support Peter Maydell
2014-01-29 13:40 ` [Qemu-devel] [PULL 37/38] arm_gic: Introduce define for GIC_NR_SGIS Peter Maydell
2014-01-29 13:40 ` [Qemu-devel] [PULL 38/38] arm_gic: Fix GICD_ICPENDR and GICD_ISPENDR writes Peter Maydell
2014-01-31 14:36 ` [Qemu-devel] [PULL 00/38] target-arm queue Peter Maydell

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.