From: LIU Zhiwei <zhiwei_liu@c-sky.com> To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org, LIU Zhiwei <zhiwei_liu@c-sky.com>, qemu-riscv@nongnu.org, palmer@dabbelt.com, alistair23@gmail.com Subject: [PATCH 04/38] target/riscv: 16-bit Addition & Subtraction Instructions Date: Fri, 12 Feb 2021 23:02:22 +0800 [thread overview] Message-ID: <20210212150256.885-5-zhiwei_liu@c-sky.com> (raw) In-Reply-To: <20210212150256.885-1-zhiwei_liu@c-sky.com> Include 5 groups: Wrap-around (dropping overflow), Signed Halving, Unsigned Halving, Signed Saturation, and Unsigned Saturation. Signed-off-by: LIU Zhiwei <zhiwei_liu@c-sky.com> --- target/riscv/helper.h | 30 ++ target/riscv/insn32.decode | 32 +++ target/riscv/insn_trans/trans_rvp.c.inc | 161 +++++++++++ target/riscv/meson.build | 1 + target/riscv/packed_helper.c | 354 ++++++++++++++++++++++++ target/riscv/translate.c | 1 + 6 files changed, 579 insertions(+) create mode 100644 target/riscv/insn_trans/trans_rvp.c.inc create mode 100644 target/riscv/packed_helper.c diff --git a/target/riscv/helper.h b/target/riscv/helper.h index e3f3f41e89..6d622c732a 100644 --- a/target/riscv/helper.h +++ b/target/riscv/helper.h @@ -1145,3 +1145,33 @@ DEF_HELPER_6(vcompress_vm_b, void, ptr, ptr, ptr, ptr, env, i32) DEF_HELPER_6(vcompress_vm_h, void, ptr, ptr, ptr, ptr, env, i32) DEF_HELPER_6(vcompress_vm_w, void, ptr, ptr, ptr, ptr, env, i32) DEF_HELPER_6(vcompress_vm_d, void, ptr, ptr, ptr, ptr, env, i32) + +/* P extension function */ +DEF_HELPER_3(radd16, tl, env, tl, tl) +DEF_HELPER_3(uradd16, tl, env, tl, tl) +DEF_HELPER_3(kadd16, tl, env, tl, tl) +DEF_HELPER_3(ukadd16, tl, env, tl, tl) +DEF_HELPER_3(rsub16, tl, env, tl, tl) +DEF_HELPER_3(ursub16, tl, env, tl, tl) +DEF_HELPER_3(ksub16, tl, env, tl, tl) +DEF_HELPER_3(uksub16, tl, env, tl, tl) +DEF_HELPER_3(cras16, tl, env, tl, tl) +DEF_HELPER_3(rcras16, tl, env, tl, tl) +DEF_HELPER_3(urcras16, tl, env, tl, tl) +DEF_HELPER_3(kcras16, tl, env, tl, tl) +DEF_HELPER_3(ukcras16, tl, env, tl, tl) +DEF_HELPER_3(crsa16, tl, env, tl, tl) +DEF_HELPER_3(rcrsa16, tl, env, tl, tl) +DEF_HELPER_3(urcrsa16, tl, env, tl, tl) +DEF_HELPER_3(kcrsa16, tl, env, tl, tl) +DEF_HELPER_3(ukcrsa16, tl, env, tl, tl) +DEF_HELPER_3(stas16, tl, env, tl, tl) +DEF_HELPER_3(rstas16, tl, env, tl, tl) +DEF_HELPER_3(urstas16, tl, env, tl, tl) +DEF_HELPER_3(kstas16, tl, env, tl, tl) +DEF_HELPER_3(ukstas16, tl, env, tl, tl) +DEF_HELPER_3(stsa16, tl, env, tl, tl) +DEF_HELPER_3(rstsa16, tl, env, tl, tl) +DEF_HELPER_3(urstsa16, tl, env, tl, tl) +DEF_HELPER_3(kstsa16, tl, env, tl, tl) +DEF_HELPER_3(ukstsa16, tl, env, tl, tl) diff --git a/target/riscv/insn32.decode b/target/riscv/insn32.decode index 84080dd18c..8815e90476 100644 --- a/target/riscv/insn32.decode +++ b/target/riscv/insn32.decode @@ -592,3 +592,35 @@ vcompress_vm 010111 - ..... ..... 010 ..... 1010111 @r vsetvli 0 ........... ..... 111 ..... 1010111 @r2_zimm vsetvl 1000000 ..... ..... 111 ..... 1010111 @r + +# *** RV32P Extension *** +add16 0100000 ..... ..... 000 ..... 1111111 @r +radd16 0000000 ..... ..... 000 ..... 1111111 @r +uradd16 0010000 ..... ..... 000 ..... 1111111 @r +kadd16 0001000 ..... ..... 000 ..... 1111111 @r +ukadd16 0011000 ..... ..... 000 ..... 1111111 @r +sub16 0100001 ..... ..... 000 ..... 1111111 @r +rsub16 0000001 ..... ..... 000 ..... 1111111 @r +ursub16 0010001 ..... ..... 000 ..... 1111111 @r +ksub16 0001001 ..... ..... 000 ..... 1111111 @r +uksub16 0011001 ..... ..... 000 ..... 1111111 @r +cras16 0100010 ..... ..... 000 ..... 1111111 @r +rcras16 0000010 ..... ..... 000 ..... 1111111 @r +urcras16 0010010 ..... ..... 000 ..... 1111111 @r +kcras16 0001010 ..... ..... 000 ..... 1111111 @r +ukcras16 0011010 ..... ..... 000 ..... 1111111 @r +crsa16 0100011 ..... ..... 000 ..... 1111111 @r +rcrsa16 0000011 ..... ..... 000 ..... 1111111 @r +urcrsa16 0010011 ..... ..... 000 ..... 1111111 @r +kcrsa16 0001011 ..... ..... 000 ..... 1111111 @r +ukcrsa16 0011011 ..... ..... 000 ..... 1111111 @r +stas16 1111010 ..... ..... 010 ..... 1111111 @r +rstas16 1011010 ..... ..... 010 ..... 1111111 @r +urstas16 1101010 ..... ..... 010 ..... 1111111 @r +kstas16 1100010 ..... ..... 010 ..... 1111111 @r +ukstas16 1110010 ..... ..... 010 ..... 1111111 @r +stsa16 1111011 ..... ..... 010 ..... 1111111 @r +rstsa16 1011011 ..... ..... 010 ..... 1111111 @r +urstsa16 1101011 ..... ..... 010 ..... 1111111 @r +kstsa16 1100011 ..... ..... 010 ..... 1111111 @r +ukstsa16 1110011 ..... ..... 010 ..... 1111111 @r diff --git a/target/riscv/insn_trans/trans_rvp.c.inc b/target/riscv/insn_trans/trans_rvp.c.inc new file mode 100644 index 0000000000..0885a4fd45 --- /dev/null +++ b/target/riscv/insn_trans/trans_rvp.c.inc @@ -0,0 +1,161 @@ +/* + * RISC-V translation routines for the RVP Standard Extension. + * + * Copyright (c) 2021 T-Head Semiconductor Co., Ltd. All rights reserved. + * + * This program is free software; you can redistribute it and/or modify it + * under the terms and conditions of the GNU General Public License, + * version 2 or later, as published by the Free Software Foundation. + * + * This program is distributed in the hope it will be useful, but WITHOUT + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for + * more details. + * + * You should have received a copy of the GNU General Public License along with + * this program. If not, see <http://www.gnu.org/licenses/>. + */ + +#include "tcg/tcg-op-gvec.h" +#include "tcg/tcg-gvec-desc.h" +#include "tcg/tcg.h" + +/* + *** SIMD Data Processing Instructions + */ + +/* 16-bit Addition & Subtraction Instructions */ + +/* + * For some instructions, such as add16, an oberservation can be utilized: + * 1) If any reg is zero, it can be reduced to an inline op on the whole reg. + * 2) Otherwise, it can be acclebrated by an gvec op or an inline op. + */ + +typedef void GenZeroFn(DisasContext *, arg_r *); +typedef void GenNoZero32Fn(TCGv, TCGv, TCGv); +typedef void GenNoZero64Fn(unsigned, uint32_t, uint32_t, + uint32_t, uint32_t, uint32_t); + +static inline bool +r_inline(DisasContext *ctx, arg_r *a, uint8_t vece, + GenNoZero64Fn *f64, GenNoZero32Fn *f32, + GenZeroFn *fn) +{ + if (!has_ext(ctx, RVP)) { + return false; + } + if (a->rd && a->rs1 && a->rs2) { +#ifdef TARGET_RISCV64 + f64(vece, offsetof(CPURISCVState, gpr[a->rd]), + offsetof(CPURISCVState, gpr[a->rs1]), + offsetof(CPURISCVState, gpr[a->rs2]), + 8, 8); +#else + f32(cpu_gpr[a->rd], cpu_gpr[a->rs1], cpu_gpr[a->rs2]); +#endif + } else { + fn(ctx, a); + } + return true; +} + +/* Complete inline implementation */ +#define GEN_RVP_R_INLINE(NAME, GSUF, VECE, FN) \ +static bool trans_##NAME(DisasContext *s, arg_r *a) \ +{ \ + return r_inline(s, a, VECE, tcg_gen_gvec_##GSUF, \ + tcg_gen_simd_##NAME, (GenZeroFn *)FN); \ +} \ + +static void tcg_gen_simd_add16(TCGv d, TCGv a, TCGv b) +{ + TCGv t1 = tcg_temp_new(); + TCGv t2 = tcg_temp_new(); + + tcg_gen_andi_tl(t1, a, ~0xffff); + tcg_gen_add_tl(t2, a, b); + tcg_gen_add_tl(t1, t1, b); + tcg_gen_deposit_tl(d, t1, t2, 0, 16); + + tcg_temp_free(t1); + tcg_temp_free(t2); +} + +GEN_RVP_R_INLINE(add16, add, 1, trans_add); + +static void tcg_gen_simd_sub16(TCGv d, TCGv a, TCGv b) +{ + TCGv t1 = tcg_temp_new(); + TCGv t2 = tcg_temp_new(); + + tcg_gen_andi_tl(t1, b, ~0xffff); + tcg_gen_sub_tl(t2, a, b); + tcg_gen_sub_tl(t1, a, t1); + tcg_gen_deposit_tl(d, t1, t2, 0, 16); + + tcg_temp_free(t1); + tcg_temp_free(t2); +} + +GEN_RVP_R_INLINE(sub16, sub, 1, trans_sub); + +/* Out of line helpers for R format packed instructions */ +typedef void gen_helper_rvp_r(TCGv, TCGv_ptr, TCGv, TCGv); + +static inline bool r_ool(DisasContext *ctx, arg_r *a, gen_helper_rvp_r *fn) +{ + TCGv src1, src2, dst; + if (!has_ext(ctx, RVP)) { + return false; + } + + src1 = tcg_temp_new(); + src2 = tcg_temp_new(); + dst = tcg_temp_new(); + + gen_get_gpr(src1, a->rs1); + gen_get_gpr(src2, a->rs2); + fn(dst, cpu_env, src1, src2); + gen_set_gpr(a->rd, dst); + + tcg_temp_free(src1); + tcg_temp_free(src2); + tcg_temp_free(dst); + return true; +} + +#define GEN_RVP_R_OOL(NAME) \ +static bool trans_##NAME(DisasContext *s, arg_r *a) \ +{ \ + return r_ool(s, a, gen_helper_##NAME); \ +} + +GEN_RVP_R_OOL(radd16); +GEN_RVP_R_OOL(uradd16); +GEN_RVP_R_OOL(kadd16); +GEN_RVP_R_OOL(ukadd16); +GEN_RVP_R_OOL(rsub16); +GEN_RVP_R_OOL(ursub16); +GEN_RVP_R_OOL(ksub16); +GEN_RVP_R_OOL(uksub16); +GEN_RVP_R_OOL(cras16); +GEN_RVP_R_OOL(rcras16); +GEN_RVP_R_OOL(urcras16); +GEN_RVP_R_OOL(kcras16); +GEN_RVP_R_OOL(ukcras16); +GEN_RVP_R_OOL(crsa16); +GEN_RVP_R_OOL(rcrsa16); +GEN_RVP_R_OOL(urcrsa16); +GEN_RVP_R_OOL(kcrsa16); +GEN_RVP_R_OOL(ukcrsa16); +GEN_RVP_R_OOL(stas16); +GEN_RVP_R_OOL(rstas16); +GEN_RVP_R_OOL(urstas16); +GEN_RVP_R_OOL(kstas16); +GEN_RVP_R_OOL(ukstas16); +GEN_RVP_R_OOL(stsa16); +GEN_RVP_R_OOL(rstsa16); +GEN_RVP_R_OOL(urstsa16); +GEN_RVP_R_OOL(kstsa16); +GEN_RVP_R_OOL(ukstsa16); diff --git a/target/riscv/meson.build b/target/riscv/meson.build index 14a5c62dac..d26a437ee8 100644 --- a/target/riscv/meson.build +++ b/target/riscv/meson.build @@ -21,6 +21,7 @@ riscv_ss.add(files( 'gdbstub.c', 'op_helper.c', 'vector_helper.c', + 'packed_helper.c', 'translate.c', )) diff --git a/target/riscv/packed_helper.c b/target/riscv/packed_helper.c new file mode 100644 index 0000000000..b84abaaf25 --- /dev/null +++ b/target/riscv/packed_helper.c @@ -0,0 +1,354 @@ +/* + * RISC-V P Extension Helpers for QEMU. + * + * Copyright (c) 2021 T-Head Semiconductor Co., Ltd. All rights reserved. + * + * This program is free software; you can redistribute it and/or modify it + * under the terms and conditions of the GNU General Public License, + * version 2 or later, as published by the Free Software Foundation. + * + * This program is distributed in the hope it will be useful, but WITHOUT + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for + * more details. + * + * You should have received a copy of the GNU General Public License along with + * this program. If not, see <http://www.gnu.org/licenses/>. + */ +#include "qemu/osdep.h" +#include "cpu.h" +#include "exec/exec-all.h" +#include "exec/helper-proto.h" +#include "exec/cpu_ldst.h" +#include "fpu/softfloat.h" +#include <math.h> +#include "internals.h" + +/* + *** SIMD Data Processing Instructions + */ + +/* 16-bit Addition & Subtraction Instructions */ +typedef void PackedFn3i(CPURISCVState *, void *, void *, void *, uint8_t); + +/* Define a common function to loop elements in packed register */ +static inline target_ulong +rvpr(CPURISCVState *env, target_ulong a, target_ulong b, + uint8_t step, uint8_t size, PackedFn3i *fn) +{ + int i, passes = sizeof(target_ulong) / size; + target_ulong result = 0; + + for (i = 0; i < passes; i += step) { + fn(env, &result, &a, &b, i); + } + return result; +} + +#define RVPR(NAME, STEP, SIZE) \ +target_ulong HELPER(NAME)(CPURISCVState *env, target_ulong a, \ + target_ulong b) \ +{ \ + return rvpr(env, a, b, STEP, SIZE, (PackedFn3i *)do_##NAME);\ +} + +static inline int32_t hadd32(int32_t a, int32_t b) +{ + return ((int64_t)a + b) >> 1; +} + +static inline void do_radd16(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + int16_t *d = vd, *a = va, *b = vb; + d[i] = hadd32(a[i], b[i]); +} + +RVPR(radd16, 1, 2); + +static inline uint32_t haddu32(uint32_t a, uint32_t b) +{ + return ((uint64_t)a + b) >> 1; +} + +static inline void do_uradd16(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + uint16_t *d = vd, *a = va, *b = vb; + d[i] = haddu32(a[i], b[i]); +} + +RVPR(uradd16, 1, 2); + +static inline void do_kadd16(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + int16_t *d = vd, *a = va, *b = vb; + d[i] = sadd16(env, 0, a[i], b[i]); +} + +RVPR(kadd16, 1, 2); + +static inline void do_ukadd16(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + uint16_t *d = vd, *a = va, *b = vb; + d[i] = saddu16(env, 0, a[i], b[i]); +} + +RVPR(ukadd16, 1, 2); + +static inline int32_t hsub32(int32_t a, int32_t b) +{ + return ((int64_t)a - b) >> 1; +} + +static inline int64_t hsub64(int64_t a, int64_t b) +{ + int64_t res = a - b; + int64_t over = (res ^ a) & (a ^ b) & INT64_MIN; + + /* With signed overflow, bit 64 is inverse of bit 63. */ + return (res >> 1) ^ over; +} + +static inline void do_rsub16(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + int16_t *d = vd, *a = va, *b = vb; + d[i] = hsub32(a[i], b[i]); +} + +RVPR(rsub16, 1, 2); + +static inline uint64_t hsubu64(uint64_t a, uint64_t b) +{ + return (a - b) >> 1; +} + +static inline void do_ursub16(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + uint16_t *d = vd, *a = va, *b = vb; + d[i] = hsubu64(a[i], b[i]); +} + +RVPR(ursub16, 1, 2); + +static inline void do_ksub16(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + int16_t *d = vd, *a = va, *b = vb; + d[i] = ssub16(env, 0, a[i], b[i]); +} + +RVPR(ksub16, 1, 2); + +static inline void do_uksub16(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + uint16_t *d = vd, *a = va, *b = vb; + d[i] = ssubu16(env, 0, a[i], b[i]); +} + +RVPR(uksub16, 1, 2); + +static inline void do_cras16(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + uint16_t *d = vd, *a = va, *b = vb; + d[H2(i)] = a[H2(i)] - b[H2(i + 1)]; + d[H2(i + 1)] = a[H2(i + 1)] + b[H2(i)]; +} + +RVPR(cras16, 2, 2); + +static inline void do_rcras16(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + int16_t *d = vd, *a = va, *b = vb; + d[H2(i)] = hsub32(a[H2(i)], b[H2(i + 1)]); + d[H2(i + 1)] = hadd32(a[H2(i + 1)], b[H2(i)]); +} + +RVPR(rcras16, 2, 2); + +static inline void do_urcras16(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + uint16_t *d = vd, *a = va, *b = vb; + d[H2(i)] = hsubu64(a[H2(i)], b[H2(i + 1)]); + d[H2(i + 1)] = haddu32(a[H2(i + 1)], b[H2(i)]); +} + +RVPR(urcras16, 2, 2); + +static inline void do_kcras16(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + int16_t *d = vd, *a = va, *b = vb; + d[H2(i)] = ssub16(env, 0, a[H2(i)], b[H2(i + 1)]); + d[H2(i + 1)] = sadd16(env, 0, a[H2(i + 1)], b[H2(i)]); +} + +RVPR(kcras16, 2, 2); + +static inline void do_ukcras16(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + uint16_t *d = vd, *a = va, *b = vb; + d[H2(i)] = ssubu16(env, 0, a[H2(i)], b[H2(i + 1)]); + d[H2(i + 1)] = saddu16(env, 0, a[H2(i + 1)], b[H2(i)]); +} + +RVPR(ukcras16, 2, 2); + +static inline void do_crsa16(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + uint16_t *d = vd, *a = va, *b = vb; + d[H2(i)] = a[H2(i)] + b[H2(i + 1)]; + d[H2(i + 1)] = a[H2(i + 1)] - b[H2(i)]; +} + +RVPR(crsa16, 2, 2); + +static inline void do_rcrsa16(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + int16_t *d = vd, *a = va, *b = vb; + d[H2(i)] = hadd32(a[H2(i)], b[H2(i + 1)]); + d[H2(i + 1)] = hsub32(a[H2(i + 1)], b[H2(i)]); +} + +RVPR(rcrsa16, 2, 2); + +static inline void do_urcrsa16(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + uint16_t *d = vd, *a = va, *b = vb; + d[H2(i)] = haddu32(a[H2(i)], b[H2(i + 1)]); + d[H2(i + 1)] = hsubu64(a[H2(i + 1)], b[H2(i)]); +} + +RVPR(urcrsa16, 2, 2); + +static inline void do_kcrsa16(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + int16_t *d = vd, *a = va, *b = vb; + d[H2(i)] = sadd16(env, 0, a[H2(i)], b[H2(i + 1)]); + d[H2(i + 1)] = ssub16(env, 0, a[H2(i + 1)], b[H2(i)]); +} + +RVPR(kcrsa16, 2, 2); + +static inline void do_ukcrsa16(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + uint16_t *d = vd, *a = va, *b = vb; + d[H2(i)] = saddu16(env, 0, a[H2(i)], b[H2(i + 1)]); + d[H2(i + 1)] = ssubu16(env, 0, a[H2(i + 1)], b[H2(i)]); +} + +RVPR(ukcrsa16, 2, 2); + +static inline void do_stas16(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + int16_t *d = vd, *a = va, *b = vb; + d[H2(i)] = a[H2(i)] - b[H2(i)]; + d[H2(i + 1)] = a[H2(i + 1)] + b[H2(i + 1)]; +} + +RVPR(stas16, 2, 2); + +static inline void do_rstas16(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + int16_t *d = vd, *a = va, *b = vb; + d[H2(i)] = hsub32(a[H2(i)], b[H2(i)]); + d[H2(i + 1)] = hadd32(a[H2(i + 1)], b[H2(i + 1)]); +} + +RVPR(rstas16, 2, 2); + +static inline void do_urstas16(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + uint16_t *d = vd, *a = va, *b = vb; + d[H2(i)] = hsubu64(a[H2(i)], b[H2(i)]); + d[H2(i + 1)] = haddu32(a[H2(i + 1)], b[H2(i + 1)]); +} + +RVPR(urstas16, 2, 2); + +static inline void do_kstas16(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + int16_t *d = vd, *a = va, *b = vb; + d[H2(i)] = ssub16(env, 0, a[H2(i)], b[H2(i)]); + d[H2(i + 1)] = sadd16(env, 0, a[H2(i + 1)], b[H2(i + 1)]); +} + +RVPR(kstas16, 2, 2); + +static inline void do_ukstas16(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + uint16_t *d = vd, *a = va, *b = vb; + d[H2(i)] = ssubu16(env, 0, a[H2(i)], b[H2(i)]); + d[H2(i + 1)] = saddu16(env, 0, a[H2(i + 1)], b[H2(i + 1)]); +} + +RVPR(ukstas16, 2, 2); + +static inline void do_stsa16(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + uint16_t *d = vd, *a = va, *b = vb; + d[H2(i)] = a[H2(i)] + b[H2(i)]; + d[H2(i + 1)] = a[H2(i + 1)] - b[H2(i + 1)]; +} + +RVPR(stsa16, 2, 2); + +static inline void do_rstsa16(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + int16_t *d = vd, *a = va, *b = vb; + d[H2(i)] = hadd32(a[H2(i)], b[H2(i)]); + d[H2(i + 1)] = hsub32(a[H2(i + 1)], b[H2(i + 1)]); +} + +RVPR(rstsa16, 2, 2); + +static inline void do_urstsa16(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + uint16_t *d = vd, *a = va, *b = vb; + d[H2(i)] = haddu32(a[H2(i)], b[H2(i)]); + d[H2(i + 1)] = hsubu64(a[H2(i + 1)], b[H2(i + 1)]); +} + +RVPR(urstsa16, 2, 2); + +static inline void do_kstsa16(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + int16_t *d = vd, *a = va, *b = vb; + d[H2(i)] = sadd16(env, 0, a[H2(i)], b[H2(i)]); + d[H2(i + 1)] = ssub16(env, 0, a[H2(i + 1)], b[H2(i + 1)]); +} + +RVPR(kstsa16, 2, 2); + +static inline void do_ukstsa16(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + uint16_t *d = vd, *a = va, *b = vb; + d[H2(i)] = saddu16(env, 0, a[H2(i)], b[H2(i)]); + d[H2(i + 1)] = ssubu16(env, 0, a[H2(i + 1)], b[H2(i + 1)]); +} + +RVPR(ukstsa16, 2, 2); diff --git a/target/riscv/translate.c b/target/riscv/translate.c index eb810efec6..f0a753f9c7 100644 --- a/target/riscv/translate.c +++ b/target/riscv/translate.c @@ -766,6 +766,7 @@ static uint32_t opcode_at(DisasContextBase *dcbase, target_ulong pc) #include "insn_trans/trans_rvd.c.inc" #include "insn_trans/trans_rvh.c.inc" #include "insn_trans/trans_rvv.c.inc" +#include "insn_trans/trans_rvp.c.inc" #include "insn_trans/trans_privileged.c.inc" /* Include the auto-generated decoder for 16 bit insn */ -- 2.17.1
WARNING: multiple messages have this Message-ID (diff)
From: LIU Zhiwei <zhiwei_liu@c-sky.com> To: qemu-devel@nongnu.org Cc: qemu-riscv@nongnu.org, richard.henderson@linaro.org, alistair23@gmail.com, palmer@dabbelt.com, LIU Zhiwei <zhiwei_liu@c-sky.com> Subject: [PATCH 04/38] target/riscv: 16-bit Addition & Subtraction Instructions Date: Fri, 12 Feb 2021 23:02:22 +0800 [thread overview] Message-ID: <20210212150256.885-5-zhiwei_liu@c-sky.com> (raw) In-Reply-To: <20210212150256.885-1-zhiwei_liu@c-sky.com> Include 5 groups: Wrap-around (dropping overflow), Signed Halving, Unsigned Halving, Signed Saturation, and Unsigned Saturation. Signed-off-by: LIU Zhiwei <zhiwei_liu@c-sky.com> --- target/riscv/helper.h | 30 ++ target/riscv/insn32.decode | 32 +++ target/riscv/insn_trans/trans_rvp.c.inc | 161 +++++++++++ target/riscv/meson.build | 1 + target/riscv/packed_helper.c | 354 ++++++++++++++++++++++++ target/riscv/translate.c | 1 + 6 files changed, 579 insertions(+) create mode 100644 target/riscv/insn_trans/trans_rvp.c.inc create mode 100644 target/riscv/packed_helper.c diff --git a/target/riscv/helper.h b/target/riscv/helper.h index e3f3f41e89..6d622c732a 100644 --- a/target/riscv/helper.h +++ b/target/riscv/helper.h @@ -1145,3 +1145,33 @@ DEF_HELPER_6(vcompress_vm_b, void, ptr, ptr, ptr, ptr, env, i32) DEF_HELPER_6(vcompress_vm_h, void, ptr, ptr, ptr, ptr, env, i32) DEF_HELPER_6(vcompress_vm_w, void, ptr, ptr, ptr, ptr, env, i32) DEF_HELPER_6(vcompress_vm_d, void, ptr, ptr, ptr, ptr, env, i32) + +/* P extension function */ +DEF_HELPER_3(radd16, tl, env, tl, tl) +DEF_HELPER_3(uradd16, tl, env, tl, tl) +DEF_HELPER_3(kadd16, tl, env, tl, tl) +DEF_HELPER_3(ukadd16, tl, env, tl, tl) +DEF_HELPER_3(rsub16, tl, env, tl, tl) +DEF_HELPER_3(ursub16, tl, env, tl, tl) +DEF_HELPER_3(ksub16, tl, env, tl, tl) +DEF_HELPER_3(uksub16, tl, env, tl, tl) +DEF_HELPER_3(cras16, tl, env, tl, tl) +DEF_HELPER_3(rcras16, tl, env, tl, tl) +DEF_HELPER_3(urcras16, tl, env, tl, tl) +DEF_HELPER_3(kcras16, tl, env, tl, tl) +DEF_HELPER_3(ukcras16, tl, env, tl, tl) +DEF_HELPER_3(crsa16, tl, env, tl, tl) +DEF_HELPER_3(rcrsa16, tl, env, tl, tl) +DEF_HELPER_3(urcrsa16, tl, env, tl, tl) +DEF_HELPER_3(kcrsa16, tl, env, tl, tl) +DEF_HELPER_3(ukcrsa16, tl, env, tl, tl) +DEF_HELPER_3(stas16, tl, env, tl, tl) +DEF_HELPER_3(rstas16, tl, env, tl, tl) +DEF_HELPER_3(urstas16, tl, env, tl, tl) +DEF_HELPER_3(kstas16, tl, env, tl, tl) +DEF_HELPER_3(ukstas16, tl, env, tl, tl) +DEF_HELPER_3(stsa16, tl, env, tl, tl) +DEF_HELPER_3(rstsa16, tl, env, tl, tl) +DEF_HELPER_3(urstsa16, tl, env, tl, tl) +DEF_HELPER_3(kstsa16, tl, env, tl, tl) +DEF_HELPER_3(ukstsa16, tl, env, tl, tl) diff --git a/target/riscv/insn32.decode b/target/riscv/insn32.decode index 84080dd18c..8815e90476 100644 --- a/target/riscv/insn32.decode +++ b/target/riscv/insn32.decode @@ -592,3 +592,35 @@ vcompress_vm 010111 - ..... ..... 010 ..... 1010111 @r vsetvli 0 ........... ..... 111 ..... 1010111 @r2_zimm vsetvl 1000000 ..... ..... 111 ..... 1010111 @r + +# *** RV32P Extension *** +add16 0100000 ..... ..... 000 ..... 1111111 @r +radd16 0000000 ..... ..... 000 ..... 1111111 @r +uradd16 0010000 ..... ..... 000 ..... 1111111 @r +kadd16 0001000 ..... ..... 000 ..... 1111111 @r +ukadd16 0011000 ..... ..... 000 ..... 1111111 @r +sub16 0100001 ..... ..... 000 ..... 1111111 @r +rsub16 0000001 ..... ..... 000 ..... 1111111 @r +ursub16 0010001 ..... ..... 000 ..... 1111111 @r +ksub16 0001001 ..... ..... 000 ..... 1111111 @r +uksub16 0011001 ..... ..... 000 ..... 1111111 @r +cras16 0100010 ..... ..... 000 ..... 1111111 @r +rcras16 0000010 ..... ..... 000 ..... 1111111 @r +urcras16 0010010 ..... ..... 000 ..... 1111111 @r +kcras16 0001010 ..... ..... 000 ..... 1111111 @r +ukcras16 0011010 ..... ..... 000 ..... 1111111 @r +crsa16 0100011 ..... ..... 000 ..... 1111111 @r +rcrsa16 0000011 ..... ..... 000 ..... 1111111 @r +urcrsa16 0010011 ..... ..... 000 ..... 1111111 @r +kcrsa16 0001011 ..... ..... 000 ..... 1111111 @r +ukcrsa16 0011011 ..... ..... 000 ..... 1111111 @r +stas16 1111010 ..... ..... 010 ..... 1111111 @r +rstas16 1011010 ..... ..... 010 ..... 1111111 @r +urstas16 1101010 ..... ..... 010 ..... 1111111 @r +kstas16 1100010 ..... ..... 010 ..... 1111111 @r +ukstas16 1110010 ..... ..... 010 ..... 1111111 @r +stsa16 1111011 ..... ..... 010 ..... 1111111 @r +rstsa16 1011011 ..... ..... 010 ..... 1111111 @r +urstsa16 1101011 ..... ..... 010 ..... 1111111 @r +kstsa16 1100011 ..... ..... 010 ..... 1111111 @r +ukstsa16 1110011 ..... ..... 010 ..... 1111111 @r diff --git a/target/riscv/insn_trans/trans_rvp.c.inc b/target/riscv/insn_trans/trans_rvp.c.inc new file mode 100644 index 0000000000..0885a4fd45 --- /dev/null +++ b/target/riscv/insn_trans/trans_rvp.c.inc @@ -0,0 +1,161 @@ +/* + * RISC-V translation routines for the RVP Standard Extension. + * + * Copyright (c) 2021 T-Head Semiconductor Co., Ltd. All rights reserved. + * + * This program is free software; you can redistribute it and/or modify it + * under the terms and conditions of the GNU General Public License, + * version 2 or later, as published by the Free Software Foundation. + * + * This program is distributed in the hope it will be useful, but WITHOUT + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for + * more details. + * + * You should have received a copy of the GNU General Public License along with + * this program. If not, see <http://www.gnu.org/licenses/>. + */ + +#include "tcg/tcg-op-gvec.h" +#include "tcg/tcg-gvec-desc.h" +#include "tcg/tcg.h" + +/* + *** SIMD Data Processing Instructions + */ + +/* 16-bit Addition & Subtraction Instructions */ + +/* + * For some instructions, such as add16, an oberservation can be utilized: + * 1) If any reg is zero, it can be reduced to an inline op on the whole reg. + * 2) Otherwise, it can be acclebrated by an gvec op or an inline op. + */ + +typedef void GenZeroFn(DisasContext *, arg_r *); +typedef void GenNoZero32Fn(TCGv, TCGv, TCGv); +typedef void GenNoZero64Fn(unsigned, uint32_t, uint32_t, + uint32_t, uint32_t, uint32_t); + +static inline bool +r_inline(DisasContext *ctx, arg_r *a, uint8_t vece, + GenNoZero64Fn *f64, GenNoZero32Fn *f32, + GenZeroFn *fn) +{ + if (!has_ext(ctx, RVP)) { + return false; + } + if (a->rd && a->rs1 && a->rs2) { +#ifdef TARGET_RISCV64 + f64(vece, offsetof(CPURISCVState, gpr[a->rd]), + offsetof(CPURISCVState, gpr[a->rs1]), + offsetof(CPURISCVState, gpr[a->rs2]), + 8, 8); +#else + f32(cpu_gpr[a->rd], cpu_gpr[a->rs1], cpu_gpr[a->rs2]); +#endif + } else { + fn(ctx, a); + } + return true; +} + +/* Complete inline implementation */ +#define GEN_RVP_R_INLINE(NAME, GSUF, VECE, FN) \ +static bool trans_##NAME(DisasContext *s, arg_r *a) \ +{ \ + return r_inline(s, a, VECE, tcg_gen_gvec_##GSUF, \ + tcg_gen_simd_##NAME, (GenZeroFn *)FN); \ +} \ + +static void tcg_gen_simd_add16(TCGv d, TCGv a, TCGv b) +{ + TCGv t1 = tcg_temp_new(); + TCGv t2 = tcg_temp_new(); + + tcg_gen_andi_tl(t1, a, ~0xffff); + tcg_gen_add_tl(t2, a, b); + tcg_gen_add_tl(t1, t1, b); + tcg_gen_deposit_tl(d, t1, t2, 0, 16); + + tcg_temp_free(t1); + tcg_temp_free(t2); +} + +GEN_RVP_R_INLINE(add16, add, 1, trans_add); + +static void tcg_gen_simd_sub16(TCGv d, TCGv a, TCGv b) +{ + TCGv t1 = tcg_temp_new(); + TCGv t2 = tcg_temp_new(); + + tcg_gen_andi_tl(t1, b, ~0xffff); + tcg_gen_sub_tl(t2, a, b); + tcg_gen_sub_tl(t1, a, t1); + tcg_gen_deposit_tl(d, t1, t2, 0, 16); + + tcg_temp_free(t1); + tcg_temp_free(t2); +} + +GEN_RVP_R_INLINE(sub16, sub, 1, trans_sub); + +/* Out of line helpers for R format packed instructions */ +typedef void gen_helper_rvp_r(TCGv, TCGv_ptr, TCGv, TCGv); + +static inline bool r_ool(DisasContext *ctx, arg_r *a, gen_helper_rvp_r *fn) +{ + TCGv src1, src2, dst; + if (!has_ext(ctx, RVP)) { + return false; + } + + src1 = tcg_temp_new(); + src2 = tcg_temp_new(); + dst = tcg_temp_new(); + + gen_get_gpr(src1, a->rs1); + gen_get_gpr(src2, a->rs2); + fn(dst, cpu_env, src1, src2); + gen_set_gpr(a->rd, dst); + + tcg_temp_free(src1); + tcg_temp_free(src2); + tcg_temp_free(dst); + return true; +} + +#define GEN_RVP_R_OOL(NAME) \ +static bool trans_##NAME(DisasContext *s, arg_r *a) \ +{ \ + return r_ool(s, a, gen_helper_##NAME); \ +} + +GEN_RVP_R_OOL(radd16); +GEN_RVP_R_OOL(uradd16); +GEN_RVP_R_OOL(kadd16); +GEN_RVP_R_OOL(ukadd16); +GEN_RVP_R_OOL(rsub16); +GEN_RVP_R_OOL(ursub16); +GEN_RVP_R_OOL(ksub16); +GEN_RVP_R_OOL(uksub16); +GEN_RVP_R_OOL(cras16); +GEN_RVP_R_OOL(rcras16); +GEN_RVP_R_OOL(urcras16); +GEN_RVP_R_OOL(kcras16); +GEN_RVP_R_OOL(ukcras16); +GEN_RVP_R_OOL(crsa16); +GEN_RVP_R_OOL(rcrsa16); +GEN_RVP_R_OOL(urcrsa16); +GEN_RVP_R_OOL(kcrsa16); +GEN_RVP_R_OOL(ukcrsa16); +GEN_RVP_R_OOL(stas16); +GEN_RVP_R_OOL(rstas16); +GEN_RVP_R_OOL(urstas16); +GEN_RVP_R_OOL(kstas16); +GEN_RVP_R_OOL(ukstas16); +GEN_RVP_R_OOL(stsa16); +GEN_RVP_R_OOL(rstsa16); +GEN_RVP_R_OOL(urstsa16); +GEN_RVP_R_OOL(kstsa16); +GEN_RVP_R_OOL(ukstsa16); diff --git a/target/riscv/meson.build b/target/riscv/meson.build index 14a5c62dac..d26a437ee8 100644 --- a/target/riscv/meson.build +++ b/target/riscv/meson.build @@ -21,6 +21,7 @@ riscv_ss.add(files( 'gdbstub.c', 'op_helper.c', 'vector_helper.c', + 'packed_helper.c', 'translate.c', )) diff --git a/target/riscv/packed_helper.c b/target/riscv/packed_helper.c new file mode 100644 index 0000000000..b84abaaf25 --- /dev/null +++ b/target/riscv/packed_helper.c @@ -0,0 +1,354 @@ +/* + * RISC-V P Extension Helpers for QEMU. + * + * Copyright (c) 2021 T-Head Semiconductor Co., Ltd. All rights reserved. + * + * This program is free software; you can redistribute it and/or modify it + * under the terms and conditions of the GNU General Public License, + * version 2 or later, as published by the Free Software Foundation. + * + * This program is distributed in the hope it will be useful, but WITHOUT + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for + * more details. + * + * You should have received a copy of the GNU General Public License along with + * this program. If not, see <http://www.gnu.org/licenses/>. + */ +#include "qemu/osdep.h" +#include "cpu.h" +#include "exec/exec-all.h" +#include "exec/helper-proto.h" +#include "exec/cpu_ldst.h" +#include "fpu/softfloat.h" +#include <math.h> +#include "internals.h" + +/* + *** SIMD Data Processing Instructions + */ + +/* 16-bit Addition & Subtraction Instructions */ +typedef void PackedFn3i(CPURISCVState *, void *, void *, void *, uint8_t); + +/* Define a common function to loop elements in packed register */ +static inline target_ulong +rvpr(CPURISCVState *env, target_ulong a, target_ulong b, + uint8_t step, uint8_t size, PackedFn3i *fn) +{ + int i, passes = sizeof(target_ulong) / size; + target_ulong result = 0; + + for (i = 0; i < passes; i += step) { + fn(env, &result, &a, &b, i); + } + return result; +} + +#define RVPR(NAME, STEP, SIZE) \ +target_ulong HELPER(NAME)(CPURISCVState *env, target_ulong a, \ + target_ulong b) \ +{ \ + return rvpr(env, a, b, STEP, SIZE, (PackedFn3i *)do_##NAME);\ +} + +static inline int32_t hadd32(int32_t a, int32_t b) +{ + return ((int64_t)a + b) >> 1; +} + +static inline void do_radd16(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + int16_t *d = vd, *a = va, *b = vb; + d[i] = hadd32(a[i], b[i]); +} + +RVPR(radd16, 1, 2); + +static inline uint32_t haddu32(uint32_t a, uint32_t b) +{ + return ((uint64_t)a + b) >> 1; +} + +static inline void do_uradd16(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + uint16_t *d = vd, *a = va, *b = vb; + d[i] = haddu32(a[i], b[i]); +} + +RVPR(uradd16, 1, 2); + +static inline void do_kadd16(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + int16_t *d = vd, *a = va, *b = vb; + d[i] = sadd16(env, 0, a[i], b[i]); +} + +RVPR(kadd16, 1, 2); + +static inline void do_ukadd16(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + uint16_t *d = vd, *a = va, *b = vb; + d[i] = saddu16(env, 0, a[i], b[i]); +} + +RVPR(ukadd16, 1, 2); + +static inline int32_t hsub32(int32_t a, int32_t b) +{ + return ((int64_t)a - b) >> 1; +} + +static inline int64_t hsub64(int64_t a, int64_t b) +{ + int64_t res = a - b; + int64_t over = (res ^ a) & (a ^ b) & INT64_MIN; + + /* With signed overflow, bit 64 is inverse of bit 63. */ + return (res >> 1) ^ over; +} + +static inline void do_rsub16(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + int16_t *d = vd, *a = va, *b = vb; + d[i] = hsub32(a[i], b[i]); +} + +RVPR(rsub16, 1, 2); + +static inline uint64_t hsubu64(uint64_t a, uint64_t b) +{ + return (a - b) >> 1; +} + +static inline void do_ursub16(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + uint16_t *d = vd, *a = va, *b = vb; + d[i] = hsubu64(a[i], b[i]); +} + +RVPR(ursub16, 1, 2); + +static inline void do_ksub16(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + int16_t *d = vd, *a = va, *b = vb; + d[i] = ssub16(env, 0, a[i], b[i]); +} + +RVPR(ksub16, 1, 2); + +static inline void do_uksub16(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + uint16_t *d = vd, *a = va, *b = vb; + d[i] = ssubu16(env, 0, a[i], b[i]); +} + +RVPR(uksub16, 1, 2); + +static inline void do_cras16(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + uint16_t *d = vd, *a = va, *b = vb; + d[H2(i)] = a[H2(i)] - b[H2(i + 1)]; + d[H2(i + 1)] = a[H2(i + 1)] + b[H2(i)]; +} + +RVPR(cras16, 2, 2); + +static inline void do_rcras16(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + int16_t *d = vd, *a = va, *b = vb; + d[H2(i)] = hsub32(a[H2(i)], b[H2(i + 1)]); + d[H2(i + 1)] = hadd32(a[H2(i + 1)], b[H2(i)]); +} + +RVPR(rcras16, 2, 2); + +static inline void do_urcras16(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + uint16_t *d = vd, *a = va, *b = vb; + d[H2(i)] = hsubu64(a[H2(i)], b[H2(i + 1)]); + d[H2(i + 1)] = haddu32(a[H2(i + 1)], b[H2(i)]); +} + +RVPR(urcras16, 2, 2); + +static inline void do_kcras16(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + int16_t *d = vd, *a = va, *b = vb; + d[H2(i)] = ssub16(env, 0, a[H2(i)], b[H2(i + 1)]); + d[H2(i + 1)] = sadd16(env, 0, a[H2(i + 1)], b[H2(i)]); +} + +RVPR(kcras16, 2, 2); + +static inline void do_ukcras16(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + uint16_t *d = vd, *a = va, *b = vb; + d[H2(i)] = ssubu16(env, 0, a[H2(i)], b[H2(i + 1)]); + d[H2(i + 1)] = saddu16(env, 0, a[H2(i + 1)], b[H2(i)]); +} + +RVPR(ukcras16, 2, 2); + +static inline void do_crsa16(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + uint16_t *d = vd, *a = va, *b = vb; + d[H2(i)] = a[H2(i)] + b[H2(i + 1)]; + d[H2(i + 1)] = a[H2(i + 1)] - b[H2(i)]; +} + +RVPR(crsa16, 2, 2); + +static inline void do_rcrsa16(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + int16_t *d = vd, *a = va, *b = vb; + d[H2(i)] = hadd32(a[H2(i)], b[H2(i + 1)]); + d[H2(i + 1)] = hsub32(a[H2(i + 1)], b[H2(i)]); +} + +RVPR(rcrsa16, 2, 2); + +static inline void do_urcrsa16(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + uint16_t *d = vd, *a = va, *b = vb; + d[H2(i)] = haddu32(a[H2(i)], b[H2(i + 1)]); + d[H2(i + 1)] = hsubu64(a[H2(i + 1)], b[H2(i)]); +} + +RVPR(urcrsa16, 2, 2); + +static inline void do_kcrsa16(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + int16_t *d = vd, *a = va, *b = vb; + d[H2(i)] = sadd16(env, 0, a[H2(i)], b[H2(i + 1)]); + d[H2(i + 1)] = ssub16(env, 0, a[H2(i + 1)], b[H2(i)]); +} + +RVPR(kcrsa16, 2, 2); + +static inline void do_ukcrsa16(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + uint16_t *d = vd, *a = va, *b = vb; + d[H2(i)] = saddu16(env, 0, a[H2(i)], b[H2(i + 1)]); + d[H2(i + 1)] = ssubu16(env, 0, a[H2(i + 1)], b[H2(i)]); +} + +RVPR(ukcrsa16, 2, 2); + +static inline void do_stas16(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + int16_t *d = vd, *a = va, *b = vb; + d[H2(i)] = a[H2(i)] - b[H2(i)]; + d[H2(i + 1)] = a[H2(i + 1)] + b[H2(i + 1)]; +} + +RVPR(stas16, 2, 2); + +static inline void do_rstas16(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + int16_t *d = vd, *a = va, *b = vb; + d[H2(i)] = hsub32(a[H2(i)], b[H2(i)]); + d[H2(i + 1)] = hadd32(a[H2(i + 1)], b[H2(i + 1)]); +} + +RVPR(rstas16, 2, 2); + +static inline void do_urstas16(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + uint16_t *d = vd, *a = va, *b = vb; + d[H2(i)] = hsubu64(a[H2(i)], b[H2(i)]); + d[H2(i + 1)] = haddu32(a[H2(i + 1)], b[H2(i + 1)]); +} + +RVPR(urstas16, 2, 2); + +static inline void do_kstas16(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + int16_t *d = vd, *a = va, *b = vb; + d[H2(i)] = ssub16(env, 0, a[H2(i)], b[H2(i)]); + d[H2(i + 1)] = sadd16(env, 0, a[H2(i + 1)], b[H2(i + 1)]); +} + +RVPR(kstas16, 2, 2); + +static inline void do_ukstas16(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + uint16_t *d = vd, *a = va, *b = vb; + d[H2(i)] = ssubu16(env, 0, a[H2(i)], b[H2(i)]); + d[H2(i + 1)] = saddu16(env, 0, a[H2(i + 1)], b[H2(i + 1)]); +} + +RVPR(ukstas16, 2, 2); + +static inline void do_stsa16(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + uint16_t *d = vd, *a = va, *b = vb; + d[H2(i)] = a[H2(i)] + b[H2(i)]; + d[H2(i + 1)] = a[H2(i + 1)] - b[H2(i + 1)]; +} + +RVPR(stsa16, 2, 2); + +static inline void do_rstsa16(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + int16_t *d = vd, *a = va, *b = vb; + d[H2(i)] = hadd32(a[H2(i)], b[H2(i)]); + d[H2(i + 1)] = hsub32(a[H2(i + 1)], b[H2(i + 1)]); +} + +RVPR(rstsa16, 2, 2); + +static inline void do_urstsa16(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + uint16_t *d = vd, *a = va, *b = vb; + d[H2(i)] = haddu32(a[H2(i)], b[H2(i)]); + d[H2(i + 1)] = hsubu64(a[H2(i + 1)], b[H2(i + 1)]); +} + +RVPR(urstsa16, 2, 2); + +static inline void do_kstsa16(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + int16_t *d = vd, *a = va, *b = vb; + d[H2(i)] = sadd16(env, 0, a[H2(i)], b[H2(i)]); + d[H2(i + 1)] = ssub16(env, 0, a[H2(i + 1)], b[H2(i + 1)]); +} + +RVPR(kstsa16, 2, 2); + +static inline void do_ukstsa16(CPURISCVState *env, void *vd, void *va, + void *vb, uint8_t i) +{ + uint16_t *d = vd, *a = va, *b = vb; + d[H2(i)] = saddu16(env, 0, a[H2(i)], b[H2(i)]); + d[H2(i + 1)] = ssubu16(env, 0, a[H2(i + 1)], b[H2(i + 1)]); +} + +RVPR(ukstsa16, 2, 2); diff --git a/target/riscv/translate.c b/target/riscv/translate.c index eb810efec6..f0a753f9c7 100644 --- a/target/riscv/translate.c +++ b/target/riscv/translate.c @@ -766,6 +766,7 @@ static uint32_t opcode_at(DisasContextBase *dcbase, target_ulong pc) #include "insn_trans/trans_rvd.c.inc" #include "insn_trans/trans_rvh.c.inc" #include "insn_trans/trans_rvv.c.inc" +#include "insn_trans/trans_rvp.c.inc" #include "insn_trans/trans_privileged.c.inc" /* Include the auto-generated decoder for 16 bit insn */ -- 2.17.1
next prev parent reply other threads:[~2021-02-12 15:15 UTC|newest] Thread overview: 150+ messages / expand[flat|nested] mbox.gz Atom feed top 2021-02-12 15:02 [PATCH 00/38] target/riscv: support packed extension v0.9.2 LIU Zhiwei 2021-02-12 15:02 ` LIU Zhiwei 2021-02-12 15:02 ` [PATCH 01/38] target/riscv: implementation-defined constant parameters LIU Zhiwei 2021-02-12 15:02 ` LIU Zhiwei 2021-03-09 14:08 ` Alistair Francis 2021-03-09 14:08 ` Alistair Francis 2021-02-12 15:02 ` [PATCH 02/38] target/riscv: Hoist vector functions LIU Zhiwei 2021-02-12 15:02 ` LIU Zhiwei 2021-03-09 14:10 ` Alistair Francis 2021-03-09 14:10 ` Alistair Francis 2021-02-12 15:02 ` [PATCH 03/38] target/riscv: Fixup saturate subtract function LIU Zhiwei 2021-02-12 15:02 ` LIU Zhiwei 2021-02-12 18:52 ` Richard Henderson 2021-02-12 18:52 ` Richard Henderson 2021-03-09 14:11 ` Alistair Francis 2021-03-09 14:11 ` Alistair Francis 2021-02-12 15:02 ` LIU Zhiwei [this message] 2021-02-12 15:02 ` [PATCH 04/38] target/riscv: 16-bit Addition & Subtraction Instructions LIU Zhiwei 2021-02-12 18:03 ` Richard Henderson 2021-02-12 18:03 ` Richard Henderson 2021-02-18 8:39 ` LIU Zhiwei 2021-02-18 8:39 ` LIU Zhiwei 2021-02-18 16:20 ` Richard Henderson 2021-02-18 16:20 ` Richard Henderson 2021-02-12 19:02 ` Richard Henderson 2021-02-12 19:02 ` Richard Henderson 2021-02-18 8:47 ` LIU Zhiwei 2021-02-18 8:47 ` LIU Zhiwei 2021-02-18 16:21 ` Richard Henderson 2021-02-18 16:21 ` Richard Henderson 2021-02-12 15:02 ` [PATCH 05/38] target/riscv: 8-bit Addition & Subtraction Instruction LIU Zhiwei 2021-02-12 15:02 ` LIU Zhiwei 2021-03-15 21:22 ` Alistair Francis 2021-03-15 21:22 ` Alistair Francis 2021-05-24 1:00 ` Palmer Dabbelt 2021-05-24 1:00 ` Palmer Dabbelt 2021-05-26 5:43 ` LIU Zhiwei 2021-05-26 5:43 ` LIU Zhiwei 2021-05-26 6:15 ` Palmer Dabbelt 2021-05-26 6:15 ` Palmer Dabbelt 2021-02-12 15:02 ` [PATCH 06/38] target/riscv: SIMD 16-bit Shift Instructions LIU Zhiwei 2021-02-12 15:02 ` LIU Zhiwei 2021-03-15 21:25 ` Alistair Francis 2021-03-15 21:25 ` Alistair Francis 2021-03-16 2:40 ` LIU Zhiwei 2021-03-16 2:40 ` LIU Zhiwei 2021-03-16 19:54 ` Alistair Francis 2021-03-16 19:54 ` Alistair Francis 2021-03-17 2:30 ` LIU Zhiwei 2021-03-17 2:30 ` LIU Zhiwei 2021-03-17 20:39 ` Alistair Francis 2021-03-17 20:39 ` Alistair Francis 2021-02-12 15:02 ` [PATCH 07/38] target/riscv: SIMD 8-bit " LIU Zhiwei 2021-02-12 15:02 ` LIU Zhiwei 2021-03-15 21:27 ` Alistair Francis 2021-03-15 21:27 ` Alistair Francis 2021-05-24 4:46 ` Palmer Dabbelt 2021-05-24 4:46 ` Palmer Dabbelt 2021-02-12 15:02 ` [PATCH 08/38] target/riscv: SIMD 16-bit Compare Instructions LIU Zhiwei 2021-02-12 15:02 ` LIU Zhiwei 2021-03-15 21:28 ` Alistair Francis 2021-03-15 21:28 ` Alistair Francis 2021-05-26 5:30 ` Palmer Dabbelt 2021-05-26 5:30 ` Palmer Dabbelt 2021-05-26 5:31 ` Palmer Dabbelt 2021-05-26 5:31 ` Palmer Dabbelt 2021-02-12 15:02 ` [PATCH 09/38] target/riscv: SIMD 8-bit " LIU Zhiwei 2021-02-12 15:02 ` LIU Zhiwei 2021-03-15 21:31 ` Alistair Francis 2021-03-15 21:31 ` Alistair Francis 2021-02-12 15:02 ` [PATCH 10/38] target/riscv: SIMD 16-bit Multiply Instructions LIU Zhiwei 2021-02-12 15:02 ` LIU Zhiwei 2021-02-12 15:02 ` [PATCH 11/38] target/riscv: SIMD 8-bit " LIU Zhiwei 2021-02-12 15:02 ` LIU Zhiwei 2021-03-15 21:33 ` Alistair Francis 2021-03-15 21:33 ` Alistair Francis 2021-02-12 15:02 ` [PATCH 12/38] target/riscv: SIMD 16-bit Miscellaneous Instructions LIU Zhiwei 2021-02-12 15:02 ` LIU Zhiwei 2021-03-15 21:35 ` Alistair Francis 2021-03-15 21:35 ` Alistair Francis 2021-02-12 15:02 ` [PATCH 13/38] target/riscv: SIMD 8-bit " LIU Zhiwei 2021-02-12 15:02 ` LIU Zhiwei 2021-03-16 14:38 ` Alistair Francis 2021-03-16 14:38 ` Alistair Francis 2021-02-12 15:02 ` [PATCH 14/38] target/riscv: 8-bit Unpacking Instructions LIU Zhiwei 2021-02-12 15:02 ` LIU Zhiwei 2021-03-16 14:40 ` Alistair Francis 2021-03-16 14:40 ` Alistair Francis 2021-02-12 15:02 ` [PATCH 15/38] target/riscv: 16-bit Packing Instructions LIU Zhiwei 2021-02-12 15:02 ` LIU Zhiwei 2021-03-16 14:42 ` Alistair Francis 2021-03-16 14:42 ` Alistair Francis 2021-02-12 15:02 ` [PATCH 16/38] target/riscv: Signed MSW 32x32 Multiply and Add Instructions LIU Zhiwei 2021-02-12 15:02 ` LIU Zhiwei 2021-02-12 15:02 ` [PATCH 17/38] target/riscv: Signed MSW 32x16 " LIU Zhiwei 2021-02-12 15:02 ` LIU Zhiwei 2021-03-16 16:01 ` Alistair Francis 2021-03-16 16:01 ` Alistair Francis 2021-02-12 15:02 ` [PATCH 18/38] target/riscv: Signed 16-bit Multiply 32-bit Add/Subtract Instructions LIU Zhiwei 2021-02-12 15:02 ` LIU Zhiwei 2021-02-12 15:02 ` [PATCH 19/38] target/riscv: Signed 16-bit Multiply 64-bit " LIU Zhiwei 2021-02-12 15:02 ` LIU Zhiwei 2021-02-12 15:02 ` [PATCH 20/38] target/riscv: Partial-SIMD Miscellaneous Instructions LIU Zhiwei 2021-02-12 15:02 ` LIU Zhiwei 2021-03-16 19:44 ` Alistair Francis 2021-03-16 19:44 ` Alistair Francis 2021-02-12 15:02 ` [PATCH 21/38] target/riscv: 8-bit Multiply with 32-bit Add Instructions LIU Zhiwei 2021-02-12 15:02 ` LIU Zhiwei 2021-02-12 15:02 ` [PATCH 22/38] target/riscv: 64-bit Add/Subtract Instructions LIU Zhiwei 2021-02-12 15:02 ` LIU Zhiwei 2021-02-12 15:02 ` [PATCH 23/38] target/riscv: 32-bit Multiply " LIU Zhiwei 2021-02-12 15:02 ` LIU Zhiwei 2021-02-12 15:02 ` [PATCH 24/38] target/riscv: Signed 16-bit Multiply with " LIU Zhiwei 2021-02-12 15:02 ` LIU Zhiwei 2021-02-12 15:02 ` [PATCH 25/38] target/riscv: Non-SIMD Q15 saturation ALU Instructions LIU Zhiwei 2021-02-12 15:02 ` LIU Zhiwei 2021-02-12 15:02 ` [PATCH 26/38] target/riscv: Non-SIMD Q31 " LIU Zhiwei 2021-02-12 15:02 ` LIU Zhiwei 2021-02-12 15:02 ` [PATCH 27/38] target/riscv: 32-bit Computation Instructions LIU Zhiwei 2021-02-12 15:02 ` LIU Zhiwei 2021-02-12 15:02 ` [PATCH 28/38] target/riscv: Non-SIMD Miscellaneous Instructions LIU Zhiwei 2021-02-12 15:02 ` LIU Zhiwei 2021-02-12 15:02 ` [PATCH 29/38] target/riscv: RV64 Only SIMD 32-bit Add/Subtract Instructions LIU Zhiwei 2021-02-12 15:02 ` LIU Zhiwei 2021-02-12 15:02 ` [PATCH 30/38] target/riscv: RV64 Only SIMD 32-bit Shift Instructions LIU Zhiwei 2021-02-12 15:02 ` LIU Zhiwei 2021-02-12 15:02 ` [PATCH 31/38] target/riscv: RV64 Only SIMD 32-bit Miscellaneous Instructions LIU Zhiwei 2021-02-12 15:02 ` LIU Zhiwei 2021-02-12 15:02 ` [PATCH 32/38] target/riscv: RV64 Only SIMD Q15 saturating Multiply Instructions LIU Zhiwei 2021-02-12 15:02 ` LIU Zhiwei 2021-02-12 15:02 ` [PATCH 33/38] target/riscv: RV64 Only 32-bit " LIU Zhiwei 2021-02-12 15:02 ` LIU Zhiwei 2021-02-12 15:02 ` [PATCH 34/38] target/riscv: RV64 Only 32-bit Multiply & Add Instructions LIU Zhiwei 2021-02-12 15:02 ` LIU Zhiwei 2021-02-12 15:02 ` [PATCH 35/38] target/riscv: RV64 Only 32-bit Parallel " LIU Zhiwei 2021-02-12 15:02 ` LIU Zhiwei 2021-02-12 15:02 ` [PATCH 36/38] target/riscv: RV64 Only Non-SIMD 32-bit Shift Instructions LIU Zhiwei 2021-02-12 15:02 ` LIU Zhiwei 2021-02-12 15:02 ` [PATCH 37/38] target/riscv: RV64 Only 32-bit Packing Instructions LIU Zhiwei 2021-02-12 15:02 ` LIU Zhiwei 2021-02-12 15:02 ` [PATCH 38/38] target/riscv: configure and turn on packed extension from command line LIU Zhiwei 2021-02-12 15:02 ` LIU Zhiwei 2021-03-05 6:14 ` [PATCH 00/38] target/riscv: support packed extension v0.9.2 LIU Zhiwei 2021-03-05 6:14 ` LIU Zhiwei 2021-04-13 3:27 ` LIU Zhiwei 2021-04-13 3:27 ` LIU Zhiwei 2021-04-15 4:46 ` Alistair Francis 2021-04-15 4:46 ` Alistair Francis 2021-04-15 5:50 ` LIU Zhiwei 2021-04-15 5:50 ` LIU Zhiwei
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20210212150256.885-5-zhiwei_liu@c-sky.com \ --to=zhiwei_liu@c-sky.com \ --cc=alistair23@gmail.com \ --cc=palmer@dabbelt.com \ --cc=qemu-devel@nongnu.org \ --cc=qemu-riscv@nongnu.org \ --cc=richard.henderson@linaro.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.