* [PATCH v2 1/9] target/arm: Convert Neon VSHL and VSLI 2-reg-shift insn to decodetree
2020-05-22 14:55 [PATCH v2 0/9] target/arm: Convert 2-reg-shift and 1-reg-imm Neon insns to decodetree Peter Maydell
@ 2020-05-22 14:55 ` Peter Maydell
2020-06-01 23:08 ` Richard Henderson
2020-05-22 14:55 ` [PATCH v2 2/9] target/arm: Convert Neon VSHR 2-reg-shift insns " Peter Maydell
` (8 subsequent siblings)
9 siblings, 1 reply; 17+ messages in thread
From: Peter Maydell @ 2020-05-22 14:55 UTC (permalink / raw)
To: qemu-arm, qemu-devel; +Cc: Richard Henderson
Convert the VSHL and VSLI insns from the Neon 2-registers-and-a-shift
group to decodetree.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
target/arm/neon-dp.decode | 25 ++++++++++++++++++++++
target/arm/translate-neon.inc.c | 38 +++++++++++++++++++++++++++++++++
target/arm/translate.c | 18 +++++++---------
3 files changed, 71 insertions(+), 10 deletions(-)
diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
index 8beb1db768b..4bd305e7ea0 100644
--- a/target/arm/neon-dp.decode
+++ b/target/arm/neon-dp.decode
@@ -199,3 +199,28 @@ VRECPS_fp_3s 1111 001 0 0 . 0 . .... .... 1111 ... 1 .... @3same_fp
VRSQRTS_fp_3s 1111 001 0 0 . 1 . .... .... 1111 ... 1 .... @3same_fp
VMAXNM_fp_3s 1111 001 1 0 . 0 . .... .... 1111 ... 1 .... @3same_fp
VMINNM_fp_3s 1111 001 1 0 . 1 . .... .... 1111 ... 1 .... @3same_fp
+
+######################################################################
+# 2-reg-and-shift grouping:
+# 1111 001 U 1 D immH:3 immL:3 Vd:4 opc:4 L Q M 1 Vm:4
+######################################################################
+&2reg_shift vm vd q shift size
+
+@2reg_shl_d .... ... . . . shift:6 .... .... 1 q:1 . . .... \
+ &2reg_shift vm=%vm_dp vd=%vd_dp size=3
+@2reg_shl_s .... ... . . . 1 shift:5 .... .... 0 q:1 . . .... \
+ &2reg_shift vm=%vm_dp vd=%vd_dp size=2
+@2reg_shl_h .... ... . . . 01 shift:4 .... .... 0 q:1 . . .... \
+ &2reg_shift vm=%vm_dp vd=%vd_dp size=1
+@2reg_shl_b .... ... . . . 001 shift:3 .... .... 0 q:1 . . .... \
+ &2reg_shift vm=%vm_dp vd=%vd_dp size=0
+
+VSHL_2sh 1111 001 0 1 . ...... .... 0101 . . . 1 .... @2reg_shl_d
+VSHL_2sh 1111 001 0 1 . ...... .... 0101 . . . 1 .... @2reg_shl_s
+VSHL_2sh 1111 001 0 1 . ...... .... 0101 . . . 1 .... @2reg_shl_h
+VSHL_2sh 1111 001 0 1 . ...... .... 0101 . . . 1 .... @2reg_shl_b
+
+VSLI_2sh 1111 001 1 1 . ...... .... 0101 . . . 1 .... @2reg_shl_d
+VSLI_2sh 1111 001 1 1 . ...... .... 0101 . . . 1 .... @2reg_shl_s
+VSLI_2sh 1111 001 1 1 . ...... .... 0101 . . . 1 .... @2reg_shl_h
+VSLI_2sh 1111 001 1 1 . ...... .... 0101 . . . 1 .... @2reg_shl_b
diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
index 3fe65a0b080..305213fe6d9 100644
--- a/target/arm/translate-neon.inc.c
+++ b/target/arm/translate-neon.inc.c
@@ -1310,3 +1310,41 @@ static bool do_3same_fp_pair(DisasContext *s, arg_3same *a, VFPGen3OpSPFn *fn)
DO_3S_FP_PAIR(VPADD, gen_helper_vfp_adds)
DO_3S_FP_PAIR(VPMAX, gen_helper_vfp_maxs)
DO_3S_FP_PAIR(VPMIN, gen_helper_vfp_mins)
+
+static bool do_vector_2sh(DisasContext *s, arg_2reg_shift *a, GVecGen2iFn *fn)
+{
+ /* Handle a 2-reg-shift insn which can be vectorized. */
+ int vec_size = a->q ? 16 : 8;
+ int rd_ofs = neon_reg_offset(a->vd, 0);
+ int rm_ofs = neon_reg_offset(a->vm, 0);
+
+ if (!arm_dc_feature(s, ARM_FEATURE_NEON)) {
+ return false;
+ }
+
+ /* UNDEF accesses to D16-D31 if they don't exist. */
+ if (!dc_isar_feature(aa32_simd_r32, s) &&
+ ((a->vd | a->vm) & 0x10)) {
+ return false;
+ }
+
+ if ((a->vm | a->vd) & a->q) {
+ return false;
+ }
+
+ if (!vfp_access_check(s)) {
+ return true;
+ }
+
+ fn(a->size, rd_ofs, rm_ofs, a->shift, vec_size, vec_size);
+ return true;
+}
+
+#define DO_2SH(INSN, FUNC) \
+ static bool trans_##INSN##_2sh(DisasContext *s, arg_2reg_shift *a) \
+ { \
+ return do_vector_2sh(s, a, FUNC); \
+ } \
+
+DO_2SH(VSHL, tcg_gen_gvec_shli)
+DO_2SH(VSLI, gen_gvec_sli)
diff --git a/target/arm/translate.c b/target/arm/translate.c
index c8296116d4b..d0a4a08f6d9 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -5294,6 +5294,14 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
if ((insn & 0x00380080) != 0) {
/* Two registers and shift. */
op = (insn >> 8) & 0xf;
+
+ switch (op) {
+ case 5: /* VSHL, VSLI */
+ return 1; /* handled by decodetree */
+ default:
+ break;
+ }
+
if (insn & (1 << 7)) {
/* 64-bit shift. */
if (op > 7) {
@@ -5387,16 +5395,6 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
gen_gvec_sri(size, rd_ofs, rm_ofs, shift,
vec_size, vec_size);
return 0;
-
- case 5: /* VSHL, VSLI */
- if (u) { /* VSLI */
- gen_gvec_sli(size, rd_ofs, rm_ofs, shift,
- vec_size, vec_size);
- } else { /* VSHL */
- tcg_gen_gvec_shli(size, rd_ofs, rm_ofs, shift,
- vec_size, vec_size);
- }
- return 0;
}
if (size == 3) {
--
2.20.1
^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [PATCH v2 1/9] target/arm: Convert Neon VSHL and VSLI 2-reg-shift insn to decodetree
2020-05-22 14:55 ` [PATCH v2 1/9] target/arm: Convert Neon VSHL and VSLI 2-reg-shift insn " Peter Maydell
@ 2020-06-01 23:08 ` Richard Henderson
0 siblings, 0 replies; 17+ messages in thread
From: Richard Henderson @ 2020-06-01 23:08 UTC (permalink / raw)
To: Peter Maydell, qemu-arm, qemu-devel
On 5/22/20 7:55 AM, Peter Maydell wrote:
> Convert the VSHL and VSLI insns from the Neon 2-registers-and-a-shift
> group to decodetree.
>
> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
> ---
> target/arm/neon-dp.decode | 25 ++++++++++++++++++++++
> target/arm/translate-neon.inc.c | 38 +++++++++++++++++++++++++++++++++
> target/arm/translate.c | 18 +++++++---------
> 3 files changed, 71 insertions(+), 10 deletions(-)
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
r~
^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH v2 2/9] target/arm: Convert Neon VSHR 2-reg-shift insns to decodetree
2020-05-22 14:55 [PATCH v2 0/9] target/arm: Convert 2-reg-shift and 1-reg-imm Neon insns to decodetree Peter Maydell
2020-05-22 14:55 ` [PATCH v2 1/9] target/arm: Convert Neon VSHL and VSLI 2-reg-shift insn " Peter Maydell
@ 2020-05-22 14:55 ` Peter Maydell
2020-05-22 14:55 ` [PATCH v2 3/9] target/arm: Convert Neon VSRA, VSRI, VRSHR, VRSRA " Peter Maydell
` (7 subsequent siblings)
9 siblings, 0 replies; 17+ messages in thread
From: Peter Maydell @ 2020-05-22 14:55 UTC (permalink / raw)
To: qemu-arm, qemu-devel; +Cc: Richard Henderson
Convert the VSHR 2-reg-shift insns to decodetree.
Note that unlike the legacy decoder, we present the right shift
amount to the trans_ function as a positive integer.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
---
target/arm/neon-dp.decode | 25 ++++++++++++++++++++
target/arm/translate-neon.inc.c | 41 +++++++++++++++++++++++++++++++++
target/arm/translate.c | 21 +----------------
3 files changed, 67 insertions(+), 20 deletions(-)
diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
index 4bd305e7ea0..cd3a8f936d7 100644
--- a/target/arm/neon-dp.decode
+++ b/target/arm/neon-dp.decode
@@ -206,6 +206,21 @@ VMINNM_fp_3s 1111 001 1 0 . 1 . .... .... 1111 ... 1 .... @3same_fp
######################################################################
&2reg_shift vm vd q shift size
+# Right shifts are encoded as N - shift, where N is the element size in bits.
+%neon_rshift_i6 16:6 !function=rsub_64
+%neon_rshift_i5 16:5 !function=rsub_32
+%neon_rshift_i4 16:4 !function=rsub_16
+%neon_rshift_i3 16:3 !function=rsub_8
+
+@2reg_shr_d .... ... . . . ...... .... .... 1 q:1 . . .... \
+ &2reg_shift vm=%vm_dp vd=%vd_dp size=3 shift=%neon_rshift_i6
+@2reg_shr_s .... ... . . . 1 ..... .... .... 0 q:1 . . .... \
+ &2reg_shift vm=%vm_dp vd=%vd_dp size=2 shift=%neon_rshift_i5
+@2reg_shr_h .... ... . . . 01 .... .... .... 0 q:1 . . .... \
+ &2reg_shift vm=%vm_dp vd=%vd_dp size=1 shift=%neon_rshift_i4
+@2reg_shr_b .... ... . . . 001 ... .... .... 0 q:1 . . .... \
+ &2reg_shift vm=%vm_dp vd=%vd_dp size=0 shift=%neon_rshift_i3
+
@2reg_shl_d .... ... . . . shift:6 .... .... 1 q:1 . . .... \
&2reg_shift vm=%vm_dp vd=%vd_dp size=3
@2reg_shl_s .... ... . . . 1 shift:5 .... .... 0 q:1 . . .... \
@@ -215,6 +230,16 @@ VMINNM_fp_3s 1111 001 1 0 . 1 . .... .... 1111 ... 1 .... @3same_fp
@2reg_shl_b .... ... . . . 001 shift:3 .... .... 0 q:1 . . .... \
&2reg_shift vm=%vm_dp vd=%vd_dp size=0
+VSHR_S_2sh 1111 001 0 1 . ...... .... 0000 . . . 1 .... @2reg_shr_d
+VSHR_S_2sh 1111 001 0 1 . ...... .... 0000 . . . 1 .... @2reg_shr_s
+VSHR_S_2sh 1111 001 0 1 . ...... .... 0000 . . . 1 .... @2reg_shr_h
+VSHR_S_2sh 1111 001 0 1 . ...... .... 0000 . . . 1 .... @2reg_shr_b
+
+VSHR_U_2sh 1111 001 1 1 . ...... .... 0000 . . . 1 .... @2reg_shr_d
+VSHR_U_2sh 1111 001 1 1 . ...... .... 0000 . . . 1 .... @2reg_shr_s
+VSHR_U_2sh 1111 001 1 1 . ...... .... 0000 . . . 1 .... @2reg_shr_h
+VSHR_U_2sh 1111 001 1 1 . ...... .... 0000 . . . 1 .... @2reg_shr_b
+
VSHL_2sh 1111 001 0 1 . ...... .... 0101 . . . 1 .... @2reg_shl_d
VSHL_2sh 1111 001 0 1 . ...... .... 0101 . . . 1 .... @2reg_shl_s
VSHL_2sh 1111 001 0 1 . ...... .... 0101 . . . 1 .... @2reg_shl_h
diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
index 305213fe6d9..0475696835f 100644
--- a/target/arm/translate-neon.inc.c
+++ b/target/arm/translate-neon.inc.c
@@ -31,6 +31,24 @@ static inline int plus1(DisasContext *s, int x)
return x + 1;
}
+static inline int rsub_64(DisasContext *s, int x)
+{
+ return 64 - x;
+}
+
+static inline int rsub_32(DisasContext *s, int x)
+{
+ return 32 - x;
+}
+static inline int rsub_16(DisasContext *s, int x)
+{
+ return 16 - x;
+}
+static inline int rsub_8(DisasContext *s, int x)
+{
+ return 8 - x;
+}
+
/* Include the generated Neon decoder */
#include "decode-neon-dp.inc.c"
#include "decode-neon-ls.inc.c"
@@ -1348,3 +1366,26 @@ static bool do_vector_2sh(DisasContext *s, arg_2reg_shift *a, GVecGen2iFn *fn)
DO_2SH(VSHL, tcg_gen_gvec_shli)
DO_2SH(VSLI, gen_gvec_sli)
+
+static bool trans_VSHR_S_2sh(DisasContext *s, arg_2reg_shift *a)
+{
+ /* Signed shift out of range results in all-sign-bits */
+ a->shift = MIN(a->shift, (8 << a->size) - 1);
+ return do_vector_2sh(s, a, tcg_gen_gvec_sari);
+}
+
+static void gen_zero_rd_2sh(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs,
+ int64_t shift, uint32_t oprsz, uint32_t maxsz)
+{
+ tcg_gen_gvec_dup_imm(vece, rd_ofs, oprsz, maxsz, 0);
+}
+
+static bool trans_VSHR_U_2sh(DisasContext *s, arg_2reg_shift *a)
+{
+ /* Shift out of range is architecturally valid and results in zero. */
+ if (a->shift >= (8 << a->size)) {
+ return do_vector_2sh(s, a, gen_zero_rd_2sh);
+ } else {
+ return do_vector_2sh(s, a, tcg_gen_gvec_shri);
+ }
+}
diff --git a/target/arm/translate.c b/target/arm/translate.c
index d0a4a08f6d9..f2ccab1b21c 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -5296,6 +5296,7 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
op = (insn >> 8) & 0xf;
switch (op) {
+ case 0: /* VSHR */
case 5: /* VSHL, VSLI */
return 1; /* handled by decodetree */
default:
@@ -5330,26 +5331,6 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
}
switch (op) {
- case 0: /* VSHR */
- /* Right shift comes here negative. */
- shift = -shift;
- /* Shifts larger than the element size are architecturally
- * valid. Unsigned results in all zeros; signed results
- * in all sign bits.
- */
- if (!u) {
- tcg_gen_gvec_sari(size, rd_ofs, rm_ofs,
- MIN(shift, (8 << size) - 1),
- vec_size, vec_size);
- } else if (shift >= 8 << size) {
- tcg_gen_gvec_dup_imm(MO_8, rd_ofs, vec_size,
- vec_size, 0);
- } else {
- tcg_gen_gvec_shri(size, rd_ofs, rm_ofs, shift,
- vec_size, vec_size);
- }
- return 0;
-
case 1: /* VSRA */
/* Right shift comes here negative. */
shift = -shift;
--
2.20.1
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH v2 3/9] target/arm: Convert Neon VSRA, VSRI, VRSHR, VRSRA 2-reg-shift insns to decodetree
2020-05-22 14:55 [PATCH v2 0/9] target/arm: Convert 2-reg-shift and 1-reg-imm Neon insns to decodetree Peter Maydell
2020-05-22 14:55 ` [PATCH v2 1/9] target/arm: Convert Neon VSHL and VSLI 2-reg-shift insn " Peter Maydell
2020-05-22 14:55 ` [PATCH v2 2/9] target/arm: Convert Neon VSHR 2-reg-shift insns " Peter Maydell
@ 2020-05-22 14:55 ` Peter Maydell
2020-05-22 14:55 ` [PATCH v2 4/9] target/arm: Convert VQSHLU, VQSHL " Peter Maydell
` (6 subsequent siblings)
9 siblings, 0 replies; 17+ messages in thread
From: Peter Maydell @ 2020-05-22 14:55 UTC (permalink / raw)
To: qemu-arm, qemu-devel; +Cc: Richard Henderson
Convert the VSRA, VSRI, VRSHR, VRSRA 2-reg-shift insns to decodetree.
(These are the last instructions in the group that are vectorized;
the rest all require looping over each element.)
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
---
target/arm/neon-dp.decode | 35 ++++++++++++++++++++++
target/arm/translate-neon.inc.c | 7 +++++
target/arm/translate.c | 52 +++------------------------------
3 files changed, 46 insertions(+), 48 deletions(-)
diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
index cd3a8f936d7..d99a07b16d4 100644
--- a/target/arm/neon-dp.decode
+++ b/target/arm/neon-dp.decode
@@ -240,6 +240,41 @@ VSHR_U_2sh 1111 001 1 1 . ...... .... 0000 . . . 1 .... @2reg_shr_s
VSHR_U_2sh 1111 001 1 1 . ...... .... 0000 . . . 1 .... @2reg_shr_h
VSHR_U_2sh 1111 001 1 1 . ...... .... 0000 . . . 1 .... @2reg_shr_b
+VSRA_S_2sh 1111 001 0 1 . ...... .... 0001 . . . 1 .... @2reg_shr_d
+VSRA_S_2sh 1111 001 0 1 . ...... .... 0001 . . . 1 .... @2reg_shr_s
+VSRA_S_2sh 1111 001 0 1 . ...... .... 0001 . . . 1 .... @2reg_shr_h
+VSRA_S_2sh 1111 001 0 1 . ...... .... 0001 . . . 1 .... @2reg_shr_b
+
+VSRA_U_2sh 1111 001 1 1 . ...... .... 0001 . . . 1 .... @2reg_shr_d
+VSRA_U_2sh 1111 001 1 1 . ...... .... 0001 . . . 1 .... @2reg_shr_s
+VSRA_U_2sh 1111 001 1 1 . ...... .... 0001 . . . 1 .... @2reg_shr_h
+VSRA_U_2sh 1111 001 1 1 . ...... .... 0001 . . . 1 .... @2reg_shr_b
+
+VRSHR_S_2sh 1111 001 0 1 . ...... .... 0010 . . . 1 .... @2reg_shr_d
+VRSHR_S_2sh 1111 001 0 1 . ...... .... 0010 . . . 1 .... @2reg_shr_s
+VRSHR_S_2sh 1111 001 0 1 . ...... .... 0010 . . . 1 .... @2reg_shr_h
+VRSHR_S_2sh 1111 001 0 1 . ...... .... 0010 . . . 1 .... @2reg_shr_b
+
+VRSHR_U_2sh 1111 001 1 1 . ...... .... 0010 . . . 1 .... @2reg_shr_d
+VRSHR_U_2sh 1111 001 1 1 . ...... .... 0010 . . . 1 .... @2reg_shr_s
+VRSHR_U_2sh 1111 001 1 1 . ...... .... 0010 . . . 1 .... @2reg_shr_h
+VRSHR_U_2sh 1111 001 1 1 . ...... .... 0010 . . . 1 .... @2reg_shr_b
+
+VRSRA_S_2sh 1111 001 0 1 . ...... .... 0011 . . . 1 .... @2reg_shr_d
+VRSRA_S_2sh 1111 001 0 1 . ...... .... 0011 . . . 1 .... @2reg_shr_s
+VRSRA_S_2sh 1111 001 0 1 . ...... .... 0011 . . . 1 .... @2reg_shr_h
+VRSRA_S_2sh 1111 001 0 1 . ...... .... 0011 . . . 1 .... @2reg_shr_b
+
+VRSRA_U_2sh 1111 001 1 1 . ...... .... 0011 . . . 1 .... @2reg_shr_d
+VRSRA_U_2sh 1111 001 1 1 . ...... .... 0011 . . . 1 .... @2reg_shr_s
+VRSRA_U_2sh 1111 001 1 1 . ...... .... 0011 . . . 1 .... @2reg_shr_h
+VRSRA_U_2sh 1111 001 1 1 . ...... .... 0011 . . . 1 .... @2reg_shr_b
+
+VSRI_2sh 1111 001 1 1 . ...... .... 0100 . . . 1 .... @2reg_shr_d
+VSRI_2sh 1111 001 1 1 . ...... .... 0100 . . . 1 .... @2reg_shr_s
+VSRI_2sh 1111 001 1 1 . ...... .... 0100 . . . 1 .... @2reg_shr_h
+VSRI_2sh 1111 001 1 1 . ...... .... 0100 . . . 1 .... @2reg_shr_b
+
VSHL_2sh 1111 001 0 1 . ...... .... 0101 . . . 1 .... @2reg_shl_d
VSHL_2sh 1111 001 0 1 . ...... .... 0101 . . . 1 .... @2reg_shl_s
VSHL_2sh 1111 001 0 1 . ...... .... 0101 . . . 1 .... @2reg_shl_h
diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
index 0475696835f..f4d42683aea 100644
--- a/target/arm/translate-neon.inc.c
+++ b/target/arm/translate-neon.inc.c
@@ -1366,6 +1366,13 @@ static bool do_vector_2sh(DisasContext *s, arg_2reg_shift *a, GVecGen2iFn *fn)
DO_2SH(VSHL, tcg_gen_gvec_shli)
DO_2SH(VSLI, gen_gvec_sli)
+DO_2SH(VSRI, gen_gvec_sri)
+DO_2SH(VSRA_S, gen_gvec_ssra)
+DO_2SH(VSRA_U, gen_gvec_usra)
+DO_2SH(VRSHR_S, gen_gvec_srshr)
+DO_2SH(VRSHR_U, gen_gvec_urshr)
+DO_2SH(VRSRA_S, gen_gvec_srsra)
+DO_2SH(VRSRA_U, gen_gvec_ursra)
static bool trans_VSHR_S_2sh(DisasContext *s, arg_2reg_shift *a)
{
diff --git a/target/arm/translate.c b/target/arm/translate.c
index f2ccab1b21c..4a55986aad9 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -5297,6 +5297,10 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
switch (op) {
case 0: /* VSHR */
+ case 1: /* VSRA */
+ case 2: /* VRSHR */
+ case 3: /* VRSRA */
+ case 4: /* VSRI */
case 5: /* VSHL, VSLI */
return 1; /* handled by decodetree */
default:
@@ -5330,54 +5334,6 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
shift = shift - (1 << (size + 3));
}
- switch (op) {
- case 1: /* VSRA */
- /* Right shift comes here negative. */
- shift = -shift;
- if (u) {
- gen_gvec_usra(size, rd_ofs, rm_ofs, shift,
- vec_size, vec_size);
- } else {
- gen_gvec_ssra(size, rd_ofs, rm_ofs, shift,
- vec_size, vec_size);
- }
- return 0;
-
- case 2: /* VRSHR */
- /* Right shift comes here negative. */
- shift = -shift;
- if (u) {
- gen_gvec_urshr(size, rd_ofs, rm_ofs, shift,
- vec_size, vec_size);
- } else {
- gen_gvec_srshr(size, rd_ofs, rm_ofs, shift,
- vec_size, vec_size);
- }
- return 0;
-
- case 3: /* VRSRA */
- /* Right shift comes here negative. */
- shift = -shift;
- if (u) {
- gen_gvec_ursra(size, rd_ofs, rm_ofs, shift,
- vec_size, vec_size);
- } else {
- gen_gvec_srsra(size, rd_ofs, rm_ofs, shift,
- vec_size, vec_size);
- }
- return 0;
-
- case 4: /* VSRI */
- if (!u) {
- return 1;
- }
- /* Right shift comes here negative. */
- shift = -shift;
- gen_gvec_sri(size, rd_ofs, rm_ofs, shift,
- vec_size, vec_size);
- return 0;
- }
-
if (size == 3) {
count = q + 1;
} else {
--
2.20.1
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH v2 4/9] target/arm: Convert VQSHLU, VQSHL 2-reg-shift insns to decodetree
2020-05-22 14:55 [PATCH v2 0/9] target/arm: Convert 2-reg-shift and 1-reg-imm Neon insns to decodetree Peter Maydell
` (2 preceding siblings ...)
2020-05-22 14:55 ` [PATCH v2 3/9] target/arm: Convert Neon VSRA, VSRI, VRSHR, VRSRA " Peter Maydell
@ 2020-05-22 14:55 ` Peter Maydell
2020-06-01 23:12 ` Richard Henderson
2020-05-22 14:55 ` [PATCH v2 5/9] target/arm: Convert Neon narrowing shifts with op==8 " Peter Maydell
` (5 subsequent siblings)
9 siblings, 1 reply; 17+ messages in thread
From: Peter Maydell @ 2020-05-22 14:55 UTC (permalink / raw)
To: qemu-arm, qemu-devel; +Cc: Richard Henderson
Convert the VQSHLU and QVSHL 2-reg-shift insns to decodetree.
These are the last of the simple shift-by-immediate insns.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
target/arm/neon-dp.decode | 15 +++++
target/arm/translate-neon.inc.c | 108 +++++++++++++++++++++++++++++++
target/arm/translate.c | 110 +-------------------------------
3 files changed, 126 insertions(+), 107 deletions(-)
diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
index d99a07b16d4..f9183060a51 100644
--- a/target/arm/neon-dp.decode
+++ b/target/arm/neon-dp.decode
@@ -284,3 +284,18 @@ VSLI_2sh 1111 001 1 1 . ...... .... 0101 . . . 1 .... @2reg_shl_d
VSLI_2sh 1111 001 1 1 . ...... .... 0101 . . . 1 .... @2reg_shl_s
VSLI_2sh 1111 001 1 1 . ...... .... 0101 . . . 1 .... @2reg_shl_h
VSLI_2sh 1111 001 1 1 . ...... .... 0101 . . . 1 .... @2reg_shl_b
+
+VQSHLU_64_2sh 1111 001 1 1 . ...... .... 0110 . . . 1 .... @2reg_shl_d
+VQSHLU_2sh 1111 001 1 1 . ...... .... 0110 . . . 1 .... @2reg_shl_s
+VQSHLU_2sh 1111 001 1 1 . ...... .... 0110 . . . 1 .... @2reg_shl_h
+VQSHLU_2sh 1111 001 1 1 . ...... .... 0110 . . . 1 .... @2reg_shl_b
+
+VQSHL_S_64_2sh 1111 001 0 1 . ...... .... 0111 . . . 1 .... @2reg_shl_d
+VQSHL_S_2sh 1111 001 0 1 . ...... .... 0111 . . . 1 .... @2reg_shl_s
+VQSHL_S_2sh 1111 001 0 1 . ...... .... 0111 . . . 1 .... @2reg_shl_h
+VQSHL_S_2sh 1111 001 0 1 . ...... .... 0111 . . . 1 .... @2reg_shl_b
+
+VQSHL_U_64_2sh 1111 001 1 1 . ...... .... 0111 . . . 1 .... @2reg_shl_d
+VQSHL_U_2sh 1111 001 1 1 . ...... .... 0111 . . . 1 .... @2reg_shl_s
+VQSHL_U_2sh 1111 001 1 1 . ...... .... 0111 . . . 1 .... @2reg_shl_h
+VQSHL_U_2sh 1111 001 1 1 . ...... .... 0111 . . . 1 .... @2reg_shl_b
diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
index f4d42683aea..396db55565f 100644
--- a/target/arm/translate-neon.inc.c
+++ b/target/arm/translate-neon.inc.c
@@ -1396,3 +1396,111 @@ static bool trans_VSHR_U_2sh(DisasContext *s, arg_2reg_shift *a)
return do_vector_2sh(s, a, tcg_gen_gvec_shri);
}
}
+
+static bool do_2shift_env_64(DisasContext *s, arg_2reg_shift *a,
+ NeonGenTwo64OpEnvFn *fn)
+{
+ /*
+ * 2-reg-and-shift operations, size == 3 case, where the
+ * function needs to be passed cpu_env.
+ */
+ TCGv_i64 constimm;
+ int pass;
+
+ if (!arm_dc_feature(s, ARM_FEATURE_NEON)) {
+ return false;
+ }
+
+ /* UNDEF accesses to D16-D31 if they don't exist. */
+ if (!dc_isar_feature(aa32_simd_r32, s) &&
+ ((a->vd | a->vm) & 0x10)) {
+ return false;
+ }
+
+ if ((a->vm | a->vd) & a->q) {
+ return false;
+ }
+
+ if (!vfp_access_check(s)) {
+ return true;
+ }
+
+ /*
+ * To avoid excessive duplication of ops we implement shift
+ * by immediate using the variable shift operations.
+ */
+ constimm = tcg_const_i64(dup_const(a->size, a->shift));
+
+ for (pass = 0; pass < a->q + 1; pass++) {
+ TCGv_i64 tmp = tcg_temp_new_i64();
+
+ neon_load_reg64(tmp, a->vm + pass);
+ fn(tmp, cpu_env, tmp, constimm);
+ neon_store_reg64(tmp, a->vd + pass);
+ }
+ tcg_temp_free_i64(constimm);
+ return true;
+}
+
+static bool do_2shift_env_32(DisasContext *s, arg_2reg_shift *a,
+ NeonGenTwoOpEnvFn *fn)
+{
+ /*
+ * 2-reg-and-shift operations, size < 3 case, where the
+ * helper needs to be passed cpu_env.
+ */
+ TCGv_i32 constimm;
+ int pass;
+
+ if (!arm_dc_feature(s, ARM_FEATURE_NEON)) {
+ return false;
+ }
+
+ /* UNDEF accesses to D16-D31 if they don't exist. */
+ if (!dc_isar_feature(aa32_simd_r32, s) &&
+ ((a->vd | a->vm) & 0x10)) {
+ return false;
+ }
+
+ if ((a->vm | a->vd) & a->q) {
+ return false;
+ }
+
+ if (!vfp_access_check(s)) {
+ return true;
+ }
+
+ /*
+ * To avoid excessive duplication of ops we implement shift
+ * by immediate using the variable shift operations.
+ */
+ constimm = tcg_const_i32(dup_const(a->size, a->shift));
+
+ for (pass = 0; pass < (a->q ? 4 : 2); pass++) {
+ TCGv_i32 tmp = neon_load_reg(a->vm, pass);
+ fn(tmp, cpu_env, tmp, constimm);
+ neon_store_reg(a->vd, pass, tmp);
+ }
+ tcg_temp_free_i32(constimm);
+ return true;
+}
+
+#define DO_2SHIFT_ENV(INSN, FUNC) \
+ static bool trans_##INSN##_64_2sh(DisasContext *s, arg_2reg_shift *a) \
+ { \
+ return do_2shift_env_64(s, a, gen_helper_neon_##FUNC##64); \
+ } \
+ static bool trans_##INSN##_2sh(DisasContext *s, arg_2reg_shift *a) \
+ { \
+ static NeonGenTwoOpEnvFn * const fns[] = { \
+ gen_helper_neon_##FUNC##8, \
+ gen_helper_neon_##FUNC##16, \
+ gen_helper_neon_##FUNC##32, \
+ }; \
+ assert(a->size < ARRAY_SIZE(fns)); \
+ return do_2shift_env_32(s, a, fns[a->size]); \
+ }
+
+DO_2SHIFT_ENV(VQSHLU, qshlu_s)
+DO_2SHIFT_ENV(VQSHL_U, qshl_u)
+DO_2SHIFT_ENV(VQSHL_S, qshl_s)
diff --git a/target/arm/translate.c b/target/arm/translate.c
index 4a55986aad9..d711d39eb9d 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -3011,29 +3011,6 @@ static inline void gen_neon_rsb(int size, TCGv_i32 t0, TCGv_i32 t1)
}
}
-#define GEN_NEON_INTEGER_OP_ENV(name) do { \
- switch ((size << 1) | u) { \
- case 0: \
- gen_helper_neon_##name##_s8(tmp, cpu_env, tmp, tmp2); \
- break; \
- case 1: \
- gen_helper_neon_##name##_u8(tmp, cpu_env, tmp, tmp2); \
- break; \
- case 2: \
- gen_helper_neon_##name##_s16(tmp, cpu_env, tmp, tmp2); \
- break; \
- case 3: \
- gen_helper_neon_##name##_u16(tmp, cpu_env, tmp, tmp2); \
- break; \
- case 4: \
- gen_helper_neon_##name##_s32(tmp, cpu_env, tmp, tmp2); \
- break; \
- case 5: \
- gen_helper_neon_##name##_u32(tmp, cpu_env, tmp, tmp2); \
- break; \
- default: return 1; \
- }} while (0)
-
static TCGv_i32 neon_load_scratch(int scratch)
{
TCGv_i32 tmp = tcg_temp_new_i32();
@@ -5252,7 +5229,6 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
int size;
int shift;
int pass;
- int count;
int u;
int vec_size;
uint32_t imm;
@@ -5302,6 +5278,8 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
case 3: /* VRSRA */
case 4: /* VSRI */
case 5: /* VSHL, VSLI */
+ case 6: /* VQSHLU */
+ case 7: /* VQSHL */
return 1; /* handled by decodetree */
default:
break;
@@ -5319,89 +5297,7 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
size--;
}
shift = (insn >> 16) & ((1 << (3 + size)) - 1);
- if (op < 8) {
- /* Shift by immediate:
- VSHR, VSRA, VRSHR, VRSRA, VSRI, VSHL, VQSHL, VQSHLU. */
- if (q && ((rd | rm) & 1)) {
- return 1;
- }
- if (!u && (op == 4 || op == 6)) {
- return 1;
- }
- /* Right shifts are encoded as N - shift, where N is the
- element size in bits. */
- if (op <= 4) {
- shift = shift - (1 << (size + 3));
- }
-
- if (size == 3) {
- count = q + 1;
- } else {
- count = q ? 4: 2;
- }
-
- /* To avoid excessive duplication of ops we implement shift
- * by immediate using the variable shift operations.
- */
- imm = dup_const(size, shift);
-
- for (pass = 0; pass < count; pass++) {
- if (size == 3) {
- neon_load_reg64(cpu_V0, rm + pass);
- tcg_gen_movi_i64(cpu_V1, imm);
- switch (op) {
- case 6: /* VQSHLU */
- gen_helper_neon_qshlu_s64(cpu_V0, cpu_env,
- cpu_V0, cpu_V1);
- break;
- case 7: /* VQSHL */
- if (u) {
- gen_helper_neon_qshl_u64(cpu_V0, cpu_env,
- cpu_V0, cpu_V1);
- } else {
- gen_helper_neon_qshl_s64(cpu_V0, cpu_env,
- cpu_V0, cpu_V1);
- }
- break;
- default:
- g_assert_not_reached();
- }
- neon_store_reg64(cpu_V0, rd + pass);
- } else { /* size < 3 */
- /* Operands in T0 and T1. */
- tmp = neon_load_reg(rm, pass);
- tmp2 = tcg_temp_new_i32();
- tcg_gen_movi_i32(tmp2, imm);
- switch (op) {
- case 6: /* VQSHLU */
- switch (size) {
- case 0:
- gen_helper_neon_qshlu_s8(tmp, cpu_env,
- tmp, tmp2);
- break;
- case 1:
- gen_helper_neon_qshlu_s16(tmp, cpu_env,
- tmp, tmp2);
- break;
- case 2:
- gen_helper_neon_qshlu_s32(tmp, cpu_env,
- tmp, tmp2);
- break;
- default:
- abort();
- }
- break;
- case 7: /* VQSHL */
- GEN_NEON_INTEGER_OP_ENV(qshl);
- break;
- default:
- g_assert_not_reached();
- }
- tcg_temp_free_i32(tmp2);
- neon_store_reg(rd, pass, tmp);
- }
- } /* for pass */
- } else if (op < 10) {
+ if (op < 10) {
/* Shift by immediate and narrow:
VSHRN, VRSHRN, VQSHRN, VQRSHRN. */
int input_unsigned = (op == 8) ? !u : u;
--
2.20.1
^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [PATCH v2 4/9] target/arm: Convert VQSHLU, VQSHL 2-reg-shift insns to decodetree
2020-05-22 14:55 ` [PATCH v2 4/9] target/arm: Convert VQSHLU, VQSHL " Peter Maydell
@ 2020-06-01 23:12 ` Richard Henderson
0 siblings, 0 replies; 17+ messages in thread
From: Richard Henderson @ 2020-06-01 23:12 UTC (permalink / raw)
To: Peter Maydell, qemu-arm, qemu-devel
On 5/22/20 7:55 AM, Peter Maydell wrote:
> Convert the VQSHLU and QVSHL 2-reg-shift insns to decodetree.
> These are the last of the simple shift-by-immediate insns.
>
> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
> ---
> target/arm/neon-dp.decode | 15 +++++
> target/arm/translate-neon.inc.c | 108 +++++++++++++++++++++++++++++++
> target/arm/translate.c | 110 +-------------------------------
> 3 files changed, 126 insertions(+), 107 deletions(-)
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
r~
^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH v2 5/9] target/arm: Convert Neon narrowing shifts with op==8 to decodetree
2020-05-22 14:55 [PATCH v2 0/9] target/arm: Convert 2-reg-shift and 1-reg-imm Neon insns to decodetree Peter Maydell
` (3 preceding siblings ...)
2020-05-22 14:55 ` [PATCH v2 4/9] target/arm: Convert VQSHLU, VQSHL " Peter Maydell
@ 2020-05-22 14:55 ` Peter Maydell
2020-05-22 22:16 ` Peter Maydell
2020-06-01 23:13 ` Richard Henderson
2020-05-22 14:55 ` [PATCH v2 6/9] target/arm: Convert Neon narrowing shifts with op==9 " Peter Maydell
` (4 subsequent siblings)
9 siblings, 2 replies; 17+ messages in thread
From: Peter Maydell @ 2020-05-22 14:55 UTC (permalink / raw)
To: qemu-arm, qemu-devel; +Cc: Richard Henderson
Convert the Neon narrowing shifts where op==8 to decodetree:
* VSHRN
* VRSHRN
* VQSHRUN
* VQRSHRUN
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
target/arm/neon-dp.decode | 27 +++++
target/arm/translate-neon.inc.c | 168 ++++++++++++++++++++++++++++++++
target/arm/translate.c | 1 +
3 files changed, 196 insertions(+)
diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
index f9183060a51..01887240b4a 100644
--- a/target/arm/neon-dp.decode
+++ b/target/arm/neon-dp.decode
@@ -230,6 +230,17 @@ VMINNM_fp_3s 1111 001 1 0 . 1 . .... .... 1111 ... 1 .... @3same_fp
@2reg_shl_b .... ... . . . 001 shift:3 .... .... 0 q:1 . . .... \
&2reg_shift vm=%vm_dp vd=%vd_dp size=0
+# Narrowing right shifts: here the Q bit is part of the opcode decode
+@2reg_shrn_d .... ... . . . 1 ..... .... .... 0 . . . .... \
+ &2reg_shift vm=%vm_dp vd=%vd_dp size=3 q=0 \
+ shift=%neon_rshift_i5
+@2reg_shrn_s .... ... . . . 01 .... .... .... 0 . . . .... \
+ &2reg_shift vm=%vm_dp vd=%vd_dp size=2 q=0 \
+ shift=%neon_rshift_i4
+@2reg_shrn_h .... ... . . . 001 ... .... .... 0 . . . .... \
+ &2reg_shift vm=%vm_dp vd=%vd_dp size=1 q=0 \
+ shift=%neon_rshift_i3
+
VSHR_S_2sh 1111 001 0 1 . ...... .... 0000 . . . 1 .... @2reg_shr_d
VSHR_S_2sh 1111 001 0 1 . ...... .... 0000 . . . 1 .... @2reg_shr_s
VSHR_S_2sh 1111 001 0 1 . ...... .... 0000 . . . 1 .... @2reg_shr_h
@@ -299,3 +310,19 @@ VQSHL_U_64_2sh 1111 001 1 1 . ...... .... 0111 . . . 1 .... @2reg_shl_d
VQSHL_U_2sh 1111 001 1 1 . ...... .... 0111 . . . 1 .... @2reg_shl_s
VQSHL_U_2sh 1111 001 1 1 . ...... .... 0111 . . . 1 .... @2reg_shl_h
VQSHL_U_2sh 1111 001 1 1 . ...... .... 0111 . . . 1 .... @2reg_shl_b
+
+VSHRN_64_2sh 1111 001 0 1 . ...... .... 1000 . 0 . 1 .... @2reg_shrn_d
+VSHRN_32_2sh 1111 001 0 1 . ...... .... 1000 . 0 . 1 .... @2reg_shrn_s
+VSHRN_16_2sh 1111 001 0 1 . ...... .... 1000 . 0 . 1 .... @2reg_shrn_h
+
+VRSHRN_64_2sh 1111 001 0 1 . ...... .... 1000 . 1 . 1 .... @2reg_shrn_d
+VRSHRN_32_2sh 1111 001 0 1 . ...... .... 1000 . 1 . 1 .... @2reg_shrn_s
+VRSHRN_16_2sh 1111 001 0 1 . ...... .... 1000 . 1 . 1 .... @2reg_shrn_h
+
+VQSHRUN_64_2sh 1111 001 1 1 . ...... .... 1000 . 0 . 1 .... @2reg_shrn_d
+VQSHRUN_32_2sh 1111 001 1 1 . ...... .... 1000 . 0 . 1 .... @2reg_shrn_s
+VQSHRUN_16_2sh 1111 001 1 1 . ...... .... 1000 . 0 . 1 .... @2reg_shrn_h
+
+VQRSHRUN_64_2sh 1111 001 1 1 . ...... .... 1000 . 1 . 1 .... @2reg_shrn_d
+VQRSHRUN_32_2sh 1111 001 1 1 . ...... .... 1000 . 1 . 1 .... @2reg_shrn_s
+VQRSHRUN_16_2sh 1111 001 1 1 . ...... .... 1000 . 1 . 1 .... @2reg_shrn_h
diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
index 396db55565f..18ea7255e38 100644
--- a/target/arm/translate-neon.inc.c
+++ b/target/arm/translate-neon.inc.c
@@ -1504,3 +1504,171 @@ static bool do_2shift_env_32(DisasContext *s, arg_2reg_shift *a,
DO_2SHIFT_ENV(VQSHLU, qshlu_s)
DO_2SHIFT_ENV(VQSHL_U, qshl_u)
DO_2SHIFT_ENV(VQSHL_S, qshl_s)
+
+static bool do_2shift_narrow_64(DisasContext *s, arg_2reg_shift *a,
+ NeonGenTwo64OpFn *shiftfn,
+ NeonGenNarrowEnvFn *narrowfn)
+{
+ /* 2-reg-and-shift narrowing-shift operations, size == 3 case */
+ TCGv_i64 constimm, rm1, rm2;
+ TCGv_i32 rd;
+
+ if (!arm_dc_feature(s, ARM_FEATURE_NEON)) {
+ return false;
+ }
+
+ /* UNDEF accesses to D16-D31 if they don't exist. */
+ if (!dc_isar_feature(aa32_simd_r32, s) &&
+ ((a->vd | a->vm) & 0x10)) {
+ return false;
+ }
+
+ if (a->vm & 1) {
+ return false;
+ }
+
+ if (!vfp_access_check(s)) {
+ return true;
+ }
+
+ /*
+ * This is always a right shift, and the shiftfn is always a
+ * left-shift helper, which thus needs the negated shift count.
+ */
+ constimm = tcg_const_i64(-a->shift);
+ rm1 = tcg_temp_new_i64();
+ rm2 = tcg_temp_new_i64();
+
+ /* Load both inputs first to avoid potential overwrite if rm == rd */
+ neon_load_reg64(rm1, a->vm);
+ neon_load_reg64(rm2, a->vm + 1);
+
+ shiftfn(rm1, rm1, constimm);
+ rd = tcg_temp_new_i32();
+ narrowfn(rd, cpu_env, rm1);
+ neon_store_reg(a->vd, 0, rd);
+
+ shiftfn(rm2, rm2, constimm);
+ rd = tcg_temp_new_i32();
+ narrowfn(rd, cpu_env, rm2);
+ neon_store_reg(a->vd, 1, rd);
+
+ tcg_temp_free_i64(rm1);
+ tcg_temp_free_i64(rm2);
+ tcg_temp_free_i64(constimm);
+
+ return true;
+}
+
+static bool do_2shift_narrow_32(DisasContext *s, arg_2reg_shift *a,
+ NeonGenTwoOpFn *shiftfn,
+ NeonGenNarrowEnvFn *narrowfn)
+{
+ /* 2-reg-and-shift narrowing-shift operations, size < 3 case */
+ TCGv_i32 constimm, rm1, rm2, rm3, rm4;
+ TCGv_i64 rtmp;
+ uint32_t imm;
+
+ if (!arm_dc_feature(s, ARM_FEATURE_NEON)) {
+ return false;
+ }
+
+ /* UNDEF accesses to D16-D31 if they don't exist. */
+ if (!dc_isar_feature(aa32_simd_r32, s) &&
+ ((a->vd | a->vm) & 0x10)) {
+ return false;
+ }
+
+ if (a->vm & 1) {
+ return false;
+ }
+
+ if (!vfp_access_check(s)) {
+ return true;
+ }
+
+ /*
+ * This is always a right shift, and the shiftfn is always a
+ * left-shift helper, which thus needs the negated shift count
+ * duplicated into each lane of the immediate value.
+ */
+ if (a->size == 1) {
+ imm = (uint16_t)(-a->shift);
+ imm |= imm << 16;
+ } else {
+ /* size == 2 */
+ imm = -a->shift;
+ }
+ constimm = tcg_const_i32(imm);
+
+ /* Load all inputs first to avoid potential overwrite */
+ rm1 = neon_load_reg(a->vm, 0);
+ rm2 = neon_load_reg(a->vm, 1);
+ rm3 = neon_load_reg(a->vm + 1, 0);
+ rm4 = neon_load_reg(a->vm + 1, 1);
+ rtmp = tcg_temp_new_i64();
+
+ // todo expand out the shift-narrow and the narrow-op
+ shiftfn(rm1, rm1, constimm);
+ shiftfn(rm2, rm2, constimm);
+
+ tcg_gen_concat_i32_i64(rtmp, rm1, rm2);
+ tcg_temp_free_i32(rm2);
+
+ narrowfn(rm1, cpu_env, rtmp);
+ neon_store_reg(a->vd, 0, rm1);
+
+ shiftfn(rm3, rm3, constimm);
+ shiftfn(rm4, rm4, constimm);
+ tcg_temp_free_i32(constimm);
+
+ tcg_gen_concat_i32_i64(rtmp, rm3, rm4);
+ tcg_temp_free_i32(rm4);
+
+ narrowfn(rm3, cpu_env, rtmp);
+ tcg_temp_free_i64(rtmp);
+ neon_store_reg(a->vd, 1, rm3);
+ return true;
+}
+
+#define DO_2SN_64(INSN, FUNC, NARROWFUNC) \
+ static bool trans_##INSN##_2sh(DisasContext *s, arg_2reg_shift *a) \
+ { \
+ return do_2shift_narrow_64(s, a, FUNC, NARROWFUNC); \
+ }
+#define DO_2SN_32(INSN, FUNC, NARROWFUNC) \
+ static bool trans_##INSN##_2sh(DisasContext *s, arg_2reg_shift *a) \
+ { \
+ return do_2shift_narrow_32(s, a, FUNC, NARROWFUNC); \
+ }
+
+static void gen_neon_narrow_u32(TCGv_i32 dest, TCGv_ptr env, TCGv_i64 src)
+{
+ tcg_gen_extrl_i64_i32(dest, src);
+}
+
+static void gen_neon_narrow_u16(TCGv_i32 dest, TCGv_ptr env, TCGv_i64 src)
+{
+ gen_helper_neon_narrow_u16(dest, src);
+}
+
+static void gen_neon_narrow_u8(TCGv_i32 dest, TCGv_ptr env, TCGv_i64 src)
+{
+ gen_helper_neon_narrow_u8(dest, src);
+}
+
+DO_2SN_64(VSHRN_64, gen_ushl_i64, gen_neon_narrow_u32)
+DO_2SN_32(VSHRN_32, gen_ushl_i32, gen_neon_narrow_u16)
+DO_2SN_32(VSHRN_16, gen_helper_neon_shl_u16, gen_neon_narrow_u8)
+
+DO_2SN_64(VRSHRN_64, gen_helper_neon_rshl_u64, gen_neon_narrow_u32)
+DO_2SN_32(VRSHRN_32, gen_helper_neon_rshl_u32, gen_neon_narrow_u16)
+DO_2SN_32(VRSHRN_16, gen_helper_neon_rshl_u16, gen_neon_narrow_u8)
+
+DO_2SN_64(VQSHRUN_64, gen_sshl_i64, gen_helper_neon_unarrow_sat32)
+DO_2SN_32(VQSHRUN_32, gen_sshl_i32, gen_helper_neon_unarrow_sat16)
+DO_2SN_32(VQSHRUN_16, gen_helper_neon_shl_s16, gen_helper_neon_unarrow_sat8)
+
+DO_2SN_64(VQRSHRUN_64, gen_helper_neon_rshl_s64, gen_helper_neon_unarrow_sat32)
+DO_2SN_32(VQRSHRUN_32, gen_helper_neon_rshl_s32, gen_helper_neon_unarrow_sat16)
+DO_2SN_32(VQRSHRUN_16, gen_helper_neon_rshl_s16, gen_helper_neon_unarrow_sat8)
diff --git a/target/arm/translate.c b/target/arm/translate.c
index d711d39eb9d..f884db535b4 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -5280,6 +5280,7 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
case 5: /* VSHL, VSLI */
case 6: /* VQSHLU */
case 7: /* VQSHL */
+ case 8: /* VSHRN, VRSHRN, VQSHRUN, VQRSHRUN */
return 1; /* handled by decodetree */
default:
break;
--
2.20.1
^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [PATCH v2 5/9] target/arm: Convert Neon narrowing shifts with op==8 to decodetree
2020-05-22 14:55 ` [PATCH v2 5/9] target/arm: Convert Neon narrowing shifts with op==8 " Peter Maydell
@ 2020-05-22 22:16 ` Peter Maydell
2020-06-01 23:13 ` Richard Henderson
1 sibling, 0 replies; 17+ messages in thread
From: Peter Maydell @ 2020-05-22 22:16 UTC (permalink / raw)
To: qemu-arm, QEMU Developers; +Cc: Richard Henderson
On Fri, 22 May 2020 at 15:55, Peter Maydell <peter.maydell@linaro.org> wrote:
>
> Convert the Neon narrowing shifts where op==8 to decodetree:
> * VSHRN
> * VRSHRN
> * VQSHRUN
> * VQRSHRUN
>
> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
> ---
> + // todo expand out the shift-narrow and the narrow-op
Oops. I fixed this todo item but forgot to delete the comment.
The code should be correct, though.
-- PMM
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH v2 5/9] target/arm: Convert Neon narrowing shifts with op==8 to decodetree
2020-05-22 14:55 ` [PATCH v2 5/9] target/arm: Convert Neon narrowing shifts with op==8 " Peter Maydell
2020-05-22 22:16 ` Peter Maydell
@ 2020-06-01 23:13 ` Richard Henderson
1 sibling, 0 replies; 17+ messages in thread
From: Richard Henderson @ 2020-06-01 23:13 UTC (permalink / raw)
To: Peter Maydell, qemu-arm, qemu-devel
On 5/22/20 7:55 AM, Peter Maydell wrote:
> Convert the Neon narrowing shifts where op==8 to decodetree:
> * VSHRN
> * VRSHRN
> * VQSHRUN
> * VQRSHRUN
>
> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
> ---
> target/arm/neon-dp.decode | 27 +++++
> target/arm/translate-neon.inc.c | 168 ++++++++++++++++++++++++++++++++
> target/arm/translate.c | 1 +
> 3 files changed, 196 insertions(+)
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
r~
^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH v2 6/9] target/arm: Convert Neon narrowing shifts with op==9 to decodetree
2020-05-22 14:55 [PATCH v2 0/9] target/arm: Convert 2-reg-shift and 1-reg-imm Neon insns to decodetree Peter Maydell
` (4 preceding siblings ...)
2020-05-22 14:55 ` [PATCH v2 5/9] target/arm: Convert Neon narrowing shifts with op==8 " Peter Maydell
@ 2020-05-22 14:55 ` Peter Maydell
2020-05-22 14:55 ` [PATCH v2 7/9] target/arm: Convert Neon VSHLL, VMOVL " Peter Maydell
` (3 subsequent siblings)
9 siblings, 0 replies; 17+ messages in thread
From: Peter Maydell @ 2020-05-22 14:55 UTC (permalink / raw)
To: qemu-arm, qemu-devel; +Cc: Richard Henderson
Convert the remaining Neon narrowing shifts to decodetree:
* VQSHRN
* VQRSHRN
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
---
target/arm/neon-dp.decode | 20 ++++++
target/arm/translate-neon.inc.c | 15 +++++
target/arm/translate.c | 110 +-------------------------------
3 files changed, 37 insertions(+), 108 deletions(-)
diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
index 01887240b4a..43db393cf76 100644
--- a/target/arm/neon-dp.decode
+++ b/target/arm/neon-dp.decode
@@ -326,3 +326,23 @@ VQSHRUN_16_2sh 1111 001 1 1 . ...... .... 1000 . 0 . 1 .... @2reg_shrn_h
VQRSHRUN_64_2sh 1111 001 1 1 . ...... .... 1000 . 1 . 1 .... @2reg_shrn_d
VQRSHRUN_32_2sh 1111 001 1 1 . ...... .... 1000 . 1 . 1 .... @2reg_shrn_s
VQRSHRUN_16_2sh 1111 001 1 1 . ...... .... 1000 . 1 . 1 .... @2reg_shrn_h
+
+# VQSHRN with signed input
+VQSHRN_S64_2sh 1111 001 0 1 . ...... .... 1001 . 0 . 1 .... @2reg_shrn_d
+VQSHRN_S32_2sh 1111 001 0 1 . ...... .... 1001 . 0 . 1 .... @2reg_shrn_s
+VQSHRN_S16_2sh 1111 001 0 1 . ...... .... 1001 . 0 . 1 .... @2reg_shrn_h
+
+# VQRSHRN with signed input
+VQRSHRN_S64_2sh 1111 001 0 1 . ...... .... 1001 . 1 . 1 .... @2reg_shrn_d
+VQRSHRN_S32_2sh 1111 001 0 1 . ...... .... 1001 . 1 . 1 .... @2reg_shrn_s
+VQRSHRN_S16_2sh 1111 001 0 1 . ...... .... 1001 . 1 . 1 .... @2reg_shrn_h
+
+# VQSHRN with unsigned input
+VQSHRN_U64_2sh 1111 001 1 1 . ...... .... 1001 . 0 . 1 .... @2reg_shrn_d
+VQSHRN_U32_2sh 1111 001 1 1 . ...... .... 1001 . 0 . 1 .... @2reg_shrn_s
+VQSHRN_U16_2sh 1111 001 1 1 . ...... .... 1001 . 0 . 1 .... @2reg_shrn_h
+
+# VQRSHRN with unsigned input
+VQRSHRN_U64_2sh 1111 001 1 1 . ...... .... 1001 . 1 . 1 .... @2reg_shrn_d
+VQRSHRN_U32_2sh 1111 001 1 1 . ...... .... 1001 . 1 . 1 .... @2reg_shrn_s
+VQRSHRN_U16_2sh 1111 001 1 1 . ...... .... 1001 . 1 . 1 .... @2reg_shrn_h
diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
index 18ea7255e38..9a75a69a4f5 100644
--- a/target/arm/translate-neon.inc.c
+++ b/target/arm/translate-neon.inc.c
@@ -1672,3 +1672,18 @@ DO_2SN_32(VQSHRUN_16, gen_helper_neon_shl_s16, gen_helper_neon_unarrow_sat8)
DO_2SN_64(VQRSHRUN_64, gen_helper_neon_rshl_s64, gen_helper_neon_unarrow_sat32)
DO_2SN_32(VQRSHRUN_32, gen_helper_neon_rshl_s32, gen_helper_neon_unarrow_sat16)
DO_2SN_32(VQRSHRUN_16, gen_helper_neon_rshl_s16, gen_helper_neon_unarrow_sat8)
+DO_2SN_64(VQSHRN_S64, gen_sshl_i64, gen_helper_neon_narrow_sat_s32)
+DO_2SN_32(VQSHRN_S32, gen_sshl_i32, gen_helper_neon_narrow_sat_s16)
+DO_2SN_32(VQSHRN_S16, gen_helper_neon_shl_s16, gen_helper_neon_narrow_sat_s8)
+
+DO_2SN_64(VQRSHRN_S64, gen_helper_neon_rshl_s64, gen_helper_neon_narrow_sat_s32)
+DO_2SN_32(VQRSHRN_S32, gen_helper_neon_rshl_s32, gen_helper_neon_narrow_sat_s16)
+DO_2SN_32(VQRSHRN_S16, gen_helper_neon_rshl_s16, gen_helper_neon_narrow_sat_s8)
+
+DO_2SN_64(VQSHRN_U64, gen_ushl_i64, gen_helper_neon_narrow_sat_u32)
+DO_2SN_32(VQSHRN_U32, gen_ushl_i32, gen_helper_neon_narrow_sat_u16)
+DO_2SN_32(VQSHRN_U16, gen_helper_neon_shl_u16, gen_helper_neon_narrow_sat_u8)
+
+DO_2SN_64(VQRSHRN_U64, gen_helper_neon_rshl_u64, gen_helper_neon_narrow_sat_u32)
+DO_2SN_32(VQRSHRN_U32, gen_helper_neon_rshl_u32, gen_helper_neon_narrow_sat_u16)
+DO_2SN_32(VQRSHRN_U16, gen_helper_neon_rshl_u16, gen_helper_neon_narrow_sat_u8)
diff --git a/target/arm/translate.c b/target/arm/translate.c
index f884db535b4..f728231b198 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -3201,40 +3201,6 @@ static inline void gen_neon_unarrow_sats(int size, TCGv_i32 dest, TCGv_i64 src)
}
}
-static inline void gen_neon_shift_narrow(int size, TCGv_i32 var, TCGv_i32 shift,
- int q, int u)
-{
- if (q) {
- if (u) {
- switch (size) {
- case 1: gen_helper_neon_rshl_u16(var, var, shift); break;
- case 2: gen_helper_neon_rshl_u32(var, var, shift); break;
- default: abort();
- }
- } else {
- switch (size) {
- case 1: gen_helper_neon_rshl_s16(var, var, shift); break;
- case 2: gen_helper_neon_rshl_s32(var, var, shift); break;
- default: abort();
- }
- }
- } else {
- if (u) {
- switch (size) {
- case 1: gen_helper_neon_shl_u16(var, var, shift); break;
- case 2: gen_ushl_i32(var, var, shift); break;
- default: abort();
- }
- } else {
- switch (size) {
- case 1: gen_helper_neon_shl_s16(var, var, shift); break;
- case 2: gen_sshl_i32(var, var, shift); break;
- default: abort();
- }
- }
- }
-}
-
static inline void gen_neon_widen(TCGv_i64 dest, TCGv_i32 src, int size, int u)
{
if (u) {
@@ -5281,6 +5247,7 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
case 6: /* VQSHLU */
case 7: /* VQSHL */
case 8: /* VSHRN, VRSHRN, VQSHRUN, VQRSHRUN */
+ case 9: /* VQSHRN, VQRSHRN */
return 1; /* handled by decodetree */
default:
break;
@@ -5298,80 +5265,7 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
size--;
}
shift = (insn >> 16) & ((1 << (3 + size)) - 1);
- if (op < 10) {
- /* Shift by immediate and narrow:
- VSHRN, VRSHRN, VQSHRN, VQRSHRN. */
- int input_unsigned = (op == 8) ? !u : u;
- if (rm & 1) {
- return 1;
- }
- shift = shift - (1 << (size + 3));
- size++;
- if (size == 3) {
- tmp64 = tcg_const_i64(shift);
- neon_load_reg64(cpu_V0, rm);
- neon_load_reg64(cpu_V1, rm + 1);
- for (pass = 0; pass < 2; pass++) {
- TCGv_i64 in;
- if (pass == 0) {
- in = cpu_V0;
- } else {
- in = cpu_V1;
- }
- if (q) {
- if (input_unsigned) {
- gen_helper_neon_rshl_u64(cpu_V0, in, tmp64);
- } else {
- gen_helper_neon_rshl_s64(cpu_V0, in, tmp64);
- }
- } else {
- if (input_unsigned) {
- gen_ushl_i64(cpu_V0, in, tmp64);
- } else {
- gen_sshl_i64(cpu_V0, in, tmp64);
- }
- }
- tmp = tcg_temp_new_i32();
- gen_neon_narrow_op(op == 8, u, size - 1, tmp, cpu_V0);
- neon_store_reg(rd, pass, tmp);
- } /* for pass */
- tcg_temp_free_i64(tmp64);
- } else {
- if (size == 1) {
- imm = (uint16_t)shift;
- imm |= imm << 16;
- } else {
- /* size == 2 */
- imm = (uint32_t)shift;
- }
- tmp2 = tcg_const_i32(imm);
- tmp4 = neon_load_reg(rm + 1, 0);
- tmp5 = neon_load_reg(rm + 1, 1);
- for (pass = 0; pass < 2; pass++) {
- if (pass == 0) {
- tmp = neon_load_reg(rm, 0);
- } else {
- tmp = tmp4;
- }
- gen_neon_shift_narrow(size, tmp, tmp2, q,
- input_unsigned);
- if (pass == 0) {
- tmp3 = neon_load_reg(rm, 1);
- } else {
- tmp3 = tmp5;
- }
- gen_neon_shift_narrow(size, tmp3, tmp2, q,
- input_unsigned);
- tcg_gen_concat_i32_i64(cpu_V0, tmp, tmp3);
- tcg_temp_free_i32(tmp);
- tcg_temp_free_i32(tmp3);
- tmp = tcg_temp_new_i32();
- gen_neon_narrow_op(op == 8, u, size - 1, tmp, cpu_V0);
- neon_store_reg(rd, pass, tmp);
- } /* for pass */
- tcg_temp_free_i32(tmp2);
- }
- } else if (op == 10) {
+ if (op == 10) {
/* VSHLL, VMOVL */
if (q || (rd & 1)) {
return 1;
--
2.20.1
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH v2 7/9] target/arm: Convert Neon VSHLL, VMOVL to decodetree
2020-05-22 14:55 [PATCH v2 0/9] target/arm: Convert 2-reg-shift and 1-reg-imm Neon insns to decodetree Peter Maydell
` (5 preceding siblings ...)
2020-05-22 14:55 ` [PATCH v2 6/9] target/arm: Convert Neon narrowing shifts with op==9 " Peter Maydell
@ 2020-05-22 14:55 ` Peter Maydell
2020-05-22 14:55 ` [PATCH v2 8/9] target/arm: Convert VCVT fixed-point ops " Peter Maydell
` (2 subsequent siblings)
9 siblings, 0 replies; 17+ messages in thread
From: Peter Maydell @ 2020-05-22 14:55 UTC (permalink / raw)
To: qemu-arm, qemu-devel; +Cc: Richard Henderson
Convert the VSHLL and VMOVL insns from the 2-reg-shift group
to decodetree. Since the loop always has two passes, we unroll
it to avoid the awkward reassignment of one TCGv to another.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
---
target/arm/neon-dp.decode | 16 +++++++
target/arm/translate-neon.inc.c | 81 +++++++++++++++++++++++++++++++++
target/arm/translate.c | 46 +------------------
3 files changed, 99 insertions(+), 44 deletions(-)
diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
index 43db393cf76..9dd13d13254 100644
--- a/target/arm/neon-dp.decode
+++ b/target/arm/neon-dp.decode
@@ -241,6 +241,14 @@ VMINNM_fp_3s 1111 001 1 0 . 1 . .... .... 1111 ... 1 .... @3same_fp
&2reg_shift vm=%vm_dp vd=%vd_dp size=1 q=0 \
shift=%neon_rshift_i3
+# Long left shifts: again Q is part of opcode decode
+@2reg_shll_s .... ... . . . 1 shift:5 .... .... 0 . . . .... \
+ &2reg_shift vm=%vm_dp vd=%vd_dp size=2 q=0
+@2reg_shll_h .... ... . . . 01 shift:4 .... .... 0 . . . .... \
+ &2reg_shift vm=%vm_dp vd=%vd_dp size=1 q=0
+@2reg_shll_b .... ... . . . 001 shift:3 .... .... 0 . . . .... \
+ &2reg_shift vm=%vm_dp vd=%vd_dp size=0 q=0
+
VSHR_S_2sh 1111 001 0 1 . ...... .... 0000 . . . 1 .... @2reg_shr_d
VSHR_S_2sh 1111 001 0 1 . ...... .... 0000 . . . 1 .... @2reg_shr_s
VSHR_S_2sh 1111 001 0 1 . ...... .... 0000 . . . 1 .... @2reg_shr_h
@@ -346,3 +354,11 @@ VQSHRN_U16_2sh 1111 001 1 1 . ...... .... 1001 . 0 . 1 .... @2reg_shrn_h
VQRSHRN_U64_2sh 1111 001 1 1 . ...... .... 1001 . 1 . 1 .... @2reg_shrn_d
VQRSHRN_U32_2sh 1111 001 1 1 . ...... .... 1001 . 1 . 1 .... @2reg_shrn_s
VQRSHRN_U16_2sh 1111 001 1 1 . ...... .... 1001 . 1 . 1 .... @2reg_shrn_h
+
+VSHLL_S_2sh 1111 001 0 1 . ...... .... 1010 . 0 . 1 .... @2reg_shll_s
+VSHLL_S_2sh 1111 001 0 1 . ...... .... 1010 . 0 . 1 .... @2reg_shll_h
+VSHLL_S_2sh 1111 001 0 1 . ...... .... 1010 . 0 . 1 .... @2reg_shll_b
+
+VSHLL_U_2sh 1111 001 1 1 . ...... .... 1010 . 0 . 1 .... @2reg_shll_s
+VSHLL_U_2sh 1111 001 1 1 . ...... .... 1010 . 0 . 1 .... @2reg_shll_h
+VSHLL_U_2sh 1111 001 1 1 . ...... .... 1010 . 0 . 1 .... @2reg_shll_b
diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
index 9a75a69a4f5..5678bfd0d4d 100644
--- a/target/arm/translate-neon.inc.c
+++ b/target/arm/translate-neon.inc.c
@@ -1687,3 +1687,84 @@ DO_2SN_32(VQSHRN_U16, gen_helper_neon_shl_u16, gen_helper_neon_narrow_sat_u8)
DO_2SN_64(VQRSHRN_U64, gen_helper_neon_rshl_u64, gen_helper_neon_narrow_sat_u32)
DO_2SN_32(VQRSHRN_U32, gen_helper_neon_rshl_u32, gen_helper_neon_narrow_sat_u16)
DO_2SN_32(VQRSHRN_U16, gen_helper_neon_rshl_u16, gen_helper_neon_narrow_sat_u8)
+
+static bool do_vshll_2sh(DisasContext *s, arg_2reg_shift *a,
+ NeonGenWidenFn *widenfn, bool u)
+{
+ TCGv_i64 tmp;
+ TCGv_i32 rm0, rm1;
+ uint64_t widen_mask = 0;
+
+ if (!arm_dc_feature(s, ARM_FEATURE_NEON)) {
+ return false;
+ }
+
+ /* UNDEF accesses to D16-D31 if they don't exist. */
+ if (!dc_isar_feature(aa32_simd_r32, s) &&
+ ((a->vd | a->vm) & 0x10)) {
+ return false;
+ }
+
+ if (a->vd & 1) {
+ return false;
+ }
+
+ if (!vfp_access_check(s)) {
+ return true;
+ }
+
+ /*
+ * This is a widen-and-shift operation. The shift is always less
+ * than the width of the source type, so after widening the input
+ * vector we can simply shift the whole 64-bit widened register,
+ * and then clear the potential overflow bits resulting from left
+ * bits of the narrow input appearing as right bits of the left
+ * neighbour narrow input. Calculate a mask of bits to clear.
+ */
+ if ((a->shift != 0) && (a->size < 2 || u)) {
+ int esize = 8 << a->size;
+ widen_mask = MAKE_64BIT_MASK(0, esize);
+ widen_mask >>= esize - a->shift;
+ widen_mask = dup_const(a->size + 1, widen_mask);
+ }
+
+ rm0 = neon_load_reg(a->vm, 0);
+ rm1 = neon_load_reg(a->vm, 1);
+ tmp = tcg_temp_new_i64();
+
+ widenfn(tmp, rm0);
+ if (a->shift != 0) {
+ tcg_gen_shli_i64(tmp, tmp, a->shift);
+ tcg_gen_andi_i64(tmp, tmp, ~widen_mask);
+ }
+ neon_store_reg64(tmp, a->vd);
+
+ widenfn(tmp, rm1);
+ if (a->shift != 0) {
+ tcg_gen_shli_i64(tmp, tmp, a->shift);
+ tcg_gen_andi_i64(tmp, tmp, ~widen_mask);
+ }
+ neon_store_reg64(tmp, a->vd + 1);
+ tcg_temp_free_i64(tmp);
+ return true;
+}
+
+static bool trans_VSHLL_S_2sh(DisasContext *s, arg_2reg_shift *a)
+{
+ NeonGenWidenFn *widenfn[] = {
+ gen_helper_neon_widen_s8,
+ gen_helper_neon_widen_s16,
+ tcg_gen_ext_i32_i64,
+ };
+ return do_vshll_2sh(s, a, widenfn[a->size], false);
+}
+
+static bool trans_VSHLL_U_2sh(DisasContext *s, arg_2reg_shift *a)
+{
+ NeonGenWidenFn *widenfn[] = {
+ gen_helper_neon_widen_u8,
+ gen_helper_neon_widen_u16,
+ tcg_gen_extu_i32_i64,
+ };
+ return do_vshll_2sh(s, a, widenfn[a->size], true);
+}
diff --git a/target/arm/translate.c b/target/arm/translate.c
index f728231b198..ef39c89f10a 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -5248,6 +5248,7 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
case 7: /* VQSHL */
case 8: /* VSHRN, VRSHRN, VQSHRUN, VQRSHRUN */
case 9: /* VQSHRN, VQRSHRN */
+ case 10: /* VSHLL, including VMOVL */
return 1; /* handled by decodetree */
default:
break;
@@ -5265,50 +5266,7 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
size--;
}
shift = (insn >> 16) & ((1 << (3 + size)) - 1);
- if (op == 10) {
- /* VSHLL, VMOVL */
- if (q || (rd & 1)) {
- return 1;
- }
- tmp = neon_load_reg(rm, 0);
- tmp2 = neon_load_reg(rm, 1);
- for (pass = 0; pass < 2; pass++) {
- if (pass == 1)
- tmp = tmp2;
-
- gen_neon_widen(cpu_V0, tmp, size, u);
-
- if (shift != 0) {
- /* The shift is less than the width of the source
- type, so we can just shift the whole register. */
- tcg_gen_shli_i64(cpu_V0, cpu_V0, shift);
- /* Widen the result of shift: we need to clear
- * the potential overflow bits resulting from
- * left bits of the narrow input appearing as
- * right bits of left the neighbour narrow
- * input. */
- if (size < 2 || !u) {
- uint64_t imm64;
- if (size == 0) {
- imm = (0xffu >> (8 - shift));
- imm |= imm << 16;
- } else if (size == 1) {
- imm = 0xffff >> (16 - shift);
- } else {
- /* size == 2 */
- imm = 0xffffffff >> (32 - shift);
- }
- if (size < 2) {
- imm64 = imm | (((uint64_t)imm) << 32);
- } else {
- imm64 = imm;
- }
- tcg_gen_andi_i64(cpu_V0, cpu_V0, ~imm64);
- }
- }
- neon_store_reg64(cpu_V0, rd + pass);
- }
- } else if (op >= 14) {
+ if (op >= 14) {
/* VCVT fixed-point. */
TCGv_ptr fpst;
TCGv_i32 shiftv;
--
2.20.1
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH v2 8/9] target/arm: Convert VCVT fixed-point ops to decodetree
2020-05-22 14:55 [PATCH v2 0/9] target/arm: Convert 2-reg-shift and 1-reg-imm Neon insns to decodetree Peter Maydell
` (6 preceding siblings ...)
2020-05-22 14:55 ` [PATCH v2 7/9] target/arm: Convert Neon VSHLL, VMOVL " Peter Maydell
@ 2020-05-22 14:55 ` Peter Maydell
2020-05-22 14:55 ` [PATCH v2 9/9] target/arm: Convert Neon one-register-and-immediate insns " Peter Maydell
2020-05-22 19:19 ` [PATCH v2 0/9] target/arm: Convert 2-reg-shift and 1-reg-imm Neon " no-reply
9 siblings, 0 replies; 17+ messages in thread
From: Peter Maydell @ 2020-05-22 14:55 UTC (permalink / raw)
To: qemu-arm, qemu-devel; +Cc: Richard Henderson
Convert the VCVT fixed-point conversion operations in the
Neon 2-regs-and-shift group to decodetree.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
---
target/arm/neon-dp.decode | 11 +++++
target/arm/translate-neon.inc.c | 49 +++++++++++++++++++++
target/arm/translate.c | 75 +--------------------------------
3 files changed, 62 insertions(+), 73 deletions(-)
diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
index 9dd13d13254..e217d51670d 100644
--- a/target/arm/neon-dp.decode
+++ b/target/arm/neon-dp.decode
@@ -249,6 +249,10 @@ VMINNM_fp_3s 1111 001 1 0 . 1 . .... .... 1111 ... 1 .... @3same_fp
@2reg_shll_b .... ... . . . 001 shift:3 .... .... 0 . . . .... \
&2reg_shift vm=%vm_dp vd=%vd_dp size=0 q=0
+# We use size=0 for fp32 and size=1 for fp16 to match the 3-same encodings.
+@2reg_vcvt .... ... . . . 1 ..... .... .... . q:1 . . .... \
+ &2reg_shift vm=%vm_dp vd=%vd_dp size=0 shift=%neon_rshift_i5
+
VSHR_S_2sh 1111 001 0 1 . ...... .... 0000 . . . 1 .... @2reg_shr_d
VSHR_S_2sh 1111 001 0 1 . ...... .... 0000 . . . 1 .... @2reg_shr_s
VSHR_S_2sh 1111 001 0 1 . ...... .... 0000 . . . 1 .... @2reg_shr_h
@@ -362,3 +366,10 @@ VSHLL_S_2sh 1111 001 0 1 . ...... .... 1010 . 0 . 1 .... @2reg_shll_b
VSHLL_U_2sh 1111 001 1 1 . ...... .... 1010 . 0 . 1 .... @2reg_shll_s
VSHLL_U_2sh 1111 001 1 1 . ...... .... 1010 . 0 . 1 .... @2reg_shll_h
VSHLL_U_2sh 1111 001 1 1 . ...... .... 1010 . 0 . 1 .... @2reg_shll_b
+
+# VCVT fixed<->float conversions
+# TODO: FP16 fixed<->float conversions are opc==0b1100 and 0b1101
+VCVT_SF_2sh 1111 001 0 1 . ...... .... 1110 0 . . 1 .... @2reg_vcvt
+VCVT_UF_2sh 1111 001 1 1 . ...... .... 1110 0 . . 1 .... @2reg_vcvt
+VCVT_FS_2sh 1111 001 0 1 . ...... .... 1111 0 . . 1 .... @2reg_vcvt
+VCVT_FU_2sh 1111 001 1 1 . ...... .... 1111 0 . . 1 .... @2reg_vcvt
diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
index 5678bfd0d4d..8d1c58eddc2 100644
--- a/target/arm/translate-neon.inc.c
+++ b/target/arm/translate-neon.inc.c
@@ -1768,3 +1768,52 @@ static bool trans_VSHLL_U_2sh(DisasContext *s, arg_2reg_shift *a)
};
return do_vshll_2sh(s, a, widenfn[a->size], true);
}
+
+static bool do_fp_2sh(DisasContext *s, arg_2reg_shift *a,
+ NeonGenTwoSingleOPFn *fn)
+{
+ /* FP operations in 2-reg-and-shift group */
+ TCGv_i32 tmp, shiftv;
+ TCGv_ptr fpstatus;
+ int pass;
+
+ if (!arm_dc_feature(s, ARM_FEATURE_NEON)) {
+ return false;
+ }
+
+ /* UNDEF accesses to D16-D31 if they don't exist. */
+ if (!dc_isar_feature(aa32_simd_r32, s) &&
+ ((a->vd | a->vm) & 0x10)) {
+ return false;
+ }
+
+ if ((a->vm | a->vd) & a->q) {
+ return false;
+ }
+
+ if (!vfp_access_check(s)) {
+ return true;
+ }
+
+ fpstatus = get_fpstatus_ptr(1);
+ shiftv = tcg_const_i32(a->shift);
+ for (pass = 0; pass < (a->q ? 4 : 2); pass++) {
+ tmp = neon_load_reg(a->vm, pass);
+ fn(tmp, tmp, shiftv, fpstatus);
+ neon_store_reg(a->vd, pass, tmp);
+ }
+ tcg_temp_free_ptr(fpstatus);
+ tcg_temp_free_i32(shiftv);
+ return true;
+}
+
+#define DO_FP_2SH(INSN, FUNC) \
+ static bool trans_##INSN##_2sh(DisasContext *s, arg_2reg_shift *a) \
+ { \
+ return do_fp_2sh(s, a, FUNC); \
+ }
+
+DO_FP_2SH(VCVT_SF, gen_helper_vfp_sltos)
+DO_FP_2SH(VCVT_UF, gen_helper_vfp_ultos)
+DO_FP_2SH(VCVT_FS, gen_helper_vfp_tosls_round_to_zero)
+DO_FP_2SH(VCVT_FU, gen_helper_vfp_touls_round_to_zero)
diff --git a/target/arm/translate.c b/target/arm/translate.c
index ef39c89f10a..9cc44e6258e 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -5193,7 +5193,6 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
int q;
int rd, rn, rm, rd_ofs, rn_ofs, rm_ofs;
int size;
- int shift;
int pass;
int u;
int vec_size;
@@ -5234,78 +5233,8 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
return 1;
} else if (insn & (1 << 4)) {
if ((insn & 0x00380080) != 0) {
- /* Two registers and shift. */
- op = (insn >> 8) & 0xf;
-
- switch (op) {
- case 0: /* VSHR */
- case 1: /* VSRA */
- case 2: /* VRSHR */
- case 3: /* VRSRA */
- case 4: /* VSRI */
- case 5: /* VSHL, VSLI */
- case 6: /* VQSHLU */
- case 7: /* VQSHL */
- case 8: /* VSHRN, VRSHRN, VQSHRUN, VQRSHRUN */
- case 9: /* VQSHRN, VQRSHRN */
- case 10: /* VSHLL, including VMOVL */
- return 1; /* handled by decodetree */
- default:
- break;
- }
-
- if (insn & (1 << 7)) {
- /* 64-bit shift. */
- if (op > 7) {
- return 1;
- }
- size = 3;
- } else {
- size = 2;
- while ((insn & (1 << (size + 19))) == 0)
- size--;
- }
- shift = (insn >> 16) & ((1 << (3 + size)) - 1);
- if (op >= 14) {
- /* VCVT fixed-point. */
- TCGv_ptr fpst;
- TCGv_i32 shiftv;
- VFPGenFixPointFn *fn;
-
- if (!(insn & (1 << 21)) || (q && ((rd | rm) & 1))) {
- return 1;
- }
-
- if (!(op & 1)) {
- if (u) {
- fn = gen_helper_vfp_ultos;
- } else {
- fn = gen_helper_vfp_sltos;
- }
- } else {
- if (u) {
- fn = gen_helper_vfp_touls_round_to_zero;
- } else {
- fn = gen_helper_vfp_tosls_round_to_zero;
- }
- }
-
- /* We have already masked out the must-be-1 top bit of imm6,
- * hence this 32-shift where the ARM ARM has 64-imm6.
- */
- shift = 32 - shift;
- fpst = get_fpstatus_ptr(1);
- shiftv = tcg_const_i32(shift);
- for (pass = 0; pass < (q ? 4 : 2); pass++) {
- TCGv_i32 tmpf = neon_load_reg(rm, pass);
- fn(tmpf, tmpf, shiftv, fpst);
- neon_store_reg(rd, pass, tmpf);
- }
- tcg_temp_free_ptr(fpst);
- tcg_temp_free_i32(shiftv);
- } else {
- return 1;
- }
+ /* Two registers and shift: handled by decodetree */
+ return 1;
} else { /* (insn & 0x00380080) == 0 */
int invert, reg_ofs, vec_size;
--
2.20.1
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH v2 9/9] target/arm: Convert Neon one-register-and-immediate insns to decodetree
2020-05-22 14:55 [PATCH v2 0/9] target/arm: Convert 2-reg-shift and 1-reg-imm Neon insns to decodetree Peter Maydell
` (7 preceding siblings ...)
2020-05-22 14:55 ` [PATCH v2 8/9] target/arm: Convert VCVT fixed-point ops " Peter Maydell
@ 2020-05-22 14:55 ` Peter Maydell
2020-06-01 23:32 ` Richard Henderson
2020-05-22 19:19 ` [PATCH v2 0/9] target/arm: Convert 2-reg-shift and 1-reg-imm Neon " no-reply
9 siblings, 1 reply; 17+ messages in thread
From: Peter Maydell @ 2020-05-22 14:55 UTC (permalink / raw)
To: qemu-arm, qemu-devel; +Cc: Richard Henderson
Convert the insns in the one-register-and-immediate group to decodetree.
In the new decode, our asimd_imm_const() function returns a 64-bit value
rather than a 32-bit one, which means we don't need to treat cmode=14 op=1
as a special case in the decoder (it is the only encoding where the two
halves of the 64-bit value are different).
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
target/arm/neon-dp.decode | 22 ++++++
target/arm/translate-neon.inc.c | 118 ++++++++++++++++++++++++++++++++
target/arm/translate.c | 101 +--------------------------
3 files changed, 142 insertions(+), 99 deletions(-)
diff --git a/target/arm/neon-dp.decode b/target/arm/neon-dp.decode
index e217d51670d..1643d84e9c2 100644
--- a/target/arm/neon-dp.decode
+++ b/target/arm/neon-dp.decode
@@ -373,3 +373,25 @@ VCVT_SF_2sh 1111 001 0 1 . ...... .... 1110 0 . . 1 .... @2reg_vcvt
VCVT_UF_2sh 1111 001 1 1 . ...... .... 1110 0 . . 1 .... @2reg_vcvt
VCVT_FS_2sh 1111 001 0 1 . ...... .... 1111 0 . . 1 .... @2reg_vcvt
VCVT_FU_2sh 1111 001 1 1 . ...... .... 1111 0 . . 1 .... @2reg_vcvt
+
+######################################################################
+# 1-reg-and-modified-immediate grouping:
+# 1111 001 i 1 D 000 imm:3 Vd:4 cmode:4 0 Q op 1 Vm:4
+######################################################################
+
+&1reg_imm vd q imm cmode op
+
+%asimd_imm_value 24:1 16:3 0:4
+
+@1reg_imm .... ... . . . ... ... .... .... . q:1 . . .... \
+ &1reg_imm imm=%asimd_imm_value vd=%vd_dp
+
+# The cmode/op bits here decode VORR/VBIC/VMOV/VMNV, but
+# not in a way we can conveniently represent in decodetree without
+# a lot of repetition:
+# VORR: op=0, (cmode & 1) && cmode < 12
+# VBIC: op=1, (cmode & 1) && cmode < 12
+# VMOV: everything else
+# So we have a single decode line and check the cmode/op in the
+# trans function.
+Vimm_1r 1111 001 . 1 . 000 ... .... cmode:4 0 . op:1 1 .... @1reg_imm
diff --git a/target/arm/translate-neon.inc.c b/target/arm/translate-neon.inc.c
index 8d1c58eddc2..39c7e70373a 100644
--- a/target/arm/translate-neon.inc.c
+++ b/target/arm/translate-neon.inc.c
@@ -1817,3 +1817,121 @@ DO_FP_2SH(VCVT_SF, gen_helper_vfp_sltos)
DO_FP_2SH(VCVT_UF, gen_helper_vfp_ultos)
DO_FP_2SH(VCVT_FS, gen_helper_vfp_tosls_round_to_zero)
DO_FP_2SH(VCVT_FU, gen_helper_vfp_touls_round_to_zero)
+
+static uint64_t asimd_imm_const(uint32_t imm, int cmode, int op)
+{
+ /*
+ * Expand the encoded constant.
+ * Note that cmode = 2,3,4,5,6,7,10,11,12,13 imm=0 is UNPREDICTABLE.
+ * We choose to not special-case this and will behave as if a
+ * valid constant encoding of 0 had been given.
+ * cmode = 15 op = 1 must UNDEF; we assume decode has handled that.
+ */
+ switch (cmode) {
+ case 0: case 1:
+ /* no-op */
+ break;
+ case 2: case 3:
+ imm <<= 8;
+ break;
+ case 4: case 5:
+ imm <<= 16;
+ break;
+ case 6: case 7:
+ imm <<= 24;
+ break;
+ case 8: case 9:
+ imm |= imm << 16;
+ break;
+ case 10: case 11:
+ imm = (imm << 8) | (imm << 24);
+ break;
+ case 12:
+ imm = (imm << 8) | 0xff;
+ break;
+ case 13:
+ imm = (imm << 16) | 0xffff;
+ break;
+ case 14:
+ if (op) {
+ /*
+ * This is the only case where the top and bottom 32 bits
+ * of the encoded constant differ.
+ */
+ uint64_t imm64 = 0;
+ int n;
+
+ for (n = 0; n < 8; n++) {
+ if (imm & (1 << n)) {
+ imm64 |= (0xffULL << (n * 8));
+ }
+ }
+ return imm64;
+ }
+ imm |= (imm << 8) | (imm << 16) | (imm << 24);
+ break;
+ case 15:
+ imm = ((imm & 0x80) << 24) | ((imm & 0x3f) << 19)
+ | ((imm & 0x40) ? (0x1f << 25) : (1 << 30));
+ break;
+ }
+ if (op) {
+ imm = ~imm;
+ }
+ return dup_const(MO_32, imm);
+}
+
+static bool do_1reg_imm(DisasContext *s, arg_1reg_imm *a,
+ GVecGen2iFn *fn)
+{
+ uint64_t imm;
+ int reg_ofs, vec_size;
+
+ if (!arm_dc_feature(s, ARM_FEATURE_NEON)) {
+ return false;
+ }
+
+ /* UNDEF accesses to D16-D31 if they don't exist. */
+ if (!dc_isar_feature(aa32_simd_r32, s) && (a->vd & 0x10)) {
+ return false;
+ }
+
+ if (a->vd & a->q) {
+ return false;
+ }
+
+ if (!vfp_access_check(s)) {
+ return true;
+ }
+
+ reg_ofs = neon_reg_offset(a->vd, 0);
+ vec_size = a->q ? 16 : 8;
+ imm = asimd_imm_const(a->imm, a->cmode, a->op);
+
+ fn(MO_64, reg_ofs, reg_ofs, imm, vec_size, vec_size);
+ return true;
+}
+
+static void gen_VMOV_1r(unsigned vece, uint32_t dofs, uint32_t aofs,
+ int64_t c, uint32_t oprsz, uint32_t maxsz)
+{
+ tcg_gen_gvec_dup_imm(MO_64, dofs, oprsz, maxsz, c);
+}
+
+static bool trans_Vimm_1r(DisasContext *s, arg_1reg_imm *a)
+{
+ /* Handle decode of cmode/op here between VORR/VBIC/VMOV */
+ GVecGen2iFn *fn;
+
+ if ((a->cmode & 1) && a->cmode < 12) {
+ /* for op=1, the imm will be inverted, so BIC becomes AND. */
+ fn = a->op ? tcg_gen_gvec_andi : tcg_gen_gvec_ori;
+ } else {
+ /* There is one unallocated cmode/op combination in this space */
+ if (a->cmode == 15 && a->op == 1) {
+ return false;
+ }
+ fn = gen_VMOV_1r;
+ }
+ return do_1reg_imm(s, a, fn);
+}
diff --git a/target/arm/translate.c b/target/arm/translate.c
index 9cc44e6258e..20d07e99053 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -5232,105 +5232,8 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn)
/* Three register same length: handled by decodetree */
return 1;
} else if (insn & (1 << 4)) {
- if ((insn & 0x00380080) != 0) {
- /* Two registers and shift: handled by decodetree */
- return 1;
- } else { /* (insn & 0x00380080) == 0 */
- int invert, reg_ofs, vec_size;
-
- if (q && (rd & 1)) {
- return 1;
- }
-
- op = (insn >> 8) & 0xf;
- /* One register and immediate. */
- imm = (u << 7) | ((insn >> 12) & 0x70) | (insn & 0xf);
- invert = (insn & (1 << 5)) != 0;
- /* Note that op = 2,3,4,5,6,7,10,11,12,13 imm=0 is UNPREDICTABLE.
- * We choose to not special-case this and will behave as if a
- * valid constant encoding of 0 had been given.
- */
- switch (op) {
- case 0: case 1:
- /* no-op */
- break;
- case 2: case 3:
- imm <<= 8;
- break;
- case 4: case 5:
- imm <<= 16;
- break;
- case 6: case 7:
- imm <<= 24;
- break;
- case 8: case 9:
- imm |= imm << 16;
- break;
- case 10: case 11:
- imm = (imm << 8) | (imm << 24);
- break;
- case 12:
- imm = (imm << 8) | 0xff;
- break;
- case 13:
- imm = (imm << 16) | 0xffff;
- break;
- case 14:
- imm |= (imm << 8) | (imm << 16) | (imm << 24);
- if (invert) {
- imm = ~imm;
- }
- break;
- case 15:
- if (invert) {
- return 1;
- }
- imm = ((imm & 0x80) << 24) | ((imm & 0x3f) << 19)
- | ((imm & 0x40) ? (0x1f << 25) : (1 << 30));
- break;
- }
- if (invert) {
- imm = ~imm;
- }
-
- reg_ofs = neon_reg_offset(rd, 0);
- vec_size = q ? 16 : 8;
-
- if (op & 1 && op < 12) {
- if (invert) {
- /* The immediate value has already been inverted,
- * so BIC becomes AND.
- */
- tcg_gen_gvec_andi(MO_32, reg_ofs, reg_ofs, imm,
- vec_size, vec_size);
- } else {
- tcg_gen_gvec_ori(MO_32, reg_ofs, reg_ofs, imm,
- vec_size, vec_size);
- }
- } else {
- /* VMOV, VMVN. */
- if (op == 14 && invert) {
- TCGv_i64 t64 = tcg_temp_new_i64();
-
- for (pass = 0; pass <= q; ++pass) {
- uint64_t val = 0;
- int n;
-
- for (n = 0; n < 8; n++) {
- if (imm & (1 << (n + pass * 8))) {
- val |= 0xffull << (n * 8);
- }
- }
- tcg_gen_movi_i64(t64, val);
- neon_store_reg64(t64, rd + pass);
- }
- tcg_temp_free_i64(t64);
- } else {
- tcg_gen_gvec_dup_imm(MO_32, reg_ofs, vec_size,
- vec_size, imm);
- }
- }
- }
+ /* Two registers and shift or reg and imm: handled by decodetree */
+ return 1;
} else { /* (insn & 0x00800010 == 0x00800000) */
if (size != 3) {
op = (insn >> 8) & 0xf;
--
2.20.1
^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [PATCH v2 9/9] target/arm: Convert Neon one-register-and-immediate insns to decodetree
2020-05-22 14:55 ` [PATCH v2 9/9] target/arm: Convert Neon one-register-and-immediate insns " Peter Maydell
@ 2020-06-01 23:32 ` Richard Henderson
2020-06-02 8:58 ` Peter Maydell
0 siblings, 1 reply; 17+ messages in thread
From: Richard Henderson @ 2020-06-01 23:32 UTC (permalink / raw)
To: Peter Maydell, qemu-arm, qemu-devel
On 5/22/20 7:55 AM, Peter Maydell wrote:
> Convert the insns in the one-register-and-immediate group to decodetree.
>
> In the new decode, our asimd_imm_const() function returns a 64-bit value
> rather than a 32-bit one, which means we don't need to treat cmode=14 op=1
> as a special case in the decoder (it is the only encoding where the two
> halves of the 64-bit value are different).
>
> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
> ---
> target/arm/neon-dp.decode | 22 ++++++
> target/arm/translate-neon.inc.c | 118 ++++++++++++++++++++++++++++++++
> target/arm/translate.c | 101 +--------------------------
> 3 files changed, 142 insertions(+), 99 deletions(-)
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
because this is a faithful transliteration of the existing code, but...
> + switch (cmode) {
> + case 0: case 1:
> + /* no-op */
> + break;
> + case 2: case 3:
> + imm <<= 8;
> + break;
> + case 4: case 5:
> + imm <<= 16;
> + break;
> + case 6: case 7:
> + imm <<= 24;
> + break;
> + case 8: case 9:
> + imm |= imm << 16;
> + break;
> + case 10: case 11:
> + imm = (imm << 8) | (imm << 24);
> + break;
It might be clearer to use dup_const for each case, which would more closely
match the pseudocode. E.g. here,
return dup_const(MO_16, imm << 8);
> + imm |= (imm << 8) | (imm << 16) | (imm << 24);
return dup_const(MO_8, imm);
Something to remember for a follow-up.
r~
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH v2 9/9] target/arm: Convert Neon one-register-and-immediate insns to decodetree
2020-06-01 23:32 ` Richard Henderson
@ 2020-06-02 8:58 ` Peter Maydell
0 siblings, 0 replies; 17+ messages in thread
From: Peter Maydell @ 2020-06-02 8:58 UTC (permalink / raw)
To: Richard Henderson; +Cc: qemu-arm, QEMU Developers
On Tue, 2 Jun 2020 at 00:32, Richard Henderson
<richard.henderson@linaro.org> wrote:
> It might be clearer to use dup_const for each case, which would more closely
> match the pseudocode. E.g. here,
>
> return dup_const(MO_16, imm << 8);
>
> > + imm |= (imm << 8) | (imm << 16) | (imm << 24);
>
> return dup_const(MO_8, imm);
Yeah, I did think about this, but figured that keeping the
existing code structure was clearer for purposes of reviewing
this refactoring series.
thanks
-- PMM
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH v2 0/9] target/arm: Convert 2-reg-shift and 1-reg-imm Neon insns to decodetree
2020-05-22 14:55 [PATCH v2 0/9] target/arm: Convert 2-reg-shift and 1-reg-imm Neon insns to decodetree Peter Maydell
` (8 preceding siblings ...)
2020-05-22 14:55 ` [PATCH v2 9/9] target/arm: Convert Neon one-register-and-immediate insns " Peter Maydell
@ 2020-05-22 19:19 ` no-reply
9 siblings, 0 replies; 17+ messages in thread
From: no-reply @ 2020-05-22 19:19 UTC (permalink / raw)
To: peter.maydell; +Cc: qemu-arm, richard.henderson, qemu-devel
Patchew URL: https://patchew.org/QEMU/20200522145520.6778-1-peter.maydell@linaro.org/
Hi,
This series seems to have some coding style problems. See output below for
more information:
Message-id: 20200522145520.6778-1-peter.maydell@linaro.org
Subject: [PATCH v2 0/9] target/arm: Convert 2-reg-shift and 1-reg-imm Neon insns to decodetree
Type: series
=== TEST SCRIPT BEGIN ===
#!/bin/bash
git rev-parse base > /dev/null || exit 0
git config --local diff.renamelimit 0
git config --local diff.renames True
git config --local diff.algorithm histogram
./scripts/checkpatch.pl --mailback base..
=== TEST SCRIPT END ===
Switched to a new branch 'test'
bbfdb6d target/arm: Convert Neon one-register-and-immediate insns to decodetree
e12ab4f target/arm: Convert VCVT fixed-point ops to decodetree
e32ffd0 target/arm: Convert Neon VSHLL, VMOVL to decodetree
09f9294 target/arm: Convert Neon narrowing shifts with op==9 to decodetree
7235981 target/arm: Convert Neon narrowing shifts with op==8 to decodetree
1df57d5 target/arm: Convert VQSHLU, VQSHL 2-reg-shift insns to decodetree
c2b6277 target/arm: Convert Neon VSRA, VSRI, VRSHR, VRSRA 2-reg-shift insns to decodetree
f48b59c target/arm: Convert Neon VSHR 2-reg-shift insns to decodetree
2a2d74c target/arm: Convert Neon VSHL and VSLI 2-reg-shift insn to decodetree
=== OUTPUT BEGIN ===
1/9 Checking commit 2a2d74c89bb5 (target/arm: Convert Neon VSHL and VSLI 2-reg-shift insn to decodetree)
ERROR: spaces required around that '*' (ctx:WxV)
#55: FILE: target/arm/translate-neon.inc.c:1314:
+static bool do_vector_2sh(DisasContext *s, arg_2reg_shift *a, GVecGen2iFn *fn)
^
ERROR: spaces required around that '*' (ctx:WxV)
#85: FILE: target/arm/translate-neon.inc.c:1344:
+ static bool trans_##INSN##_2sh(DisasContext *s, arg_2reg_shift *a) \
^
total: 2 errors, 0 warnings, 99 lines checked
Patch 1/9 has style problems, please review. If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.
2/9 Checking commit f48b59c62b8a (target/arm: Convert Neon VSHR 2-reg-shift insns to decodetree)
ERROR: spaces required around that '*' (ctx:WxV)
#93: FILE: target/arm/translate-neon.inc.c:1370:
+static bool trans_VSHR_S_2sh(DisasContext *s, arg_2reg_shift *a)
^
total: 1 errors, 0 warnings, 120 lines checked
Patch 2/9 has style problems, please review. If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.
3/9 Checking commit c2b6277a4e9c (target/arm: Convert Neon VSRA, VSRI, VRSHR, VRSRA 2-reg-shift insns to decodetree)
4/9 Checking commit 1df57d58e551 (target/arm: Convert VQSHLU, VQSHL 2-reg-shift insns to decodetree)
5/9 Checking commit 7235981fbe43 (target/arm: Convert Neon narrowing shifts with op==8 to decodetree)
ERROR: do not use C99 // comments
#170: FILE: target/arm/translate-neon.inc.c:1611:
+ // todo expand out the shift-narrow and the narrow-op
total: 1 errors, 0 warnings, 214 lines checked
Patch 5/9 has style problems, please review. If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.
6/9 Checking commit 09f9294fe86a (target/arm: Convert Neon narrowing shifts with op==9 to decodetree)
7/9 Checking commit e32ffd0ffb35 (target/arm: Convert Neon VSHLL, VMOVL to decodetree)
8/9 Checking commit e12ab4f9bff6 (target/arm: Convert VCVT fixed-point ops to decodetree)
9/9 Checking commit bbfdb6dcc6d8 (target/arm: Convert Neon one-register-and-immediate insns to decodetree)
=== OUTPUT END ===
Test command exited with code: 1
The full log is available at
http://patchew.org/logs/20200522145520.6778-1-peter.maydell@linaro.org/testing.checkpatch/?type=message.
---
Email generated automatically by Patchew [https://patchew.org/].
Please send your feedback to patchew-devel@redhat.com
^ permalink raw reply [flat|nested] 17+ messages in thread