* [PATCH 0/3] target/ppc: Implement Vector Expand/Extract Mask and Vector Mask
@ 2021-11-10 18:56 matheus.ferst
2021-11-10 18:56 ` [PATCH 1/3] target/ppc: Implement Vector Expand Mask matheus.ferst
` (2 more replies)
0 siblings, 3 replies; 7+ messages in thread
From: matheus.ferst @ 2021-11-10 18:56 UTC (permalink / raw)
To: qemu-devel, qemu-ppc
Cc: danielhb413, richard.henderson, groug, clg, Matheus Ferst, david
From: Matheus Ferst <matheus.ferst@eldorado.org.br>
This is a small patch series just to allow Ubuntu 21.10 to boot with
-cpu POWER10. Glibc 2.34 is using vextractbm, so the init is killed by
SIGILL without the second patch of this series. The other two insns. are
included as they are somewhat close to Vector Extract Mask (at least in
pseudocode).
Matheus Ferst (3):
target/ppc: Implement Vector Expand Mask
target/ppc: Implement Vector Extract Mask
target/ppc: Implement Vector Mask Move insns
target/ppc/insn32.decode | 28 ++++
target/ppc/translate/vmx-impl.c.inc | 233 ++++++++++++++++++++++++++++
2 files changed, 261 insertions(+)
--
2.25.1
^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH 1/3] target/ppc: Implement Vector Expand Mask
2021-11-10 18:56 [PATCH 0/3] target/ppc: Implement Vector Expand/Extract Mask and Vector Mask matheus.ferst
@ 2021-11-10 18:56 ` matheus.ferst
2021-11-11 9:28 ` Richard Henderson
2021-11-10 18:56 ` [PATCH 2/3] target/ppc: Implement Vector Extract Mask matheus.ferst
2021-11-10 18:56 ` [PATCH 3/3] target/ppc: Implement Vector Mask Move insns matheus.ferst
2 siblings, 1 reply; 7+ messages in thread
From: matheus.ferst @ 2021-11-10 18:56 UTC (permalink / raw)
To: qemu-devel, qemu-ppc
Cc: danielhb413, richard.henderson, groug, clg, Matheus Ferst, david
From: Matheus Ferst <matheus.ferst@eldorado.org.br>
Implement the following PowerISA v3.1 instructions:
vexpandbm: Vector Expand Byte Mask
vexpandhm: Vector Expand Halfword Mask
vexpandwm: Vector Expand Word Mask
vexpanddm: Vector Expand Doubleword Mask
vexpandqm: Vector Expand Quadword Mask
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
---
target/ppc/insn32.decode | 11 ++++++++++
target/ppc/translate/vmx-impl.c.inc | 34 +++++++++++++++++++++++++++++
2 files changed, 45 insertions(+)
diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode
index e135b8aba4..9a28f1d266 100644
--- a/target/ppc/insn32.decode
+++ b/target/ppc/insn32.decode
@@ -56,6 +56,9 @@
&VX_uim4 vrt uim vrb
@VX_uim4 ...... vrt:5 . uim:4 vrb:5 ........... &VX_uim4
+&VX_tb vrt vrb
+@VX_tb ...... vrt:5 ..... vrb:5 ........... &VX_tb
+
&X rt ra rb
@X ...... rt:5 ra:5 rb:5 .......... . &X
@@ -408,6 +411,14 @@ VINSWVRX 000100 ..... ..... ..... 00110001111 @VX
VSLDBI 000100 ..... ..... ..... 00 ... 010110 @VN
VSRDBI 000100 ..... ..... ..... 01 ... 010110 @VN
+## Vector Mask Manipulation Instructions
+
+VEXPANDBM 000100 ..... 00000 ..... 11001000010 @VX_tb
+VEXPANDHM 000100 ..... 00001 ..... 11001000010 @VX_tb
+VEXPANDWM 000100 ..... 00010 ..... 11001000010 @VX_tb
+VEXPANDDM 000100 ..... 00011 ..... 11001000010 @VX_tb
+VEXPANDQM 000100 ..... 00100 ..... 11001000010 @VX_tb
+
# VSX Load/Store Instructions
LXV 111101 ..... ..... ............ . 001 @DQ_TSX
diff --git a/target/ppc/translate/vmx-impl.c.inc b/target/ppc/translate/vmx-impl.c.inc
index b361f73a67..58aca58f0f 100644
--- a/target/ppc/translate/vmx-impl.c.inc
+++ b/target/ppc/translate/vmx-impl.c.inc
@@ -1505,6 +1505,40 @@ static bool trans_VSRDBI(DisasContext *ctx, arg_VN *a)
return true;
}
+static bool do_vexpand(DisasContext *ctx, arg_VX_tb *a, unsigned vece)
+{
+ REQUIRE_INSNS_FLAGS2(ctx, ISA310);
+ REQUIRE_VECTOR(ctx);
+
+ tcg_gen_gvec_sari(vece, avr_full_offset(a->vrt), avr_full_offset(a->vrb),
+ (8 << vece) - 1, 16, 16);
+
+ return true;
+}
+
+TRANS(VEXPANDBM, do_vexpand, MO_8)
+TRANS(VEXPANDHM, do_vexpand, MO_16)
+TRANS(VEXPANDWM, do_vexpand, MO_32)
+TRANS(VEXPANDDM, do_vexpand, MO_64)
+
+static bool trans_VEXPANDQM(DisasContext *ctx, arg_VX_tb *a)
+{
+ TCGv_i64 tmp;
+
+ REQUIRE_INSNS_FLAGS2(ctx, ISA310);
+ REQUIRE_VECTOR(ctx);
+
+ tmp = tcg_temp_new_i64();
+
+ get_avr64(tmp, a->vrb, true);
+ tcg_gen_sari_i64(tmp, tmp, 63);
+ set_avr64(a->vrt, tmp, false);
+ set_avr64(a->vrt, tmp, true);
+
+ tcg_temp_free_i64(tmp);
+ return true;
+}
+
#define GEN_VAFORM_PAIRED(name0, name1, opc2) \
static void glue(gen_, name0##_##name1)(DisasContext *ctx) \
{ \
--
2.25.1
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH 2/3] target/ppc: Implement Vector Extract Mask
2021-11-10 18:56 [PATCH 0/3] target/ppc: Implement Vector Expand/Extract Mask and Vector Mask matheus.ferst
2021-11-10 18:56 ` [PATCH 1/3] target/ppc: Implement Vector Expand Mask matheus.ferst
@ 2021-11-10 18:56 ` matheus.ferst
2021-11-11 9:54 ` Richard Henderson
2021-11-10 18:56 ` [PATCH 3/3] target/ppc: Implement Vector Mask Move insns matheus.ferst
2 siblings, 1 reply; 7+ messages in thread
From: matheus.ferst @ 2021-11-10 18:56 UTC (permalink / raw)
To: qemu-devel, qemu-ppc
Cc: danielhb413, richard.henderson, groug, clg, Matheus Ferst, david
From: Matheus Ferst <matheus.ferst@eldorado.org.br>
Implement the following PowerISA v3.1 instructions:
vextractbm: Vector Extract Byte Mask
vextracthm: Vector Extract Halfword Mask
vextractwm: Vector Extract Word Mask
vextractdm: Vector Extract Doubleword Mask
vextractqm: Vector Extract Quadword Mask
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
---
target/ppc/insn32.decode | 6 ++
target/ppc/translate/vmx-impl.c.inc | 85 +++++++++++++++++++++++++++++
2 files changed, 91 insertions(+)
diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode
index 9a28f1d266..639ac22bf0 100644
--- a/target/ppc/insn32.decode
+++ b/target/ppc/insn32.decode
@@ -419,6 +419,12 @@ VEXPANDWM 000100 ..... 00010 ..... 11001000010 @VX_tb
VEXPANDDM 000100 ..... 00011 ..... 11001000010 @VX_tb
VEXPANDQM 000100 ..... 00100 ..... 11001000010 @VX_tb
+VEXTRACTBM 000100 ..... 01000 ..... 11001000010 @VX_tb
+VEXTRACTHM 000100 ..... 01001 ..... 11001000010 @VX_tb
+VEXTRACTWM 000100 ..... 01010 ..... 11001000010 @VX_tb
+VEXTRACTDM 000100 ..... 01011 ..... 11001000010 @VX_tb
+VEXTRACTQM 000100 ..... 01100 ..... 11001000010 @VX_tb
+
# VSX Load/Store Instructions
LXV 111101 ..... ..... ............ . 001 @DQ_TSX
diff --git a/target/ppc/translate/vmx-impl.c.inc b/target/ppc/translate/vmx-impl.c.inc
index 58aca58f0f..c6a30614fb 100644
--- a/target/ppc/translate/vmx-impl.c.inc
+++ b/target/ppc/translate/vmx-impl.c.inc
@@ -1539,6 +1539,91 @@ static bool trans_VEXPANDQM(DisasContext *ctx, arg_VX_tb *a)
return true;
}
+static bool do_vextractm(DisasContext *ctx, arg_VX_tb *a, unsigned vece)
+{
+ const uint64_t elem_length = 8 << vece, elem_num = 15 >> vece;
+ int i = elem_num;
+ uint64_t bit;
+ TCGv_i64 t, b, tmp, zero;
+
+ REQUIRE_INSNS_FLAGS2(ctx, ISA310);
+ REQUIRE_VECTOR(ctx);
+
+ t = tcg_const_i64(0);
+ b = tcg_temp_new_i64();
+ tmp = tcg_temp_new_i64();
+ zero = tcg_constant_i64(0);
+
+ get_avr64(b, a->vrb, true);
+ for (bit = 1ULL << 63; i > elem_num / 2; i--, bit >>= elem_length) {
+ tcg_gen_shli_i64(t, t, 1);
+ tcg_gen_andi_i64(tmp, b, bit);
+ tcg_gen_setcond_i64(TCG_COND_NE, tmp, tmp, zero);
+ tcg_gen_or_i64(t, t, tmp);
+ }
+
+ get_avr64(b, a->vrb, false);
+ for (bit = 1ULL << 63; i >= 0; i--, bit >>= elem_length) {
+ tcg_gen_shli_i64(t, t, 1);
+ tcg_gen_andi_i64(tmp, b, bit);
+ tcg_gen_setcond_i64(TCG_COND_NE, tmp, tmp, zero);
+ tcg_gen_or_i64(t, t, tmp);
+ }
+
+ tcg_gen_trunc_i64_tl(cpu_gpr[a->vrt], t);
+
+ tcg_temp_free_i64(t);
+ tcg_temp_free_i64(b);
+ tcg_temp_free_i64(tmp);
+
+ return true;
+}
+
+TRANS(VEXTRACTBM, do_vextractm, MO_8)
+TRANS(VEXTRACTHM, do_vextractm, MO_16)
+TRANS(VEXTRACTWM, do_vextractm, MO_32)
+
+static bool trans_VEXTRACTDM(DisasContext *ctx, arg_VX_tb *a)
+{
+ TCGv_i64 t, b;
+
+ t = tcg_temp_new_i64();
+ b = tcg_temp_new_i64();
+
+ get_avr64(b, a->vrb, true);
+ tcg_gen_andi_i64(t, b, 1);
+ tcg_gen_shli_i64(t, t, 1);
+
+ get_avr64(b, a->vrb, false);
+ tcg_gen_andi_i64(b, b, 1);
+ tcg_gen_or_i64(t, t, b);
+
+ tcg_gen_trunc_i64_tl(cpu_gpr[a->vrt], t);
+
+ tcg_temp_free_i64(t);
+ tcg_temp_free_i64(b);
+
+ return true;
+}
+
+static bool trans_VEXTRACTQM(DisasContext *ctx, arg_VX_tb *a)
+{
+ TCGv_i64 tmp;
+
+ REQUIRE_INSNS_FLAGS2(ctx, ISA310);
+ REQUIRE_VECTOR(ctx);
+
+ tmp = tcg_temp_new_i64();
+
+ get_avr64(tmp, a->vrb, true);
+ tcg_gen_shri_i64(tmp, tmp, 63);
+ tcg_gen_trunc_i64_tl(cpu_gpr[a->vrt], tmp);
+
+ tcg_temp_free_i64(tmp);
+
+ return true;
+}
+
#define GEN_VAFORM_PAIRED(name0, name1, opc2) \
static void glue(gen_, name0##_##name1)(DisasContext *ctx) \
{ \
--
2.25.1
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH 3/3] target/ppc: Implement Vector Mask Move insns
2021-11-10 18:56 [PATCH 0/3] target/ppc: Implement Vector Expand/Extract Mask and Vector Mask matheus.ferst
2021-11-10 18:56 ` [PATCH 1/3] target/ppc: Implement Vector Expand Mask matheus.ferst
2021-11-10 18:56 ` [PATCH 2/3] target/ppc: Implement Vector Extract Mask matheus.ferst
@ 2021-11-10 18:56 ` matheus.ferst
2021-11-11 10:43 ` Richard Henderson
2 siblings, 1 reply; 7+ messages in thread
From: matheus.ferst @ 2021-11-10 18:56 UTC (permalink / raw)
To: qemu-devel, qemu-ppc
Cc: danielhb413, richard.henderson, groug, clg, Matheus Ferst, david
From: Matheus Ferst <matheus.ferst@eldorado.org.br>
Implement the following PowerISA v3.1 instructions:
mtvsrbm: Move to VSR Byte Mask
mtvsrhm: Move to VSR Halfword Mask
mtvsrwm: Move to VSR Word Mask
mtvsrdm: Move to VSR Doubleword Mask
mtvsrqm: Move to VSR Quadword Mask
mtvsrbmi: Move to VSR Byte Mask Immediate
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
---
target/ppc/insn32.decode | 11 +++
target/ppc/translate/vmx-impl.c.inc | 114 ++++++++++++++++++++++++++++
2 files changed, 125 insertions(+)
diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode
index 639ac22bf0..f68931f4f3 100644
--- a/target/ppc/insn32.decode
+++ b/target/ppc/insn32.decode
@@ -40,6 +40,10 @@
%ds_rtp 22:4 !function=times_2
@DS_rtp ...... ....0 ra:5 .............. .. &D rt=%ds_rtp si=%ds_si
+&DX_b vrt b
+%dx_b 6:10 16:5 0:1
+@DX_b ...... vrt:5 ..... .......... ..... . &DX_b b=%dx_b
+
&DX rt d
%dx_d 6:s10 16:5 0:1
@DX ...... rt:5 ..... .......... ..... . &DX d=%dx_d
@@ -413,6 +417,13 @@ VSRDBI 000100 ..... ..... ..... 01 ... 010110 @VN
## Vector Mask Manipulation Instructions
+MTVSRBM 000100 ..... 10000 ..... 11001000010 @VX_tb
+MTVSRHM 000100 ..... 10001 ..... 11001000010 @VX_tb
+MTVSRWM 000100 ..... 10010 ..... 11001000010 @VX_tb
+MTVSRDM 000100 ..... 10011 ..... 11001000010 @VX_tb
+MTVSRQM 000100 ..... 10100 ..... 11001000010 @VX_tb
+MTVSRBMI 000100 ..... ..... .......... 01010 . @DX_b
+
VEXPANDBM 000100 ..... 00000 ..... 11001000010 @VX_tb
VEXPANDHM 000100 ..... 00001 ..... 11001000010 @VX_tb
VEXPANDWM 000100 ..... 00010 ..... 11001000010 @VX_tb
diff --git a/target/ppc/translate/vmx-impl.c.inc b/target/ppc/translate/vmx-impl.c.inc
index c6a30614fb..9f86133d1d 100644
--- a/target/ppc/translate/vmx-impl.c.inc
+++ b/target/ppc/translate/vmx-impl.c.inc
@@ -1624,6 +1624,120 @@ static bool trans_VEXTRACTQM(DisasContext *ctx, arg_VX_tb *a)
return true;
}
+static bool do_mtvsrm(DisasContext *ctx, arg_VX_tb *a, unsigned vece)
+{
+ const uint64_t elem_length = 8 << vece, highest_bit = 15 >> vece;
+ int i;
+ TCGv_i64 t0, t1, zero, ones;
+
+ REQUIRE_INSNS_FLAGS2(ctx, ISA310);
+ REQUIRE_VECTOR(ctx);
+
+ t0 = tcg_const_i64(0);
+ t1 = tcg_temp_new_i64();
+ zero = tcg_constant_i64(0);
+ ones = tcg_constant_i64(MAKE_64BIT_MASK(0, elem_length));
+
+ for (i = 1 << highest_bit; i > 1 << (highest_bit / 2); i >>= 1) {
+ tcg_gen_shli_i64(t0, t0, elem_length);
+ tcg_gen_ext_tl_i64(t1, cpu_gpr[a->vrb]);
+ tcg_gen_andi_i64(t1, t1, i);
+ tcg_gen_movcond_i64(TCG_COND_NE, t1, t1, zero, ones, zero);
+ tcg_gen_or_i64(t0, t0, t1);
+ }
+
+ set_avr64(a->vrt, t0, true);
+
+ for (; i > 0; i >>= 1) {
+ tcg_gen_shli_i64(t0, t0, elem_length);
+ tcg_gen_ext_tl_i64(t1, cpu_gpr[a->vrb]);
+ tcg_gen_andi_i64(t1, t1, i);
+ tcg_gen_movcond_i64(TCG_COND_NE, t1, t1, zero, ones, zero);
+ tcg_gen_or_i64(t0, t0, t1);
+ }
+
+ set_avr64(a->vrt, t0, false);
+
+ tcg_temp_free_i64(t0);
+ tcg_temp_free_i64(t1);
+
+ return true;
+}
+
+TRANS(MTVSRBM, do_mtvsrm, MO_8)
+TRANS(MTVSRHM, do_mtvsrm, MO_16)
+TRANS(MTVSRWM, do_mtvsrm, MO_32)
+
+static bool trans_MTVSRDM(DisasContext *ctx, arg_VX_tb *a)
+{
+ TCGv_i64 t0, t1;
+
+ REQUIRE_INSNS_FLAGS2(ctx, ISA310);
+ REQUIRE_VECTOR(ctx);
+
+ t0 = tcg_temp_new_i64();
+ t1 = tcg_temp_new_i64();
+
+ tcg_gen_ext_tl_i64(t0, cpu_gpr[a->vrb]);
+ tcg_gen_sextract_i64(t1, t0, 1, 1);
+ set_avr64(a->vrt, t1, true);
+ tcg_gen_sextract_i64(t0, t0, 0, 1);
+ set_avr64(a->vrt, t0, false);
+
+ tcg_temp_free_i64(t0);
+ tcg_temp_free_i64(t1);
+
+ return true;
+}
+
+static bool trans_MTVSRQM(DisasContext *ctx, arg_VX_tb *a)
+{
+ TCGv_i64 tmp;
+
+ REQUIRE_INSNS_FLAGS2(ctx, ISA310);
+ REQUIRE_VECTOR(ctx);
+
+ tmp = tcg_temp_new_i64();
+
+ tcg_gen_ext_tl_i64(tmp, cpu_gpr[a->vrb]);
+ tcg_gen_sextract_i64(tmp, tmp, 0, 1);
+ set_avr64(a->vrt, tmp, false);
+ set_avr64(a->vrt, tmp, true);
+
+ tcg_temp_free_i64(tmp);
+
+ return true;
+}
+
+static bool trans_MTVSRBMI(DisasContext *ctx, arg_DX_b *a)
+{
+ int i;
+ uint64_t hi = 0, lo = 0;
+
+ REQUIRE_INSNS_FLAGS2(ctx, ISA310);
+ REQUIRE_VECTOR(ctx);
+
+ for (i = 1 << 15; i >= 1 << 8; i >>= 1) {
+ hi <<= 8;
+ if (a->b & i) {
+ hi |= 0xFF;
+ }
+ }
+
+ set_avr64(a->vrt, tcg_constant_i64(hi), true);
+
+ for (; i > 0; i >>= 1) {
+ lo <<= 8;
+ if (a->b & i) {
+ lo |= 0xFF;
+ }
+ }
+
+ set_avr64(a->vrt, tcg_constant_i64(lo), false);
+
+ return true;
+}
+
#define GEN_VAFORM_PAIRED(name0, name1, opc2) \
static void glue(gen_, name0##_##name1)(DisasContext *ctx) \
{ \
--
2.25.1
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH 1/3] target/ppc: Implement Vector Expand Mask
2021-11-10 18:56 ` [PATCH 1/3] target/ppc: Implement Vector Expand Mask matheus.ferst
@ 2021-11-11 9:28 ` Richard Henderson
0 siblings, 0 replies; 7+ messages in thread
From: Richard Henderson @ 2021-11-11 9:28 UTC (permalink / raw)
To: matheus.ferst, qemu-devel, qemu-ppc; +Cc: groug, danielhb413, clg, david
On 11/10/21 7:56 PM, matheus.ferst@eldorado.org.br wrote:
> From: Matheus Ferst<matheus.ferst@eldorado.org.br>
>
> Implement the following PowerISA v3.1 instructions:
> vexpandbm: Vector Expand Byte Mask
> vexpandhm: Vector Expand Halfword Mask
> vexpandwm: Vector Expand Word Mask
> vexpanddm: Vector Expand Doubleword Mask
> vexpandqm: Vector Expand Quadword Mask
>
> Signed-off-by: Matheus Ferst<matheus.ferst@eldorado.org.br>
> ---
> target/ppc/insn32.decode | 11 ++++++++++
> target/ppc/translate/vmx-impl.c.inc | 34 +++++++++++++++++++++++++++++
> 2 files changed, 45 insertions(+)
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
r~
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 2/3] target/ppc: Implement Vector Extract Mask
2021-11-10 18:56 ` [PATCH 2/3] target/ppc: Implement Vector Extract Mask matheus.ferst
@ 2021-11-11 9:54 ` Richard Henderson
0 siblings, 0 replies; 7+ messages in thread
From: Richard Henderson @ 2021-11-11 9:54 UTC (permalink / raw)
To: matheus.ferst, qemu-devel, qemu-ppc; +Cc: groug, danielhb413, clg, david
On 11/10/21 7:56 PM, matheus.ferst@eldorado.org.br wrote:
> From: Matheus Ferst <matheus.ferst@eldorado.org.br>
>
> Implement the following PowerISA v3.1 instructions:
> vextractbm: Vector Extract Byte Mask
> vextracthm: Vector Extract Halfword Mask
> vextractwm: Vector Extract Word Mask
> vextractdm: Vector Extract Doubleword Mask
> vextractqm: Vector Extract Quadword Mask
>
> Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
> ---
> target/ppc/insn32.decode | 6 ++
> target/ppc/translate/vmx-impl.c.inc | 85 +++++++++++++++++++++++++++++
> 2 files changed, 91 insertions(+)
>
> diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode
> index 9a28f1d266..639ac22bf0 100644
> --- a/target/ppc/insn32.decode
> +++ b/target/ppc/insn32.decode
> @@ -419,6 +419,12 @@ VEXPANDWM 000100 ..... 00010 ..... 11001000010 @VX_tb
> VEXPANDDM 000100 ..... 00011 ..... 11001000010 @VX_tb
> VEXPANDQM 000100 ..... 00100 ..... 11001000010 @VX_tb
>
> +VEXTRACTBM 000100 ..... 01000 ..... 11001000010 @VX_tb
> +VEXTRACTHM 000100 ..... 01001 ..... 11001000010 @VX_tb
> +VEXTRACTWM 000100 ..... 01010 ..... 11001000010 @VX_tb
> +VEXTRACTDM 000100 ..... 01011 ..... 11001000010 @VX_tb
> +VEXTRACTQM 000100 ..... 01100 ..... 11001000010 @VX_tb
> +
> # VSX Load/Store Instructions
>
> LXV 111101 ..... ..... ............ . 001 @DQ_TSX
> diff --git a/target/ppc/translate/vmx-impl.c.inc b/target/ppc/translate/vmx-impl.c.inc
> index 58aca58f0f..c6a30614fb 100644
> --- a/target/ppc/translate/vmx-impl.c.inc
> +++ b/target/ppc/translate/vmx-impl.c.inc
> @@ -1539,6 +1539,91 @@ static bool trans_VEXPANDQM(DisasContext *ctx, arg_VX_tb *a)
> return true;
> }
>
> +static bool do_vextractm(DisasContext *ctx, arg_VX_tb *a, unsigned vece)
> +{
> + const uint64_t elem_length = 8 << vece, elem_num = 15 >> vece;
> + int i = elem_num;
> + uint64_t bit;
> + TCGv_i64 t, b, tmp, zero;
> +
> + REQUIRE_INSNS_FLAGS2(ctx, ISA310);
> + REQUIRE_VECTOR(ctx);
> +
> + t = tcg_const_i64(0);
> + b = tcg_temp_new_i64();
> + tmp = tcg_temp_new_i64();
> + zero = tcg_constant_i64(0);
> +
> + get_avr64(b, a->vrb, true);
> + for (bit = 1ULL << 63; i > elem_num / 2; i--, bit >>= elem_length) {
> + tcg_gen_shli_i64(t, t, 1);
> + tcg_gen_andi_i64(tmp, b, bit);
> + tcg_gen_setcond_i64(TCG_COND_NE, tmp, tmp, zero);
> + tcg_gen_or_i64(t, t, tmp);
> + }
This is over-complicated. Shift b into the correct position, isolate the correct bit, or
it into the result.
int ele_width = 8 << vece;
int ele_count_half = 8 >> vece;
tcg_gen_movi_i64(r, 0);
for (int w = 0; w < 2; w++) {
get_avr64(v, a->vrb, w);
for (int i = 0; i < ele_count_half; ++i) {
int b_in = i * ele_width - 1;
int b_out = w * ele_count_half + i;
tcg_gen_shri_i64(t, v, b_in - b_out);
tcg_gen_andi_i64(t, t, 1 << b_out);
tcg_gen_or_i64(r, r, t);
}
}
tcg_gen_trunc_i64_tl(gpr, r);
> +TRANS(VEXTRACTBM, do_vextractm, MO_8)
> +TRANS(VEXTRACTHM, do_vextractm, MO_16)
> +TRANS(VEXTRACTWM, do_vextractm, MO_32)
> +
> +static bool trans_VEXTRACTDM(DisasContext *ctx, arg_VX_tb *a)
Should be able to use the common routine above as well.
r~
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 3/3] target/ppc: Implement Vector Mask Move insns
2021-11-10 18:56 ` [PATCH 3/3] target/ppc: Implement Vector Mask Move insns matheus.ferst
@ 2021-11-11 10:43 ` Richard Henderson
0 siblings, 0 replies; 7+ messages in thread
From: Richard Henderson @ 2021-11-11 10:43 UTC (permalink / raw)
To: matheus.ferst, qemu-devel, qemu-ppc; +Cc: groug, danielhb413, clg, david
On 11/10/21 7:56 PM, matheus.ferst@eldorado.org.br wrote:
> +static bool do_mtvsrm(DisasContext *ctx, arg_VX_tb *a, unsigned vece)
> +{
> + const uint64_t elem_length = 8 << vece, highest_bit = 15 >> vece;
> + int i;
> + TCGv_i64 t0, t1, zero, ones;
> +
> + REQUIRE_INSNS_FLAGS2(ctx, ISA310);
> + REQUIRE_VECTOR(ctx);
> +
> + t0 = tcg_const_i64(0);
> + t1 = tcg_temp_new_i64();
> + zero = tcg_constant_i64(0);
> + ones = tcg_constant_i64(MAKE_64BIT_MASK(0, elem_length));
> +
> + for (i = 1 << highest_bit; i > 1 << (highest_bit / 2); i >>= 1) {
> + tcg_gen_shli_i64(t0, t0, elem_length);
> + tcg_gen_ext_tl_i64(t1, cpu_gpr[a->vrb]);
> + tcg_gen_andi_i64(t1, t1, i);
> + tcg_gen_movcond_i64(TCG_COND_NE, t1, t1, zero, ones, zero);
> + tcg_gen_or_i64(t0, t0, t1);
> + }
We can do better than that.
tcg_gen_extu_tl_i64(t0, gpr);
tcg_gen_extract_i64(t1, t0, elem_count_half, elem_count_half);
tcg_gen_extract_i64(t0, t0, 0, elem_count_half);
/*
* Spread the bits into their respective elements.
* E.g. for bytes:
* 00000000000000000000000000000000000000000000000000000000abcdefgh
* << 32-4
* 0000000000000000000000000000abcdefgh0000000000000000000000000000
* |
* 0000000000000000000000000000abcdefgh00000000000000000000abcdefgh
* << 16-2
* 00000000000000abcdefgh00000000000000000000abcdefgh00000000000000
* |
* 00000000000000abcdefgh000000abcdefgh000000abcdefgh000000abcdefgh
* << 8-1
* 0000000abcdefgh000000abcdefgh000000abcdefgh000000abcdefgh0000000
* |
* 0000000abcdefgXbcdefgXbcdefgXbcdefgXbcdefgXbcdefgXbcdefgXbcdefgh
* & dup(1)
* 0000000a0000000b0000000c0000000d0000000e0000000f0000000g0000000h
* * 0xff
* aaaaaaaabbbbbbbbccccccccddddddddeeeeeeeeffffffffgggggggghhhhhhhh
*/
for (i = elem_count_half, j = 32; i > 0; i >>= 1, j >>= 1) {
tcg_gen_shli_i64(s0, t0, j - i);
tcg_gen_shli_i64(s1, t1, j - i);
tcg_gen_or_i64(t0, t0, s0);
tcg_gen_or_i64(t1, t1, s1);
}
c = dup_const(vece, 1);
tcg_gen_andi_i64(t0, t0, c);
tcg_gen_andi_i64(t1, t1, c);
c = MAKE_64BIT_MASK(0, elem_length);
tcg_gen_muli_i64(t0, t0, c);
tcg_gen_muli_i64(t1, t1, c);
set_avr64(a->vrt, t0, false);
set_avr64(a->vrt, t1, true);
r~
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2021-11-11 10:44 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-11-10 18:56 [PATCH 0/3] target/ppc: Implement Vector Expand/Extract Mask and Vector Mask matheus.ferst
2021-11-10 18:56 ` [PATCH 1/3] target/ppc: Implement Vector Expand Mask matheus.ferst
2021-11-11 9:28 ` Richard Henderson
2021-11-10 18:56 ` [PATCH 2/3] target/ppc: Implement Vector Extract Mask matheus.ferst
2021-11-11 9:54 ` Richard Henderson
2021-11-10 18:56 ` [PATCH 3/3] target/ppc: Implement Vector Mask Move insns matheus.ferst
2021-11-11 10:43 ` Richard Henderson
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).