* [PATCH v3 00/37] target/ppc: PowerISA Vector/VSX instruction batch
@ 2022-02-10 12:34 matheus.ferst
2022-02-10 12:34 ` [PATCH v3 01/37] target/ppc: Introduce TRANS*FLAGS macros matheus.ferst
` (36 more replies)
0 siblings, 37 replies; 55+ messages in thread
From: matheus.ferst @ 2022-02-10 12:34 UTC (permalink / raw)
To: qemu-devel, qemu-ppc
Cc: danielhb413, richard.henderson, groug, clg, Matheus Ferst, david
From: Matheus Ferst <matheus.ferst@eldorado.org.br>
This patch series implements 5 missing instructions from PowerISA v3.0
and 40 new instructions from PowerISA v3.1, moving 62 other instructions
to decodetree along the way.
v3:
- Dropped patch 33, which caused a regression in xxperm[r]
v2:
- New patch (30) to remove xscmpnedp
Lucas Coutinho (2):
target/ppc: Move vexts[bhw]2[wd] to decodetree
target/ppc: Implement vextsd2q
Lucas Mateus Castro (alqotel) (3):
target/ppc: moved vector even and odd multiplication to decodetree
target/ppc: Moved vector multiply high and low to decodetree
target/ppc: vmulh* instructions use gvec
Luis Pires (1):
target/ppc: Introduce TRANS*FLAGS macros
Matheus Ferst (20):
target/ppc: Move Vector Compare Equal/Not Equal/Greater Than to
decodetree
target/ppc: Move Vector Compare Not Equal or Zero to decodetree
target/ppc: Implement Vector Compare Equal Quadword
target/ppc: Implement Vector Compare Greater Than Quadword
target/ppc: Implement Vector Compare Quadword
target/ppc: implement vstri[bh][lr]
target/ppc: implement vclrlb
target/ppc: implement vclrrb
target/ppc: implement vcntmb[bhwd]
target/ppc: implement vgnb
target/ppc: Move vsel and vperm/vpermr to decodetree
target/ppc: Move xxsel to decodetree
target/ppc: move xxperm/xxpermr to decodetree
target/ppc: Move xxpermdi to decodetree
target/ppc: Implement xxpermx instruction
tcg/tcg-op-gvec.c: Introduce tcg_gen_gvec_4i
target/ppc: Implement xxeval
target/ppc: Implement xxgenpcv[bhwd]m instruction
target/ppc: move xs[n]madd[am][ds]p/xs[n]msub[am][ds]p to decodetree
target/ppc: implement xs[n]maddqp[o]/xs[n]msubqp[o]
Víctor Colombo (11):
target/ppc: Implement vmsumcud instruction
target/ppc: Implement vmsumudm instruction
target/ppc: Implement xvtlsbb instruction
target/ppc: Remove xscmpnedp instruction
target/ppc: Refactor VSX_SCALAR_CMP_DP
target/ppc: Implement xscmp{eq,ge,gt}qp
target/ppc: Move xscmp{eq,ge,gt}dp to decodetree
target/ppc: Move xs{max,min}[cj]dp to use do_helper_XX3
target/ppc: Refactor VSX_MAX_MINC helper
target/ppc: Implement xs{max,min}cqp
target/ppc: Implement xvcvbf16spn and xvcvspbf16 instructions
include/tcg/tcg-op-gvec.h | 22 +
target/ppc/fpu_helper.c | 171 ++++--
target/ppc/helper.h | 143 ++---
target/ppc/insn32.decode | 188 +++++-
target/ppc/insn64.decode | 40 +-
target/ppc/int_helper.c | 354 ++++++-----
target/ppc/translate.c | 19 +
target/ppc/translate/vmx-impl.c.inc | 894 +++++++++++++++++++++++++---
target/ppc/translate/vmx-ops.c.inc | 41 +-
target/ppc/translate/vsx-impl.c.inc | 543 ++++++++++++++---
target/ppc/translate/vsx-ops.c.inc | 67 ---
tcg/ppc/tcg-target.c.inc | 6 +
tcg/tcg-op-gvec.c | 146 +++++
13 files changed, 2066 insertions(+), 568 deletions(-)
--
2.31.1
^ permalink raw reply [flat|nested] 55+ messages in thread
* [PATCH v3 01/37] target/ppc: Introduce TRANS*FLAGS macros
2022-02-10 12:34 [PATCH v3 00/37] target/ppc: PowerISA Vector/VSX instruction batch matheus.ferst
@ 2022-02-10 12:34 ` matheus.ferst
2022-02-10 12:34 ` [PATCH v3 02/37] target/ppc: moved vector even and odd multiplication to decodetree matheus.ferst
` (35 subsequent siblings)
36 siblings, 0 replies; 55+ messages in thread
From: matheus.ferst @ 2022-02-10 12:34 UTC (permalink / raw)
To: qemu-devel, qemu-ppc
Cc: danielhb413, richard.henderson, groug, Luis Pires, clg,
Matheus Ferst, david
From: Luis Pires <luis.pires@eldorado.org.br>
New macros that add FLAGS and FLAGS2 checking were added for
both TRANS and TRANS64.
Signed-off-by: Luis Pires <luis.pires@eldorado.org.br>
[ferst: - TRANS_FLAGS2 instead of TRANS_FLAGS_E
- Use the new macros in load/store vector insns ]
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
---
target/ppc/translate.c | 19 +++++++++++++++
target/ppc/translate/vsx-impl.c.inc | 37 ++++++++++-------------------
2 files changed, 31 insertions(+), 25 deletions(-)
diff --git a/target/ppc/translate.c b/target/ppc/translate.c
index 40232201bb..4731a2e45a 100644
--- a/target/ppc/translate.c
+++ b/target/ppc/translate.c
@@ -7438,10 +7438,29 @@ static int times_16(DisasContext *ctx, int x)
#define TRANS(NAME, FUNC, ...) \
static bool trans_##NAME(DisasContext *ctx, arg_##NAME *a) \
{ return FUNC(ctx, a, __VA_ARGS__); }
+#define TRANS_FLAGS(FLAGS, NAME, FUNC, ...) \
+ static bool trans_##NAME(DisasContext *ctx, arg_##NAME *a) \
+ { \
+ REQUIRE_INSNS_FLAGS(ctx, FLAGS); \
+ return FUNC(ctx, a, __VA_ARGS__); \
+ }
+#define TRANS_FLAGS2(FLAGS2, NAME, FUNC, ...) \
+ static bool trans_##NAME(DisasContext *ctx, arg_##NAME *a) \
+ { \
+ REQUIRE_INSNS_FLAGS2(ctx, FLAGS2); \
+ return FUNC(ctx, a, __VA_ARGS__); \
+ }
#define TRANS64(NAME, FUNC, ...) \
static bool trans_##NAME(DisasContext *ctx, arg_##NAME *a) \
{ REQUIRE_64BIT(ctx); return FUNC(ctx, a, __VA_ARGS__); }
+#define TRANS64_FLAGS2(FLAGS2, NAME, FUNC, ...) \
+ static bool trans_##NAME(DisasContext *ctx, arg_##NAME *a) \
+ { \
+ REQUIRE_64BIT(ctx); \
+ REQUIRE_INSNS_FLAGS2(ctx, FLAGS2); \
+ return FUNC(ctx, a, __VA_ARGS__); \
+ }
/* TODO: More TRANS* helpers for extra insn_flags checks. */
diff --git a/target/ppc/translate/vsx-impl.c.inc b/target/ppc/translate/vsx-impl.c.inc
index c636e38164..b89be57272 100644
--- a/target/ppc/translate/vsx-impl.c.inc
+++ b/target/ppc/translate/vsx-impl.c.inc
@@ -2070,12 +2070,6 @@ static bool do_lstxv(DisasContext *ctx, int ra, TCGv displ,
static bool do_lstxv_D(DisasContext *ctx, arg_D *a, bool store, bool paired)
{
- if (paired) {
- REQUIRE_INSNS_FLAGS2(ctx, ISA310);
- } else {
- REQUIRE_INSNS_FLAGS2(ctx, ISA300);
- }
-
if (paired || a->rt >= 32) {
REQUIRE_VSX(ctx);
} else {
@@ -2089,7 +2083,6 @@ static bool do_lstxv_PLS_D(DisasContext *ctx, arg_PLS_D *a,
bool store, bool paired)
{
arg_D d;
- REQUIRE_INSNS_FLAGS2(ctx, ISA310);
REQUIRE_VSX(ctx);
if (!resolve_PLS_D(ctx, &d, a)) {
@@ -2101,12 +2094,6 @@ static bool do_lstxv_PLS_D(DisasContext *ctx, arg_PLS_D *a,
static bool do_lstxv_X(DisasContext *ctx, arg_X *a, bool store, bool paired)
{
- if (paired) {
- REQUIRE_INSNS_FLAGS2(ctx, ISA310);
- } else {
- REQUIRE_INSNS_FLAGS2(ctx, ISA300);
- }
-
if (paired || a->rt >= 32) {
REQUIRE_VSX(ctx);
} else {
@@ -2116,18 +2103,18 @@ static bool do_lstxv_X(DisasContext *ctx, arg_X *a, bool store, bool paired)
return do_lstxv(ctx, a->ra, cpu_gpr[a->rb], a->rt, store, paired);
}
-TRANS(STXV, do_lstxv_D, true, false)
-TRANS(LXV, do_lstxv_D, false, false)
-TRANS(STXVP, do_lstxv_D, true, true)
-TRANS(LXVP, do_lstxv_D, false, true)
-TRANS(STXVX, do_lstxv_X, true, false)
-TRANS(LXVX, do_lstxv_X, false, false)
-TRANS(STXVPX, do_lstxv_X, true, true)
-TRANS(LXVPX, do_lstxv_X, false, true)
-TRANS64(PSTXV, do_lstxv_PLS_D, true, false)
-TRANS64(PLXV, do_lstxv_PLS_D, false, false)
-TRANS64(PSTXVP, do_lstxv_PLS_D, true, true)
-TRANS64(PLXVP, do_lstxv_PLS_D, false, true)
+TRANS_FLAGS2(ISA300, STXV, do_lstxv_D, true, false)
+TRANS_FLAGS2(ISA300, LXV, do_lstxv_D, false, false)
+TRANS_FLAGS2(ISA310, STXVP, do_lstxv_D, true, true)
+TRANS_FLAGS2(ISA310, LXVP, do_lstxv_D, false, true)
+TRANS_FLAGS2(ISA300, STXVX, do_lstxv_X, true, false)
+TRANS_FLAGS2(ISA300, LXVX, do_lstxv_X, false, false)
+TRANS_FLAGS2(ISA310, STXVPX, do_lstxv_X, true, true)
+TRANS_FLAGS2(ISA310, LXVPX, do_lstxv_X, false, true)
+TRANS64_FLAGS2(ISA310, PSTXV, do_lstxv_PLS_D, true, false)
+TRANS64_FLAGS2(ISA310, PLXV, do_lstxv_PLS_D, false, false)
+TRANS64_FLAGS2(ISA310, PSTXVP, do_lstxv_PLS_D, true, true)
+TRANS64_FLAGS2(ISA310, PLXVP, do_lstxv_PLS_D, false, true)
static void gen_xxblendv_vec(unsigned vece, TCGv_vec t, TCGv_vec a, TCGv_vec b,
TCGv_vec c)
--
2.31.1
^ permalink raw reply related [flat|nested] 55+ messages in thread
* [PATCH v3 02/37] target/ppc: moved vector even and odd multiplication to decodetree
2022-02-10 12:34 [PATCH v3 00/37] target/ppc: PowerISA Vector/VSX instruction batch matheus.ferst
2022-02-10 12:34 ` [PATCH v3 01/37] target/ppc: Introduce TRANS*FLAGS macros matheus.ferst
@ 2022-02-10 12:34 ` matheus.ferst
2022-02-11 3:39 ` Richard Henderson
2022-02-10 12:34 ` [PATCH v3 03/37] target/ppc: Moved vector multiply high and low " matheus.ferst
` (34 subsequent siblings)
36 siblings, 1 reply; 55+ messages in thread
From: matheus.ferst @ 2022-02-10 12:34 UTC (permalink / raw)
To: qemu-devel, qemu-ppc
Cc: Lucas Mateus Castro (alqotel),
danielhb413, richard.henderson, groug, Lucas Mateus Castro, clg,
Matheus Ferst, david
From: "Lucas Mateus Castro (alqotel)" <lucas.castro@eldorado.org.br>
Moved the instructions vmulesb, vmulosb, vmuleub, vmuloub,
vmulesh, vmulosh, vmuleuh, vmulouh, vmulesw, vmulosw,
muleuw and vmulouw from legacy to decodetree. Implemented
the instructions vmulesd, vmulosd, vmuleud, vmuloud.
Signed-off-by: Lucas Mateus Castro (alqotel) <lucas.araujo@eldorado.org.br>
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
---
target/ppc/helper.h | 28 +++++++++-------
target/ppc/insn32.decode | 22 ++++++++++++
target/ppc/int_helper.c | 36 ++++++++++++++------
target/ppc/translate/vmx-impl.c.inc | 52 +++++++++++++++++++----------
target/ppc/translate/vmx-ops.c.inc | 15 ++-------
tcg/ppc/tcg-target.c.inc | 6 ++++
6 files changed, 107 insertions(+), 52 deletions(-)
diff --git a/target/ppc/helper.h b/target/ppc/helper.h
index f9c72dcd50..debb05c12c 100644
--- a/target/ppc/helper.h
+++ b/target/ppc/helper.h
@@ -191,18 +191,22 @@ DEF_HELPER_3(vmrglw, void, avr, avr, avr)
DEF_HELPER_3(vmrghb, void, avr, avr, avr)
DEF_HELPER_3(vmrghh, void, avr, avr, avr)
DEF_HELPER_3(vmrghw, void, avr, avr, avr)
-DEF_HELPER_3(vmulesb, void, avr, avr, avr)
-DEF_HELPER_3(vmulesh, void, avr, avr, avr)
-DEF_HELPER_3(vmulesw, void, avr, avr, avr)
-DEF_HELPER_3(vmuleub, void, avr, avr, avr)
-DEF_HELPER_3(vmuleuh, void, avr, avr, avr)
-DEF_HELPER_3(vmuleuw, void, avr, avr, avr)
-DEF_HELPER_3(vmulosb, void, avr, avr, avr)
-DEF_HELPER_3(vmulosh, void, avr, avr, avr)
-DEF_HELPER_3(vmulosw, void, avr, avr, avr)
-DEF_HELPER_3(vmuloub, void, avr, avr, avr)
-DEF_HELPER_3(vmulouh, void, avr, avr, avr)
-DEF_HELPER_3(vmulouw, void, avr, avr, avr)
+DEF_HELPER_3(VMULESB, void, avr, avr, avr)
+DEF_HELPER_3(VMULESH, void, avr, avr, avr)
+DEF_HELPER_3(VMULESW, void, avr, avr, avr)
+DEF_HELPER_3(VMULESD, void, avr, avr, avr)
+DEF_HELPER_3(VMULEUB, void, avr, avr, avr)
+DEF_HELPER_3(VMULEUH, void, avr, avr, avr)
+DEF_HELPER_3(VMULEUW, void, avr, avr, avr)
+DEF_HELPER_3(VMULEUD, void, avr, avr, avr)
+DEF_HELPER_3(VMULOSB, void, avr, avr, avr)
+DEF_HELPER_3(VMULOSH, void, avr, avr, avr)
+DEF_HELPER_3(VMULOSW, void, avr, avr, avr)
+DEF_HELPER_3(VMULOSD, void, avr, avr, avr)
+DEF_HELPER_3(VMULOUB, void, avr, avr, avr)
+DEF_HELPER_3(VMULOUH, void, avr, avr, avr)
+DEF_HELPER_3(VMULOUW, void, avr, avr, avr)
+DEF_HELPER_3(VMULOUD, void, avr, avr, avr)
DEF_HELPER_3(vmulhsw, void, avr, avr, avr)
DEF_HELPER_3(vmulhuw, void, avr, avr, avr)
DEF_HELPER_3(vmulhsd, void, avr, avr, avr)
diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode
index 2a9c91a423..cab372fa44 100644
--- a/target/ppc/insn32.decode
+++ b/target/ppc/insn32.decode
@@ -475,3 +475,25 @@ XSCVQPDP 111111 ..... 10100 ..... 1101000100 . @X_tb_rc
&XL_s s:uint8_t
@XL_s ......-------------- s:1 .......... - &XL_s
RFEBB 010011-------------- . 0010010010 - @XL_s
+
+## Vector Multiply Instruction
+
+VMULESB 000100 ..... ..... ..... 01100001000 @VX
+VMULOSB 000100 ..... ..... ..... 00100001000 @VX
+VMULEUB 000100 ..... ..... ..... 01000001000 @VX
+VMULOUB 000100 ..... ..... ..... 00000001000 @VX
+
+VMULESH 000100 ..... ..... ..... 01101001000 @VX
+VMULOSH 000100 ..... ..... ..... 00101001000 @VX
+VMULEUH 000100 ..... ..... ..... 01001001000 @VX
+VMULOUH 000100 ..... ..... ..... 00001001000 @VX
+
+VMULESW 000100 ..... ..... ..... 01110001000 @VX
+VMULOSW 000100 ..... ..... ..... 00110001000 @VX
+VMULEUW 000100 ..... ..... ..... 01010001000 @VX
+VMULOUW 000100 ..... ..... ..... 00010001000 @VX
+
+VMULESD 000100 ..... ..... ..... 01111001000 @VX
+VMULOSD 000100 ..... ..... ..... 00111001000 @VX
+VMULEUD 000100 ..... ..... ..... 01011001000 @VX
+VMULOUD 000100 ..... ..... ..... 00011001000 @VX
diff --git a/target/ppc/int_helper.c b/target/ppc/int_helper.c
index 9bc327bcba..56b9e9369b 100644
--- a/target/ppc/int_helper.c
+++ b/target/ppc/int_helper.c
@@ -1150,7 +1150,7 @@ void helper_vmsumuhs(CPUPPCState *env, ppc_avr_t *r, ppc_avr_t *a,
}
#define VMUL_DO_EVN(name, mul_element, mul_access, prod_access, cast) \
- void helper_v##name(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b) \
+ void helper_V##name(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b) \
{ \
int i; \
\
@@ -1161,7 +1161,7 @@ void helper_vmsumuhs(CPUPPCState *env, ppc_avr_t *r, ppc_avr_t *a,
}
#define VMUL_DO_ODD(name, mul_element, mul_access, prod_access, cast) \
- void helper_v##name(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b) \
+ void helper_V##name(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b) \
{ \
int i; \
\
@@ -1172,17 +1172,33 @@ void helper_vmsumuhs(CPUPPCState *env, ppc_avr_t *r, ppc_avr_t *a,
}
#define VMUL(suffix, mul_element, mul_access, prod_access, cast) \
- VMUL_DO_EVN(mule##suffix, mul_element, mul_access, prod_access, cast) \
- VMUL_DO_ODD(mulo##suffix, mul_element, mul_access, prod_access, cast)
-VMUL(sb, s8, VsrSB, VsrSH, int16_t)
-VMUL(sh, s16, VsrSH, VsrSW, int32_t)
-VMUL(sw, s32, VsrSW, VsrSD, int64_t)
-VMUL(ub, u8, VsrB, VsrH, uint16_t)
-VMUL(uh, u16, VsrH, VsrW, uint32_t)
-VMUL(uw, u32, VsrW, VsrD, uint64_t)
+ VMUL_DO_EVN(MULE##suffix, mul_element, mul_access, prod_access, cast) \
+ VMUL_DO_ODD(MULO##suffix, mul_element, mul_access, prod_access, cast)
+VMUL(SB, s8, VsrSB, VsrSH, int16_t)
+VMUL(SH, s16, VsrSH, VsrSW, int32_t)
+VMUL(SW, s32, VsrSW, VsrSD, int64_t)
+VMUL(UB, u8, VsrB, VsrH, uint16_t)
+VMUL(UH, u16, VsrH, VsrW, uint32_t)
+VMUL(UW, u32, VsrW, VsrD, uint64_t)
#undef VMUL_DO_EVN
#undef VMUL_DO_ODD
#undef VMUL
+void helper_VMULESD(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b)
+{
+ muls64(&r->VsrD(1), &r->VsrD(0), a->VsrSD(0), b->VsrSD(0));
+}
+void helper_VMULOSD(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b)
+{
+ muls64(&r->VsrD(1), &r->VsrD(0), a->VsrSD(1), b->VsrSD(1));
+}
+void helper_VMULEUD(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b)
+{
+ mulu64(&r->VsrD(1), &r->VsrD(0), a->VsrD(0), b->VsrD(0));
+}
+void helper_VMULOUD(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b)
+{
+ mulu64(&r->VsrD(1), &r->VsrD(0), a->VsrD(1), b->VsrD(1));
+}
void helper_vmulhsw(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b)
{
diff --git a/target/ppc/translate/vmx-impl.c.inc b/target/ppc/translate/vmx-impl.c.inc
index d5e02fd7f2..430579addd 100644
--- a/target/ppc/translate/vmx-impl.c.inc
+++ b/target/ppc/translate/vmx-impl.c.inc
@@ -798,29 +798,11 @@ static void trans_vclzd(DisasContext *ctx)
tcg_temp_free_i64(avr);
}
-GEN_VXFORM(vmuloub, 4, 0);
-GEN_VXFORM(vmulouh, 4, 1);
-GEN_VXFORM(vmulouw, 4, 2);
GEN_VXFORM_V(vmuluwm, MO_32, tcg_gen_gvec_mul, 4, 2);
-GEN_VXFORM_DUAL(vmulouw, PPC_ALTIVEC, PPC_NONE,
- vmuluwm, PPC_NONE, PPC2_ALTIVEC_207)
-GEN_VXFORM(vmulosb, 4, 4);
-GEN_VXFORM(vmulosh, 4, 5);
-GEN_VXFORM(vmulosw, 4, 6);
GEN_VXFORM_V(vmulld, MO_64, tcg_gen_gvec_mul, 4, 7);
-GEN_VXFORM(vmuleub, 4, 8);
-GEN_VXFORM(vmuleuh, 4, 9);
-GEN_VXFORM(vmuleuw, 4, 10);
GEN_VXFORM(vmulhuw, 4, 10);
GEN_VXFORM(vmulhud, 4, 11);
-GEN_VXFORM_DUAL(vmuleuw, PPC_ALTIVEC, PPC_NONE,
- vmulhuw, PPC_NONE, PPC2_ISA310);
-GEN_VXFORM(vmulesb, 4, 12);
-GEN_VXFORM(vmulesh, 4, 13);
-GEN_VXFORM(vmulesw, 4, 14);
GEN_VXFORM(vmulhsw, 4, 14);
-GEN_VXFORM_DUAL(vmulesw, PPC_ALTIVEC, PPC_NONE,
- vmulhsw, PPC_NONE, PPC2_ISA310);
GEN_VXFORM(vmulhsd, 4, 15);
GEN_VXFORM_V(vslb, MO_8, tcg_gen_gvec_shlv, 2, 4);
GEN_VXFORM_V(vslh, MO_16, tcg_gen_gvec_shlv, 2, 5);
@@ -2104,6 +2086,40 @@ static bool trans_VPEXTD(DisasContext *ctx, arg_VX *a)
return true;
}
+static bool do_vx_helper(DisasContext *ctx, arg_VX *a,
+ void (*gen_helper) (TCGv_ptr, TCGv_ptr, TCGv_ptr))
+{
+ TCGv_ptr ra, rb, rd;
+ REQUIRE_VECTOR(ctx);
+
+ ra = gen_avr_ptr(a->vra);
+ rb = gen_avr_ptr(a->vrb);
+ rd = gen_avr_ptr(a->vrt);
+ gen_helper(rd, ra, rb);
+ tcg_temp_free_ptr(ra);
+ tcg_temp_free_ptr(rb);
+ tcg_temp_free_ptr(rd);
+
+ return true;
+}
+
+TRANS_FLAGS2(ALTIVEC_207, VMULESB, do_vx_helper, gen_helper_VMULESB)
+TRANS_FLAGS2(ALTIVEC_207, VMULOSB, do_vx_helper, gen_helper_VMULOSB)
+TRANS_FLAGS2(ALTIVEC_207, VMULEUB, do_vx_helper, gen_helper_VMULEUB)
+TRANS_FLAGS2(ALTIVEC_207, VMULOUB, do_vx_helper, gen_helper_VMULOUB)
+TRANS_FLAGS2(ALTIVEC_207, VMULESH, do_vx_helper, gen_helper_VMULESH)
+TRANS_FLAGS2(ALTIVEC_207, VMULOSH, do_vx_helper, gen_helper_VMULOSH)
+TRANS_FLAGS2(ALTIVEC_207, VMULEUH, do_vx_helper, gen_helper_VMULEUH)
+TRANS_FLAGS2(ALTIVEC_207, VMULOUH, do_vx_helper, gen_helper_VMULOUH)
+TRANS_FLAGS2(ALTIVEC_207, VMULESW, do_vx_helper, gen_helper_VMULESW)
+TRANS_FLAGS2(ALTIVEC_207, VMULOSW, do_vx_helper, gen_helper_VMULOSW)
+TRANS_FLAGS2(ALTIVEC_207, VMULEUW, do_vx_helper, gen_helper_VMULEUW)
+TRANS_FLAGS2(ALTIVEC_207, VMULOUW, do_vx_helper, gen_helper_VMULOUW)
+TRANS_FLAGS2(ISA310, VMULESD, do_vx_helper, gen_helper_VMULESD)
+TRANS_FLAGS2(ISA310, VMULOSD, do_vx_helper, gen_helper_VMULOSD)
+TRANS_FLAGS2(ISA310, VMULEUD, do_vx_helper, gen_helper_VMULEUD)
+TRANS_FLAGS2(ISA310, VMULOUD, do_vx_helper, gen_helper_VMULOUD)
+
#undef GEN_VR_LDX
#undef GEN_VR_STX
#undef GEN_VR_LVE
diff --git a/target/ppc/translate/vmx-ops.c.inc b/target/ppc/translate/vmx-ops.c.inc
index 25ee715b43..f310b2fbde 100644
--- a/target/ppc/translate/vmx-ops.c.inc
+++ b/target/ppc/translate/vmx-ops.c.inc
@@ -101,20 +101,11 @@ GEN_VXFORM_DUAL(vmrgow, vextuwlx, 6, 26, PPC_NONE, PPC2_ALTIVEC_207),
GEN_VXFORM_300(vextubrx, 6, 28),
GEN_VXFORM_300(vextuhrx, 6, 29),
GEN_VXFORM_DUAL(vmrgew, vextuwrx, 6, 30, PPC_NONE, PPC2_ALTIVEC_207),
-GEN_VXFORM(vmuloub, 4, 0),
-GEN_VXFORM(vmulouh, 4, 1),
-GEN_VXFORM_DUAL(vmulouw, vmuluwm, 4, 2, PPC_ALTIVEC, PPC_NONE),
-GEN_VXFORM(vmulosb, 4, 4),
-GEN_VXFORM(vmulosh, 4, 5),
-GEN_VXFORM_207(vmulosw, 4, 6),
+GEN_VXFORM_207(vmuluwm, 4, 2),
GEN_VXFORM_310(vmulld, 4, 7),
-GEN_VXFORM(vmuleub, 4, 8),
-GEN_VXFORM(vmuleuh, 4, 9),
-GEN_VXFORM_DUAL(vmuleuw, vmulhuw, 4, 10, PPC_ALTIVEC, PPC_NONE),
+GEN_VXFORM_310(vmulhuw, 4, 10),
GEN_VXFORM_310(vmulhud, 4, 11),
-GEN_VXFORM(vmulesb, 4, 12),
-GEN_VXFORM(vmulesh, 4, 13),
-GEN_VXFORM_DUAL(vmulesw, vmulhsw, 4, 14, PPC_ALTIVEC, PPC_NONE),
+GEN_VXFORM_310(vmulhsw, 4, 14),
GEN_VXFORM_310(vmulhsd, 4, 15),
GEN_VXFORM(vslb, 2, 4),
GEN_VXFORM(vslh, 2, 5),
diff --git a/tcg/ppc/tcg-target.c.inc b/tcg/ppc/tcg-target.c.inc
index 9e79a7edee..b15f0fef0e 100644
--- a/tcg/ppc/tcg-target.c.inc
+++ b/tcg/ppc/tcg-target.c.inc
@@ -3905,3 +3905,9 @@ void tcg_register_jit(const void *buf, size_t buf_size)
tcg_register_jit_int(buf, buf_size, &debug_frame, sizeof(debug_frame));
}
#endif /* __ELF__ */
+#undef VMULEUB
+#undef VMULEUH
+#undef VMULEUW
+#undef VMULOUB
+#undef VMULOUH
+#undef VMULOUW
--
2.31.1
^ permalink raw reply related [flat|nested] 55+ messages in thread
* [PATCH v3 03/37] target/ppc: Moved vector multiply high and low to decodetree
2022-02-10 12:34 [PATCH v3 00/37] target/ppc: PowerISA Vector/VSX instruction batch matheus.ferst
2022-02-10 12:34 ` [PATCH v3 01/37] target/ppc: Introduce TRANS*FLAGS macros matheus.ferst
2022-02-10 12:34 ` [PATCH v3 02/37] target/ppc: moved vector even and odd multiplication to decodetree matheus.ferst
@ 2022-02-10 12:34 ` matheus.ferst
2022-02-11 3:41 ` Richard Henderson
2022-02-10 12:34 ` [PATCH v3 04/37] target/ppc: vmulh* instructions use gvec matheus.ferst
` (33 subsequent siblings)
36 siblings, 1 reply; 55+ messages in thread
From: matheus.ferst @ 2022-02-10 12:34 UTC (permalink / raw)
To: qemu-devel, qemu-ppc
Cc: Lucas Mateus Castro (alqotel),
danielhb413, richard.henderson, groug, Lucas Mateus Castro, clg,
Matheus Ferst, david
From: "Lucas Mateus Castro (alqotel)" <lucas.castro@eldorado.org.br>
Moved instructions vmulld, vmulhuw, vmulhsw, vmulhud and vmulhsd to
decodetree
Signed-off-by: Lucas Mateus Castro (alqotel) <lucas.araujo@eldorado.org.br>
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
---
target/ppc/helper.h | 8 ++++----
target/ppc/insn32.decode | 6 ++++++
target/ppc/int_helper.c | 8 ++++----
target/ppc/translate/vmx-impl.c.inc | 21 ++++++++++++++++-----
target/ppc/translate/vmx-ops.c.inc | 5 -----
5 files changed, 30 insertions(+), 18 deletions(-)
diff --git a/target/ppc/helper.h b/target/ppc/helper.h
index debb05c12c..1125a1704d 100644
--- a/target/ppc/helper.h
+++ b/target/ppc/helper.h
@@ -207,10 +207,10 @@ DEF_HELPER_3(VMULOUB, void, avr, avr, avr)
DEF_HELPER_3(VMULOUH, void, avr, avr, avr)
DEF_HELPER_3(VMULOUW, void, avr, avr, avr)
DEF_HELPER_3(VMULOUD, void, avr, avr, avr)
-DEF_HELPER_3(vmulhsw, void, avr, avr, avr)
-DEF_HELPER_3(vmulhuw, void, avr, avr, avr)
-DEF_HELPER_3(vmulhsd, void, avr, avr, avr)
-DEF_HELPER_3(vmulhud, void, avr, avr, avr)
+DEF_HELPER_3(VMULHSW, void, avr, avr, avr)
+DEF_HELPER_3(VMULHUW, void, avr, avr, avr)
+DEF_HELPER_3(VMULHSD, void, avr, avr, avr)
+DEF_HELPER_3(VMULHUD, void, avr, avr, avr)
DEF_HELPER_3(vslo, void, avr, avr, avr)
DEF_HELPER_3(vsro, void, avr, avr, avr)
DEF_HELPER_3(vsrv, void, avr, avr, avr)
diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode
index cab372fa44..4774548b3d 100644
--- a/target/ppc/insn32.decode
+++ b/target/ppc/insn32.decode
@@ -497,3 +497,9 @@ VMULESD 000100 ..... ..... ..... 01111001000 @VX
VMULOSD 000100 ..... ..... ..... 00111001000 @VX
VMULEUD 000100 ..... ..... ..... 01011001000 @VX
VMULOUD 000100 ..... ..... ..... 00011001000 @VX
+
+VMULHSW 000100 ..... ..... ..... 01110001001 @VX
+VMULHUW 000100 ..... ..... ..... 01010001001 @VX
+VMULHSD 000100 ..... ..... ..... 01111001001 @VX
+VMULHUD 000100 ..... ..... ..... 01011001001 @VX
+VMULLD 000100 ..... ..... ..... 00111001001 @VX
diff --git a/target/ppc/int_helper.c b/target/ppc/int_helper.c
index 56b9e9369b..e134162fdd 100644
--- a/target/ppc/int_helper.c
+++ b/target/ppc/int_helper.c
@@ -1200,7 +1200,7 @@ void helper_VMULOUD(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b)
mulu64(&r->VsrD(1), &r->VsrD(0), a->VsrD(1), b->VsrD(1));
}
-void helper_vmulhsw(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b)
+void helper_VMULHSW(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b)
{
int i;
@@ -1209,7 +1209,7 @@ void helper_vmulhsw(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b)
}
}
-void helper_vmulhuw(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b)
+void helper_VMULHUW(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b)
{
int i;
@@ -1219,7 +1219,7 @@ void helper_vmulhuw(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b)
}
}
-void helper_vmulhsd(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b)
+void helper_VMULHSD(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b)
{
uint64_t discard;
@@ -1227,7 +1227,7 @@ void helper_vmulhsd(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b)
muls64(&discard, &r->u64[1], a->s64[1], b->s64[1]);
}
-void helper_vmulhud(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b)
+void helper_VMULHUD(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b)
{
uint64_t discard;
diff --git a/target/ppc/translate/vmx-impl.c.inc b/target/ppc/translate/vmx-impl.c.inc
index 430579addd..62d0642226 100644
--- a/target/ppc/translate/vmx-impl.c.inc
+++ b/target/ppc/translate/vmx-impl.c.inc
@@ -799,11 +799,6 @@ static void trans_vclzd(DisasContext *ctx)
}
GEN_VXFORM_V(vmuluwm, MO_32, tcg_gen_gvec_mul, 4, 2);
-GEN_VXFORM_V(vmulld, MO_64, tcg_gen_gvec_mul, 4, 7);
-GEN_VXFORM(vmulhuw, 4, 10);
-GEN_VXFORM(vmulhud, 4, 11);
-GEN_VXFORM(vmulhsw, 4, 14);
-GEN_VXFORM(vmulhsd, 4, 15);
GEN_VXFORM_V(vslb, MO_8, tcg_gen_gvec_shlv, 2, 4);
GEN_VXFORM_V(vslh, MO_16, tcg_gen_gvec_shlv, 2, 5);
GEN_VXFORM_V(vslw, MO_32, tcg_gen_gvec_shlv, 2, 6);
@@ -2103,6 +2098,17 @@ static bool do_vx_helper(DisasContext *ctx, arg_VX *a,
return true;
}
+static bool trans_VMULLD(DisasContext *ctx, arg_VX *a)
+{
+ REQUIRE_INSNS_FLAGS2(ctx, ISA310);
+ REQUIRE_VECTOR(ctx);
+
+ tcg_gen_gvec_mul(MO_64, avr_full_offset(a->vrt), avr_full_offset(a->vra),
+ avr_full_offset(a->vrb), 16, 16);
+
+ return true;
+}
+
TRANS_FLAGS2(ALTIVEC_207, VMULESB, do_vx_helper, gen_helper_VMULESB)
TRANS_FLAGS2(ALTIVEC_207, VMULOSB, do_vx_helper, gen_helper_VMULOSB)
TRANS_FLAGS2(ALTIVEC_207, VMULEUB, do_vx_helper, gen_helper_VMULEUB)
@@ -2120,6 +2126,11 @@ TRANS_FLAGS2(ISA310, VMULOSD, do_vx_helper, gen_helper_VMULOSD)
TRANS_FLAGS2(ISA310, VMULEUD, do_vx_helper, gen_helper_VMULEUD)
TRANS_FLAGS2(ISA310, VMULOUD, do_vx_helper, gen_helper_VMULOUD)
+TRANS_FLAGS2(ISA310, VMULHSW, do_vx_helper, gen_helper_VMULHSW)
+TRANS_FLAGS2(ISA310, VMULHSD, do_vx_helper, gen_helper_VMULHSD)
+TRANS_FLAGS2(ISA310, VMULHUW, do_vx_helper, gen_helper_VMULHUW)
+TRANS_FLAGS2(ISA310, VMULHUD, do_vx_helper, gen_helper_VMULHUD)
+
#undef GEN_VR_LDX
#undef GEN_VR_STX
#undef GEN_VR_LVE
diff --git a/target/ppc/translate/vmx-ops.c.inc b/target/ppc/translate/vmx-ops.c.inc
index f310b2fbde..914e68e5b0 100644
--- a/target/ppc/translate/vmx-ops.c.inc
+++ b/target/ppc/translate/vmx-ops.c.inc
@@ -102,11 +102,6 @@ GEN_VXFORM_300(vextubrx, 6, 28),
GEN_VXFORM_300(vextuhrx, 6, 29),
GEN_VXFORM_DUAL(vmrgew, vextuwrx, 6, 30, PPC_NONE, PPC2_ALTIVEC_207),
GEN_VXFORM_207(vmuluwm, 4, 2),
-GEN_VXFORM_310(vmulld, 4, 7),
-GEN_VXFORM_310(vmulhuw, 4, 10),
-GEN_VXFORM_310(vmulhud, 4, 11),
-GEN_VXFORM_310(vmulhsw, 4, 14),
-GEN_VXFORM_310(vmulhsd, 4, 15),
GEN_VXFORM(vslb, 2, 4),
GEN_VXFORM(vslh, 2, 5),
GEN_VXFORM_DUAL(vslw, vrlwnm, 2, 6, PPC_ALTIVEC, PPC_NONE),
--
2.31.1
^ permalink raw reply related [flat|nested] 55+ messages in thread
* [PATCH v3 04/37] target/ppc: vmulh* instructions use gvec
2022-02-10 12:34 [PATCH v3 00/37] target/ppc: PowerISA Vector/VSX instruction batch matheus.ferst
` (2 preceding siblings ...)
2022-02-10 12:34 ` [PATCH v3 03/37] target/ppc: Moved vector multiply high and low " matheus.ferst
@ 2022-02-10 12:34 ` matheus.ferst
2022-02-11 3:51 ` Richard Henderson
2022-02-10 12:34 ` [PATCH v3 05/37] target/ppc: Implement vmsumcud instruction matheus.ferst
` (32 subsequent siblings)
36 siblings, 1 reply; 55+ messages in thread
From: matheus.ferst @ 2022-02-10 12:34 UTC (permalink / raw)
To: qemu-devel, qemu-ppc
Cc: Lucas Mateus Castro (alqotel),
danielhb413, richard.henderson, groug, Lucas Mateus Castro, clg,
Matheus Ferst, david
From: "Lucas Mateus Castro (alqotel)" <lucas.castro@eldorado.org.br>
Changed vmulhuw, vmulhud, vmulhsw, vmulhsd to use
gvec instructions
Signed-off-by: Lucas Mateus Castro (alqotel) <lucas.araujo@eldorado.org.br>
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
---
target/ppc/helper.h | 8 +-
target/ppc/int_helper.c | 8 +-
target/ppc/translate/vmx-impl.c.inc | 154 +++++++++++++++++++++++++++-
3 files changed, 158 insertions(+), 12 deletions(-)
diff --git a/target/ppc/helper.h b/target/ppc/helper.h
index 1125a1704d..92595a42df 100644
--- a/target/ppc/helper.h
+++ b/target/ppc/helper.h
@@ -207,10 +207,10 @@ DEF_HELPER_3(VMULOUB, void, avr, avr, avr)
DEF_HELPER_3(VMULOUH, void, avr, avr, avr)
DEF_HELPER_3(VMULOUW, void, avr, avr, avr)
DEF_HELPER_3(VMULOUD, void, avr, avr, avr)
-DEF_HELPER_3(VMULHSW, void, avr, avr, avr)
-DEF_HELPER_3(VMULHUW, void, avr, avr, avr)
-DEF_HELPER_3(VMULHSD, void, avr, avr, avr)
-DEF_HELPER_3(VMULHUD, void, avr, avr, avr)
+DEF_HELPER_4(VMULHSW, void, avr, avr, avr, i32)
+DEF_HELPER_4(VMULHUW, void, avr, avr, avr, i32)
+DEF_HELPER_4(VMULHSD, void, avr, avr, avr, i32)
+DEF_HELPER_4(VMULHUD, void, avr, avr, avr, i32)
DEF_HELPER_3(vslo, void, avr, avr, avr)
DEF_HELPER_3(vsro, void, avr, avr, avr)
DEF_HELPER_3(vsrv, void, avr, avr, avr)
diff --git a/target/ppc/int_helper.c b/target/ppc/int_helper.c
index e134162fdd..79cde68f19 100644
--- a/target/ppc/int_helper.c
+++ b/target/ppc/int_helper.c
@@ -1200,7 +1200,7 @@ void helper_VMULOUD(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b)
mulu64(&r->VsrD(1), &r->VsrD(0), a->VsrD(1), b->VsrD(1));
}
-void helper_VMULHSW(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b)
+void helper_VMULHSW(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b, uint32_t desc)
{
int i;
@@ -1209,7 +1209,7 @@ void helper_VMULHSW(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b)
}
}
-void helper_VMULHUW(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b)
+void helper_VMULHUW(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b, uint32_t desc)
{
int i;
@@ -1219,7 +1219,7 @@ void helper_VMULHUW(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b)
}
}
-void helper_VMULHSD(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b)
+void helper_VMULHSD(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b, uint32_t desc)
{
uint64_t discard;
@@ -1227,7 +1227,7 @@ void helper_VMULHSD(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b)
muls64(&discard, &r->u64[1], a->s64[1], b->s64[1]);
}
-void helper_VMULHUD(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b)
+void helper_VMULHUD(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b, uint32_t desc)
{
uint64_t discard;
diff --git a/target/ppc/translate/vmx-impl.c.inc b/target/ppc/translate/vmx-impl.c.inc
index 62d0642226..bed8df81c4 100644
--- a/target/ppc/translate/vmx-impl.c.inc
+++ b/target/ppc/translate/vmx-impl.c.inc
@@ -2126,10 +2126,156 @@ TRANS_FLAGS2(ISA310, VMULOSD, do_vx_helper, gen_helper_VMULOSD)
TRANS_FLAGS2(ISA310, VMULEUD, do_vx_helper, gen_helper_VMULEUD)
TRANS_FLAGS2(ISA310, VMULOUD, do_vx_helper, gen_helper_VMULOUD)
-TRANS_FLAGS2(ISA310, VMULHSW, do_vx_helper, gen_helper_VMULHSW)
-TRANS_FLAGS2(ISA310, VMULHSD, do_vx_helper, gen_helper_VMULHSD)
-TRANS_FLAGS2(ISA310, VMULHUW, do_vx_helper, gen_helper_VMULHUW)
-TRANS_FLAGS2(ISA310, VMULHUD, do_vx_helper, gen_helper_VMULHUD)
+static void do_vx_vmulhu_vec(unsigned vece, TCGv_vec t, TCGv_vec a, TCGv_vec b)
+{
+ TCGv_vec a1, b1, mask, w, k;
+ unsigned bits;
+ bits = (vece == MO_32) ? 16 : 32;
+
+ a1 = tcg_temp_new_vec_matching(t);
+ b1 = tcg_temp_new_vec_matching(t);
+ w = tcg_temp_new_vec_matching(t);
+ k = tcg_temp_new_vec_matching(t);
+ mask = tcg_temp_new_vec_matching(t);
+
+ tcg_gen_dupi_vec(vece, mask, (vece == MO_32) ? 0xFFFF : 0xFFFFFFFF);
+ tcg_gen_and_vec(vece, a1, a, mask);
+ tcg_gen_and_vec(vece, b1, b, mask);
+ tcg_gen_mul_vec(vece, t, a1, b1);
+ tcg_gen_shri_vec(vece, k, t, bits);
+
+ tcg_gen_shri_vec(vece, a1, a, bits);
+ tcg_gen_mul_vec(vece, t, a1, b1);
+ tcg_gen_add_vec(vece, t, t, k);
+ tcg_gen_and_vec(vece, k, t, mask);
+ tcg_gen_shri_vec(vece, w, t, bits);
+
+ tcg_gen_and_vec(vece, a1, a, mask);
+ tcg_gen_shri_vec(vece, b1, b, bits);
+ tcg_gen_mul_vec(vece, t, a1, b1);
+ tcg_gen_add_vec(vece, t, t, k);
+ tcg_gen_shri_vec(vece, k, t, bits);
+
+ tcg_gen_shri_vec(vece, a1, a, bits);
+ tcg_gen_mul_vec(vece, t, a1, b1);
+ tcg_gen_add_vec(vece, t, t, w);
+ tcg_gen_add_vec(vece, t, t, k);
+
+ tcg_temp_free_vec(a1);
+ tcg_temp_free_vec(b1);
+ tcg_temp_free_vec(w);
+ tcg_temp_free_vec(k);
+ tcg_temp_free_vec(mask);
+}
+
+static bool do_vx_mulhu(DisasContext *ctx, arg_VX *a, unsigned vece)
+{
+ REQUIRE_INSNS_FLAGS2(ctx, ISA310);
+ REQUIRE_VECTOR(ctx);
+
+ static const TCGOpcode vecop_list[] = {
+ INDEX_op_mul_vec, INDEX_op_add_vec, INDEX_op_shri_vec, 0
+ };
+
+ static const GVecGen3 op[2] = {
+ {
+ .fniv = do_vx_vmulhu_vec,
+ .fno = gen_helper_VMULHUW,
+ .opt_opc = vecop_list,
+ .vece = MO_32
+ },
+ {
+ .fniv = do_vx_vmulhu_vec,
+ .fno = gen_helper_VMULHUD,
+ .opt_opc = vecop_list,
+ .vece = MO_64
+ },
+ };
+
+ tcg_gen_gvec_3(avr_full_offset(a->vrt), avr_full_offset(a->vra),
+ avr_full_offset(a->vrb), 16, 16, &op[vece - MO_32]);
+
+ return true;
+
+}
+
+static void do_vx_vmulhs_vec(unsigned vece, TCGv_vec t, TCGv_vec a, TCGv_vec b)
+{
+ TCGv_vec a1, b1, mask, w, k;
+ unsigned bits;
+ bits = (vece == MO_32) ? 16 : 32;
+
+ a1 = tcg_temp_new_vec_matching(t);
+ b1 = tcg_temp_new_vec_matching(t);
+ w = tcg_temp_new_vec_matching(t);
+ k = tcg_temp_new_vec_matching(t);
+ mask = tcg_temp_new_vec_matching(t);
+
+ tcg_gen_dupi_vec(vece, mask, (vece == MO_32) ? 0xFFFF : 0xFFFFFFFF);
+ tcg_gen_and_vec(vece, a1, a, mask);
+ tcg_gen_and_vec(vece, b1, b, mask);
+ tcg_gen_mul_vec(vece, t, a1, b1);
+ tcg_gen_shri_vec(vece, k, t, bits);
+
+ tcg_gen_sari_vec(vece, a1, a, bits);
+ tcg_gen_mul_vec(vece, t, a1, b1);
+ tcg_gen_add_vec(vece, t, t, k);
+ tcg_gen_and_vec(vece, k, t, mask);
+ tcg_gen_sari_vec(vece, w, t, bits);
+
+ tcg_gen_and_vec(vece, a1, a, mask);
+ tcg_gen_sari_vec(vece, b1, b, bits);
+ tcg_gen_mul_vec(vece, t, a1, b1);
+ tcg_gen_add_vec(vece, t, t, k);
+ tcg_gen_sari_vec(vece, k, t, bits);
+
+ tcg_gen_sari_vec(vece, a1, a, bits);
+ tcg_gen_mul_vec(vece, t, a1, b1);
+ tcg_gen_add_vec(vece, t, t, w);
+ tcg_gen_add_vec(vece, t, t, k);
+
+ tcg_temp_free_vec(a1);
+ tcg_temp_free_vec(b1);
+ tcg_temp_free_vec(w);
+ tcg_temp_free_vec(k);
+ tcg_temp_free_vec(mask);
+}
+
+static bool do_vx_mulhs(DisasContext *ctx, arg_VX *a, unsigned vece)
+{
+ REQUIRE_INSNS_FLAGS2(ctx, ISA310);
+ REQUIRE_VECTOR(ctx);
+
+ static const TCGOpcode vecop_list[] = {
+ INDEX_op_mul_vec, INDEX_op_add_vec, INDEX_op_shri_vec,
+ INDEX_op_sari_vec, 0
+ };
+
+ static const GVecGen3 op[2] = {
+ {
+ .fniv = do_vx_vmulhs_vec,
+ .fno = gen_helper_VMULHSW,
+ .opt_opc = vecop_list,
+ .vece = MO_32
+ },
+ {
+ .fniv = do_vx_vmulhs_vec,
+ .fno = gen_helper_VMULHSD,
+ .opt_opc = vecop_list,
+ .vece = MO_64
+ },
+ };
+
+ tcg_gen_gvec_3(avr_full_offset(a->vrt), avr_full_offset(a->vra),
+ avr_full_offset(a->vrb), 16, 16, &op[vece - MO_32]);
+
+ return true;
+}
+
+TRANS(VMULHSW, do_vx_mulhs, MO_32)
+TRANS(VMULHSD, do_vx_mulhs, MO_64)
+TRANS(VMULHUW, do_vx_mulhu, MO_32)
+TRANS(VMULHUD, do_vx_mulhu, MO_64)
#undef GEN_VR_LDX
#undef GEN_VR_STX
--
2.31.1
^ permalink raw reply related [flat|nested] 55+ messages in thread
* [PATCH v3 05/37] target/ppc: Implement vmsumcud instruction
2022-02-10 12:34 [PATCH v3 00/37] target/ppc: PowerISA Vector/VSX instruction batch matheus.ferst
` (3 preceding siblings ...)
2022-02-10 12:34 ` [PATCH v3 04/37] target/ppc: vmulh* instructions use gvec matheus.ferst
@ 2022-02-10 12:34 ` matheus.ferst
2022-02-11 4:05 ` Richard Henderson
2022-02-10 12:34 ` [PATCH v3 06/37] target/ppc: Implement vmsumudm instruction matheus.ferst
` (31 subsequent siblings)
36 siblings, 1 reply; 55+ messages in thread
From: matheus.ferst @ 2022-02-10 12:34 UTC (permalink / raw)
To: qemu-devel, qemu-ppc
Cc: danielhb413, richard.henderson, groug, Víctor Colombo, clg,
Matheus Ferst, david
From: Víctor Colombo <victor.colombo@eldorado.org.br>
Based on [1] by Lijun Pan <ljp@linux.ibm.com>, which was never merged
into master.
[1]: https://lists.gnu.org/archive/html/qemu-ppc/2020-07/msg00419.html
Signed-off-by: Víctor Colombo <victor.colombo@eldorado.org.br>
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
---
target/ppc/insn32.decode | 4 +++
target/ppc/translate/vmx-impl.c.inc | 53 +++++++++++++++++++++++++++++
2 files changed, 57 insertions(+)
diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode
index 4774548b3d..0ec64cb4f4 100644
--- a/target/ppc/insn32.decode
+++ b/target/ppc/insn32.decode
@@ -440,6 +440,10 @@ VEXTRACTWM 000100 ..... 01010 ..... 11001000010 @VX_tb
VEXTRACTDM 000100 ..... 01011 ..... 11001000010 @VX_tb
VEXTRACTQM 000100 ..... 01100 ..... 11001000010 @VX_tb
+## Vector Multiply-Sum Instructions
+
+VMSUMCUD 000100 ..... ..... ..... ..... 010111 @VA
+
# VSX Load/Store Instructions
LXV 111101 ..... ..... ............ . 001 @DQ_TSX
diff --git a/target/ppc/translate/vmx-impl.c.inc b/target/ppc/translate/vmx-impl.c.inc
index bed8df81c4..694da75448 100644
--- a/target/ppc/translate/vmx-impl.c.inc
+++ b/target/ppc/translate/vmx-impl.c.inc
@@ -2081,6 +2081,59 @@ static bool trans_VPEXTD(DisasContext *ctx, arg_VX *a)
return true;
}
+static bool trans_VMSUMCUD(DisasContext *ctx, arg_VA *a)
+{
+ TCGv_i64 tmp0, tmp1, prod1h, prod1l, prod0h, prod0l, zero;
+
+ REQUIRE_INSNS_FLAGS2(ctx, ISA310);
+ REQUIRE_VECTOR(ctx);
+
+ tmp0 = tcg_temp_new_i64();
+ tmp1 = tcg_temp_new_i64();
+ prod1h = tcg_temp_new_i64();
+ prod1l = tcg_temp_new_i64();
+ prod0h = tcg_temp_new_i64();
+ prod0l = tcg_temp_new_i64();
+ zero = tcg_constant_i64(0);
+
+ /* prod1 = vsr[vra+32].dw[1] * vsr[vrb+32].dw[1] */
+ get_avr64(tmp0, a->vra, false);
+ get_avr64(tmp1, a->vrb, false);
+ tcg_gen_mulu2_i64(prod1l, prod1h, tmp0, tmp1);
+
+ /* prod0 = vsr[vra+32].dw[0] * vsr[vrb+32].dw[0] */
+ get_avr64(tmp0, a->vra, true);
+ get_avr64(tmp1, a->vrb, true);
+ tcg_gen_mulu2_i64(prod0l, prod0h, tmp0, tmp1);
+
+ /* Sum lower 64-bits elements */
+ get_avr64(tmp1, a->rc, false);
+ tcg_gen_add2_i64(tmp1, tmp0, tmp1, zero, prod1l, zero);
+ tcg_gen_add2_i64(tmp1, tmp0, tmp1, tmp0, prod0l, zero);
+
+ /*
+ * Discard lower 64-bits, leaving the carry into bit 64.
+ * Then sum the higher 64-bit elements.
+ */
+ tcg_gen_mov_i64(tmp1, tmp0);
+ get_avr64(tmp0, a->rc, true);
+ tcg_gen_add2_i64(tmp1, tmp0, tmp0, zero, prod1h, zero);
+ tcg_gen_add2_i64(tmp1, tmp0, tmp1, tmp0, prod0h, zero);
+
+ /* Discard 64 more bits to complete the CHOP128(temp >> 128) */
+ set_avr64(a->vrt, tmp0, false);
+ set_avr64(a->vrt, zero, true);
+
+ tcg_temp_free_i64(tmp0);
+ tcg_temp_free_i64(tmp1);
+ tcg_temp_free_i64(prod1h);
+ tcg_temp_free_i64(prod1l);
+ tcg_temp_free_i64(prod0h);
+ tcg_temp_free_i64(prod0l);
+
+ return true;
+}
+
static bool do_vx_helper(DisasContext *ctx, arg_VX *a,
void (*gen_helper) (TCGv_ptr, TCGv_ptr, TCGv_ptr))
{
--
2.31.1
^ permalink raw reply related [flat|nested] 55+ messages in thread
* [PATCH v3 06/37] target/ppc: Implement vmsumudm instruction
2022-02-10 12:34 [PATCH v3 00/37] target/ppc: PowerISA Vector/VSX instruction batch matheus.ferst
` (4 preceding siblings ...)
2022-02-10 12:34 ` [PATCH v3 05/37] target/ppc: Implement vmsumcud instruction matheus.ferst
@ 2022-02-10 12:34 ` matheus.ferst
2022-02-11 4:07 ` Richard Henderson
2022-02-10 12:34 ` [PATCH v3 07/37] target/ppc: Move vexts[bhw]2[wd] to decodetree matheus.ferst
` (30 subsequent siblings)
36 siblings, 1 reply; 55+ messages in thread
From: matheus.ferst @ 2022-02-10 12:34 UTC (permalink / raw)
To: qemu-devel, qemu-ppc
Cc: danielhb413, richard.henderson, groug, Víctor Colombo, clg,
Matheus Ferst, david
From: Víctor Colombo <victor.colombo@eldorado.org.br>
Based on [1] by Lijun Pan <ljp@linux.ibm.com>, which was never merged
into master.
[1]: https://lists.gnu.org/archive/html/qemu-ppc/2020-07/msg00419.html
Signed-off-by: Víctor Colombo <victor.colombo@eldorado.org.br>
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
---
target/ppc/insn32.decode | 1 +
target/ppc/translate/vmx-impl.c.inc | 34 +++++++++++++++++++++++++++++
2 files changed, 35 insertions(+)
diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode
index 0ec64cb4f4..c4796260b6 100644
--- a/target/ppc/insn32.decode
+++ b/target/ppc/insn32.decode
@@ -443,6 +443,7 @@ VEXTRACTQM 000100 ..... 01100 ..... 11001000010 @VX_tb
## Vector Multiply-Sum Instructions
VMSUMCUD 000100 ..... ..... ..... ..... 010111 @VA
+VMSUMUDM 000100 ..... ..... ..... ..... 100011 @VA
# VSX Load/Store Instructions
diff --git a/target/ppc/translate/vmx-impl.c.inc b/target/ppc/translate/vmx-impl.c.inc
index 694da75448..b7559cf94c 100644
--- a/target/ppc/translate/vmx-impl.c.inc
+++ b/target/ppc/translate/vmx-impl.c.inc
@@ -2081,6 +2081,40 @@ static bool trans_VPEXTD(DisasContext *ctx, arg_VX *a)
return true;
}
+static bool trans_VMSUMUDM(DisasContext *ctx, arg_VA *a)
+{
+ TCGv_i64 rl, rh, src1, src2;
+ int dw;
+
+ REQUIRE_INSNS_FLAGS2(ctx, ISA300);
+ REQUIRE_VECTOR(ctx);
+
+ rh = tcg_temp_new_i64();
+ rl = tcg_temp_new_i64();
+ src1 = tcg_temp_new_i64();
+ src2 = tcg_temp_new_i64();
+
+ get_avr64(rl, a->rc, false);
+ get_avr64(rh, a->rc, true);
+
+ for (dw = 0; dw < 2; dw++) {
+ get_avr64(src1, a->vra, dw);
+ get_avr64(src2, a->vrb, dw);
+ tcg_gen_mulu2_i64(src1, src2, src1, src2);
+ tcg_gen_add2_i64(rl, rh, rl, rh, src1, src2);
+ }
+
+ set_avr64(a->vrt, rl, false);
+ set_avr64(a->vrt, rh, true);
+
+ tcg_temp_free_i64(rl);
+ tcg_temp_free_i64(rh);
+ tcg_temp_free_i64(src1);
+ tcg_temp_free_i64(src2);
+
+ return true;
+}
+
static bool trans_VMSUMCUD(DisasContext *ctx, arg_VA *a)
{
TCGv_i64 tmp0, tmp1, prod1h, prod1l, prod0h, prod0l, zero;
--
2.31.1
^ permalink raw reply related [flat|nested] 55+ messages in thread
* [PATCH v3 07/37] target/ppc: Move vexts[bhw]2[wd] to decodetree
2022-02-10 12:34 [PATCH v3 00/37] target/ppc: PowerISA Vector/VSX instruction batch matheus.ferst
` (5 preceding siblings ...)
2022-02-10 12:34 ` [PATCH v3 06/37] target/ppc: Implement vmsumudm instruction matheus.ferst
@ 2022-02-10 12:34 ` matheus.ferst
2022-02-11 4:14 ` Richard Henderson
2022-02-10 12:34 ` [PATCH v3 08/37] target/ppc: Implement vextsd2q matheus.ferst
` (29 subsequent siblings)
36 siblings, 1 reply; 55+ messages in thread
From: matheus.ferst @ 2022-02-10 12:34 UTC (permalink / raw)
To: qemu-devel, qemu-ppc
Cc: danielhb413, richard.henderson, groug, clg, Matheus Ferst,
Lucas Coutinho, david
From: Lucas Coutinho <lucas.coutinho@eldorado.org.br>
Move the following instructions to decodetree:
vextsb2w: Vector Extend Sign Byte To Word
vextsh2w: Vector Extend Sign Halfword To Word
vextsb2d: Vector Extend Sign Byte To Doubleword
vextsh2d: Vector Extend Sign Halfword To Doubleword
vextsw2d: Vector Extend Sign Word To Doubleword
Signed-off-by: Lucas Coutinho <lucas.coutinho@eldorado.org.br>
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
---
target/ppc/helper.h | 5 -----
target/ppc/insn32.decode | 8 ++++++++
target/ppc/int_helper.c | 15 ---------------
target/ppc/translate/vmx-impl.c.inc | 25 ++++++++++++++++++++-----
target/ppc/translate/vmx-ops.c.inc | 5 -----
5 files changed, 28 insertions(+), 30 deletions(-)
diff --git a/target/ppc/helper.h b/target/ppc/helper.h
index 92595a42df..0084080fad 100644
--- a/target/ppc/helper.h
+++ b/target/ppc/helper.h
@@ -249,11 +249,6 @@ DEF_HELPER_4(VINSBLX, void, env, avr, i64, tl)
DEF_HELPER_4(VINSHLX, void, env, avr, i64, tl)
DEF_HELPER_4(VINSWLX, void, env, avr, i64, tl)
DEF_HELPER_4(VINSDLX, void, env, avr, i64, tl)
-DEF_HELPER_2(vextsb2w, void, avr, avr)
-DEF_HELPER_2(vextsh2w, void, avr, avr)
-DEF_HELPER_2(vextsb2d, void, avr, avr)
-DEF_HELPER_2(vextsh2d, void, avr, avr)
-DEF_HELPER_2(vextsw2d, void, avr, avr)
DEF_HELPER_2(vnegw, void, avr, avr)
DEF_HELPER_2(vnegd, void, avr, avr)
DEF_HELPER_2(vupkhpx, void, avr, avr)
diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode
index c4796260b6..757791f0ac 100644
--- a/target/ppc/insn32.decode
+++ b/target/ppc/insn32.decode
@@ -419,6 +419,14 @@ VINSWVRX 000100 ..... ..... ..... 00110001111 @VX
VSLDBI 000100 ..... ..... ..... 00 ... 010110 @VN
VSRDBI 000100 ..... ..... ..... 01 ... 010110 @VN
+## Vector Integer Arithmetic Instructions
+
+VEXTSB2W 000100 ..... 10000 ..... 11000000010 @VX_tb
+VEXTSH2W 000100 ..... 10001 ..... 11000000010 @VX_tb
+VEXTSB2D 000100 ..... 11000 ..... 11000000010 @VX_tb
+VEXTSH2D 000100 ..... 11001 ..... 11000000010 @VX_tb
+VEXTSW2D 000100 ..... 11010 ..... 11000000010 @VX_tb
+
## Vector Mask Manipulation Instructions
MTVSRBM 000100 ..... 10000 ..... 11001000010 @VX_tb
diff --git a/target/ppc/int_helper.c b/target/ppc/int_helper.c
index 79cde68f19..630fbc579a 100644
--- a/target/ppc/int_helper.c
+++ b/target/ppc/int_helper.c
@@ -1768,21 +1768,6 @@ XXBLEND(W, 32)
XXBLEND(D, 64)
#undef XXBLEND
-#define VEXT_SIGNED(name, element, cast) \
-void helper_##name(ppc_avr_t *r, ppc_avr_t *b) \
-{ \
- int i; \
- for (i = 0; i < ARRAY_SIZE(r->element); i++) { \
- r->element[i] = (cast)b->element[i]; \
- } \
-}
-VEXT_SIGNED(vextsb2w, s32, int8_t)
-VEXT_SIGNED(vextsb2d, s64, int8_t)
-VEXT_SIGNED(vextsh2w, s32, int16_t)
-VEXT_SIGNED(vextsh2d, s64, int16_t)
-VEXT_SIGNED(vextsw2d, s64, int32_t)
-#undef VEXT_SIGNED
-
#define VNEG(name, element) \
void helper_##name(ppc_avr_t *r, ppc_avr_t *b) \
{ \
diff --git a/target/ppc/translate/vmx-impl.c.inc b/target/ppc/translate/vmx-impl.c.inc
index b7559cf94c..ec782c47ff 100644
--- a/target/ppc/translate/vmx-impl.c.inc
+++ b/target/ppc/translate/vmx-impl.c.inc
@@ -1772,11 +1772,26 @@ GEN_VXFORM_TRANS(vclzw, 1, 30)
GEN_VXFORM_TRANS(vclzd, 1, 31)
GEN_VXFORM_NOA_2(vnegw, 1, 24, 6)
GEN_VXFORM_NOA_2(vnegd, 1, 24, 7)
-GEN_VXFORM_NOA_2(vextsb2w, 1, 24, 16)
-GEN_VXFORM_NOA_2(vextsh2w, 1, 24, 17)
-GEN_VXFORM_NOA_2(vextsb2d, 1, 24, 24)
-GEN_VXFORM_NOA_2(vextsh2d, 1, 24, 25)
-GEN_VXFORM_NOA_2(vextsw2d, 1, 24, 26)
+
+static bool do_vexts(DisasContext *ctx, arg_VX_tb *a, int vece, int s)
+{
+ REQUIRE_INSNS_FLAGS2(ctx, ISA300);
+ REQUIRE_VECTOR(ctx);
+
+ tcg_gen_gvec_shli(vece, avr_full_offset(a->vrt), avr_full_offset(a->vrb),
+ s, 16, 16);
+ tcg_gen_gvec_sari(vece, avr_full_offset(a->vrt), avr_full_offset(a->vrt),
+ s, 16, 16);
+
+ return true;
+}
+
+TRANS(VEXTSB2W, do_vexts, MO_32, 24);
+TRANS(VEXTSH2W, do_vexts, MO_32, 16);
+TRANS(VEXTSB2D, do_vexts, MO_64, 56);
+TRANS(VEXTSH2D, do_vexts, MO_64, 48);
+TRANS(VEXTSW2D, do_vexts, MO_64, 32);
+
GEN_VXFORM_NOA_2(vctzb, 1, 24, 28)
GEN_VXFORM_NOA_2(vctzh, 1, 24, 29)
GEN_VXFORM_NOA_2(vctzw, 1, 24, 30)
diff --git a/target/ppc/translate/vmx-ops.c.inc b/target/ppc/translate/vmx-ops.c.inc
index 914e68e5b0..6787327f56 100644
--- a/target/ppc/translate/vmx-ops.c.inc
+++ b/target/ppc/translate/vmx-ops.c.inc
@@ -216,11 +216,6 @@ GEN_VXFORM(vspltish, 6, 13),
GEN_VXFORM(vspltisw, 6, 14),
GEN_VXFORM_300_EO(vnegw, 0x01, 0x18, 0x06),
GEN_VXFORM_300_EO(vnegd, 0x01, 0x18, 0x07),
-GEN_VXFORM_300_EO(vextsb2w, 0x01, 0x18, 0x10),
-GEN_VXFORM_300_EO(vextsh2w, 0x01, 0x18, 0x11),
-GEN_VXFORM_300_EO(vextsb2d, 0x01, 0x18, 0x18),
-GEN_VXFORM_300_EO(vextsh2d, 0x01, 0x18, 0x19),
-GEN_VXFORM_300_EO(vextsw2d, 0x01, 0x18, 0x1A),
GEN_VXFORM_300_EO(vctzb, 0x01, 0x18, 0x1C),
GEN_VXFORM_300_EO(vctzh, 0x01, 0x18, 0x1D),
GEN_VXFORM_300_EO(vctzw, 0x01, 0x18, 0x1E),
--
2.31.1
^ permalink raw reply related [flat|nested] 55+ messages in thread
* [PATCH v3 08/37] target/ppc: Implement vextsd2q
2022-02-10 12:34 [PATCH v3 00/37] target/ppc: PowerISA Vector/VSX instruction batch matheus.ferst
` (6 preceding siblings ...)
2022-02-10 12:34 ` [PATCH v3 07/37] target/ppc: Move vexts[bhw]2[wd] to decodetree matheus.ferst
@ 2022-02-10 12:34 ` matheus.ferst
2022-02-11 4:15 ` Richard Henderson
2022-02-10 12:34 ` [PATCH v3 09/37] target/ppc: Move Vector Compare Equal/Not Equal/Greater Than to decodetree matheus.ferst
` (28 subsequent siblings)
36 siblings, 1 reply; 55+ messages in thread
From: matheus.ferst @ 2022-02-10 12:34 UTC (permalink / raw)
To: qemu-devel, qemu-ppc
Cc: danielhb413, richard.henderson, groug, clg, Matheus Ferst,
Lucas Coutinho, david
From: Lucas Coutinho <lucas.coutinho@eldorado.org.br>
Signed-off-by: Lucas Coutinho <lucas.coutinho@eldorado.org.br>
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
---
target/ppc/insn32.decode | 1 +
target/ppc/translate/vmx-impl.c.inc | 18 ++++++++++++++++++
2 files changed, 19 insertions(+)
diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode
index 757791f0ac..1c935eeac8 100644
--- a/target/ppc/insn32.decode
+++ b/target/ppc/insn32.decode
@@ -426,6 +426,7 @@ VEXTSH2W 000100 ..... 10001 ..... 11000000010 @VX_tb
VEXTSB2D 000100 ..... 11000 ..... 11000000010 @VX_tb
VEXTSH2D 000100 ..... 11001 ..... 11000000010 @VX_tb
VEXTSW2D 000100 ..... 11010 ..... 11000000010 @VX_tb
+VEXTSD2Q 000100 ..... 11011 ..... 11000000010 @VX_tb
## Vector Mask Manipulation Instructions
diff --git a/target/ppc/translate/vmx-impl.c.inc b/target/ppc/translate/vmx-impl.c.inc
index ec782c47ff..99606a8718 100644
--- a/target/ppc/translate/vmx-impl.c.inc
+++ b/target/ppc/translate/vmx-impl.c.inc
@@ -1792,6 +1792,24 @@ TRANS(VEXTSB2D, do_vexts, MO_64, 56);
TRANS(VEXTSH2D, do_vexts, MO_64, 48);
TRANS(VEXTSW2D, do_vexts, MO_64, 32);
+static bool trans_VEXTSD2Q(DisasContext *ctx, arg_VX_tb *a)
+{
+ TCGv_i64 tmp;
+
+ REQUIRE_INSNS_FLAGS2(ctx, ISA310);
+ REQUIRE_VECTOR(ctx);
+
+ tmp = tcg_temp_new_i64();
+
+ get_avr64(tmp, a->vrb, false);
+ set_avr64(a->vrt, tmp, false);
+ tcg_gen_sari_i64(tmp, tmp, 63);
+ set_avr64(a->vrt, tmp, true);
+
+ tcg_temp_free_i64(tmp);
+ return true;
+}
+
GEN_VXFORM_NOA_2(vctzb, 1, 24, 28)
GEN_VXFORM_NOA_2(vctzh, 1, 24, 29)
GEN_VXFORM_NOA_2(vctzw, 1, 24, 30)
--
2.31.1
^ permalink raw reply related [flat|nested] 55+ messages in thread
* [PATCH v3 09/37] target/ppc: Move Vector Compare Equal/Not Equal/Greater Than to decodetree
2022-02-10 12:34 [PATCH v3 00/37] target/ppc: PowerISA Vector/VSX instruction batch matheus.ferst
` (7 preceding siblings ...)
2022-02-10 12:34 ` [PATCH v3 08/37] target/ppc: Implement vextsd2q matheus.ferst
@ 2022-02-10 12:34 ` matheus.ferst
2022-02-11 4:27 ` Richard Henderson
2022-02-10 12:34 ` [PATCH v3 10/37] target/ppc: Move Vector Compare Not Equal or Zero " matheus.ferst
` (27 subsequent siblings)
36 siblings, 1 reply; 55+ messages in thread
From: matheus.ferst @ 2022-02-10 12:34 UTC (permalink / raw)
To: qemu-devel, qemu-ppc
Cc: danielhb413, richard.henderson, groug, clg, Matheus Ferst, david
From: Matheus Ferst <matheus.ferst@eldorado.org.br>
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
---
target/ppc/helper.h | 30 ----------
target/ppc/insn32.decode | 24 ++++++++
target/ppc/int_helper.c | 54 -----------------
target/ppc/translate/vmx-impl.c.inc | 91 ++++++++++++++++++++---------
target/ppc/translate/vmx-ops.c.inc | 15 +----
5 files changed, 90 insertions(+), 124 deletions(-)
diff --git a/target/ppc/helper.h b/target/ppc/helper.h
index 0084080fad..4f0f3e3a08 100644
--- a/target/ppc/helper.h
+++ b/target/ppc/helper.h
@@ -141,46 +141,16 @@ DEF_HELPER_3(vabsduw, void, avr, avr, avr)
DEF_HELPER_3(vavgsb, void, avr, avr, avr)
DEF_HELPER_3(vavgsh, void, avr, avr, avr)
DEF_HELPER_3(vavgsw, void, avr, avr, avr)
-DEF_HELPER_4(vcmpequb, void, env, avr, avr, avr)
-DEF_HELPER_4(vcmpequh, void, env, avr, avr, avr)
-DEF_HELPER_4(vcmpequw, void, env, avr, avr, avr)
-DEF_HELPER_4(vcmpequd, void, env, avr, avr, avr)
-DEF_HELPER_4(vcmpneb, void, env, avr, avr, avr)
-DEF_HELPER_4(vcmpneh, void, env, avr, avr, avr)
-DEF_HELPER_4(vcmpnew, void, env, avr, avr, avr)
DEF_HELPER_4(vcmpnezb, void, env, avr, avr, avr)
DEF_HELPER_4(vcmpnezh, void, env, avr, avr, avr)
DEF_HELPER_4(vcmpnezw, void, env, avr, avr, avr)
-DEF_HELPER_4(vcmpgtub, void, env, avr, avr, avr)
-DEF_HELPER_4(vcmpgtuh, void, env, avr, avr, avr)
-DEF_HELPER_4(vcmpgtuw, void, env, avr, avr, avr)
-DEF_HELPER_4(vcmpgtud, void, env, avr, avr, avr)
-DEF_HELPER_4(vcmpgtsb, void, env, avr, avr, avr)
-DEF_HELPER_4(vcmpgtsh, void, env, avr, avr, avr)
-DEF_HELPER_4(vcmpgtsw, void, env, avr, avr, avr)
-DEF_HELPER_4(vcmpgtsd, void, env, avr, avr, avr)
DEF_HELPER_4(vcmpeqfp, void, env, avr, avr, avr)
DEF_HELPER_4(vcmpgefp, void, env, avr, avr, avr)
DEF_HELPER_4(vcmpgtfp, void, env, avr, avr, avr)
DEF_HELPER_4(vcmpbfp, void, env, avr, avr, avr)
-DEF_HELPER_4(vcmpequb_dot, void, env, avr, avr, avr)
-DEF_HELPER_4(vcmpequh_dot, void, env, avr, avr, avr)
-DEF_HELPER_4(vcmpequw_dot, void, env, avr, avr, avr)
-DEF_HELPER_4(vcmpequd_dot, void, env, avr, avr, avr)
-DEF_HELPER_4(vcmpneb_dot, void, env, avr, avr, avr)
-DEF_HELPER_4(vcmpneh_dot, void, env, avr, avr, avr)
-DEF_HELPER_4(vcmpnew_dot, void, env, avr, avr, avr)
DEF_HELPER_4(vcmpnezb_dot, void, env, avr, avr, avr)
DEF_HELPER_4(vcmpnezh_dot, void, env, avr, avr, avr)
DEF_HELPER_4(vcmpnezw_dot, void, env, avr, avr, avr)
-DEF_HELPER_4(vcmpgtub_dot, void, env, avr, avr, avr)
-DEF_HELPER_4(vcmpgtuh_dot, void, env, avr, avr, avr)
-DEF_HELPER_4(vcmpgtuw_dot, void, env, avr, avr, avr)
-DEF_HELPER_4(vcmpgtud_dot, void, env, avr, avr, avr)
-DEF_HELPER_4(vcmpgtsb_dot, void, env, avr, avr, avr)
-DEF_HELPER_4(vcmpgtsh_dot, void, env, avr, avr, avr)
-DEF_HELPER_4(vcmpgtsw_dot, void, env, avr, avr, avr)
-DEF_HELPER_4(vcmpgtsd_dot, void, env, avr, avr, avr)
DEF_HELPER_4(vcmpeqfp_dot, void, env, avr, avr, avr)
DEF_HELPER_4(vcmpgefp_dot, void, env, avr, avr, avr)
DEF_HELPER_4(vcmpgtfp_dot, void, env, avr, avr, avr)
diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode
index 1c935eeac8..bfb36e6969 100644
--- a/target/ppc/insn32.decode
+++ b/target/ppc/insn32.decode
@@ -51,6 +51,9 @@
&VA vrt vra vrb rc
@VA ...... vrt:5 vra:5 vrb:5 rc:5 ...... &VA
+&VC vrt vra vrb rc:bool
+@VC ...... vrt:5 vra:5 vrb:5 rc:1 .......... &VC
+
&VN vrt vra vrb sh
@VN ...... vrt:5 vra:5 vrb:5 .. sh:3 ...... &VN
@@ -373,6 +376,27 @@ DSCLIQ 111111 ..... ..... ...... 001000010 . @Z22_tap_sh_rc
DSCRI 111011 ..... ..... ...... 001100010 . @Z22_ta_sh_rc
DSCRIQ 111111 ..... ..... ...... 001100010 . @Z22_tap_sh_rc
+## Vector Integer Instructions
+
+VCMPEQUB 000100 ..... ..... ..... . 0000000110 @VC
+VCMPEQUH 000100 ..... ..... ..... . 0001000110 @VC
+VCMPEQUW 000100 ..... ..... ..... . 0010000110 @VC
+VCMPEQUD 000100 ..... ..... ..... . 0011000111 @VC
+
+VCMPGTSB 000100 ..... ..... ..... . 1100000110 @VC
+VCMPGTSH 000100 ..... ..... ..... . 1101000110 @VC
+VCMPGTSW 000100 ..... ..... ..... . 1110000110 @VC
+VCMPGTSD 000100 ..... ..... ..... . 1111000111 @VC
+
+VCMPGTUB 000100 ..... ..... ..... . 1000000110 @VC
+VCMPGTUH 000100 ..... ..... ..... . 1001000110 @VC
+VCMPGTUW 000100 ..... ..... ..... . 1010000110 @VC
+VCMPGTUD 000100 ..... ..... ..... . 1011000111 @VC
+
+VCMPNEB 000100 ..... ..... ..... . 0000000111 @VC
+VCMPNEH 000100 ..... ..... ..... . 0001000111 @VC
+VCMPNEW 000100 ..... ..... ..... . 0010000111 @VC
+
## Vector Bit Manipulation Instruction
VCFUGED 000100 ..... ..... ..... 10101001101 @VX
diff --git a/target/ppc/int_helper.c b/target/ppc/int_helper.c
index 630fbc579a..83e718ab0e 100644
--- a/target/ppc/int_helper.c
+++ b/target/ppc/int_helper.c
@@ -749,57 +749,6 @@ VCF(ux, uint32_to_float32, u32)
VCF(sx, int32_to_float32, s32)
#undef VCF
-#define VCMP_DO(suffix, compare, element, record) \
- void helper_vcmp##suffix(CPUPPCState *env, ppc_avr_t *r, \
- ppc_avr_t *a, ppc_avr_t *b) \
- { \
- uint64_t ones = (uint64_t)-1; \
- uint64_t all = ones; \
- uint64_t none = 0; \
- int i; \
- \
- for (i = 0; i < ARRAY_SIZE(r->element); i++) { \
- uint64_t result = (a->element[i] compare b->element[i] ? \
- ones : 0x0); \
- switch (sizeof(a->element[0])) { \
- case 8: \
- r->u64[i] = result; \
- break; \
- case 4: \
- r->u32[i] = result; \
- break; \
- case 2: \
- r->u16[i] = result; \
- break; \
- case 1: \
- r->u8[i] = result; \
- break; \
- } \
- all &= result; \
- none |= result; \
- } \
- if (record) { \
- env->crf[6] = ((all != 0) << 3) | ((none == 0) << 1); \
- } \
- }
-#define VCMP(suffix, compare, element) \
- VCMP_DO(suffix, compare, element, 0) \
- VCMP_DO(suffix##_dot, compare, element, 1)
-VCMP(equb, ==, u8)
-VCMP(equh, ==, u16)
-VCMP(equw, ==, u32)
-VCMP(equd, ==, u64)
-VCMP(gtub, >, u8)
-VCMP(gtuh, >, u16)
-VCMP(gtuw, >, u32)
-VCMP(gtud, >, u64)
-VCMP(gtsb, >, s8)
-VCMP(gtsh, >, s16)
-VCMP(gtsw, >, s32)
-VCMP(gtsd, >, s64)
-#undef VCMP_DO
-#undef VCMP
-
#define VCMPNE_DO(suffix, element, etype, cmpzero, record) \
void helper_vcmpne##suffix(CPUPPCState *env, ppc_avr_t *r, \
ppc_avr_t *a, ppc_avr_t *b) \
@@ -838,9 +787,6 @@ void helper_vcmpne##suffix(CPUPPCState *env, ppc_avr_t *r, \
VCMPNE(zb, u8, uint8_t, 1)
VCMPNE(zh, u16, uint16_t, 1)
VCMPNE(zw, u32, uint32_t, 1)
-VCMPNE(b, u8, uint8_t, 0)
-VCMPNE(h, u16, uint16_t, 0)
-VCMPNE(w, u32, uint32_t, 0)
#undef VCMPNE_DO
#undef VCMPNE
diff --git a/target/ppc/translate/vmx-impl.c.inc b/target/ppc/translate/vmx-impl.c.inc
index 99606a8718..a32ad92195 100644
--- a/target/ppc/translate/vmx-impl.c.inc
+++ b/target/ppc/translate/vmx-impl.c.inc
@@ -985,41 +985,76 @@ static void glue(gen_, name0##_##name1)(DisasContext *ctx) \
} \
}
-GEN_VXRFORM(vcmpequb, 3, 0)
-GEN_VXRFORM(vcmpequh, 3, 1)
-GEN_VXRFORM(vcmpequw, 3, 2)
-GEN_VXRFORM(vcmpequd, 3, 3)
GEN_VXRFORM(vcmpnezb, 3, 4)
GEN_VXRFORM(vcmpnezh, 3, 5)
GEN_VXRFORM(vcmpnezw, 3, 6)
-GEN_VXRFORM(vcmpgtsb, 3, 12)
-GEN_VXRFORM(vcmpgtsh, 3, 13)
-GEN_VXRFORM(vcmpgtsw, 3, 14)
-GEN_VXRFORM(vcmpgtsd, 3, 15)
-GEN_VXRFORM(vcmpgtub, 3, 8)
-GEN_VXRFORM(vcmpgtuh, 3, 9)
-GEN_VXRFORM(vcmpgtuw, 3, 10)
-GEN_VXRFORM(vcmpgtud, 3, 11)
+
+static void do_vcmp_rc(int vrt)
+{
+ TCGv_i64 t0, t1;
+
+ t0 = tcg_temp_new_i64();
+ t1 = tcg_temp_new_i64();
+
+ get_avr64(t0, vrt, true);
+ tcg_gen_ctpop_i64(t1, t0);
+ get_avr64(t0, vrt, false);
+ tcg_gen_ctpop_i64(t0, t0);
+ tcg_gen_add_i64(t1, t0, t1);
+
+ tcg_gen_setcondi_i64(TCG_COND_EQ, t0, t1, 0);
+ tcg_gen_shli_i64(t0, t0, 1);
+
+ tcg_gen_setcondi_i64(TCG_COND_EQ, t1, t1, 128);
+ tcg_gen_shli_i64(t1, t1, 3);
+
+ tcg_gen_or_i64(t0, t0, t1);
+ tcg_gen_extrl_i64_i32(cpu_crf[6], t0);
+
+ tcg_temp_free_i64(t0);
+ tcg_temp_free_i64(t1);
+}
+
+static bool do_vcmp(DisasContext *ctx, arg_VC *a, TCGCond cond, int vece)
+{
+ REQUIRE_VECTOR(ctx);
+
+ tcg_gen_gvec_cmp(cond, vece, avr_full_offset(a->vrt),
+ avr_full_offset(a->vra), avr_full_offset(a->vrb), 16, 16);
+ tcg_gen_gvec_shli(vece, avr_full_offset(a->vrt), avr_full_offset(a->vrt),
+ (8 << vece) - 1, 16, 16);
+ tcg_gen_gvec_sari(vece, avr_full_offset(a->vrt), avr_full_offset(a->vrt),
+ (8 << vece) - 1, 16, 16);
+
+ if (a->rc) {
+ do_vcmp_rc(a->vrt);
+ }
+
+ return true;
+}
+
+TRANS_FLAGS(ALTIVEC, VCMPEQUB, do_vcmp, TCG_COND_EQ, MO_8)
+TRANS_FLAGS(ALTIVEC, VCMPEQUH, do_vcmp, TCG_COND_EQ, MO_16)
+TRANS_FLAGS(ALTIVEC, VCMPEQUW, do_vcmp, TCG_COND_EQ, MO_32)
+TRANS_FLAGS2(ALTIVEC_207, VCMPEQUD, do_vcmp, TCG_COND_EQ, MO_64)
+
+TRANS_FLAGS(ALTIVEC, VCMPGTSB, do_vcmp, TCG_COND_GT, MO_8)
+TRANS_FLAGS(ALTIVEC, VCMPGTSH, do_vcmp, TCG_COND_GT, MO_16)
+TRANS_FLAGS(ALTIVEC, VCMPGTSW, do_vcmp, TCG_COND_GT, MO_32)
+TRANS_FLAGS2(ALTIVEC_207, VCMPGTSD, do_vcmp, TCG_COND_GT, MO_64)
+TRANS_FLAGS(ALTIVEC, VCMPGTUB, do_vcmp, TCG_COND_GTU, MO_8)
+TRANS_FLAGS(ALTIVEC, VCMPGTUH, do_vcmp, TCG_COND_GTU, MO_16)
+TRANS_FLAGS(ALTIVEC, VCMPGTUW, do_vcmp, TCG_COND_GTU, MO_32)
+TRANS_FLAGS2(ALTIVEC_207, VCMPGTUD, do_vcmp, TCG_COND_GTU, MO_64)
+
+TRANS_FLAGS2(ISA300, VCMPNEB, do_vcmp, TCG_COND_NE, MO_8)
+TRANS_FLAGS2(ISA300, VCMPNEH, do_vcmp, TCG_COND_NE, MO_16)
+TRANS_FLAGS2(ISA300, VCMPNEW, do_vcmp, TCG_COND_NE, MO_32)
+
GEN_VXRFORM(vcmpeqfp, 3, 3)
GEN_VXRFORM(vcmpgefp, 3, 7)
GEN_VXRFORM(vcmpgtfp, 3, 11)
GEN_VXRFORM(vcmpbfp, 3, 15)
-GEN_VXRFORM(vcmpneb, 3, 0)
-GEN_VXRFORM(vcmpneh, 3, 1)
-GEN_VXRFORM(vcmpnew, 3, 2)
-
-GEN_VXRFORM_DUAL(vcmpequb, PPC_ALTIVEC, PPC_NONE, \
- vcmpneb, PPC_NONE, PPC2_ISA300)
-GEN_VXRFORM_DUAL(vcmpequh, PPC_ALTIVEC, PPC_NONE, \
- vcmpneh, PPC_NONE, PPC2_ISA300)
-GEN_VXRFORM_DUAL(vcmpequw, PPC_ALTIVEC, PPC_NONE, \
- vcmpnew, PPC_NONE, PPC2_ISA300)
-GEN_VXRFORM_DUAL(vcmpeqfp, PPC_ALTIVEC, PPC_NONE, \
- vcmpequd, PPC_NONE, PPC2_ALTIVEC_207)
-GEN_VXRFORM_DUAL(vcmpbfp, PPC_ALTIVEC, PPC_NONE, \
- vcmpgtsd, PPC_NONE, PPC2_ALTIVEC_207)
-GEN_VXRFORM_DUAL(vcmpgtfp, PPC_ALTIVEC, PPC_NONE, \
- vcmpgtud, PPC_NONE, PPC2_ALTIVEC_207)
static void gen_vsplti(DisasContext *ctx, int vece)
{
diff --git a/target/ppc/translate/vmx-ops.c.inc b/target/ppc/translate/vmx-ops.c.inc
index 6787327f56..80d460c34e 100644
--- a/target/ppc/translate/vmx-ops.c.inc
+++ b/target/ppc/translate/vmx-ops.c.inc
@@ -187,19 +187,10 @@ GEN_HANDLER2_E(name, str, 0x4, opc2, opc3, 0x00000000, PPC_NONE, PPC2_ISA300),
GEN_VXRFORM_300(vcmpnezb, 3, 4)
GEN_VXRFORM_300(vcmpnezh, 3, 5)
GEN_VXRFORM_300(vcmpnezw, 3, 6)
-GEN_VXRFORM(vcmpgtsb, 3, 12)
-GEN_VXRFORM(vcmpgtsh, 3, 13)
-GEN_VXRFORM(vcmpgtsw, 3, 14)
-GEN_VXRFORM(vcmpgtub, 3, 8)
-GEN_VXRFORM(vcmpgtuh, 3, 9)
-GEN_VXRFORM(vcmpgtuw, 3, 10)
-GEN_VXRFORM_DUAL(vcmpeqfp, vcmpequd, 3, 3, PPC_ALTIVEC, PPC_NONE)
+GEN_VXRFORM(vcmpeqfp, 3, 3)
GEN_VXRFORM(vcmpgefp, 3, 7)
-GEN_VXRFORM_DUAL(vcmpgtfp, vcmpgtud, 3, 11, PPC_ALTIVEC, PPC_NONE)
-GEN_VXRFORM_DUAL(vcmpbfp, vcmpgtsd, 3, 15, PPC_ALTIVEC, PPC_NONE)
-GEN_VXRFORM_DUAL(vcmpequb, vcmpneb, 3, 0, PPC_ALTIVEC, PPC_NONE)
-GEN_VXRFORM_DUAL(vcmpequh, vcmpneh, 3, 1, PPC_ALTIVEC, PPC_NONE)
-GEN_VXRFORM_DUAL(vcmpequw, vcmpnew, 3, 2, PPC_ALTIVEC, PPC_NONE)
+GEN_VXRFORM(vcmpgtfp, 3, 11)
+GEN_VXRFORM(vcmpbfp, 3, 15)
#define GEN_VXFORM_DUAL_INV(name0, name1, opc2, opc3, inval0, inval1, type) \
GEN_OPCODE_DUAL(name0##_##name1, 0x04, opc2, opc3, inval0, inval1, type, \
--
2.31.1
^ permalink raw reply related [flat|nested] 55+ messages in thread
* [PATCH v3 10/37] target/ppc: Move Vector Compare Not Equal or Zero to decodetree
2022-02-10 12:34 [PATCH v3 00/37] target/ppc: PowerISA Vector/VSX instruction batch matheus.ferst
` (8 preceding siblings ...)
2022-02-10 12:34 ` [PATCH v3 09/37] target/ppc: Move Vector Compare Equal/Not Equal/Greater Than to decodetree matheus.ferst
@ 2022-02-10 12:34 ` matheus.ferst
2022-02-11 4:41 ` Richard Henderson
2022-02-10 12:34 ` [PATCH v3 11/37] target/ppc: Implement Vector Compare Equal Quadword matheus.ferst
` (26 subsequent siblings)
36 siblings, 1 reply; 55+ messages in thread
From: matheus.ferst @ 2022-02-10 12:34 UTC (permalink / raw)
To: qemu-devel, qemu-ppc
Cc: danielhb413, richard.henderson, groug, clg, Matheus Ferst, david
From: Matheus Ferst <matheus.ferst@eldorado.org.br>
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
---
target/ppc/helper.h | 9 ++--
target/ppc/insn32.decode | 4 ++
target/ppc/int_helper.c | 50 +++++----------------
target/ppc/translate/vmx-impl.c.inc | 69 +++++++++++++++++++++++++++--
target/ppc/translate/vmx-ops.c.inc | 3 --
5 files changed, 83 insertions(+), 52 deletions(-)
diff --git a/target/ppc/helper.h b/target/ppc/helper.h
index 4f0f3e3a08..729d7eb608 100644
--- a/target/ppc/helper.h
+++ b/target/ppc/helper.h
@@ -141,16 +141,13 @@ DEF_HELPER_3(vabsduw, void, avr, avr, avr)
DEF_HELPER_3(vavgsb, void, avr, avr, avr)
DEF_HELPER_3(vavgsh, void, avr, avr, avr)
DEF_HELPER_3(vavgsw, void, avr, avr, avr)
-DEF_HELPER_4(vcmpnezb, void, env, avr, avr, avr)
-DEF_HELPER_4(vcmpnezh, void, env, avr, avr, avr)
-DEF_HELPER_4(vcmpnezw, void, env, avr, avr, avr)
DEF_HELPER_4(vcmpeqfp, void, env, avr, avr, avr)
DEF_HELPER_4(vcmpgefp, void, env, avr, avr, avr)
DEF_HELPER_4(vcmpgtfp, void, env, avr, avr, avr)
DEF_HELPER_4(vcmpbfp, void, env, avr, avr, avr)
-DEF_HELPER_4(vcmpnezb_dot, void, env, avr, avr, avr)
-DEF_HELPER_4(vcmpnezh_dot, void, env, avr, avr, avr)
-DEF_HELPER_4(vcmpnezw_dot, void, env, avr, avr, avr)
+DEF_HELPER_4(VCMPNEZB, void, avr, avr, avr, i32)
+DEF_HELPER_4(VCMPNEZH, void, avr, avr, avr, i32)
+DEF_HELPER_4(VCMPNEZW, void, avr, avr, avr, i32)
DEF_HELPER_4(vcmpeqfp_dot, void, env, avr, avr, avr)
DEF_HELPER_4(vcmpgefp_dot, void, env, avr, avr, avr)
DEF_HELPER_4(vcmpgtfp_dot, void, env, avr, avr, avr)
diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode
index bfb36e6969..a0adf18671 100644
--- a/target/ppc/insn32.decode
+++ b/target/ppc/insn32.decode
@@ -397,6 +397,10 @@ VCMPNEB 000100 ..... ..... ..... . 0000000111 @VC
VCMPNEH 000100 ..... ..... ..... . 0001000111 @VC
VCMPNEW 000100 ..... ..... ..... . 0010000111 @VC
+VCMPNEZB 000100 ..... ..... ..... . 0100000111 @VC
+VCMPNEZH 000100 ..... ..... ..... . 0101000111 @VC
+VCMPNEZW 000100 ..... ..... ..... . 0110000111 @VC
+
## Vector Bit Manipulation Instruction
VCFUGED 000100 ..... ..... ..... 10101001101 @VX
diff --git a/target/ppc/int_helper.c b/target/ppc/int_helper.c
index 83e718ab0e..1c5e6acda1 100644
--- a/target/ppc/int_helper.c
+++ b/target/ppc/int_helper.c
@@ -749,46 +749,18 @@ VCF(ux, uint32_to_float32, u32)
VCF(sx, int32_to_float32, s32)
#undef VCF
-#define VCMPNE_DO(suffix, element, etype, cmpzero, record) \
-void helper_vcmpne##suffix(CPUPPCState *env, ppc_avr_t *r, \
- ppc_avr_t *a, ppc_avr_t *b) \
-{ \
- etype ones = (etype)-1; \
- etype all = ones; \
- etype result, none = 0; \
- int i; \
- \
- for (i = 0; i < ARRAY_SIZE(r->element); i++) { \
- if (cmpzero) { \
- result = ((a->element[i] == 0) \
- || (b->element[i] == 0) \
- || (a->element[i] != b->element[i]) ? \
- ones : 0x0); \
- } else { \
- result = (a->element[i] != b->element[i]) ? ones : 0x0; \
- } \
- r->element[i] = result; \
- all &= result; \
- none |= result; \
- } \
- if (record) { \
- env->crf[6] = ((all != 0) << 3) | ((none == 0) << 1); \
- } \
+#define VCMPNEZ(NAME, ELEM) \
+void helper_##NAME(ppc_vsr_t *t, ppc_vsr_t *a, ppc_vsr_t *b, uint32_t desc) \
+{ \
+ for (int i = 0; i < ARRAY_SIZE(t->ELEM); i++) { \
+ t->ELEM[i] = ((a->ELEM[i] == 0) || (b->ELEM[i] == 0) || \
+ (a->ELEM[i] != b->ELEM[i])) ? -1 : 0; \
+ } \
}
-
-/*
- * VCMPNEZ - Vector compare not equal to zero
- * suffix - instruction mnemonic suffix (b: byte, h: halfword, w: word)
- * element - element type to access from vector
- */
-#define VCMPNE(suffix, element, etype, cmpzero) \
- VCMPNE_DO(suffix, element, etype, cmpzero, 0) \
- VCMPNE_DO(suffix##_dot, element, etype, cmpzero, 1)
-VCMPNE(zb, u8, uint8_t, 1)
-VCMPNE(zh, u16, uint16_t, 1)
-VCMPNE(zw, u32, uint32_t, 1)
-#undef VCMPNE_DO
-#undef VCMPNE
+VCMPNEZ(VCMPNEZB, u8)
+VCMPNEZ(VCMPNEZH, u16)
+VCMPNEZ(VCMPNEZW, u32)
+#undef VCMPNEZ
#define VCMPFP_DO(suffix, compare, order, record) \
void helper_vcmp##suffix(CPUPPCState *env, ppc_avr_t *r, \
diff --git a/target/ppc/translate/vmx-impl.c.inc b/target/ppc/translate/vmx-impl.c.inc
index a32ad92195..67059ed9b2 100644
--- a/target/ppc/translate/vmx-impl.c.inc
+++ b/target/ppc/translate/vmx-impl.c.inc
@@ -985,10 +985,6 @@ static void glue(gen_, name0##_##name1)(DisasContext *ctx) \
} \
}
-GEN_VXRFORM(vcmpnezb, 3, 4)
-GEN_VXRFORM(vcmpnezh, 3, 5)
-GEN_VXRFORM(vcmpnezw, 3, 6)
-
static void do_vcmp_rc(int vrt)
{
TCGv_i64 t0, t1;
@@ -1051,6 +1047,71 @@ TRANS_FLAGS2(ISA300, VCMPNEB, do_vcmp, TCG_COND_NE, MO_8)
TRANS_FLAGS2(ISA300, VCMPNEH, do_vcmp, TCG_COND_NE, MO_16)
TRANS_FLAGS2(ISA300, VCMPNEW, do_vcmp, TCG_COND_NE, MO_32)
+static void gen_vcmpnez_vec(unsigned vece, TCGv_vec t, TCGv_vec a, TCGv_vec b)
+{
+ TCGv_vec t0, t1, zero;
+
+ t0 = tcg_temp_new_vec_matching(t);
+ t1 = tcg_temp_new_vec_matching(t);
+ zero = tcg_constant_vec_matching(t, vece, 0);
+
+ tcg_gen_cmp_vec(TCG_COND_EQ, vece, t0, a, zero);
+ tcg_gen_cmp_vec(TCG_COND_EQ, vece, t1, b, zero);
+ tcg_gen_cmp_vec(TCG_COND_NE, vece, t, a, b);
+
+ tcg_gen_or_vec(vece, t, t, t0);
+ tcg_gen_or_vec(vece, t, t, t1);
+
+ tcg_gen_shli_vec(vece, t, t, (8 << vece) - 1);
+ tcg_gen_sari_vec(vece, t, t, (8 << vece) - 1);
+
+ tcg_temp_free_vec(t0);
+ tcg_temp_free_vec(t1);
+}
+
+static bool do_vcmpnez(DisasContext *ctx, arg_VC *a, int vece)
+{
+ static const TCGOpcode vecop_list[] = {
+ INDEX_op_cmp_vec, INDEX_op_shli_vec, INDEX_op_sari_vec, 0
+ };
+ static const GVecGen3 ops[3] = {
+ {
+ .fniv = gen_vcmpnez_vec,
+ .fno = gen_helper_VCMPNEZB,
+ .opt_opc = vecop_list,
+ .vece = MO_8
+ },
+ {
+ .fniv = gen_vcmpnez_vec,
+ .fno = gen_helper_VCMPNEZH,
+ .opt_opc = vecop_list,
+ .vece = MO_16
+ },
+ {
+ .fniv = gen_vcmpnez_vec,
+ .fno = gen_helper_VCMPNEZW,
+ .opt_opc = vecop_list,
+ .vece = MO_32
+ }
+ };
+
+ REQUIRE_INSNS_FLAGS2(ctx, ISA300);
+ REQUIRE_VECTOR(ctx);
+
+ tcg_gen_gvec_3(avr_full_offset(a->vrt), avr_full_offset(a->vra),
+ avr_full_offset(a->vrb), 16, 16, &ops[vece]);
+
+ if (a->rc) {
+ do_vcmp_rc(a->vrt);
+ }
+
+ return true;
+}
+
+TRANS(VCMPNEZB, do_vcmpnez, MO_8)
+TRANS(VCMPNEZH, do_vcmpnez, MO_16)
+TRANS(VCMPNEZW, do_vcmpnez, MO_32)
+
GEN_VXRFORM(vcmpeqfp, 3, 3)
GEN_VXRFORM(vcmpgefp, 3, 7)
GEN_VXRFORM(vcmpgtfp, 3, 11)
diff --git a/target/ppc/translate/vmx-ops.c.inc b/target/ppc/translate/vmx-ops.c.inc
index 80d460c34e..cb4c5bb953 100644
--- a/target/ppc/translate/vmx-ops.c.inc
+++ b/target/ppc/translate/vmx-ops.c.inc
@@ -184,9 +184,6 @@ GEN_HANDLER2_E(name, str, 0x4, opc2, opc3, 0x00000000, PPC_NONE, PPC2_ISA300),
GEN_VXRFORM1_300(name, name, #name, opc2, opc3) \
GEN_VXRFORM1_300(name##_dot, name##_, #name ".", opc2, (opc3 | (0x1 << 4)))
-GEN_VXRFORM_300(vcmpnezb, 3, 4)
-GEN_VXRFORM_300(vcmpnezh, 3, 5)
-GEN_VXRFORM_300(vcmpnezw, 3, 6)
GEN_VXRFORM(vcmpeqfp, 3, 3)
GEN_VXRFORM(vcmpgefp, 3, 7)
GEN_VXRFORM(vcmpgtfp, 3, 11)
--
2.31.1
^ permalink raw reply related [flat|nested] 55+ messages in thread
* [PATCH v3 11/37] target/ppc: Implement Vector Compare Equal Quadword
2022-02-10 12:34 [PATCH v3 00/37] target/ppc: PowerISA Vector/VSX instruction batch matheus.ferst
` (9 preceding siblings ...)
2022-02-10 12:34 ` [PATCH v3 10/37] target/ppc: Move Vector Compare Not Equal or Zero " matheus.ferst
@ 2022-02-10 12:34 ` matheus.ferst
2022-02-11 4:51 ` Richard Henderson
2022-02-10 12:34 ` [PATCH v3 12/37] target/ppc: Implement Vector Compare Greater Than Quadword matheus.ferst
` (25 subsequent siblings)
36 siblings, 1 reply; 55+ messages in thread
From: matheus.ferst @ 2022-02-10 12:34 UTC (permalink / raw)
To: qemu-devel, qemu-ppc
Cc: danielhb413, richard.henderson, groug, clg, Matheus Ferst, david
From: Matheus Ferst <matheus.ferst@eldorado.org.br>
Implement the following PowerISA v3.1 instructions:
vcmpequq Vector Compare Equal Quadword
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
---
target/ppc/insn32.decode | 1 +
target/ppc/translate/vmx-impl.c.inc | 43 +++++++++++++++++++++++++++++
2 files changed, 44 insertions(+)
diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode
index a0adf18671..39730df32d 100644
--- a/target/ppc/insn32.decode
+++ b/target/ppc/insn32.decode
@@ -382,6 +382,7 @@ VCMPEQUB 000100 ..... ..... ..... . 0000000110 @VC
VCMPEQUH 000100 ..... ..... ..... . 0001000110 @VC
VCMPEQUW 000100 ..... ..... ..... . 0010000110 @VC
VCMPEQUD 000100 ..... ..... ..... . 0011000111 @VC
+VCMPEQUQ 000100 ..... ..... ..... . 0111000111 @VC
VCMPGTSB 000100 ..... ..... ..... . 1100000110 @VC
VCMPGTSH 000100 ..... ..... ..... . 1101000110 @VC
diff --git a/target/ppc/translate/vmx-impl.c.inc b/target/ppc/translate/vmx-impl.c.inc
index 67059ed9b2..bdb0b4370b 100644
--- a/target/ppc/translate/vmx-impl.c.inc
+++ b/target/ppc/translate/vmx-impl.c.inc
@@ -1112,6 +1112,49 @@ TRANS(VCMPNEZB, do_vcmpnez, MO_8)
TRANS(VCMPNEZH, do_vcmpnez, MO_16)
TRANS(VCMPNEZW, do_vcmpnez, MO_32)
+static bool trans_VCMPEQUQ(DisasContext *ctx, arg_VC *a)
+{
+ TCGv_i64 t0, t1;
+ TCGLabel *l1, *l2;
+
+ REQUIRE_INSNS_FLAGS2(ctx, ISA310);
+ REQUIRE_VECTOR(ctx);
+
+ t0 = tcg_temp_new_i64();
+ t1 = tcg_temp_new_i64();
+ l1 = gen_new_label();
+ l2 = gen_new_label();
+
+ get_avr64(t0, a->vra, true);
+ get_avr64(t1, a->vrb, true);
+ tcg_gen_brcond_i64(TCG_COND_NE, t0, t1, l1);
+
+ get_avr64(t0, a->vra, false);
+ get_avr64(t1, a->vrb, false);
+ tcg_gen_brcond_i64(TCG_COND_NE, t0, t1, l1);
+
+ set_avr64(a->vrt, tcg_constant_i64(-1), true);
+ set_avr64(a->vrt, tcg_constant_i64(-1), false);
+ if (a->rc) {
+ tcg_gen_movi_i32(cpu_crf[6], 1 << 3);
+ }
+ tcg_gen_br(l2);
+
+ gen_set_label(l1);
+ set_avr64(a->vrt, tcg_constant_i64(0), true);
+ set_avr64(a->vrt, tcg_constant_i64(0), false);
+ if (a->rc) {
+ tcg_gen_movi_i32(cpu_crf[6], 1 << 1);
+ }
+
+ gen_set_label(l2);
+
+ tcg_temp_free_i64(t0);
+ tcg_temp_free_i64(t1);
+
+ return true;
+}
+
GEN_VXRFORM(vcmpeqfp, 3, 3)
GEN_VXRFORM(vcmpgefp, 3, 7)
GEN_VXRFORM(vcmpgtfp, 3, 11)
--
2.31.1
^ permalink raw reply related [flat|nested] 55+ messages in thread
* [PATCH v3 12/37] target/ppc: Implement Vector Compare Greater Than Quadword
2022-02-10 12:34 [PATCH v3 00/37] target/ppc: PowerISA Vector/VSX instruction batch matheus.ferst
` (10 preceding siblings ...)
2022-02-10 12:34 ` [PATCH v3 11/37] target/ppc: Implement Vector Compare Equal Quadword matheus.ferst
@ 2022-02-10 12:34 ` matheus.ferst
2022-02-11 4:53 ` Richard Henderson
2022-02-10 12:34 ` [PATCH v3 13/37] target/ppc: Implement Vector Compare Quadword matheus.ferst
` (24 subsequent siblings)
36 siblings, 1 reply; 55+ messages in thread
From: matheus.ferst @ 2022-02-10 12:34 UTC (permalink / raw)
To: qemu-devel, qemu-ppc
Cc: danielhb413, richard.henderson, groug, clg, Matheus Ferst, david
From: Matheus Ferst <matheus.ferst@eldorado.org.br>
Implement the following PowerISA v3.1 instructions:
vcmpgtsq: Vector Compare Greater Than Signed Quadword
vcmpgtuq: Vector Compare Greater Than Unsigned Quadword
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
---
target/ppc/insn32.decode | 2 ++
target/ppc/translate/vmx-impl.c.inc | 49 +++++++++++++++++++++++++++++
2 files changed, 51 insertions(+)
diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode
index 39730df32d..45649f7d1d 100644
--- a/target/ppc/insn32.decode
+++ b/target/ppc/insn32.decode
@@ -388,11 +388,13 @@ VCMPGTSB 000100 ..... ..... ..... . 1100000110 @VC
VCMPGTSH 000100 ..... ..... ..... . 1101000110 @VC
VCMPGTSW 000100 ..... ..... ..... . 1110000110 @VC
VCMPGTSD 000100 ..... ..... ..... . 1111000111 @VC
+VCMPGTSQ 000100 ..... ..... ..... . 1110000111 @VC
VCMPGTUB 000100 ..... ..... ..... . 1000000110 @VC
VCMPGTUH 000100 ..... ..... ..... . 1001000110 @VC
VCMPGTUW 000100 ..... ..... ..... . 1010000110 @VC
VCMPGTUD 000100 ..... ..... ..... . 1011000111 @VC
+VCMPGTUQ 000100 ..... ..... ..... . 1010000111 @VC
VCMPNEB 000100 ..... ..... ..... . 0000000111 @VC
VCMPNEH 000100 ..... ..... ..... . 0001000111 @VC
diff --git a/target/ppc/translate/vmx-impl.c.inc b/target/ppc/translate/vmx-impl.c.inc
index bdb0b4370b..302ef4370a 100644
--- a/target/ppc/translate/vmx-impl.c.inc
+++ b/target/ppc/translate/vmx-impl.c.inc
@@ -1155,6 +1155,55 @@ static bool trans_VCMPEQUQ(DisasContext *ctx, arg_VC *a)
return true;
}
+static bool do_vcmpgtq(DisasContext *ctx, arg_VC *a, bool sign)
+{
+ TCGv_i64 t0, t1;
+ TCGLabel *l1, *l2, *l3;
+
+ REQUIRE_INSNS_FLAGS2(ctx, ISA310);
+ REQUIRE_VECTOR(ctx);
+
+ t0 = tcg_temp_local_new_i64();
+ t1 = tcg_temp_local_new_i64();
+ l1 = gen_new_label();
+ l2 = gen_new_label();
+ l3 = gen_new_label();
+
+ get_avr64(t0, a->vra, true);
+ get_avr64(t1, a->vrb, true);
+ tcg_gen_brcond_i64(sign ? TCG_COND_GT : TCG_COND_GTU, t0, t1, l1);
+ tcg_gen_brcond_i64(sign ? TCG_COND_LT : TCG_COND_LTU, t0, t1, l2);
+
+ get_avr64(t0, a->vra, false);
+ get_avr64(t1, a->vrb, false);
+ tcg_gen_brcond_i64(TCG_COND_GTU, t0, t1, l1);
+ tcg_gen_br(l2);
+
+ gen_set_label(l1);
+ set_avr64(a->vrt, tcg_constant_i64(-1), true);
+ set_avr64(a->vrt, tcg_constant_i64(-1), false);
+ if (a->rc) {
+ tcg_gen_movi_i32(cpu_crf[6], 1 << 3);
+ }
+ tcg_gen_br(l3);
+
+ gen_set_label(l2);
+ set_avr64(a->vrt, tcg_constant_i64(0), true);
+ set_avr64(a->vrt, tcg_constant_i64(0), false);
+ if (a->rc) {
+ tcg_gen_movi_i32(cpu_crf[6], 1 << 1);
+ }
+
+ gen_set_label(l3);
+ tcg_temp_free_i64(t0);
+ tcg_temp_free_i64(t1);
+
+ return true;
+}
+
+TRANS(VCMPGTSQ, do_vcmpgtq, true)
+TRANS(VCMPGTUQ, do_vcmpgtq, false)
+
GEN_VXRFORM(vcmpeqfp, 3, 3)
GEN_VXRFORM(vcmpgefp, 3, 7)
GEN_VXRFORM(vcmpgtfp, 3, 11)
--
2.31.1
^ permalink raw reply related [flat|nested] 55+ messages in thread
* [PATCH v3 13/37] target/ppc: Implement Vector Compare Quadword
2022-02-10 12:34 [PATCH v3 00/37] target/ppc: PowerISA Vector/VSX instruction batch matheus.ferst
` (11 preceding siblings ...)
2022-02-10 12:34 ` [PATCH v3 12/37] target/ppc: Implement Vector Compare Greater Than Quadword matheus.ferst
@ 2022-02-10 12:34 ` matheus.ferst
2022-02-11 4:55 ` Richard Henderson
2022-02-10 12:34 ` [PATCH v3 14/37] target/ppc: implement vstri[bh][lr] matheus.ferst
` (23 subsequent siblings)
36 siblings, 1 reply; 55+ messages in thread
From: matheus.ferst @ 2022-02-10 12:34 UTC (permalink / raw)
To: qemu-devel, qemu-ppc
Cc: danielhb413, richard.henderson, groug, clg, Matheus Ferst, david
From: Matheus Ferst <matheus.ferst@eldorado.org.br>
Implement the following PowerISA v3.1 instructions:
vcmpsq: Vector Compare Signed Quadword
vcmpuq: Vector Compare Unsigned Quadword
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
---
target/ppc/insn32.decode | 6 ++++
target/ppc/translate/vmx-impl.c.inc | 45 +++++++++++++++++++++++++++++
2 files changed, 51 insertions(+)
diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode
index 45649f7d1d..605e66872e 100644
--- a/target/ppc/insn32.decode
+++ b/target/ppc/insn32.decode
@@ -60,6 +60,9 @@
&VX vrt vra vrb
@VX ...... vrt:5 vra:5 vrb:5 .......... . &VX
+&VX_bf bf vra vrb
+@VX_bf ...... bf:3 .. vra:5 vrb:5 ........... &VX_bf
+
&VX_uim4 vrt uim vrb
@VX_uim4 ...... vrt:5 . uim:4 vrb:5 ........... &VX_uim4
@@ -404,6 +407,9 @@ VCMPNEZB 000100 ..... ..... ..... . 0100000111 @VC
VCMPNEZH 000100 ..... ..... ..... . 0101000111 @VC
VCMPNEZW 000100 ..... ..... ..... . 0110000111 @VC
+VCMPSQ 000100 ... -- ..... ..... 00101000001 @VX_bf
+VCMPUQ 000100 ... -- ..... ..... 00100000001 @VX_bf
+
## Vector Bit Manipulation Instruction
VCFUGED 000100 ..... ..... ..... 10101001101 @VX
diff --git a/target/ppc/translate/vmx-impl.c.inc b/target/ppc/translate/vmx-impl.c.inc
index 302ef4370a..2c13fbeb39 100644
--- a/target/ppc/translate/vmx-impl.c.inc
+++ b/target/ppc/translate/vmx-impl.c.inc
@@ -1204,6 +1204,51 @@ static bool do_vcmpgtq(DisasContext *ctx, arg_VC *a, bool sign)
TRANS(VCMPGTSQ, do_vcmpgtq, true)
TRANS(VCMPGTUQ, do_vcmpgtq, false)
+static bool do_vcmpq(DisasContext *ctx, arg_VX_bf *a, bool sign)
+{
+ TCGv_i64 vra, vrb;
+ TCGLabel *gt, *lt, *done;
+
+ REQUIRE_INSNS_FLAGS2(ctx, ISA310);
+ REQUIRE_VECTOR(ctx);
+
+ vra = tcg_temp_local_new_i64();
+ vrb = tcg_temp_local_new_i64();
+ gt = gen_new_label();
+ lt = gen_new_label();
+ done = gen_new_label();
+
+ get_avr64(vra, a->vra, true);
+ get_avr64(vrb, a->vrb, true);
+ tcg_gen_brcond_i64((sign ? TCG_COND_GT : TCG_COND_GTU), vra, vrb, gt);
+ tcg_gen_brcond_i64((sign ? TCG_COND_LT : TCG_COND_LTU), vra, vrb, lt);
+
+ get_avr64(vra, a->vra, false);
+ get_avr64(vrb, a->vrb, false);
+ tcg_gen_brcond_i64(TCG_COND_GTU, vra, vrb, gt);
+ tcg_gen_brcond_i64(TCG_COND_LTU, vra, vrb, lt);
+
+ tcg_gen_movi_i32(cpu_crf[a->bf], CRF_EQ);
+ tcg_gen_br(done);
+
+ gen_set_label(gt);
+ tcg_gen_movi_i32(cpu_crf[a->bf], CRF_GT);
+ tcg_gen_br(done);
+
+ gen_set_label(lt);
+ tcg_gen_movi_i32(cpu_crf[a->bf], CRF_LT);
+ tcg_gen_br(done);
+
+ gen_set_label(done);
+ tcg_temp_free_i64(vra);
+ tcg_temp_free_i64(vrb);
+
+ return true;
+}
+
+TRANS(VCMPSQ, do_vcmpq, true)
+TRANS(VCMPUQ, do_vcmpq, false)
+
GEN_VXRFORM(vcmpeqfp, 3, 3)
GEN_VXRFORM(vcmpgefp, 3, 7)
GEN_VXRFORM(vcmpgtfp, 3, 11)
--
2.31.1
^ permalink raw reply related [flat|nested] 55+ messages in thread
* [PATCH v3 14/37] target/ppc: implement vstri[bh][lr]
2022-02-10 12:34 [PATCH v3 00/37] target/ppc: PowerISA Vector/VSX instruction batch matheus.ferst
` (12 preceding siblings ...)
2022-02-10 12:34 ` [PATCH v3 13/37] target/ppc: Implement Vector Compare Quadword matheus.ferst
@ 2022-02-10 12:34 ` matheus.ferst
2022-02-11 5:00 ` Richard Henderson
2022-02-10 12:34 ` [PATCH v3 15/37] target/ppc: implement vclrlb matheus.ferst
` (22 subsequent siblings)
36 siblings, 1 reply; 55+ messages in thread
From: matheus.ferst @ 2022-02-10 12:34 UTC (permalink / raw)
To: qemu-devel, qemu-ppc
Cc: danielhb413, richard.henderson, groug, clg, Matheus Ferst, david
From: Matheus Ferst <matheus.ferst@eldorado.org.br>
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
---
target/ppc/helper.h | 4 ++++
target/ppc/insn32.decode | 10 +++++++++
target/ppc/int_helper.c | 32 +++++++++++++++++++++++++++++
target/ppc/translate/vmx-impl.c.inc | 24 ++++++++++++++++++++++
4 files changed, 70 insertions(+)
diff --git a/target/ppc/helper.h b/target/ppc/helper.h
index 729d7eb608..935e99b5e9 100644
--- a/target/ppc/helper.h
+++ b/target/ppc/helper.h
@@ -216,6 +216,10 @@ DEF_HELPER_4(VINSBLX, void, env, avr, i64, tl)
DEF_HELPER_4(VINSHLX, void, env, avr, i64, tl)
DEF_HELPER_4(VINSWLX, void, env, avr, i64, tl)
DEF_HELPER_4(VINSDLX, void, env, avr, i64, tl)
+DEF_HELPER_4(VSTRIBL, void, env, avr, avr, tl)
+DEF_HELPER_4(VSTRIBR, void, env, avr, avr, tl)
+DEF_HELPER_4(VSTRIHL, void, env, avr, avr, tl)
+DEF_HELPER_4(VSTRIHR, void, env, avr, avr, tl)
DEF_HELPER_2(vnegw, void, avr, avr)
DEF_HELPER_2(vnegd, void, avr, avr)
DEF_HELPER_2(vupkhpx, void, avr, avr)
diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode
index 605e66872e..ea497ecd80 100644
--- a/target/ppc/insn32.decode
+++ b/target/ppc/insn32.decode
@@ -63,6 +63,9 @@
&VX_bf bf vra vrb
@VX_bf ...... bf:3 .. vra:5 vrb:5 ........... &VX_bf
+&VX_tb_rc vrt vrb rc:bool
+@VX_tb_rc ...... vrt:5 ..... vrb:5 rc:1 .......... &VX_tb_rc
+
&VX_uim4 vrt uim vrb
@VX_uim4 ...... vrt:5 . uim:4 vrb:5 ........... &VX_uim4
@@ -491,6 +494,13 @@ VEXTRACTQM 000100 ..... 01100 ..... 11001000010 @VX_tb
VMSUMCUD 000100 ..... ..... ..... ..... 010111 @VA
VMSUMUDM 000100 ..... ..... ..... ..... 100011 @VA
+## Vector String Instructions
+
+VSTRIBL 000100 ..... 00000 ..... . 0000001101 @VX_tb_rc
+VSTRIBR 000100 ..... 00001 ..... . 0000001101 @VX_tb_rc
+VSTRIHL 000100 ..... 00010 ..... . 0000001101 @VX_tb_rc
+VSTRIHR 000100 ..... 00011 ..... . 0000001101 @VX_tb_rc
+
# VSX Load/Store Instructions
LXV 111101 ..... ..... ............ . 001 @DQ_TSX
diff --git a/target/ppc/int_helper.c b/target/ppc/int_helper.c
index 1c5e6acda1..e98dedcdf5 100644
--- a/target/ppc/int_helper.c
+++ b/target/ppc/int_helper.c
@@ -1640,6 +1640,38 @@ VEXTRACT(uw, u32)
VEXTRACT(d, u64)
#undef VEXTRACT
+#define VSTRI(NAME, ELEM, NUM_ELEMS, LEFT) \
+void helper_##NAME(CPUPPCState *env, ppc_avr_t *t, ppc_avr_t *b, \
+ target_ulong rc) \
+{ \
+ bool null_found = false; \
+ int i, idx; \
+ \
+ for (i = 0; i < NUM_ELEMS; i++) { \
+ idx = LEFT ? i : NUM_ELEMS - i - 1; \
+ if (b->Vsr##ELEM(idx)) { \
+ t->Vsr##ELEM(idx) = b->Vsr##ELEM(idx); \
+ } else { \
+ null_found = true; \
+ break; \
+ } \
+ } \
+ \
+ for (; i < NUM_ELEMS; i++) { \
+ idx = LEFT ? i : NUM_ELEMS - i - 1; \
+ t->Vsr##ELEM(idx) = 0; \
+ } \
+ \
+ if (rc) { \
+ env->crf[6] = null_found ? 0b0010 : 0; \
+ } \
+}
+VSTRI(VSTRIBL, B, 16, true)
+VSTRI(VSTRIBR, B, 16, false)
+VSTRI(VSTRIHL, H, 8, true)
+VSTRI(VSTRIHR, H, 8, false)
+#undef VSTRI
+
void helper_xxextractuw(CPUPPCState *env, ppc_vsr_t *xt,
ppc_vsr_t *xb, uint32_t index)
{
diff --git a/target/ppc/translate/vmx-impl.c.inc b/target/ppc/translate/vmx-impl.c.inc
index 2c13fbeb39..8bcf637ff8 100644
--- a/target/ppc/translate/vmx-impl.c.inc
+++ b/target/ppc/translate/vmx-impl.c.inc
@@ -1932,6 +1932,30 @@ static bool trans_MTVSRBMI(DisasContext *ctx, arg_DX_b *a)
return true;
}
+static bool do_vstri(DisasContext *ctx, arg_VX_tb_rc *a,
+ void (*gen_helper)(TCGv_ptr, TCGv_ptr, TCGv_ptr, TCGv))
+{
+ TCGv_ptr vrt, vrb;
+
+ REQUIRE_INSNS_FLAGS2(ctx, ISA310);
+ REQUIRE_VECTOR(ctx);
+
+ vrt = gen_avr_ptr(a->vrt);
+ vrb = gen_avr_ptr(a->vrb);
+
+ gen_helper(cpu_env, vrt, vrb, tcg_constant_tl(a->rc));
+
+ tcg_temp_free_ptr(vrt);
+ tcg_temp_free_ptr(vrb);
+
+ return true;
+}
+
+TRANS(VSTRIBL, do_vstri, gen_helper_VSTRIBL)
+TRANS(VSTRIBR, do_vstri, gen_helper_VSTRIBR)
+TRANS(VSTRIHL, do_vstri, gen_helper_VSTRIHL)
+TRANS(VSTRIHR, do_vstri, gen_helper_VSTRIHR)
+
#define GEN_VAFORM_PAIRED(name0, name1, opc2) \
static void glue(gen_, name0##_##name1)(DisasContext *ctx) \
{ \
--
2.31.1
^ permalink raw reply related [flat|nested] 55+ messages in thread
* [PATCH v3 15/37] target/ppc: implement vclrlb
2022-02-10 12:34 [PATCH v3 00/37] target/ppc: PowerISA Vector/VSX instruction batch matheus.ferst
` (13 preceding siblings ...)
2022-02-10 12:34 ` [PATCH v3 14/37] target/ppc: implement vstri[bh][lr] matheus.ferst
@ 2022-02-10 12:34 ` matheus.ferst
2022-02-11 5:20 ` Richard Henderson
2022-02-10 12:34 ` [PATCH v3 16/37] target/ppc: implement vclrrb matheus.ferst
` (21 subsequent siblings)
36 siblings, 1 reply; 55+ messages in thread
From: matheus.ferst @ 2022-02-10 12:34 UTC (permalink / raw)
To: qemu-devel, qemu-ppc
Cc: danielhb413, richard.henderson, groug, clg, Matheus Ferst, david
From: Matheus Ferst <matheus.ferst@eldorado.org.br>
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
---
target/ppc/insn32.decode | 2 ++
target/ppc/translate/vmx-impl.c.inc | 56 +++++++++++++++++++++++++++++
2 files changed, 58 insertions(+)
diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode
index ea497ecd80..483651cf9c 100644
--- a/target/ppc/insn32.decode
+++ b/target/ppc/insn32.decode
@@ -501,6 +501,8 @@ VSTRIBR 000100 ..... 00001 ..... . 0000001101 @VX_tb_rc
VSTRIHL 000100 ..... 00010 ..... . 0000001101 @VX_tb_rc
VSTRIHR 000100 ..... 00011 ..... . 0000001101 @VX_tb_rc
+VCLRLB 000100 ..... ..... ..... 00110001101 @VX
+
# VSX Load/Store Instructions
LXV 111101 ..... ..... ............ . 001 @DQ_TSX
diff --git a/target/ppc/translate/vmx-impl.c.inc b/target/ppc/translate/vmx-impl.c.inc
index 8bcf637ff8..3fb4935bff 100644
--- a/target/ppc/translate/vmx-impl.c.inc
+++ b/target/ppc/translate/vmx-impl.c.inc
@@ -1956,6 +1956,62 @@ TRANS(VSTRIBR, do_vstri, gen_helper_VSTRIBR)
TRANS(VSTRIHL, do_vstri, gen_helper_VSTRIHL)
TRANS(VSTRIHR, do_vstri, gen_helper_VSTRIHR)
+static bool trans_VCLRLB(DisasContext *ctx, arg_VX *a)
+{
+ TCGv_i64 hi, lo, rb;
+ TCGLabel *l, *end;
+
+ REQUIRE_INSNS_FLAGS2(ctx, ISA310);
+ REQUIRE_VECTOR(ctx);
+
+ l = gen_new_label();
+ end = gen_new_label();
+
+ hi = tcg_const_local_i64(0);
+ lo = tcg_const_local_i64(0);
+ rb = tcg_temp_local_new_i64();
+
+ tcg_gen_extu_tl_i64(rb, cpu_gpr[a->vrb]);
+
+ /* RB == 0: all zeros */
+ tcg_gen_brcondi_i64(TCG_COND_EQ, rb, 0, end);
+
+ get_avr64(lo, a->vra, false);
+
+ /* RB <= 8 */
+ tcg_gen_brcondi_i64(TCG_COND_LEU, rb, 8, l);
+
+ get_avr64(hi, a->vra, true);
+
+ /* RB >= 16: just copy VRA to VRB */
+ tcg_gen_brcondi_i64(TCG_COND_GEU, rb, 16, end);
+
+ /* 8 < RB < 16: copy lo and partially clear hi */
+ tcg_gen_subfi_i64(rb, 16, rb);
+ tcg_gen_shli_i64(rb, rb, 3);
+ tcg_gen_shl_i64(hi, hi, rb);
+ tcg_gen_shr_i64(hi, hi, rb);
+ tcg_gen_br(end);
+
+ /* 0 < RB <= 8: zeroes hi and partially clears lo */
+ gen_set_label(l);
+ tcg_gen_subfi_i64(rb, 8, rb);
+ tcg_gen_shli_i64(rb, rb, 3);
+ tcg_gen_shl_i64(lo, lo, rb);
+ tcg_gen_shr_i64(lo, lo, rb);
+
+ /* Update VRT */
+ gen_set_label(end);
+ set_avr64(a->vrt, hi, true);
+ set_avr64(a->vrt, lo, false);
+
+ tcg_temp_free_i64(hi);
+ tcg_temp_free_i64(lo);
+ tcg_temp_free_i64(rb);
+
+ return true;
+}
+
#define GEN_VAFORM_PAIRED(name0, name1, opc2) \
static void glue(gen_, name0##_##name1)(DisasContext *ctx) \
{ \
--
2.31.1
^ permalink raw reply related [flat|nested] 55+ messages in thread
* [PATCH v3 16/37] target/ppc: implement vclrrb
2022-02-10 12:34 [PATCH v3 00/37] target/ppc: PowerISA Vector/VSX instruction batch matheus.ferst
` (14 preceding siblings ...)
2022-02-10 12:34 ` [PATCH v3 15/37] target/ppc: implement vclrlb matheus.ferst
@ 2022-02-10 12:34 ` matheus.ferst
2022-02-10 12:34 ` [PATCH v3 17/37] target/ppc: implement vcntmb[bhwd] matheus.ferst
` (20 subsequent siblings)
36 siblings, 0 replies; 55+ messages in thread
From: matheus.ferst @ 2022-02-10 12:34 UTC (permalink / raw)
To: qemu-devel, qemu-ppc
Cc: danielhb413, richard.henderson, groug, clg, Matheus Ferst, david
From: Matheus Ferst <matheus.ferst@eldorado.org.br>
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
---
target/ppc/insn32.decode | 1 +
target/ppc/translate/vmx-impl.c.inc | 43 +++++++++++++++++++++++------
2 files changed, 35 insertions(+), 9 deletions(-)
diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode
index 483651cf9c..bf2f3b1e0b 100644
--- a/target/ppc/insn32.decode
+++ b/target/ppc/insn32.decode
@@ -502,6 +502,7 @@ VSTRIHL 000100 ..... 00010 ..... . 0000001101 @VX_tb_rc
VSTRIHR 000100 ..... 00011 ..... . 0000001101 @VX_tb_rc
VCLRLB 000100 ..... ..... ..... 00110001101 @VX
+VCLRRB 000100 ..... ..... ..... 00111001101 @VX
# VSX Load/Store Instructions
diff --git a/target/ppc/translate/vmx-impl.c.inc b/target/ppc/translate/vmx-impl.c.inc
index 3fb4935bff..9ae0ab94ac 100644
--- a/target/ppc/translate/vmx-impl.c.inc
+++ b/target/ppc/translate/vmx-impl.c.inc
@@ -1956,7 +1956,7 @@ TRANS(VSTRIBR, do_vstri, gen_helper_VSTRIBR)
TRANS(VSTRIHL, do_vstri, gen_helper_VSTRIHL)
TRANS(VSTRIHR, do_vstri, gen_helper_VSTRIHR)
-static bool trans_VCLRLB(DisasContext *ctx, arg_VX *a)
+static bool do_vclrb(DisasContext *ctx, arg_VX *a, bool right)
{
TCGv_i64 hi, lo, rb;
TCGLabel *l, *end;
@@ -1976,29 +1976,51 @@ static bool trans_VCLRLB(DisasContext *ctx, arg_VX *a)
/* RB == 0: all zeros */
tcg_gen_brcondi_i64(TCG_COND_EQ, rb, 0, end);
- get_avr64(lo, a->vra, false);
+ if (right) {
+ get_avr64(hi, a->vra, true);
+ } else {
+ get_avr64(lo, a->vra, false);
+ }
/* RB <= 8 */
tcg_gen_brcondi_i64(TCG_COND_LEU, rb, 8, l);
- get_avr64(hi, a->vra, true);
+ if (right) {
+ get_avr64(lo, a->vra, false);
+ } else {
+ get_avr64(hi, a->vra, true);
+ }
/* RB >= 16: just copy VRA to VRB */
tcg_gen_brcondi_i64(TCG_COND_GEU, rb, 16, end);
- /* 8 < RB < 16: copy lo and partially clear hi */
+ /* 8 < RB < 16: */
tcg_gen_subfi_i64(rb, 16, rb);
tcg_gen_shli_i64(rb, rb, 3);
- tcg_gen_shl_i64(hi, hi, rb);
- tcg_gen_shr_i64(hi, hi, rb);
+ if (right) {
+ /* copy hi and partially clear lo */
+ tcg_gen_shr_i64(lo, lo, rb);
+ tcg_gen_shl_i64(lo, lo, rb);
+ } else {
+ /* copy lo and partially clear hi */
+ tcg_gen_shl_i64(hi, hi, rb);
+ tcg_gen_shr_i64(hi, hi, rb);
+ }
tcg_gen_br(end);
- /* 0 < RB <= 8: zeroes hi and partially clears lo */
+ /* 0 < RB <= 8: */
gen_set_label(l);
tcg_gen_subfi_i64(rb, 8, rb);
tcg_gen_shli_i64(rb, rb, 3);
- tcg_gen_shl_i64(lo, lo, rb);
- tcg_gen_shr_i64(lo, lo, rb);
+ if (right) {
+ /* zeroes lo and partially clears hi */
+ tcg_gen_shr_i64(hi, hi, rb);
+ tcg_gen_shl_i64(hi, hi, rb);
+ } else {
+ /* zeroes hi and partially clears lo */
+ tcg_gen_shl_i64(lo, lo, rb);
+ tcg_gen_shr_i64(lo, lo, rb);
+ }
/* Update VRT */
gen_set_label(end);
@@ -2012,6 +2034,9 @@ static bool trans_VCLRLB(DisasContext *ctx, arg_VX *a)
return true;
}
+TRANS(VCLRLB, do_vclrb, false)
+TRANS(VCLRRB, do_vclrb, true)
+
#define GEN_VAFORM_PAIRED(name0, name1, opc2) \
static void glue(gen_, name0##_##name1)(DisasContext *ctx) \
{ \
--
2.31.1
^ permalink raw reply related [flat|nested] 55+ messages in thread
* [PATCH v3 17/37] target/ppc: implement vcntmb[bhwd]
2022-02-10 12:34 [PATCH v3 00/37] target/ppc: PowerISA Vector/VSX instruction batch matheus.ferst
` (15 preceding siblings ...)
2022-02-10 12:34 ` [PATCH v3 16/37] target/ppc: implement vclrrb matheus.ferst
@ 2022-02-10 12:34 ` matheus.ferst
2022-02-11 5:28 ` Richard Henderson
2022-02-10 12:34 ` [PATCH v3 18/37] target/ppc: implement vgnb matheus.ferst
` (19 subsequent siblings)
36 siblings, 1 reply; 55+ messages in thread
From: matheus.ferst @ 2022-02-10 12:34 UTC (permalink / raw)
To: qemu-devel, qemu-ppc
Cc: danielhb413, richard.henderson, groug, clg, Matheus Ferst, david
From: Matheus Ferst <matheus.ferst@eldorado.org.br>
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
---
target/ppc/insn32.decode | 8 ++++++++
target/ppc/translate/vmx-impl.c.inc | 32 +++++++++++++++++++++++++++++
2 files changed, 40 insertions(+)
diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode
index bf2f3b1e0b..0a3e39f3e9 100644
--- a/target/ppc/insn32.decode
+++ b/target/ppc/insn32.decode
@@ -63,6 +63,9 @@
&VX_bf bf vra vrb
@VX_bf ...... bf:3 .. vra:5 vrb:5 ........... &VX_bf
+&VX_mp rt mp:bool vrb
+@VX_mp ...... rt:5 .... mp:1 vrb:5 ........... &VX_mp
+
&VX_tb_rc vrt vrb rc:bool
@VX_tb_rc ...... vrt:5 ..... vrb:5 rc:1 .......... &VX_tb_rc
@@ -489,6 +492,11 @@ VEXTRACTWM 000100 ..... 01010 ..... 11001000010 @VX_tb
VEXTRACTDM 000100 ..... 01011 ..... 11001000010 @VX_tb
VEXTRACTQM 000100 ..... 01100 ..... 11001000010 @VX_tb
+VCNTMBB 000100 ..... 1100 . ..... 11001000010 @VX_mp
+VCNTMBH 000100 ..... 1101 . ..... 11001000010 @VX_mp
+VCNTMBW 000100 ..... 1110 . ..... 11001000010 @VX_mp
+VCNTMBD 000100 ..... 1111 . ..... 11001000010 @VX_mp
+
## Vector Multiply-Sum Instructions
VMSUMCUD 000100 ..... ..... ..... ..... 010111 @VA
diff --git a/target/ppc/translate/vmx-impl.c.inc b/target/ppc/translate/vmx-impl.c.inc
index 9ae0ab94ac..78b277466a 100644
--- a/target/ppc/translate/vmx-impl.c.inc
+++ b/target/ppc/translate/vmx-impl.c.inc
@@ -1932,6 +1932,38 @@ static bool trans_MTVSRBMI(DisasContext *ctx, arg_DX_b *a)
return true;
}
+static bool do_vcntmb(DisasContext *ctx, arg_VX_mp *a, int vece)
+{
+ TCGv_i64 rt, vrb, mask;
+ rt = tcg_const_i64(0);
+ vrb = tcg_temp_new_i64();
+ mask = tcg_constant_i64(dup_const(vece, 1ULL << ((8 << vece) - 1)));
+
+ for (int i = 0; i < 2; i++) {
+ get_avr64(vrb, a->vrb, i);
+ if (a->mp) {
+ tcg_gen_and_i64(vrb, mask, vrb);
+ } else {
+ tcg_gen_andc_i64(vrb, mask, vrb);
+ }
+ tcg_gen_ctpop_i64(vrb, vrb);
+ tcg_gen_add_i64(rt, rt, vrb);
+ }
+
+ tcg_gen_shli_i64(rt, rt, TARGET_LONG_BITS - 8 + vece);
+ tcg_gen_trunc_i64_tl(cpu_gpr[a->rt], rt);
+
+ tcg_temp_free_i64(vrb);
+ tcg_temp_free_i64(rt);
+
+ return true;
+}
+
+TRANS(VCNTMBB, do_vcntmb, MO_8)
+TRANS(VCNTMBH, do_vcntmb, MO_16)
+TRANS(VCNTMBW, do_vcntmb, MO_32)
+TRANS(VCNTMBD, do_vcntmb, MO_64)
+
static bool do_vstri(DisasContext *ctx, arg_VX_tb_rc *a,
void (*gen_helper)(TCGv_ptr, TCGv_ptr, TCGv_ptr, TCGv))
{
--
2.31.1
^ permalink raw reply related [flat|nested] 55+ messages in thread
* [PATCH v3 18/37] target/ppc: implement vgnb
2022-02-10 12:34 [PATCH v3 00/37] target/ppc: PowerISA Vector/VSX instruction batch matheus.ferst
` (16 preceding siblings ...)
2022-02-10 12:34 ` [PATCH v3 17/37] target/ppc: implement vcntmb[bhwd] matheus.ferst
@ 2022-02-10 12:34 ` matheus.ferst
2022-02-11 6:15 ` Richard Henderson
2022-02-10 12:34 ` [PATCH v3 19/37] target/ppc: Move vsel and vperm/vpermr to decodetree matheus.ferst
` (18 subsequent siblings)
36 siblings, 1 reply; 55+ messages in thread
From: matheus.ferst @ 2022-02-10 12:34 UTC (permalink / raw)
To: qemu-devel, qemu-ppc
Cc: danielhb413, richard.henderson, groug, clg, Matheus Ferst, david
From: Matheus Ferst <matheus.ferst@eldorado.org.br>
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
---
target/ppc/insn32.decode | 5 ++++
target/ppc/translate/vmx-impl.c.inc | 44 +++++++++++++++++++++++++++++
2 files changed, 49 insertions(+)
diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode
index 0a3e39f3e9..7b629e81af 100644
--- a/target/ppc/insn32.decode
+++ b/target/ppc/insn32.decode
@@ -66,6 +66,9 @@
&VX_mp rt mp:bool vrb
@VX_mp ...... rt:5 .... mp:1 vrb:5 ........... &VX_mp
+&VX_n rt vrb n
+@VX_n ...... rt:5 .. n:3 vrb:5 ........... &VX_n
+
&VX_tb_rc vrt vrb rc:bool
@VX_tb_rc ...... vrt:5 ..... vrb:5 rc:1 .......... &VX_tb_rc
@@ -418,6 +421,8 @@ VCMPUQ 000100 ... -- ..... ..... 00100000001 @VX_bf
## Vector Bit Manipulation Instruction
+VGNB 000100 ..... -- ... ..... 10011001100 @VX_n
+
VCFUGED 000100 ..... ..... ..... 10101001101 @VX
VCLZDM 000100 ..... ..... ..... 11110000100 @VX
VCTZDM 000100 ..... ..... ..... 11111000100 @VX
diff --git a/target/ppc/translate/vmx-impl.c.inc b/target/ppc/translate/vmx-impl.c.inc
index 78b277466a..43eb7ab70c 100644
--- a/target/ppc/translate/vmx-impl.c.inc
+++ b/target/ppc/translate/vmx-impl.c.inc
@@ -1438,6 +1438,50 @@ GEN_VXFORM_DUAL(vsplth, PPC_ALTIVEC, PPC_NONE,
GEN_VXFORM_DUAL(vspltw, PPC_ALTIVEC, PPC_NONE,
vextractuw, PPC_NONE, PPC2_ISA300);
+static bool trans_VGNB(DisasContext *ctx, arg_VX_n *a)
+{
+ TCGv_i64 vrb, tmp, rt;
+ int in = 63, out = 63;
+
+ REQUIRE_INSNS_FLAGS2(ctx, ISA310);
+ REQUIRE_VECTOR(ctx);
+
+ if (a->n < 2) {
+ /*
+ * "N can be any value between 2 and 7, inclusive." Otherwise, the
+ * result is undefined, so we don't need to change RT. Also, N > 7 is
+ * impossible since the immediate field is 3 bits only.
+ */
+ return true;
+ }
+
+ vrb = tcg_temp_new_i64();
+ tmp = tcg_temp_new_i64();
+ rt = tcg_const_i64(0);
+
+ for (int dw = 1; dw >= 0; dw--) {
+ get_avr64(vrb, a->vrb, dw);
+ for (; in >= 0; in -= a->n, out--) {
+ if (in > out) {
+ tcg_gen_shri_i64(tmp, vrb, in - out);
+ } else {
+ tcg_gen_shli_i64(tmp, vrb, out - in);
+ }
+ tcg_gen_andi_i64(tmp, tmp, 1ULL << out);
+ tcg_gen_or_i64(rt, rt, tmp);
+ }
+ in += 64;
+ }
+
+ tcg_gen_trunc_i64_tl(cpu_gpr[a->rt], rt);
+
+ tcg_temp_free_i64(vrb);
+ tcg_temp_free_i64(tmp);
+ tcg_temp_free_i64(rt);
+
+ return true;
+}
+
static bool do_vextdx(DisasContext *ctx, arg_VA *a, int size, bool right,
void (*gen_helper)(TCGv_ptr, TCGv_ptr, TCGv_ptr, TCGv_ptr, TCGv))
{
--
2.31.1
^ permalink raw reply related [flat|nested] 55+ messages in thread
* [PATCH v3 19/37] target/ppc: Move vsel and vperm/vpermr to decodetree
2022-02-10 12:34 [PATCH v3 00/37] target/ppc: PowerISA Vector/VSX instruction batch matheus.ferst
` (17 preceding siblings ...)
2022-02-10 12:34 ` [PATCH v3 18/37] target/ppc: implement vgnb matheus.ferst
@ 2022-02-10 12:34 ` matheus.ferst
2022-02-10 12:34 ` [PATCH v3 20/37] target/ppc: Move xxsel " matheus.ferst
` (17 subsequent siblings)
36 siblings, 0 replies; 55+ messages in thread
From: matheus.ferst @ 2022-02-10 12:34 UTC (permalink / raw)
To: qemu-devel, qemu-ppc
Cc: danielhb413, richard.henderson, groug, clg, Matheus Ferst, david
From: Matheus Ferst <matheus.ferst@eldorado.org.br>
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
---
target/ppc/helper.h | 5 +--
target/ppc/insn32.decode | 5 +++
target/ppc/int_helper.c | 13 +-----
target/ppc/translate/vmx-impl.c.inc | 69 ++++++++++++++++++++++-------
target/ppc/translate/vmx-ops.c.inc | 2 -
5 files changed, 62 insertions(+), 32 deletions(-)
diff --git a/target/ppc/helper.h b/target/ppc/helper.h
index 935e99b5e9..f439f15400 100644
--- a/target/ppc/helper.h
+++ b/target/ppc/helper.h
@@ -232,9 +232,8 @@ DEF_HELPER_2(vupklsh, void, avr, avr)
DEF_HELPER_2(vupklsw, void, avr, avr)
DEF_HELPER_5(vmsumubm, void, env, avr, avr, avr, avr)
DEF_HELPER_5(vmsummbm, void, env, avr, avr, avr, avr)
-DEF_HELPER_5(vsel, void, env, avr, avr, avr, avr)
-DEF_HELPER_5(vperm, void, env, avr, avr, avr, avr)
-DEF_HELPER_5(vpermr, void, env, avr, avr, avr, avr)
+DEF_HELPER_4(VPERM, void, avr, avr, avr, avr)
+DEF_HELPER_4(VPERMR, void, avr, avr, avr, avr)
DEF_HELPER_4(vpkshss, void, env, avr, avr, avr)
DEF_HELPER_4(vpkshus, void, env, avr, avr, avr)
DEF_HELPER_4(vpkswss, void, env, avr, avr, avr)
diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode
index 7b629e81af..2ebb441550 100644
--- a/target/ppc/insn32.decode
+++ b/target/ppc/insn32.decode
@@ -467,6 +467,11 @@ VINSWVRX 000100 ..... ..... ..... 00110001111 @VX
VSLDBI 000100 ..... ..... ..... 00 ... 010110 @VN
VSRDBI 000100 ..... ..... ..... 01 ... 010110 @VN
+VPERM 000100 ..... ..... ..... ..... 101011 @VA
+VPERMR 000100 ..... ..... ..... ..... 111011 @VA
+
+VSEL 000100 ..... ..... ..... ..... 101010 @VA
+
## Vector Integer Arithmetic Instructions
VEXTSB2W 000100 ..... 10000 ..... 11000000010 @VX_tb
diff --git a/target/ppc/int_helper.c b/target/ppc/int_helper.c
index e98dedcdf5..3c604a5c09 100644
--- a/target/ppc/int_helper.c
+++ b/target/ppc/int_helper.c
@@ -1153,8 +1153,7 @@ void helper_VMULHUD(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b, uint32_t desc)
mulu64(&discard, &r->u64[1], a->u64[1], b->u64[1]);
}
-void helper_vperm(CPUPPCState *env, ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b,
- ppc_avr_t *c)
+void helper_VPERM(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b, ppc_avr_t *c)
{
ppc_avr_t result;
int i;
@@ -1172,8 +1171,7 @@ void helper_vperm(CPUPPCState *env, ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b,
*r = result;
}
-void helper_vpermr(CPUPPCState *env, ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b,
- ppc_avr_t *c)
+void helper_VPERMR(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b, ppc_avr_t *c)
{
ppc_avr_t result;
int i;
@@ -1441,13 +1439,6 @@ VRLMI(vrlwmi, 32, u32, 1);
VRLMI(vrldnm, 64, u64, 0);
VRLMI(vrlwnm, 32, u32, 0);
-void helper_vsel(CPUPPCState *env, ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b,
- ppc_avr_t *c)
-{
- r->u64[0] = (a->u64[0] & ~c->u64[0]) | (b->u64[0] & c->u64[0]);
- r->u64[1] = (a->u64[1] & ~c->u64[1]) | (b->u64[1] & c->u64[1]);
-}
-
void helper_vexptefp(CPUPPCState *env, ppc_avr_t *r, ppc_avr_t *b)
{
int i;
diff --git a/target/ppc/translate/vmx-impl.c.inc b/target/ppc/translate/vmx-impl.c.inc
index 43eb7ab70c..fbb85f7564 100644
--- a/target/ppc/translate/vmx-impl.c.inc
+++ b/target/ppc/translate/vmx-impl.c.inc
@@ -2156,28 +2156,65 @@ static void gen_vmladduhm(DisasContext *ctx)
tcg_temp_free_ptr(rd);
}
-static void gen_vpermr(DisasContext *ctx)
+static bool trans_VPERM(DisasContext *ctx, arg_VA *a)
{
- TCGv_ptr ra, rb, rc, rd;
- if (unlikely(!ctx->altivec_enabled)) {
- gen_exception(ctx, POWERPC_EXCP_VPU);
- return;
- }
- ra = gen_avr_ptr(rA(ctx->opcode));
- rb = gen_avr_ptr(rB(ctx->opcode));
- rc = gen_avr_ptr(rC(ctx->opcode));
- rd = gen_avr_ptr(rD(ctx->opcode));
- gen_helper_vpermr(cpu_env, rd, ra, rb, rc);
- tcg_temp_free_ptr(ra);
- tcg_temp_free_ptr(rb);
- tcg_temp_free_ptr(rc);
- tcg_temp_free_ptr(rd);
+ TCGv_ptr vrt, vra, vrb, vrc;
+
+ REQUIRE_INSNS_FLAGS(ctx, ALTIVEC);
+ REQUIRE_VECTOR(ctx);
+
+ vrt = gen_avr_ptr(a->vrt);
+ vra = gen_avr_ptr(a->vra);
+ vrb = gen_avr_ptr(a->vrb);
+ vrc = gen_avr_ptr(a->rc);
+
+ gen_helper_VPERM(vrt, vra, vrb, vrc);
+
+ tcg_temp_free_ptr(vrt);
+ tcg_temp_free_ptr(vra);
+ tcg_temp_free_ptr(vrb);
+ tcg_temp_free_ptr(vrc);
+
+ return true;
+}
+
+static bool trans_VPERMR(DisasContext *ctx, arg_VA *a)
+{
+ TCGv_ptr vrt, vra, vrb, vrc;
+
+ REQUIRE_INSNS_FLAGS2(ctx, ISA300);
+ REQUIRE_VECTOR(ctx);
+
+ vrt = gen_avr_ptr(a->vrt);
+ vra = gen_avr_ptr(a->vra);
+ vrb = gen_avr_ptr(a->vrb);
+ vrc = gen_avr_ptr(a->rc);
+
+ gen_helper_VPERMR(vrt, vra, vrb, vrc);
+
+ tcg_temp_free_ptr(vrt);
+ tcg_temp_free_ptr(vra);
+ tcg_temp_free_ptr(vrb);
+ tcg_temp_free_ptr(vrc);
+
+ return true;
+}
+
+static bool trans_VSEL(DisasContext *ctx, arg_VA *a)
+{
+ REQUIRE_INSNS_FLAGS(ctx, ALTIVEC);
+ REQUIRE_VECTOR(ctx);
+
+ tcg_gen_gvec_bitsel(MO_64, avr_full_offset(a->vrt), avr_full_offset(a->rc),
+ avr_full_offset(a->vrb), avr_full_offset(a->vra),
+ 16, 16);
+
+ return true;
}
GEN_VAFORM_PAIRED(vmsumubm, vmsummbm, 18)
GEN_VAFORM_PAIRED(vmsumuhm, vmsumuhs, 19)
GEN_VAFORM_PAIRED(vmsumshm, vmsumshs, 20)
-GEN_VAFORM_PAIRED(vsel, vperm, 21)
GEN_VAFORM_PAIRED(vmaddfp, vnmsubfp, 23)
GEN_VXFORM_NOA(vclzb, 1, 28)
diff --git a/target/ppc/translate/vmx-ops.c.inc b/target/ppc/translate/vmx-ops.c.inc
index cb4c5bb953..2895e8a114 100644
--- a/target/ppc/translate/vmx-ops.c.inc
+++ b/target/ppc/translate/vmx-ops.c.inc
@@ -210,7 +210,6 @@ GEN_VXFORM_300_EO(vctzw, 0x01, 0x18, 0x1E),
GEN_VXFORM_300_EO(vctzd, 0x01, 0x18, 0x1F),
GEN_VXFORM_300_EO(vclzlsbb, 0x01, 0x18, 0x0),
GEN_VXFORM_300_EO(vctzlsbb, 0x01, 0x18, 0x1),
-GEN_VXFORM_300(vpermr, 0x1D, 0xFF),
#define GEN_VXFORM_NOA(name, opc2, opc3) \
GEN_HANDLER(name, 0x04, opc2, opc3, 0x001f0000, PPC_ALTIVEC)
@@ -245,7 +244,6 @@ GEN_VAFORM_PAIRED(vmhaddshs, vmhraddshs, 16),
GEN_VAFORM_PAIRED(vmsumubm, vmsummbm, 18),
GEN_VAFORM_PAIRED(vmsumuhm, vmsumuhs, 19),
GEN_VAFORM_PAIRED(vmsumshm, vmsumshs, 20),
-GEN_VAFORM_PAIRED(vsel, vperm, 21),
GEN_VAFORM_PAIRED(vmaddfp, vnmsubfp, 23),
GEN_VXFORM_DUAL(vclzb, vpopcntb, 1, 28, PPC_NONE, PPC2_ALTIVEC_207),
--
2.31.1
^ permalink raw reply related [flat|nested] 55+ messages in thread
* [PATCH v3 20/37] target/ppc: Move xxsel to decodetree
2022-02-10 12:34 [PATCH v3 00/37] target/ppc: PowerISA Vector/VSX instruction batch matheus.ferst
` (18 preceding siblings ...)
2022-02-10 12:34 ` [PATCH v3 19/37] target/ppc: Move vsel and vperm/vpermr to decodetree matheus.ferst
@ 2022-02-10 12:34 ` matheus.ferst
2022-02-10 12:34 ` [PATCH v3 21/37] target/ppc: move xxperm/xxpermr " matheus.ferst
` (16 subsequent siblings)
36 siblings, 0 replies; 55+ messages in thread
From: matheus.ferst @ 2022-02-10 12:34 UTC (permalink / raw)
To: qemu-devel, qemu-ppc
Cc: danielhb413, richard.henderson, groug, clg, Matheus Ferst, david
From: Matheus Ferst <matheus.ferst@eldorado.org.br>
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
---
target/ppc/insn32.decode | 6 ++++
target/ppc/insn64.decode | 24 ++++++++--------
target/ppc/translate/vsx-impl.c.inc | 20 ++++++--------
target/ppc/translate/vsx-ops.c.inc | 43 -----------------------------
4 files changed, 26 insertions(+), 67 deletions(-)
diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode
index 2ebb441550..e56aec7636 100644
--- a/target/ppc/insn32.decode
+++ b/target/ppc/insn32.decode
@@ -148,12 +148,16 @@
%xx_xt 0:1 21:5
%xx_xb 1:1 11:5
%xx_xa 2:1 16:5
+%xx_xc 3:1 6:5
&XX2 xt xb uim:uint8_t
@XX2 ...... ..... ... uim:2 ..... ......... .. &XX2 xt=%xx_xt xb=%xx_xb
&XX3 xt xa xb
@XX3 ...... ..... ..... ..... ........ ... &XX3 xt=%xx_xt xa=%xx_xa xb=%xx_xb
+&XX4 xt xa xb xc
+@XX4 ...... ..... ..... ..... ..... .. .... &XX4 xt=%xx_xt xa=%xx_xa xb=%xx_xb xc=%xx_xc
+
&Z22_bf_fra bf fra dm
@Z22_bf_fra ...... bf:3 .. fra:5 dm:6 ......... . &Z22_bf_fra
@@ -538,6 +542,8 @@ STXVPX 011111 ..... ..... ..... 0111001101 - @X_TSXP
XXSPLTIB 111100 ..... 00 ........ 0101101000 . @X_imm8
XXSPLTW 111100 ..... ---.. ..... 010100100 . . @XX2
+XXSEL 111100 ..... ..... ..... ..... 11 .... @XX4
+
## VSX Vector Load Special Value Instruction
LXVKQ 111100 ..... 11111 ..... 0101101000 . @X_uim5
diff --git a/target/ppc/insn64.decode b/target/ppc/insn64.decode
index 39e610913d..9e4f531fb9 100644
--- a/target/ppc/insn64.decode
+++ b/target/ppc/insn64.decode
@@ -44,15 +44,15 @@
...... ..... .... . ................ \
&8RR_D si=%8rr_si xt=%8rr_xt
-# Format XX4
-&XX4 xt xa xb xc
-%xx4_xt 0:1 21:5
-%xx4_xa 2:1 16:5
-%xx4_xb 1:1 11:5
-%xx4_xc 3:1 6:5
-@XX4 ........ ........ ........ ........ \
+# Format 8RR:XX4
+%8rr_xx_xt 0:1 21:5
+%8rr_xx_xa 2:1 16:5
+%8rr_xx_xb 1:1 11:5
+%8rr_xx_xc 3:1 6:5
+&8RR_XX4 xt xa xb xc
+@8RR_XX4 ........ ........ ........ ........ \
...... ..... ..... ..... ..... .. .... \
- &XX4 xt=%xx4_xt xa=%xx4_xa xb=%xx4_xb xc=%xx4_xc
+ &8RR_XX4 xt=%8rr_xx_xt xa=%8rr_xx_xa xb=%8rr_xx_xb xc=%8rr_xx_xc
### Fixed-Point Load Instructions
@@ -187,10 +187,10 @@ XXSPLTI32DX 000001 01 0000 -- -- ................ \
100000 ..... 000 .. ................ @8RR_D_IX
XXBLENDVD 000001 01 0000 -- ------------------ \
- 100001 ..... ..... ..... ..... 11 .... @XX4
+ 100001 ..... ..... ..... ..... 11 .... @8RR_XX4
XXBLENDVW 000001 01 0000 -- ------------------ \
- 100001 ..... ..... ..... ..... 10 .... @XX4
+ 100001 ..... ..... ..... ..... 10 .... @8RR_XX4
XXBLENDVH 000001 01 0000 -- ------------------ \
- 100001 ..... ..... ..... ..... 01 .... @XX4
+ 100001 ..... ..... ..... ..... 01 .... @8RR_XX4
XXBLENDVB 000001 01 0000 -- ------------------ \
- 100001 ..... ..... ..... ..... 00 .... @XX4
+ 100001 ..... ..... ..... ..... 00 .... @8RR_XX4
diff --git a/target/ppc/translate/vsx-impl.c.inc b/target/ppc/translate/vsx-impl.c.inc
index b89be57272..83e3285e19 100644
--- a/target/ppc/translate/vsx-impl.c.inc
+++ b/target/ppc/translate/vsx-impl.c.inc
@@ -1420,19 +1420,15 @@ static void glue(gen_, name)(DisasContext *ctx) \
VSX_XXMRG(xxmrghw, 1)
VSX_XXMRG(xxmrglw, 0)
-static void gen_xxsel(DisasContext *ctx)
+static bool trans_XXSEL(DisasContext *ctx, arg_XX4 *a)
{
- int rt = xT(ctx->opcode);
- int ra = xA(ctx->opcode);
- int rb = xB(ctx->opcode);
- int rc = xC(ctx->opcode);
+ REQUIRE_INSNS_FLAGS2(ctx, VSX);
+ REQUIRE_VSX(ctx);
- if (unlikely(!ctx->vsx_enabled)) {
- gen_exception(ctx, POWERPC_EXCP_VSXU);
- return;
- }
- tcg_gen_gvec_bitsel(MO_64, vsr_full_offset(rt), vsr_full_offset(rc),
- vsr_full_offset(rb), vsr_full_offset(ra), 16, 16);
+ tcg_gen_gvec_bitsel(MO_64, vsr_full_offset(a->xt), vsr_full_offset(a->xc),
+ vsr_full_offset(a->xb), vsr_full_offset(a->xa), 16, 16);
+
+ return true;
}
static bool trans_XXSPLTW(DisasContext *ctx, arg_XX2 *a)
@@ -2125,7 +2121,7 @@ static void gen_xxblendv_vec(unsigned vece, TCGv_vec t, TCGv_vec a, TCGv_vec b,
tcg_temp_free_vec(tmp);
}
-static bool do_xxblendv(DisasContext *ctx, arg_XX4 *a, unsigned vece)
+static bool do_xxblendv(DisasContext *ctx, arg_8RR_XX4 *a, unsigned vece)
{
static const TCGOpcode vecop_list[] = {
INDEX_op_sari_vec, 0
diff --git a/target/ppc/translate/vsx-ops.c.inc b/target/ppc/translate/vsx-ops.c.inc
index c974324c4c..b0dbb38c80 100644
--- a/target/ppc/translate/vsx-ops.c.inc
+++ b/target/ppc/translate/vsx-ops.c.inc
@@ -347,47 +347,4 @@ GEN_XX3FORM_DM(xxsldwi, 0x08, 0x00),
GEN_XX2FORM_EXT(xxextractuw, 0x0A, 0x0A, PPC2_ISA300),
GEN_XX2FORM_EXT(xxinsertw, 0x0A, 0x0B, PPC2_ISA300),
-#define GEN_XXSEL_ROW(opc3) \
-GEN_HANDLER2_E(xxsel, "xxsel", 0x3C, 0x18, opc3, 0, PPC_NONE, PPC2_VSX), \
-GEN_HANDLER2_E(xxsel, "xxsel", 0x3C, 0x19, opc3, 0, PPC_NONE, PPC2_VSX), \
-GEN_HANDLER2_E(xxsel, "xxsel", 0x3C, 0x1A, opc3, 0, PPC_NONE, PPC2_VSX), \
-GEN_HANDLER2_E(xxsel, "xxsel", 0x3C, 0x1B, opc3, 0, PPC_NONE, PPC2_VSX), \
-GEN_HANDLER2_E(xxsel, "xxsel", 0x3C, 0x1C, opc3, 0, PPC_NONE, PPC2_VSX), \
-GEN_HANDLER2_E(xxsel, "xxsel", 0x3C, 0x1D, opc3, 0, PPC_NONE, PPC2_VSX), \
-GEN_HANDLER2_E(xxsel, "xxsel", 0x3C, 0x1E, opc3, 0, PPC_NONE, PPC2_VSX), \
-GEN_HANDLER2_E(xxsel, "xxsel", 0x3C, 0x1F, opc3, 0, PPC_NONE, PPC2_VSX), \
-
-GEN_XXSEL_ROW(0x00)
-GEN_XXSEL_ROW(0x01)
-GEN_XXSEL_ROW(0x02)
-GEN_XXSEL_ROW(0x03)
-GEN_XXSEL_ROW(0x04)
-GEN_XXSEL_ROW(0x05)
-GEN_XXSEL_ROW(0x06)
-GEN_XXSEL_ROW(0x07)
-GEN_XXSEL_ROW(0x08)
-GEN_XXSEL_ROW(0x09)
-GEN_XXSEL_ROW(0x0A)
-GEN_XXSEL_ROW(0x0B)
-GEN_XXSEL_ROW(0x0C)
-GEN_XXSEL_ROW(0x0D)
-GEN_XXSEL_ROW(0x0E)
-GEN_XXSEL_ROW(0x0F)
-GEN_XXSEL_ROW(0x10)
-GEN_XXSEL_ROW(0x11)
-GEN_XXSEL_ROW(0x12)
-GEN_XXSEL_ROW(0x13)
-GEN_XXSEL_ROW(0x14)
-GEN_XXSEL_ROW(0x15)
-GEN_XXSEL_ROW(0x16)
-GEN_XXSEL_ROW(0x17)
-GEN_XXSEL_ROW(0x18)
-GEN_XXSEL_ROW(0x19)
-GEN_XXSEL_ROW(0x1A)
-GEN_XXSEL_ROW(0x1B)
-GEN_XXSEL_ROW(0x1C)
-GEN_XXSEL_ROW(0x1D)
-GEN_XXSEL_ROW(0x1E)
-GEN_XXSEL_ROW(0x1F)
-
GEN_XX3FORM_DM(xxpermdi, 0x08, 0x01),
--
2.31.1
^ permalink raw reply related [flat|nested] 55+ messages in thread
* [PATCH v3 21/37] target/ppc: move xxperm/xxpermr to decodetree
2022-02-10 12:34 [PATCH v3 00/37] target/ppc: PowerISA Vector/VSX instruction batch matheus.ferst
` (19 preceding siblings ...)
2022-02-10 12:34 ` [PATCH v3 20/37] target/ppc: Move xxsel " matheus.ferst
@ 2022-02-10 12:34 ` matheus.ferst
2022-02-10 12:34 ` [PATCH v3 22/37] target/ppc: Move xxpermdi " matheus.ferst
` (15 subsequent siblings)
36 siblings, 0 replies; 55+ messages in thread
From: matheus.ferst @ 2022-02-10 12:34 UTC (permalink / raw)
To: qemu-devel, qemu-ppc
Cc: danielhb413, richard.henderson, groug, clg, Matheus Ferst, david
From: Matheus Ferst <matheus.ferst@eldorado.org.br>
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
---
target/ppc/fpu_helper.c | 21 ---------------
target/ppc/helper.h | 2 --
target/ppc/insn32.decode | 5 ++++
target/ppc/translate/vsx-impl.c.inc | 42 +++++++++++++++++++++++++++--
target/ppc/translate/vsx-ops.c.inc | 2 --
5 files changed, 45 insertions(+), 27 deletions(-)
diff --git a/target/ppc/fpu_helper.c b/target/ppc/fpu_helper.c
index e5c29b53b8..f9c7a4dc44 100644
--- a/target/ppc/fpu_helper.c
+++ b/target/ppc/fpu_helper.c
@@ -3055,27 +3055,6 @@ uint64_t helper_xsrsp(CPUPPCState *env, uint64_t xb)
return xt;
}
-#define VSX_XXPERM(op, indexed) \
-void helper_##op(CPUPPCState *env, ppc_vsr_t *xt, \
- ppc_vsr_t *xa, ppc_vsr_t *pcv) \
-{ \
- ppc_vsr_t t = *xt; \
- int i, idx; \
- \
- for (i = 0; i < 16; i++) { \
- idx = pcv->VsrB(i) & 0x1F; \
- if (indexed) { \
- idx = 31 - idx; \
- } \
- t.VsrB(i) = (idx <= 15) ? xa->VsrB(idx) \
- : xt->VsrB(idx - 16); \
- } \
- *xt = t; \
-}
-
-VSX_XXPERM(xxperm, 0)
-VSX_XXPERM(xxpermr, 1)
-
void helper_xvxsigsp(CPUPPCState *env, ppc_vsr_t *xt, ppc_vsr_t *xb)
{
ppc_vsr_t t = { };
diff --git a/target/ppc/helper.h b/target/ppc/helper.h
index f439f15400..bedd2896d3 100644
--- a/target/ppc/helper.h
+++ b/target/ppc/helper.h
@@ -501,8 +501,6 @@ DEF_HELPER_3(xvrspic, void, env, vsr, vsr)
DEF_HELPER_3(xvrspim, void, env, vsr, vsr)
DEF_HELPER_3(xvrspip, void, env, vsr, vsr)
DEF_HELPER_3(xvrspiz, void, env, vsr, vsr)
-DEF_HELPER_4(xxperm, void, env, vsr, vsr, vsr)
-DEF_HELPER_4(xxpermr, void, env, vsr, vsr, vsr)
DEF_HELPER_4(xxextractuw, void, env, vsr, vsr, i32)
DEF_HELPER_4(xxinsertw, void, env, vsr, vsr, i32)
DEF_HELPER_3(xvxsigsp, void, env, vsr, vsr)
diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode
index e56aec7636..39fdcb943c 100644
--- a/target/ppc/insn32.decode
+++ b/target/ppc/insn32.decode
@@ -542,6 +542,11 @@ STXVPX 011111 ..... ..... ..... 0111001101 - @X_TSXP
XXSPLTIB 111100 ..... 00 ........ 0101101000 . @X_imm8
XXSPLTW 111100 ..... ---.. ..... 010100100 . . @XX2
+## VSX Permute Instructions
+
+XXPERM 111100 ..... ..... ..... 00011010 ... @XX3
+XXPERMR 111100 ..... ..... ..... 00111010 ... @XX3
+
XXSEL 111100 ..... ..... ..... ..... 11 .... @XX4
## VSX Vector Load Special Value Instruction
diff --git a/target/ppc/translate/vsx-impl.c.inc b/target/ppc/translate/vsx-impl.c.inc
index 83e3285e19..43e52ee3f8 100644
--- a/target/ppc/translate/vsx-impl.c.inc
+++ b/target/ppc/translate/vsx-impl.c.inc
@@ -1198,8 +1198,46 @@ GEN_VSX_HELPER_X2(xvrspip, 0x12, 0x0A, 0, PPC2_VSX)
GEN_VSX_HELPER_X2(xvrspiz, 0x12, 0x09, 0, PPC2_VSX)
GEN_VSX_HELPER_2(xvtstdcsp, 0x14, 0x1A, 0, PPC2_VSX)
GEN_VSX_HELPER_2(xvtstdcdp, 0x14, 0x1E, 0, PPC2_VSX)
-GEN_VSX_HELPER_X3(xxperm, 0x08, 0x03, 0, PPC2_ISA300)
-GEN_VSX_HELPER_X3(xxpermr, 0x08, 0x07, 0, PPC2_ISA300)
+
+static bool trans_XXPERM(DisasContext *ctx, arg_XX3 *a)
+{
+ TCGv_ptr xt, xa, xb;
+
+ REQUIRE_INSNS_FLAGS2(ctx, ISA300);
+ REQUIRE_VSX(ctx);
+
+ xt = gen_vsr_ptr(a->xt);
+ xa = gen_vsr_ptr(a->xa);
+ xb = gen_vsr_ptr(a->xb);
+
+ gen_helper_VPERM(xt, xa, xt, xb);
+
+ tcg_temp_free_ptr(xt);
+ tcg_temp_free_ptr(xa);
+ tcg_temp_free_ptr(xb);
+
+ return true;
+}
+
+static bool trans_XXPERMR(DisasContext *ctx, arg_XX3 *a)
+{
+ TCGv_ptr xt, xa, xb;
+
+ REQUIRE_INSNS_FLAGS2(ctx, ISA300);
+ REQUIRE_VSX(ctx);
+
+ xt = gen_vsr_ptr(a->xt);
+ xa = gen_vsr_ptr(a->xa);
+ xb = gen_vsr_ptr(a->xb);
+
+ gen_helper_VPERMR(xt, xa, xt, xb);
+
+ tcg_temp_free_ptr(xt);
+ tcg_temp_free_ptr(xa);
+ tcg_temp_free_ptr(xb);
+
+ return true;
+}
#define GEN_VSX_HELPER_VSX_MADD(name, op1, aop, mop, inval, type) \
static void gen_##name(DisasContext *ctx) \
diff --git a/target/ppc/translate/vsx-ops.c.inc b/target/ppc/translate/vsx-ops.c.inc
index b0dbb38c80..86ed1a996a 100644
--- a/target/ppc/translate/vsx-ops.c.inc
+++ b/target/ppc/translate/vsx-ops.c.inc
@@ -341,8 +341,6 @@ VSX_LOGICAL(xxlnand, 0x8, 0x16, PPC2_VSX207),
VSX_LOGICAL(xxlorc, 0x8, 0x15, PPC2_VSX207),
GEN_XX3FORM(xxmrghw, 0x08, 0x02, PPC2_VSX),
GEN_XX3FORM(xxmrglw, 0x08, 0x06, PPC2_VSX),
-GEN_XX3FORM(xxperm, 0x08, 0x03, PPC2_ISA300),
-GEN_XX3FORM(xxpermr, 0x08, 0x07, PPC2_ISA300),
GEN_XX3FORM_DM(xxsldwi, 0x08, 0x00),
GEN_XX2FORM_EXT(xxextractuw, 0x0A, 0x0A, PPC2_ISA300),
GEN_XX2FORM_EXT(xxinsertw, 0x0A, 0x0B, PPC2_ISA300),
--
2.31.1
^ permalink raw reply related [flat|nested] 55+ messages in thread
* [PATCH v3 22/37] target/ppc: Move xxpermdi to decodetree
2022-02-10 12:34 [PATCH v3 00/37] target/ppc: PowerISA Vector/VSX instruction batch matheus.ferst
` (20 preceding siblings ...)
2022-02-10 12:34 ` [PATCH v3 21/37] target/ppc: move xxperm/xxpermr " matheus.ferst
@ 2022-02-10 12:34 ` matheus.ferst
2022-02-10 12:34 ` [PATCH v3 23/37] target/ppc: Implement xxpermx instruction matheus.ferst
` (14 subsequent siblings)
36 siblings, 0 replies; 55+ messages in thread
From: matheus.ferst @ 2022-02-10 12:34 UTC (permalink / raw)
To: qemu-devel, qemu-ppc
Cc: danielhb413, richard.henderson, groug, clg, Matheus Ferst, david
From: Matheus Ferst <matheus.ferst@eldorado.org.br>
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
---
target/ppc/insn32.decode | 4 ++
target/ppc/translate/vsx-impl.c.inc | 71 +++++++++++++----------------
target/ppc/translate/vsx-ops.c.inc | 2 -
3 files changed, 36 insertions(+), 41 deletions(-)
diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode
index 39fdcb943c..8f0627c529 100644
--- a/target/ppc/insn32.decode
+++ b/target/ppc/insn32.decode
@@ -155,6 +155,9 @@
&XX3 xt xa xb
@XX3 ...... ..... ..... ..... ........ ... &XX3 xt=%xx_xt xa=%xx_xa xb=%xx_xb
+&XX3_dm xt xa xb dm
+@XX3_dm ...... ..... ..... ..... . dm:2 ..... ... &XX3_dm xt=%xx_xt xa=%xx_xa xb=%xx_xb
+
&XX4 xt xa xb xc
@XX4 ...... ..... ..... ..... ..... .. .... &XX4 xt=%xx_xt xa=%xx_xa xb=%xx_xb xc=%xx_xc
@@ -546,6 +549,7 @@ XXSPLTW 111100 ..... ---.. ..... 010100100 . . @XX2
XXPERM 111100 ..... ..... ..... 00011010 ... @XX3
XXPERMR 111100 ..... ..... ..... 00111010 ... @XX3
+XXPERMDI 111100 ..... ..... ..... 0 .. 01010 ... @XX3_dm
XXSEL 111100 ..... ..... ..... ..... 11 .... @XX4
diff --git a/target/ppc/translate/vsx-impl.c.inc b/target/ppc/translate/vsx-impl.c.inc
index 43e52ee3f8..fcee930024 100644
--- a/target/ppc/translate/vsx-impl.c.inc
+++ b/target/ppc/translate/vsx-impl.c.inc
@@ -665,45 +665,6 @@ static void gen_mtvsrws(DisasContext *ctx)
#endif
-static void gen_xxpermdi(DisasContext *ctx)
-{
- TCGv_i64 xh, xl;
-
- if (unlikely(!ctx->vsx_enabled)) {
- gen_exception(ctx, POWERPC_EXCP_VSXU);
- return;
- }
-
- xh = tcg_temp_new_i64();
- xl = tcg_temp_new_i64();
-
- if (unlikely((xT(ctx->opcode) == xA(ctx->opcode)) ||
- (xT(ctx->opcode) == xB(ctx->opcode)))) {
- get_cpu_vsr(xh, xA(ctx->opcode), (DM(ctx->opcode) & 2) == 0);
- get_cpu_vsr(xl, xB(ctx->opcode), (DM(ctx->opcode) & 1) == 0);
-
- set_cpu_vsr(xT(ctx->opcode), xh, true);
- set_cpu_vsr(xT(ctx->opcode), xl, false);
- } else {
- if ((DM(ctx->opcode) & 2) == 0) {
- get_cpu_vsr(xh, xA(ctx->opcode), true);
- set_cpu_vsr(xT(ctx->opcode), xh, true);
- } else {
- get_cpu_vsr(xh, xA(ctx->opcode), false);
- set_cpu_vsr(xT(ctx->opcode), xh, true);
- }
- if ((DM(ctx->opcode) & 1) == 0) {
- get_cpu_vsr(xl, xB(ctx->opcode), true);
- set_cpu_vsr(xT(ctx->opcode), xl, false);
- } else {
- get_cpu_vsr(xl, xB(ctx->opcode), false);
- set_cpu_vsr(xT(ctx->opcode), xl, false);
- }
- }
- tcg_temp_free_i64(xh);
- tcg_temp_free_i64(xl);
-}
-
#define OP_ABS 1
#define OP_NABS 2
#define OP_NEG 3
@@ -1239,6 +1200,38 @@ static bool trans_XXPERMR(DisasContext *ctx, arg_XX3 *a)
return true;
}
+static bool trans_XXPERMDI(DisasContext *ctx, arg_XX3_dm *a)
+{
+ TCGv_i64 t0, t1;
+
+ REQUIRE_INSNS_FLAGS2(ctx, VSX);
+ REQUIRE_VSX(ctx);
+
+ t0 = tcg_temp_new_i64();
+
+ if (unlikely(a->xt == a->xa || a->xt == a->xb)) {
+ t1 = tcg_temp_new_i64();
+
+ get_cpu_vsr(t0, a->xa, (a->dm & 2) == 0);
+ get_cpu_vsr(t1, a->xb, (a->dm & 1) == 0);
+
+ set_cpu_vsr(a->xt, t0, true);
+ set_cpu_vsr(a->xt, t1, false);
+
+ tcg_temp_free_i64(t1);
+ } else {
+ get_cpu_vsr(t0, a->xa, (a->dm & 2) == 0);
+ set_cpu_vsr(a->xt, t0, true);
+
+ get_cpu_vsr(t0, a->xb, (a->dm & 1) == 0);
+ set_cpu_vsr(a->xt, t0, false);
+ }
+
+ tcg_temp_free_i64(t0);
+
+ return true;
+}
+
#define GEN_VSX_HELPER_VSX_MADD(name, op1, aop, mop, inval, type) \
static void gen_##name(DisasContext *ctx) \
{ \
diff --git a/target/ppc/translate/vsx-ops.c.inc b/target/ppc/translate/vsx-ops.c.inc
index 86ed1a996a..0a6b2b31ac 100644
--- a/target/ppc/translate/vsx-ops.c.inc
+++ b/target/ppc/translate/vsx-ops.c.inc
@@ -344,5 +344,3 @@ GEN_XX3FORM(xxmrglw, 0x08, 0x06, PPC2_VSX),
GEN_XX3FORM_DM(xxsldwi, 0x08, 0x00),
GEN_XX2FORM_EXT(xxextractuw, 0x0A, 0x0A, PPC2_ISA300),
GEN_XX2FORM_EXT(xxinsertw, 0x0A, 0x0B, PPC2_ISA300),
-
-GEN_XX3FORM_DM(xxpermdi, 0x08, 0x01),
--
2.31.1
^ permalink raw reply related [flat|nested] 55+ messages in thread
* [PATCH v3 23/37] target/ppc: Implement xxpermx instruction
2022-02-10 12:34 [PATCH v3 00/37] target/ppc: PowerISA Vector/VSX instruction batch matheus.ferst
` (21 preceding siblings ...)
2022-02-10 12:34 ` [PATCH v3 22/37] target/ppc: Move xxpermdi " matheus.ferst
@ 2022-02-10 12:34 ` matheus.ferst
2022-02-10 12:34 ` [PATCH v3 24/37] tcg/tcg-op-gvec.c: Introduce tcg_gen_gvec_4i matheus.ferst
` (13 subsequent siblings)
36 siblings, 0 replies; 55+ messages in thread
From: matheus.ferst @ 2022-02-10 12:34 UTC (permalink / raw)
To: qemu-devel, qemu-ppc
Cc: danielhb413, richard.henderson, groug, clg, Matheus Ferst, david
From: Matheus Ferst <matheus.ferst@eldorado.org.br>
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
---
target/ppc/helper.h | 1 +
target/ppc/insn64.decode | 8 ++++++++
target/ppc/int_helper.c | 20 ++++++++++++++++++++
target/ppc/translate/vsx-impl.c.inc | 22 ++++++++++++++++++++++
4 files changed, 51 insertions(+)
diff --git a/target/ppc/helper.h b/target/ppc/helper.h
index bedd2896d3..2a4cd89643 100644
--- a/target/ppc/helper.h
+++ b/target/ppc/helper.h
@@ -502,6 +502,7 @@ DEF_HELPER_3(xvrspim, void, env, vsr, vsr)
DEF_HELPER_3(xvrspip, void, env, vsr, vsr)
DEF_HELPER_3(xvrspiz, void, env, vsr, vsr)
DEF_HELPER_4(xxextractuw, void, env, vsr, vsr, i32)
+DEF_HELPER_5(XXPERMX, void, vsr, vsr, vsr, vsr, tl)
DEF_HELPER_4(xxinsertw, void, env, vsr, vsr, i32)
DEF_HELPER_3(xvxsigsp, void, env, vsr, vsr)
DEF_HELPER_5(XXBLENDVB, void, vsr, vsr, vsr, vsr, i32)
diff --git a/target/ppc/insn64.decode b/target/ppc/insn64.decode
index 9e4f531fb9..0963e064b1 100644
--- a/target/ppc/insn64.decode
+++ b/target/ppc/insn64.decode
@@ -54,6 +54,11 @@
...... ..... ..... ..... ..... .. .... \
&8RR_XX4 xt=%8rr_xx_xt xa=%8rr_xx_xa xb=%8rr_xx_xb xc=%8rr_xx_xc
+&8RR_XX4_uim3 xt xa xb xc uim3
+@8RR_XX4_uim3 ...... .. .... .. ............... uim3:3 \
+ ...... ..... ..... ..... ..... .. .... \
+ &8RR_XX4_uim3 xt=%8rr_xx_xt xa=%8rr_xx_xa xb=%8rr_xx_xb xc=%8rr_xx_xc
+
### Fixed-Point Load Instructions
PLBZ 000001 10 0--.-- .................. \
@@ -194,3 +199,6 @@ XXBLENDVH 000001 01 0000 -- ------------------ \
100001 ..... ..... ..... ..... 01 .... @8RR_XX4
XXBLENDVB 000001 01 0000 -- ------------------ \
100001 ..... ..... ..... ..... 00 .... @8RR_XX4
+
+XXPERMX 000001 01 0000 -- --------------- ... \
+ 100010 ..... ..... ..... ..... 00 .... @8RR_XX4_uim3
diff --git a/target/ppc/int_helper.c b/target/ppc/int_helper.c
index 3c604a5c09..27739400e4 100644
--- a/target/ppc/int_helper.c
+++ b/target/ppc/int_helper.c
@@ -1153,6 +1153,26 @@ void helper_VMULHUD(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b, uint32_t desc)
mulu64(&discard, &r->u64[1], a->u64[1], b->u64[1]);
}
+void helper_XXPERMX(ppc_vsr_t *t, ppc_vsr_t *s0, ppc_vsr_t *s1, ppc_vsr_t *pcv,
+ target_ulong uim)
+{
+ int i, idx;
+ ppc_vsr_t tmp = { .u64 = {0, 0} };
+
+ for (i = 0; i < ARRAY_SIZE(t->u8); i++) {
+ if ((pcv->VsrB(i) >> 5) == uim) {
+ idx = pcv->VsrB(i) & 0x1f;
+ if (idx < ARRAY_SIZE(t->u8)) {
+ tmp.VsrB(i) = s0->VsrB(idx);
+ } else {
+ tmp.VsrB(i) = s1->VsrB(idx - ARRAY_SIZE(t->u8));
+ }
+ }
+ }
+
+ *t = tmp;
+}
+
void helper_VPERM(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b, ppc_avr_t *c)
{
ppc_avr_t result;
diff --git a/target/ppc/translate/vsx-impl.c.inc b/target/ppc/translate/vsx-impl.c.inc
index fcee930024..2ad913ae9b 100644
--- a/target/ppc/translate/vsx-impl.c.inc
+++ b/target/ppc/translate/vsx-impl.c.inc
@@ -1232,6 +1232,28 @@ static bool trans_XXPERMDI(DisasContext *ctx, arg_XX3_dm *a)
return true;
}
+static bool trans_XXPERMX(DisasContext *ctx, arg_8RR_XX4_uim3 *a)
+{
+ TCGv_ptr xt, xa, xb, xc;
+
+ REQUIRE_INSNS_FLAGS2(ctx, ISA310);
+ REQUIRE_VSX(ctx);
+
+ xt = gen_vsr_ptr(a->xt);
+ xa = gen_vsr_ptr(a->xa);
+ xb = gen_vsr_ptr(a->xb);
+ xc = gen_vsr_ptr(a->xc);
+
+ gen_helper_XXPERMX(xt, xa, xb, xc, tcg_constant_tl(a->uim3));
+
+ tcg_temp_free_ptr(xt);
+ tcg_temp_free_ptr(xa);
+ tcg_temp_free_ptr(xb);
+ tcg_temp_free_ptr(xc);
+
+ return true;
+}
+
#define GEN_VSX_HELPER_VSX_MADD(name, op1, aop, mop, inval, type) \
static void gen_##name(DisasContext *ctx) \
{ \
--
2.31.1
^ permalink raw reply related [flat|nested] 55+ messages in thread
* [PATCH v3 24/37] tcg/tcg-op-gvec.c: Introduce tcg_gen_gvec_4i
2022-02-10 12:34 [PATCH v3 00/37] target/ppc: PowerISA Vector/VSX instruction batch matheus.ferst
` (22 preceding siblings ...)
2022-02-10 12:34 ` [PATCH v3 23/37] target/ppc: Implement xxpermx instruction matheus.ferst
@ 2022-02-10 12:34 ` matheus.ferst
2022-02-10 12:34 ` [PATCH v3 25/37] target/ppc: Implement xxeval matheus.ferst
` (12 subsequent siblings)
36 siblings, 0 replies; 55+ messages in thread
From: matheus.ferst @ 2022-02-10 12:34 UTC (permalink / raw)
To: qemu-devel, qemu-ppc
Cc: danielhb413, richard.henderson, groug, clg, Matheus Ferst, david
From: Matheus Ferst <matheus.ferst@eldorado.org.br>
Following the implementation of tcg_gen_gvec_3i, add a four-vector and
immediate operand expansion method.
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
---
include/tcg/tcg-op-gvec.h | 22 ++++++
tcg/tcg-op-gvec.c | 146 ++++++++++++++++++++++++++++++++++++++
2 files changed, 168 insertions(+)
diff --git a/include/tcg/tcg-op-gvec.h b/include/tcg/tcg-op-gvec.h
index da55fed870..28cafbcc5c 100644
--- a/include/tcg/tcg-op-gvec.h
+++ b/include/tcg/tcg-op-gvec.h
@@ -218,6 +218,25 @@ typedef struct {
bool write_aofs;
} GVecGen4;
+typedef struct {
+ /*
+ * Expand inline as a 64-bit or 32-bit integer. Only one of these will be
+ * non-NULL.
+ */
+ void (*fni8)(TCGv_i64, TCGv_i64, TCGv_i64, TCGv_i64, int64_t);
+ void (*fni4)(TCGv_i32, TCGv_i32, TCGv_i32, TCGv_i32, int32_t);
+ /* Expand inline with a host vector type. */
+ void (*fniv)(unsigned, TCGv_vec, TCGv_vec, TCGv_vec, TCGv_vec, int64_t);
+ /* Expand out-of-line helper w/descriptor, data in descriptor. */
+ gen_helper_gvec_4 *fno;
+ /* The optional opcodes, if any, utilized by .fniv. */
+ const TCGOpcode *opt_opc;
+ /* The vector element size, if applicable. */
+ uint8_t vece;
+ /* Prefer i64 to v64. */
+ bool prefer_i64;
+} GVecGen4i;
+
void tcg_gen_gvec_2(uint32_t dofs, uint32_t aofs,
uint32_t oprsz, uint32_t maxsz, const GVecGen2 *);
void tcg_gen_gvec_2i(uint32_t dofs, uint32_t aofs, uint32_t oprsz,
@@ -231,6 +250,9 @@ void tcg_gen_gvec_3i(uint32_t dofs, uint32_t aofs, uint32_t bofs,
const GVecGen3i *);
void tcg_gen_gvec_4(uint32_t dofs, uint32_t aofs, uint32_t bofs, uint32_t cofs,
uint32_t oprsz, uint32_t maxsz, const GVecGen4 *);
+void tcg_gen_gvec_4i(uint32_t dofs, uint32_t aofs, uint32_t bofs, uint32_t cofs,
+ uint32_t oprsz, uint32_t maxsz, int64_t c,
+ const GVecGen4i *);
/* Expand a specific vector operation. */
diff --git a/tcg/tcg-op-gvec.c b/tcg/tcg-op-gvec.c
index ffe55e908f..079a761b04 100644
--- a/tcg/tcg-op-gvec.c
+++ b/tcg/tcg-op-gvec.c
@@ -836,6 +836,30 @@ static void expand_4_i32(uint32_t dofs, uint32_t aofs, uint32_t bofs,
tcg_temp_free_i32(t0);
}
+static void expand_4i_i32(uint32_t dofs, uint32_t aofs, uint32_t bofs,
+ uint32_t cofs, uint32_t oprsz, int32_t c,
+ void (*fni)(TCGv_i32, TCGv_i32, TCGv_i32, TCGv_i32,
+ int32_t))
+{
+ TCGv_i32 t0 = tcg_temp_new_i32();
+ TCGv_i32 t1 = tcg_temp_new_i32();
+ TCGv_i32 t2 = tcg_temp_new_i32();
+ TCGv_i32 t3 = tcg_temp_new_i32();
+ uint32_t i;
+
+ for (i = 0; i < oprsz; i += 4) {
+ tcg_gen_ld_i32(t1, cpu_env, aofs + i);
+ tcg_gen_ld_i32(t2, cpu_env, bofs + i);
+ tcg_gen_ld_i32(t3, cpu_env, cofs + i);
+ fni(t0, t1, t2, t3, c);
+ tcg_gen_st_i32(t0, cpu_env, dofs + i);
+ }
+ tcg_temp_free_i32(t3);
+ tcg_temp_free_i32(t2);
+ tcg_temp_free_i32(t1);
+ tcg_temp_free_i32(t0);
+}
+
/* Expand OPSZ bytes worth of two-operand operations using i64 elements. */
static void expand_2_i64(uint32_t dofs, uint32_t aofs, uint32_t oprsz,
bool load_dest, void (*fni)(TCGv_i64, TCGv_i64))
@@ -971,6 +995,30 @@ static void expand_4_i64(uint32_t dofs, uint32_t aofs, uint32_t bofs,
tcg_temp_free_i64(t0);
}
+static void expand_4i_i64(uint32_t dofs, uint32_t aofs, uint32_t bofs,
+ uint32_t cofs, uint32_t oprsz, int64_t c,
+ void (*fni)(TCGv_i64, TCGv_i64, TCGv_i64, TCGv_i64,
+ int64_t))
+{
+ TCGv_i64 t0 = tcg_temp_new_i64();
+ TCGv_i64 t1 = tcg_temp_new_i64();
+ TCGv_i64 t2 = tcg_temp_new_i64();
+ TCGv_i64 t3 = tcg_temp_new_i64();
+ uint32_t i;
+
+ for (i = 0; i < oprsz; i += 8) {
+ tcg_gen_ld_i64(t1, cpu_env, aofs + i);
+ tcg_gen_ld_i64(t2, cpu_env, bofs + i);
+ tcg_gen_ld_i64(t3, cpu_env, cofs + i);
+ fni(t0, t1, t2, t3, c);
+ tcg_gen_st_i64(t0, cpu_env, dofs + i);
+ }
+ tcg_temp_free_i64(t3);
+ tcg_temp_free_i64(t2);
+ tcg_temp_free_i64(t1);
+ tcg_temp_free_i64(t0);
+}
+
/* Expand OPSZ bytes worth of two-operand operations using host vectors. */
static void expand_2_vec(unsigned vece, uint32_t dofs, uint32_t aofs,
uint32_t oprsz, uint32_t tysz, TCGType type,
@@ -1121,6 +1169,35 @@ static void expand_4_vec(unsigned vece, uint32_t dofs, uint32_t aofs,
tcg_temp_free_vec(t0);
}
+/*
+ * Expand OPSZ bytes worth of four-vector operands and an immediate operand
+ * using host vectors.
+ */
+static void expand_4i_vec(unsigned vece, uint32_t dofs, uint32_t aofs,
+ uint32_t bofs, uint32_t cofs, uint32_t oprsz,
+ uint32_t tysz, TCGType type, int64_t c,
+ void (*fni)(unsigned, TCGv_vec, TCGv_vec,
+ TCGv_vec, TCGv_vec, int64_t))
+{
+ TCGv_vec t0 = tcg_temp_new_vec(type);
+ TCGv_vec t1 = tcg_temp_new_vec(type);
+ TCGv_vec t2 = tcg_temp_new_vec(type);
+ TCGv_vec t3 = tcg_temp_new_vec(type);
+ uint32_t i;
+
+ for (i = 0; i < oprsz; i += tysz) {
+ tcg_gen_ld_vec(t1, cpu_env, aofs + i);
+ tcg_gen_ld_vec(t2, cpu_env, bofs + i);
+ tcg_gen_ld_vec(t3, cpu_env, cofs + i);
+ fni(vece, t0, t1, t2, t3, c);
+ tcg_gen_st_vec(t0, cpu_env, dofs + i);
+ }
+ tcg_temp_free_vec(t3);
+ tcg_temp_free_vec(t2);
+ tcg_temp_free_vec(t1);
+ tcg_temp_free_vec(t0);
+}
+
/* Expand a vector two-operand operation. */
void tcg_gen_gvec_2(uint32_t dofs, uint32_t aofs,
uint32_t oprsz, uint32_t maxsz, const GVecGen2 *g)
@@ -1533,6 +1610,75 @@ void tcg_gen_gvec_4(uint32_t dofs, uint32_t aofs, uint32_t bofs, uint32_t cofs,
}
}
+/* Expand a vector four-operand operation. */
+void tcg_gen_gvec_4i(uint32_t dofs, uint32_t aofs, uint32_t bofs, uint32_t cofs,
+ uint32_t oprsz, uint32_t maxsz, int64_t c,
+ const GVecGen4i *g)
+{
+ const TCGOpcode *this_list = g->opt_opc ? : vecop_list_empty;
+ const TCGOpcode *hold_list = tcg_swap_vecop_list(this_list);
+ TCGType type;
+ uint32_t some;
+
+ check_size_align(oprsz, maxsz, dofs | aofs | bofs | cofs);
+ check_overlap_4(dofs, aofs, bofs, cofs, maxsz);
+
+ type = 0;
+ if (g->fniv) {
+ type = choose_vector_type(g->opt_opc, g->vece, oprsz, g->prefer_i64);
+ }
+ switch (type) {
+ case TCG_TYPE_V256:
+ /*
+ * Recall that ARM SVE allows vector sizes that are not a
+ * power of 2, but always a multiple of 16. The intent is
+ * that e.g. size == 80 would be expanded with 2x32 + 1x16.
+ */
+ some = QEMU_ALIGN_DOWN(oprsz, 32);
+ expand_4i_vec(g->vece, dofs, aofs, bofs, cofs, some,
+ 32, TCG_TYPE_V256, c, g->fniv);
+ if (some == oprsz) {
+ break;
+ }
+ dofs += some;
+ aofs += some;
+ bofs += some;
+ cofs += some;
+ oprsz -= some;
+ maxsz -= some;
+ /* fallthru */
+ case TCG_TYPE_V128:
+ expand_4i_vec(g->vece, dofs, aofs, bofs, cofs, oprsz,
+ 16, TCG_TYPE_V128, c, g->fniv);
+ break;
+ case TCG_TYPE_V64:
+ expand_4i_vec(g->vece, dofs, aofs, bofs, cofs, oprsz,
+ 8, TCG_TYPE_V64, c, g->fniv);
+ break;
+
+ case 0:
+ if (g->fni8 && check_size_impl(oprsz, 8)) {
+ expand_4i_i64(dofs, aofs, bofs, cofs, oprsz, c, g->fni8);
+ } else if (g->fni4 && check_size_impl(oprsz, 4)) {
+ expand_4i_i32(dofs, aofs, bofs, cofs, oprsz, c, g->fni4);
+ } else {
+ assert(g->fno != NULL);
+ tcg_gen_gvec_4_ool(dofs, aofs, bofs, cofs,
+ oprsz, maxsz, c, g->fno);
+ oprsz = maxsz;
+ }
+ break;
+
+ default:
+ g_assert_not_reached();
+ }
+ tcg_swap_vecop_list(hold_list);
+
+ if (oprsz < maxsz) {
+ expand_clr(dofs + oprsz, maxsz - oprsz);
+ }
+}
+
/*
* Expand specific vector operations.
*/
--
2.31.1
^ permalink raw reply related [flat|nested] 55+ messages in thread
* [PATCH v3 25/37] target/ppc: Implement xxeval
2022-02-10 12:34 [PATCH v3 00/37] target/ppc: PowerISA Vector/VSX instruction batch matheus.ferst
` (23 preceding siblings ...)
2022-02-10 12:34 ` [PATCH v3 24/37] tcg/tcg-op-gvec.c: Introduce tcg_gen_gvec_4i matheus.ferst
@ 2022-02-10 12:34 ` matheus.ferst
2022-02-10 12:34 ` [PATCH v3 26/37] target/ppc: Implement xxgenpcv[bhwd]m instruction matheus.ferst
` (11 subsequent siblings)
36 siblings, 0 replies; 55+ messages in thread
From: matheus.ferst @ 2022-02-10 12:34 UTC (permalink / raw)
To: qemu-devel, qemu-ppc
Cc: danielhb413, richard.henderson, groug, clg, Matheus Ferst, david
From: Matheus Ferst <matheus.ferst@eldorado.org.br>
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
---
target/ppc/helper.h | 1 +
target/ppc/insn64.decode | 8 ++
target/ppc/int_helper.c | 42 ++++++++++
target/ppc/translate/vsx-impl.c.inc | 121 ++++++++++++++++++++++++++++
4 files changed, 172 insertions(+)
diff --git a/target/ppc/helper.h b/target/ppc/helper.h
index 2a4cd89643..4cf8e0747d 100644
--- a/target/ppc/helper.h
+++ b/target/ppc/helper.h
@@ -505,6 +505,7 @@ DEF_HELPER_4(xxextractuw, void, env, vsr, vsr, i32)
DEF_HELPER_5(XXPERMX, void, vsr, vsr, vsr, vsr, tl)
DEF_HELPER_4(xxinsertw, void, env, vsr, vsr, i32)
DEF_HELPER_3(xvxsigsp, void, env, vsr, vsr)
+DEF_HELPER_5(XXEVAL, void, vsr, vsr, vsr, vsr, i32)
DEF_HELPER_5(XXBLENDVB, void, vsr, vsr, vsr, vsr, i32)
DEF_HELPER_5(XXBLENDVH, void, vsr, vsr, vsr, vsr, i32)
DEF_HELPER_5(XXBLENDVW, void, vsr, vsr, vsr, vsr, i32)
diff --git a/target/ppc/insn64.decode b/target/ppc/insn64.decode
index 0963e064b1..fdb859f62d 100644
--- a/target/ppc/insn64.decode
+++ b/target/ppc/insn64.decode
@@ -54,6 +54,11 @@
...... ..... ..... ..... ..... .. .... \
&8RR_XX4 xt=%8rr_xx_xt xa=%8rr_xx_xa xb=%8rr_xx_xb xc=%8rr_xx_xc
+&8RR_XX4_imm xt xa xb xc imm
+@8RR_XX4_imm ........ ........ ........ imm:8 \
+ ...... ..... ..... ..... ..... .. .... \
+ &8RR_XX4_imm xt=%8rr_xx_xt xa=%8rr_xx_xa xb=%8rr_xx_xb xc=%8rr_xx_xc
+
&8RR_XX4_uim3 xt xa xb xc uim3
@8RR_XX4_uim3 ...... .. .... .. ............... uim3:3 \
...... ..... ..... ..... ..... .. .... \
@@ -184,6 +189,9 @@ PLXVP 000001 00 0--.-- .................. \
PSTXVP 000001 00 0--.-- .................. \
111110 ..... ..... ................ @8LS_D_TSXP
+XXEVAL 000001 01 0000 -- ---------- ........ \
+ 100010 ..... ..... ..... ..... 01 .... @8RR_XX4_imm
+
XXSPLTIDP 000001 01 0000 -- -- ................ \
100000 ..... 0010 . ................ @8RR_D
XXSPLTIW 000001 01 0000 -- -- ................ \
diff --git a/target/ppc/int_helper.c b/target/ppc/int_helper.c
index 27739400e4..7e26d68f53 100644
--- a/target/ppc/int_helper.c
+++ b/target/ppc/int_helper.c
@@ -28,6 +28,7 @@
#include "fpu/softfloat.h"
#include "qapi/error.h"
#include "qemu/guest-random.h"
+#include "tcg/tcg-gvec-desc.h"
#include "helper_regs.h"
/*****************************************************************************/
@@ -1714,6 +1715,47 @@ void helper_xxinsertw(CPUPPCState *env, ppc_vsr_t *xt,
*xt = t;
}
+void helper_XXEVAL(ppc_avr_t *t, ppc_avr_t *a, ppc_avr_t *b, ppc_avr_t *c,
+ uint32_t desc)
+{
+ /*
+ * Instead of processing imm bit-by-bit, we'll skip the computation of
+ * conjunctions whose corresponding bit is unset.
+ */
+ int bit, imm = simd_data(desc);
+ Int128 conj, disj = int128_zero();
+
+ /* Iterate over set bits from the least to the most significant bit */
+ while (imm) {
+ /*
+ * Get the next bit to be processed with ctz64. Invert the result of
+ * ctz64 to match the indexing used by PowerISA.
+ */
+ bit = 7 - ctzl(imm);
+ if (bit & 0x4) {
+ conj = a->s128;
+ } else {
+ conj = int128_not(a->s128);
+ }
+ if (bit & 0x2) {
+ conj = int128_and(conj, b->s128);
+ } else {
+ conj = int128_and(conj, int128_not(b->s128));
+ }
+ if (bit & 0x1) {
+ conj = int128_and(conj, c->s128);
+ } else {
+ conj = int128_and(conj, int128_not(c->s128));
+ }
+ disj = int128_or(disj, conj);
+
+ /* Unset the least significant bit that is set */
+ imm &= imm - 1;
+ }
+
+ t->s128 = disj;
+}
+
#define XXBLEND(name, sz) \
void glue(helper_XXBLENDV, name)(ppc_avr_t *t, ppc_avr_t *a, ppc_avr_t *b, \
ppc_avr_t *c, uint32_t desc) \
diff --git a/target/ppc/translate/vsx-impl.c.inc b/target/ppc/translate/vsx-impl.c.inc
index 2ad913ae9b..662b08b465 100644
--- a/target/ppc/translate/vsx-impl.c.inc
+++ b/target/ppc/translate/vsx-impl.c.inc
@@ -2165,6 +2165,127 @@ TRANS64_FLAGS2(ISA310, PLXV, do_lstxv_PLS_D, false, false)
TRANS64_FLAGS2(ISA310, PSTXVP, do_lstxv_PLS_D, true, true)
TRANS64_FLAGS2(ISA310, PLXVP, do_lstxv_PLS_D, false, true)
+static void gen_xxeval_i64(TCGv_i64 t, TCGv_i64 a, TCGv_i64 b, TCGv_i64 c,
+ int64_t imm)
+{
+ /*
+ * Instead of processing imm bit-by-bit, we'll skip the computation of
+ * conjunctions whose corresponding bit is unset.
+ */
+ int bit;
+ TCGv_i64 conj, disj;
+
+ conj = tcg_temp_new_i64();
+ disj = tcg_temp_new_i64();
+
+ tcg_gen_movi_i64(disj, 0);
+
+ /* Iterate over set bits from the least to the most significant bit */
+ while (imm) {
+ /*
+ * Get the next bit to be processed with ctz64. Invert the result of
+ * ctz64 to match the indexing used by PowerISA.
+ */
+ bit = 7 - ctz64(imm);
+ if (bit & 0x4) {
+ tcg_gen_mov_i64(conj, a);
+ } else {
+ tcg_gen_not_i64(conj, a);
+ }
+ if (bit & 0x2) {
+ tcg_gen_and_i64(conj, conj, b);
+ } else {
+ tcg_gen_andc_i64(conj, conj, b);
+ }
+ if (bit & 0x1) {
+ tcg_gen_and_i64(conj, conj, c);
+ } else {
+ tcg_gen_andc_i64(conj, conj, c);
+ }
+ tcg_gen_or_i64(disj, disj, conj);
+
+ /* Unset the least significant bit that is set */
+ imm &= imm - 1;
+ }
+
+ tcg_gen_mov_i64(t, disj);
+
+ tcg_temp_free_i64(conj);
+ tcg_temp_free_i64(disj);
+}
+
+static void gen_xxeval_vec(unsigned vece, TCGv_vec t, TCGv_vec a, TCGv_vec b,
+ TCGv_vec c, int64_t imm)
+{
+ /*
+ * Instead of processing imm bit-by-bit, we'll skip the computation of
+ * conjunctions whose corresponding bit is unset.
+ */
+ int bit;
+ TCGv_vec disj, conj;
+
+ disj = tcg_temp_new_vec_matching(t);
+ conj = tcg_temp_new_vec_matching(t);
+
+ tcg_gen_dupi_vec(vece, disj, 0);
+
+ /* Iterate over set bits from the least to the most significant bit */
+ while (imm) {
+ /*
+ * Get the next bit to be processed with ctz64. Invert the result of
+ * ctz64 to match the indexing used by PowerISA.
+ */
+ bit = 7 - ctz64(imm);
+ if (bit & 0x4) {
+ tcg_gen_mov_vec(conj, a);
+ } else {
+ tcg_gen_not_vec(vece, conj, a);
+ }
+ if (bit & 0x2) {
+ tcg_gen_and_vec(vece, conj, conj, b);
+ } else {
+ tcg_gen_andc_vec(vece, conj, conj, b);
+ }
+ if (bit & 0x1) {
+ tcg_gen_and_vec(vece, conj, conj, c);
+ } else {
+ tcg_gen_andc_vec(vece, conj, conj, c);
+ }
+ tcg_gen_or_vec(vece, disj, disj, conj);
+
+ /* Unset the least significant bit that is set */
+ imm &= imm - 1;
+ }
+
+ tcg_gen_mov_vec(t, disj);
+
+ tcg_temp_free_vec(disj);
+ tcg_temp_free_vec(conj);
+}
+
+static bool trans_XXEVAL(DisasContext *ctx, arg_8RR_XX4_imm *a)
+{
+ REQUIRE_INSNS_FLAGS2(ctx, ISA310);
+ REQUIRE_VSX(ctx);
+
+ static const TCGOpcode vecop_list[] = {
+ INDEX_op_andc_vec, 0
+ };
+ static const GVecGen4i op = {
+ .fniv = gen_xxeval_vec,
+ .fno = gen_helper_XXEVAL,
+ .fni8 = gen_xxeval_i64,
+ .opt_opc = vecop_list,
+ .vece = MO_64
+ };
+
+ tcg_gen_gvec_4i(vsr_full_offset(a->xt), vsr_full_offset(a->xa),
+ vsr_full_offset(a->xb), vsr_full_offset(a->xc),
+ 16, 16, a->imm, &op);
+
+ return true;
+}
+
static void gen_xxblendv_vec(unsigned vece, TCGv_vec t, TCGv_vec a, TCGv_vec b,
TCGv_vec c)
{
--
2.31.1
^ permalink raw reply related [flat|nested] 55+ messages in thread
* [PATCH v3 26/37] target/ppc: Implement xxgenpcv[bhwd]m instruction
2022-02-10 12:34 [PATCH v3 00/37] target/ppc: PowerISA Vector/VSX instruction batch matheus.ferst
` (24 preceding siblings ...)
2022-02-10 12:34 ` [PATCH v3 25/37] target/ppc: Implement xxeval matheus.ferst
@ 2022-02-10 12:34 ` matheus.ferst
2022-02-10 12:34 ` [PATCH v3 27/37] target/ppc: move xs[n]madd[am][ds]p/xs[n]msub[am][ds]p to decodetree matheus.ferst
` (10 subsequent siblings)
36 siblings, 0 replies; 55+ messages in thread
From: matheus.ferst @ 2022-02-10 12:34 UTC (permalink / raw)
To: qemu-devel, qemu-ppc
Cc: danielhb413, richard.henderson, groug, clg, Matheus Ferst, david
From: Matheus Ferst <matheus.ferst@eldorado.org.br>
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
---
target/ppc/helper.h | 4 ++
target/ppc/insn32.decode | 10 ++++
target/ppc/int_helper.c | 84 +++++++++++++++++++++++++++++
target/ppc/translate/vsx-impl.c.inc | 29 ++++++++++
4 files changed, 127 insertions(+)
diff --git a/target/ppc/helper.h b/target/ppc/helper.h
index 4cf8e0747d..78025d597e 100644
--- a/target/ppc/helper.h
+++ b/target/ppc/helper.h
@@ -501,6 +501,10 @@ DEF_HELPER_3(xvrspic, void, env, vsr, vsr)
DEF_HELPER_3(xvrspim, void, env, vsr, vsr)
DEF_HELPER_3(xvrspip, void, env, vsr, vsr)
DEF_HELPER_3(xvrspiz, void, env, vsr, vsr)
+DEF_HELPER_3(XXGENPCVBM, void, vsr, avr, tl)
+DEF_HELPER_3(XXGENPCVHM, void, vsr, avr, tl)
+DEF_HELPER_3(XXGENPCVWM, void, vsr, avr, tl)
+DEF_HELPER_3(XXGENPCVDM, void, vsr, avr, tl)
DEF_HELPER_4(xxextractuw, void, env, vsr, vsr, i32)
DEF_HELPER_5(XXPERMX, void, vsr, vsr, vsr, vsr, tl)
DEF_HELPER_4(xxinsertw, void, env, vsr, vsr, i32)
diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode
index 8f0627c529..ced6b273ca 100644
--- a/target/ppc/insn32.decode
+++ b/target/ppc/insn32.decode
@@ -119,6 +119,9 @@
@X_bfl ...... bf:3 - l:1 ra:5 rb:5 ..........- &X_bfl
%x_xt 0:1 21:5
+&X_imm5 xt imm:uint8_t vrb
+@X_imm5 ...... ..... imm:5 vrb:5 .......... . &X_imm5 xt=%x_xt
+
&X_imm8 xt imm:uint8_t
@X_imm8 ...... ..... .. imm:8 .......... . &X_imm8 xt=%x_xt
@@ -553,6 +556,13 @@ XXPERMDI 111100 ..... ..... ..... 0 .. 01010 ... @XX3_dm
XXSEL 111100 ..... ..... ..... ..... 11 .... @XX4
+## VSX Vector Generate PCV
+
+XXGENPCVBM 111100 ..... ..... ..... 1110010100 . @X_imm5
+XXGENPCVHM 111100 ..... ..... ..... 1110010101 . @X_imm5
+XXGENPCVWM 111100 ..... ..... ..... 1110110100 . @X_imm5
+XXGENPCVDM 111100 ..... ..... ..... 1110110101 . @X_imm5
+
## VSX Vector Load Special Value Instruction
LXVKQ 111100 ..... 11111 ..... 0101101000 . @X_uim5
diff --git a/target/ppc/int_helper.c b/target/ppc/int_helper.c
index 7e26d68f53..2fa37a91c9 100644
--- a/target/ppc/int_helper.c
+++ b/target/ppc/int_helper.c
@@ -1210,6 +1210,90 @@ void helper_VPERMR(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b, ppc_avr_t *c)
*r = result;
}
+#define XXGENPCV(NAME, SZ) \
+void helper_##NAME(ppc_vsr_t *t, ppc_vsr_t *b, target_ulong imm) \
+{ \
+ ppc_vsr_t tmp = { .u64 = { 0, 0 } }; \
+ \
+ switch (imm) { \
+ case 0b00000: /* Big-Endian expansion */ \
+ /* Initialize tmp with the result of an all-zeros mask */ \
+ tmp.VsrD(0) = 0x1011121314151617; \
+ tmp.VsrD(1) = 0x18191A1B1C1D1E1F; \
+ \
+ /* Iterate over the most significant byte of each element */ \
+ for (int i = 0, j = 0; i < ARRAY_SIZE(b->u8); i += SZ) { \
+ if (b->VsrB(i) & 0x80) { \
+ /* Update each byte of the element */ \
+ for (int k = 0; k < SZ; k++) { \
+ tmp.VsrB(i + k) = j + k; \
+ } \
+ j += SZ; \
+ } \
+ } \
+ \
+ break; \
+ case 0b00001: /* Big-Endian compression */ \
+ /* Iterate over the most significant byte of each element */ \
+ for (int i = 0, j = 0; i < ARRAY_SIZE(b->u8); i += SZ) { \
+ if (b->VsrB(i) & 0x80) { \
+ /* Update each byte of the element */ \
+ for (int k = 0; k < SZ; k++) { \
+ tmp.VsrB(j + k) = i + k; \
+ } \
+ j += SZ; \
+ } \
+ } \
+ \
+ break; \
+ case 0b00010: /* Little-Endian expansion */ \
+ /* Initialize tmp with the result of an all-zeros mask */ \
+ tmp.VsrD(0) = 0x1F1E1D1C1B1A1918; \
+ tmp.VsrD(1) = 0x1716151413121110; \
+ \
+ /* Iterate over the most significant byte of each element */ \
+ for (int i = 0, j = 0; i < ARRAY_SIZE(b->u8); i += SZ) { \
+ /* Reverse indexing of "i" */ \
+ const int idx = ARRAY_SIZE(b->u8) - i - SZ; \
+ if (b->VsrB(idx) & 0x80) { \
+ /* Update each byte of the element */ \
+ for (int k = 0, rk = SZ - 1; k < SZ; k++, rk--) { \
+ tmp.VsrB(idx + rk) = j + k; \
+ } \
+ j += SZ; \
+ } \
+ } \
+ \
+ break; \
+ case 0b00011: /* Little-Endian compression */ \
+ /* Iterate over the most significant byte of each element */ \
+ for (int i = 0, j = 0; i < ARRAY_SIZE(b->u8); i += SZ) { \
+ if (b->VsrB(ARRAY_SIZE(b->u8) - i - SZ) & 0x80) { \
+ /* Update each byte of the element */ \
+ for (int k = 0, rk = SZ - 1; k < SZ; k++, rk--) { \
+ /* Reverse indexing of "j" */ \
+ const int idx = ARRAY_SIZE(b->u8) - j - SZ; \
+ tmp.VsrB(idx + rk) = i + k; \
+ } \
+ j += SZ; \
+ } \
+ } \
+ \
+ break; \
+ default: \
+ /* Translation code validates IMM before calling this helper */ \
+ g_assert_not_reached(); \
+ break; \
+ } \
+ \
+ *t = tmp; \
+}
+XXGENPCV(XXGENPCVBM, 1)
+XXGENPCV(XXGENPCVHM, 2)
+XXGENPCV(XXGENPCVWM, 4)
+XXGENPCV(XXGENPCVDM, 8)
+#undef XXGENPCV
+
#if defined(HOST_WORDS_BIGENDIAN)
#define VBPERMQ_INDEX(avr, i) ((avr)->u8[(i)])
#define VBPERMD_INDEX(i) (i)
diff --git a/target/ppc/translate/vsx-impl.c.inc b/target/ppc/translate/vsx-impl.c.inc
index 662b08b465..22f48938cb 100644
--- a/target/ppc/translate/vsx-impl.c.inc
+++ b/target/ppc/translate/vsx-impl.c.inc
@@ -1254,6 +1254,35 @@ static bool trans_XXPERMX(DisasContext *ctx, arg_8RR_XX4_uim3 *a)
return true;
}
+static bool do_xxgenpcv(DisasContext *ctx, arg_X_imm5 *a,
+ void (*gen_helper)(TCGv_ptr, TCGv_ptr, TCGv))
+{
+ TCGv_ptr xt, vrb;
+
+ REQUIRE_INSNS_FLAGS2(ctx, ISA310);
+ REQUIRE_VSX(ctx);
+
+ if (a->imm & ~0x3) {
+ gen_invalid(ctx);
+ return true;
+ }
+
+ xt = gen_vsr_ptr(a->xt);
+ vrb = gen_avr_ptr(a->vrb);
+
+ gen_helper(xt, vrb, tcg_constant_tl(a->imm));
+
+ tcg_temp_free_ptr(xt);
+ tcg_temp_free_ptr(vrb);
+
+ return true;
+}
+
+TRANS(XXGENPCVBM, do_xxgenpcv, gen_helper_XXGENPCVBM)
+TRANS(XXGENPCVHM, do_xxgenpcv, gen_helper_XXGENPCVHM)
+TRANS(XXGENPCVWM, do_xxgenpcv, gen_helper_XXGENPCVWM)
+TRANS(XXGENPCVDM, do_xxgenpcv, gen_helper_XXGENPCVDM)
+
#define GEN_VSX_HELPER_VSX_MADD(name, op1, aop, mop, inval, type) \
static void gen_##name(DisasContext *ctx) \
{ \
--
2.31.1
^ permalink raw reply related [flat|nested] 55+ messages in thread
* [PATCH v3 27/37] target/ppc: move xs[n]madd[am][ds]p/xs[n]msub[am][ds]p to decodetree
2022-02-10 12:34 [PATCH v3 00/37] target/ppc: PowerISA Vector/VSX instruction batch matheus.ferst
` (25 preceding siblings ...)
2022-02-10 12:34 ` [PATCH v3 26/37] target/ppc: Implement xxgenpcv[bhwd]m instruction matheus.ferst
@ 2022-02-10 12:34 ` matheus.ferst
2022-02-10 12:34 ` [PATCH v3 28/37] target/ppc: implement xs[n]maddqp[o]/xs[n]msubqp[o] matheus.ferst
` (9 subsequent siblings)
36 siblings, 0 replies; 55+ messages in thread
From: matheus.ferst @ 2022-02-10 12:34 UTC (permalink / raw)
To: qemu-devel, qemu-ppc
Cc: danielhb413, richard.henderson, groug, clg, Matheus Ferst, david
From: Matheus Ferst <matheus.ferst@eldorado.org.br>
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
---
target/ppc/fpu_helper.c | 23 ++++++------
target/ppc/helper.h | 16 ++++-----
target/ppc/insn32.decode | 22 ++++++++++++
target/ppc/translate/vsx-impl.c.inc | 56 ++++++++++++++++++++++++-----
target/ppc/translate/vsx-ops.c.inc | 16 ---------
5 files changed, 90 insertions(+), 43 deletions(-)
diff --git a/target/ppc/fpu_helper.c b/target/ppc/fpu_helper.c
index f9c7a4dc44..2b7766ddb6 100644
--- a/target/ppc/fpu_helper.c
+++ b/target/ppc/fpu_helper.c
@@ -2156,10 +2156,11 @@ VSX_TSQRT(xvtsqrtsp, 4, float32, VsrW(i), -126, 23)
* maddflgs - flags for the float*muladd routine that control the
* various forms (madd, msub, nmadd, nmsub)
* sfprf - set FPRF
+ * r2sp - round intermediate double precision result to single precision
*/
#define VSX_MADD(op, nels, tp, fld, maddflgs, sfprf, r2sp) \
void helper_##op(CPUPPCState *env, ppc_vsr_t *xt, \
- ppc_vsr_t *xa, ppc_vsr_t *b, ppc_vsr_t *c) \
+ ppc_vsr_t *s1, ppc_vsr_t *s2, ppc_vsr_t *s3) \
{ \
ppc_vsr_t t = *xt; \
int i; \
@@ -2175,12 +2176,12 @@ void helper_##op(CPUPPCState *env, ppc_vsr_t *xt, \
* result to odd. \
*/ \
set_float_rounding_mode(float_round_to_zero, &tstat); \
- t.fld = tp##_muladd(xa->fld, b->fld, c->fld, \
+ t.fld = tp##_muladd(s1->fld, s3->fld, s2->fld, \
maddflgs, &tstat); \
t.fld |= (get_float_exception_flags(&tstat) & \
float_flag_inexact) != 0; \
} else { \
- t.fld = tp##_muladd(xa->fld, b->fld, c->fld, \
+ t.fld = tp##_muladd(s1->fld, s3->fld, s2->fld, \
maddflgs, &tstat); \
} \
env->fp_status.float_exception_flags |= tstat.float_exception_flags; \
@@ -2202,14 +2203,14 @@ void helper_##op(CPUPPCState *env, ppc_vsr_t *xt, \
do_float_check_status(env, GETPC()); \
}
-VSX_MADD(xsmadddp, 1, float64, VsrD(0), MADD_FLGS, 1, 0)
-VSX_MADD(xsmsubdp, 1, float64, VsrD(0), MSUB_FLGS, 1, 0)
-VSX_MADD(xsnmadddp, 1, float64, VsrD(0), NMADD_FLGS, 1, 0)
-VSX_MADD(xsnmsubdp, 1, float64, VsrD(0), NMSUB_FLGS, 1, 0)
-VSX_MADD(xsmaddsp, 1, float64, VsrD(0), MADD_FLGS, 1, 1)
-VSX_MADD(xsmsubsp, 1, float64, VsrD(0), MSUB_FLGS, 1, 1)
-VSX_MADD(xsnmaddsp, 1, float64, VsrD(0), NMADD_FLGS, 1, 1)
-VSX_MADD(xsnmsubsp, 1, float64, VsrD(0), NMSUB_FLGS, 1, 1)
+VSX_MADD(XSMADDDP, 1, float64, VsrD(0), MADD_FLGS, 1, 0)
+VSX_MADD(XSMSUBDP, 1, float64, VsrD(0), MSUB_FLGS, 1, 0)
+VSX_MADD(XSNMADDDP, 1, float64, VsrD(0), NMADD_FLGS, 1, 0)
+VSX_MADD(XSNMSUBDP, 1, float64, VsrD(0), NMSUB_FLGS, 1, 0)
+VSX_MADD(XSMADDSP, 1, float64, VsrD(0), MADD_FLGS, 1, 1)
+VSX_MADD(XSMSUBSP, 1, float64, VsrD(0), MSUB_FLGS, 1, 1)
+VSX_MADD(XSNMADDSP, 1, float64, VsrD(0), NMADD_FLGS, 1, 1)
+VSX_MADD(XSNMSUBSP, 1, float64, VsrD(0), NMSUB_FLGS, 1, 1)
VSX_MADD(xvmadddp, 2, float64, VsrD(i), MADD_FLGS, 0, 0)
VSX_MADD(xvmsubdp, 2, float64, VsrD(i), MSUB_FLGS, 0, 0)
diff --git a/target/ppc/helper.h b/target/ppc/helper.h
index 78025d597e..502af2fc29 100644
--- a/target/ppc/helper.h
+++ b/target/ppc/helper.h
@@ -362,10 +362,10 @@ DEF_HELPER_3(xssqrtdp, void, env, vsr, vsr)
DEF_HELPER_3(xsrsqrtedp, void, env, vsr, vsr)
DEF_HELPER_4(xstdivdp, void, env, i32, vsr, vsr)
DEF_HELPER_3(xstsqrtdp, void, env, i32, vsr)
-DEF_HELPER_5(xsmadddp, void, env, vsr, vsr, vsr, vsr)
-DEF_HELPER_5(xsmsubdp, void, env, vsr, vsr, vsr, vsr)
-DEF_HELPER_5(xsnmadddp, void, env, vsr, vsr, vsr, vsr)
-DEF_HELPER_5(xsnmsubdp, void, env, vsr, vsr, vsr, vsr)
+DEF_HELPER_5(XSMADDDP, void, env, vsr, vsr, vsr, vsr)
+DEF_HELPER_5(XSMSUBDP, void, env, vsr, vsr, vsr, vsr)
+DEF_HELPER_5(XSNMADDDP, void, env, vsr, vsr, vsr, vsr)
+DEF_HELPER_5(XSNMSUBDP, void, env, vsr, vsr, vsr, vsr)
DEF_HELPER_4(xscmpeqdp, void, env, vsr, vsr, vsr)
DEF_HELPER_4(xscmpgtdp, void, env, vsr, vsr, vsr)
DEF_HELPER_4(xscmpgedp, void, env, vsr, vsr, vsr)
@@ -425,10 +425,10 @@ DEF_HELPER_3(xsresp, void, env, vsr, vsr)
DEF_HELPER_2(xsrsp, i64, env, i64)
DEF_HELPER_3(xssqrtsp, void, env, vsr, vsr)
DEF_HELPER_3(xsrsqrtesp, void, env, vsr, vsr)
-DEF_HELPER_5(xsmaddsp, void, env, vsr, vsr, vsr, vsr)
-DEF_HELPER_5(xsmsubsp, void, env, vsr, vsr, vsr, vsr)
-DEF_HELPER_5(xsnmaddsp, void, env, vsr, vsr, vsr, vsr)
-DEF_HELPER_5(xsnmsubsp, void, env, vsr, vsr, vsr, vsr)
+DEF_HELPER_5(XSMADDSP, void, env, vsr, vsr, vsr, vsr)
+DEF_HELPER_5(XSMSUBSP, void, env, vsr, vsr, vsr, vsr)
+DEF_HELPER_5(XSNMADDSP, void, env, vsr, vsr, vsr, vsr)
+DEF_HELPER_5(XSNMSUBSP, void, env, vsr, vsr, vsr, vsr)
DEF_HELPER_4(xvadddp, void, env, vsr, vsr, vsr)
DEF_HELPER_4(xvsubdp, void, env, vsr, vsr, vsr)
diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode
index ced6b273ca..84bc8b1168 100644
--- a/target/ppc/insn32.decode
+++ b/target/ppc/insn32.decode
@@ -543,6 +543,28 @@ STXVX 011111 ..... ..... ..... 0110001100 . @X_TSX
LXVPX 011111 ..... ..... ..... 0101001101 - @X_TSXP
STXVPX 011111 ..... ..... ..... 0111001101 - @X_TSXP
+## VSX Scalar Multiply-Add Instructions
+
+XSMADDADP 111100 ..... ..... ..... 00100001 . . . @XX3
+XSMADDMDP 111100 ..... ..... ..... 00101001 . . . @XX3
+XSMADDASP 111100 ..... ..... ..... 00000001 . . . @XX3
+XSMADDMSP 111100 ..... ..... ..... 00001001 . . . @XX3
+
+XSMSUBADP 111100 ..... ..... ..... 00110001 . . . @XX3
+XSMSUBMDP 111100 ..... ..... ..... 00111001 . . . @XX3
+XSMSUBASP 111100 ..... ..... ..... 00010001 . . . @XX3
+XSMSUBMSP 111100 ..... ..... ..... 00011001 . . . @XX3
+
+XSNMADDASP 111100 ..... ..... ..... 10000001 . . . @XX3
+XSNMADDMSP 111100 ..... ..... ..... 10001001 . . . @XX3
+XSNMADDADP 111100 ..... ..... ..... 10100001 . . . @XX3
+XSNMADDMDP 111100 ..... ..... ..... 10101001 . . . @XX3
+
+XSNMSUBASP 111100 ..... ..... ..... 10010001 . . . @XX3
+XSNMSUBMSP 111100 ..... ..... ..... 10011001 . . . @XX3
+XSNMSUBADP 111100 ..... ..... ..... 10110001 . . . @XX3
+XSNMSUBMDP 111100 ..... ..... ..... 10111001 . . . @XX3
+
## VSX splat instruction
XXSPLTIB 111100 ..... 00 ........ 0101101000 . @X_imm8
diff --git a/target/ppc/translate/vsx-impl.c.inc b/target/ppc/translate/vsx-impl.c.inc
index 22f48938cb..be91b8d053 100644
--- a/target/ppc/translate/vsx-impl.c.inc
+++ b/target/ppc/translate/vsx-impl.c.inc
@@ -1283,6 +1283,54 @@ TRANS(XXGENPCVHM, do_xxgenpcv, gen_helper_XXGENPCVHM)
TRANS(XXGENPCVWM, do_xxgenpcv, gen_helper_XXGENPCVWM)
TRANS(XXGENPCVDM, do_xxgenpcv, gen_helper_XXGENPCVDM)
+static bool do_xsmadd(DisasContext *ctx, int tgt, int src1, int src2, int src3,
+ void (gen_helper)(TCGv_ptr, TCGv_ptr, TCGv_ptr, TCGv_ptr, TCGv_ptr))
+{
+ TCGv_ptr t, s1, s2, s3;
+
+ t = gen_vsr_ptr(tgt);
+ s1 = gen_vsr_ptr(src1);
+ s2 = gen_vsr_ptr(src2);
+ s3 = gen_vsr_ptr(src3);
+
+ gen_helper(cpu_env, t, s1, s2, s3);
+
+ tcg_temp_free_ptr(t);
+ tcg_temp_free_ptr(s1);
+ tcg_temp_free_ptr(s2);
+ tcg_temp_free_ptr(s3);
+
+ return true;
+}
+
+static bool do_xsmadd_XX3(DisasContext *ctx, arg_XX3 *a, bool type_a,
+ void (gen_helper)(TCGv_ptr, TCGv_ptr, TCGv_ptr, TCGv_ptr, TCGv_ptr))
+{
+ REQUIRE_VSX(ctx);
+
+ if (type_a) {
+ return do_xsmadd(ctx, a->xt, a->xa, a->xt, a->xb, gen_helper);
+ }
+ return do_xsmadd(ctx, a->xt, a->xa, a->xb, a->xt, gen_helper);
+}
+
+TRANS_FLAGS2(VSX, XSMADDADP, do_xsmadd_XX3, true, gen_helper_XSMADDDP)
+TRANS_FLAGS2(VSX, XSMADDMDP, do_xsmadd_XX3, false, gen_helper_XSMADDDP)
+TRANS_FLAGS2(VSX, XSMSUBADP, do_xsmadd_XX3, true, gen_helper_XSMSUBDP)
+TRANS_FLAGS2(VSX, XSMSUBMDP, do_xsmadd_XX3, false, gen_helper_XSMSUBDP)
+TRANS_FLAGS2(VSX, XSNMADDADP, do_xsmadd_XX3, true, gen_helper_XSNMADDDP)
+TRANS_FLAGS2(VSX, XSNMADDMDP, do_xsmadd_XX3, false, gen_helper_XSNMADDDP)
+TRANS_FLAGS2(VSX, XSNMSUBADP, do_xsmadd_XX3, true, gen_helper_XSNMSUBDP)
+TRANS_FLAGS2(VSX, XSNMSUBMDP, do_xsmadd_XX3, false, gen_helper_XSNMSUBDP)
+TRANS_FLAGS2(VSX207, XSMADDASP, do_xsmadd_XX3, true, gen_helper_XSMADDSP)
+TRANS_FLAGS2(VSX207, XSMADDMSP, do_xsmadd_XX3, false, gen_helper_XSMADDSP)
+TRANS_FLAGS2(VSX207, XSMSUBASP, do_xsmadd_XX3, true, gen_helper_XSMSUBSP)
+TRANS_FLAGS2(VSX207, XSMSUBMSP, do_xsmadd_XX3, false, gen_helper_XSMSUBSP)
+TRANS_FLAGS2(VSX207, XSNMADDASP, do_xsmadd_XX3, true, gen_helper_XSNMADDSP)
+TRANS_FLAGS2(VSX207, XSNMADDMSP, do_xsmadd_XX3, false, gen_helper_XSNMADDSP)
+TRANS_FLAGS2(VSX207, XSNMSUBASP, do_xsmadd_XX3, true, gen_helper_XSNMSUBSP)
+TRANS_FLAGS2(VSX207, XSNMSUBMSP, do_xsmadd_XX3, false, gen_helper_XSNMSUBSP)
+
#define GEN_VSX_HELPER_VSX_MADD(name, op1, aop, mop, inval, type) \
static void gen_##name(DisasContext *ctx) \
{ \
@@ -1313,14 +1361,6 @@ static void gen_##name(DisasContext *ctx) \
tcg_temp_free_ptr(c); \
}
-GEN_VSX_HELPER_VSX_MADD(xsmadddp, 0x04, 0x04, 0x05, 0, PPC2_VSX)
-GEN_VSX_HELPER_VSX_MADD(xsmsubdp, 0x04, 0x06, 0x07, 0, PPC2_VSX)
-GEN_VSX_HELPER_VSX_MADD(xsnmadddp, 0x04, 0x14, 0x15, 0, PPC2_VSX)
-GEN_VSX_HELPER_VSX_MADD(xsnmsubdp, 0x04, 0x16, 0x17, 0, PPC2_VSX)
-GEN_VSX_HELPER_VSX_MADD(xsmaddsp, 0x04, 0x00, 0x01, 0, PPC2_VSX207)
-GEN_VSX_HELPER_VSX_MADD(xsmsubsp, 0x04, 0x02, 0x03, 0, PPC2_VSX207)
-GEN_VSX_HELPER_VSX_MADD(xsnmaddsp, 0x04, 0x10, 0x11, 0, PPC2_VSX207)
-GEN_VSX_HELPER_VSX_MADD(xsnmsubsp, 0x04, 0x12, 0x13, 0, PPC2_VSX207)
GEN_VSX_HELPER_VSX_MADD(xvmadddp, 0x04, 0x0C, 0x0D, 0, PPC2_VSX)
GEN_VSX_HELPER_VSX_MADD(xvmsubdp, 0x04, 0x0E, 0x0F, 0, PPC2_VSX)
GEN_VSX_HELPER_VSX_MADD(xvnmadddp, 0x04, 0x1C, 0x1D, 0, PPC2_VSX)
diff --git a/target/ppc/translate/vsx-ops.c.inc b/target/ppc/translate/vsx-ops.c.inc
index 0a6b2b31ac..9cfec53df0 100644
--- a/target/ppc/translate/vsx-ops.c.inc
+++ b/target/ppc/translate/vsx-ops.c.inc
@@ -186,14 +186,6 @@ GEN_XX2FORM(xssqrtdp, 0x16, 0x04, PPC2_VSX),
GEN_XX2FORM(xsrsqrtedp, 0x14, 0x04, PPC2_VSX),
GEN_XX3FORM(xstdivdp, 0x14, 0x07, PPC2_VSX),
GEN_XX2FORM(xstsqrtdp, 0x14, 0x06, PPC2_VSX),
-GEN_XX3FORM_NAME(xsmadddp, "xsmaddadp", 0x04, 0x04, PPC2_VSX),
-GEN_XX3FORM_NAME(xsmadddp, "xsmaddmdp", 0x04, 0x05, PPC2_VSX),
-GEN_XX3FORM_NAME(xsmsubdp, "xsmsubadp", 0x04, 0x06, PPC2_VSX),
-GEN_XX3FORM_NAME(xsmsubdp, "xsmsubmdp", 0x04, 0x07, PPC2_VSX),
-GEN_XX3FORM_NAME(xsnmadddp, "xsnmaddadp", 0x04, 0x14, PPC2_VSX),
-GEN_XX3FORM_NAME(xsnmadddp, "xsnmaddmdp", 0x04, 0x15, PPC2_VSX),
-GEN_XX3FORM_NAME(xsnmsubdp, "xsnmsubadp", 0x04, 0x16, PPC2_VSX),
-GEN_XX3FORM_NAME(xsnmsubdp, "xsnmsubmdp", 0x04, 0x17, PPC2_VSX),
GEN_XX3FORM(xscmpeqdp, 0x0C, 0x00, PPC2_ISA300),
GEN_XX3FORM(xscmpgtdp, 0x0C, 0x01, PPC2_ISA300),
GEN_XX3FORM(xscmpgedp, 0x0C, 0x02, PPC2_ISA300),
@@ -235,14 +227,6 @@ GEN_XX2FORM(xsresp, 0x14, 0x01, PPC2_VSX207),
GEN_XX2FORM(xsrsp, 0x12, 0x11, PPC2_VSX207),
GEN_XX2FORM(xssqrtsp, 0x16, 0x00, PPC2_VSX207),
GEN_XX2FORM(xsrsqrtesp, 0x14, 0x00, PPC2_VSX207),
-GEN_XX3FORM_NAME(xsmaddsp, "xsmaddasp", 0x04, 0x00, PPC2_VSX207),
-GEN_XX3FORM_NAME(xsmaddsp, "xsmaddmsp", 0x04, 0x01, PPC2_VSX207),
-GEN_XX3FORM_NAME(xsmsubsp, "xsmsubasp", 0x04, 0x02, PPC2_VSX207),
-GEN_XX3FORM_NAME(xsmsubsp, "xsmsubmsp", 0x04, 0x03, PPC2_VSX207),
-GEN_XX3FORM_NAME(xsnmaddsp, "xsnmaddasp", 0x04, 0x10, PPC2_VSX207),
-GEN_XX3FORM_NAME(xsnmaddsp, "xsnmaddmsp", 0x04, 0x11, PPC2_VSX207),
-GEN_XX3FORM_NAME(xsnmsubsp, "xsnmsubasp", 0x04, 0x12, PPC2_VSX207),
-GEN_XX3FORM_NAME(xsnmsubsp, "xsnmsubmsp", 0x04, 0x13, PPC2_VSX207),
GEN_XX2FORM(xscvsxdsp, 0x10, 0x13, PPC2_VSX207),
GEN_XX2FORM(xscvuxdsp, 0x10, 0x12, PPC2_VSX207),
--
2.31.1
^ permalink raw reply related [flat|nested] 55+ messages in thread
* [PATCH v3 28/37] target/ppc: implement xs[n]maddqp[o]/xs[n]msubqp[o]
2022-02-10 12:34 [PATCH v3 00/37] target/ppc: PowerISA Vector/VSX instruction batch matheus.ferst
` (26 preceding siblings ...)
2022-02-10 12:34 ` [PATCH v3 27/37] target/ppc: move xs[n]madd[am][ds]p/xs[n]msub[am][ds]p to decodetree matheus.ferst
@ 2022-02-10 12:34 ` matheus.ferst
2022-02-10 12:34 ` [PATCH v3 29/37] target/ppc: Implement xvtlsbb instruction matheus.ferst
` (8 subsequent siblings)
36 siblings, 0 replies; 55+ messages in thread
From: matheus.ferst @ 2022-02-10 12:34 UTC (permalink / raw)
To: qemu-devel, qemu-ppc
Cc: danielhb413, richard.henderson, groug, clg, Matheus Ferst, david
From: Matheus Ferst <matheus.ferst@eldorado.org.br>
Implement the following PowerISA v3.0 instuctions:
xsmaddqp[o]: VSX Scalar Multiply-Add Quad-Precision [using round to Odd]
xsmsubqp[o]: VSX Scalar Multiply-Subtract Quad-Precision [using round
to Odd]
xsnmaddqp[o]: VSX Scalar Negative Multiply-Add Quad-Precision [using
round to Odd]
xsnmsubqp[o]: VSX Scalar Negative Multiply-Subtract Quad-Precision
[using round to Odd]
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
---
target/ppc/fpu_helper.c | 42 +++++++++++++++++++++++++++++
target/ppc/helper.h | 9 +++++++
target/ppc/insn32.decode | 4 +++
target/ppc/translate/vsx-impl.c.inc | 25 +++++++++++++++++
4 files changed, 80 insertions(+)
diff --git a/target/ppc/fpu_helper.c b/target/ppc/fpu_helper.c
index 2b7766ddb6..cd4e07ed5b 100644
--- a/target/ppc/fpu_helper.c
+++ b/target/ppc/fpu_helper.c
@@ -2222,6 +2222,48 @@ VSX_MADD(xvmsubsp, 4, float32, VsrW(i), MSUB_FLGS, 0, 0)
VSX_MADD(xvnmaddsp, 4, float32, VsrW(i), NMADD_FLGS, 0, 0)
VSX_MADD(xvnmsubsp, 4, float32, VsrW(i), NMSUB_FLGS, 0, 0)
+/*
+ * VSX_MADDQ - VSX floating point quad-precision muliply/add
+ * op - instruction mnemonic
+ * maddflgs - flags for the float*muladd routine that control the
+ * various forms (madd, msub, nmadd, nmsub)
+ * ro - round to odd
+ */
+#define VSX_MADDQ(op, maddflgs, ro) \
+void helper_##op(CPUPPCState *env, ppc_vsr_t *xt, ppc_vsr_t *s1, ppc_vsr_t *s2,\
+ ppc_vsr_t *s3) \
+{ \
+ ppc_vsr_t t = *xt; \
+ \
+ helper_reset_fpstatus(env); \
+ \
+ float_status tstat = env->fp_status; \
+ set_float_exception_flags(0, &tstat); \
+ if (ro) { \
+ tstat.float_rounding_mode = float_round_to_odd; \
+ } \
+ t.f128 = float128_muladd(s1->f128, s3->f128, s2->f128, maddflgs, &tstat); \
+ env->fp_status.float_exception_flags |= tstat.float_exception_flags; \
+ \
+ if (unlikely(tstat.float_exception_flags & float_flag_invalid)) { \
+ float_invalid_op_madd(env, tstat.float_exception_flags, \
+ false, GETPC()); \
+ } \
+ \
+ helper_compute_fprf_float128(env, t.f128); \
+ *xt = t; \
+ do_float_check_status(env, GETPC()); \
+}
+
+VSX_MADDQ(XSMADDQP, MADD_FLGS, 0)
+VSX_MADDQ(XSMADDQPO, MADD_FLGS, 1)
+VSX_MADDQ(XSMSUBQP, MSUB_FLGS, 0)
+VSX_MADDQ(XSMSUBQPO, MSUB_FLGS, 1)
+VSX_MADDQ(XSNMADDQP, NMADD_FLGS, 0)
+VSX_MADDQ(XSNMADDQPO, NMADD_FLGS, 1)
+VSX_MADDQ(XSNMSUBQP, NMSUB_FLGS, 0)
+VSX_MADDQ(XSNMSUBQPO, NMSUB_FLGS, 0)
+
/*
* VSX_SCALAR_CMP_DP - VSX scalar floating point compare double precision
* op - instruction mnemonic
diff --git a/target/ppc/helper.h b/target/ppc/helper.h
index 502af2fc29..b4eef14511 100644
--- a/target/ppc/helper.h
+++ b/target/ppc/helper.h
@@ -430,6 +430,15 @@ DEF_HELPER_5(XSMSUBSP, void, env, vsr, vsr, vsr, vsr)
DEF_HELPER_5(XSNMADDSP, void, env, vsr, vsr, vsr, vsr)
DEF_HELPER_5(XSNMSUBSP, void, env, vsr, vsr, vsr, vsr)
+DEF_HELPER_5(XSMADDQP, void, env, vsr, vsr, vsr, vsr)
+DEF_HELPER_5(XSMADDQPO, void, env, vsr, vsr, vsr, vsr)
+DEF_HELPER_5(XSMSUBQP, void, env, vsr, vsr, vsr, vsr)
+DEF_HELPER_5(XSMSUBQPO, void, env, vsr, vsr, vsr, vsr)
+DEF_HELPER_5(XSNMADDQP, void, env, vsr, vsr, vsr, vsr)
+DEF_HELPER_5(XSNMADDQPO, void, env, vsr, vsr, vsr, vsr)
+DEF_HELPER_5(XSNMSUBQP, void, env, vsr, vsr, vsr, vsr)
+DEF_HELPER_5(XSNMSUBQPO, void, env, vsr, vsr, vsr, vsr)
+
DEF_HELPER_4(xvadddp, void, env, vsr, vsr, vsr)
DEF_HELPER_4(xvsubdp, void, env, vsr, vsr, vsr)
DEF_HELPER_4(xvmuldp, void, env, vsr, vsr, vsr)
diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode
index 84bc8b1168..c16b990a00 100644
--- a/target/ppc/insn32.decode
+++ b/target/ppc/insn32.decode
@@ -549,21 +549,25 @@ XSMADDADP 111100 ..... ..... ..... 00100001 . . . @XX3
XSMADDMDP 111100 ..... ..... ..... 00101001 . . . @XX3
XSMADDASP 111100 ..... ..... ..... 00000001 . . . @XX3
XSMADDMSP 111100 ..... ..... ..... 00001001 . . . @XX3
+XSMADDQP 111111 ..... ..... ..... 0110000100 . @X_rc
XSMSUBADP 111100 ..... ..... ..... 00110001 . . . @XX3
XSMSUBMDP 111100 ..... ..... ..... 00111001 . . . @XX3
XSMSUBASP 111100 ..... ..... ..... 00010001 . . . @XX3
XSMSUBMSP 111100 ..... ..... ..... 00011001 . . . @XX3
+XSMSUBQP 111111 ..... ..... ..... 0110100100 . @X_rc
XSNMADDASP 111100 ..... ..... ..... 10000001 . . . @XX3
XSNMADDMSP 111100 ..... ..... ..... 10001001 . . . @XX3
XSNMADDADP 111100 ..... ..... ..... 10100001 . . . @XX3
XSNMADDMDP 111100 ..... ..... ..... 10101001 . . . @XX3
+XSNMADDQP 111111 ..... ..... ..... 0111000100 . @X_rc
XSNMSUBASP 111100 ..... ..... ..... 10010001 . . . @XX3
XSNMSUBMSP 111100 ..... ..... ..... 10011001 . . . @XX3
XSNMSUBADP 111100 ..... ..... ..... 10110001 . . . @XX3
XSNMSUBMDP 111100 ..... ..... ..... 10111001 . . . @XX3
+XSNMSUBQP 111111 ..... ..... ..... 0111100100 . @X_rc
## VSX splat instruction
diff --git a/target/ppc/translate/vsx-impl.c.inc b/target/ppc/translate/vsx-impl.c.inc
index be91b8d053..7764b1e5c2 100644
--- a/target/ppc/translate/vsx-impl.c.inc
+++ b/target/ppc/translate/vsx-impl.c.inc
@@ -1331,6 +1331,31 @@ TRANS_FLAGS2(VSX207, XSNMADDMSP, do_xsmadd_XX3, false, gen_helper_XSNMADDSP)
TRANS_FLAGS2(VSX207, XSNMSUBASP, do_xsmadd_XX3, true, gen_helper_XSNMSUBSP)
TRANS_FLAGS2(VSX207, XSNMSUBMSP, do_xsmadd_XX3, false, gen_helper_XSNMSUBSP)
+static bool do_xsmadd_X(DisasContext *ctx, arg_X_rc *a,
+ void (gen_helper)(TCGv_ptr, TCGv_ptr, TCGv_ptr, TCGv_ptr, TCGv_ptr),
+ void (gen_helper_ro)(TCGv_ptr, TCGv_ptr, TCGv_ptr, TCGv_ptr, TCGv_ptr))
+{
+ int vrt, vra, vrb;
+
+ REQUIRE_INSNS_FLAGS2(ctx, ISA300);
+ REQUIRE_VSX(ctx);
+
+ vrt = a->rt + 32;
+ vra = a->ra + 32;
+ vrb = a->rb + 32;
+
+ if (a->rc) {
+ return do_xsmadd(ctx, vrt, vra, vrt, vrb, gen_helper_ro);
+ }
+
+ return do_xsmadd(ctx, vrt, vra, vrt, vrb, gen_helper);
+}
+
+TRANS(XSMADDQP, do_xsmadd_X, gen_helper_XSMADDQP, gen_helper_XSMADDQPO)
+TRANS(XSMSUBQP, do_xsmadd_X, gen_helper_XSMSUBQP, gen_helper_XSMSUBQPO)
+TRANS(XSNMADDQP, do_xsmadd_X, gen_helper_XSNMADDQP, gen_helper_XSNMADDQPO)
+TRANS(XSNMSUBQP, do_xsmadd_X, gen_helper_XSNMSUBQP, gen_helper_XSNMSUBQPO)
+
#define GEN_VSX_HELPER_VSX_MADD(name, op1, aop, mop, inval, type) \
static void gen_##name(DisasContext *ctx) \
{ \
--
2.31.1
^ permalink raw reply related [flat|nested] 55+ messages in thread
* [PATCH v3 29/37] target/ppc: Implement xvtlsbb instruction
2022-02-10 12:34 [PATCH v3 00/37] target/ppc: PowerISA Vector/VSX instruction batch matheus.ferst
` (27 preceding siblings ...)
2022-02-10 12:34 ` [PATCH v3 28/37] target/ppc: implement xs[n]maddqp[o]/xs[n]msubqp[o] matheus.ferst
@ 2022-02-10 12:34 ` matheus.ferst
2022-02-10 12:34 ` [PATCH v3 30/37] target/ppc: Remove xscmpnedp instruction matheus.ferst
` (7 subsequent siblings)
36 siblings, 0 replies; 55+ messages in thread
From: matheus.ferst @ 2022-02-10 12:34 UTC (permalink / raw)
To: qemu-devel, qemu-ppc
Cc: danielhb413, richard.henderson, groug, Víctor Colombo, clg,
Matheus Ferst, david
From: Víctor Colombo <victor.colombo@eldorado.org.br>
Signed-off-by: Víctor Colombo <victor.colombo@eldorado.org.br>
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
---
target/ppc/insn32.decode | 7 ++++++
target/ppc/translate/vsx-impl.c.inc | 37 +++++++++++++++++++++++++++++
2 files changed, 44 insertions(+)
diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode
index c16b990a00..c9cf865fdf 100644
--- a/target/ppc/insn32.decode
+++ b/target/ppc/insn32.decode
@@ -155,6 +155,9 @@
&XX2 xt xb uim:uint8_t
@XX2 ...... ..... ... uim:2 ..... ......... .. &XX2 xt=%xx_xt xb=%xx_xb
+&XX2_bf_xb bf xb
+@XX2_bf_xb ...... bf:3 .. ..... ..... ......... . . &XX2_bf_xb xb=%xx_xb
+
&XX3 xt xa xb
@XX3 ...... ..... ..... ..... ........ ... &XX3 xt=%xx_xt xa=%xx_xa xb=%xx_xb
@@ -604,6 +607,10 @@ XSMINJDP 111100 ..... ..... ..... 10011000 ... @XX3
XSCVQPDP 111111 ..... 10100 ..... 1101000100 . @X_tb_rc
+## VSX Vector Test Least-Significant Bit by Byte Instruction
+
+XVTLSBB 111100 ... -- 00010 ..... 111011011 . - @XX2_bf_xb
+
### rfebb
&XL_s s:uint8_t
@XL_s ......-------------- s:1 .......... - &XL_s
diff --git a/target/ppc/translate/vsx-impl.c.inc b/target/ppc/translate/vsx-impl.c.inc
index 7764b1e5c2..5a0ec8e828 100644
--- a/target/ppc/translate/vsx-impl.c.inc
+++ b/target/ppc/translate/vsx-impl.c.inc
@@ -1688,6 +1688,43 @@ static bool trans_LXVKQ(DisasContext *ctx, arg_X_uim5 *a)
return true;
}
+static bool trans_XVTLSBB(DisasContext *ctx, arg_XX2_bf_xb *a)
+{
+ TCGv_i64 xb, tmp, all_true, all_false, mask, zero;
+
+ REQUIRE_INSNS_FLAGS2(ctx, ISA310);
+ REQUIRE_VSX(ctx);
+
+ xb = tcg_temp_new_i64();
+ tmp = tcg_temp_new_i64();
+ all_true = tcg_const_i64(0b1000);
+ all_false = tcg_const_i64(0b0010);
+ mask = tcg_constant_i64(dup_const(MO_8, 1));
+ zero = tcg_constant_i64(0);
+
+ for (int dw = 0; dw < 2; dw++) {
+ get_cpu_vsr(xb, a->xb, dw);
+
+ tcg_gen_and_i64(tmp, mask, xb);
+ tcg_gen_movcond_i64(TCG_COND_EQ, all_true, tmp,
+ mask, all_true, zero);
+
+ tcg_gen_andc_i64(tmp, mask, xb);
+ tcg_gen_movcond_i64(TCG_COND_EQ, all_false, tmp,
+ mask, all_false, zero);
+ }
+
+ tcg_gen_or_i64(tmp, all_false, all_true);
+ tcg_gen_extrl_i64_i32(cpu_crf[a->bf], tmp);
+
+ tcg_temp_free_i64(xb);
+ tcg_temp_free_i64(tmp);
+ tcg_temp_free_i64(all_true);
+ tcg_temp_free_i64(all_false);
+
+ return true;
+}
+
static void gen_xxsldwi(DisasContext *ctx)
{
TCGv_i64 xth, xtl;
--
2.31.1
^ permalink raw reply related [flat|nested] 55+ messages in thread
* [PATCH v3 30/37] target/ppc: Remove xscmpnedp instruction
2022-02-10 12:34 [PATCH v3 00/37] target/ppc: PowerISA Vector/VSX instruction batch matheus.ferst
` (28 preceding siblings ...)
2022-02-10 12:34 ` [PATCH v3 29/37] target/ppc: Implement xvtlsbb instruction matheus.ferst
@ 2022-02-10 12:34 ` matheus.ferst
2022-02-10 12:34 ` [PATCH v3 31/37] target/ppc: Refactor VSX_SCALAR_CMP_DP matheus.ferst
` (6 subsequent siblings)
36 siblings, 0 replies; 55+ messages in thread
From: matheus.ferst @ 2022-02-10 12:34 UTC (permalink / raw)
To: qemu-devel, qemu-ppc
Cc: danielhb413, richard.henderson, groug, Víctor Colombo, clg,
Matheus Ferst, david
From: Víctor Colombo <victor.colombo@eldorado.org.br>
xscmpnedp was added in ISA v3.0 but removed in v3.0B. This patch
removes this instruction as it was not in the final version of v3.0.
Signed-off-by: Víctor Colombo <victor.colombo@eldorado.org.br>
Acked-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
---
target/ppc/fpu_helper.c | 1 -
target/ppc/helper.h | 1 -
target/ppc/translate/vsx-impl.c.inc | 1 -
target/ppc/translate/vsx-ops.c.inc | 1 -
4 files changed, 4 deletions(-)
diff --git a/target/ppc/fpu_helper.c b/target/ppc/fpu_helper.c
index cd4e07ed5b..6b0296525b 100644
--- a/target/ppc/fpu_helper.c
+++ b/target/ppc/fpu_helper.c
@@ -2313,7 +2313,6 @@ void helper_##op(CPUPPCState *env, ppc_vsr_t *xt, \
VSX_SCALAR_CMP_DP(xscmpeqdp, eq, 1, 0)
VSX_SCALAR_CMP_DP(xscmpgedp, le, 1, 1)
VSX_SCALAR_CMP_DP(xscmpgtdp, lt, 1, 1)
-VSX_SCALAR_CMP_DP(xscmpnedp, eq, 0, 0)
void helper_xscmpexpdp(CPUPPCState *env, uint32_t opcode,
ppc_vsr_t *xa, ppc_vsr_t *xb)
diff --git a/target/ppc/helper.h b/target/ppc/helper.h
index b4eef14511..9e6b2af74b 100644
--- a/target/ppc/helper.h
+++ b/target/ppc/helper.h
@@ -369,7 +369,6 @@ DEF_HELPER_5(XSNMSUBDP, void, env, vsr, vsr, vsr, vsr)
DEF_HELPER_4(xscmpeqdp, void, env, vsr, vsr, vsr)
DEF_HELPER_4(xscmpgtdp, void, env, vsr, vsr, vsr)
DEF_HELPER_4(xscmpgedp, void, env, vsr, vsr, vsr)
-DEF_HELPER_4(xscmpnedp, void, env, vsr, vsr, vsr)
DEF_HELPER_4(xscmpexpdp, void, env, i32, vsr, vsr)
DEF_HELPER_4(xscmpexpqp, void, env, i32, vsr, vsr)
DEF_HELPER_4(xscmpodp, void, env, i32, vsr, vsr)
diff --git a/target/ppc/translate/vsx-impl.c.inc b/target/ppc/translate/vsx-impl.c.inc
index 5a0ec8e828..9be7bf0ffd 100644
--- a/target/ppc/translate/vsx-impl.c.inc
+++ b/target/ppc/translate/vsx-impl.c.inc
@@ -1053,7 +1053,6 @@ GEN_VSX_HELPER_X1(xstsqrtdp, 0x14, 0x06, 0, PPC2_VSX)
GEN_VSX_HELPER_X3(xscmpeqdp, 0x0C, 0x00, 0, PPC2_ISA300)
GEN_VSX_HELPER_X3(xscmpgtdp, 0x0C, 0x01, 0, PPC2_ISA300)
GEN_VSX_HELPER_X3(xscmpgedp, 0x0C, 0x02, 0, PPC2_ISA300)
-GEN_VSX_HELPER_X3(xscmpnedp, 0x0C, 0x03, 0, PPC2_ISA300)
GEN_VSX_HELPER_X2_AB(xscmpexpdp, 0x0C, 0x07, 0, PPC2_ISA300)
GEN_VSX_HELPER_R2_AB(xscmpexpqp, 0x04, 0x05, 0, PPC2_ISA300)
GEN_VSX_HELPER_X2_AB(xscmpodp, 0x0C, 0x05, 0, PPC2_VSX)
diff --git a/target/ppc/translate/vsx-ops.c.inc b/target/ppc/translate/vsx-ops.c.inc
index 9cfec53df0..34310c1fb5 100644
--- a/target/ppc/translate/vsx-ops.c.inc
+++ b/target/ppc/translate/vsx-ops.c.inc
@@ -189,7 +189,6 @@ GEN_XX2FORM(xstsqrtdp, 0x14, 0x06, PPC2_VSX),
GEN_XX3FORM(xscmpeqdp, 0x0C, 0x00, PPC2_ISA300),
GEN_XX3FORM(xscmpgtdp, 0x0C, 0x01, PPC2_ISA300),
GEN_XX3FORM(xscmpgedp, 0x0C, 0x02, PPC2_ISA300),
-GEN_XX3FORM(xscmpnedp, 0x0C, 0x03, PPC2_ISA300),
GEN_XX3FORM(xscmpexpdp, 0x0C, 0x07, PPC2_ISA300),
GEN_VSX_XFORM_300(xscmpexpqp, 0x04, 0x05, 0x00600001),
GEN_XX2IFORM(xscmpodp, 0x0C, 0x05, PPC2_VSX),
--
2.31.1
^ permalink raw reply related [flat|nested] 55+ messages in thread
* [PATCH v3 31/37] target/ppc: Refactor VSX_SCALAR_CMP_DP
2022-02-10 12:34 [PATCH v3 00/37] target/ppc: PowerISA Vector/VSX instruction batch matheus.ferst
` (29 preceding siblings ...)
2022-02-10 12:34 ` [PATCH v3 30/37] target/ppc: Remove xscmpnedp instruction matheus.ferst
@ 2022-02-10 12:34 ` matheus.ferst
2022-02-10 12:34 ` [PATCH v3 32/37] target/ppc: Implement xscmp{eq,ge,gt}qp matheus.ferst
` (5 subsequent siblings)
36 siblings, 0 replies; 55+ messages in thread
From: matheus.ferst @ 2022-02-10 12:34 UTC (permalink / raw)
To: qemu-devel, qemu-ppc
Cc: danielhb413, richard.henderson, groug, Víctor Colombo, clg,
Matheus Ferst, david
From: Víctor Colombo <victor.colombo@eldorado.org.br>
Refactor VSX_SCALAR_CMP_DP, changing its name to VSX_SCALAR_CMP and
prepare the helper to be used for quadword comparisons.
Signed-off-by: Víctor Colombo <victor.colombo@eldorado.org.br>
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
---
target/ppc/fpu_helper.c | 31 ++++++++++++++-----------------
1 file changed, 14 insertions(+), 17 deletions(-)
diff --git a/target/ppc/fpu_helper.c b/target/ppc/fpu_helper.c
index 6b0296525b..4a73186155 100644
--- a/target/ppc/fpu_helper.c
+++ b/target/ppc/fpu_helper.c
@@ -2265,28 +2265,30 @@ VSX_MADDQ(XSNMSUBQP, NMSUB_FLGS, 0)
VSX_MADDQ(XSNMSUBQPO, NMSUB_FLGS, 0)
/*
- * VSX_SCALAR_CMP_DP - VSX scalar floating point compare double precision
+ * VSX_SCALAR_CMP - VSX scalar floating point compare
* op - instruction mnemonic
+ * tp - type
* cmp - comparison operation
* exp - expected result of comparison
+ * fld - vsr_t field
* svxvc - set VXVC bit
*/
-#define VSX_SCALAR_CMP_DP(op, cmp, exp, svxvc) \
+#define VSX_SCALAR_CMP(op, tp, cmp, fld, exp, svxvc) \
void helper_##op(CPUPPCState *env, ppc_vsr_t *xt, \
ppc_vsr_t *xa, ppc_vsr_t *xb) \
{ \
- ppc_vsr_t t = *xt; \
+ ppc_vsr_t t = { }; \
bool vxsnan_flag = false, vxvc_flag = false, vex_flag = false; \
\
- if (float64_is_signaling_nan(xa->VsrD(0), &env->fp_status) || \
- float64_is_signaling_nan(xb->VsrD(0), &env->fp_status)) { \
+ if (tp##_is_signaling_nan(xa->fld, &env->fp_status) || \
+ tp##_is_signaling_nan(xb->fld, &env->fp_status)) { \
vxsnan_flag = true; \
if (fpscr_ve == 0 && svxvc) { \
vxvc_flag = true; \
} \
} else if (svxvc) { \
- vxvc_flag = float64_is_quiet_nan(xa->VsrD(0), &env->fp_status) || \
- float64_is_quiet_nan(xb->VsrD(0), &env->fp_status); \
+ vxvc_flag = tp##_is_quiet_nan(xa->fld, &env->fp_status) || \
+ tp##_is_quiet_nan(xb->fld, &env->fp_status); \
} \
if (vxsnan_flag) { \
float_invalid_op_vxsnan(env, GETPC()); \
@@ -2297,22 +2299,17 @@ void helper_##op(CPUPPCState *env, ppc_vsr_t *xt, \
vex_flag = fpscr_ve && (vxvc_flag || vxsnan_flag); \
\
if (!vex_flag) { \
- if (float64_##cmp(xb->VsrD(0), xa->VsrD(0), \
- &env->fp_status) == exp) { \
- t.VsrD(0) = -1; \
- t.VsrD(1) = 0; \
- } else { \
- t.VsrD(0) = 0; \
- t.VsrD(1) = 0; \
+ if (tp##_##cmp(xb->fld, xa->fld, &env->fp_status) == exp) { \
+ memset(&t.fld, 0xFF, sizeof(t.fld)); \
} \
} \
*xt = t; \
do_float_check_status(env, GETPC()); \
}
-VSX_SCALAR_CMP_DP(xscmpeqdp, eq, 1, 0)
-VSX_SCALAR_CMP_DP(xscmpgedp, le, 1, 1)
-VSX_SCALAR_CMP_DP(xscmpgtdp, lt, 1, 1)
+VSX_SCALAR_CMP(xscmpeqdp, float64, eq, VsrD(0), 1, 0)
+VSX_SCALAR_CMP(xscmpgedp, float64, le, VsrD(0), 1, 1)
+VSX_SCALAR_CMP(xscmpgtdp, float64, lt, VsrD(0), 1, 1)
void helper_xscmpexpdp(CPUPPCState *env, uint32_t opcode,
ppc_vsr_t *xa, ppc_vsr_t *xb)
--
2.31.1
^ permalink raw reply related [flat|nested] 55+ messages in thread
* [PATCH v3 32/37] target/ppc: Implement xscmp{eq,ge,gt}qp
2022-02-10 12:34 [PATCH v3 00/37] target/ppc: PowerISA Vector/VSX instruction batch matheus.ferst
` (30 preceding siblings ...)
2022-02-10 12:34 ` [PATCH v3 31/37] target/ppc: Refactor VSX_SCALAR_CMP_DP matheus.ferst
@ 2022-02-10 12:34 ` matheus.ferst
2022-02-10 12:34 ` [PATCH v3 33/37] target/ppc: Move xscmp{eq,ge,gt}dp to decodetree matheus.ferst
` (4 subsequent siblings)
36 siblings, 0 replies; 55+ messages in thread
From: matheus.ferst @ 2022-02-10 12:34 UTC (permalink / raw)
To: qemu-devel, qemu-ppc
Cc: danielhb413, richard.henderson, groug, Víctor Colombo, clg,
Matheus Ferst, david
From: Víctor Colombo <victor.colombo@eldorado.org.br>
Signed-off-by: Víctor Colombo <victor.colombo@eldorado.org.br>
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
---
target/ppc/fpu_helper.c | 4 ++++
target/ppc/helper.h | 3 +++
target/ppc/insn32.decode | 3 +++
target/ppc/translate/vsx-impl.c.inc | 31 +++++++++++++++++++++++++++++
4 files changed, 41 insertions(+)
diff --git a/target/ppc/fpu_helper.c b/target/ppc/fpu_helper.c
index 4a73186155..6d6c6cfa08 100644
--- a/target/ppc/fpu_helper.c
+++ b/target/ppc/fpu_helper.c
@@ -2311,6 +2311,10 @@ VSX_SCALAR_CMP(xscmpeqdp, float64, eq, VsrD(0), 1, 0)
VSX_SCALAR_CMP(xscmpgedp, float64, le, VsrD(0), 1, 1)
VSX_SCALAR_CMP(xscmpgtdp, float64, lt, VsrD(0), 1, 1)
+VSX_SCALAR_CMP(XSCMPEQQP, float128, eq, f128, 1, 0)
+VSX_SCALAR_CMP(XSCMPGEQP, float128, le, f128, 1, 1)
+VSX_SCALAR_CMP(XSCMPGTQP, float128, lt, f128, 1, 1)
+
void helper_xscmpexpdp(CPUPPCState *env, uint32_t opcode,
ppc_vsr_t *xa, ppc_vsr_t *xb)
{
diff --git a/target/ppc/helper.h b/target/ppc/helper.h
index 9e6b2af74b..6d78dffd24 100644
--- a/target/ppc/helper.h
+++ b/target/ppc/helper.h
@@ -369,6 +369,9 @@ DEF_HELPER_5(XSNMSUBDP, void, env, vsr, vsr, vsr, vsr)
DEF_HELPER_4(xscmpeqdp, void, env, vsr, vsr, vsr)
DEF_HELPER_4(xscmpgtdp, void, env, vsr, vsr, vsr)
DEF_HELPER_4(xscmpgedp, void, env, vsr, vsr, vsr)
+DEF_HELPER_4(XSCMPEQQP, void, env, vsr, vsr, vsr)
+DEF_HELPER_4(XSCMPGTQP, void, env, vsr, vsr, vsr)
+DEF_HELPER_4(XSCMPGEQP, void, env, vsr, vsr, vsr)
DEF_HELPER_4(xscmpexpdp, void, env, i32, vsr, vsr)
DEF_HELPER_4(xscmpexpqp, void, env, i32, vsr, vsr)
DEF_HELPER_4(xscmpodp, void, env, i32, vsr, vsr)
diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode
index c9cf865fdf..6d986dba68 100644
--- a/target/ppc/insn32.decode
+++ b/target/ppc/insn32.decode
@@ -602,6 +602,9 @@ XSMAXCDP 111100 ..... ..... ..... 10000000 ... @XX3
XSMINCDP 111100 ..... ..... ..... 10001000 ... @XX3
XSMAXJDP 111100 ..... ..... ..... 10010000 ... @XX3
XSMINJDP 111100 ..... ..... ..... 10011000 ... @XX3
+XSCMPEQQP 111111 ..... ..... ..... 0001000100 - @X
+XSCMPGEQP 111111 ..... ..... ..... 0011000100 - @X
+XSCMPGTQP 111111 ..... ..... ..... 0011100100 - @X
## VSX Binary Floating-Point Convert Instructions
diff --git a/target/ppc/translate/vsx-impl.c.inc b/target/ppc/translate/vsx-impl.c.inc
index 9be7bf0ffd..e1fea01a3a 100644
--- a/target/ppc/translate/vsx-impl.c.inc
+++ b/target/ppc/translate/vsx-impl.c.inc
@@ -2497,6 +2497,37 @@ TRANS(XSMINCDP, do_xsmaxmincjdp, gen_helper_xsmincdp)
TRANS(XSMAXJDP, do_xsmaxmincjdp, gen_helper_xsmaxjdp)
TRANS(XSMINJDP, do_xsmaxmincjdp, gen_helper_xsminjdp)
+static bool do_helper_X(arg_X *a,
+ void (*helper)(TCGv_ptr, TCGv_ptr, TCGv_ptr, TCGv_ptr))
+{
+ TCGv_ptr rt, ra, rb;
+
+ rt = gen_avr_ptr(a->rt);
+ ra = gen_avr_ptr(a->ra);
+ rb = gen_avr_ptr(a->rb);
+
+ helper(cpu_env, rt, ra, rb);
+
+ tcg_temp_free_ptr(rt);
+ tcg_temp_free_ptr(ra);
+ tcg_temp_free_ptr(rb);
+
+ return true;
+}
+
+static bool do_xscmpqp(DisasContext *ctx, arg_X *a,
+ void (*helper)(TCGv_ptr, TCGv_ptr, TCGv_ptr, TCGv_ptr))
+{
+ REQUIRE_INSNS_FLAGS2(ctx, ISA310);
+ REQUIRE_VSX(ctx);
+
+ return do_helper_X(a, helper);
+}
+
+TRANS(XSCMPEQQP, do_xscmpqp, gen_helper_XSCMPEQQP)
+TRANS(XSCMPGEQP, do_xscmpqp, gen_helper_XSCMPGEQP)
+TRANS(XSCMPGTQP, do_xscmpqp, gen_helper_XSCMPGTQP)
+
#undef GEN_XX2FORM
#undef GEN_XX3FORM
#undef GEN_XX2IFORM
--
2.31.1
^ permalink raw reply related [flat|nested] 55+ messages in thread
* [PATCH v3 33/37] target/ppc: Move xscmp{eq,ge,gt}dp to decodetree
2022-02-10 12:34 [PATCH v3 00/37] target/ppc: PowerISA Vector/VSX instruction batch matheus.ferst
` (31 preceding siblings ...)
2022-02-10 12:34 ` [PATCH v3 32/37] target/ppc: Implement xscmp{eq,ge,gt}qp matheus.ferst
@ 2022-02-10 12:34 ` matheus.ferst
2022-02-10 12:34 ` [PATCH v3 34/37] target/ppc: Move xs{max, min}[cj]dp to use do_helper_XX3 matheus.ferst
` (3 subsequent siblings)
36 siblings, 0 replies; 55+ messages in thread
From: matheus.ferst @ 2022-02-10 12:34 UTC (permalink / raw)
To: qemu-devel, qemu-ppc
Cc: danielhb413, richard.henderson, groug, Víctor Colombo, clg,
Matheus Ferst, david
From: Víctor Colombo <victor.colombo@eldorado.org.br>
Signed-off-by: Víctor Colombo <victor.colombo@eldorado.org.br>
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
---
target/ppc/fpu_helper.c | 7 +++----
target/ppc/helper.h | 6 +++---
target/ppc/insn32.decode | 3 +++
target/ppc/translate/vsx-impl.c.inc | 28 +++++++++++++++++++++++++---
target/ppc/translate/vsx-ops.c.inc | 3 ---
5 files changed, 34 insertions(+), 13 deletions(-)
diff --git a/target/ppc/fpu_helper.c b/target/ppc/fpu_helper.c
index 6d6c6cfa08..8181e19e28 100644
--- a/target/ppc/fpu_helper.c
+++ b/target/ppc/fpu_helper.c
@@ -2307,10 +2307,9 @@ void helper_##op(CPUPPCState *env, ppc_vsr_t *xt, \
do_float_check_status(env, GETPC()); \
}
-VSX_SCALAR_CMP(xscmpeqdp, float64, eq, VsrD(0), 1, 0)
-VSX_SCALAR_CMP(xscmpgedp, float64, le, VsrD(0), 1, 1)
-VSX_SCALAR_CMP(xscmpgtdp, float64, lt, VsrD(0), 1, 1)
-
+VSX_SCALAR_CMP(XSCMPEQDP, float64, eq, VsrD(0), 1, 0)
+VSX_SCALAR_CMP(XSCMPGEDP, float64, le, VsrD(0), 1, 1)
+VSX_SCALAR_CMP(XSCMPGTDP, float64, lt, VsrD(0), 1, 1)
VSX_SCALAR_CMP(XSCMPEQQP, float128, eq, f128, 1, 0)
VSX_SCALAR_CMP(XSCMPGEQP, float128, le, f128, 1, 1)
VSX_SCALAR_CMP(XSCMPGTQP, float128, lt, f128, 1, 1)
diff --git a/target/ppc/helper.h b/target/ppc/helper.h
index 6d78dffd24..775dc74ef8 100644
--- a/target/ppc/helper.h
+++ b/target/ppc/helper.h
@@ -366,9 +366,9 @@ DEF_HELPER_5(XSMADDDP, void, env, vsr, vsr, vsr, vsr)
DEF_HELPER_5(XSMSUBDP, void, env, vsr, vsr, vsr, vsr)
DEF_HELPER_5(XSNMADDDP, void, env, vsr, vsr, vsr, vsr)
DEF_HELPER_5(XSNMSUBDP, void, env, vsr, vsr, vsr, vsr)
-DEF_HELPER_4(xscmpeqdp, void, env, vsr, vsr, vsr)
-DEF_HELPER_4(xscmpgtdp, void, env, vsr, vsr, vsr)
-DEF_HELPER_4(xscmpgedp, void, env, vsr, vsr, vsr)
+DEF_HELPER_4(XSCMPEQDP, void, env, vsr, vsr, vsr)
+DEF_HELPER_4(XSCMPGTDP, void, env, vsr, vsr, vsr)
+DEF_HELPER_4(XSCMPGEDP, void, env, vsr, vsr, vsr)
DEF_HELPER_4(XSCMPEQQP, void, env, vsr, vsr, vsr)
DEF_HELPER_4(XSCMPGTQP, void, env, vsr, vsr, vsr)
DEF_HELPER_4(XSCMPGEQP, void, env, vsr, vsr, vsr)
diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode
index 6d986dba68..69aa80ba84 100644
--- a/target/ppc/insn32.decode
+++ b/target/ppc/insn32.decode
@@ -602,6 +602,9 @@ XSMAXCDP 111100 ..... ..... ..... 10000000 ... @XX3
XSMINCDP 111100 ..... ..... ..... 10001000 ... @XX3
XSMAXJDP 111100 ..... ..... ..... 10010000 ... @XX3
XSMINJDP 111100 ..... ..... ..... 10011000 ... @XX3
+XSCMPEQDP 111100 ..... ..... ..... 00000011 ... @XX3
+XSCMPGEDP 111100 ..... ..... ..... 00010011 ... @XX3
+XSCMPGTDP 111100 ..... ..... ..... 00001011 ... @XX3
XSCMPEQQP 111111 ..... ..... ..... 0001000100 - @X
XSCMPGEQP 111111 ..... ..... ..... 0011000100 - @X
XSCMPGTQP 111111 ..... ..... ..... 0011100100 - @X
diff --git a/target/ppc/translate/vsx-impl.c.inc b/target/ppc/translate/vsx-impl.c.inc
index e1fea01a3a..3ac41a6958 100644
--- a/target/ppc/translate/vsx-impl.c.inc
+++ b/target/ppc/translate/vsx-impl.c.inc
@@ -1050,9 +1050,6 @@ GEN_VSX_HELPER_X2(xssqrtdp, 0x16, 0x04, 0, PPC2_VSX)
GEN_VSX_HELPER_X2(xsrsqrtedp, 0x14, 0x04, 0, PPC2_VSX)
GEN_VSX_HELPER_X2_AB(xstdivdp, 0x14, 0x07, 0, PPC2_VSX)
GEN_VSX_HELPER_X1(xstsqrtdp, 0x14, 0x06, 0, PPC2_VSX)
-GEN_VSX_HELPER_X3(xscmpeqdp, 0x0C, 0x00, 0, PPC2_ISA300)
-GEN_VSX_HELPER_X3(xscmpgtdp, 0x0C, 0x01, 0, PPC2_ISA300)
-GEN_VSX_HELPER_X3(xscmpgedp, 0x0C, 0x02, 0, PPC2_ISA300)
GEN_VSX_HELPER_X2_AB(xscmpexpdp, 0x0C, 0x07, 0, PPC2_ISA300)
GEN_VSX_HELPER_R2_AB(xscmpexpqp, 0x04, 0x05, 0, PPC2_ISA300)
GEN_VSX_HELPER_X2_AB(xscmpodp, 0x0C, 0x05, 0, PPC2_VSX)
@@ -2471,6 +2468,31 @@ TRANS(XXBLENDVH, do_xxblendv, MO_16)
TRANS(XXBLENDVW, do_xxblendv, MO_32)
TRANS(XXBLENDVD, do_xxblendv, MO_64)
+static bool do_helper_XX3(DisasContext *ctx, arg_XX3 *a,
+ void (*helper)(TCGv_ptr, TCGv_ptr, TCGv_ptr, TCGv_ptr))
+{
+ TCGv_ptr xt, xa, xb;
+
+ REQUIRE_INSNS_FLAGS2(ctx, ISA300);
+ REQUIRE_VSX(ctx);
+
+ xt = gen_vsr_ptr(a->xt);
+ xa = gen_vsr_ptr(a->xa);
+ xb = gen_vsr_ptr(a->xb);
+
+ helper(cpu_env, xt, xa, xb);
+
+ tcg_temp_free_ptr(xt);
+ tcg_temp_free_ptr(xa);
+ tcg_temp_free_ptr(xb);
+
+ return true;
+}
+
+TRANS(XSCMPEQDP, do_helper_XX3, gen_helper_XSCMPEQDP)
+TRANS(XSCMPGEDP, do_helper_XX3, gen_helper_XSCMPGEDP)
+TRANS(XSCMPGTDP, do_helper_XX3, gen_helper_XSCMPGTDP)
+
static bool do_xsmaxmincjdp(DisasContext *ctx, arg_XX3 *a,
void (*helper)(TCGv_ptr, TCGv_ptr, TCGv_ptr, TCGv_ptr))
{
diff --git a/target/ppc/translate/vsx-ops.c.inc b/target/ppc/translate/vsx-ops.c.inc
index 34310c1fb5..b8fd116728 100644
--- a/target/ppc/translate/vsx-ops.c.inc
+++ b/target/ppc/translate/vsx-ops.c.inc
@@ -186,9 +186,6 @@ GEN_XX2FORM(xssqrtdp, 0x16, 0x04, PPC2_VSX),
GEN_XX2FORM(xsrsqrtedp, 0x14, 0x04, PPC2_VSX),
GEN_XX3FORM(xstdivdp, 0x14, 0x07, PPC2_VSX),
GEN_XX2FORM(xstsqrtdp, 0x14, 0x06, PPC2_VSX),
-GEN_XX3FORM(xscmpeqdp, 0x0C, 0x00, PPC2_ISA300),
-GEN_XX3FORM(xscmpgtdp, 0x0C, 0x01, PPC2_ISA300),
-GEN_XX3FORM(xscmpgedp, 0x0C, 0x02, PPC2_ISA300),
GEN_XX3FORM(xscmpexpdp, 0x0C, 0x07, PPC2_ISA300),
GEN_VSX_XFORM_300(xscmpexpqp, 0x04, 0x05, 0x00600001),
GEN_XX2IFORM(xscmpodp, 0x0C, 0x05, PPC2_VSX),
--
2.31.1
^ permalink raw reply related [flat|nested] 55+ messages in thread
* [PATCH v3 34/37] target/ppc: Move xs{max, min}[cj]dp to use do_helper_XX3
2022-02-10 12:34 [PATCH v3 00/37] target/ppc: PowerISA Vector/VSX instruction batch matheus.ferst
` (32 preceding siblings ...)
2022-02-10 12:34 ` [PATCH v3 33/37] target/ppc: Move xscmp{eq,ge,gt}dp to decodetree matheus.ferst
@ 2022-02-10 12:34 ` matheus.ferst
2022-02-10 12:34 ` [PATCH v3 35/37] target/ppc: Refactor VSX_MAX_MINC helper matheus.ferst
` (2 subsequent siblings)
36 siblings, 0 replies; 55+ messages in thread
From: matheus.ferst @ 2022-02-10 12:34 UTC (permalink / raw)
To: qemu-devel, qemu-ppc
Cc: danielhb413, richard.henderson, groug, Víctor Colombo, clg,
Matheus Ferst, david
From: Víctor Colombo <victor.colombo@eldorado.org.br>
Also, fixes these instructions not being capitalized.
Signed-off-by: Víctor Colombo <victor.colombo@eldorado.org.br>
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
---
target/ppc/fpu_helper.c | 8 ++++----
target/ppc/helper.h | 8 ++++----
target/ppc/translate/vsx-impl.c.inc | 30 ++++-------------------------
3 files changed, 12 insertions(+), 34 deletions(-)
diff --git a/target/ppc/fpu_helper.c b/target/ppc/fpu_helper.c
index 8181e19e28..41f5748074 100644
--- a/target/ppc/fpu_helper.c
+++ b/target/ppc/fpu_helper.c
@@ -2568,8 +2568,8 @@ void helper_##name(CPUPPCState *env, \
} \
} \
-VSX_MAX_MINC(xsmaxcdp, 1);
-VSX_MAX_MINC(xsmincdp, 0);
+VSX_MAX_MINC(XSMAXCDP, 1);
+VSX_MAX_MINC(XSMINCDP, 0);
#define VSX_MAX_MINJ(name, max) \
void helper_##name(CPUPPCState *env, \
@@ -2623,8 +2623,8 @@ void helper_##name(CPUPPCState *env, \
} \
} \
-VSX_MAX_MINJ(xsmaxjdp, 1);
-VSX_MAX_MINJ(xsminjdp, 0);
+VSX_MAX_MINJ(XSMAXJDP, 1);
+VSX_MAX_MINJ(XSMINJDP, 0);
/*
* VSX_CMP - VSX floating point compare
diff --git a/target/ppc/helper.h b/target/ppc/helper.h
index 775dc74ef8..d67f14cd16 100644
--- a/target/ppc/helper.h
+++ b/target/ppc/helper.h
@@ -380,10 +380,10 @@ DEF_HELPER_4(xscmpoqp, void, env, i32, vsr, vsr)
DEF_HELPER_4(xscmpuqp, void, env, i32, vsr, vsr)
DEF_HELPER_4(xsmaxdp, void, env, vsr, vsr, vsr)
DEF_HELPER_4(xsmindp, void, env, vsr, vsr, vsr)
-DEF_HELPER_4(xsmaxcdp, void, env, vsr, vsr, vsr)
-DEF_HELPER_4(xsmincdp, void, env, vsr, vsr, vsr)
-DEF_HELPER_4(xsmaxjdp, void, env, vsr, vsr, vsr)
-DEF_HELPER_4(xsminjdp, void, env, vsr, vsr, vsr)
+DEF_HELPER_4(XSMAXCDP, void, env, vsr, vsr, vsr)
+DEF_HELPER_4(XSMINCDP, void, env, vsr, vsr, vsr)
+DEF_HELPER_4(XSMAXJDP, void, env, vsr, vsr, vsr)
+DEF_HELPER_4(XSMINJDP, void, env, vsr, vsr, vsr)
DEF_HELPER_3(xscvdphp, void, env, vsr, vsr)
DEF_HELPER_4(xscvdpqp, void, env, i32, vsr, vsr)
DEF_HELPER_3(xscvdpsp, void, env, vsr, vsr)
diff --git a/target/ppc/translate/vsx-impl.c.inc b/target/ppc/translate/vsx-impl.c.inc
index 3ac41a6958..2e1267e8a3 100644
--- a/target/ppc/translate/vsx-impl.c.inc
+++ b/target/ppc/translate/vsx-impl.c.inc
@@ -2492,32 +2492,10 @@ static bool do_helper_XX3(DisasContext *ctx, arg_XX3 *a,
TRANS(XSCMPEQDP, do_helper_XX3, gen_helper_XSCMPEQDP)
TRANS(XSCMPGEDP, do_helper_XX3, gen_helper_XSCMPGEDP)
TRANS(XSCMPGTDP, do_helper_XX3, gen_helper_XSCMPGTDP)
-
-static bool do_xsmaxmincjdp(DisasContext *ctx, arg_XX3 *a,
- void (*helper)(TCGv_ptr, TCGv_ptr, TCGv_ptr, TCGv_ptr))
-{
- TCGv_ptr xt, xa, xb;
-
- REQUIRE_INSNS_FLAGS2(ctx, ISA300);
- REQUIRE_VSX(ctx);
-
- xt = gen_vsr_ptr(a->xt);
- xa = gen_vsr_ptr(a->xa);
- xb = gen_vsr_ptr(a->xb);
-
- helper(cpu_env, xt, xa, xb);
-
- tcg_temp_free_ptr(xt);
- tcg_temp_free_ptr(xa);
- tcg_temp_free_ptr(xb);
-
- return true;
-}
-
-TRANS(XSMAXCDP, do_xsmaxmincjdp, gen_helper_xsmaxcdp)
-TRANS(XSMINCDP, do_xsmaxmincjdp, gen_helper_xsmincdp)
-TRANS(XSMAXJDP, do_xsmaxmincjdp, gen_helper_xsmaxjdp)
-TRANS(XSMINJDP, do_xsmaxmincjdp, gen_helper_xsminjdp)
+TRANS(XSMAXCDP, do_helper_XX3, gen_helper_XSMAXCDP)
+TRANS(XSMINCDP, do_helper_XX3, gen_helper_XSMINCDP)
+TRANS(XSMAXJDP, do_helper_XX3, gen_helper_XSMAXJDP)
+TRANS(XSMINJDP, do_helper_XX3, gen_helper_XSMINJDP)
static bool do_helper_X(arg_X *a,
void (*helper)(TCGv_ptr, TCGv_ptr, TCGv_ptr, TCGv_ptr))
--
2.31.1
^ permalink raw reply related [flat|nested] 55+ messages in thread
* [PATCH v3 35/37] target/ppc: Refactor VSX_MAX_MINC helper
2022-02-10 12:34 [PATCH v3 00/37] target/ppc: PowerISA Vector/VSX instruction batch matheus.ferst
` (33 preceding siblings ...)
2022-02-10 12:34 ` [PATCH v3 34/37] target/ppc: Move xs{max, min}[cj]dp to use do_helper_XX3 matheus.ferst
@ 2022-02-10 12:34 ` matheus.ferst
2022-02-10 12:34 ` [PATCH v3 36/37] target/ppc: Implement xs{max,min}cqp matheus.ferst
2022-02-10 12:34 ` [PATCH v3 37/37] target/ppc: Implement xvcvbf16spn and xvcvspbf16 instructions matheus.ferst
36 siblings, 0 replies; 55+ messages in thread
From: matheus.ferst @ 2022-02-10 12:34 UTC (permalink / raw)
To: qemu-devel, qemu-ppc
Cc: danielhb413, richard.henderson, groug, Víctor Colombo, clg,
Matheus Ferst, david
From: Víctor Colombo <victor.colombo@eldorado.org.br>
Refactor xs{max,min}cdp VSX_MAX_MINC helper to prepare for
xs{max,min}cqp implementation.
Signed-off-by: Víctor Colombo <victor.colombo@eldorado.org.br>
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
---
target/ppc/fpu_helper.c | 23 +++++++++--------------
1 file changed, 9 insertions(+), 14 deletions(-)
diff --git a/target/ppc/fpu_helper.c b/target/ppc/fpu_helper.c
index 41f5748074..6ff5273269 100644
--- a/target/ppc/fpu_helper.c
+++ b/target/ppc/fpu_helper.c
@@ -2536,27 +2536,22 @@ VSX_MAX_MIN(xsmindp, minnum, 1, float64, VsrD(0))
VSX_MAX_MIN(xvmindp, minnum, 2, float64, VsrD(i))
VSX_MAX_MIN(xvminsp, minnum, 4, float32, VsrW(i))
-#define VSX_MAX_MINC(name, max) \
+#define VSX_MAX_MINC(name, op, tp, fld) \
void helper_##name(CPUPPCState *env, \
ppc_vsr_t *xt, ppc_vsr_t *xa, ppc_vsr_t *xb) \
{ \
ppc_vsr_t t = *xt; \
bool vxsnan_flag = false, vex_flag = false; \
\
- if (unlikely(float64_is_any_nan(xa->VsrD(0)) || \
- float64_is_any_nan(xb->VsrD(0)))) { \
- if (float64_is_signaling_nan(xa->VsrD(0), &env->fp_status) || \
- float64_is_signaling_nan(xb->VsrD(0), &env->fp_status)) { \
+ if (unlikely(tp##_is_any_nan(xa->fld) || \
+ tp##_is_any_nan(xb->fld))) { \
+ if (tp##_is_signaling_nan(xa->fld, &env->fp_status) || \
+ tp##_is_signaling_nan(xb->fld, &env->fp_status)) { \
vxsnan_flag = true; \
} \
- t.VsrD(0) = xb->VsrD(0); \
- } else if ((max && \
- !float64_lt(xa->VsrD(0), xb->VsrD(0), &env->fp_status)) || \
- (!max && \
- float64_lt(xa->VsrD(0), xb->VsrD(0), &env->fp_status))) { \
- t.VsrD(0) = xa->VsrD(0); \
+ t.fld = xb->fld; \
} else { \
- t.VsrD(0) = xb->VsrD(0); \
+ t.fld = tp##_##op(xa->fld, xb->fld, &env->fp_status); \
} \
\
vex_flag = fpscr_ve & vxsnan_flag; \
@@ -2568,8 +2563,8 @@ void helper_##name(CPUPPCState *env, \
} \
} \
-VSX_MAX_MINC(XSMAXCDP, 1);
-VSX_MAX_MINC(XSMINCDP, 0);
+VSX_MAX_MINC(XSMAXCDP, maxnum, float64, VsrD(0));
+VSX_MAX_MINC(XSMINCDP, minnum, float64, VsrD(0));
#define VSX_MAX_MINJ(name, max) \
void helper_##name(CPUPPCState *env, \
--
2.31.1
^ permalink raw reply related [flat|nested] 55+ messages in thread
* [PATCH v3 36/37] target/ppc: Implement xs{max,min}cqp
2022-02-10 12:34 [PATCH v3 00/37] target/ppc: PowerISA Vector/VSX instruction batch matheus.ferst
` (34 preceding siblings ...)
2022-02-10 12:34 ` [PATCH v3 35/37] target/ppc: Refactor VSX_MAX_MINC helper matheus.ferst
@ 2022-02-10 12:34 ` matheus.ferst
2022-02-10 12:34 ` [PATCH v3 37/37] target/ppc: Implement xvcvbf16spn and xvcvspbf16 instructions matheus.ferst
36 siblings, 0 replies; 55+ messages in thread
From: matheus.ferst @ 2022-02-10 12:34 UTC (permalink / raw)
To: qemu-devel, qemu-ppc
Cc: danielhb413, richard.henderson, groug, Víctor Colombo, clg,
Matheus Ferst, david
From: Víctor Colombo <victor.colombo@eldorado.org.br>
Signed-off-by: Víctor Colombo <victor.colombo@eldorado.org.br>
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
---
target/ppc/fpu_helper.c | 2 ++
target/ppc/helper.h | 2 ++
target/ppc/insn32.decode | 3 +++
target/ppc/translate/vsx-impl.c.inc | 2 ++
4 files changed, 9 insertions(+)
diff --git a/target/ppc/fpu_helper.c b/target/ppc/fpu_helper.c
index 6ff5273269..c724fa8a8d 100644
--- a/target/ppc/fpu_helper.c
+++ b/target/ppc/fpu_helper.c
@@ -2565,6 +2565,8 @@ void helper_##name(CPUPPCState *env, \
VSX_MAX_MINC(XSMAXCDP, maxnum, float64, VsrD(0));
VSX_MAX_MINC(XSMINCDP, minnum, float64, VsrD(0));
+VSX_MAX_MINC(XSMAXCQP, maxnum, float128, f128);
+VSX_MAX_MINC(XSMINCQP, minnum, float128, f128);
#define VSX_MAX_MINJ(name, max) \
void helper_##name(CPUPPCState *env, \
diff --git a/target/ppc/helper.h b/target/ppc/helper.h
index d67f14cd16..9403f43134 100644
--- a/target/ppc/helper.h
+++ b/target/ppc/helper.h
@@ -384,6 +384,8 @@ DEF_HELPER_4(XSMAXCDP, void, env, vsr, vsr, vsr)
DEF_HELPER_4(XSMINCDP, void, env, vsr, vsr, vsr)
DEF_HELPER_4(XSMAXJDP, void, env, vsr, vsr, vsr)
DEF_HELPER_4(XSMINJDP, void, env, vsr, vsr, vsr)
+DEF_HELPER_4(XSMAXCQP, void, env, vsr, vsr, vsr)
+DEF_HELPER_4(XSMINCQP, void, env, vsr, vsr, vsr)
DEF_HELPER_3(xscvdphp, void, env, vsr, vsr)
DEF_HELPER_4(xscvdpqp, void, env, i32, vsr, vsr)
DEF_HELPER_3(xscvdpsp, void, env, vsr, vsr)
diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode
index 69aa80ba84..0dd54d60c3 100644
--- a/target/ppc/insn32.decode
+++ b/target/ppc/insn32.decode
@@ -602,6 +602,9 @@ XSMAXCDP 111100 ..... ..... ..... 10000000 ... @XX3
XSMINCDP 111100 ..... ..... ..... 10001000 ... @XX3
XSMAXJDP 111100 ..... ..... ..... 10010000 ... @XX3
XSMINJDP 111100 ..... ..... ..... 10011000 ... @XX3
+XSMAXCQP 111111 ..... ..... ..... 1010100100 - @X
+XSMINCQP 111111 ..... ..... ..... 1011100100 - @X
+
XSCMPEQDP 111100 ..... ..... ..... 00000011 ... @XX3
XSCMPGEDP 111100 ..... ..... ..... 00010011 ... @XX3
XSCMPGTDP 111100 ..... ..... ..... 00001011 ... @XX3
diff --git a/target/ppc/translate/vsx-impl.c.inc b/target/ppc/translate/vsx-impl.c.inc
index 2e1267e8a3..a19c828414 100644
--- a/target/ppc/translate/vsx-impl.c.inc
+++ b/target/ppc/translate/vsx-impl.c.inc
@@ -2527,6 +2527,8 @@ static bool do_xscmpqp(DisasContext *ctx, arg_X *a,
TRANS(XSCMPEQQP, do_xscmpqp, gen_helper_XSCMPEQQP)
TRANS(XSCMPGEQP, do_xscmpqp, gen_helper_XSCMPGEQP)
TRANS(XSCMPGTQP, do_xscmpqp, gen_helper_XSCMPGTQP)
+TRANS(XSMAXCQP, do_xscmpqp, gen_helper_XSMAXCQP)
+TRANS(XSMINCQP, do_xscmpqp, gen_helper_XSMINCQP)
#undef GEN_XX2FORM
#undef GEN_XX3FORM
--
2.31.1
^ permalink raw reply related [flat|nested] 55+ messages in thread
* [PATCH v3 37/37] target/ppc: Implement xvcvbf16spn and xvcvspbf16 instructions
2022-02-10 12:34 [PATCH v3 00/37] target/ppc: PowerISA Vector/VSX instruction batch matheus.ferst
` (35 preceding siblings ...)
2022-02-10 12:34 ` [PATCH v3 36/37] target/ppc: Implement xs{max,min}cqp matheus.ferst
@ 2022-02-10 12:34 ` matheus.ferst
36 siblings, 0 replies; 55+ messages in thread
From: matheus.ferst @ 2022-02-10 12:34 UTC (permalink / raw)
To: qemu-devel, qemu-ppc
Cc: danielhb413, richard.henderson, groug, Víctor Colombo, clg,
Matheus Ferst, david
From: Víctor Colombo <victor.colombo@eldorado.org.br>
Signed-off-by: Víctor Colombo <victor.colombo@eldorado.org.br>
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
---
target/ppc/fpu_helper.c | 21 +++++++++++++++++++
target/ppc/helper.h | 1 +
target/ppc/insn32.decode | 11 +++++++---
target/ppc/translate/vsx-impl.c.inc | 31 ++++++++++++++++++++++++++++-
4 files changed, 60 insertions(+), 4 deletions(-)
diff --git a/target/ppc/fpu_helper.c b/target/ppc/fpu_helper.c
index c724fa8a8d..f83a80e685 100644
--- a/target/ppc/fpu_helper.c
+++ b/target/ppc/fpu_helper.c
@@ -2790,6 +2790,27 @@ VSX_CVT_FP_TO_FP_HP(xscvhpdp, 1, float16, float64, VsrH(3), VsrD(0), 1)
VSX_CVT_FP_TO_FP_HP(xvcvsphp, 4, float32, float16, VsrW(i), VsrH(2 * i + 1), 0)
VSX_CVT_FP_TO_FP_HP(xvcvhpsp, 4, float16, float32, VsrH(2 * i + 1), VsrW(i), 0)
+void helper_XVCVSPBF16(CPUPPCState *env, ppc_vsr_t *xt, ppc_vsr_t *xb)
+{
+ ppc_vsr_t t = { };
+ int i;
+
+ helper_reset_fpstatus(env);
+ for (i = 0; i < 4; i++) {
+ if (unlikely(float32_is_signaling_nan(xb->VsrW(i), &env->fp_status))) {
+ float_invalid_op_vxsnan(env, GETPC());
+ t.VsrH(2 * i + 1) = float32_to_bfloat16(
+ float32_snan_to_qnan(xb->VsrW(i)), &env->fp_status);
+ } else {
+ t.VsrH(2 * i + 1) =
+ float32_to_bfloat16(xb->VsrW(i), &env->fp_status);
+ }
+ }
+
+ *xt = t;
+ do_float_check_status(env, GETPC());
+}
+
void helper_XSCVQPDP(CPUPPCState *env, uint32_t ro, ppc_vsr_t *xt,
ppc_vsr_t *xb)
{
diff --git a/target/ppc/helper.h b/target/ppc/helper.h
index 9403f43134..82df92b69d 100644
--- a/target/ppc/helper.h
+++ b/target/ppc/helper.h
@@ -499,6 +499,7 @@ DEF_HELPER_FLAGS_4(xvcmpnesp, TCG_CALL_NO_RWG, i32, env, vsr, vsr, vsr)
DEF_HELPER_3(xvcvspdp, void, env, vsr, vsr)
DEF_HELPER_3(xvcvsphp, void, env, vsr, vsr)
DEF_HELPER_3(xvcvhpsp, void, env, vsr, vsr)
+DEF_HELPER_3(XVCVSPBF16, void, env, vsr, vsr)
DEF_HELPER_3(xvcvspsxds, void, env, vsr, vsr)
DEF_HELPER_3(xvcvspsxws, void, env, vsr, vsr)
DEF_HELPER_3(xvcvspuxds, void, env, vsr, vsr)
diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode
index 0dd54d60c3..ebb5c22ee1 100644
--- a/target/ppc/insn32.decode
+++ b/target/ppc/insn32.decode
@@ -152,8 +152,11 @@
%xx_xb 1:1 11:5
%xx_xa 2:1 16:5
%xx_xc 3:1 6:5
-&XX2 xt xb uim:uint8_t
-@XX2 ...... ..... ... uim:2 ..... ......... .. &XX2 xt=%xx_xt xb=%xx_xb
+&XX2 xt xb
+@XX2 ...... ..... ..... ..... ......... .. &XX2 xt=%xx_xt xb=%xx_xb
+
+&XX2_uim2 xt xb uim:uint8_t
+@XX2_uim2 ...... ..... ... uim:2 ..... ......... .. &XX2_uim2 xt=%xx_xt xb=%xx_xb
&XX2_bf_xb bf xb
@XX2_bf_xb ...... bf:3 .. ..... ..... ......... . . &XX2_bf_xb xb=%xx_xb
@@ -575,7 +578,7 @@ XSNMSUBQP 111111 ..... ..... ..... 0111100100 . @X_rc
## VSX splat instruction
XXSPLTIB 111100 ..... 00 ........ 0101101000 . @X_imm8
-XXSPLTW 111100 ..... ---.. ..... 010100100 . . @XX2
+XXSPLTW 111100 ..... ---.. ..... 010100100 . . @XX2_uim2
## VSX Permute Instructions
@@ -615,6 +618,8 @@ XSCMPGTQP 111111 ..... ..... ..... 0011100100 - @X
## VSX Binary Floating-Point Convert Instructions
XSCVQPDP 111111 ..... 10100 ..... 1101000100 . @X_tb_rc
+XVCVBF16SPN 111100 ..... 10000 ..... 111011011 .. @XX2
+XVCVSPBF16 111100 ..... 10001 ..... 111011011 .. @XX2
## VSX Vector Test Least-Significant Bit by Byte Instruction
diff --git a/target/ppc/translate/vsx-impl.c.inc b/target/ppc/translate/vsx-impl.c.inc
index a19c828414..7a4f992170 100644
--- a/target/ppc/translate/vsx-impl.c.inc
+++ b/target/ppc/translate/vsx-impl.c.inc
@@ -1574,7 +1574,7 @@ static bool trans_XXSEL(DisasContext *ctx, arg_XX4 *a)
return true;
}
-static bool trans_XXSPLTW(DisasContext *ctx, arg_XX2 *a)
+static bool trans_XXSPLTW(DisasContext *ctx, arg_XX2_uim2 *a)
{
int tofs, bofs;
@@ -2530,6 +2530,35 @@ TRANS(XSCMPGTQP, do_xscmpqp, gen_helper_XSCMPGTQP)
TRANS(XSMAXCQP, do_xscmpqp, gen_helper_XSMAXCQP)
TRANS(XSMINCQP, do_xscmpqp, gen_helper_XSMINCQP)
+static bool trans_XVCVSPBF16(DisasContext *ctx, arg_XX2 *a)
+{
+ TCGv_ptr xt, xb;
+
+ REQUIRE_INSNS_FLAGS2(ctx, ISA310);
+ REQUIRE_VSX(ctx);
+
+ xt = gen_vsr_ptr(a->xt);
+ xb = gen_vsr_ptr(a->xb);
+
+ gen_helper_XVCVSPBF16(cpu_env, xt, xb);
+
+ tcg_temp_free_ptr(xt);
+ tcg_temp_free_ptr(xb);
+
+ return true;
+}
+
+static bool trans_XVCVBF16SPN(DisasContext *ctx, arg_XX2 *a)
+{
+ REQUIRE_INSNS_FLAGS2(ctx, ISA310);
+ REQUIRE_VSX(ctx);
+
+ tcg_gen_gvec_shli(MO_32, vsr_full_offset(a->xt), vsr_full_offset(a->xb),
+ 16, 16, 16);
+
+ return true;
+}
+
#undef GEN_XX2FORM
#undef GEN_XX3FORM
#undef GEN_XX2IFORM
--
2.31.1
^ permalink raw reply related [flat|nested] 55+ messages in thread
* Re: [PATCH v3 02/37] target/ppc: moved vector even and odd multiplication to decodetree
2022-02-10 12:34 ` [PATCH v3 02/37] target/ppc: moved vector even and odd multiplication to decodetree matheus.ferst
@ 2022-02-11 3:39 ` Richard Henderson
0 siblings, 0 replies; 55+ messages in thread
From: Richard Henderson @ 2022-02-11 3:39 UTC (permalink / raw)
To: matheus.ferst, qemu-devel, qemu-ppc
Cc: Lucas Mateus Castro (alqotel),
danielhb413, groug, Lucas Mateus Castro, clg, david
On 2/10/22 23:34, matheus.ferst@eldorado.org.br wrote:
> +void helper_VMULESD(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b)
> +{
> + muls64(&r->VsrD(1), &r->VsrD(0), a->VsrSD(0), b->VsrSD(0));
> +}
> +void helper_VMULOSD(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b)
> +{
> + muls64(&r->VsrD(1), &r->VsrD(0), a->VsrSD(1), b->VsrSD(1));
> +}
> +void helper_VMULEUD(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b)
> +{
> + mulu64(&r->VsrD(1), &r->VsrD(0), a->VsrD(0), b->VsrD(0));
> +}
> +void helper_VMULOUD(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b)
> +{
> + mulu64(&r->VsrD(1), &r->VsrD(0), a->VsrD(1), b->VsrD(1));
> +}
These are single tcg calls; there's no particular need to have them out-of-line, except
perhaps just to make it easier for your pattern expansion. But perhaps not important
enough to worry about.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
r~
^ permalink raw reply [flat|nested] 55+ messages in thread
* Re: [PATCH v3 03/37] target/ppc: Moved vector multiply high and low to decodetree
2022-02-10 12:34 ` [PATCH v3 03/37] target/ppc: Moved vector multiply high and low " matheus.ferst
@ 2022-02-11 3:41 ` Richard Henderson
0 siblings, 0 replies; 55+ messages in thread
From: Richard Henderson @ 2022-02-11 3:41 UTC (permalink / raw)
To: matheus.ferst, qemu-devel, qemu-ppc
Cc: Lucas Mateus Castro (alqotel),
danielhb413, groug, Lucas Mateus Castro, clg, david
On 2/10/22 23:34, matheus.ferst@eldorado.org.br wrote:
> From: "Lucas Mateus Castro (alqotel)"<lucas.castro@eldorado.org.br>
>
> Moved instructions vmulld, vmulhuw, vmulhsw, vmulhud and vmulhsd to
> decodetree
>
> Signed-off-by: Lucas Mateus Castro (alqotel)<lucas.araujo@eldorado.org.br>
> Signed-off-by: Matheus Ferst<matheus.ferst@eldorado.org.br>
> ---
> target/ppc/helper.h | 8 ++++----
> target/ppc/insn32.decode | 6 ++++++
> target/ppc/int_helper.c | 8 ++++----
> target/ppc/translate/vmx-impl.c.inc | 21 ++++++++++++++++-----
> target/ppc/translate/vmx-ops.c.inc | 5 -----
> 5 files changed, 30 insertions(+), 18 deletions(-)
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
r~
^ permalink raw reply [flat|nested] 55+ messages in thread
* Re: [PATCH v3 04/37] target/ppc: vmulh* instructions use gvec
2022-02-10 12:34 ` [PATCH v3 04/37] target/ppc: vmulh* instructions use gvec matheus.ferst
@ 2022-02-11 3:51 ` Richard Henderson
0 siblings, 0 replies; 55+ messages in thread
From: Richard Henderson @ 2022-02-11 3:51 UTC (permalink / raw)
To: matheus.ferst, qemu-devel, qemu-ppc
Cc: Lucas Mateus Castro (alqotel),
danielhb413, groug, Lucas Mateus Castro, clg, david
On 2/10/22 23:34, matheus.ferst@eldorado.org.br wrote:
> +static void do_vx_vmulhu_vec(unsigned vece, TCGv_vec t, TCGv_vec a, TCGv_vec b)
> +{
> + TCGv_vec a1, b1, mask, w, k;
> + unsigned bits;
> + bits = (vece == MO_32) ? 16 : 32;
> +
> + a1 = tcg_temp_new_vec_matching(t);
> + b1 = tcg_temp_new_vec_matching(t);
> + w = tcg_temp_new_vec_matching(t);
> + k = tcg_temp_new_vec_matching(t);
> + mask = tcg_temp_new_vec_matching(t);
> +
> + tcg_gen_dupi_vec(vece, mask, (vece == MO_32) ? 0xFFFF : 0xFFFFFFFF);
> + tcg_gen_and_vec(vece, a1, a, mask);
> + tcg_gen_and_vec(vece, b1, b, mask);
> + tcg_gen_mul_vec(vece, t, a1, b1);
> + tcg_gen_shri_vec(vece, k, t, bits);
> +
> + tcg_gen_shri_vec(vece, a1, a, bits);
> + tcg_gen_mul_vec(vece, t, a1, b1);
> + tcg_gen_add_vec(vece, t, t, k);
> + tcg_gen_and_vec(vece, k, t, mask);
> + tcg_gen_shri_vec(vece, w, t, bits);
> +
> + tcg_gen_and_vec(vece, a1, a, mask);
> + tcg_gen_shri_vec(vece, b1, b, bits);
> + tcg_gen_mul_vec(vece, t, a1, b1);
> + tcg_gen_add_vec(vece, t, t, k);
> + tcg_gen_shri_vec(vece, k, t, bits);
> +
> + tcg_gen_shri_vec(vece, a1, a, bits);
> + tcg_gen_mul_vec(vece, t, a1, b1);
> + tcg_gen_add_vec(vece, t, t, w);
> + tcg_gen_add_vec(vece, t, t, k);
I don't think that you should decompose 4 high-part 32-bit multiplies into 4 32-bit
multiplies plus lots of arithmetic. This is not a win. You're actually better off with
pure integer arithmetic here.
You could instead widen these into 2 64-bit multiplies, plus some arithmetic. That's
certainly closer to the break-even point.
> + {
> + .fniv = do_vx_vmulhu_vec,
> + .fno = gen_helper_VMULHUD,
> + .opt_opc = vecop_list,
> + .vece = MO_64
> + },
> + };
As for the two high-part 64-bit multiplies, I think that should definitely remain an
integer operation.
You probably want to expand these with inline integer operations using .fni[48].
> +static void do_vx_vmulhs_vec(unsigned vece, TCGv_vec t, TCGv_vec a, TCGv_vec b)
Very much likewise.
r~
^ permalink raw reply [flat|nested] 55+ messages in thread
* Re: [PATCH v3 05/37] target/ppc: Implement vmsumcud instruction
2022-02-10 12:34 ` [PATCH v3 05/37] target/ppc: Implement vmsumcud instruction matheus.ferst
@ 2022-02-11 4:05 ` Richard Henderson
0 siblings, 0 replies; 55+ messages in thread
From: Richard Henderson @ 2022-02-11 4:05 UTC (permalink / raw)
To: matheus.ferst, qemu-devel, qemu-ppc
Cc: Víctor Colombo, groug, danielhb413, clg, david
On 2/10/22 23:34, matheus.ferst@eldorado.org.br wrote:
> + /*
> + * Discard lower 64-bits, leaving the carry into bit 64.
> + * Then sum the higher 64-bit elements.
> + */
> + tcg_gen_mov_i64(tmp1, tmp0);
> + get_avr64(tmp0, a->rc, true);
> + tcg_gen_add2_i64(tmp1, tmp0, tmp0, zero, prod1h, zero);
The move into tmp1 is dead here.
I think you wanted a third add2 here, adding the old tmp0 + new rc word.
r~
^ permalink raw reply [flat|nested] 55+ messages in thread
* Re: [PATCH v3 06/37] target/ppc: Implement vmsumudm instruction
2022-02-10 12:34 ` [PATCH v3 06/37] target/ppc: Implement vmsumudm instruction matheus.ferst
@ 2022-02-11 4:07 ` Richard Henderson
0 siblings, 0 replies; 55+ messages in thread
From: Richard Henderson @ 2022-02-11 4:07 UTC (permalink / raw)
To: matheus.ferst, qemu-devel, qemu-ppc
Cc: Víctor Colombo, groug, danielhb413, clg, david
On 2/10/22 23:34, matheus.ferst@eldorado.org.br wrote:
> From: Víctor Colombo<victor.colombo@eldorado.org.br>
>
> Based on [1] by Lijun Pan<ljp@linux.ibm.com>, which was never merged
> into master.
>
> [1]:https://lists.gnu.org/archive/html/qemu-ppc/2020-07/msg00419.html
>
> Signed-off-by: Víctor Colombo<victor.colombo@eldorado.org.br>
> Signed-off-by: Matheus Ferst<matheus.ferst@eldorado.org.br>
> ---
> target/ppc/insn32.decode | 1 +
> target/ppc/translate/vmx-impl.c.inc | 34 +++++++++++++++++++++++++++++
> 2 files changed, 35 insertions(+)
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
r~
^ permalink raw reply [flat|nested] 55+ messages in thread
* Re: [PATCH v3 07/37] target/ppc: Move vexts[bhw]2[wd] to decodetree
2022-02-10 12:34 ` [PATCH v3 07/37] target/ppc: Move vexts[bhw]2[wd] to decodetree matheus.ferst
@ 2022-02-11 4:14 ` Richard Henderson
0 siblings, 0 replies; 55+ messages in thread
From: Richard Henderson @ 2022-02-11 4:14 UTC (permalink / raw)
To: matheus.ferst, qemu-devel, qemu-ppc
Cc: Lucas Coutinho, groug, danielhb413, clg, david
On 2/10/22 23:34, matheus.ferst@eldorado.org.br wrote:
> From: Lucas Coutinho <lucas.coutinho@eldorado.org.br>
>
> Move the following instructions to decodetree:
> vextsb2w: Vector Extend Sign Byte To Word
> vextsh2w: Vector Extend Sign Halfword To Word
> vextsb2d: Vector Extend Sign Byte To Doubleword
> vextsh2d: Vector Extend Sign Halfword To Doubleword
> vextsw2d: Vector Extend Sign Word To Doubleword
>
> Signed-off-by: Lucas Coutinho <lucas.coutinho@eldorado.org.br>
> Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
> ---
> target/ppc/helper.h | 5 -----
> target/ppc/insn32.decode | 8 ++++++++
> target/ppc/int_helper.c | 15 ---------------
> target/ppc/translate/vmx-impl.c.inc | 25 ++++++++++++++++++++-----
> target/ppc/translate/vmx-ops.c.inc | 5 -----
> 5 files changed, 28 insertions(+), 30 deletions(-)
>
> diff --git a/target/ppc/helper.h b/target/ppc/helper.h
> index 92595a42df..0084080fad 100644
> --- a/target/ppc/helper.h
> +++ b/target/ppc/helper.h
> @@ -249,11 +249,6 @@ DEF_HELPER_4(VINSBLX, void, env, avr, i64, tl)
> DEF_HELPER_4(VINSHLX, void, env, avr, i64, tl)
> DEF_HELPER_4(VINSWLX, void, env, avr, i64, tl)
> DEF_HELPER_4(VINSDLX, void, env, avr, i64, tl)
> -DEF_HELPER_2(vextsb2w, void, avr, avr)
> -DEF_HELPER_2(vextsh2w, void, avr, avr)
> -DEF_HELPER_2(vextsb2d, void, avr, avr)
> -DEF_HELPER_2(vextsh2d, void, avr, avr)
> -DEF_HELPER_2(vextsw2d, void, avr, avr)
> DEF_HELPER_2(vnegw, void, avr, avr)
> DEF_HELPER_2(vnegd, void, avr, avr)
> DEF_HELPER_2(vupkhpx, void, avr, avr)
> diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode
> index c4796260b6..757791f0ac 100644
> --- a/target/ppc/insn32.decode
> +++ b/target/ppc/insn32.decode
> @@ -419,6 +419,14 @@ VINSWVRX 000100 ..... ..... ..... 00110001111 @VX
> VSLDBI 000100 ..... ..... ..... 00 ... 010110 @VN
> VSRDBI 000100 ..... ..... ..... 01 ... 010110 @VN
>
> +## Vector Integer Arithmetic Instructions
> +
> +VEXTSB2W 000100 ..... 10000 ..... 11000000010 @VX_tb
> +VEXTSH2W 000100 ..... 10001 ..... 11000000010 @VX_tb
> +VEXTSB2D 000100 ..... 11000 ..... 11000000010 @VX_tb
> +VEXTSH2D 000100 ..... 11001 ..... 11000000010 @VX_tb
> +VEXTSW2D 000100 ..... 11010 ..... 11000000010 @VX_tb
> +
> ## Vector Mask Manipulation Instructions
>
> MTVSRBM 000100 ..... 10000 ..... 11001000010 @VX_tb
> diff --git a/target/ppc/int_helper.c b/target/ppc/int_helper.c
> index 79cde68f19..630fbc579a 100644
> --- a/target/ppc/int_helper.c
> +++ b/target/ppc/int_helper.c
> @@ -1768,21 +1768,6 @@ XXBLEND(W, 32)
> XXBLEND(D, 64)
> #undef XXBLEND
>
> -#define VEXT_SIGNED(name, element, cast) \
> -void helper_##name(ppc_avr_t *r, ppc_avr_t *b) \
> -{ \
> - int i; \
> - for (i = 0; i < ARRAY_SIZE(r->element); i++) { \
> - r->element[i] = (cast)b->element[i]; \
> - } \
> -}
> -VEXT_SIGNED(vextsb2w, s32, int8_t)
> -VEXT_SIGNED(vextsb2d, s64, int8_t)
> -VEXT_SIGNED(vextsh2w, s32, int16_t)
> -VEXT_SIGNED(vextsh2d, s64, int16_t)
> -VEXT_SIGNED(vextsw2d, s64, int32_t)
> -#undef VEXT_SIGNED
> -
> #define VNEG(name, element) \
> void helper_##name(ppc_avr_t *r, ppc_avr_t *b) \
> { \
> diff --git a/target/ppc/translate/vmx-impl.c.inc b/target/ppc/translate/vmx-impl.c.inc
> index b7559cf94c..ec782c47ff 100644
> --- a/target/ppc/translate/vmx-impl.c.inc
> +++ b/target/ppc/translate/vmx-impl.c.inc
> @@ -1772,11 +1772,26 @@ GEN_VXFORM_TRANS(vclzw, 1, 30)
> GEN_VXFORM_TRANS(vclzd, 1, 31)
> GEN_VXFORM_NOA_2(vnegw, 1, 24, 6)
> GEN_VXFORM_NOA_2(vnegd, 1, 24, 7)
> -GEN_VXFORM_NOA_2(vextsb2w, 1, 24, 16)
> -GEN_VXFORM_NOA_2(vextsh2w, 1, 24, 17)
> -GEN_VXFORM_NOA_2(vextsb2d, 1, 24, 24)
> -GEN_VXFORM_NOA_2(vextsh2d, 1, 24, 25)
> -GEN_VXFORM_NOA_2(vextsw2d, 1, 24, 26)
> +
> +static bool do_vexts(DisasContext *ctx, arg_VX_tb *a, int vece, int s)
> +{
> + REQUIRE_INSNS_FLAGS2(ctx, ISA300);
> + REQUIRE_VECTOR(ctx);
> +
> + tcg_gen_gvec_shli(vece, avr_full_offset(a->vrt), avr_full_offset(a->vrb),
> + s, 16, 16);
> + tcg_gen_gvec_sari(vece, avr_full_offset(a->vrt), avr_full_offset(a->vrt),
> + s, 16, 16);
It would be better to collect this into a single composite gvec operation (x86 is
especially bad with unsupported vector operation sizes).
Use GVecGen3, provide the relevant .fni4/.fni8/.fniv functions and the vecop_list. We
have elsewhere relied on 4 integer operations being expanded, when the vector op itself
isn't supported, so you should be able to drop the .fnio out-of-line helper.
r~
^ permalink raw reply [flat|nested] 55+ messages in thread
* Re: [PATCH v3 08/37] target/ppc: Implement vextsd2q
2022-02-10 12:34 ` [PATCH v3 08/37] target/ppc: Implement vextsd2q matheus.ferst
@ 2022-02-11 4:15 ` Richard Henderson
0 siblings, 0 replies; 55+ messages in thread
From: Richard Henderson @ 2022-02-11 4:15 UTC (permalink / raw)
To: matheus.ferst, qemu-devel, qemu-ppc
Cc: Lucas Coutinho, groug, danielhb413, clg, david
On 2/10/22 23:34, matheus.ferst@eldorado.org.br wrote:
> From: Lucas Coutinho<lucas.coutinho@eldorado.org.br>
>
> Signed-off-by: Lucas Coutinho<lucas.coutinho@eldorado.org.br>
> Signed-off-by: Matheus Ferst<matheus.ferst@eldorado.org.br>
> ---
> target/ppc/insn32.decode | 1 +
> target/ppc/translate/vmx-impl.c.inc | 18 ++++++++++++++++++
> 2 files changed, 19 insertions(+)
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
r~
^ permalink raw reply [flat|nested] 55+ messages in thread
* Re: [PATCH v3 09/37] target/ppc: Move Vector Compare Equal/Not Equal/Greater Than to decodetree
2022-02-10 12:34 ` [PATCH v3 09/37] target/ppc: Move Vector Compare Equal/Not Equal/Greater Than to decodetree matheus.ferst
@ 2022-02-11 4:27 ` Richard Henderson
0 siblings, 0 replies; 55+ messages in thread
From: Richard Henderson @ 2022-02-11 4:27 UTC (permalink / raw)
To: matheus.ferst, qemu-devel, qemu-ppc; +Cc: groug, danielhb413, clg, david
On 2/10/22 23:34, matheus.ferst@eldorado.org.br wrote:
> +static void do_vcmp_rc(int vrt)
> +{
> + TCGv_i64 t0, t1;
> +
> + t0 = tcg_temp_new_i64();
> + t1 = tcg_temp_new_i64();
> +
> + get_avr64(t0, vrt, true);
> + tcg_gen_ctpop_i64(t1, t0);
> + get_avr64(t0, vrt, false);
> + tcg_gen_ctpop_i64(t0, t0);
> + tcg_gen_add_i64(t1, t0, t1);
I don't understand the ctpop here. I would have expected:
tcg_gen_and_i64(set, t0, t1);
tcg_gen_or_i64(clr, t0, t1);
tcg_gen_setcondi_i64(TCG_COND_EQ, set, set, -1); /* all bits set */
tcg_gen_setcondi_i64(TCG_COND_EQ, clr, clr, 0); /* all bits clear */
> +static bool do_vcmp(DisasContext *ctx, arg_VC *a, TCGCond cond, int vece)
> +{
> + REQUIRE_VECTOR(ctx);
> +
> + tcg_gen_gvec_cmp(cond, vece, avr_full_offset(a->vrt),
> + avr_full_offset(a->vra), avr_full_offset(a->vrb), 16, 16);
> + tcg_gen_gvec_shli(vece, avr_full_offset(a->vrt), avr_full_offset(a->vrt),
> + (8 << vece) - 1, 16, 16);
> + tcg_gen_gvec_sari(vece, avr_full_offset(a->vrt), avr_full_offset(a->vrt),
> + (8 << vece) - 1, 16, 16);
Vector compare already produces -1; no need for anything beyond the cmp.
r~
^ permalink raw reply [flat|nested] 55+ messages in thread
* Re: [PATCH v3 10/37] target/ppc: Move Vector Compare Not Equal or Zero to decodetree
2022-02-10 12:34 ` [PATCH v3 10/37] target/ppc: Move Vector Compare Not Equal or Zero " matheus.ferst
@ 2022-02-11 4:41 ` Richard Henderson
2022-02-17 12:45 ` Matheus K. Ferst
0 siblings, 1 reply; 55+ messages in thread
From: Richard Henderson @ 2022-02-11 4:41 UTC (permalink / raw)
To: matheus.ferst, qemu-devel, qemu-ppc; +Cc: groug, danielhb413, clg, david
On 2/10/22 23:34, matheus.ferst@eldorado.org.br wrote:
> +static void gen_vcmpnez_vec(unsigned vece, TCGv_vec t, TCGv_vec a, TCGv_vec b)
> +{
> + TCGv_vec t0, t1, zero;
> +
> + t0 = tcg_temp_new_vec_matching(t);
> + t1 = tcg_temp_new_vec_matching(t);
> + zero = tcg_constant_vec_matching(t, vece, 0);
> +
> + tcg_gen_cmp_vec(TCG_COND_EQ, vece, t0, a, zero);
> + tcg_gen_cmp_vec(TCG_COND_EQ, vece, t1, b, zero);
> + tcg_gen_cmp_vec(TCG_COND_NE, vece, t, a, b);
> +
> + tcg_gen_or_vec(vece, t, t, t0);
> + tcg_gen_or_vec(vece, t, t, t1);
> +
> + tcg_gen_shli_vec(vece, t, t, (8 << vece) - 1);
> + tcg_gen_sari_vec(vece, t, t, (8 << vece) - 1);
No shifting required, only the cmp.
> +static bool do_vcmpnez(DisasContext *ctx, arg_VC *a, int vece)
> +{
> + static const TCGOpcode vecop_list[] = {
> + INDEX_op_cmp_vec, INDEX_op_shli_vec, INDEX_op_sari_vec, 0
> + };
Therefore no vecop_list required (cmp itself is mandatory).
r~
^ permalink raw reply [flat|nested] 55+ messages in thread
* Re: [PATCH v3 11/37] target/ppc: Implement Vector Compare Equal Quadword
2022-02-10 12:34 ` [PATCH v3 11/37] target/ppc: Implement Vector Compare Equal Quadword matheus.ferst
@ 2022-02-11 4:51 ` Richard Henderson
0 siblings, 0 replies; 55+ messages in thread
From: Richard Henderson @ 2022-02-11 4:51 UTC (permalink / raw)
To: matheus.ferst, qemu-devel, qemu-ppc; +Cc: groug, danielhb413, clg, david
On 2/10/22 23:34, matheus.ferst@eldorado.org.br wrote:
> From: Matheus Ferst <matheus.ferst@eldorado.org.br>
>
> Implement the following PowerISA v3.1 instructions:
> vcmpequq Vector Compare Equal Quadword
>
> Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
> ---
> target/ppc/insn32.decode | 1 +
> target/ppc/translate/vmx-impl.c.inc | 43 +++++++++++++++++++++++++++++
> 2 files changed, 44 insertions(+)
>
> diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode
> index a0adf18671..39730df32d 100644
> --- a/target/ppc/insn32.decode
> +++ b/target/ppc/insn32.decode
> @@ -382,6 +382,7 @@ VCMPEQUB 000100 ..... ..... ..... . 0000000110 @VC
> VCMPEQUH 000100 ..... ..... ..... . 0001000110 @VC
> VCMPEQUW 000100 ..... ..... ..... . 0010000110 @VC
> VCMPEQUD 000100 ..... ..... ..... . 0011000111 @VC
> +VCMPEQUQ 000100 ..... ..... ..... . 0111000111 @VC
>
> VCMPGTSB 000100 ..... ..... ..... . 1100000110 @VC
> VCMPGTSH 000100 ..... ..... ..... . 1101000110 @VC
> diff --git a/target/ppc/translate/vmx-impl.c.inc b/target/ppc/translate/vmx-impl.c.inc
> index 67059ed9b2..bdb0b4370b 100644
> --- a/target/ppc/translate/vmx-impl.c.inc
> +++ b/target/ppc/translate/vmx-impl.c.inc
> @@ -1112,6 +1112,49 @@ TRANS(VCMPNEZB, do_vcmpnez, MO_8)
> TRANS(VCMPNEZH, do_vcmpnez, MO_16)
> TRANS(VCMPNEZW, do_vcmpnez, MO_32)
>
> +static bool trans_VCMPEQUQ(DisasContext *ctx, arg_VC *a)
> +{
> + TCGv_i64 t0, t1;
> + TCGLabel *l1, *l2;
> +
> + REQUIRE_INSNS_FLAGS2(ctx, ISA310);
> + REQUIRE_VECTOR(ctx);
> +
> + t0 = tcg_temp_new_i64();
> + t1 = tcg_temp_new_i64();
> + l1 = gen_new_label();
> + l2 = gen_new_label();
> +
> + get_avr64(t0, a->vra, true);
> + get_avr64(t1, a->vrb, true);
> + tcg_gen_brcond_i64(TCG_COND_NE, t0, t1, l1);
> +
> + get_avr64(t0, a->vra, false);
> + get_avr64(t1, a->vrb, false);
> + tcg_gen_brcond_i64(TCG_COND_NE, t0, t1, l1);
It would be much better to not use a branch.
E.g.
get_avr64(t0, a->vra, true);
get_avr64(t1, a->vrb, true);
tcg_gen_xor_i64(c0, t0, t1);
get_avr64(t0, a->vra, false);
get_avr64(t1, a->vrb, false);
tcg_gen_xor_i64(c1, t0, t1);
tcg_gen_or_i64(c0, c0, c1);
tcg_gen_setcondi_i64(TCG_COND_EQ, c0, c0, 0);
tcg_gen_neg_i64(c0, c0);
set_avr64(a->vrt, c0, true);
set_avr64(a->vrt, c0, false);
tcg_gen_extrl_i64_i32(crf, c0);
tcg_gen_andi_i32(crf, crf, 0xa);
tcg_gen_xori_i32(crf, crf, 0x2);
r~
^ permalink raw reply [flat|nested] 55+ messages in thread
* Re: [PATCH v3 12/37] target/ppc: Implement Vector Compare Greater Than Quadword
2022-02-10 12:34 ` [PATCH v3 12/37] target/ppc: Implement Vector Compare Greater Than Quadword matheus.ferst
@ 2022-02-11 4:53 ` Richard Henderson
0 siblings, 0 replies; 55+ messages in thread
From: Richard Henderson @ 2022-02-11 4:53 UTC (permalink / raw)
To: matheus.ferst, qemu-devel, qemu-ppc; +Cc: groug, danielhb413, clg, david
On 2/10/22 23:34, matheus.ferst@eldorado.org.br wrote:
> + get_avr64(t0, a->vra, true);
> + get_avr64(t1, a->vrb, true);
> + tcg_gen_brcond_i64(sign ? TCG_COND_GT : TCG_COND_GTU, t0, t1, l1);
> + tcg_gen_brcond_i64(sign ? TCG_COND_LT : TCG_COND_LTU, t0, t1, l2);
> +
> + get_avr64(t0, a->vra, false);
> + get_avr64(t1, a->vrb, false);
> + tcg_gen_brcond_i64(TCG_COND_GTU, t0, t1, l1);
> + tcg_gen_br(l2);
Similarly wrt branches, and computation of the result.
r~
^ permalink raw reply [flat|nested] 55+ messages in thread
* Re: [PATCH v3 13/37] target/ppc: Implement Vector Compare Quadword
2022-02-10 12:34 ` [PATCH v3 13/37] target/ppc: Implement Vector Compare Quadword matheus.ferst
@ 2022-02-11 4:55 ` Richard Henderson
0 siblings, 0 replies; 55+ messages in thread
From: Richard Henderson @ 2022-02-11 4:55 UTC (permalink / raw)
To: matheus.ferst, qemu-devel, qemu-ppc; +Cc: groug, danielhb413, clg, david
On 2/10/22 23:34, matheus.ferst@eldorado.org.br wrote:
> From: Matheus Ferst<matheus.ferst@eldorado.org.br>
>
> Implement the following PowerISA v3.1 instructions:
> vcmpsq: Vector Compare Signed Quadword
> vcmpuq: Vector Compare Unsigned Quadword
>
> Signed-off-by: Matheus Ferst<matheus.ferst@eldorado.org.br>
> ---
> target/ppc/insn32.decode | 6 ++++
> target/ppc/translate/vmx-impl.c.inc | 45 +++++++++++++++++++++++++++++
> 2 files changed, 51 insertions(+)
This one is complex enough it does warrant branches.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
r~
^ permalink raw reply [flat|nested] 55+ messages in thread
* Re: [PATCH v3 14/37] target/ppc: implement vstri[bh][lr]
2022-02-10 12:34 ` [PATCH v3 14/37] target/ppc: implement vstri[bh][lr] matheus.ferst
@ 2022-02-11 5:00 ` Richard Henderson
0 siblings, 0 replies; 55+ messages in thread
From: Richard Henderson @ 2022-02-11 5:00 UTC (permalink / raw)
To: matheus.ferst, qemu-devel, qemu-ppc; +Cc: groug, danielhb413, clg, david
On 2/10/22 23:34, matheus.ferst@eldorado.org.br wrote:
> +#define VSTRI(NAME, ELEM, NUM_ELEMS, LEFT) \
> +void helper_##NAME(CPUPPCState *env, ppc_avr_t *t, ppc_avr_t *b, \
> + target_ulong rc) \
> +{ \
> + bool null_found = false; \
> + int i, idx; \
> + \
> + for (i = 0; i < NUM_ELEMS; i++) { \
> + idx = LEFT ? i : NUM_ELEMS - i - 1; \
> + if (b->Vsr##ELEM(idx)) { \
> + t->Vsr##ELEM(idx) = b->Vsr##ELEM(idx); \
> + } else { \
> + null_found = true; \
> + break; \
> + } \
> + } \
> + \
> + for (; i < NUM_ELEMS; i++) { \
> + idx = LEFT ? i : NUM_ELEMS - i - 1; \
> + t->Vsr##ELEM(idx) = 0; \
> + } \
> + \
> + if (rc) { \
> + env->crf[6] = null_found ? 0b0010 : 0; \
> + } \
> +}
The only reason you're passing in env is for crf[6], which requires you to pass in a
second argument. And you're not using the return value.
It would be better to always return the rc value, and only conditionally assign it.
E.g.
if (a->rc) {
gen_helper(cpu_crf[6], vrt, vrb);
} else {
TCGv_i32 discard = tcg_temp_new_i32();
gen_helper(discard, vrt, vrb);
tcg_temp_free_i32(discard);
}
r~
^ permalink raw reply [flat|nested] 55+ messages in thread
* Re: [PATCH v3 15/37] target/ppc: implement vclrlb
2022-02-10 12:34 ` [PATCH v3 15/37] target/ppc: implement vclrlb matheus.ferst
@ 2022-02-11 5:20 ` Richard Henderson
0 siblings, 0 replies; 55+ messages in thread
From: Richard Henderson @ 2022-02-11 5:20 UTC (permalink / raw)
To: matheus.ferst, qemu-devel, qemu-ppc; +Cc: groug, danielhb413, clg, david
On 2/10/22 23:34, matheus.ferst@eldorado.org.br wrote:
> From: Matheus Ferst <matheus.ferst@eldorado.org.br>
>
> Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
> ---
> target/ppc/insn32.decode | 2 ++
> target/ppc/translate/vmx-impl.c.inc | 56 +++++++++++++++++++++++++++++
> 2 files changed, 58 insertions(+)
>
> diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode
> index ea497ecd80..483651cf9c 100644
> --- a/target/ppc/insn32.decode
> +++ b/target/ppc/insn32.decode
> @@ -501,6 +501,8 @@ VSTRIBR 000100 ..... 00001 ..... . 0000001101 @VX_tb_rc
> VSTRIHL 000100 ..... 00010 ..... . 0000001101 @VX_tb_rc
> VSTRIHR 000100 ..... 00011 ..... . 0000001101 @VX_tb_rc
>
> +VCLRLB 000100 ..... ..... ..... 00110001101 @VX
> +
> # VSX Load/Store Instructions
>
> LXV 111101 ..... ..... ............ . 001 @DQ_TSX
> diff --git a/target/ppc/translate/vmx-impl.c.inc b/target/ppc/translate/vmx-impl.c.inc
> index 8bcf637ff8..3fb4935bff 100644
> --- a/target/ppc/translate/vmx-impl.c.inc
> +++ b/target/ppc/translate/vmx-impl.c.inc
> @@ -1956,6 +1956,62 @@ TRANS(VSTRIBR, do_vstri, gen_helper_VSTRIBR)
> TRANS(VSTRIHL, do_vstri, gen_helper_VSTRIHL)
> TRANS(VSTRIHR, do_vstri, gen_helper_VSTRIHR)
>
> +static bool trans_VCLRLB(DisasContext *ctx, arg_VX *a)
> +{
> + TCGv_i64 hi, lo, rb;
> + TCGLabel *l, *end;
> +
> + REQUIRE_INSNS_FLAGS2(ctx, ISA310);
> + REQUIRE_VECTOR(ctx);
> +
> + l = gen_new_label();
> + end = gen_new_label();
> +
> + hi = tcg_const_local_i64(0);
> + lo = tcg_const_local_i64(0);
> + rb = tcg_temp_local_new_i64();
> +
> + tcg_gen_extu_tl_i64(rb, cpu_gpr[a->vrb]);
> +
> + /* RB == 0: all zeros */
> + tcg_gen_brcondi_i64(TCG_COND_EQ, rb, 0, end);
> +
> + get_avr64(lo, a->vra, false);
> +
> + /* RB <= 8 */
> + tcg_gen_brcondi_i64(TCG_COND_LEU, rb, 8, l);
> +
> + get_avr64(hi, a->vra, true);
> +
> + /* RB >= 16: just copy VRA to VRB */
> + tcg_gen_brcondi_i64(TCG_COND_GEU, rb, 16, end);
> +
> + /* 8 < RB < 16: copy lo and partially clear hi */
> + tcg_gen_subfi_i64(rb, 16, rb);
> + tcg_gen_shli_i64(rb, rb, 3);
> + tcg_gen_shl_i64(hi, hi, rb);
> + tcg_gen_shr_i64(hi, hi, rb);
> + tcg_gen_br(end);
> +
> + /* 0 < RB <= 8: zeroes hi and partially clears lo */
> + gen_set_label(l);
> + tcg_gen_subfi_i64(rb, 8, rb);
> + tcg_gen_shli_i64(rb, rb, 3);
> + tcg_gen_shl_i64(lo, lo, rb);
> + tcg_gen_shr_i64(lo, lo, rb);
There's a bit of redundancy here, and if we exploit that we can remove the branches.
Compute the mask modulo 8. That result applies to either the first or second word, or
neither. Use 3 movcond to select among the cases:
sh = (rb & 7) << 3;
mask = ~(-1 << sh);
ml = rb < 8 ? mask : 0;
mh = rb < 8 ? 0 : mask;
mh = rb < 16 ? mh : -1;
lo &= ml;
hi &= mh;
r~
^ permalink raw reply [flat|nested] 55+ messages in thread
* Re: [PATCH v3 17/37] target/ppc: implement vcntmb[bhwd]
2022-02-10 12:34 ` [PATCH v3 17/37] target/ppc: implement vcntmb[bhwd] matheus.ferst
@ 2022-02-11 5:28 ` Richard Henderson
0 siblings, 0 replies; 55+ messages in thread
From: Richard Henderson @ 2022-02-11 5:28 UTC (permalink / raw)
To: matheus.ferst, qemu-devel, qemu-ppc; +Cc: groug, danielhb413, clg, david
On 2/10/22 23:34, matheus.ferst@eldorado.org.br wrote:
> From: Matheus Ferst<matheus.ferst@eldorado.org.br>
>
> Signed-off-by: Matheus Ferst<matheus.ferst@eldorado.org.br>
> ---
> target/ppc/insn32.decode | 8 ++++++++
> target/ppc/translate/vmx-impl.c.inc | 32 +++++++++++++++++++++++++++++
> 2 files changed, 40 insertions(+)
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
r~
^ permalink raw reply [flat|nested] 55+ messages in thread
* Re: [PATCH v3 18/37] target/ppc: implement vgnb
2022-02-10 12:34 ` [PATCH v3 18/37] target/ppc: implement vgnb matheus.ferst
@ 2022-02-11 6:15 ` Richard Henderson
0 siblings, 0 replies; 55+ messages in thread
From: Richard Henderson @ 2022-02-11 6:15 UTC (permalink / raw)
To: matheus.ferst, qemu-devel, qemu-ppc; +Cc: groug, danielhb413, clg, david
On 2/10/22 23:34, matheus.ferst@eldorado.org.br wrote:
> + for (int dw = 1; dw >= 0; dw--) {
> + get_avr64(vrb, a->vrb, dw);
> + for (; in >= 0; in -= a->n, out--) {
> + if (in > out) {
> + tcg_gen_shri_i64(tmp, vrb, in - out);
> + } else {
> + tcg_gen_shli_i64(tmp, vrb, out - in);
> + }
> + tcg_gen_andi_i64(tmp, tmp, 1ULL << out);
> + tcg_gen_or_i64(rt, rt, tmp);
> + }
> + in += 64;
> + }
This is going to produce up to 3*64 operations (n=2).
You can produce more than one output pairing per shift,
and produce the same result in 3*lg2(64) operations.
I've given an example like this on the list before, recently.
I think it was in the context of some riscv bit manipulation.
> N = 2
>
> AxBxCxDxExFxGxHxIxJxKxLxMxNxOxPxQxRxSxTxUxVxWxXxYxZx0x1x2x3x4x5x
> & rep(0b10)
> A.B.C.D.E.F.G.H.I.J.K.L.M.N.O.P.Q.R.S.T.U.V.W.X.Y.Z.0.1.2.3.4.5.
> << 1
> .B.C.D.E.F.G.H.I.J.K.L.M.N.O.P.Q.R.S.T.U.V.W.X.Y.Z.0.1.2.3.4.5..
> |
> ABBCCDDEEFFGGHHIIJJKKLLMMNNOOPPQQRRSSTTUUVVWWXXYYZZ001122334455.
> & rep(0b1100)
> AB..CD..EF..GH..IJ..KL..MN..OP..QR..ST..UV..WX..YZ..01..23..45..
> << 2
> ..CD..EF..GH..IJ..KL..MN..OP..QR..ST..UV..WX..YZ..01..23..45....
> |
> ABCDCDEFEFGHGHIJIJKLKLMNMNOPOPWQQRSTSTUVUVWXWXYZYZ010123234545..
> & rep(0xf0)
> ABCD....EFGH....IJKL....MNOP....QRST....UVWX....YZ01....2345....
> << 4
> ....EFGH....IJKL....MNOP....QRST....UVWX....YZ01....2345........
> |
> ABCDEFGHEFGHIJKLIJKLMNOPMNOPQRSTQRSTUVWXUVWXYZ01YZ0123452345....
> & rep(0xff00)
> ABCDEFGH........IJKLMNOP........QRSTUVWX........YZ012345........
> << 8
> ........IJKLMNOP........QRSTUVWX........YZ012345................
> |
> ABCDEFGHIJKLMNOPIJKLMNOPQRSTUVWXQRSTUVWXYZ012345YZ012345........
> & rep(0xffff0000)
> ABCDEFGHIJKLMNOP................QRSTUVWXYZ012345................
> deposit(t, 32, 16)
> ABCDEFGHIJKLMNOPQRSTUVWXYZ012346................................
and similarly for larger N. For N >= 4, I believe that half of the masking may be elided,
because there are already zeros in which to place bits.
> N = 5
>
> AxxxxBxxxxCxxxxDxxxxExxxxFxxxxGxxxxHxxxxIxxxxJxxxxKxxxxLxxxxMxxx
> & rep(0b10000)
> A....B....C....D....E....F....G....H....I....J....K....L....M...
> << (5 - 1)
> .B....C....D....E....F....G....H....I....J....K....L....M.......
> |
> AB...BC...CD...DE...EF...FG...GH...HI...IJ...JK...KL...LM...M...
> << (10 - 2)
> ..CD...DE...EF...FG...GH...HI...IJ...JK...KL...LM...M...
> |
> ABCD.BCDE.CDEF.DEFG.EFGH.FGHI.GHIJ.HIJK.IJKL.JKLM.KLM..LM...M...
> & rep(0xf0000)
> ABCD................EFGH................IJKL................M...
> << (20 - 4)
> ....EFGH................IJKL................M...................
> |
> ABCDEFGH............EFGHIJKL............IJKLM...............M...
> << (40 - 8)
> ........IJKLM...............M...................................
> |
> ABCDEFGHIJKLM.......EFGHIJKLM...........IJKLM...............M...
> & 0xfff8_0000_0000_0000
> ABCDEFGHIJKLM...................................................
It's probably worth working through the various N to make sure you know which masking is
required.
r~
^ permalink raw reply [flat|nested] 55+ messages in thread
* Re: [PATCH v3 10/37] target/ppc: Move Vector Compare Not Equal or Zero to decodetree
2022-02-11 4:41 ` Richard Henderson
@ 2022-02-17 12:45 ` Matheus K. Ferst
0 siblings, 0 replies; 55+ messages in thread
From: Matheus K. Ferst @ 2022-02-17 12:45 UTC (permalink / raw)
To: Richard Henderson, qemu-devel, qemu-ppc; +Cc: groug, danielhb413, clg, david
On 11/02/2022 01:41, Richard Henderson wrote:
> On 2/10/22 23:34, matheus.ferst@eldorado.org.br wrote:
>> +static void gen_vcmpnez_vec(unsigned vece, TCGv_vec t, TCGv_vec a,
>> TCGv_vec b)
>> +{
>> + TCGv_vec t0, t1, zero;
>> +
>> + t0 = tcg_temp_new_vec_matching(t);
>> + t1 = tcg_temp_new_vec_matching(t);
>> + zero = tcg_constant_vec_matching(t, vece, 0);
>> +
>> + tcg_gen_cmp_vec(TCG_COND_EQ, vece, t0, a, zero);
>> + tcg_gen_cmp_vec(TCG_COND_EQ, vece, t1, b, zero);
>> + tcg_gen_cmp_vec(TCG_COND_NE, vece, t, a, b);
>> +
>> + tcg_gen_or_vec(vece, t, t, t0);
>> + tcg_gen_or_vec(vece, t, t, t1);
>> +
>> + tcg_gen_shli_vec(vece, t, t, (8 << vece) - 1);
>> + tcg_gen_sari_vec(vece, t, t, (8 << vece) - 1);
>
> No shifting required, only the cmp.
>
>> +static bool do_vcmpnez(DisasContext *ctx, arg_VC *a, int vece)
>> +{
>> + static const TCGOpcode vecop_list[] = {
>> + INDEX_op_cmp_vec, INDEX_op_shli_vec, INDEX_op_sari_vec, 0
>> + };
>
> Therefore no vecop_list required (cmp itself is mandatory).
>
Without vecop_list, we hit the assert in tcg_assert_listed_vecop, which
is called from tcg_gen_cmp_vec. Am I missing something?
Thanks,
Matheus K. Ferst
Instituto de Pesquisas ELDORADO <http://www.eldorado.org.br/>
Analista de Software
Aviso Legal - Disclaimer <https://www.eldorado.org.br/disclaimer.html>
^ permalink raw reply [flat|nested] 55+ messages in thread
end of thread, other threads:[~2022-02-17 13:09 UTC | newest]
Thread overview: 55+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-02-10 12:34 [PATCH v3 00/37] target/ppc: PowerISA Vector/VSX instruction batch matheus.ferst
2022-02-10 12:34 ` [PATCH v3 01/37] target/ppc: Introduce TRANS*FLAGS macros matheus.ferst
2022-02-10 12:34 ` [PATCH v3 02/37] target/ppc: moved vector even and odd multiplication to decodetree matheus.ferst
2022-02-11 3:39 ` Richard Henderson
2022-02-10 12:34 ` [PATCH v3 03/37] target/ppc: Moved vector multiply high and low " matheus.ferst
2022-02-11 3:41 ` Richard Henderson
2022-02-10 12:34 ` [PATCH v3 04/37] target/ppc: vmulh* instructions use gvec matheus.ferst
2022-02-11 3:51 ` Richard Henderson
2022-02-10 12:34 ` [PATCH v3 05/37] target/ppc: Implement vmsumcud instruction matheus.ferst
2022-02-11 4:05 ` Richard Henderson
2022-02-10 12:34 ` [PATCH v3 06/37] target/ppc: Implement vmsumudm instruction matheus.ferst
2022-02-11 4:07 ` Richard Henderson
2022-02-10 12:34 ` [PATCH v3 07/37] target/ppc: Move vexts[bhw]2[wd] to decodetree matheus.ferst
2022-02-11 4:14 ` Richard Henderson
2022-02-10 12:34 ` [PATCH v3 08/37] target/ppc: Implement vextsd2q matheus.ferst
2022-02-11 4:15 ` Richard Henderson
2022-02-10 12:34 ` [PATCH v3 09/37] target/ppc: Move Vector Compare Equal/Not Equal/Greater Than to decodetree matheus.ferst
2022-02-11 4:27 ` Richard Henderson
2022-02-10 12:34 ` [PATCH v3 10/37] target/ppc: Move Vector Compare Not Equal or Zero " matheus.ferst
2022-02-11 4:41 ` Richard Henderson
2022-02-17 12:45 ` Matheus K. Ferst
2022-02-10 12:34 ` [PATCH v3 11/37] target/ppc: Implement Vector Compare Equal Quadword matheus.ferst
2022-02-11 4:51 ` Richard Henderson
2022-02-10 12:34 ` [PATCH v3 12/37] target/ppc: Implement Vector Compare Greater Than Quadword matheus.ferst
2022-02-11 4:53 ` Richard Henderson
2022-02-10 12:34 ` [PATCH v3 13/37] target/ppc: Implement Vector Compare Quadword matheus.ferst
2022-02-11 4:55 ` Richard Henderson
2022-02-10 12:34 ` [PATCH v3 14/37] target/ppc: implement vstri[bh][lr] matheus.ferst
2022-02-11 5:00 ` Richard Henderson
2022-02-10 12:34 ` [PATCH v3 15/37] target/ppc: implement vclrlb matheus.ferst
2022-02-11 5:20 ` Richard Henderson
2022-02-10 12:34 ` [PATCH v3 16/37] target/ppc: implement vclrrb matheus.ferst
2022-02-10 12:34 ` [PATCH v3 17/37] target/ppc: implement vcntmb[bhwd] matheus.ferst
2022-02-11 5:28 ` Richard Henderson
2022-02-10 12:34 ` [PATCH v3 18/37] target/ppc: implement vgnb matheus.ferst
2022-02-11 6:15 ` Richard Henderson
2022-02-10 12:34 ` [PATCH v3 19/37] target/ppc: Move vsel and vperm/vpermr to decodetree matheus.ferst
2022-02-10 12:34 ` [PATCH v3 20/37] target/ppc: Move xxsel " matheus.ferst
2022-02-10 12:34 ` [PATCH v3 21/37] target/ppc: move xxperm/xxpermr " matheus.ferst
2022-02-10 12:34 ` [PATCH v3 22/37] target/ppc: Move xxpermdi " matheus.ferst
2022-02-10 12:34 ` [PATCH v3 23/37] target/ppc: Implement xxpermx instruction matheus.ferst
2022-02-10 12:34 ` [PATCH v3 24/37] tcg/tcg-op-gvec.c: Introduce tcg_gen_gvec_4i matheus.ferst
2022-02-10 12:34 ` [PATCH v3 25/37] target/ppc: Implement xxeval matheus.ferst
2022-02-10 12:34 ` [PATCH v3 26/37] target/ppc: Implement xxgenpcv[bhwd]m instruction matheus.ferst
2022-02-10 12:34 ` [PATCH v3 27/37] target/ppc: move xs[n]madd[am][ds]p/xs[n]msub[am][ds]p to decodetree matheus.ferst
2022-02-10 12:34 ` [PATCH v3 28/37] target/ppc: implement xs[n]maddqp[o]/xs[n]msubqp[o] matheus.ferst
2022-02-10 12:34 ` [PATCH v3 29/37] target/ppc: Implement xvtlsbb instruction matheus.ferst
2022-02-10 12:34 ` [PATCH v3 30/37] target/ppc: Remove xscmpnedp instruction matheus.ferst
2022-02-10 12:34 ` [PATCH v3 31/37] target/ppc: Refactor VSX_SCALAR_CMP_DP matheus.ferst
2022-02-10 12:34 ` [PATCH v3 32/37] target/ppc: Implement xscmp{eq,ge,gt}qp matheus.ferst
2022-02-10 12:34 ` [PATCH v3 33/37] target/ppc: Move xscmp{eq,ge,gt}dp to decodetree matheus.ferst
2022-02-10 12:34 ` [PATCH v3 34/37] target/ppc: Move xs{max, min}[cj]dp to use do_helper_XX3 matheus.ferst
2022-02-10 12:34 ` [PATCH v3 35/37] target/ppc: Refactor VSX_MAX_MINC helper matheus.ferst
2022-02-10 12:34 ` [PATCH v3 36/37] target/ppc: Implement xs{max,min}cqp matheus.ferst
2022-02-10 12:34 ` [PATCH v3 37/37] target/ppc: Implement xvcvbf16spn and xvcvspbf16 instructions matheus.ferst
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.