* [PATCH v4 01/47] target/ppc: Introduce TRANS*FLAGS macros
2022-02-22 14:35 [PATCH v4 00/47] target/ppc: PowerISA Vector/VSX instruction batch matheus.ferst
@ 2022-02-22 14:35 ` matheus.ferst
2022-02-22 14:36 ` [PATCH v4 02/47] target/ppc: moved vector even and odd multiplication to decodetree matheus.ferst
` (45 subsequent siblings)
46 siblings, 0 replies; 97+ messages in thread
From: matheus.ferst @ 2022-02-22 14:35 UTC (permalink / raw)
To: qemu-devel, qemu-ppc
Cc: danielhb413, richard.henderson, groug, Luis Pires, clg,
Matheus Ferst, david
From: Luis Pires <luis.pires@eldorado.org.br>
New macros that add FLAGS and FLAGS2 checking were added for
both TRANS and TRANS64.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Luis Pires <luis.pires@eldorado.org.br>
[ferst: - TRANS_FLAGS2 instead of TRANS_FLAGS_E
- Use the new macros in load/store vector insns ]
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
---
target/ppc/translate.c | 19 +++++++++++++++
target/ppc/translate/vsx-impl.c.inc | 37 ++++++++++-------------------
2 files changed, 31 insertions(+), 25 deletions(-)
diff --git a/target/ppc/translate.c b/target/ppc/translate.c
index 2eaffd432a..b647430012 100644
--- a/target/ppc/translate.c
+++ b/target/ppc/translate.c
@@ -6604,10 +6604,29 @@ static int times_16(DisasContext *ctx, int x)
#define TRANS(NAME, FUNC, ...) \
static bool trans_##NAME(DisasContext *ctx, arg_##NAME *a) \
{ return FUNC(ctx, a, __VA_ARGS__); }
+#define TRANS_FLAGS(FLAGS, NAME, FUNC, ...) \
+ static bool trans_##NAME(DisasContext *ctx, arg_##NAME *a) \
+ { \
+ REQUIRE_INSNS_FLAGS(ctx, FLAGS); \
+ return FUNC(ctx, a, __VA_ARGS__); \
+ }
+#define TRANS_FLAGS2(FLAGS2, NAME, FUNC, ...) \
+ static bool trans_##NAME(DisasContext *ctx, arg_##NAME *a) \
+ { \
+ REQUIRE_INSNS_FLAGS2(ctx, FLAGS2); \
+ return FUNC(ctx, a, __VA_ARGS__); \
+ }
#define TRANS64(NAME, FUNC, ...) \
static bool trans_##NAME(DisasContext *ctx, arg_##NAME *a) \
{ REQUIRE_64BIT(ctx); return FUNC(ctx, a, __VA_ARGS__); }
+#define TRANS64_FLAGS2(FLAGS2, NAME, FUNC, ...) \
+ static bool trans_##NAME(DisasContext *ctx, arg_##NAME *a) \
+ { \
+ REQUIRE_64BIT(ctx); \
+ REQUIRE_INSNS_FLAGS2(ctx, FLAGS2); \
+ return FUNC(ctx, a, __VA_ARGS__); \
+ }
/* TODO: More TRANS* helpers for extra insn_flags checks. */
diff --git a/target/ppc/translate/vsx-impl.c.inc b/target/ppc/translate/vsx-impl.c.inc
index 128968b5e7..e8a4ba0cfa 100644
--- a/target/ppc/translate/vsx-impl.c.inc
+++ b/target/ppc/translate/vsx-impl.c.inc
@@ -2072,12 +2072,6 @@ static bool do_lstxv(DisasContext *ctx, int ra, TCGv displ,
static bool do_lstxv_D(DisasContext *ctx, arg_D *a, bool store, bool paired)
{
- if (paired) {
- REQUIRE_INSNS_FLAGS2(ctx, ISA310);
- } else {
- REQUIRE_INSNS_FLAGS2(ctx, ISA300);
- }
-
if (paired || a->rt >= 32) {
REQUIRE_VSX(ctx);
} else {
@@ -2091,7 +2085,6 @@ static bool do_lstxv_PLS_D(DisasContext *ctx, arg_PLS_D *a,
bool store, bool paired)
{
arg_D d;
- REQUIRE_INSNS_FLAGS2(ctx, ISA310);
REQUIRE_VSX(ctx);
if (!resolve_PLS_D(ctx, &d, a)) {
@@ -2103,12 +2096,6 @@ static bool do_lstxv_PLS_D(DisasContext *ctx, arg_PLS_D *a,
static bool do_lstxv_X(DisasContext *ctx, arg_X *a, bool store, bool paired)
{
- if (paired) {
- REQUIRE_INSNS_FLAGS2(ctx, ISA310);
- } else {
- REQUIRE_INSNS_FLAGS2(ctx, ISA300);
- }
-
if (paired || a->rt >= 32) {
REQUIRE_VSX(ctx);
} else {
@@ -2118,18 +2105,18 @@ static bool do_lstxv_X(DisasContext *ctx, arg_X *a, bool store, bool paired)
return do_lstxv(ctx, a->ra, cpu_gpr[a->rb], a->rt, store, paired);
}
-TRANS(STXV, do_lstxv_D, true, false)
-TRANS(LXV, do_lstxv_D, false, false)
-TRANS(STXVP, do_lstxv_D, true, true)
-TRANS(LXVP, do_lstxv_D, false, true)
-TRANS(STXVX, do_lstxv_X, true, false)
-TRANS(LXVX, do_lstxv_X, false, false)
-TRANS(STXVPX, do_lstxv_X, true, true)
-TRANS(LXVPX, do_lstxv_X, false, true)
-TRANS64(PSTXV, do_lstxv_PLS_D, true, false)
-TRANS64(PLXV, do_lstxv_PLS_D, false, false)
-TRANS64(PSTXVP, do_lstxv_PLS_D, true, true)
-TRANS64(PLXVP, do_lstxv_PLS_D, false, true)
+TRANS_FLAGS2(ISA300, STXV, do_lstxv_D, true, false)
+TRANS_FLAGS2(ISA300, LXV, do_lstxv_D, false, false)
+TRANS_FLAGS2(ISA310, STXVP, do_lstxv_D, true, true)
+TRANS_FLAGS2(ISA310, LXVP, do_lstxv_D, false, true)
+TRANS_FLAGS2(ISA300, STXVX, do_lstxv_X, true, false)
+TRANS_FLAGS2(ISA300, LXVX, do_lstxv_X, false, false)
+TRANS_FLAGS2(ISA310, STXVPX, do_lstxv_X, true, true)
+TRANS_FLAGS2(ISA310, LXVPX, do_lstxv_X, false, true)
+TRANS64_FLAGS2(ISA310, PSTXV, do_lstxv_PLS_D, true, false)
+TRANS64_FLAGS2(ISA310, PLXV, do_lstxv_PLS_D, false, false)
+TRANS64_FLAGS2(ISA310, PSTXVP, do_lstxv_PLS_D, true, true)
+TRANS64_FLAGS2(ISA310, PLXVP, do_lstxv_PLS_D, false, true)
static void gen_xxblendv_vec(unsigned vece, TCGv_vec t, TCGv_vec a, TCGv_vec b,
TCGv_vec c)
--
2.25.1
^ permalink raw reply related [flat|nested] 97+ messages in thread
* [PATCH v4 02/47] target/ppc: moved vector even and odd multiplication to decodetree
2022-02-22 14:35 [PATCH v4 00/47] target/ppc: PowerISA Vector/VSX instruction batch matheus.ferst
2022-02-22 14:35 ` [PATCH v4 01/47] target/ppc: Introduce TRANS*FLAGS macros matheus.ferst
@ 2022-02-22 14:36 ` matheus.ferst
2022-02-22 18:19 ` Richard Henderson
2022-02-22 14:36 ` [PATCH v4 03/47] target/ppc: Moved vector multiply high and low " matheus.ferst
` (44 subsequent siblings)
46 siblings, 1 reply; 97+ messages in thread
From: matheus.ferst @ 2022-02-22 14:36 UTC (permalink / raw)
To: qemu-devel, qemu-ppc
Cc: Lucas Mateus Castro (alqotel),
danielhb413, richard.henderson, groug, Lucas Mateus Castro, clg,
Matheus Ferst, david
From: "Lucas Mateus Castro (alqotel)" <lucas.castro@eldorado.org.br>
Moved the instructions vmulesb, vmulosb, vmuleub, vmuloub,
vmulesh, vmulosh, vmuleuh, vmulouh, vmulesw, vmulosw,
muleuw and vmulouw from legacy to decodetree. Implemented
the instructions vmulesd, vmulosd, vmuleud, vmuloud.
Signed-off-by: Lucas Mateus Castro (alqotel) <lucas.araujo@eldorado.org.br>
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
---
target/ppc/helper.h | 28 +++++++++-------
target/ppc/insn32.decode | 22 ++++++++++++
target/ppc/int_helper.c | 36 ++++++++++++++------
target/ppc/translate/vmx-impl.c.inc | 52 +++++++++++++++++++----------
target/ppc/translate/vmx-ops.c.inc | 15 ++-------
tcg/ppc/tcg-target.c.inc | 6 ++++
6 files changed, 107 insertions(+), 52 deletions(-)
diff --git a/target/ppc/helper.h b/target/ppc/helper.h
index ab008c9d4e..04689522f8 100644
--- a/target/ppc/helper.h
+++ b/target/ppc/helper.h
@@ -190,18 +190,22 @@ DEF_HELPER_3(vmrglw, void, avr, avr, avr)
DEF_HELPER_3(vmrghb, void, avr, avr, avr)
DEF_HELPER_3(vmrghh, void, avr, avr, avr)
DEF_HELPER_3(vmrghw, void, avr, avr, avr)
-DEF_HELPER_3(vmulesb, void, avr, avr, avr)
-DEF_HELPER_3(vmulesh, void, avr, avr, avr)
-DEF_HELPER_3(vmulesw, void, avr, avr, avr)
-DEF_HELPER_3(vmuleub, void, avr, avr, avr)
-DEF_HELPER_3(vmuleuh, void, avr, avr, avr)
-DEF_HELPER_3(vmuleuw, void, avr, avr, avr)
-DEF_HELPER_3(vmulosb, void, avr, avr, avr)
-DEF_HELPER_3(vmulosh, void, avr, avr, avr)
-DEF_HELPER_3(vmulosw, void, avr, avr, avr)
-DEF_HELPER_3(vmuloub, void, avr, avr, avr)
-DEF_HELPER_3(vmulouh, void, avr, avr, avr)
-DEF_HELPER_3(vmulouw, void, avr, avr, avr)
+DEF_HELPER_3(VMULESB, void, avr, avr, avr)
+DEF_HELPER_3(VMULESH, void, avr, avr, avr)
+DEF_HELPER_3(VMULESW, void, avr, avr, avr)
+DEF_HELPER_3(VMULESD, void, avr, avr, avr)
+DEF_HELPER_3(VMULEUB, void, avr, avr, avr)
+DEF_HELPER_3(VMULEUH, void, avr, avr, avr)
+DEF_HELPER_3(VMULEUW, void, avr, avr, avr)
+DEF_HELPER_3(VMULEUD, void, avr, avr, avr)
+DEF_HELPER_3(VMULOSB, void, avr, avr, avr)
+DEF_HELPER_3(VMULOSH, void, avr, avr, avr)
+DEF_HELPER_3(VMULOSW, void, avr, avr, avr)
+DEF_HELPER_3(VMULOSD, void, avr, avr, avr)
+DEF_HELPER_3(VMULOUB, void, avr, avr, avr)
+DEF_HELPER_3(VMULOUH, void, avr, avr, avr)
+DEF_HELPER_3(VMULOUW, void, avr, avr, avr)
+DEF_HELPER_3(VMULOUD, void, avr, avr, avr)
DEF_HELPER_3(vmulhsw, void, avr, avr, avr)
DEF_HELPER_3(vmulhuw, void, avr, avr, avr)
DEF_HELPER_3(vmulhsd, void, avr, avr, avr)
diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode
index 2a9c91a423..092ea79618 100644
--- a/target/ppc/insn32.decode
+++ b/target/ppc/insn32.decode
@@ -440,6 +440,28 @@ VEXTRACTWM 000100 ..... 01010 ..... 11001000010 @VX_tb
VEXTRACTDM 000100 ..... 01011 ..... 11001000010 @VX_tb
VEXTRACTQM 000100 ..... 01100 ..... 11001000010 @VX_tb
+## Vector Multiply Instruction
+
+VMULESB 000100 ..... ..... ..... 01100001000 @VX
+VMULOSB 000100 ..... ..... ..... 00100001000 @VX
+VMULEUB 000100 ..... ..... ..... 01000001000 @VX
+VMULOUB 000100 ..... ..... ..... 00000001000 @VX
+
+VMULESH 000100 ..... ..... ..... 01101001000 @VX
+VMULOSH 000100 ..... ..... ..... 00101001000 @VX
+VMULEUH 000100 ..... ..... ..... 01001001000 @VX
+VMULOUH 000100 ..... ..... ..... 00001001000 @VX
+
+VMULESW 000100 ..... ..... ..... 01110001000 @VX
+VMULOSW 000100 ..... ..... ..... 00110001000 @VX
+VMULEUW 000100 ..... ..... ..... 01010001000 @VX
+VMULOUW 000100 ..... ..... ..... 00010001000 @VX
+
+VMULESD 000100 ..... ..... ..... 01111001000 @VX
+VMULOSD 000100 ..... ..... ..... 00111001000 @VX
+VMULEUD 000100 ..... ..... ..... 01011001000 @VX
+VMULOUD 000100 ..... ..... ..... 00011001000 @VX
+
# VSX Load/Store Instructions
LXV 111101 ..... ..... ............ . 001 @DQ_TSX
diff --git a/target/ppc/int_helper.c b/target/ppc/int_helper.c
index d1b12788b2..7d925418d4 100644
--- a/target/ppc/int_helper.c
+++ b/target/ppc/int_helper.c
@@ -1063,7 +1063,7 @@ void helper_vmsumuhs(CPUPPCState *env, ppc_avr_t *r, ppc_avr_t *a,
}
#define VMUL_DO_EVN(name, mul_element, mul_access, prod_access, cast) \
- void helper_v##name(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b) \
+ void helper_V##name(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b) \
{ \
int i; \
\
@@ -1074,7 +1074,7 @@ void helper_vmsumuhs(CPUPPCState *env, ppc_avr_t *r, ppc_avr_t *a,
}
#define VMUL_DO_ODD(name, mul_element, mul_access, prod_access, cast) \
- void helper_v##name(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b) \
+ void helper_V##name(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b) \
{ \
int i; \
\
@@ -1085,17 +1085,33 @@ void helper_vmsumuhs(CPUPPCState *env, ppc_avr_t *r, ppc_avr_t *a,
}
#define VMUL(suffix, mul_element, mul_access, prod_access, cast) \
- VMUL_DO_EVN(mule##suffix, mul_element, mul_access, prod_access, cast) \
- VMUL_DO_ODD(mulo##suffix, mul_element, mul_access, prod_access, cast)
-VMUL(sb, s8, VsrSB, VsrSH, int16_t)
-VMUL(sh, s16, VsrSH, VsrSW, int32_t)
-VMUL(sw, s32, VsrSW, VsrSD, int64_t)
-VMUL(ub, u8, VsrB, VsrH, uint16_t)
-VMUL(uh, u16, VsrH, VsrW, uint32_t)
-VMUL(uw, u32, VsrW, VsrD, uint64_t)
+ VMUL_DO_EVN(MULE##suffix, mul_element, mul_access, prod_access, cast) \
+ VMUL_DO_ODD(MULO##suffix, mul_element, mul_access, prod_access, cast)
+VMUL(SB, s8, VsrSB, VsrSH, int16_t)
+VMUL(SH, s16, VsrSH, VsrSW, int32_t)
+VMUL(SW, s32, VsrSW, VsrSD, int64_t)
+VMUL(UB, u8, VsrB, VsrH, uint16_t)
+VMUL(UH, u16, VsrH, VsrW, uint32_t)
+VMUL(UW, u32, VsrW, VsrD, uint64_t)
#undef VMUL_DO_EVN
#undef VMUL_DO_ODD
#undef VMUL
+void helper_VMULESD(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b)
+{
+ muls64(&r->VsrD(1), &r->VsrD(0), a->VsrSD(0), b->VsrSD(0));
+}
+void helper_VMULOSD(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b)
+{
+ muls64(&r->VsrD(1), &r->VsrD(0), a->VsrSD(1), b->VsrSD(1));
+}
+void helper_VMULEUD(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b)
+{
+ mulu64(&r->VsrD(1), &r->VsrD(0), a->VsrD(0), b->VsrD(0));
+}
+void helper_VMULOUD(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b)
+{
+ mulu64(&r->VsrD(1), &r->VsrD(0), a->VsrD(1), b->VsrD(1));
+}
void helper_vmulhsw(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b)
{
diff --git a/target/ppc/translate/vmx-impl.c.inc b/target/ppc/translate/vmx-impl.c.inc
index d5e02fd7f2..430579addd 100644
--- a/target/ppc/translate/vmx-impl.c.inc
+++ b/target/ppc/translate/vmx-impl.c.inc
@@ -798,29 +798,11 @@ static void trans_vclzd(DisasContext *ctx)
tcg_temp_free_i64(avr);
}
-GEN_VXFORM(vmuloub, 4, 0);
-GEN_VXFORM(vmulouh, 4, 1);
-GEN_VXFORM(vmulouw, 4, 2);
GEN_VXFORM_V(vmuluwm, MO_32, tcg_gen_gvec_mul, 4, 2);
-GEN_VXFORM_DUAL(vmulouw, PPC_ALTIVEC, PPC_NONE,
- vmuluwm, PPC_NONE, PPC2_ALTIVEC_207)
-GEN_VXFORM(vmulosb, 4, 4);
-GEN_VXFORM(vmulosh, 4, 5);
-GEN_VXFORM(vmulosw, 4, 6);
GEN_VXFORM_V(vmulld, MO_64, tcg_gen_gvec_mul, 4, 7);
-GEN_VXFORM(vmuleub, 4, 8);
-GEN_VXFORM(vmuleuh, 4, 9);
-GEN_VXFORM(vmuleuw, 4, 10);
GEN_VXFORM(vmulhuw, 4, 10);
GEN_VXFORM(vmulhud, 4, 11);
-GEN_VXFORM_DUAL(vmuleuw, PPC_ALTIVEC, PPC_NONE,
- vmulhuw, PPC_NONE, PPC2_ISA310);
-GEN_VXFORM(vmulesb, 4, 12);
-GEN_VXFORM(vmulesh, 4, 13);
-GEN_VXFORM(vmulesw, 4, 14);
GEN_VXFORM(vmulhsw, 4, 14);
-GEN_VXFORM_DUAL(vmulesw, PPC_ALTIVEC, PPC_NONE,
- vmulhsw, PPC_NONE, PPC2_ISA310);
GEN_VXFORM(vmulhsd, 4, 15);
GEN_VXFORM_V(vslb, MO_8, tcg_gen_gvec_shlv, 2, 4);
GEN_VXFORM_V(vslh, MO_16, tcg_gen_gvec_shlv, 2, 5);
@@ -2104,6 +2086,40 @@ static bool trans_VPEXTD(DisasContext *ctx, arg_VX *a)
return true;
}
+static bool do_vx_helper(DisasContext *ctx, arg_VX *a,
+ void (*gen_helper) (TCGv_ptr, TCGv_ptr, TCGv_ptr))
+{
+ TCGv_ptr ra, rb, rd;
+ REQUIRE_VECTOR(ctx);
+
+ ra = gen_avr_ptr(a->vra);
+ rb = gen_avr_ptr(a->vrb);
+ rd = gen_avr_ptr(a->vrt);
+ gen_helper(rd, ra, rb);
+ tcg_temp_free_ptr(ra);
+ tcg_temp_free_ptr(rb);
+ tcg_temp_free_ptr(rd);
+
+ return true;
+}
+
+TRANS_FLAGS2(ALTIVEC_207, VMULESB, do_vx_helper, gen_helper_VMULESB)
+TRANS_FLAGS2(ALTIVEC_207, VMULOSB, do_vx_helper, gen_helper_VMULOSB)
+TRANS_FLAGS2(ALTIVEC_207, VMULEUB, do_vx_helper, gen_helper_VMULEUB)
+TRANS_FLAGS2(ALTIVEC_207, VMULOUB, do_vx_helper, gen_helper_VMULOUB)
+TRANS_FLAGS2(ALTIVEC_207, VMULESH, do_vx_helper, gen_helper_VMULESH)
+TRANS_FLAGS2(ALTIVEC_207, VMULOSH, do_vx_helper, gen_helper_VMULOSH)
+TRANS_FLAGS2(ALTIVEC_207, VMULEUH, do_vx_helper, gen_helper_VMULEUH)
+TRANS_FLAGS2(ALTIVEC_207, VMULOUH, do_vx_helper, gen_helper_VMULOUH)
+TRANS_FLAGS2(ALTIVEC_207, VMULESW, do_vx_helper, gen_helper_VMULESW)
+TRANS_FLAGS2(ALTIVEC_207, VMULOSW, do_vx_helper, gen_helper_VMULOSW)
+TRANS_FLAGS2(ALTIVEC_207, VMULEUW, do_vx_helper, gen_helper_VMULEUW)
+TRANS_FLAGS2(ALTIVEC_207, VMULOUW, do_vx_helper, gen_helper_VMULOUW)
+TRANS_FLAGS2(ISA310, VMULESD, do_vx_helper, gen_helper_VMULESD)
+TRANS_FLAGS2(ISA310, VMULOSD, do_vx_helper, gen_helper_VMULOSD)
+TRANS_FLAGS2(ISA310, VMULEUD, do_vx_helper, gen_helper_VMULEUD)
+TRANS_FLAGS2(ISA310, VMULOUD, do_vx_helper, gen_helper_VMULOUD)
+
#undef GEN_VR_LDX
#undef GEN_VR_STX
#undef GEN_VR_LVE
diff --git a/target/ppc/translate/vmx-ops.c.inc b/target/ppc/translate/vmx-ops.c.inc
index 25ee715b43..f310b2fbde 100644
--- a/target/ppc/translate/vmx-ops.c.inc
+++ b/target/ppc/translate/vmx-ops.c.inc
@@ -101,20 +101,11 @@ GEN_VXFORM_DUAL(vmrgow, vextuwlx, 6, 26, PPC_NONE, PPC2_ALTIVEC_207),
GEN_VXFORM_300(vextubrx, 6, 28),
GEN_VXFORM_300(vextuhrx, 6, 29),
GEN_VXFORM_DUAL(vmrgew, vextuwrx, 6, 30, PPC_NONE, PPC2_ALTIVEC_207),
-GEN_VXFORM(vmuloub, 4, 0),
-GEN_VXFORM(vmulouh, 4, 1),
-GEN_VXFORM_DUAL(vmulouw, vmuluwm, 4, 2, PPC_ALTIVEC, PPC_NONE),
-GEN_VXFORM(vmulosb, 4, 4),
-GEN_VXFORM(vmulosh, 4, 5),
-GEN_VXFORM_207(vmulosw, 4, 6),
+GEN_VXFORM_207(vmuluwm, 4, 2),
GEN_VXFORM_310(vmulld, 4, 7),
-GEN_VXFORM(vmuleub, 4, 8),
-GEN_VXFORM(vmuleuh, 4, 9),
-GEN_VXFORM_DUAL(vmuleuw, vmulhuw, 4, 10, PPC_ALTIVEC, PPC_NONE),
+GEN_VXFORM_310(vmulhuw, 4, 10),
GEN_VXFORM_310(vmulhud, 4, 11),
-GEN_VXFORM(vmulesb, 4, 12),
-GEN_VXFORM(vmulesh, 4, 13),
-GEN_VXFORM_DUAL(vmulesw, vmulhsw, 4, 14, PPC_ALTIVEC, PPC_NONE),
+GEN_VXFORM_310(vmulhsw, 4, 14),
GEN_VXFORM_310(vmulhsd, 4, 15),
GEN_VXFORM(vslb, 2, 4),
GEN_VXFORM(vslh, 2, 5),
diff --git a/tcg/ppc/tcg-target.c.inc b/tcg/ppc/tcg-target.c.inc
index dea24f23c4..69d22e08cb 100644
--- a/tcg/ppc/tcg-target.c.inc
+++ b/tcg/ppc/tcg-target.c.inc
@@ -3987,3 +3987,9 @@ void tcg_register_jit(const void *buf, size_t buf_size)
tcg_register_jit_int(buf, buf_size, &debug_frame, sizeof(debug_frame));
}
#endif /* __ELF__ */
+#undef VMULEUB
+#undef VMULEUH
+#undef VMULEUW
+#undef VMULOUB
+#undef VMULOUH
+#undef VMULOUW
--
2.25.1
^ permalink raw reply related [flat|nested] 97+ messages in thread
* Re: [PATCH v4 02/47] target/ppc: moved vector even and odd multiplication to decodetree
2022-02-22 14:36 ` [PATCH v4 02/47] target/ppc: moved vector even and odd multiplication to decodetree matheus.ferst
@ 2022-02-22 18:19 ` Richard Henderson
0 siblings, 0 replies; 97+ messages in thread
From: Richard Henderson @ 2022-02-22 18:19 UTC (permalink / raw)
To: matheus.ferst, qemu-devel, qemu-ppc
Cc: Lucas Mateus Castro (alqotel),
danielhb413, groug, Lucas Mateus Castro, clg, david
On 2/22/22 04:36, matheus.ferst@eldorado.org.br wrote:
> From: "Lucas Mateus Castro (alqotel)"<lucas.castro@eldorado.org.br>
>
> Moved the instructions vmulesb, vmulosb, vmuleub, vmuloub,
> vmulesh, vmulosh, vmuleuh, vmulouh, vmulesw, vmulosw,
> muleuw and vmulouw from legacy to decodetree. Implemented
> the instructions vmulesd, vmulosd, vmuleud, vmuloud.
>
> Signed-off-by: Lucas Mateus Castro (alqotel)<lucas.araujo@eldorado.org.br>
> Signed-off-by: Matheus Ferst<matheus.ferst@eldorado.org.br>
> ---
> target/ppc/helper.h | 28 +++++++++-------
> target/ppc/insn32.decode | 22 ++++++++++++
> target/ppc/int_helper.c | 36 ++++++++++++++------
> target/ppc/translate/vmx-impl.c.inc | 52 +++++++++++++++++++----------
> target/ppc/translate/vmx-ops.c.inc | 15 ++-------
> tcg/ppc/tcg-target.c.inc | 6 ++++
> 6 files changed, 107 insertions(+), 52 deletions(-)
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
> +void helper_VMULESD(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b)
> +{
> + muls64(&r->VsrD(1), &r->VsrD(0), a->VsrSD(0), b->VsrSD(0));
> +}
> +void helper_VMULOSD(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b)
> +{
> + muls64(&r->VsrD(1), &r->VsrD(0), a->VsrSD(1), b->VsrSD(1));
> +}
> +void helper_VMULEUD(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b)
> +{
> + mulu64(&r->VsrD(1), &r->VsrD(0), a->VsrD(0), b->VsrD(0));
> +}
> +void helper_VMULOUD(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b)
> +{
> + mulu64(&r->VsrD(1), &r->VsrD(0), a->VsrD(1), b->VsrD(1));
> +}
Did I mention before that these are trivially implemented inline?
r~
^ permalink raw reply [flat|nested] 97+ messages in thread
* [PATCH v4 03/47] target/ppc: Moved vector multiply high and low to decodetree
2022-02-22 14:35 [PATCH v4 00/47] target/ppc: PowerISA Vector/VSX instruction batch matheus.ferst
2022-02-22 14:35 ` [PATCH v4 01/47] target/ppc: Introduce TRANS*FLAGS macros matheus.ferst
2022-02-22 14:36 ` [PATCH v4 02/47] target/ppc: moved vector even and odd multiplication to decodetree matheus.ferst
@ 2022-02-22 14:36 ` matheus.ferst
2022-02-22 18:19 ` Richard Henderson
2022-02-22 14:36 ` [PATCH v4 04/47] target/ppc: vmulh* instructions without helpers matheus.ferst
` (43 subsequent siblings)
46 siblings, 1 reply; 97+ messages in thread
From: matheus.ferst @ 2022-02-22 14:36 UTC (permalink / raw)
To: qemu-devel, qemu-ppc
Cc: Lucas Mateus Castro (alqotel),
danielhb413, richard.henderson, groug, Lucas Mateus Castro, clg,
Matheus Ferst, david
From: "Lucas Mateus Castro (alqotel)" <lucas.castro@eldorado.org.br>
Moved instructions vmulld, vmulhuw, vmulhsw, vmulhud and vmulhsd to
decodetree
Signed-off-by: Lucas Mateus Castro (alqotel) <lucas.araujo@eldorado.org.br>
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
---
target/ppc/helper.h | 8 ++++----
target/ppc/insn32.decode | 6 ++++++
target/ppc/int_helper.c | 8 ++++----
target/ppc/translate/vmx-impl.c.inc | 21 ++++++++++++++++-----
target/ppc/translate/vmx-ops.c.inc | 5 -----
5 files changed, 30 insertions(+), 18 deletions(-)
diff --git a/target/ppc/helper.h b/target/ppc/helper.h
index 04689522f8..5d11158f1f 100644
--- a/target/ppc/helper.h
+++ b/target/ppc/helper.h
@@ -206,10 +206,10 @@ DEF_HELPER_3(VMULOUB, void, avr, avr, avr)
DEF_HELPER_3(VMULOUH, void, avr, avr, avr)
DEF_HELPER_3(VMULOUW, void, avr, avr, avr)
DEF_HELPER_3(VMULOUD, void, avr, avr, avr)
-DEF_HELPER_3(vmulhsw, void, avr, avr, avr)
-DEF_HELPER_3(vmulhuw, void, avr, avr, avr)
-DEF_HELPER_3(vmulhsd, void, avr, avr, avr)
-DEF_HELPER_3(vmulhud, void, avr, avr, avr)
+DEF_HELPER_3(VMULHSW, void, avr, avr, avr)
+DEF_HELPER_3(VMULHUW, void, avr, avr, avr)
+DEF_HELPER_3(VMULHSD, void, avr, avr, avr)
+DEF_HELPER_3(VMULHUD, void, avr, avr, avr)
DEF_HELPER_3(vslo, void, avr, avr, avr)
DEF_HELPER_3(vsro, void, avr, avr, avr)
DEF_HELPER_3(vsrv, void, avr, avr, avr)
diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode
index 092ea79618..d817e44c71 100644
--- a/target/ppc/insn32.decode
+++ b/target/ppc/insn32.decode
@@ -462,6 +462,12 @@ VMULOSD 000100 ..... ..... ..... 00111001000 @VX
VMULEUD 000100 ..... ..... ..... 01011001000 @VX
VMULOUD 000100 ..... ..... ..... 00011001000 @VX
+VMULHSW 000100 ..... ..... ..... 01110001001 @VX
+VMULHUW 000100 ..... ..... ..... 01010001001 @VX
+VMULHSD 000100 ..... ..... ..... 01111001001 @VX
+VMULHUD 000100 ..... ..... ..... 01011001001 @VX
+VMULLD 000100 ..... ..... ..... 00111001001 @VX
+
# VSX Load/Store Instructions
LXV 111101 ..... ..... ............ . 001 @DQ_TSX
diff --git a/target/ppc/int_helper.c b/target/ppc/int_helper.c
index 7d925418d4..8ddeccef12 100644
--- a/target/ppc/int_helper.c
+++ b/target/ppc/int_helper.c
@@ -1113,7 +1113,7 @@ void helper_VMULOUD(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b)
mulu64(&r->VsrD(1), &r->VsrD(0), a->VsrD(1), b->VsrD(1));
}
-void helper_vmulhsw(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b)
+void helper_VMULHSW(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b)
{
int i;
@@ -1122,7 +1122,7 @@ void helper_vmulhsw(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b)
}
}
-void helper_vmulhuw(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b)
+void helper_VMULHUW(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b)
{
int i;
@@ -1132,7 +1132,7 @@ void helper_vmulhuw(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b)
}
}
-void helper_vmulhsd(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b)
+void helper_VMULHSD(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b)
{
uint64_t discard;
@@ -1140,7 +1140,7 @@ void helper_vmulhsd(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b)
muls64(&discard, &r->u64[1], a->s64[1], b->s64[1]);
}
-void helper_vmulhud(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b)
+void helper_VMULHUD(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b)
{
uint64_t discard;
diff --git a/target/ppc/translate/vmx-impl.c.inc b/target/ppc/translate/vmx-impl.c.inc
index 430579addd..62d0642226 100644
--- a/target/ppc/translate/vmx-impl.c.inc
+++ b/target/ppc/translate/vmx-impl.c.inc
@@ -799,11 +799,6 @@ static void trans_vclzd(DisasContext *ctx)
}
GEN_VXFORM_V(vmuluwm, MO_32, tcg_gen_gvec_mul, 4, 2);
-GEN_VXFORM_V(vmulld, MO_64, tcg_gen_gvec_mul, 4, 7);
-GEN_VXFORM(vmulhuw, 4, 10);
-GEN_VXFORM(vmulhud, 4, 11);
-GEN_VXFORM(vmulhsw, 4, 14);
-GEN_VXFORM(vmulhsd, 4, 15);
GEN_VXFORM_V(vslb, MO_8, tcg_gen_gvec_shlv, 2, 4);
GEN_VXFORM_V(vslh, MO_16, tcg_gen_gvec_shlv, 2, 5);
GEN_VXFORM_V(vslw, MO_32, tcg_gen_gvec_shlv, 2, 6);
@@ -2103,6 +2098,17 @@ static bool do_vx_helper(DisasContext *ctx, arg_VX *a,
return true;
}
+static bool trans_VMULLD(DisasContext *ctx, arg_VX *a)
+{
+ REQUIRE_INSNS_FLAGS2(ctx, ISA310);
+ REQUIRE_VECTOR(ctx);
+
+ tcg_gen_gvec_mul(MO_64, avr_full_offset(a->vrt), avr_full_offset(a->vra),
+ avr_full_offset(a->vrb), 16, 16);
+
+ return true;
+}
+
TRANS_FLAGS2(ALTIVEC_207, VMULESB, do_vx_helper, gen_helper_VMULESB)
TRANS_FLAGS2(ALTIVEC_207, VMULOSB, do_vx_helper, gen_helper_VMULOSB)
TRANS_FLAGS2(ALTIVEC_207, VMULEUB, do_vx_helper, gen_helper_VMULEUB)
@@ -2120,6 +2126,11 @@ TRANS_FLAGS2(ISA310, VMULOSD, do_vx_helper, gen_helper_VMULOSD)
TRANS_FLAGS2(ISA310, VMULEUD, do_vx_helper, gen_helper_VMULEUD)
TRANS_FLAGS2(ISA310, VMULOUD, do_vx_helper, gen_helper_VMULOUD)
+TRANS_FLAGS2(ISA310, VMULHSW, do_vx_helper, gen_helper_VMULHSW)
+TRANS_FLAGS2(ISA310, VMULHSD, do_vx_helper, gen_helper_VMULHSD)
+TRANS_FLAGS2(ISA310, VMULHUW, do_vx_helper, gen_helper_VMULHUW)
+TRANS_FLAGS2(ISA310, VMULHUD, do_vx_helper, gen_helper_VMULHUD)
+
#undef GEN_VR_LDX
#undef GEN_VR_STX
#undef GEN_VR_LVE
diff --git a/target/ppc/translate/vmx-ops.c.inc b/target/ppc/translate/vmx-ops.c.inc
index f310b2fbde..914e68e5b0 100644
--- a/target/ppc/translate/vmx-ops.c.inc
+++ b/target/ppc/translate/vmx-ops.c.inc
@@ -102,11 +102,6 @@ GEN_VXFORM_300(vextubrx, 6, 28),
GEN_VXFORM_300(vextuhrx, 6, 29),
GEN_VXFORM_DUAL(vmrgew, vextuwrx, 6, 30, PPC_NONE, PPC2_ALTIVEC_207),
GEN_VXFORM_207(vmuluwm, 4, 2),
-GEN_VXFORM_310(vmulld, 4, 7),
-GEN_VXFORM_310(vmulhuw, 4, 10),
-GEN_VXFORM_310(vmulhud, 4, 11),
-GEN_VXFORM_310(vmulhsw, 4, 14),
-GEN_VXFORM_310(vmulhsd, 4, 15),
GEN_VXFORM(vslb, 2, 4),
GEN_VXFORM(vslh, 2, 5),
GEN_VXFORM_DUAL(vslw, vrlwnm, 2, 6, PPC_ALTIVEC, PPC_NONE),
--
2.25.1
^ permalink raw reply related [flat|nested] 97+ messages in thread
* Re: [PATCH v4 03/47] target/ppc: Moved vector multiply high and low to decodetree
2022-02-22 14:36 ` [PATCH v4 03/47] target/ppc: Moved vector multiply high and low " matheus.ferst
@ 2022-02-22 18:19 ` Richard Henderson
0 siblings, 0 replies; 97+ messages in thread
From: Richard Henderson @ 2022-02-22 18:19 UTC (permalink / raw)
To: matheus.ferst, qemu-devel, qemu-ppc
Cc: Lucas Mateus Castro (alqotel),
danielhb413, groug, Lucas Mateus Castro, clg, david
On 2/22/22 04:36, matheus.ferst@eldorado.org.br wrote:
> From: "Lucas Mateus Castro (alqotel)"<lucas.castro@eldorado.org.br>
>
> Moved instructions vmulld, vmulhuw, vmulhsw, vmulhud and vmulhsd to
> decodetree
>
> Signed-off-by: Lucas Mateus Castro (alqotel)<lucas.araujo@eldorado.org.br>
> Signed-off-by: Matheus Ferst<matheus.ferst@eldorado.org.br>
> ---
> target/ppc/helper.h | 8 ++++----
> target/ppc/insn32.decode | 6 ++++++
> target/ppc/int_helper.c | 8 ++++----
> target/ppc/translate/vmx-impl.c.inc | 21 ++++++++++++++++-----
> target/ppc/translate/vmx-ops.c.inc | 5 -----
> 5 files changed, 30 insertions(+), 18 deletions(-)
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
r~
^ permalink raw reply [flat|nested] 97+ messages in thread
* [PATCH v4 04/47] target/ppc: vmulh* instructions without helpers
2022-02-22 14:35 [PATCH v4 00/47] target/ppc: PowerISA Vector/VSX instruction batch matheus.ferst
` (2 preceding siblings ...)
2022-02-22 14:36 ` [PATCH v4 03/47] target/ppc: Moved vector multiply high and low " matheus.ferst
@ 2022-02-22 14:36 ` matheus.ferst
2022-02-22 18:23 ` Richard Henderson
2022-02-22 14:36 ` [PATCH v4 05/47] target/ppc: Implement vmsumcud instruction matheus.ferst
` (42 subsequent siblings)
46 siblings, 1 reply; 97+ messages in thread
From: matheus.ferst @ 2022-02-22 14:36 UTC (permalink / raw)
To: qemu-devel, qemu-ppc
Cc: Lucas Mateus Castro (alqotel),
danielhb413, richard.henderson, groug, Lucas Mateus Castro, clg,
Matheus Ferst, david
From: "Lucas Mateus Castro (alqotel)" <lucas.castro@eldorado.org.br>
Changed vmulhuw, vmulhud, vmulhsw, vmulhsd to not
use helpers.
Signed-off-by: Lucas Mateus Castro (alqotel) <lucas.araujo@eldorado.org.br>
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
---
Changes in v4:
Changed from gvec to i64, this resulted in a better performance on
a Power host for all 4 instructions and a better performance for
vmulhsw and vmulhuw in x86, but a worse performance for vmulhsd and
vmulhud in a x86 host.
---
target/ppc/helper.h | 4 -
target/ppc/int_helper.c | 35 --------
target/ppc/translate/vmx-impl.c.inc | 123 +++++++++++++++++++++++++++-
3 files changed, 119 insertions(+), 43 deletions(-)
diff --git a/target/ppc/helper.h b/target/ppc/helper.h
index 5d11158f1f..d0c5a3fef1 100644
--- a/target/ppc/helper.h
+++ b/target/ppc/helper.h
@@ -206,10 +206,6 @@ DEF_HELPER_3(VMULOUB, void, avr, avr, avr)
DEF_HELPER_3(VMULOUH, void, avr, avr, avr)
DEF_HELPER_3(VMULOUW, void, avr, avr, avr)
DEF_HELPER_3(VMULOUD, void, avr, avr, avr)
-DEF_HELPER_3(VMULHSW, void, avr, avr, avr)
-DEF_HELPER_3(VMULHUW, void, avr, avr, avr)
-DEF_HELPER_3(VMULHSD, void, avr, avr, avr)
-DEF_HELPER_3(VMULHUD, void, avr, avr, avr)
DEF_HELPER_3(vslo, void, avr, avr, avr)
DEF_HELPER_3(vsro, void, avr, avr, avr)
DEF_HELPER_3(vsrv, void, avr, avr, avr)
diff --git a/target/ppc/int_helper.c b/target/ppc/int_helper.c
index 8ddeccef12..64c87d9418 100644
--- a/target/ppc/int_helper.c
+++ b/target/ppc/int_helper.c
@@ -1113,41 +1113,6 @@ void helper_VMULOUD(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b)
mulu64(&r->VsrD(1), &r->VsrD(0), a->VsrD(1), b->VsrD(1));
}
-void helper_VMULHSW(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b)
-{
- int i;
-
- for (i = 0; i < 4; i++) {
- r->s32[i] = (int32_t)(((int64_t)a->s32[i] * (int64_t)b->s32[i]) >> 32);
- }
-}
-
-void helper_VMULHUW(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b)
-{
- int i;
-
- for (i = 0; i < 4; i++) {
- r->u32[i] = (uint32_t)(((uint64_t)a->u32[i] *
- (uint64_t)b->u32[i]) >> 32);
- }
-}
-
-void helper_VMULHSD(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b)
-{
- uint64_t discard;
-
- muls64(&discard, &r->u64[0], a->s64[0], b->s64[0]);
- muls64(&discard, &r->u64[1], a->s64[1], b->s64[1]);
-}
-
-void helper_VMULHUD(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b)
-{
- uint64_t discard;
-
- mulu64(&discard, &r->u64[0], a->u64[0], b->u64[0]);
- mulu64(&discard, &r->u64[1], a->u64[1], b->u64[1]);
-}
-
void helper_vperm(CPUPPCState *env, ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b,
ppc_avr_t *c)
{
diff --git a/target/ppc/translate/vmx-impl.c.inc b/target/ppc/translate/vmx-impl.c.inc
index 62d0642226..3951ae124a 100644
--- a/target/ppc/translate/vmx-impl.c.inc
+++ b/target/ppc/translate/vmx-impl.c.inc
@@ -2126,10 +2126,125 @@ TRANS_FLAGS2(ISA310, VMULOSD, do_vx_helper, gen_helper_VMULOSD)
TRANS_FLAGS2(ISA310, VMULEUD, do_vx_helper, gen_helper_VMULEUD)
TRANS_FLAGS2(ISA310, VMULOUD, do_vx_helper, gen_helper_VMULOUD)
-TRANS_FLAGS2(ISA310, VMULHSW, do_vx_helper, gen_helper_VMULHSW)
-TRANS_FLAGS2(ISA310, VMULHSD, do_vx_helper, gen_helper_VMULHSD)
-TRANS_FLAGS2(ISA310, VMULHUW, do_vx_helper, gen_helper_VMULHUW)
-TRANS_FLAGS2(ISA310, VMULHUD, do_vx_helper, gen_helper_VMULHUD)
+static void do_vx_vmulhw_i64(TCGv_i64 t, TCGv_i64 a, TCGv_i64 b, bool sign)
+{
+ TCGv_i64 hh, lh, temp;
+
+ uint64_t c;
+ hh = tcg_temp_new_i64();
+ lh = tcg_temp_new_i64();
+ temp = tcg_temp_new_i64();
+
+ c = 0xFFFFFFFF;
+
+ if (sign) {
+ tcg_gen_ext32s_i64(lh, a);
+ tcg_gen_ext32s_i64(temp, b);
+ } else {
+ tcg_gen_andi_i64(lh, a, c);
+ tcg_gen_andi_i64(temp, b, c);
+ }
+ tcg_gen_mul_i64(lh, lh, temp);
+
+ if (sign) {
+ tcg_gen_sari_i64(hh, a, 32);
+ tcg_gen_sari_i64(temp, b, 32);
+ } else {
+ tcg_gen_shri_i64(hh, a, 32);
+ tcg_gen_shri_i64(temp, b, 32);
+ }
+ tcg_gen_mul_i64(hh, hh, temp);
+
+ tcg_gen_shri_i64(lh, lh, 32);
+ tcg_gen_andi_i64(hh, hh, c << 32);
+ tcg_gen_or_i64(t, hh, lh);
+
+ tcg_temp_free_i64(hh);
+ tcg_temp_free_i64(lh);
+ tcg_temp_free_i64(temp);
+}
+
+static void do_vx_vmulhd_i64(TCGv_i64 t, TCGv_i64 a, TCGv_i64 b, bool sign)
+{
+ TCGv_i64 a1, b1, mask, w, k;
+ void (*tcg_gen_shift_imm)(TCGv_i64, TCGv_i64, int64_t);
+
+ a1 = tcg_temp_new_i64();
+ b1 = tcg_temp_new_i64();
+ w = tcg_temp_new_i64();
+ k = tcg_temp_new_i64();
+ mask = tcg_temp_new_i64();
+ if (sign) {
+ tcg_gen_shift_imm = tcg_gen_sari_i64;
+ } else {
+ tcg_gen_shift_imm = tcg_gen_shri_i64;
+ }
+
+ tcg_gen_movi_i64(mask, 0xFFFFFFFF);
+ tcg_gen_and_i64(a1, a, mask);
+ tcg_gen_and_i64(b1, b, mask);
+ tcg_gen_mul_i64(t, a1, b1);
+ tcg_gen_shri_i64(k, t, 32);
+
+ tcg_gen_shift_imm(a1, a, 32);
+ tcg_gen_mul_i64(t, a1, b1);
+ tcg_gen_add_i64(t, t, k);
+ tcg_gen_and_i64(k, t, mask);
+ tcg_gen_shift_imm(w, t, 32);
+
+ tcg_gen_and_i64(a1, a, mask);
+ tcg_gen_shift_imm(b1, b, 32);
+ tcg_gen_mul_i64(t, a1, b1);
+ tcg_gen_add_i64(t, t, k);
+ tcg_gen_shift_imm(k, t, 32);
+
+ tcg_gen_shift_imm(a1, a, 32);
+ tcg_gen_mul_i64(t, a1, b1);
+ tcg_gen_add_i64(t, t, w);
+ tcg_gen_add_i64(t, t, k);
+
+ tcg_temp_free_i64(a1);
+ tcg_temp_free_i64(b1);
+ tcg_temp_free_i64(w);
+ tcg_temp_free_i64(k);
+ tcg_temp_free_i64(mask);
+}
+
+static bool do_vx_mulh(DisasContext *ctx, arg_VX *a, bool sign,
+ void (*func)(TCGv_i64, TCGv_i64, TCGv_i64, bool))
+{
+ REQUIRE_INSNS_FLAGS2(ctx, ISA310);
+ REQUIRE_VECTOR(ctx);
+
+ TCGv_i64 vra, vrb, vrt;
+ int i;
+
+ vra = tcg_temp_new_i64();
+ vrb = tcg_temp_new_i64();
+ vrt = tcg_temp_new_i64();
+
+ for (i = 0; i < 2; i++) {
+ get_avr64(vra, a->vra, i);
+ get_avr64(vrb, a->vrb, i);
+ get_avr64(vrt, a->vrt, i);
+
+ func(vrt, vra, vrb, sign);
+
+ set_avr64(a->vrt, vrt, i);
+ }
+
+ tcg_temp_free_i64(vra);
+ tcg_temp_free_i64(vrb);
+ tcg_temp_free_i64(vrt);
+
+ return true;
+
+}
+
+TRANS(VMULHSW, do_vx_mulh, true , do_vx_vmulhw_i64)
+TRANS(VMULHSD, do_vx_mulh, true , do_vx_vmulhd_i64)
+TRANS(VMULHUW, do_vx_mulh, false, do_vx_vmulhw_i64)
+TRANS(VMULHUD, do_vx_mulh, false, do_vx_vmulhd_i64)
#undef GEN_VR_LDX
#undef GEN_VR_STX
--
2.25.1
^ permalink raw reply related [flat|nested] 97+ messages in thread
* Re: [PATCH v4 04/47] target/ppc: vmulh* instructions without helpers
2022-02-22 14:36 ` [PATCH v4 04/47] target/ppc: vmulh* instructions without helpers matheus.ferst
@ 2022-02-22 18:23 ` Richard Henderson
0 siblings, 0 replies; 97+ messages in thread
From: Richard Henderson @ 2022-02-22 18:23 UTC (permalink / raw)
To: matheus.ferst, qemu-devel, qemu-ppc
Cc: Lucas Mateus Castro (alqotel),
danielhb413, groug, Lucas Mateus Castro, clg, david
On 2/22/22 04:36, matheus.ferst@eldorado.org.br wrote:
> From: "Lucas Mateus Castro (alqotel)" <lucas.castro@eldorado.org.br>
>
> Changed vmulhuw, vmulhud, vmulhsw, vmulhsd to not
> use helpers.
>
> Signed-off-by: Lucas Mateus Castro (alqotel) <lucas.araujo@eldorado.org.br>
> Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
> ---
> Changes in v4:
> Changed from gvec to i64, this resulted in a better performance on
> a Power host for all 4 instructions and a better performance for
> vmulhsw and vmulhuw in x86, but a worse performance for vmulhsd and
> vmulhud in a x86 host.
Unsurprising.
> +static void do_vx_vmulhd_i64(TCGv_i64 t, TCGv_i64 a, TCGv_i64 b, bool sign)
> +{
> + TCGv_i64 a1, b1, mask, w, k;
> + void (*tcg_gen_shift_imm)(TCGv_i64, TCGv_i64, int64_t);
> +
> + a1 = tcg_temp_new_i64();
> + b1 = tcg_temp_new_i64();
> + w = tcg_temp_new_i64();
> + k = tcg_temp_new_i64();
> + mask = tcg_temp_new_i64();
> + if (sign) {
> + tcg_gen_shift_imm = tcg_gen_sari_i64;
> + } else {
> + tcg_gen_shift_imm = tcg_gen_shri_i64;
> + }
> +
> + tcg_gen_movi_i64(mask, 0xFFFFFFFF);
> + tcg_gen_and_i64(a1, a, mask);
> + tcg_gen_and_i64(b1, b, mask);
> + tcg_gen_mul_i64(t, a1, b1);
> + tcg_gen_shri_i64(k, t, 32);
> +
> + tcg_gen_shift_imm(a1, a, 32);
> + tcg_gen_mul_i64(t, a1, b1);
> + tcg_gen_add_i64(t, t, k);
> + tcg_gen_and_i64(k, t, mask);
> + tcg_gen_shift_imm(w, t, 32);
> +
> + tcg_gen_and_i64(a1, a, mask);
> + tcg_gen_shift_imm(b1, b, 32);
> + tcg_gen_mul_i64(t, a1, b1);
> + tcg_gen_add_i64(t, t, k);
> + tcg_gen_shift_imm(k, t, 32);
> +
> + tcg_gen_shift_imm(a1, a, 32);
> + tcg_gen_mul_i64(t, a1, b1);
> + tcg_gen_add_i64(t, t, w);
> + tcg_gen_add_i64(t, t, k);
You should be using tcg_gen_mul{s,u}2_i64 instead of open-coding the high-part multiplication.
r~
^ permalink raw reply [flat|nested] 97+ messages in thread
* [PATCH v4 05/47] target/ppc: Implement vmsumcud instruction
2022-02-22 14:35 [PATCH v4 00/47] target/ppc: PowerISA Vector/VSX instruction batch matheus.ferst
` (3 preceding siblings ...)
2022-02-22 14:36 ` [PATCH v4 04/47] target/ppc: vmulh* instructions without helpers matheus.ferst
@ 2022-02-22 14:36 ` matheus.ferst
2022-02-22 18:28 ` Richard Henderson
2022-02-22 14:36 ` [PATCH v4 06/47] target/ppc: Implement vmsumudm instruction matheus.ferst
` (41 subsequent siblings)
46 siblings, 1 reply; 97+ messages in thread
From: matheus.ferst @ 2022-02-22 14:36 UTC (permalink / raw)
To: qemu-devel, qemu-ppc
Cc: danielhb413, richard.henderson, groug, Víctor Colombo, clg,
Matheus Ferst, david
From: Víctor Colombo <victor.colombo@eldorado.org.br>
Based on [1] by Lijun Pan <ljp@linux.ibm.com>, which was never merged
into master.
[1]: https://lists.gnu.org/archive/html/qemu-ppc/2020-07/msg00419.html
Signed-off-by: Víctor Colombo <victor.colombo@eldorado.org.br>
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
---
Changes in v4:
Fixed dead move into tmp1
---
target/ppc/insn32.decode | 4 +++
target/ppc/translate/vmx-impl.c.inc | 53 +++++++++++++++++++++++++++++
2 files changed, 57 insertions(+)
diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode
index d817e44c71..e85a75db2f 100644
--- a/target/ppc/insn32.decode
+++ b/target/ppc/insn32.decode
@@ -468,6 +468,10 @@ VMULHSD 000100 ..... ..... ..... 01111001001 @VX
VMULHUD 000100 ..... ..... ..... 01011001001 @VX
VMULLD 000100 ..... ..... ..... 00111001001 @VX
+## Vector Multiply-Sum Instructions
+
+VMSUMCUD 000100 ..... ..... ..... ..... 010111 @VA
+
# VSX Load/Store Instructions
LXV 111101 ..... ..... ............ . 001 @DQ_TSX
diff --git a/target/ppc/translate/vmx-impl.c.inc b/target/ppc/translate/vmx-impl.c.inc
index 3951ae124a..e029873ae0 100644
--- a/target/ppc/translate/vmx-impl.c.inc
+++ b/target/ppc/translate/vmx-impl.c.inc
@@ -2081,6 +2081,59 @@ static bool trans_VPEXTD(DisasContext *ctx, arg_VX *a)
return true;
}
+static bool trans_VMSUMCUD(DisasContext *ctx, arg_VA *a)
+{
+ TCGv_i64 tmp0, tmp1, prod1h, prod1l, prod0h, prod0l, zero;
+
+ REQUIRE_INSNS_FLAGS2(ctx, ISA310);
+ REQUIRE_VECTOR(ctx);
+
+ tmp0 = tcg_temp_new_i64();
+ tmp1 = tcg_temp_new_i64();
+ prod1h = tcg_temp_new_i64();
+ prod1l = tcg_temp_new_i64();
+ prod0h = tcg_temp_new_i64();
+ prod0l = tcg_temp_new_i64();
+ zero = tcg_constant_i64(0);
+
+ /* prod1 = vsr[vra+32].dw[1] * vsr[vrb+32].dw[1] */
+ get_avr64(tmp0, a->vra, false);
+ get_avr64(tmp1, a->vrb, false);
+ tcg_gen_mulu2_i64(prod1l, prod1h, tmp0, tmp1);
+
+ /* prod0 = vsr[vra+32].dw[0] * vsr[vrb+32].dw[0] */
+ get_avr64(tmp0, a->vra, true);
+ get_avr64(tmp1, a->vrb, true);
+ tcg_gen_mulu2_i64(prod0l, prod0h, tmp0, tmp1);
+
+ /* Sum lower 64-bits elements */
+ get_avr64(tmp1, a->rc, false);
+ tcg_gen_add2_i64(tmp1, tmp0, tmp1, zero, prod1l, zero);
+ tcg_gen_add2_i64(tmp1, tmp0, tmp1, tmp0, prod0l, zero);
+
+ /*
+ * Discard lower 64-bits, leaving the carry into bit 64.
+ * Then sum the higher 64-bit elements.
+ */
+ get_avr64(tmp1, a->rc, true);
+ tcg_gen_add2_i64(tmp1, tmp0, tmp0, zero, tmp1, zero);
+ tcg_gen_add2_i64(tmp1, tmp0, tmp1, tmp0, prod1h, zero);
+ tcg_gen_add2_i64(tmp1, tmp0, tmp1, tmp0, prod0h, zero);
+
+ /* Discard 64 more bits to complete the CHOP128(temp >> 128) */
+ set_avr64(a->vrt, tmp0, false);
+ set_avr64(a->vrt, zero, true);
+
+ tcg_temp_free_i64(tmp0);
+ tcg_temp_free_i64(tmp1);
+ tcg_temp_free_i64(prod1h);
+ tcg_temp_free_i64(prod1l);
+ tcg_temp_free_i64(prod0h);
+ tcg_temp_free_i64(prod0l);
+
+ return true;
+}
+
static bool do_vx_helper(DisasContext *ctx, arg_VX *a,
void (*gen_helper) (TCGv_ptr, TCGv_ptr, TCGv_ptr))
{
--
2.25.1
^ permalink raw reply related [flat|nested] 97+ messages in thread
* Re: [PATCH v4 05/47] target/ppc: Implement vmsumcud instruction
2022-02-22 14:36 ` [PATCH v4 05/47] target/ppc: Implement vmsumcud instruction matheus.ferst
@ 2022-02-22 18:28 ` Richard Henderson
0 siblings, 0 replies; 97+ messages in thread
From: Richard Henderson @ 2022-02-22 18:28 UTC (permalink / raw)
To: matheus.ferst, qemu-devel, qemu-ppc
Cc: Víctor Colombo, groug, danielhb413, clg, david
On 2/22/22 04:36, matheus.ferst@eldorado.org.br wrote:
> From: Víctor Colombo<victor.colombo@eldorado.org.br>
>
> Based on [1] by Lijun Pan<ljp@linux.ibm.com>, which was never merged
> into master.
>
> [1]:https://lists.gnu.org/archive/html/qemu-ppc/2020-07/msg00419.html
>
> Signed-off-by: Víctor Colombo<victor.colombo@eldorado.org.br>
> Signed-off-by: Matheus Ferst<matheus.ferst@eldorado.org.br>
> ---
> Changes in v4:
>
> Fixed dead move into tmp1
> ---
> target/ppc/insn32.decode | 4 +++
> target/ppc/translate/vmx-impl.c.inc | 53 +++++++++++++++++++++++++++++
> 2 files changed, 57 insertions(+)
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
r~
^ permalink raw reply [flat|nested] 97+ messages in thread
* [PATCH v4 06/47] target/ppc: Implement vmsumudm instruction
2022-02-22 14:35 [PATCH v4 00/47] target/ppc: PowerISA Vector/VSX instruction batch matheus.ferst
` (4 preceding siblings ...)
2022-02-22 14:36 ` [PATCH v4 05/47] target/ppc: Implement vmsumcud instruction matheus.ferst
@ 2022-02-22 14:36 ` matheus.ferst
2022-02-22 14:36 ` [PATCH v4 07/47] target/ppc: Move vexts[bhw]2[wd] to decodetree matheus.ferst
` (40 subsequent siblings)
46 siblings, 0 replies; 97+ messages in thread
From: matheus.ferst @ 2022-02-22 14:36 UTC (permalink / raw)
To: qemu-devel, qemu-ppc
Cc: danielhb413, richard.henderson, groug, Víctor Colombo, clg,
Matheus Ferst, david
From: Víctor Colombo <victor.colombo@eldorado.org.br>
Based on [1] by Lijun Pan <ljp@linux.ibm.com>, which was never merged
into master.
[1]: https://lists.gnu.org/archive/html/qemu-ppc/2020-07/msg00419.html
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Víctor Colombo <victor.colombo@eldorado.org.br>
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
---
target/ppc/insn32.decode | 1 +
target/ppc/translate/vmx-impl.c.inc | 34 +++++++++++++++++++++++++++++
2 files changed, 35 insertions(+)
diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode
index e85a75db2f..732a2bb79e 100644
--- a/target/ppc/insn32.decode
+++ b/target/ppc/insn32.decode
@@ -471,6 +471,7 @@ VMULLD 000100 ..... ..... ..... 00111001001 @VX
## Vector Multiply-Sum Instructions
VMSUMCUD 000100 ..... ..... ..... ..... 010111 @VA
+VMSUMUDM 000100 ..... ..... ..... ..... 100011 @VA
# VSX Load/Store Instructions
diff --git a/target/ppc/translate/vmx-impl.c.inc b/target/ppc/translate/vmx-impl.c.inc
index e029873ae0..afe895ab7f 100644
--- a/target/ppc/translate/vmx-impl.c.inc
+++ b/target/ppc/translate/vmx-impl.c.inc
@@ -2081,6 +2081,40 @@ static bool trans_VPEXTD(DisasContext *ctx, arg_VX *a)
return true;
}
+static bool trans_VMSUMUDM(DisasContext *ctx, arg_VA *a)
+{
+ TCGv_i64 rl, rh, src1, src2;
+ int dw;
+
+ REQUIRE_INSNS_FLAGS2(ctx, ISA300);
+ REQUIRE_VECTOR(ctx);
+
+ rh = tcg_temp_new_i64();
+ rl = tcg_temp_new_i64();
+ src1 = tcg_temp_new_i64();
+ src2 = tcg_temp_new_i64();
+
+ get_avr64(rl, a->rc, false);
+ get_avr64(rh, a->rc, true);
+
+ for (dw = 0; dw < 2; dw++) {
+ get_avr64(src1, a->vra, dw);
+ get_avr64(src2, a->vrb, dw);
+ tcg_gen_mulu2_i64(src1, src2, src1, src2);
+ tcg_gen_add2_i64(rl, rh, rl, rh, src1, src2);
+ }
+
+ set_avr64(a->vrt, rl, false);
+ set_avr64(a->vrt, rh, true);
+
+ tcg_temp_free_i64(rl);
+ tcg_temp_free_i64(rh);
+ tcg_temp_free_i64(src1);
+ tcg_temp_free_i64(src2);
+
+ return true;
+}
+
static bool trans_VMSUMCUD(DisasContext *ctx, arg_VA *a)
{
TCGv_i64 tmp0, tmp1, prod1h, prod1l, prod0h, prod0l, zero;
--
2.25.1
^ permalink raw reply related [flat|nested] 97+ messages in thread
* [PATCH v4 07/47] target/ppc: Move vexts[bhw]2[wd] to decodetree
2022-02-22 14:35 [PATCH v4 00/47] target/ppc: PowerISA Vector/VSX instruction batch matheus.ferst
` (5 preceding siblings ...)
2022-02-22 14:36 ` [PATCH v4 06/47] target/ppc: Implement vmsumudm instruction matheus.ferst
@ 2022-02-22 14:36 ` matheus.ferst
2022-02-22 18:34 ` Richard Henderson
2022-02-22 14:36 ` [PATCH v4 08/47] target/ppc: Implement vextsd2q matheus.ferst
` (39 subsequent siblings)
46 siblings, 1 reply; 97+ messages in thread
From: matheus.ferst @ 2022-02-22 14:36 UTC (permalink / raw)
To: qemu-devel, qemu-ppc
Cc: danielhb413, richard.henderson, groug, clg, Matheus Ferst,
Lucas Coutinho, david
From: Lucas Coutinho <lucas.coutinho@eldorado.org.br>
Move the following instructions to decodetree:
vextsb2w: Vector Extend Sign Byte To Word
vextsh2w: Vector Extend Sign Halfword To Word
vextsb2d: Vector Extend Sign Byte To Doubleword
vextsh2d: Vector Extend Sign Halfword To Doubleword
vextsw2d: Vector Extend Sign Word To Doubleword
Signed-off-by: Lucas Coutinho <lucas.coutinho@eldorado.org.br>
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
---
target/ppc/helper.h | 5 ---
target/ppc/insn32.decode | 8 ++++
target/ppc/int_helper.c | 15 --------
target/ppc/translate/vmx-impl.c.inc | 60 ++++++++++++++++++++++++++---
target/ppc/translate/vmx-ops.c.inc | 5 ---
5 files changed, 63 insertions(+), 30 deletions(-)
diff --git a/target/ppc/helper.h b/target/ppc/helper.h
index d0c5a3fef1..6ac72868bb 100644
--- a/target/ppc/helper.h
+++ b/target/ppc/helper.h
@@ -244,11 +244,6 @@ DEF_HELPER_4(VINSBLX, void, env, avr, i64, tl)
DEF_HELPER_4(VINSHLX, void, env, avr, i64, tl)
DEF_HELPER_4(VINSWLX, void, env, avr, i64, tl)
DEF_HELPER_4(VINSDLX, void, env, avr, i64, tl)
-DEF_HELPER_2(vextsb2w, void, avr, avr)
-DEF_HELPER_2(vextsh2w, void, avr, avr)
-DEF_HELPER_2(vextsb2d, void, avr, avr)
-DEF_HELPER_2(vextsh2d, void, avr, avr)
-DEF_HELPER_2(vextsw2d, void, avr, avr)
DEF_HELPER_2(vnegw, void, avr, avr)
DEF_HELPER_2(vnegd, void, avr, avr)
DEF_HELPER_2(vupkhpx, void, avr, avr)
diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode
index 732a2bb79e..1dcf9c61e9 100644
--- a/target/ppc/insn32.decode
+++ b/target/ppc/insn32.decode
@@ -419,6 +419,14 @@ VINSWVRX 000100 ..... ..... ..... 00110001111 @VX
VSLDBI 000100 ..... ..... ..... 00 ... 010110 @VN
VSRDBI 000100 ..... ..... ..... 01 ... 010110 @VN
+## Vector Integer Arithmetic Instructions
+
+VEXTSB2W 000100 ..... 10000 ..... 11000000010 @VX_tb
+VEXTSH2W 000100 ..... 10001 ..... 11000000010 @VX_tb
+VEXTSB2D 000100 ..... 11000 ..... 11000000010 @VX_tb
+VEXTSH2D 000100 ..... 11001 ..... 11000000010 @VX_tb
+VEXTSW2D 000100 ..... 11010 ..... 11000000010 @VX_tb
+
## Vector Mask Manipulation Instructions
MTVSRBM 000100 ..... 10000 ..... 11001000010 @VX_tb
diff --git a/target/ppc/int_helper.c b/target/ppc/int_helper.c
index 64c87d9418..ade2b28795 100644
--- a/target/ppc/int_helper.c
+++ b/target/ppc/int_helper.c
@@ -1646,21 +1646,6 @@ XXBLEND(W, 32)
XXBLEND(D, 64)
#undef XXBLEND
-#define VEXT_SIGNED(name, element, cast) \
-void helper_##name(ppc_avr_t *r, ppc_avr_t *b) \
-{ \
- int i; \
- for (i = 0; i < ARRAY_SIZE(r->element); i++) { \
- r->element[i] = (cast)b->element[i]; \
- } \
-}
-VEXT_SIGNED(vextsb2w, s32, int8_t)
-VEXT_SIGNED(vextsb2d, s64, int8_t)
-VEXT_SIGNED(vextsh2w, s32, int16_t)
-VEXT_SIGNED(vextsh2d, s64, int16_t)
-VEXT_SIGNED(vextsw2d, s64, int32_t)
-#undef VEXT_SIGNED
-
#define VNEG(name, element) \
void helper_##name(ppc_avr_t *r, ppc_avr_t *b) \
{ \
diff --git a/target/ppc/translate/vmx-impl.c.inc b/target/ppc/translate/vmx-impl.c.inc
index afe895ab7f..522f8ac142 100644
--- a/target/ppc/translate/vmx-impl.c.inc
+++ b/target/ppc/translate/vmx-impl.c.inc
@@ -1772,11 +1772,61 @@ GEN_VXFORM_TRANS(vclzw, 1, 30)
GEN_VXFORM_TRANS(vclzd, 1, 31)
GEN_VXFORM_NOA_2(vnegw, 1, 24, 6)
GEN_VXFORM_NOA_2(vnegd, 1, 24, 7)
-GEN_VXFORM_NOA_2(vextsb2w, 1, 24, 16)
-GEN_VXFORM_NOA_2(vextsh2w, 1, 24, 17)
-GEN_VXFORM_NOA_2(vextsb2d, 1, 24, 24)
-GEN_VXFORM_NOA_2(vextsh2d, 1, 24, 25)
-GEN_VXFORM_NOA_2(vextsw2d, 1, 24, 26)
+
+static void gen_vexts_i64(TCGv_i64 t, TCGv_i64 b, int64_t s)
+{
+ tcg_gen_shli_i64(t, b, s);
+ tcg_gen_sari_i64(t, t, s);
+}
+
+static void gen_vexts_i32(TCGv_i32 t, TCGv_i32 b, int32_t s)
+{
+ tcg_gen_shli_i32(t, b, s);
+ tcg_gen_sari_i32(t, t, s);
+}
+
+static void gen_vexts_vec(unsigned vece, TCGv_vec t, TCGv_vec b, int64_t s)
+{
+ tcg_gen_shli_vec(vece, t, b, s);
+ tcg_gen_sari_vec(vece, t, t, s);
+}
+
+static bool do_vexts(DisasContext *ctx, arg_VX_tb *a, unsigned vece, int64_t s)
+{
+ static const TCGOpcode vecop_list[] = {
+ INDEX_op_shli_vec, INDEX_op_sari_vec, 0
+ };
+
+ static const GVecGen2i op[2] = {
+ {
+ .fni4 = gen_vexts_i32,
+ .fniv = gen_vexts_vec,
+ .opt_opc = vecop_list,
+ .vece = MO_32
+ },
+ {
+ .fni8 = gen_vexts_i64,
+ .fniv = gen_vexts_vec,
+ .opt_opc = vecop_list,
+ .vece = MO_64
+ },
+ };
+
+ REQUIRE_INSNS_FLAGS2(ctx, ISA300);
+ REQUIRE_VECTOR(ctx);
+
+ tcg_gen_gvec_2i(avr_full_offset(a->vrt), avr_full_offset(a->vrb),
+ 16, 16, s, &op[vece - MO_32]);
+
+ return true;
+}
+
+TRANS(VEXTSB2W, do_vexts, MO_32, 24);
+TRANS(VEXTSH2W, do_vexts, MO_32, 16);
+TRANS(VEXTSB2D, do_vexts, MO_64, 56);
+TRANS(VEXTSH2D, do_vexts, MO_64, 48);
+TRANS(VEXTSW2D, do_vexts, MO_64, 32);
+
GEN_VXFORM_NOA_2(vctzb, 1, 24, 28)
GEN_VXFORM_NOA_2(vctzh, 1, 24, 29)
GEN_VXFORM_NOA_2(vctzw, 1, 24, 30)
diff --git a/target/ppc/translate/vmx-ops.c.inc b/target/ppc/translate/vmx-ops.c.inc
index 914e68e5b0..6787327f56 100644
--- a/target/ppc/translate/vmx-ops.c.inc
+++ b/target/ppc/translate/vmx-ops.c.inc
@@ -216,11 +216,6 @@ GEN_VXFORM(vspltish, 6, 13),
GEN_VXFORM(vspltisw, 6, 14),
GEN_VXFORM_300_EO(vnegw, 0x01, 0x18, 0x06),
GEN_VXFORM_300_EO(vnegd, 0x01, 0x18, 0x07),
-GEN_VXFORM_300_EO(vextsb2w, 0x01, 0x18, 0x10),
-GEN_VXFORM_300_EO(vextsh2w, 0x01, 0x18, 0x11),
-GEN_VXFORM_300_EO(vextsb2d, 0x01, 0x18, 0x18),
-GEN_VXFORM_300_EO(vextsh2d, 0x01, 0x18, 0x19),
-GEN_VXFORM_300_EO(vextsw2d, 0x01, 0x18, 0x1A),
GEN_VXFORM_300_EO(vctzb, 0x01, 0x18, 0x1C),
GEN_VXFORM_300_EO(vctzh, 0x01, 0x18, 0x1D),
GEN_VXFORM_300_EO(vctzw, 0x01, 0x18, 0x1E),
--
2.25.1
^ permalink raw reply related [flat|nested] 97+ messages in thread
* Re: [PATCH v4 07/47] target/ppc: Move vexts[bhw]2[wd] to decodetree
2022-02-22 14:36 ` [PATCH v4 07/47] target/ppc: Move vexts[bhw]2[wd] to decodetree matheus.ferst
@ 2022-02-22 18:34 ` Richard Henderson
0 siblings, 0 replies; 97+ messages in thread
From: Richard Henderson @ 2022-02-22 18:34 UTC (permalink / raw)
To: matheus.ferst, qemu-devel, qemu-ppc
Cc: Lucas Coutinho, groug, danielhb413, clg, david
On 2/22/22 04:36, matheus.ferst@eldorado.org.br wrote:
> +static void gen_vexts_i64(TCGv_i64 t, TCGv_i64 b, int64_t s)
> +{
> + tcg_gen_shli_i64(t, b, s);
> + tcg_gen_sari_i64(t, t, s);
> +}
> +
> +static void gen_vexts_i32(TCGv_i32 t, TCGv_i32 b, int32_t s)
> +{
> + tcg_gen_shli_i32(t, b, s);
> + tcg_gen_sari_i32(t, t, s);
> +}
tcg_gen_sextract_*.
With that,
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
r~
^ permalink raw reply [flat|nested] 97+ messages in thread
* [PATCH v4 08/47] target/ppc: Implement vextsd2q
2022-02-22 14:35 [PATCH v4 00/47] target/ppc: PowerISA Vector/VSX instruction batch matheus.ferst
` (6 preceding siblings ...)
2022-02-22 14:36 ` [PATCH v4 07/47] target/ppc: Move vexts[bhw]2[wd] to decodetree matheus.ferst
@ 2022-02-22 14:36 ` matheus.ferst
2022-02-22 14:36 ` [PATCH v4 09/47] target/ppc: Move Vector Compare Equal/Not Equal/Greater Than to decodetree matheus.ferst
` (38 subsequent siblings)
46 siblings, 0 replies; 97+ messages in thread
From: matheus.ferst @ 2022-02-22 14:36 UTC (permalink / raw)
To: qemu-devel, qemu-ppc
Cc: danielhb413, richard.henderson, groug, clg, Matheus Ferst,
Lucas Coutinho, david
From: Lucas Coutinho <lucas.coutinho@eldorado.org.br>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Lucas Coutinho <lucas.coutinho@eldorado.org.br>
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
---
target/ppc/insn32.decode | 1 +
target/ppc/translate/vmx-impl.c.inc | 18 ++++++++++++++++++
2 files changed, 19 insertions(+)
diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode
index 1dcf9c61e9..cba680075b 100644
--- a/target/ppc/insn32.decode
+++ b/target/ppc/insn32.decode
@@ -426,6 +426,7 @@ VEXTSH2W 000100 ..... 10001 ..... 11000000010 @VX_tb
VEXTSB2D 000100 ..... 11000 ..... 11000000010 @VX_tb
VEXTSH2D 000100 ..... 11001 ..... 11000000010 @VX_tb
VEXTSW2D 000100 ..... 11010 ..... 11000000010 @VX_tb
+VEXTSD2Q 000100 ..... 11011 ..... 11000000010 @VX_tb
## Vector Mask Manipulation Instructions
diff --git a/target/ppc/translate/vmx-impl.c.inc b/target/ppc/translate/vmx-impl.c.inc
index 522f8ac142..cf69f4c412 100644
--- a/target/ppc/translate/vmx-impl.c.inc
+++ b/target/ppc/translate/vmx-impl.c.inc
@@ -1827,6 +1827,24 @@ TRANS(VEXTSB2D, do_vexts, MO_64, 56);
TRANS(VEXTSH2D, do_vexts, MO_64, 48);
TRANS(VEXTSW2D, do_vexts, MO_64, 32);
+static bool trans_VEXTSD2Q(DisasContext *ctx, arg_VX_tb *a)
+{
+ TCGv_i64 tmp;
+
+ REQUIRE_INSNS_FLAGS2(ctx, ISA310);
+ REQUIRE_VECTOR(ctx);
+
+ tmp = tcg_temp_new_i64();
+
+ get_avr64(tmp, a->vrb, false);
+ set_avr64(a->vrt, tmp, false);
+ tcg_gen_sari_i64(tmp, tmp, 63);
+ set_avr64(a->vrt, tmp, true);
+
+ tcg_temp_free_i64(tmp);
+ return true;
+}
+
GEN_VXFORM_NOA_2(vctzb, 1, 24, 28)
GEN_VXFORM_NOA_2(vctzh, 1, 24, 29)
GEN_VXFORM_NOA_2(vctzw, 1, 24, 30)
--
2.25.1
^ permalink raw reply related [flat|nested] 97+ messages in thread
* [PATCH v4 09/47] target/ppc: Move Vector Compare Equal/Not Equal/Greater Than to decodetree
2022-02-22 14:35 [PATCH v4 00/47] target/ppc: PowerISA Vector/VSX instruction batch matheus.ferst
` (7 preceding siblings ...)
2022-02-22 14:36 ` [PATCH v4 08/47] target/ppc: Implement vextsd2q matheus.ferst
@ 2022-02-22 14:36 ` matheus.ferst
2022-02-22 18:37 ` Richard Henderson
2022-02-22 14:36 ` [PATCH v4 10/47] target/ppc: Move Vector Compare Not Equal or Zero " matheus.ferst
` (37 subsequent siblings)
46 siblings, 1 reply; 97+ messages in thread
From: matheus.ferst @ 2022-02-22 14:36 UTC (permalink / raw)
To: qemu-devel, qemu-ppc
Cc: danielhb413, richard.henderson, groug, clg, Matheus Ferst, david
From: Matheus Ferst <matheus.ferst@eldorado.org.br>
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
---
target/ppc/helper.h | 30 ----------
target/ppc/insn32.decode | 24 ++++++++
target/ppc/int_helper.c | 54 -----------------
target/ppc/translate/vmx-impl.c.inc | 89 ++++++++++++++++++++---------
target/ppc/translate/vmx-ops.c.inc | 15 +----
5 files changed, 88 insertions(+), 124 deletions(-)
diff --git a/target/ppc/helper.h b/target/ppc/helper.h
index 6ac72868bb..fb421dd343 100644
--- a/target/ppc/helper.h
+++ b/target/ppc/helper.h
@@ -140,46 +140,16 @@ DEF_HELPER_3(vabsduw, void, avr, avr, avr)
DEF_HELPER_3(vavgsb, void, avr, avr, avr)
DEF_HELPER_3(vavgsh, void, avr, avr, avr)
DEF_HELPER_3(vavgsw, void, avr, avr, avr)
-DEF_HELPER_4(vcmpequb, void, env, avr, avr, avr)
-DEF_HELPER_4(vcmpequh, void, env, avr, avr, avr)
-DEF_HELPER_4(vcmpequw, void, env, avr, avr, avr)
-DEF_HELPER_4(vcmpequd, void, env, avr, avr, avr)
-DEF_HELPER_4(vcmpneb, void, env, avr, avr, avr)
-DEF_HELPER_4(vcmpneh, void, env, avr, avr, avr)
-DEF_HELPER_4(vcmpnew, void, env, avr, avr, avr)
DEF_HELPER_4(vcmpnezb, void, env, avr, avr, avr)
DEF_HELPER_4(vcmpnezh, void, env, avr, avr, avr)
DEF_HELPER_4(vcmpnezw, void, env, avr, avr, avr)
-DEF_HELPER_4(vcmpgtub, void, env, avr, avr, avr)
-DEF_HELPER_4(vcmpgtuh, void, env, avr, avr, avr)
-DEF_HELPER_4(vcmpgtuw, void, env, avr, avr, avr)
-DEF_HELPER_4(vcmpgtud, void, env, avr, avr, avr)
-DEF_HELPER_4(vcmpgtsb, void, env, avr, avr, avr)
-DEF_HELPER_4(vcmpgtsh, void, env, avr, avr, avr)
-DEF_HELPER_4(vcmpgtsw, void, env, avr, avr, avr)
-DEF_HELPER_4(vcmpgtsd, void, env, avr, avr, avr)
DEF_HELPER_4(vcmpeqfp, void, env, avr, avr, avr)
DEF_HELPER_4(vcmpgefp, void, env, avr, avr, avr)
DEF_HELPER_4(vcmpgtfp, void, env, avr, avr, avr)
DEF_HELPER_4(vcmpbfp, void, env, avr, avr, avr)
-DEF_HELPER_4(vcmpequb_dot, void, env, avr, avr, avr)
-DEF_HELPER_4(vcmpequh_dot, void, env, avr, avr, avr)
-DEF_HELPER_4(vcmpequw_dot, void, env, avr, avr, avr)
-DEF_HELPER_4(vcmpequd_dot, void, env, avr, avr, avr)
-DEF_HELPER_4(vcmpneb_dot, void, env, avr, avr, avr)
-DEF_HELPER_4(vcmpneh_dot, void, env, avr, avr, avr)
-DEF_HELPER_4(vcmpnew_dot, void, env, avr, avr, avr)
DEF_HELPER_4(vcmpnezb_dot, void, env, avr, avr, avr)
DEF_HELPER_4(vcmpnezh_dot, void, env, avr, avr, avr)
DEF_HELPER_4(vcmpnezw_dot, void, env, avr, avr, avr)
-DEF_HELPER_4(vcmpgtub_dot, void, env, avr, avr, avr)
-DEF_HELPER_4(vcmpgtuh_dot, void, env, avr, avr, avr)
-DEF_HELPER_4(vcmpgtuw_dot, void, env, avr, avr, avr)
-DEF_HELPER_4(vcmpgtud_dot, void, env, avr, avr, avr)
-DEF_HELPER_4(vcmpgtsb_dot, void, env, avr, avr, avr)
-DEF_HELPER_4(vcmpgtsh_dot, void, env, avr, avr, avr)
-DEF_HELPER_4(vcmpgtsw_dot, void, env, avr, avr, avr)
-DEF_HELPER_4(vcmpgtsd_dot, void, env, avr, avr, avr)
DEF_HELPER_4(vcmpeqfp_dot, void, env, avr, avr, avr)
DEF_HELPER_4(vcmpgefp_dot, void, env, avr, avr, avr)
DEF_HELPER_4(vcmpgtfp_dot, void, env, avr, avr, avr)
diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode
index cba680075b..5443ee0394 100644
--- a/target/ppc/insn32.decode
+++ b/target/ppc/insn32.decode
@@ -51,6 +51,9 @@
&VA vrt vra vrb rc
@VA ...... vrt:5 vra:5 vrb:5 rc:5 ...... &VA
+&VC vrt vra vrb rc:bool
+@VC ...... vrt:5 vra:5 vrb:5 rc:1 .......... &VC
+
&VN vrt vra vrb sh
@VN ...... vrt:5 vra:5 vrb:5 .. sh:3 ...... &VN
@@ -373,6 +376,27 @@ DSCLIQ 111111 ..... ..... ...... 001000010 . @Z22_tap_sh_rc
DSCRI 111011 ..... ..... ...... 001100010 . @Z22_ta_sh_rc
DSCRIQ 111111 ..... ..... ...... 001100010 . @Z22_tap_sh_rc
+## Vector Integer Instructions
+
+VCMPEQUB 000100 ..... ..... ..... . 0000000110 @VC
+VCMPEQUH 000100 ..... ..... ..... . 0001000110 @VC
+VCMPEQUW 000100 ..... ..... ..... . 0010000110 @VC
+VCMPEQUD 000100 ..... ..... ..... . 0011000111 @VC
+
+VCMPGTSB 000100 ..... ..... ..... . 1100000110 @VC
+VCMPGTSH 000100 ..... ..... ..... . 1101000110 @VC
+VCMPGTSW 000100 ..... ..... ..... . 1110000110 @VC
+VCMPGTSD 000100 ..... ..... ..... . 1111000111 @VC
+
+VCMPGTUB 000100 ..... ..... ..... . 1000000110 @VC
+VCMPGTUH 000100 ..... ..... ..... . 1001000110 @VC
+VCMPGTUW 000100 ..... ..... ..... . 1010000110 @VC
+VCMPGTUD 000100 ..... ..... ..... . 1011000111 @VC
+
+VCMPNEB 000100 ..... ..... ..... . 0000000111 @VC
+VCMPNEH 000100 ..... ..... ..... . 0001000111 @VC
+VCMPNEW 000100 ..... ..... ..... . 0010000111 @VC
+
## Vector Bit Manipulation Instruction
VCFUGED 000100 ..... ..... ..... 10101001101 @VX
diff --git a/target/ppc/int_helper.c b/target/ppc/int_helper.c
index ade2b28795..c9e64014dc 100644
--- a/target/ppc/int_helper.c
+++ b/target/ppc/int_helper.c
@@ -662,57 +662,6 @@ VCF(ux, uint32_to_float32, u32)
VCF(sx, int32_to_float32, s32)
#undef VCF
-#define VCMP_DO(suffix, compare, element, record) \
- void helper_vcmp##suffix(CPUPPCState *env, ppc_avr_t *r, \
- ppc_avr_t *a, ppc_avr_t *b) \
- { \
- uint64_t ones = (uint64_t)-1; \
- uint64_t all = ones; \
- uint64_t none = 0; \
- int i; \
- \
- for (i = 0; i < ARRAY_SIZE(r->element); i++) { \
- uint64_t result = (a->element[i] compare b->element[i] ? \
- ones : 0x0); \
- switch (sizeof(a->element[0])) { \
- case 8: \
- r->u64[i] = result; \
- break; \
- case 4: \
- r->u32[i] = result; \
- break; \
- case 2: \
- r->u16[i] = result; \
- break; \
- case 1: \
- r->u8[i] = result; \
- break; \
- } \
- all &= result; \
- none |= result; \
- } \
- if (record) { \
- env->crf[6] = ((all != 0) << 3) | ((none == 0) << 1); \
- } \
- }
-#define VCMP(suffix, compare, element) \
- VCMP_DO(suffix, compare, element, 0) \
- VCMP_DO(suffix##_dot, compare, element, 1)
-VCMP(equb, ==, u8)
-VCMP(equh, ==, u16)
-VCMP(equw, ==, u32)
-VCMP(equd, ==, u64)
-VCMP(gtub, >, u8)
-VCMP(gtuh, >, u16)
-VCMP(gtuw, >, u32)
-VCMP(gtud, >, u64)
-VCMP(gtsb, >, s8)
-VCMP(gtsh, >, s16)
-VCMP(gtsw, >, s32)
-VCMP(gtsd, >, s64)
-#undef VCMP_DO
-#undef VCMP
-
#define VCMPNE_DO(suffix, element, etype, cmpzero, record) \
void helper_vcmpne##suffix(CPUPPCState *env, ppc_avr_t *r, \
ppc_avr_t *a, ppc_avr_t *b) \
@@ -751,9 +700,6 @@ void helper_vcmpne##suffix(CPUPPCState *env, ppc_avr_t *r, \
VCMPNE(zb, u8, uint8_t, 1)
VCMPNE(zh, u16, uint16_t, 1)
VCMPNE(zw, u32, uint32_t, 1)
-VCMPNE(b, u8, uint8_t, 0)
-VCMPNE(h, u16, uint16_t, 0)
-VCMPNE(w, u32, uint32_t, 0)
#undef VCMPNE_DO
#undef VCMPNE
diff --git a/target/ppc/translate/vmx-impl.c.inc b/target/ppc/translate/vmx-impl.c.inc
index cf69f4c412..e007003f14 100644
--- a/target/ppc/translate/vmx-impl.c.inc
+++ b/target/ppc/translate/vmx-impl.c.inc
@@ -985,41 +985,74 @@ static void glue(gen_, name0##_##name1)(DisasContext *ctx) \
} \
}
-GEN_VXRFORM(vcmpequb, 3, 0)
-GEN_VXRFORM(vcmpequh, 3, 1)
-GEN_VXRFORM(vcmpequw, 3, 2)
-GEN_VXRFORM(vcmpequd, 3, 3)
GEN_VXRFORM(vcmpnezb, 3, 4)
GEN_VXRFORM(vcmpnezh, 3, 5)
GEN_VXRFORM(vcmpnezw, 3, 6)
-GEN_VXRFORM(vcmpgtsb, 3, 12)
-GEN_VXRFORM(vcmpgtsh, 3, 13)
-GEN_VXRFORM(vcmpgtsw, 3, 14)
-GEN_VXRFORM(vcmpgtsd, 3, 15)
-GEN_VXRFORM(vcmpgtub, 3, 8)
-GEN_VXRFORM(vcmpgtuh, 3, 9)
-GEN_VXRFORM(vcmpgtuw, 3, 10)
-GEN_VXRFORM(vcmpgtud, 3, 11)
+
+static void do_vcmp_rc(int vrt)
+{
+ TCGv_i64 tmp, set, clr;
+
+ tmp = tcg_temp_new_i64();
+ set = tcg_temp_new_i64();
+ clr = tcg_temp_new_i64();
+
+ get_avr64(tmp, vrt, true);
+ tcg_gen_mov_i64(set, tmp);
+ get_avr64(tmp, vrt, false);
+ tcg_gen_or_i64(clr, set, tmp);
+ tcg_gen_and_i64(set, set, tmp);
+
+ tcg_gen_setcondi_i64(TCG_COND_EQ, clr, clr, 0);
+ tcg_gen_shli_i64(clr, clr, 1);
+
+ tcg_gen_setcondi_i64(TCG_COND_EQ, set, set, -1);
+ tcg_gen_shli_i64(set, set, 3);
+
+ tcg_gen_or_i64(tmp, set, clr);
+ tcg_gen_extrl_i64_i32(cpu_crf[6], tmp);
+
+ tcg_temp_free_i64(tmp);
+ tcg_temp_free_i64(set);
+ tcg_temp_free_i64(clr);
+}
+
+static bool do_vcmp(DisasContext *ctx, arg_VC *a, TCGCond cond, int vece)
+{
+ REQUIRE_VECTOR(ctx);
+
+ tcg_gen_gvec_cmp(cond, vece, avr_full_offset(a->vrt),
+ avr_full_offset(a->vra), avr_full_offset(a->vrb), 16, 16);
+
+ if (a->rc) {
+ do_vcmp_rc(a->vrt);
+ }
+
+ return true;
+}
+
+TRANS_FLAGS(ALTIVEC, VCMPEQUB, do_vcmp, TCG_COND_EQ, MO_8)
+TRANS_FLAGS(ALTIVEC, VCMPEQUH, do_vcmp, TCG_COND_EQ, MO_16)
+TRANS_FLAGS(ALTIVEC, VCMPEQUW, do_vcmp, TCG_COND_EQ, MO_32)
+TRANS_FLAGS2(ALTIVEC_207, VCMPEQUD, do_vcmp, TCG_COND_EQ, MO_64)
+
+TRANS_FLAGS(ALTIVEC, VCMPGTSB, do_vcmp, TCG_COND_GT, MO_8)
+TRANS_FLAGS(ALTIVEC, VCMPGTSH, do_vcmp, TCG_COND_GT, MO_16)
+TRANS_FLAGS(ALTIVEC, VCMPGTSW, do_vcmp, TCG_COND_GT, MO_32)
+TRANS_FLAGS2(ALTIVEC_207, VCMPGTSD, do_vcmp, TCG_COND_GT, MO_64)
+TRANS_FLAGS(ALTIVEC, VCMPGTUB, do_vcmp, TCG_COND_GTU, MO_8)
+TRANS_FLAGS(ALTIVEC, VCMPGTUH, do_vcmp, TCG_COND_GTU, MO_16)
+TRANS_FLAGS(ALTIVEC, VCMPGTUW, do_vcmp, TCG_COND_GTU, MO_32)
+TRANS_FLAGS2(ALTIVEC_207, VCMPGTUD, do_vcmp, TCG_COND_GTU, MO_64)
+
+TRANS_FLAGS2(ISA300, VCMPNEB, do_vcmp, TCG_COND_NE, MO_8)
+TRANS_FLAGS2(ISA300, VCMPNEH, do_vcmp, TCG_COND_NE, MO_16)
+TRANS_FLAGS2(ISA300, VCMPNEW, do_vcmp, TCG_COND_NE, MO_32)
+
GEN_VXRFORM(vcmpeqfp, 3, 3)
GEN_VXRFORM(vcmpgefp, 3, 7)
GEN_VXRFORM(vcmpgtfp, 3, 11)
GEN_VXRFORM(vcmpbfp, 3, 15)
-GEN_VXRFORM(vcmpneb, 3, 0)
-GEN_VXRFORM(vcmpneh, 3, 1)
-GEN_VXRFORM(vcmpnew, 3, 2)
-
-GEN_VXRFORM_DUAL(vcmpequb, PPC_ALTIVEC, PPC_NONE, \
- vcmpneb, PPC_NONE, PPC2_ISA300)
-GEN_VXRFORM_DUAL(vcmpequh, PPC_ALTIVEC, PPC_NONE, \
- vcmpneh, PPC_NONE, PPC2_ISA300)
-GEN_VXRFORM_DUAL(vcmpequw, PPC_ALTIVEC, PPC_NONE, \
- vcmpnew, PPC_NONE, PPC2_ISA300)
-GEN_VXRFORM_DUAL(vcmpeqfp, PPC_ALTIVEC, PPC_NONE, \
- vcmpequd, PPC_NONE, PPC2_ALTIVEC_207)
-GEN_VXRFORM_DUAL(vcmpbfp, PPC_ALTIVEC, PPC_NONE, \
- vcmpgtsd, PPC_NONE, PPC2_ALTIVEC_207)
-GEN_VXRFORM_DUAL(vcmpgtfp, PPC_ALTIVEC, PPC_NONE, \
- vcmpgtud, PPC_NONE, PPC2_ALTIVEC_207)
static void gen_vsplti(DisasContext *ctx, int vece)
{
diff --git a/target/ppc/translate/vmx-ops.c.inc b/target/ppc/translate/vmx-ops.c.inc
index 6787327f56..80d460c34e 100644
--- a/target/ppc/translate/vmx-ops.c.inc
+++ b/target/ppc/translate/vmx-ops.c.inc
@@ -187,19 +187,10 @@ GEN_HANDLER2_E(name, str, 0x4, opc2, opc3, 0x00000000, PPC_NONE, PPC2_ISA300),
GEN_VXRFORM_300(vcmpnezb, 3, 4)
GEN_VXRFORM_300(vcmpnezh, 3, 5)
GEN_VXRFORM_300(vcmpnezw, 3, 6)
-GEN_VXRFORM(vcmpgtsb, 3, 12)
-GEN_VXRFORM(vcmpgtsh, 3, 13)
-GEN_VXRFORM(vcmpgtsw, 3, 14)
-GEN_VXRFORM(vcmpgtub, 3, 8)
-GEN_VXRFORM(vcmpgtuh, 3, 9)
-GEN_VXRFORM(vcmpgtuw, 3, 10)
-GEN_VXRFORM_DUAL(vcmpeqfp, vcmpequd, 3, 3, PPC_ALTIVEC, PPC_NONE)
+GEN_VXRFORM(vcmpeqfp, 3, 3)
GEN_VXRFORM(vcmpgefp, 3, 7)
-GEN_VXRFORM_DUAL(vcmpgtfp, vcmpgtud, 3, 11, PPC_ALTIVEC, PPC_NONE)
-GEN_VXRFORM_DUAL(vcmpbfp, vcmpgtsd, 3, 15, PPC_ALTIVEC, PPC_NONE)
-GEN_VXRFORM_DUAL(vcmpequb, vcmpneb, 3, 0, PPC_ALTIVEC, PPC_NONE)
-GEN_VXRFORM_DUAL(vcmpequh, vcmpneh, 3, 1, PPC_ALTIVEC, PPC_NONE)
-GEN_VXRFORM_DUAL(vcmpequw, vcmpnew, 3, 2, PPC_ALTIVEC, PPC_NONE)
+GEN_VXRFORM(vcmpgtfp, 3, 11)
+GEN_VXRFORM(vcmpbfp, 3, 15)
#define GEN_VXFORM_DUAL_INV(name0, name1, opc2, opc3, inval0, inval1, type) \
GEN_OPCODE_DUAL(name0##_##name1, 0x04, opc2, opc3, inval0, inval1, type, \
--
2.25.1
^ permalink raw reply related [flat|nested] 97+ messages in thread
* Re: [PATCH v4 09/47] target/ppc: Move Vector Compare Equal/Not Equal/Greater Than to decodetree
2022-02-22 14:36 ` [PATCH v4 09/47] target/ppc: Move Vector Compare Equal/Not Equal/Greater Than to decodetree matheus.ferst
@ 2022-02-22 18:37 ` Richard Henderson
0 siblings, 0 replies; 97+ messages in thread
From: Richard Henderson @ 2022-02-22 18:37 UTC (permalink / raw)
To: matheus.ferst, qemu-devel, qemu-ppc; +Cc: groug, danielhb413, clg, david
On 2/22/22 04:36, matheus.ferst@eldorado.org.br wrote:
> From: Matheus Ferst<matheus.ferst@eldorado.org.br>
>
> Signed-off-by: Matheus Ferst<matheus.ferst@eldorado.org.br>
> ---
> target/ppc/helper.h | 30 ----------
> target/ppc/insn32.decode | 24 ++++++++
> target/ppc/int_helper.c | 54 -----------------
> target/ppc/translate/vmx-impl.c.inc | 89 ++++++++++++++++++++---------
> target/ppc/translate/vmx-ops.c.inc | 15 +----
> 5 files changed, 88 insertions(+), 124 deletions(-)
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
r~
^ permalink raw reply [flat|nested] 97+ messages in thread
* [PATCH v4 10/47] target/ppc: Move Vector Compare Not Equal or Zero to decodetree
2022-02-22 14:35 [PATCH v4 00/47] target/ppc: PowerISA Vector/VSX instruction batch matheus.ferst
` (8 preceding siblings ...)
2022-02-22 14:36 ` [PATCH v4 09/47] target/ppc: Move Vector Compare Equal/Not Equal/Greater Than to decodetree matheus.ferst
@ 2022-02-22 14:36 ` matheus.ferst
2022-02-22 19:04 ` Richard Henderson
2022-02-22 14:36 ` [PATCH v4 11/47] target/ppc: Implement Vector Compare Equal Quadword matheus.ferst
` (36 subsequent siblings)
46 siblings, 1 reply; 97+ messages in thread
From: matheus.ferst @ 2022-02-22 14:36 UTC (permalink / raw)
To: qemu-devel, qemu-ppc
Cc: danielhb413, richard.henderson, groug, clg, Matheus Ferst, david
From: Matheus Ferst <matheus.ferst@eldorado.org.br>
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
---
target/ppc/helper.h | 9 ++--
target/ppc/insn32.decode | 4 ++
target/ppc/int_helper.c | 50 +++++-----------------
target/ppc/translate/vmx-impl.c.inc | 66 +++++++++++++++++++++++++++--
target/ppc/translate/vmx-ops.c.inc | 3 --
5 files changed, 80 insertions(+), 52 deletions(-)
diff --git a/target/ppc/helper.h b/target/ppc/helper.h
index fb421dd343..303a29fb5a 100644
--- a/target/ppc/helper.h
+++ b/target/ppc/helper.h
@@ -140,16 +140,13 @@ DEF_HELPER_3(vabsduw, void, avr, avr, avr)
DEF_HELPER_3(vavgsb, void, avr, avr, avr)
DEF_HELPER_3(vavgsh, void, avr, avr, avr)
DEF_HELPER_3(vavgsw, void, avr, avr, avr)
-DEF_HELPER_4(vcmpnezb, void, env, avr, avr, avr)
-DEF_HELPER_4(vcmpnezh, void, env, avr, avr, avr)
-DEF_HELPER_4(vcmpnezw, void, env, avr, avr, avr)
DEF_HELPER_4(vcmpeqfp, void, env, avr, avr, avr)
DEF_HELPER_4(vcmpgefp, void, env, avr, avr, avr)
DEF_HELPER_4(vcmpgtfp, void, env, avr, avr, avr)
DEF_HELPER_4(vcmpbfp, void, env, avr, avr, avr)
-DEF_HELPER_4(vcmpnezb_dot, void, env, avr, avr, avr)
-DEF_HELPER_4(vcmpnezh_dot, void, env, avr, avr, avr)
-DEF_HELPER_4(vcmpnezw_dot, void, env, avr, avr, avr)
+DEF_HELPER_4(VCMPNEZB, void, avr, avr, avr, i32)
+DEF_HELPER_4(VCMPNEZH, void, avr, avr, avr, i32)
+DEF_HELPER_4(VCMPNEZW, void, avr, avr, avr, i32)
DEF_HELPER_4(vcmpeqfp_dot, void, env, avr, avr, avr)
DEF_HELPER_4(vcmpgefp_dot, void, env, avr, avr, avr)
DEF_HELPER_4(vcmpgtfp_dot, void, env, avr, avr, avr)
diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode
index 5443ee0394..be9e05cc73 100644
--- a/target/ppc/insn32.decode
+++ b/target/ppc/insn32.decode
@@ -397,6 +397,10 @@ VCMPNEB 000100 ..... ..... ..... . 0000000111 @VC
VCMPNEH 000100 ..... ..... ..... . 0001000111 @VC
VCMPNEW 000100 ..... ..... ..... . 0010000111 @VC
+VCMPNEZB 000100 ..... ..... ..... . 0100000111 @VC
+VCMPNEZH 000100 ..... ..... ..... . 0101000111 @VC
+VCMPNEZW 000100 ..... ..... ..... . 0110000111 @VC
+
## Vector Bit Manipulation Instruction
VCFUGED 000100 ..... ..... ..... 10101001101 @VX
diff --git a/target/ppc/int_helper.c b/target/ppc/int_helper.c
index c9e64014dc..fce782499f 100644
--- a/target/ppc/int_helper.c
+++ b/target/ppc/int_helper.c
@@ -662,46 +662,18 @@ VCF(ux, uint32_to_float32, u32)
VCF(sx, int32_to_float32, s32)
#undef VCF
-#define VCMPNE_DO(suffix, element, etype, cmpzero, record) \
-void helper_vcmpne##suffix(CPUPPCState *env, ppc_avr_t *r, \
- ppc_avr_t *a, ppc_avr_t *b) \
-{ \
- etype ones = (etype)-1; \
- etype all = ones; \
- etype result, none = 0; \
- int i; \
- \
- for (i = 0; i < ARRAY_SIZE(r->element); i++) { \
- if (cmpzero) { \
- result = ((a->element[i] == 0) \
- || (b->element[i] == 0) \
- || (a->element[i] != b->element[i]) ? \
- ones : 0x0); \
- } else { \
- result = (a->element[i] != b->element[i]) ? ones : 0x0; \
- } \
- r->element[i] = result; \
- all &= result; \
- none |= result; \
- } \
- if (record) { \
- env->crf[6] = ((all != 0) << 3) | ((none == 0) << 1); \
- } \
+#define VCMPNEZ(NAME, ELEM) \
+void helper_##NAME(ppc_vsr_t *t, ppc_vsr_t *a, ppc_vsr_t *b, uint32_t desc) \
+{ \
+ for (int i = 0; i < ARRAY_SIZE(t->ELEM); i++) { \
+ t->ELEM[i] = ((a->ELEM[i] == 0) || (b->ELEM[i] == 0) || \
+ (a->ELEM[i] != b->ELEM[i])) ? -1 : 0; \
+ } \
}
-
-/*
- * VCMPNEZ - Vector compare not equal to zero
- * suffix - instruction mnemonic suffix (b: byte, h: halfword, w: word)
- * element - element type to access from vector
- */
-#define VCMPNE(suffix, element, etype, cmpzero) \
- VCMPNE_DO(suffix, element, etype, cmpzero, 0) \
- VCMPNE_DO(suffix##_dot, element, etype, cmpzero, 1)
-VCMPNE(zb, u8, uint8_t, 1)
-VCMPNE(zh, u16, uint16_t, 1)
-VCMPNE(zw, u32, uint32_t, 1)
-#undef VCMPNE_DO
-#undef VCMPNE
+VCMPNEZ(VCMPNEZB, u8)
+VCMPNEZ(VCMPNEZH, u16)
+VCMPNEZ(VCMPNEZW, u32)
+#undef VCMPNEZ
#define VCMPFP_DO(suffix, compare, order, record) \
void helper_vcmp##suffix(CPUPPCState *env, ppc_avr_t *r, \
diff --git a/target/ppc/translate/vmx-impl.c.inc b/target/ppc/translate/vmx-impl.c.inc
index e007003f14..d7f807b81d 100644
--- a/target/ppc/translate/vmx-impl.c.inc
+++ b/target/ppc/translate/vmx-impl.c.inc
@@ -985,10 +985,6 @@ static void glue(gen_, name0##_##name1)(DisasContext *ctx) \
} \
}
-GEN_VXRFORM(vcmpnezb, 3, 4)
-GEN_VXRFORM(vcmpnezh, 3, 5)
-GEN_VXRFORM(vcmpnezw, 3, 6)
-
static void do_vcmp_rc(int vrt)
{
TCGv_i64 tmp, set, clr;
@@ -1049,6 +1045,68 @@ TRANS_FLAGS2(ISA300, VCMPNEB, do_vcmp, TCG_COND_NE, MO_8)
TRANS_FLAGS2(ISA300, VCMPNEH, do_vcmp, TCG_COND_NE, MO_16)
TRANS_FLAGS2(ISA300, VCMPNEW, do_vcmp, TCG_COND_NE, MO_32)
+static void gen_vcmpnez_vec(unsigned vece, TCGv_vec t, TCGv_vec a, TCGv_vec b)
+{
+ TCGv_vec t0, t1, zero;
+
+ t0 = tcg_temp_new_vec_matching(t);
+ t1 = tcg_temp_new_vec_matching(t);
+ zero = tcg_constant_vec_matching(t, vece, 0);
+
+ tcg_gen_cmp_vec(TCG_COND_EQ, vece, t0, a, zero);
+ tcg_gen_cmp_vec(TCG_COND_EQ, vece, t1, b, zero);
+ tcg_gen_cmp_vec(TCG_COND_NE, vece, t, a, b);
+
+ tcg_gen_or_vec(vece, t, t, t0);
+ tcg_gen_or_vec(vece, t, t, t1);
+
+ tcg_temp_free_vec(t0);
+ tcg_temp_free_vec(t1);
+}
+
+static bool do_vcmpnez(DisasContext *ctx, arg_VC *a, int vece)
+{
+ static const TCGOpcode vecop_list[] = {
+ INDEX_op_cmp_vec, 0
+ };
+ static const GVecGen3 ops[3] = {
+ {
+ .fniv = gen_vcmpnez_vec,
+ .fno = gen_helper_VCMPNEZB,
+ .opt_opc = vecop_list,
+ .vece = MO_8
+ },
+ {
+ .fniv = gen_vcmpnez_vec,
+ .fno = gen_helper_VCMPNEZH,
+ .opt_opc = vecop_list,
+ .vece = MO_16
+ },
+ {
+ .fniv = gen_vcmpnez_vec,
+ .fno = gen_helper_VCMPNEZW,
+ .opt_opc = vecop_list,
+ .vece = MO_32
+ }
+ };
+
+ REQUIRE_INSNS_FLAGS2(ctx, ISA300);
+ REQUIRE_VECTOR(ctx);
+
+ tcg_gen_gvec_3(avr_full_offset(a->vrt), avr_full_offset(a->vra),
+ avr_full_offset(a->vrb), 16, 16, &ops[vece]);
+
+ if (a->rc) {
+ do_vcmp_rc(a->vrt);
+ }
+
+ return true;
+}
+
+TRANS(VCMPNEZB, do_vcmpnez, MO_8)
+TRANS(VCMPNEZH, do_vcmpnez, MO_16)
+TRANS(VCMPNEZW, do_vcmpnez, MO_32)
+
GEN_VXRFORM(vcmpeqfp, 3, 3)
GEN_VXRFORM(vcmpgefp, 3, 7)
GEN_VXRFORM(vcmpgtfp, 3, 11)
diff --git a/target/ppc/translate/vmx-ops.c.inc b/target/ppc/translate/vmx-ops.c.inc
index 80d460c34e..cb4c5bb953 100644
--- a/target/ppc/translate/vmx-ops.c.inc
+++ b/target/ppc/translate/vmx-ops.c.inc
@@ -184,9 +184,6 @@ GEN_HANDLER2_E(name, str, 0x4, opc2, opc3, 0x00000000, PPC_NONE, PPC2_ISA300),
GEN_VXRFORM1_300(name, name, #name, opc2, opc3) \
GEN_VXRFORM1_300(name##_dot, name##_, #name ".", opc2, (opc3 | (0x1 << 4)))
-GEN_VXRFORM_300(vcmpnezb, 3, 4)
-GEN_VXRFORM_300(vcmpnezh, 3, 5)
-GEN_VXRFORM_300(vcmpnezw, 3, 6)
GEN_VXRFORM(vcmpeqfp, 3, 3)
GEN_VXRFORM(vcmpgefp, 3, 7)
GEN_VXRFORM(vcmpgtfp, 3, 11)
--
2.25.1
^ permalink raw reply related [flat|nested] 97+ messages in thread
* Re: [PATCH v4 10/47] target/ppc: Move Vector Compare Not Equal or Zero to decodetree
2022-02-22 14:36 ` [PATCH v4 10/47] target/ppc: Move Vector Compare Not Equal or Zero " matheus.ferst
@ 2022-02-22 19:04 ` Richard Henderson
0 siblings, 0 replies; 97+ messages in thread
From: Richard Henderson @ 2022-02-22 19:04 UTC (permalink / raw)
To: matheus.ferst, qemu-devel, qemu-ppc; +Cc: groug, danielhb413, clg, david
On 2/22/22 04:36, matheus.ferst@eldorado.org.br wrote:
> From: Matheus Ferst<matheus.ferst@eldorado.org.br>
>
> Signed-off-by: Matheus Ferst<matheus.ferst@eldorado.org.br>
> ---
> target/ppc/helper.h | 9 ++--
> target/ppc/insn32.decode | 4 ++
> target/ppc/int_helper.c | 50 +++++-----------------
> target/ppc/translate/vmx-impl.c.inc | 66 +++++++++++++++++++++++++++--
> target/ppc/translate/vmx-ops.c.inc | 3 --
> 5 files changed, 80 insertions(+), 52 deletions(-)
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
r~
^ permalink raw reply [flat|nested] 97+ messages in thread
* [PATCH v4 11/47] target/ppc: Implement Vector Compare Equal Quadword
2022-02-22 14:35 [PATCH v4 00/47] target/ppc: PowerISA Vector/VSX instruction batch matheus.ferst
` (9 preceding siblings ...)
2022-02-22 14:36 ` [PATCH v4 10/47] target/ppc: Move Vector Compare Not Equal or Zero " matheus.ferst
@ 2022-02-22 14:36 ` matheus.ferst
2022-02-22 19:05 ` Richard Henderson
2022-02-22 14:36 ` [PATCH v4 12/47] target/ppc: Implement Vector Compare Greater Than Quadword matheus.ferst
` (35 subsequent siblings)
46 siblings, 1 reply; 97+ messages in thread
From: matheus.ferst @ 2022-02-22 14:36 UTC (permalink / raw)
To: qemu-devel, qemu-ppc
Cc: danielhb413, richard.henderson, groug, clg, Matheus Ferst, david
From: Matheus Ferst <matheus.ferst@eldorado.org.br>
Implement the following PowerISA v3.1 instructions:
vcmpequq: Vector Compare Equal Quadword
Suggested-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
---
v4:
- Branchless implementation (rth)
---
target/ppc/insn32.decode | 1 +
target/ppc/translate/vmx-impl.c.inc | 36 +++++++++++++++++++++++++++++
2 files changed, 37 insertions(+)
diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode
index be9e05cc73..437a3e29e0 100644
--- a/target/ppc/insn32.decode
+++ b/target/ppc/insn32.decode
@@ -382,6 +382,7 @@ VCMPEQUB 000100 ..... ..... ..... . 0000000110 @VC
VCMPEQUH 000100 ..... ..... ..... . 0001000110 @VC
VCMPEQUW 000100 ..... ..... ..... . 0010000110 @VC
VCMPEQUD 000100 ..... ..... ..... . 0011000111 @VC
+VCMPEQUQ 000100 ..... ..... ..... . 0111000111 @VC
VCMPGTSB 000100 ..... ..... ..... . 1100000110 @VC
VCMPGTSH 000100 ..... ..... ..... . 1101000110 @VC
diff --git a/target/ppc/translate/vmx-impl.c.inc b/target/ppc/translate/vmx-impl.c.inc
index d7f807b81d..d66a642b67 100644
--- a/target/ppc/translate/vmx-impl.c.inc
+++ b/target/ppc/translate/vmx-impl.c.inc
@@ -1107,6 +1107,42 @@ TRANS(VCMPNEZB, do_vcmpnez, MO_8)
TRANS(VCMPNEZH, do_vcmpnez, MO_16)
TRANS(VCMPNEZW, do_vcmpnez, MO_32)
+static bool trans_VCMPEQUQ(DisasContext *ctx, arg_VC *a)
+{
+ TCGv_i64 t0, t1, t2;
+
+ t0 = tcg_temp_new_i64();
+ t1 = tcg_temp_new_i64();
+ t2 = tcg_temp_new_i64();
+
+ get_avr64(t0, a->vra, true);
+ get_avr64(t1, a->vrb, true);
+ tcg_gen_xor_i64(t2, t0, t1);
+
+ get_avr64(t0, a->vra, false);
+ get_avr64(t1, a->vrb, false);
+ tcg_gen_xor_i64(t1, t0, t1);
+
+ tcg_gen_or_i64(t1, t1, t2);
+ tcg_gen_setcondi_i64(TCG_COND_EQ, t1, t1, 0);
+ tcg_gen_neg_i64(t1, t1);
+
+ set_avr64(a->vrt, t1, true);
+ set_avr64(a->vrt, t1, false);
+
+ if (a->rc) {
+ tcg_gen_extrl_i64_i32(cpu_crf[6], t1);
+ tcg_gen_andi_i32(cpu_crf[6], cpu_crf[6], 0xa);
+ tcg_gen_xori_i32(cpu_crf[6], cpu_crf[6], 0x2);
+ }
+
+ tcg_temp_free_i64(t0);
+ tcg_temp_free_i64(t1);
+ tcg_temp_free_i64(t2);
+
+ return true;
+}
+
GEN_VXRFORM(vcmpeqfp, 3, 3)
GEN_VXRFORM(vcmpgefp, 3, 7)
GEN_VXRFORM(vcmpgtfp, 3, 11)
--
2.25.1
^ permalink raw reply related [flat|nested] 97+ messages in thread
* Re: [PATCH v4 11/47] target/ppc: Implement Vector Compare Equal Quadword
2022-02-22 14:36 ` [PATCH v4 11/47] target/ppc: Implement Vector Compare Equal Quadword matheus.ferst
@ 2022-02-22 19:05 ` Richard Henderson
0 siblings, 0 replies; 97+ messages in thread
From: Richard Henderson @ 2022-02-22 19:05 UTC (permalink / raw)
To: matheus.ferst, qemu-devel, qemu-ppc; +Cc: groug, danielhb413, clg, david
On 2/22/22 04:36, matheus.ferst@eldorado.org.br wrote:
> From: Matheus Ferst<matheus.ferst@eldorado.org.br>
>
> Implement the following PowerISA v3.1 instructions:
> vcmpequq: Vector Compare Equal Quadword
>
> Suggested-by: Richard Henderson<richard.henderson@linaro.org>
> Signed-off-by: Matheus Ferst<matheus.ferst@eldorado.org.br>
> ---
> v4:
> - Branchless implementation (rth)
> ---
> target/ppc/insn32.decode | 1 +
> target/ppc/translate/vmx-impl.c.inc | 36 +++++++++++++++++++++++++++++
> 2 files changed, 37 insertions(+)
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
r~
^ permalink raw reply [flat|nested] 97+ messages in thread
* [PATCH v4 12/47] target/ppc: Implement Vector Compare Greater Than Quadword
2022-02-22 14:35 [PATCH v4 00/47] target/ppc: PowerISA Vector/VSX instruction batch matheus.ferst
` (10 preceding siblings ...)
2022-02-22 14:36 ` [PATCH v4 11/47] target/ppc: Implement Vector Compare Equal Quadword matheus.ferst
@ 2022-02-22 14:36 ` matheus.ferst
2022-02-22 19:07 ` Richard Henderson
2022-02-22 14:36 ` [PATCH v4 13/47] target/ppc: Implement Vector Compare Quadword matheus.ferst
` (34 subsequent siblings)
46 siblings, 1 reply; 97+ messages in thread
From: matheus.ferst @ 2022-02-22 14:36 UTC (permalink / raw)
To: qemu-devel, qemu-ppc
Cc: danielhb413, richard.henderson, groug, clg, Matheus Ferst, david
From: Matheus Ferst <matheus.ferst@eldorado.org.br>
Implement the following PowerISA v3.1 instructions:
vcmpgtsq: Vector Compare Greater Than Signed Quadword
vcmpgtuq: Vector Compare Greater Than Unsigned Quadword
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
---
v4:
- Branchless implementation (rth)
---
target/ppc/insn32.decode | 2 ++
target/ppc/translate/vmx-impl.c.inc | 39 +++++++++++++++++++++++++++++
2 files changed, 41 insertions(+)
diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode
index 437a3e29e0..07a4ef9103 100644
--- a/target/ppc/insn32.decode
+++ b/target/ppc/insn32.decode
@@ -388,11 +388,13 @@ VCMPGTSB 000100 ..... ..... ..... . 1100000110 @VC
VCMPGTSH 000100 ..... ..... ..... . 1101000110 @VC
VCMPGTSW 000100 ..... ..... ..... . 1110000110 @VC
VCMPGTSD 000100 ..... ..... ..... . 1111000111 @VC
+VCMPGTSQ 000100 ..... ..... ..... . 1110000111 @VC
VCMPGTUB 000100 ..... ..... ..... . 1000000110 @VC
VCMPGTUH 000100 ..... ..... ..... . 1001000110 @VC
VCMPGTUW 000100 ..... ..... ..... . 1010000110 @VC
VCMPGTUD 000100 ..... ..... ..... . 1011000111 @VC
+VCMPGTUQ 000100 ..... ..... ..... . 1010000111 @VC
VCMPNEB 000100 ..... ..... ..... . 0000000111 @VC
VCMPNEH 000100 ..... ..... ..... . 0001000111 @VC
diff --git a/target/ppc/translate/vmx-impl.c.inc b/target/ppc/translate/vmx-impl.c.inc
index d66a642b67..4a76e370fc 100644
--- a/target/ppc/translate/vmx-impl.c.inc
+++ b/target/ppc/translate/vmx-impl.c.inc
@@ -1143,6 +1143,45 @@ static bool trans_VCMPEQUQ(DisasContext *ctx, arg_VC *a)
return true;
}
+static bool do_vcmpgtq(DisasContext *ctx, arg_VC *a, bool sign)
+{
+ TCGv_i64 t0, t1, t2;
+
+ t0 = tcg_temp_new_i64();
+ t1 = tcg_temp_new_i64();
+ t2 = tcg_temp_new_i64();
+
+ get_avr64(t0, a->vra, false);
+ get_avr64(t1, a->vrb, false);
+ tcg_gen_setcond_i64(TCG_COND_GTU, t2, t0, t1);
+
+ get_avr64(t0, a->vra, true);
+ get_avr64(t1, a->vrb, true);
+ tcg_gen_movcond_i64(TCG_COND_EQ, t2, t0, t1, t2, tcg_constant_i64(0));
+ tcg_gen_setcond_i64(sign ? TCG_COND_GT : TCG_COND_GTU, t1, t0, t1);
+
+ tcg_gen_or_i64(t1, t1, t2);
+ tcg_gen_neg_i64(t1, t1);
+
+ set_avr64(a->vrt, t1, true);
+ set_avr64(a->vrt, t1, false);
+
+ if (a->rc) {
+ tcg_gen_extrl_i64_i32(cpu_crf[6], t1);
+ tcg_gen_andi_i32(cpu_crf[6], cpu_crf[6], 0xa);
+ tcg_gen_xori_i32(cpu_crf[6], cpu_crf[6], 0x2);
+ }
+
+ tcg_temp_free_i64(t0);
+ tcg_temp_free_i64(t1);
+ tcg_temp_free_i64(t2);
+
+ return true;
+}
+
+TRANS(VCMPGTSQ, do_vcmpgtq, true)
+TRANS(VCMPGTUQ, do_vcmpgtq, false)
+
GEN_VXRFORM(vcmpeqfp, 3, 3)
GEN_VXRFORM(vcmpgefp, 3, 7)
GEN_VXRFORM(vcmpgtfp, 3, 11)
--
2.25.1
^ permalink raw reply related [flat|nested] 97+ messages in thread
* Re: [PATCH v4 12/47] target/ppc: Implement Vector Compare Greater Than Quadword
2022-02-22 14:36 ` [PATCH v4 12/47] target/ppc: Implement Vector Compare Greater Than Quadword matheus.ferst
@ 2022-02-22 19:07 ` Richard Henderson
0 siblings, 0 replies; 97+ messages in thread
From: Richard Henderson @ 2022-02-22 19:07 UTC (permalink / raw)
To: matheus.ferst, qemu-devel, qemu-ppc; +Cc: groug, danielhb413, clg, david
On 2/22/22 04:36, matheus.ferst@eldorado.org.br wrote:
> From: Matheus Ferst<matheus.ferst@eldorado.org.br>
>
> Implement the following PowerISA v3.1 instructions:
> vcmpgtsq: Vector Compare Greater Than Signed Quadword
> vcmpgtuq: Vector Compare Greater Than Unsigned Quadword
>
> Signed-off-by: Matheus Ferst<matheus.ferst@eldorado.org.br>
> ---
> v4:
> - Branchless implementation (rth)
> ---
> target/ppc/insn32.decode | 2 ++
> target/ppc/translate/vmx-impl.c.inc | 39 +++++++++++++++++++++++++++++
> 2 files changed, 41 insertions(+)
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
r~
^ permalink raw reply [flat|nested] 97+ messages in thread
* [PATCH v4 13/47] target/ppc: Implement Vector Compare Quadword
2022-02-22 14:35 [PATCH v4 00/47] target/ppc: PowerISA Vector/VSX instruction batch matheus.ferst
` (11 preceding siblings ...)
2022-02-22 14:36 ` [PATCH v4 12/47] target/ppc: Implement Vector Compare Greater Than Quadword matheus.ferst
@ 2022-02-22 14:36 ` matheus.ferst
2022-02-22 14:36 ` [PATCH v4 14/47] target/ppc: implement vstri[bh][lr] matheus.ferst
` (33 subsequent siblings)
46 siblings, 0 replies; 97+ messages in thread
From: matheus.ferst @ 2022-02-22 14:36 UTC (permalink / raw)
To: qemu-devel, qemu-ppc
Cc: danielhb413, richard.henderson, groug, clg, Matheus Ferst, david
From: Matheus Ferst <matheus.ferst@eldorado.org.br>
Implement the following PowerISA v3.1 instructions:
vcmpsq: Vector Compare Signed Quadword
vcmpuq: Vector Compare Unsigned Quadword
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
---
target/ppc/insn32.decode | 6 ++++
target/ppc/translate/vmx-impl.c.inc | 45 +++++++++++++++++++++++++++++
2 files changed, 51 insertions(+)
diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode
index 07a4ef9103..f0cb6602e2 100644
--- a/target/ppc/insn32.decode
+++ b/target/ppc/insn32.decode
@@ -60,6 +60,9 @@
&VX vrt vra vrb
@VX ...... vrt:5 vra:5 vrb:5 .......... . &VX
+&VX_bf bf vra vrb
+@VX_bf ...... bf:3 .. vra:5 vrb:5 ........... &VX_bf
+
&VX_uim4 vrt uim vrb
@VX_uim4 ...... vrt:5 . uim:4 vrb:5 ........... &VX_uim4
@@ -404,6 +407,9 @@ VCMPNEZB 000100 ..... ..... ..... . 0100000111 @VC
VCMPNEZH 000100 ..... ..... ..... . 0101000111 @VC
VCMPNEZW 000100 ..... ..... ..... . 0110000111 @VC
+VCMPSQ 000100 ... -- ..... ..... 00101000001 @VX_bf
+VCMPUQ 000100 ... -- ..... ..... 00100000001 @VX_bf
+
## Vector Bit Manipulation Instruction
VCFUGED 000100 ..... ..... ..... 10101001101 @VX
diff --git a/target/ppc/translate/vmx-impl.c.inc b/target/ppc/translate/vmx-impl.c.inc
index 4a76e370fc..335bef56ff 100644
--- a/target/ppc/translate/vmx-impl.c.inc
+++ b/target/ppc/translate/vmx-impl.c.inc
@@ -1182,6 +1182,51 @@ static bool do_vcmpgtq(DisasContext *ctx, arg_VC *a, bool sign)
TRANS(VCMPGTSQ, do_vcmpgtq, true)
TRANS(VCMPGTUQ, do_vcmpgtq, false)
+static bool do_vcmpq(DisasContext *ctx, arg_VX_bf *a, bool sign)
+{
+ TCGv_i64 vra, vrb;
+ TCGLabel *gt, *lt, *done;
+
+ REQUIRE_INSNS_FLAGS2(ctx, ISA310);
+ REQUIRE_VECTOR(ctx);
+
+ vra = tcg_temp_local_new_i64();
+ vrb = tcg_temp_local_new_i64();
+ gt = gen_new_label();
+ lt = gen_new_label();
+ done = gen_new_label();
+
+ get_avr64(vra, a->vra, true);
+ get_avr64(vrb, a->vrb, true);
+ tcg_gen_brcond_i64((sign ? TCG_COND_GT : TCG_COND_GTU), vra, vrb, gt);
+ tcg_gen_brcond_i64((sign ? TCG_COND_LT : TCG_COND_LTU), vra, vrb, lt);
+
+ get_avr64(vra, a->vra, false);
+ get_avr64(vrb, a->vrb, false);
+ tcg_gen_brcond_i64(TCG_COND_GTU, vra, vrb, gt);
+ tcg_gen_brcond_i64(TCG_COND_LTU, vra, vrb, lt);
+
+ tcg_gen_movi_i32(cpu_crf[a->bf], CRF_EQ);
+ tcg_gen_br(done);
+
+ gen_set_label(gt);
+ tcg_gen_movi_i32(cpu_crf[a->bf], CRF_GT);
+ tcg_gen_br(done);
+
+ gen_set_label(lt);
+ tcg_gen_movi_i32(cpu_crf[a->bf], CRF_LT);
+ tcg_gen_br(done);
+
+ gen_set_label(done);
+ tcg_temp_free_i64(vra);
+ tcg_temp_free_i64(vrb);
+
+ return true;
+}
+
+TRANS(VCMPSQ, do_vcmpq, true)
+TRANS(VCMPUQ, do_vcmpq, false)
+
GEN_VXRFORM(vcmpeqfp, 3, 3)
GEN_VXRFORM(vcmpgefp, 3, 7)
GEN_VXRFORM(vcmpgtfp, 3, 11)
--
2.25.1
^ permalink raw reply related [flat|nested] 97+ messages in thread
* [PATCH v4 14/47] target/ppc: implement vstri[bh][lr]
2022-02-22 14:35 [PATCH v4 00/47] target/ppc: PowerISA Vector/VSX instruction batch matheus.ferst
` (12 preceding siblings ...)
2022-02-22 14:36 ` [PATCH v4 13/47] target/ppc: Implement Vector Compare Quadword matheus.ferst
@ 2022-02-22 14:36 ` matheus.ferst
2022-02-22 19:13 ` Richard Henderson
2022-02-22 14:36 ` [PATCH v4 15/47] target/ppc: implement vclrlb matheus.ferst
` (32 subsequent siblings)
46 siblings, 1 reply; 97+ messages in thread
From: matheus.ferst @ 2022-02-22 14:36 UTC (permalink / raw)
To: qemu-devel, qemu-ppc
Cc: danielhb413, richard.henderson, groug, clg, Matheus Ferst, david
From: Matheus Ferst <matheus.ferst@eldorado.org.br>
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
---
v4:
- vstri helpers return CR field (rth)
---
target/ppc/helper.h | 4 ++++
target/ppc/insn32.decode | 10 ++++++++++
target/ppc/int_helper.c | 28 +++++++++++++++++++++++++++
target/ppc/translate/vmx-impl.c.inc | 30 +++++++++++++++++++++++++++++
4 files changed, 72 insertions(+)
diff --git a/target/ppc/helper.h b/target/ppc/helper.h
index 303a29fb5a..269150b197 100644
--- a/target/ppc/helper.h
+++ b/target/ppc/helper.h
@@ -211,6 +211,10 @@ DEF_HELPER_4(VINSBLX, void, env, avr, i64, tl)
DEF_HELPER_4(VINSHLX, void, env, avr, i64, tl)
DEF_HELPER_4(VINSWLX, void, env, avr, i64, tl)
DEF_HELPER_4(VINSDLX, void, env, avr, i64, tl)
+DEF_HELPER_2(VSTRIBL, i32, avr, avr)
+DEF_HELPER_2(VSTRIBR, i32, avr, avr)
+DEF_HELPER_2(VSTRIHL, i32, avr, avr)
+DEF_HELPER_2(VSTRIHR, i32, avr, avr)
DEF_HELPER_2(vnegw, void, avr, avr)
DEF_HELPER_2(vnegd, void, avr, avr)
DEF_HELPER_2(vupkhpx, void, avr, avr)
diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode
index f0cb6602e2..d844d86829 100644
--- a/target/ppc/insn32.decode
+++ b/target/ppc/insn32.decode
@@ -63,6 +63,9 @@
&VX_bf bf vra vrb
@VX_bf ...... bf:3 .. vra:5 vrb:5 ........... &VX_bf
+&VX_tb_rc vrt vrb rc:bool
+@VX_tb_rc ...... vrt:5 ..... vrb:5 rc:1 .......... &VX_tb_rc
+
&VX_uim4 vrt uim vrb
@VX_uim4 ...... vrt:5 . uim:4 vrb:5 ........... &VX_uim4
@@ -519,6 +522,13 @@ VMULLD 000100 ..... ..... ..... 00111001001 @VX
VMSUMCUD 000100 ..... ..... ..... ..... 010111 @VA
VMSUMUDM 000100 ..... ..... ..... ..... 100011 @VA
+## Vector String Instructions
+
+VSTRIBL 000100 ..... 00000 ..... . 0000001101 @VX_tb_rc
+VSTRIBR 000100 ..... 00001 ..... . 0000001101 @VX_tb_rc
+VSTRIHL 000100 ..... 00010 ..... . 0000001101 @VX_tb_rc
+VSTRIHR 000100 ..... 00011 ..... . 0000001101 @VX_tb_rc
+
# VSX Load/Store Instructions
LXV 111101 ..... ..... ............ . 001 @DQ_TSX
diff --git a/target/ppc/int_helper.c b/target/ppc/int_helper.c
index fce782499f..0a094b535a 100644
--- a/target/ppc/int_helper.c
+++ b/target/ppc/int_helper.c
@@ -1518,6 +1518,34 @@ VEXTRACT(uw, u32)
VEXTRACT(d, u64)
#undef VEXTRACT
+#define VSTRI(NAME, ELEM, NUM_ELEMS, LEFT) \
+uint32_t helper_##NAME(ppc_avr_t *t, ppc_avr_t *b) \
+{ \
+ int i, idx, crf = 0; \
+ \
+ for (i = 0; i < NUM_ELEMS; i++) { \
+ idx = LEFT ? i : NUM_ELEMS - i - 1; \
+ if (b->Vsr##ELEM(idx)) { \
+ t->Vsr##ELEM(idx) = b->Vsr##ELEM(idx); \
+ } else { \
+ crf = 0b0010; \
+ break; \
+ } \
+ } \
+ \
+ for (; i < NUM_ELEMS; i++) { \
+ idx = LEFT ? i : NUM_ELEMS - i - 1; \
+ t->Vsr##ELEM(idx) = 0; \
+ } \
+ \
+ return crf; \
+}
+VSTRI(VSTRIBL, B, 16, true)
+VSTRI(VSTRIBR, B, 16, false)
+VSTRI(VSTRIHL, H, 8, true)
+VSTRI(VSTRIHR, H, 8, false)
+#undef VSTRI
+
void helper_xxextractuw(CPUPPCState *env, ppc_vsr_t *xt,
ppc_vsr_t *xb, uint32_t index)
{
diff --git a/target/ppc/translate/vmx-impl.c.inc b/target/ppc/translate/vmx-impl.c.inc
index 335bef56ff..1a69931d36 100644
--- a/target/ppc/translate/vmx-impl.c.inc
+++ b/target/ppc/translate/vmx-impl.c.inc
@@ -1910,6 +1910,36 @@ static bool trans_MTVSRBMI(DisasContext *ctx, arg_DX_b *a)
return true;
}
+static bool do_vstri(DisasContext *ctx, arg_VX_tb_rc *a,
+ void (*gen_helper)(TCGv_i32, TCGv_ptr, TCGv_ptr))
+{
+ TCGv_ptr vrt, vrb;
+
+ REQUIRE_INSNS_FLAGS2(ctx, ISA310);
+ REQUIRE_VECTOR(ctx);
+
+ vrt = gen_avr_ptr(a->vrt);
+ vrb = gen_avr_ptr(a->vrb);
+
+ if (a->rc) {
+ gen_helper(cpu_crf[6], vrt, vrb);
+ } else {
+ TCGv_i32 discard = tcg_temp_new_i32();
+ gen_helper(discard, vrt, vrb);
+ tcg_temp_free_i32(discard);
+ }
+
+ tcg_temp_free_ptr(vrt);
+ tcg_temp_free_ptr(vrb);
+
+ return true;
+}
+
+TRANS(VSTRIBL, do_vstri, gen_helper_VSTRIBL)
+TRANS(VSTRIBR, do_vstri, gen_helper_VSTRIBR)
+TRANS(VSTRIHL, do_vstri, gen_helper_VSTRIHL)
+TRANS(VSTRIHR, do_vstri, gen_helper_VSTRIHR)
+
#define GEN_VAFORM_PAIRED(name0, name1, opc2) \
static void glue(gen_, name0##_##name1)(DisasContext *ctx) \
{ \
--
2.25.1
^ permalink raw reply related [flat|nested] 97+ messages in thread
* Re: [PATCH v4 14/47] target/ppc: implement vstri[bh][lr]
2022-02-22 14:36 ` [PATCH v4 14/47] target/ppc: implement vstri[bh][lr] matheus.ferst
@ 2022-02-22 19:13 ` Richard Henderson
0 siblings, 0 replies; 97+ messages in thread
From: Richard Henderson @ 2022-02-22 19:13 UTC (permalink / raw)
To: matheus.ferst, qemu-devel, qemu-ppc; +Cc: groug, danielhb413, clg, david
On 2/22/22 04:36, matheus.ferst@eldorado.org.br wrote:
> From: Matheus Ferst <matheus.ferst@eldorado.org.br>
>
> Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
> ---
> v4:
> - vstri helpers return CR field (rth)
> ---
> target/ppc/helper.h | 4 ++++
> target/ppc/insn32.decode | 10 ++++++++++
> target/ppc/int_helper.c | 28 +++++++++++++++++++++++++++
> target/ppc/translate/vmx-impl.c.inc | 30 +++++++++++++++++++++++++++++
> 4 files changed, 72 insertions(+)
>
> diff --git a/target/ppc/helper.h b/target/ppc/helper.h
> index 303a29fb5a..269150b197 100644
> --- a/target/ppc/helper.h
> +++ b/target/ppc/helper.h
> @@ -211,6 +211,10 @@ DEF_HELPER_4(VINSBLX, void, env, avr, i64, tl)
> DEF_HELPER_4(VINSHLX, void, env, avr, i64, tl)
> DEF_HELPER_4(VINSWLX, void, env, avr, i64, tl)
> DEF_HELPER_4(VINSDLX, void, env, avr, i64, tl)
> +DEF_HELPER_2(VSTRIBL, i32, avr, avr)
> +DEF_HELPER_2(VSTRIBR, i32, avr, avr)
> +DEF_HELPER_2(VSTRIHL, i32, avr, avr)
> +DEF_HELPER_2(VSTRIHR, i32, avr, avr)
Oh, DEF_HELPER_FLAGS_2 with TCG_CALL_NO_RWG.
I should have thought of this wrt the other helpers you're touching in this series --
those that only modify vector registers should use this.
Otherwise,
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
r~
^ permalink raw reply [flat|nested] 97+ messages in thread
* [PATCH v4 15/47] target/ppc: implement vclrlb
2022-02-22 14:35 [PATCH v4 00/47] target/ppc: PowerISA Vector/VSX instruction batch matheus.ferst
` (13 preceding siblings ...)
2022-02-22 14:36 ` [PATCH v4 14/47] target/ppc: implement vstri[bh][lr] matheus.ferst
@ 2022-02-22 14:36 ` matheus.ferst
2022-02-22 19:15 ` Richard Henderson
2022-02-22 14:36 ` [PATCH v4 16/47] target/ppc: implement vclrrb matheus.ferst
` (31 subsequent siblings)
46 siblings, 1 reply; 97+ messages in thread
From: matheus.ferst @ 2022-02-22 14:36 UTC (permalink / raw)
To: qemu-devel, qemu-ppc
Cc: danielhb413, richard.henderson, groug, clg, Matheus Ferst, david
From: Matheus Ferst <matheus.ferst@eldorado.org.br>
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
---
v4:
- Branchless implementation (rth)
---
target/ppc/insn32.decode | 2 ++
target/ppc/translate/vmx-impl.c.inc | 40 +++++++++++++++++++++++++++++
2 files changed, 42 insertions(+)
diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode
index d844d86829..31cdbba86b 100644
--- a/target/ppc/insn32.decode
+++ b/target/ppc/insn32.decode
@@ -529,6 +529,8 @@ VSTRIBR 000100 ..... 00001 ..... . 0000001101 @VX_tb_rc
VSTRIHL 000100 ..... 00010 ..... . 0000001101 @VX_tb_rc
VSTRIHR 000100 ..... 00011 ..... . 0000001101 @VX_tb_rc
+VCLRLB 000100 ..... ..... ..... 00110001101 @VX
+
# VSX Load/Store Instructions
LXV 111101 ..... ..... ............ . 001 @DQ_TSX
diff --git a/target/ppc/translate/vmx-impl.c.inc b/target/ppc/translate/vmx-impl.c.inc
index 1a69931d36..8f12d78071 100644
--- a/target/ppc/translate/vmx-impl.c.inc
+++ b/target/ppc/translate/vmx-impl.c.inc
@@ -1940,6 +1940,46 @@ TRANS(VSTRIBR, do_vstri, gen_helper_VSTRIBR)
TRANS(VSTRIHL, do_vstri, gen_helper_VSTRIHL)
TRANS(VSTRIHR, do_vstri, gen_helper_VSTRIHR)
+static bool trans_VCLRLB(DisasContext *ctx, arg_VX *a)
+{
+ TCGv_i64 rb, mh, ml, tmp,
+ ones = tcg_constant_i64(-1),
+ zero = tcg_constant_i64(0);
+
+ rb = tcg_temp_new_i64();
+ mh = tcg_temp_new_i64();
+ ml = tcg_temp_new_i64();
+ tmp = tcg_temp_new_i64();
+
+ tcg_gen_extu_tl_i64(rb, cpu_gpr[a->vrb]);
+ tcg_gen_andi_i64(tmp, rb, 7);
+ tcg_gen_shli_i64(tmp, tmp, 3);
+ tcg_gen_shl_i64(tmp, tcg_constant_i64(-1), tmp);
+ tcg_gen_not_i64(tmp, tmp);
+
+ tcg_gen_movcond_i64(TCG_COND_LTU, ml, rb, tcg_constant_i64(8),
+ tmp, ones);
+ tcg_gen_movcond_i64(TCG_COND_LTU, mh, rb, tcg_constant_i64(8),
+ zero, tmp);
+ tcg_gen_movcond_i64(TCG_COND_LTU, mh, rb, tcg_constant_i64(16),
+ mh, ones);
+
+ get_avr64(tmp, a->vra, true);
+ tcg_gen_and_i64(tmp, tmp, mh);
+ set_avr64(a->vrt, tmp, true);
+
+ get_avr64(tmp, a->vra, false);
+ tcg_gen_and_i64(tmp, tmp, ml);
+ set_avr64(a->vrt, tmp, false);
+
+ tcg_temp_free_i64(rb);
+ tcg_temp_free_i64(mh);
+ tcg_temp_free_i64(ml);
+ tcg_temp_free_i64(tmp);
+
+ return true;
+}
+
#define GEN_VAFORM_PAIRED(name0, name1, opc2) \
static void glue(gen_, name0##_##name1)(DisasContext *ctx) \
{ \
--
2.25.1
^ permalink raw reply related [flat|nested] 97+ messages in thread
* Re: [PATCH v4 15/47] target/ppc: implement vclrlb
2022-02-22 14:36 ` [PATCH v4 15/47] target/ppc: implement vclrlb matheus.ferst
@ 2022-02-22 19:15 ` Richard Henderson
0 siblings, 0 replies; 97+ messages in thread
From: Richard Henderson @ 2022-02-22 19:15 UTC (permalink / raw)
To: matheus.ferst, qemu-devel, qemu-ppc; +Cc: groug, danielhb413, clg, david
On 2/22/22 04:36, matheus.ferst@eldorado.org.br wrote:
> +static bool trans_VCLRLB(DisasContext *ctx, arg_VX *a)
> +{
> + TCGv_i64 rb, mh, ml, tmp,
> + ones = tcg_constant_i64(-1),
> + zero = tcg_constant_i64(0);
> +
> + rb = tcg_temp_new_i64();
> + mh = tcg_temp_new_i64();
> + ml = tcg_temp_new_i64();
> + tmp = tcg_temp_new_i64();
> +
> + tcg_gen_extu_tl_i64(rb, cpu_gpr[a->vrb]);
> + tcg_gen_andi_i64(tmp, rb, 7);
> + tcg_gen_shli_i64(tmp, tmp, 3);
> + tcg_gen_shl_i64(tmp, tcg_constant_i64(-1), tmp);
Reuse ones here. Otherwise,
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
r~
^ permalink raw reply [flat|nested] 97+ messages in thread
* [PATCH v4 16/47] target/ppc: implement vclrrb
2022-02-22 14:35 [PATCH v4 00/47] target/ppc: PowerISA Vector/VSX instruction batch matheus.ferst
` (14 preceding siblings ...)
2022-02-22 14:36 ` [PATCH v4 15/47] target/ppc: implement vclrlb matheus.ferst
@ 2022-02-22 14:36 ` matheus.ferst
2022-02-22 19:17 ` Richard Henderson
2022-02-22 14:36 ` [PATCH v4 17/47] target/ppc: implement vcntmb[bhwd] matheus.ferst
` (30 subsequent siblings)
46 siblings, 1 reply; 97+ messages in thread
From: matheus.ferst @ 2022-02-22 14:36 UTC (permalink / raw)
To: qemu-devel, qemu-ppc
Cc: danielhb413, richard.henderson, groug, clg, Matheus Ferst, david
From: Matheus Ferst <matheus.ferst@eldorado.org.br>
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
---
target/ppc/insn32.decode | 1 +
target/ppc/translate/vmx-impl.c.inc | 32 +++++++++++++++++++++--------
2 files changed, 25 insertions(+), 8 deletions(-)
diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode
index 31cdbba86b..b20f1eaa8e 100644
--- a/target/ppc/insn32.decode
+++ b/target/ppc/insn32.decode
@@ -530,6 +530,7 @@ VSTRIHL 000100 ..... 00010 ..... . 0000001101 @VX_tb_rc
VSTRIHR 000100 ..... 00011 ..... . 0000001101 @VX_tb_rc
VCLRLB 000100 ..... ..... ..... 00110001101 @VX
+VCLRRB 000100 ..... ..... ..... 00111001101 @VX
# VSX Load/Store Instructions
diff --git a/target/ppc/translate/vmx-impl.c.inc b/target/ppc/translate/vmx-impl.c.inc
index 8f12d78071..4510b4ecde 100644
--- a/target/ppc/translate/vmx-impl.c.inc
+++ b/target/ppc/translate/vmx-impl.c.inc
@@ -1940,7 +1940,7 @@ TRANS(VSTRIBR, do_vstri, gen_helper_VSTRIBR)
TRANS(VSTRIHL, do_vstri, gen_helper_VSTRIHL)
TRANS(VSTRIHR, do_vstri, gen_helper_VSTRIHR)
-static bool trans_VCLRLB(DisasContext *ctx, arg_VX *a)
+static bool do_vclrb(DisasContext *ctx, arg_VX *a, bool right)
{
TCGv_i64 rb, mh, ml, tmp,
ones = tcg_constant_i64(-1),
@@ -1954,15 +1954,28 @@ static bool trans_VCLRLB(DisasContext *ctx, arg_VX *a)
tcg_gen_extu_tl_i64(rb, cpu_gpr[a->vrb]);
tcg_gen_andi_i64(tmp, rb, 7);
tcg_gen_shli_i64(tmp, tmp, 3);
- tcg_gen_shl_i64(tmp, tcg_constant_i64(-1), tmp);
+ if (right) {
+ tcg_gen_shr_i64(tmp, tcg_constant_i64(-1), tmp);
+ } else {
+ tcg_gen_shl_i64(tmp, tcg_constant_i64(-1), tmp);
+ }
tcg_gen_not_i64(tmp, tmp);
- tcg_gen_movcond_i64(TCG_COND_LTU, ml, rb, tcg_constant_i64(8),
- tmp, ones);
- tcg_gen_movcond_i64(TCG_COND_LTU, mh, rb, tcg_constant_i64(8),
- zero, tmp);
- tcg_gen_movcond_i64(TCG_COND_LTU, mh, rb, tcg_constant_i64(16),
- mh, ones);
+ if (right) {
+ tcg_gen_movcond_i64(TCG_COND_LTU, mh, rb, tcg_constant_i64(8),
+ tmp, ones);
+ tcg_gen_movcond_i64(TCG_COND_LTU, ml, rb, tcg_constant_i64(8),
+ zero, tmp);
+ tcg_gen_movcond_i64(TCG_COND_LTU, ml, rb, tcg_constant_i64(16),
+ ml, ones);
+ } else {
+ tcg_gen_movcond_i64(TCG_COND_LTU, ml, rb, tcg_constant_i64(8),
+ tmp, ones);
+ tcg_gen_movcond_i64(TCG_COND_LTU, mh, rb, tcg_constant_i64(8),
+ zero, tmp);
+ tcg_gen_movcond_i64(TCG_COND_LTU, mh, rb, tcg_constant_i64(16),
+ mh, ones);
+ }
get_avr64(tmp, a->vra, true);
tcg_gen_and_i64(tmp, tmp, mh);
@@ -1980,6 +1993,9 @@ static bool trans_VCLRLB(DisasContext *ctx, arg_VX *a)
return true;
}
+TRANS(VCLRLB, do_vclrb, false)
+TRANS(VCLRRB, do_vclrb, true)
+
#define GEN_VAFORM_PAIRED(name0, name1, opc2) \
static void glue(gen_, name0##_##name1)(DisasContext *ctx) \
{ \
--
2.25.1
^ permalink raw reply related [flat|nested] 97+ messages in thread
* Re: [PATCH v4 16/47] target/ppc: implement vclrrb
2022-02-22 14:36 ` [PATCH v4 16/47] target/ppc: implement vclrrb matheus.ferst
@ 2022-02-22 19:17 ` Richard Henderson
0 siblings, 0 replies; 97+ messages in thread
From: Richard Henderson @ 2022-02-22 19:17 UTC (permalink / raw)
To: matheus.ferst, qemu-devel, qemu-ppc; +Cc: groug, danielhb413, clg, david
On 2/22/22 04:36, matheus.ferst@eldorado.org.br wrote:
> From: Matheus Ferst<matheus.ferst@eldorado.org.br>
>
> Signed-off-by: Matheus Ferst<matheus.ferst@eldorado.org.br>
> ---
> target/ppc/insn32.decode | 1 +
> target/ppc/translate/vmx-impl.c.inc | 32 +++++++++++++++++++++--------
> 2 files changed, 25 insertions(+), 8 deletions(-)
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
r~
^ permalink raw reply [flat|nested] 97+ messages in thread
* [PATCH v4 17/47] target/ppc: implement vcntmb[bhwd]
2022-02-22 14:35 [PATCH v4 00/47] target/ppc: PowerISA Vector/VSX instruction batch matheus.ferst
` (15 preceding siblings ...)
2022-02-22 14:36 ` [PATCH v4 16/47] target/ppc: implement vclrrb matheus.ferst
@ 2022-02-22 14:36 ` matheus.ferst
2022-02-22 14:36 ` [PATCH v4 18/47] target/ppc: implement vgnb matheus.ferst
` (29 subsequent siblings)
46 siblings, 0 replies; 97+ messages in thread
From: matheus.ferst @ 2022-02-22 14:36 UTC (permalink / raw)
To: qemu-devel, qemu-ppc
Cc: danielhb413, richard.henderson, groug, clg, Matheus Ferst, david
From: Matheus Ferst <matheus.ferst@eldorado.org.br>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
---
target/ppc/insn32.decode | 8 ++++++++
target/ppc/translate/vmx-impl.c.inc | 32 +++++++++++++++++++++++++++++
2 files changed, 40 insertions(+)
diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode
index b20f1eaa8e..31a3c3b508 100644
--- a/target/ppc/insn32.decode
+++ b/target/ppc/insn32.decode
@@ -63,6 +63,9 @@
&VX_bf bf vra vrb
@VX_bf ...... bf:3 .. vra:5 vrb:5 ........... &VX_bf
+&VX_mp rt mp:bool vrb
+@VX_mp ...... rt:5 .... mp:1 vrb:5 ........... &VX_mp
+
&VX_tb_rc vrt vrb rc:bool
@VX_tb_rc ...... vrt:5 ..... vrb:5 rc:1 .......... &VX_tb_rc
@@ -489,6 +492,11 @@ VEXTRACTWM 000100 ..... 01010 ..... 11001000010 @VX_tb
VEXTRACTDM 000100 ..... 01011 ..... 11001000010 @VX_tb
VEXTRACTQM 000100 ..... 01100 ..... 11001000010 @VX_tb
+VCNTMBB 000100 ..... 1100 . ..... 11001000010 @VX_mp
+VCNTMBH 000100 ..... 1101 . ..... 11001000010 @VX_mp
+VCNTMBW 000100 ..... 1110 . ..... 11001000010 @VX_mp
+VCNTMBD 000100 ..... 1111 . ..... 11001000010 @VX_mp
+
## Vector Multiply Instruction
VMULESB 000100 ..... ..... ..... 01100001000 @VX
diff --git a/target/ppc/translate/vmx-impl.c.inc b/target/ppc/translate/vmx-impl.c.inc
index 4510b4ecde..17fc25d1bd 100644
--- a/target/ppc/translate/vmx-impl.c.inc
+++ b/target/ppc/translate/vmx-impl.c.inc
@@ -1910,6 +1910,38 @@ static bool trans_MTVSRBMI(DisasContext *ctx, arg_DX_b *a)
return true;
}
+static bool do_vcntmb(DisasContext *ctx, arg_VX_mp *a, int vece)
+{
+ TCGv_i64 rt, vrb, mask;
+ rt = tcg_const_i64(0);
+ vrb = tcg_temp_new_i64();
+ mask = tcg_constant_i64(dup_const(vece, 1ULL << ((8 << vece) - 1)));
+
+ for (int i = 0; i < 2; i++) {
+ get_avr64(vrb, a->vrb, i);
+ if (a->mp) {
+ tcg_gen_and_i64(vrb, mask, vrb);
+ } else {
+ tcg_gen_andc_i64(vrb, mask, vrb);
+ }
+ tcg_gen_ctpop_i64(vrb, vrb);
+ tcg_gen_add_i64(rt, rt, vrb);
+ }
+
+ tcg_gen_shli_i64(rt, rt, TARGET_LONG_BITS - 8 + vece);
+ tcg_gen_trunc_i64_tl(cpu_gpr[a->rt], rt);
+
+ tcg_temp_free_i64(vrb);
+ tcg_temp_free_i64(rt);
+
+ return true;
+}
+
+TRANS(VCNTMBB, do_vcntmb, MO_8)
+TRANS(VCNTMBH, do_vcntmb, MO_16)
+TRANS(VCNTMBW, do_vcntmb, MO_32)
+TRANS(VCNTMBD, do_vcntmb, MO_64)
+
static bool do_vstri(DisasContext *ctx, arg_VX_tb_rc *a,
void (*gen_helper)(TCGv_i32, TCGv_ptr, TCGv_ptr))
{
--
2.25.1
^ permalink raw reply related [flat|nested] 97+ messages in thread
* [PATCH v4 18/47] target/ppc: implement vgnb
2022-02-22 14:35 [PATCH v4 00/47] target/ppc: PowerISA Vector/VSX instruction batch matheus.ferst
` (16 preceding siblings ...)
2022-02-22 14:36 ` [PATCH v4 17/47] target/ppc: implement vcntmb[bhwd] matheus.ferst
@ 2022-02-22 14:36 ` matheus.ferst
2022-02-22 21:58 ` Richard Henderson
2022-02-22 14:36 ` [PATCH v4 19/47] target/ppc: move vs[lr][a][bhwd] to decodetree matheus.ferst
` (28 subsequent siblings)
46 siblings, 1 reply; 97+ messages in thread
From: matheus.ferst @ 2022-02-22 14:36 UTC (permalink / raw)
To: qemu-devel, qemu-ppc
Cc: danielhb413, richard.henderson, groug, clg, Matheus Ferst, david
From: Matheus Ferst <matheus.ferst@eldorado.org.br>
Suggested-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
---
v4:
- Optimized implementation (rth)
---
target/ppc/insn32.decode | 5 ++
target/ppc/translate/vmx-impl.c.inc | 135 ++++++++++++++++++++++++++++
2 files changed, 140 insertions(+)
diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode
index 31a3c3b508..02df4a98e6 100644
--- a/target/ppc/insn32.decode
+++ b/target/ppc/insn32.decode
@@ -66,6 +66,9 @@
&VX_mp rt mp:bool vrb
@VX_mp ...... rt:5 .... mp:1 vrb:5 ........... &VX_mp
+&VX_n rt vrb n
+@VX_n ...... rt:5 .. n:3 vrb:5 ........... &VX_n
+
&VX_tb_rc vrt vrb rc:bool
@VX_tb_rc ...... vrt:5 ..... vrb:5 rc:1 .......... &VX_tb_rc
@@ -418,6 +421,8 @@ VCMPUQ 000100 ... -- ..... ..... 00100000001 @VX_bf
## Vector Bit Manipulation Instruction
+VGNB 000100 ..... -- ... ..... 10011001100 @VX_n
+
VCFUGED 000100 ..... ..... ..... 10101001101 @VX
VCLZDM 000100 ..... ..... ..... 11110000100 @VX
VCTZDM 000100 ..... ..... ..... 11111000100 @VX
diff --git a/target/ppc/translate/vmx-impl.c.inc b/target/ppc/translate/vmx-impl.c.inc
index 17fc25d1bd..19219b0010 100644
--- a/target/ppc/translate/vmx-impl.c.inc
+++ b/target/ppc/translate/vmx-impl.c.inc
@@ -1416,6 +1416,141 @@ GEN_VXFORM_DUAL(vsplth, PPC_ALTIVEC, PPC_NONE,
GEN_VXFORM_DUAL(vspltw, PPC_ALTIVEC, PPC_NONE,
vextractuw, PPC_NONE, PPC2_ISA300);
+static bool trans_VGNB(DisasContext *ctx, arg_VX_n *a)
+{
+ /*
+ * Similar to do_vextractm, we'll use a sequence of mask-shift-or operations
+ * to gather the bits. The masks can be created with
+ *
+ * uint64_t mask(uint64_t n, uint64_t step)
+ * {
+ * uint64_t p = ((1UL << (1UL << step)) - 1UL) << ((n - 1UL) << step),
+ * plen = n << step, m = 0;
+ * for(int i = 0; i < 64/plen; i++) {
+ * m |= p;
+ * m = ror64(m, plen);
+ * }
+ * p >>= plen * DIV_ROUND_UP(64, plen) - 64;
+ * return m | p;
+ * }
+ *
+ * But since there are few values of N, we'll use a lookup table to avoid
+ * these calculations at runtime.
+ */
+ static const uint64_t mask[6][5] = {
+ {
+ 0xAAAAAAAAAAAAAAAAULL, 0xccccccccccccccccULL, 0xf0f0f0f0f0f0f0f0ULL,
+ 0xff00ff00ff00ff00ULL, 0xffff0000ffff0000ULL
+ },
+ {
+ 0x9249249249249249ULL, 0xC30C30C30C30C30CULL, 0xF00F00F00F00F00FULL,
+ 0xFF0000FF0000FF00ULL, 0xFFFF00000000FFFFULL
+ },
+ {
+ /* For N >= 4, some mask operations can be elided */
+ 0x8888888888888888ULL, 0, 0xf000f000f000f000ULL, 0,
+ 0xFFFF000000000000ULL
+ },
+ {
+ 0x8421084210842108ULL, 0, 0xF0000F0000F0000FULL, 0, 0
+ },
+ {
+ 0x8208208208208208ULL, 0, 0xF00000F00000F000ULL, 0, 0
+ },
+ {
+ 0x8102040810204081ULL, 0, 0xF000000F000000F0ULL, 0, 0
+ }
+ };
+ uint64_t m;
+ int i, sh, nbits = DIV_ROUND_UP(64, a->n);
+ TCGv_i64 hi, lo, t0, t1;
+
+ REQUIRE_INSNS_FLAGS2(ctx, ISA310);
+ REQUIRE_VECTOR(ctx);
+
+ if (a->n < 2) {
+ /*
+ * "N can be any value between 2 and 7, inclusive." Otherwise, the
+ * result is undefined, so we don't need to change RT. Also, N > 7 is
+ * impossible since the immediate field is 3 bits only.
+ */
+ return true;
+ }
+
+ hi = tcg_temp_new_i64();
+ lo = tcg_temp_new_i64();
+ t0 = tcg_temp_new_i64();
+ t1 = tcg_temp_new_i64();
+
+ get_avr64(hi, a->vrb, true);
+ get_avr64(lo, a->vrb, false);
+
+ /* Align the lower doubleword so we can use the same mask */
+ tcg_gen_shli_i64(lo, lo, a->n * nbits - 64);
+
+ /*
+ * Starting from the most significant bit, gather every Nth bit with a
+ * sequence of mask-shift-or operation. E.g.: for N=3
+ * AxxBxxCxxDxxExxFxxGxxHxxIxxJxxKxxLxxMxxNxxOxxPxxQxxRxxSxxTxxUxxV
+ * & rep(0b100)
+ * A..B..C..D..E..F..G..H..I..J..K..L..M..N..O..P..Q..R..S..T..U..V
+ * << 2
+ * .B..C..D..E..F..G..H..I..J..K..L..M..N..O..P..Q..R..S..T..U..V..
+ * |
+ * AB.BC.CD.DE.EF.FG.GH.HI.IJ.JK.KL.LM.MN.NO.OP.PQ.QR.RS.ST.TU.UV.V
+ * & rep(0b110000)
+ * AB....CD....EF....GH....IJ....KL....MN....OP....QR....ST....UV..
+ * << 4
+ * ..CD....EF....GH....IJ....KL....MN....OP....QR....ST....UV......
+ * |
+ * ABCD..CDEF..EFGH..GHIJ..IJKL..KLMN..MNOP..OPQR..QRST..STUV..UV..
+ * & rep(0b111100000000)
+ * ABCD........EFGH........IJKL........MNOP........QRST........UV..
+ * << 8
+ * ....EFGH........IJKL........MNOP........QRST........UV..........
+ * |
+ * ABCDEFGH....EFGHIJKL....IJKLMNOP....MNOPQRST....QRSTUV......UV..
+ * & rep(0b111111110000000000000000)
+ * ABCDEFGH................IJKLMNOP................QRSTUV..........
+ * << 16
+ * ........IJKLMNOP................QRSTUV..........................
+ * |
+ * ABCDEFGHIJKLMNOP........IJKLMNOPQRSTUV..........QRSTUV..........
+ * & rep(0b111111111111111100000000000000000000000000000000)
+ * ABCDEFGHIJKLMNOP................................QRSTUV..........
+ * << 32
+ * ................QRSTUV..........................................
+ * |
+ * ABCDEFGHIJKLMNOPQRSTUV..........................QRSTUV..........
+ */
+ for (i = 0, sh = a->n - 1; i < 5; i++, sh <<= 1) {
+ m = mask[a->n - 2][i];
+ if (m) {
+ tcg_gen_andi_i64(hi, hi, m);
+ tcg_gen_andi_i64(lo, lo, m);
+ }
+ if (sh < 64) {
+ tcg_gen_shli_i64(t0, hi, sh);
+ tcg_gen_shli_i64(t1, lo, sh);
+ tcg_gen_or_i64(hi, t0, hi);
+ tcg_gen_or_i64(lo, t1, lo);
+ }
+ }
+
+ tcg_gen_andi_i64(hi, hi, ~(~0ULL >> nbits));
+ tcg_gen_andi_i64(lo, lo, ~(~0ULL >> nbits));
+ tcg_gen_shri_i64(lo, lo, nbits);
+ tcg_gen_or_i64(hi, hi, lo);
+ tcg_gen_trunc_i64_tl(cpu_gpr[a->rt], hi);
+
+ tcg_temp_free_i64(hi);
+ tcg_temp_free_i64(lo);
+ tcg_temp_free_i64(t0);
+ tcg_temp_free_i64(t1);
+
+ return true;
+}
+
static bool do_vextdx(DisasContext *ctx, arg_VA *a, int size, bool right,
void (*gen_helper)(TCGv_ptr, TCGv_ptr, TCGv_ptr, TCGv_ptr, TCGv))
{
--
2.25.1
^ permalink raw reply related [flat|nested] 97+ messages in thread
* Re: [PATCH v4 18/47] target/ppc: implement vgnb
2022-02-22 14:36 ` [PATCH v4 18/47] target/ppc: implement vgnb matheus.ferst
@ 2022-02-22 21:58 ` Richard Henderson
0 siblings, 0 replies; 97+ messages in thread
From: Richard Henderson @ 2022-02-22 21:58 UTC (permalink / raw)
To: matheus.ferst, qemu-devel, qemu-ppc; +Cc: groug, danielhb413, clg, david
On 2/22/22 04:36, matheus.ferst@eldorado.org.br wrote:
> From: Matheus Ferst<matheus.ferst@eldorado.org.br>
>
> Suggested-by: Richard Henderson<richard.henderson@linaro.org>
> Signed-off-by: Matheus Ferst<matheus.ferst@eldorado.org.br>
> ---
> v4:
> - Optimized implementation (rth)
> ---
> target/ppc/insn32.decode | 5 ++
> target/ppc/translate/vmx-impl.c.inc | 135 ++++++++++++++++++++++++++++
> 2 files changed, 140 insertions(+)
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
r~
^ permalink raw reply [flat|nested] 97+ messages in thread
* [PATCH v4 19/47] target/ppc: move vs[lr][a][bhwd] to decodetree
2022-02-22 14:35 [PATCH v4 00/47] target/ppc: PowerISA Vector/VSX instruction batch matheus.ferst
` (17 preceding siblings ...)
2022-02-22 14:36 ` [PATCH v4 18/47] target/ppc: implement vgnb matheus.ferst
@ 2022-02-22 14:36 ` matheus.ferst
2022-02-22 22:01 ` Richard Henderson
2022-02-22 14:36 ` [PATCH v4 20/47] target/ppc: implement vslq matheus.ferst
` (27 subsequent siblings)
46 siblings, 1 reply; 97+ messages in thread
From: matheus.ferst @ 2022-02-22 14:36 UTC (permalink / raw)
To: qemu-devel, qemu-ppc
Cc: danielhb413, richard.henderson, groug, clg, Matheus Ferst, david
From: Matheus Ferst <matheus.ferst@eldorado.org.br>
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
---
v4:
- New in v4.
---
target/ppc/insn32.decode | 17 ++++++++++++
target/ppc/translate/vmx-impl.c.inc | 41 +++++++++++++++++++----------
target/ppc/translate/vmx-ops.c.inc | 13 +--------
3 files changed, 45 insertions(+), 26 deletions(-)
diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode
index 02df4a98e6..88baebe35e 100644
--- a/target/ppc/insn32.decode
+++ b/target/ppc/insn32.decode
@@ -467,6 +467,23 @@ VINSWVRX 000100 ..... ..... ..... 00110001111 @VX
VSLDBI 000100 ..... ..... ..... 00 ... 010110 @VN
VSRDBI 000100 ..... ..... ..... 01 ... 010110 @VN
+## Vector Integer Shift Instruction
+
+VSLB 000100 ..... ..... ..... 00100000100 @VX
+VSLH 000100 ..... ..... ..... 00101000100 @VX
+VSLW 000100 ..... ..... ..... 00110000100 @VX
+VSLD 000100 ..... ..... ..... 10111000100 @VX
+
+VSRB 000100 ..... ..... ..... 01000000100 @VX
+VSRH 000100 ..... ..... ..... 01001000100 @VX
+VSRW 000100 ..... ..... ..... 01010000100 @VX
+VSRD 000100 ..... ..... ..... 11011000100 @VX
+
+VSRAB 000100 ..... ..... ..... 01100000100 @VX
+VSRAH 000100 ..... ..... ..... 01101000100 @VX
+VSRAW 000100 ..... ..... ..... 01110000100 @VX
+VSRAD 000100 ..... ..... ..... 01111000100 @VX
+
## Vector Integer Arithmetic Instructions
VEXTSB2W 000100 ..... 10000 ..... 11000000010 @VX_tb
diff --git a/target/ppc/translate/vmx-impl.c.inc b/target/ppc/translate/vmx-impl.c.inc
index 19219b0010..ec4f0e7654 100644
--- a/target/ppc/translate/vmx-impl.c.inc
+++ b/target/ppc/translate/vmx-impl.c.inc
@@ -799,21 +799,7 @@ static void trans_vclzd(DisasContext *ctx)
}
GEN_VXFORM_V(vmuluwm, MO_32, tcg_gen_gvec_mul, 4, 2);
-GEN_VXFORM_V(vslb, MO_8, tcg_gen_gvec_shlv, 2, 4);
-GEN_VXFORM_V(vslh, MO_16, tcg_gen_gvec_shlv, 2, 5);
-GEN_VXFORM_V(vslw, MO_32, tcg_gen_gvec_shlv, 2, 6);
GEN_VXFORM(vrlwnm, 2, 6);
-GEN_VXFORM_DUAL(vslw, PPC_ALTIVEC, PPC_NONE, \
- vrlwnm, PPC_NONE, PPC2_ISA300)
-GEN_VXFORM_V(vsld, MO_64, tcg_gen_gvec_shlv, 2, 23);
-GEN_VXFORM_V(vsrb, MO_8, tcg_gen_gvec_shrv, 2, 8);
-GEN_VXFORM_V(vsrh, MO_16, tcg_gen_gvec_shrv, 2, 9);
-GEN_VXFORM_V(vsrw, MO_32, tcg_gen_gvec_shrv, 2, 10);
-GEN_VXFORM_V(vsrd, MO_64, tcg_gen_gvec_shrv, 2, 27);
-GEN_VXFORM_V(vsrab, MO_8, tcg_gen_gvec_sarv, 2, 12);
-GEN_VXFORM_V(vsrah, MO_16, tcg_gen_gvec_sarv, 2, 13);
-GEN_VXFORM_V(vsraw, MO_32, tcg_gen_gvec_sarv, 2, 14);
-GEN_VXFORM_V(vsrad, MO_64, tcg_gen_gvec_sarv, 2, 15);
GEN_VXFORM(vsrv, 2, 28);
GEN_VXFORM(vslv, 2, 29);
GEN_VXFORM(vslo, 6, 16);
@@ -821,6 +807,33 @@ GEN_VXFORM(vsro, 6, 17);
GEN_VXFORM(vaddcuw, 0, 6);
GEN_VXFORM(vsubcuw, 0, 22);
+static bool do_vector_gvec3_VX(DisasContext *ctx, arg_VX *a, int vece,
+ void (*gen_gvec)(unsigned, uint32_t, uint32_t,
+ uint32_t, uint32_t, uint32_t))
+{
+ REQUIRE_VECTOR(ctx);
+
+ gen_gvec(vece, avr_full_offset(a->vrt), avr_full_offset(a->vra),
+ avr_full_offset(a->vrb), 16, 16);
+
+ return true;
+}
+
+TRANS_FLAGS(ALTIVEC, VSLB, do_vector_gvec3_VX, MO_8, tcg_gen_gvec_shlv);
+TRANS_FLAGS(ALTIVEC, VSLH, do_vector_gvec3_VX, MO_16, tcg_gen_gvec_shlv);
+TRANS_FLAGS(ALTIVEC, VSLW, do_vector_gvec3_VX, MO_32, tcg_gen_gvec_shlv);
+TRANS_FLAGS2(ALTIVEC_207, VSLD, do_vector_gvec3_VX, MO_64, tcg_gen_gvec_shlv);
+
+TRANS_FLAGS(ALTIVEC, VSRB, do_vector_gvec3_VX, MO_8, tcg_gen_gvec_shrv);
+TRANS_FLAGS(ALTIVEC, VSRH, do_vector_gvec3_VX, MO_16, tcg_gen_gvec_shrv);
+TRANS_FLAGS(ALTIVEC, VSRW, do_vector_gvec3_VX, MO_32, tcg_gen_gvec_shrv);
+TRANS_FLAGS2(ALTIVEC_207, VSRD, do_vector_gvec3_VX, MO_64, tcg_gen_gvec_shrv);
+
+TRANS_FLAGS(ALTIVEC, VSRAB, do_vector_gvec3_VX, MO_8, tcg_gen_gvec_sarv);
+TRANS_FLAGS(ALTIVEC, VSRAH, do_vector_gvec3_VX, MO_16, tcg_gen_gvec_sarv);
+TRANS_FLAGS(ALTIVEC, VSRAW, do_vector_gvec3_VX, MO_32, tcg_gen_gvec_sarv);
+TRANS_FLAGS2(ALTIVEC_207, VSRAD, do_vector_gvec3_VX, MO_64, tcg_gen_gvec_sarv);
+
#define GEN_VXFORM_SAT(NAME, VECE, NORM, SAT, OPC2, OPC3) \
static void glue(glue(gen_, NAME), _vec)(unsigned vece, TCGv_vec t, \
TCGv_vec sat, TCGv_vec a, \
diff --git a/target/ppc/translate/vmx-ops.c.inc b/target/ppc/translate/vmx-ops.c.inc
index cb4c5bb953..878bce92c6 100644
--- a/target/ppc/translate/vmx-ops.c.inc
+++ b/target/ppc/translate/vmx-ops.c.inc
@@ -102,18 +102,7 @@ GEN_VXFORM_300(vextubrx, 6, 28),
GEN_VXFORM_300(vextuhrx, 6, 29),
GEN_VXFORM_DUAL(vmrgew, vextuwrx, 6, 30, PPC_NONE, PPC2_ALTIVEC_207),
GEN_VXFORM_207(vmuluwm, 4, 2),
-GEN_VXFORM(vslb, 2, 4),
-GEN_VXFORM(vslh, 2, 5),
-GEN_VXFORM_DUAL(vslw, vrlwnm, 2, 6, PPC_ALTIVEC, PPC_NONE),
-GEN_VXFORM_207(vsld, 2, 23),
-GEN_VXFORM(vsrb, 2, 8),
-GEN_VXFORM(vsrh, 2, 9),
-GEN_VXFORM(vsrw, 2, 10),
-GEN_VXFORM_207(vsrd, 2, 27),
-GEN_VXFORM(vsrab, 2, 12),
-GEN_VXFORM(vsrah, 2, 13),
-GEN_VXFORM(vsraw, 2, 14),
-GEN_VXFORM_207(vsrad, 2, 15),
+GEN_VXFORM_300(vrlwnm, 2, 6),
GEN_VXFORM_300(vsrv, 2, 28),
GEN_VXFORM_300(vslv, 2, 29),
GEN_VXFORM(vslo, 6, 16),
--
2.25.1
^ permalink raw reply related [flat|nested] 97+ messages in thread
* Re: [PATCH v4 19/47] target/ppc: move vs[lr][a][bhwd] to decodetree
2022-02-22 14:36 ` [PATCH v4 19/47] target/ppc: move vs[lr][a][bhwd] to decodetree matheus.ferst
@ 2022-02-22 22:01 ` Richard Henderson
0 siblings, 0 replies; 97+ messages in thread
From: Richard Henderson @ 2022-02-22 22:01 UTC (permalink / raw)
To: matheus.ferst, qemu-devel, qemu-ppc; +Cc: groug, danielhb413, clg, david
On 2/22/22 04:36, matheus.ferst@eldorado.org.br wrote:
> From: Matheus Ferst<matheus.ferst@eldorado.org.br>
>
> Signed-off-by: Matheus Ferst<matheus.ferst@eldorado.org.br>
> ---
> v4:
> - New in v4.
> ---
> target/ppc/insn32.decode | 17 ++++++++++++
> target/ppc/translate/vmx-impl.c.inc | 41 +++++++++++++++++++----------
> target/ppc/translate/vmx-ops.c.inc | 13 +--------
> 3 files changed, 45 insertions(+), 26 deletions(-)
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
r~
^ permalink raw reply [flat|nested] 97+ messages in thread
* [PATCH v4 20/47] target/ppc: implement vslq
2022-02-22 14:35 [PATCH v4 00/47] target/ppc: PowerISA Vector/VSX instruction batch matheus.ferst
` (18 preceding siblings ...)
2022-02-22 14:36 ` [PATCH v4 19/47] target/ppc: move vs[lr][a][bhwd] to decodetree matheus.ferst
@ 2022-02-22 14:36 ` matheus.ferst
2022-02-22 22:14 ` Richard Henderson
2022-02-22 14:36 ` [PATCH v4 21/47] target/ppc: implement vsrq matheus.ferst
` (26 subsequent siblings)
46 siblings, 1 reply; 97+ messages in thread
From: matheus.ferst @ 2022-02-22 14:36 UTC (permalink / raw)
To: qemu-devel, qemu-ppc
Cc: danielhb413, richard.henderson, groug, clg, Matheus Ferst, david
From: Matheus Ferst <matheus.ferst@eldorado.org.br>
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
---
v4:
- New in v4.
---
target/ppc/insn32.decode | 1 +
target/ppc/translate/vmx-impl.c.inc | 40 +++++++++++++++++++++++++++++
2 files changed, 41 insertions(+)
diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode
index 88baebe35e..3799065508 100644
--- a/target/ppc/insn32.decode
+++ b/target/ppc/insn32.decode
@@ -473,6 +473,7 @@ VSLB 000100 ..... ..... ..... 00100000100 @VX
VSLH 000100 ..... ..... ..... 00101000100 @VX
VSLW 000100 ..... ..... ..... 00110000100 @VX
VSLD 000100 ..... ..... ..... 10111000100 @VX
+VSLQ 000100 ..... ..... ..... 00100000101 @VX
VSRB 000100 ..... ..... ..... 01000000100 @VX
VSRH 000100 ..... ..... ..... 01001000100 @VX
diff --git a/target/ppc/translate/vmx-impl.c.inc b/target/ppc/translate/vmx-impl.c.inc
index ec4f0e7654..ca98a545ef 100644
--- a/target/ppc/translate/vmx-impl.c.inc
+++ b/target/ppc/translate/vmx-impl.c.inc
@@ -834,6 +834,46 @@ TRANS_FLAGS(ALTIVEC, VSRAH, do_vector_gvec3_VX, MO_16, tcg_gen_gvec_sarv);
TRANS_FLAGS(ALTIVEC, VSRAW, do_vector_gvec3_VX, MO_32, tcg_gen_gvec_sarv);
TRANS_FLAGS2(ALTIVEC_207, VSRAD, do_vector_gvec3_VX, MO_64, tcg_gen_gvec_sarv);
+static bool trans_VSLQ(DisasContext *ctx, arg_VX *a)
+{
+ TCGv_i64 hi, lo, tmp, n, sf = tcg_constant_i64(64);
+
+ REQUIRE_INSNS_FLAGS2(ctx, ISA310);
+ REQUIRE_VECTOR(ctx);
+
+ n = tcg_temp_new_i64();
+ hi = tcg_temp_new_i64();
+ lo = tcg_temp_new_i64();
+ tmp = tcg_const_i64(0);
+
+ get_avr64(lo, a->vra, false);
+ get_avr64(hi, a->vra, true);
+
+ get_avr64(n, a->vrb, true);
+ tcg_gen_andi_i64(n, n, 0x7F);
+
+ tcg_gen_movcond_i64(TCG_COND_GE, hi, n, sf, lo, hi);
+ tcg_gen_movcond_i64(TCG_COND_GE, lo, n, sf, tmp, lo);
+ tcg_gen_andi_i64(n, n, ~64ULL);
+
+ tcg_gen_shl_i64(tmp, lo, n);
+ set_avr64(a->vrt, tmp, false);
+
+ tcg_gen_shl_i64(hi, hi, n);
+ tcg_gen_xori_i64(n, n, 63);
+ tcg_gen_shr_i64(lo, lo, n);
+ tcg_gen_shri_i64(lo, lo, 1);
+ tcg_gen_or_i64(hi, hi, lo);
+ set_avr64(a->vrt, hi, true);
+
+ tcg_temp_free_i64(hi);
+ tcg_temp_free_i64(lo);
+ tcg_temp_free_i64(tmp);
+ tcg_temp_free_i64(n);
+
+ return true;
+}
+
#define GEN_VXFORM_SAT(NAME, VECE, NORM, SAT, OPC2, OPC3) \
static void glue(glue(gen_, NAME), _vec)(unsigned vece, TCGv_vec t, \
TCGv_vec sat, TCGv_vec a, \
--
2.25.1
^ permalink raw reply related [flat|nested] 97+ messages in thread
* Re: [PATCH v4 20/47] target/ppc: implement vslq
2022-02-22 14:36 ` [PATCH v4 20/47] target/ppc: implement vslq matheus.ferst
@ 2022-02-22 22:14 ` Richard Henderson
2022-02-23 21:53 ` Matheus K. Ferst
0 siblings, 1 reply; 97+ messages in thread
From: Richard Henderson @ 2022-02-22 22:14 UTC (permalink / raw)
To: matheus.ferst, qemu-devel, qemu-ppc; +Cc: groug, danielhb413, clg, david
On 2/22/22 04:36, matheus.ferst@eldorado.org.br wrote:
> From: Matheus Ferst <matheus.ferst@eldorado.org.br>
>
> Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
> ---
> v4:
> - New in v4.
> ---
> target/ppc/insn32.decode | 1 +
> target/ppc/translate/vmx-impl.c.inc | 40 +++++++++++++++++++++++++++++
> 2 files changed, 41 insertions(+)
>
> diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode
> index 88baebe35e..3799065508 100644
> --- a/target/ppc/insn32.decode
> +++ b/target/ppc/insn32.decode
> @@ -473,6 +473,7 @@ VSLB 000100 ..... ..... ..... 00100000100 @VX
> VSLH 000100 ..... ..... ..... 00101000100 @VX
> VSLW 000100 ..... ..... ..... 00110000100 @VX
> VSLD 000100 ..... ..... ..... 10111000100 @VX
> +VSLQ 000100 ..... ..... ..... 00100000101 @VX
>
> VSRB 000100 ..... ..... ..... 01000000100 @VX
> VSRH 000100 ..... ..... ..... 01001000100 @VX
> diff --git a/target/ppc/translate/vmx-impl.c.inc b/target/ppc/translate/vmx-impl.c.inc
> index ec4f0e7654..ca98a545ef 100644
> --- a/target/ppc/translate/vmx-impl.c.inc
> +++ b/target/ppc/translate/vmx-impl.c.inc
> @@ -834,6 +834,46 @@ TRANS_FLAGS(ALTIVEC, VSRAH, do_vector_gvec3_VX, MO_16, tcg_gen_gvec_sarv);
> TRANS_FLAGS(ALTIVEC, VSRAW, do_vector_gvec3_VX, MO_32, tcg_gen_gvec_sarv);
> TRANS_FLAGS2(ALTIVEC_207, VSRAD, do_vector_gvec3_VX, MO_64, tcg_gen_gvec_sarv);
>
> +static bool trans_VSLQ(DisasContext *ctx, arg_VX *a)
> +{
> + TCGv_i64 hi, lo, tmp, n, sf = tcg_constant_i64(64);
> +
> + REQUIRE_INSNS_FLAGS2(ctx, ISA310);
> + REQUIRE_VECTOR(ctx);
> +
> + n = tcg_temp_new_i64();
> + hi = tcg_temp_new_i64();
> + lo = tcg_temp_new_i64();
> + tmp = tcg_const_i64(0);
> +
> + get_avr64(lo, a->vra, false);
> + get_avr64(hi, a->vra, true);
> +
> + get_avr64(n, a->vrb, true);
> + tcg_gen_andi_i64(n, n, 0x7F);
> +
> + tcg_gen_movcond_i64(TCG_COND_GE, hi, n, sf, lo, hi);
> + tcg_gen_movcond_i64(TCG_COND_GE, lo, n, sf, tmp, lo);
Since you have to mask twice anyway, better use (n & 64) != 0.
Otherwise,
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
r~
^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: [PATCH v4 20/47] target/ppc: implement vslq
2022-02-22 22:14 ` Richard Henderson
@ 2022-02-23 21:53 ` Matheus K. Ferst
2022-02-23 22:12 ` Richard Henderson
0 siblings, 1 reply; 97+ messages in thread
From: Matheus K. Ferst @ 2022-02-23 21:53 UTC (permalink / raw)
To: Richard Henderson, qemu-devel, qemu-ppc; +Cc: groug, danielhb413, clg, david
On 22/02/2022 19:14, Richard Henderson wrote:
> On 2/22/22 04:36, matheus.ferst@eldorado.org.br wrote:
>> From: Matheus Ferst <matheus.ferst@eldorado.org.br>
>>
>> Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
>> ---
>> v4:
>> - New in v4.
>> ---
>> target/ppc/insn32.decode | 1 +
>> target/ppc/translate/vmx-impl.c.inc | 40 +++++++++++++++++++++++++++++
>> 2 files changed, 41 insertions(+)
>>
>> diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode
>> index 88baebe35e..3799065508 100644
>> --- a/target/ppc/insn32.decode
>> +++ b/target/ppc/insn32.decode
>> @@ -473,6 +473,7 @@ VSLB 000100 ..... ..... .....
>> 00100000100 @VX
>> VSLH 000100 ..... ..... ..... 00101000100 @VX
>> VSLW 000100 ..... ..... ..... 00110000100 @VX
>> VSLD 000100 ..... ..... ..... 10111000100 @VX
>> +VSLQ 000100 ..... ..... ..... 00100000101 @VX
>>
>> VSRB 000100 ..... ..... ..... 01000000100 @VX
>> VSRH 000100 ..... ..... ..... 01001000100 @VX
>> diff --git a/target/ppc/translate/vmx-impl.c.inc
>> b/target/ppc/translate/vmx-impl.c.inc
>> index ec4f0e7654..ca98a545ef 100644
>> --- a/target/ppc/translate/vmx-impl.c.inc
>> +++ b/target/ppc/translate/vmx-impl.c.inc
>> @@ -834,6 +834,46 @@ TRANS_FLAGS(ALTIVEC, VSRAH, do_vector_gvec3_VX,
>> MO_16, tcg_gen_gvec_sarv);
>> TRANS_FLAGS(ALTIVEC, VSRAW, do_vector_gvec3_VX, MO_32,
>> tcg_gen_gvec_sarv);
>> TRANS_FLAGS2(ALTIVEC_207, VSRAD, do_vector_gvec3_VX, MO_64,
>> tcg_gen_gvec_sarv);
>>
>> +static bool trans_VSLQ(DisasContext *ctx, arg_VX *a)
>> +{
>> + TCGv_i64 hi, lo, tmp, n, sf = tcg_constant_i64(64);
>> +
>> + REQUIRE_INSNS_FLAGS2(ctx, ISA310);
>> + REQUIRE_VECTOR(ctx);
>> +
>> + n = tcg_temp_new_i64();
>> + hi = tcg_temp_new_i64();
>> + lo = tcg_temp_new_i64();
>> + tmp = tcg_const_i64(0);
>> +
>> + get_avr64(lo, a->vra, false);
>> + get_avr64(hi, a->vra, true);
>> +
>> + get_avr64(n, a->vrb, true);
>> + tcg_gen_andi_i64(n, n, 0x7F);
>> +
>> + tcg_gen_movcond_i64(TCG_COND_GE, hi, n, sf, lo, hi);
>> + tcg_gen_movcond_i64(TCG_COND_GE, lo, n, sf, tmp, lo);
>
> Since you have to mask twice anyway, better use (n & 64) != 0.
>
Hmm, I'm not sure if I understood. To check != 0 we'll need a temp to
hold n&64. We could use tmp here, but we'll need another one in patch
22. Is that right?
Thanks,
Matheus K. Ferst
Instituto de Pesquisas ELDORADO <http://www.eldorado.org.br/>
Analista de Software
Aviso Legal - Disclaimer <https://www.eldorado.org.br/disclaimer.html>
^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: [PATCH v4 20/47] target/ppc: implement vslq
2022-02-23 21:53 ` Matheus K. Ferst
@ 2022-02-23 22:12 ` Richard Henderson
0 siblings, 0 replies; 97+ messages in thread
From: Richard Henderson @ 2022-02-23 22:12 UTC (permalink / raw)
To: Matheus K. Ferst, qemu-devel, qemu-ppc; +Cc: groug, danielhb413, clg, david
On 2/23/22 11:53, Matheus K. Ferst wrote:
> On 22/02/2022 19:14, Richard Henderson wrote:
>> On 2/22/22 04:36, matheus.ferst@eldorado.org.br wrote:
>>> From: Matheus Ferst <matheus.ferst@eldorado.org.br>
>>>
>>> Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
>>> ---
>>> v4:
>>> - New in v4.
>>> ---
>>> target/ppc/insn32.decode | 1 +
>>> target/ppc/translate/vmx-impl.c.inc | 40 +++++++++++++++++++++++++++++
>>> 2 files changed, 41 insertions(+)
>>>
>>> diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode
>>> index 88baebe35e..3799065508 100644
>>> --- a/target/ppc/insn32.decode
>>> +++ b/target/ppc/insn32.decode
>>> @@ -473,6 +473,7 @@ VSLB 000100 ..... ..... ..... 00100000100 @VX
>>> VSLH 000100 ..... ..... ..... 00101000100 @VX
>>> VSLW 000100 ..... ..... ..... 00110000100 @VX
>>> VSLD 000100 ..... ..... ..... 10111000100 @VX
>>> +VSLQ 000100 ..... ..... ..... 00100000101 @VX
>>>
>>> VSRB 000100 ..... ..... ..... 01000000100 @VX
>>> VSRH 000100 ..... ..... ..... 01001000100 @VX
>>> diff --git a/target/ppc/translate/vmx-impl.c.inc b/target/ppc/translate/vmx-impl.c.inc
>>> index ec4f0e7654..ca98a545ef 100644
>>> --- a/target/ppc/translate/vmx-impl.c.inc
>>> +++ b/target/ppc/translate/vmx-impl.c.inc
>>> @@ -834,6 +834,46 @@ TRANS_FLAGS(ALTIVEC, VSRAH, do_vector_gvec3_VX, MO_16,
>>> tcg_gen_gvec_sarv);
>>> TRANS_FLAGS(ALTIVEC, VSRAW, do_vector_gvec3_VX, MO_32, tcg_gen_gvec_sarv);
>>> TRANS_FLAGS2(ALTIVEC_207, VSRAD, do_vector_gvec3_VX, MO_64, tcg_gen_gvec_sarv);
>>>
>>> +static bool trans_VSLQ(DisasContext *ctx, arg_VX *a)
>>> +{
>>> + TCGv_i64 hi, lo, tmp, n, sf = tcg_constant_i64(64);
>>> +
>>> + REQUIRE_INSNS_FLAGS2(ctx, ISA310);
>>> + REQUIRE_VECTOR(ctx);
>>> +
>>> + n = tcg_temp_new_i64();
>>> + hi = tcg_temp_new_i64();
>>> + lo = tcg_temp_new_i64();
>>> + tmp = tcg_const_i64(0);
>>> +
>>> + get_avr64(lo, a->vra, false);
>>> + get_avr64(hi, a->vra, true);
>>> +
>>> + get_avr64(n, a->vrb, true);
>>> + tcg_gen_andi_i64(n, n, 0x7F);
>>> +
>>> + tcg_gen_movcond_i64(TCG_COND_GE, hi, n, sf, lo, hi);
>>> + tcg_gen_movcond_i64(TCG_COND_GE, lo, n, sf, tmp, lo);
>>
>> Since you have to mask twice anyway, better use (n & 64) != 0.
>>
>
> Hmm, I'm not sure if I understood. To check != 0 we'll need a temp to hold n&64. We could
> use tmp here, but we'll need another one in patch 22. Is that right?
Yes.
r~
^ permalink raw reply [flat|nested] 97+ messages in thread
* [PATCH v4 21/47] target/ppc: implement vsrq
2022-02-22 14:35 [PATCH v4 00/47] target/ppc: PowerISA Vector/VSX instruction batch matheus.ferst
` (19 preceding siblings ...)
2022-02-22 14:36 ` [PATCH v4 20/47] target/ppc: implement vslq matheus.ferst
@ 2022-02-22 14:36 ` matheus.ferst
2022-02-22 22:15 ` Richard Henderson
2022-02-22 14:36 ` [PATCH v4 22/47] target/ppc: implement vsraq matheus.ferst
` (25 subsequent siblings)
46 siblings, 1 reply; 97+ messages in thread
From: matheus.ferst @ 2022-02-22 14:36 UTC (permalink / raw)
To: qemu-devel, qemu-ppc
Cc: danielhb413, richard.henderson, groug, clg, Matheus Ferst, david
From: Matheus Ferst <matheus.ferst@eldorado.org.br>
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
---
v4:
- New in v4.
---
target/ppc/insn32.decode | 1 +
target/ppc/translate/vmx-impl.c.inc | 40 +++++++++++++++++++++--------
2 files changed, 31 insertions(+), 10 deletions(-)
diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode
index 3799065508..96ee730242 100644
--- a/target/ppc/insn32.decode
+++ b/target/ppc/insn32.decode
@@ -479,6 +479,7 @@ VSRB 000100 ..... ..... ..... 01000000100 @VX
VSRH 000100 ..... ..... ..... 01001000100 @VX
VSRW 000100 ..... ..... ..... 01010000100 @VX
VSRD 000100 ..... ..... ..... 11011000100 @VX
+VSRQ 000100 ..... ..... ..... 01000000101 @VX
VSRAB 000100 ..... ..... ..... 01100000100 @VX
VSRAH 000100 ..... ..... ..... 01101000100 @VX
diff --git a/target/ppc/translate/vmx-impl.c.inc b/target/ppc/translate/vmx-impl.c.inc
index ca98a545ef..ec2b47b4aa 100644
--- a/target/ppc/translate/vmx-impl.c.inc
+++ b/target/ppc/translate/vmx-impl.c.inc
@@ -834,11 +834,10 @@ TRANS_FLAGS(ALTIVEC, VSRAH, do_vector_gvec3_VX, MO_16, tcg_gen_gvec_sarv);
TRANS_FLAGS(ALTIVEC, VSRAW, do_vector_gvec3_VX, MO_32, tcg_gen_gvec_sarv);
TRANS_FLAGS2(ALTIVEC_207, VSRAD, do_vector_gvec3_VX, MO_64, tcg_gen_gvec_sarv);
-static bool trans_VSLQ(DisasContext *ctx, arg_VX *a)
+static bool do_vector_shift_quad(DisasContext *ctx, arg_VX *a, bool right)
{
TCGv_i64 hi, lo, tmp, n, sf = tcg_constant_i64(64);
- REQUIRE_INSNS_FLAGS2(ctx, ISA310);
REQUIRE_VECTOR(ctx);
n = tcg_temp_new_i64();
@@ -852,19 +851,37 @@ static bool trans_VSLQ(DisasContext *ctx, arg_VX *a)
get_avr64(n, a->vrb, true);
tcg_gen_andi_i64(n, n, 0x7F);
- tcg_gen_movcond_i64(TCG_COND_GE, hi, n, sf, lo, hi);
- tcg_gen_movcond_i64(TCG_COND_GE, lo, n, sf, tmp, lo);
+ if (right) {
+ tcg_gen_movcond_i64(TCG_COND_GE, lo, n, sf, hi, lo);
+ tcg_gen_movcond_i64(TCG_COND_GE, hi, n, sf, tmp, hi);
+ } else {
+ tcg_gen_movcond_i64(TCG_COND_GE, hi, n, sf, lo, hi);
+ tcg_gen_movcond_i64(TCG_COND_GE, lo, n, sf, tmp, lo);
+ }
tcg_gen_andi_i64(n, n, ~64ULL);
- tcg_gen_shl_i64(tmp, lo, n);
- set_avr64(a->vrt, tmp, false);
+ if (right) {
+ tcg_gen_shr_i64(tmp, hi, n);
+ } else {
+ tcg_gen_shl_i64(tmp, lo, n);
+ }
+ set_avr64(a->vrt, tmp, right);
- tcg_gen_shl_i64(hi, hi, n);
+ if (right) {
+ tcg_gen_shr_i64(lo, lo, n);
+ } else {
+ tcg_gen_shl_i64(hi, hi, n);
+ }
tcg_gen_xori_i64(n, n, 63);
- tcg_gen_shr_i64(lo, lo, n);
- tcg_gen_shri_i64(lo, lo, 1);
+ if (right) {
+ tcg_gen_shl_i64(hi, hi, n);
+ tcg_gen_shli_i64(hi, hi, 1);
+ } else {
+ tcg_gen_shr_i64(lo, lo, n);
+ tcg_gen_shri_i64(lo, lo, 1);
+ }
tcg_gen_or_i64(hi, hi, lo);
- set_avr64(a->vrt, hi, true);
+ set_avr64(a->vrt, hi, !right);
tcg_temp_free_i64(hi);
tcg_temp_free_i64(lo);
@@ -874,6 +891,9 @@ static bool trans_VSLQ(DisasContext *ctx, arg_VX *a)
return true;
}
+TRANS_FLAGS2(ISA310, VSLQ, do_vector_shift_quad, false);
+TRANS_FLAGS2(ISA310, VSRQ, do_vector_shift_quad, true);
+
#define GEN_VXFORM_SAT(NAME, VECE, NORM, SAT, OPC2, OPC3) \
static void glue(glue(gen_, NAME), _vec)(unsigned vece, TCGv_vec t, \
TCGv_vec sat, TCGv_vec a, \
--
2.25.1
^ permalink raw reply related [flat|nested] 97+ messages in thread
* Re: [PATCH v4 21/47] target/ppc: implement vsrq
2022-02-22 14:36 ` [PATCH v4 21/47] target/ppc: implement vsrq matheus.ferst
@ 2022-02-22 22:15 ` Richard Henderson
0 siblings, 0 replies; 97+ messages in thread
From: Richard Henderson @ 2022-02-22 22:15 UTC (permalink / raw)
To: matheus.ferst, qemu-devel, qemu-ppc; +Cc: groug, danielhb413, clg, david
On 2/22/22 04:36, matheus.ferst@eldorado.org.br wrote:
> From: Matheus Ferst<matheus.ferst@eldorado.org.br>
>
> Signed-off-by: Matheus Ferst<matheus.ferst@eldorado.org.br>
> ---
> v4:
> - New in v4.
> ---
> target/ppc/insn32.decode | 1 +
> target/ppc/translate/vmx-impl.c.inc | 40 +++++++++++++++++++++--------
> 2 files changed, 31 insertions(+), 10 deletions(-)
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
r~
^ permalink raw reply [flat|nested] 97+ messages in thread
* [PATCH v4 22/47] target/ppc: implement vsraq
2022-02-22 14:35 [PATCH v4 00/47] target/ppc: PowerISA Vector/VSX instruction batch matheus.ferst
` (20 preceding siblings ...)
2022-02-22 14:36 ` [PATCH v4 21/47] target/ppc: implement vsrq matheus.ferst
@ 2022-02-22 14:36 ` matheus.ferst
2022-02-22 22:19 ` Richard Henderson
2022-02-22 14:36 ` [PATCH v4 23/47] target/ppc: move vrl[bhwd] to decodetree matheus.ferst
` (24 subsequent siblings)
46 siblings, 1 reply; 97+ messages in thread
From: matheus.ferst @ 2022-02-22 14:36 UTC (permalink / raw)
To: qemu-devel, qemu-ppc
Cc: danielhb413, richard.henderson, groug, clg, Matheus Ferst, david
From: Matheus Ferst <matheus.ferst@eldorado.org.br>
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
---
v4:
- New in v4.
---
target/ppc/insn32.decode | 1 +
target/ppc/translate/vmx-impl.c.inc | 17 +++++++++++++----
2 files changed, 14 insertions(+), 4 deletions(-)
diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode
index 96ee730242..7a9fc1dffa 100644
--- a/target/ppc/insn32.decode
+++ b/target/ppc/insn32.decode
@@ -485,6 +485,7 @@ VSRAB 000100 ..... ..... ..... 01100000100 @VX
VSRAH 000100 ..... ..... ..... 01101000100 @VX
VSRAW 000100 ..... ..... ..... 01110000100 @VX
VSRAD 000100 ..... ..... ..... 01111000100 @VX
+VSRAQ 000100 ..... ..... ..... 01100000101 @VX
## Vector Integer Arithmetic Instructions
diff --git a/target/ppc/translate/vmx-impl.c.inc b/target/ppc/translate/vmx-impl.c.inc
index ec2b47b4aa..2eee187499 100644
--- a/target/ppc/translate/vmx-impl.c.inc
+++ b/target/ppc/translate/vmx-impl.c.inc
@@ -834,7 +834,8 @@ TRANS_FLAGS(ALTIVEC, VSRAH, do_vector_gvec3_VX, MO_16, tcg_gen_gvec_sarv);
TRANS_FLAGS(ALTIVEC, VSRAW, do_vector_gvec3_VX, MO_32, tcg_gen_gvec_sarv);
TRANS_FLAGS2(ALTIVEC_207, VSRAD, do_vector_gvec3_VX, MO_64, tcg_gen_gvec_sarv);
-static bool do_vector_shift_quad(DisasContext *ctx, arg_VX *a, bool right)
+static bool do_vector_shift_quad(DisasContext *ctx, arg_VX *a, bool right,
+ bool alg)
{
TCGv_i64 hi, lo, tmp, n, sf = tcg_constant_i64(64);
@@ -853,6 +854,9 @@ static bool do_vector_shift_quad(DisasContext *ctx, arg_VX *a, bool right)
if (right) {
tcg_gen_movcond_i64(TCG_COND_GE, lo, n, sf, hi, lo);
+ if (alg) {
+ tcg_gen_sari_i64(tmp, lo, 63);
+ }
tcg_gen_movcond_i64(TCG_COND_GE, hi, n, sf, tmp, hi);
} else {
tcg_gen_movcond_i64(TCG_COND_GE, hi, n, sf, lo, hi);
@@ -861,7 +865,11 @@ static bool do_vector_shift_quad(DisasContext *ctx, arg_VX *a, bool right)
tcg_gen_andi_i64(n, n, ~64ULL);
if (right) {
- tcg_gen_shr_i64(tmp, hi, n);
+ if (alg) {
+ tcg_gen_sar_i64(tmp, hi, n);
+ } else {
+ tcg_gen_shr_i64(tmp, hi, n);
+ }
} else {
tcg_gen_shl_i64(tmp, lo, n);
}
@@ -891,8 +899,9 @@ static bool do_vector_shift_quad(DisasContext *ctx, arg_VX *a, bool right)
return true;
}
-TRANS_FLAGS2(ISA310, VSLQ, do_vector_shift_quad, false);
-TRANS_FLAGS2(ISA310, VSRQ, do_vector_shift_quad, true);
+TRANS_FLAGS2(ISA310, VSLQ, do_vector_shift_quad, false, false);
+TRANS_FLAGS2(ISA310, VSRQ, do_vector_shift_quad, true, false);
+TRANS_FLAGS2(ISA310, VSRAQ, do_vector_shift_quad, true, true);
#define GEN_VXFORM_SAT(NAME, VECE, NORM, SAT, OPC2, OPC3) \
static void glue(glue(gen_, NAME), _vec)(unsigned vece, TCGv_vec t, \
--
2.25.1
^ permalink raw reply related [flat|nested] 97+ messages in thread
* Re: [PATCH v4 22/47] target/ppc: implement vsraq
2022-02-22 14:36 ` [PATCH v4 22/47] target/ppc: implement vsraq matheus.ferst
@ 2022-02-22 22:19 ` Richard Henderson
0 siblings, 0 replies; 97+ messages in thread
From: Richard Henderson @ 2022-02-22 22:19 UTC (permalink / raw)
To: matheus.ferst, qemu-devel, qemu-ppc; +Cc: groug, danielhb413, clg, david
On 2/22/22 04:36, matheus.ferst@eldorado.org.br wrote:
> From: Matheus Ferst <matheus.ferst@eldorado.org.br>
>
> Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
> ---
> v4:
> - New in v4.
> ---
> target/ppc/insn32.decode | 1 +
> target/ppc/translate/vmx-impl.c.inc | 17 +++++++++++++----
> 2 files changed, 14 insertions(+), 4 deletions(-)
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
r~
^ permalink raw reply [flat|nested] 97+ messages in thread
* [PATCH v4 23/47] target/ppc: move vrl[bhwd] to decodetree
2022-02-22 14:35 [PATCH v4 00/47] target/ppc: PowerISA Vector/VSX instruction batch matheus.ferst
` (21 preceding siblings ...)
2022-02-22 14:36 ` [PATCH v4 22/47] target/ppc: implement vsraq matheus.ferst
@ 2022-02-22 14:36 ` matheus.ferst
2022-02-22 22:20 ` Richard Henderson
2022-02-22 14:36 ` [PATCH v4 24/47] target/ppc: move vrl[bhwd]nm/vrl[bhwd]mi " matheus.ferst
` (23 subsequent siblings)
46 siblings, 1 reply; 97+ messages in thread
From: matheus.ferst @ 2022-02-22 14:36 UTC (permalink / raw)
To: qemu-devel, qemu-ppc
Cc: danielhb413, richard.henderson, groug, clg, Matheus Ferst, david
From: Matheus Ferst <matheus.ferst@eldorado.org.br>
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
---
v4:
- New in v4.
---
target/ppc/insn32.decode | 5 +++++
target/ppc/translate/vmx-impl.c.inc | 13 +++++--------
target/ppc/translate/vmx-ops.c.inc | 6 ++----
3 files changed, 12 insertions(+), 12 deletions(-)
diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode
index 7a9fc1dffa..d918e2d0f2 100644
--- a/target/ppc/insn32.decode
+++ b/target/ppc/insn32.decode
@@ -487,6 +487,11 @@ VSRAW 000100 ..... ..... ..... 01110000100 @VX
VSRAD 000100 ..... ..... ..... 01111000100 @VX
VSRAQ 000100 ..... ..... ..... 01100000101 @VX
+VRLB 000100 ..... ..... ..... 00000000100 @VX
+VRLH 000100 ..... ..... ..... 00001000100 @VX
+VRLW 000100 ..... ..... ..... 00010000100 @VX
+VRLD 000100 ..... ..... ..... 00011000100 @VX
+
## Vector Integer Arithmetic Instructions
VEXTSB2W 000100 ..... 10000 ..... 11000000010 @VX_tb
diff --git a/target/ppc/translate/vmx-impl.c.inc b/target/ppc/translate/vmx-impl.c.inc
index 2eee187499..9dcac4243f 100644
--- a/target/ppc/translate/vmx-impl.c.inc
+++ b/target/ppc/translate/vmx-impl.c.inc
@@ -834,6 +834,11 @@ TRANS_FLAGS(ALTIVEC, VSRAH, do_vector_gvec3_VX, MO_16, tcg_gen_gvec_sarv);
TRANS_FLAGS(ALTIVEC, VSRAW, do_vector_gvec3_VX, MO_32, tcg_gen_gvec_sarv);
TRANS_FLAGS2(ALTIVEC_207, VSRAD, do_vector_gvec3_VX, MO_64, tcg_gen_gvec_sarv);
+TRANS_FLAGS(ALTIVEC, VRLB, do_vector_gvec3_VX, MO_8, tcg_gen_gvec_rotlv)
+TRANS_FLAGS(ALTIVEC, VRLH, do_vector_gvec3_VX, MO_16, tcg_gen_gvec_rotlv)
+TRANS_FLAGS(ALTIVEC, VRLW, do_vector_gvec3_VX, MO_32, tcg_gen_gvec_rotlv)
+TRANS_FLAGS2(ALTIVEC_207, VRLD, do_vector_gvec3_VX, MO_64, tcg_gen_gvec_rotlv)
+
static bool do_vector_shift_quad(DisasContext *ctx, arg_VX *a, bool right,
bool alg)
{
@@ -968,16 +973,8 @@ GEN_VXFORM3(vsubeuqm, 31, 0);
GEN_VXFORM3(vsubecuq, 31, 0);
GEN_VXFORM_DUAL(vsubeuqm, PPC_NONE, PPC2_ALTIVEC_207, \
vsubecuq, PPC_NONE, PPC2_ALTIVEC_207)
-GEN_VXFORM_V(vrlb, MO_8, tcg_gen_gvec_rotlv, 2, 0);
-GEN_VXFORM_V(vrlh, MO_16, tcg_gen_gvec_rotlv, 2, 1);
-GEN_VXFORM_V(vrlw, MO_32, tcg_gen_gvec_rotlv, 2, 2);
GEN_VXFORM(vrlwmi, 2, 2);
-GEN_VXFORM_DUAL(vrlw, PPC_ALTIVEC, PPC_NONE, \
- vrlwmi, PPC_NONE, PPC2_ISA300)
-GEN_VXFORM_V(vrld, MO_64, tcg_gen_gvec_rotlv, 2, 3);
GEN_VXFORM(vrldmi, 2, 3);
-GEN_VXFORM_DUAL(vrld, PPC_NONE, PPC2_ALTIVEC_207, \
- vrldmi, PPC_NONE, PPC2_ISA300)
GEN_VXFORM_TRANS(vsl, 2, 7);
GEN_VXFORM(vrldnm, 2, 7);
GEN_VXFORM_DUAL(vsl, PPC_ALTIVEC, PPC_NONE, \
diff --git a/target/ppc/translate/vmx-ops.c.inc b/target/ppc/translate/vmx-ops.c.inc
index 878bce92c6..a7acea3ca7 100644
--- a/target/ppc/translate/vmx-ops.c.inc
+++ b/target/ppc/translate/vmx-ops.c.inc
@@ -133,10 +133,8 @@ GEN_VXFORM_DUAL(vaddeuqm, vaddecuq, 30, 0xFF, PPC_NONE, PPC2_ALTIVEC_207),
GEN_VXFORM_DUAL(vsubuqm, bcdtrunc, 0, 20, PPC2_ALTIVEC_207, PPC2_ISA300),
GEN_VXFORM_DUAL(vsubcuq, bcdutrunc, 0, 21, PPC2_ALTIVEC_207, PPC2_ISA300),
GEN_VXFORM_DUAL(vsubeuqm, vsubecuq, 31, 0xFF, PPC_NONE, PPC2_ALTIVEC_207),
-GEN_VXFORM(vrlb, 2, 0),
-GEN_VXFORM(vrlh, 2, 1),
-GEN_VXFORM_DUAL(vrlw, vrlwmi, 2, 2, PPC_ALTIVEC, PPC_NONE),
-GEN_VXFORM_DUAL(vrld, vrldmi, 2, 3, PPC_NONE, PPC2_ALTIVEC_207),
+GEN_VXFORM_300(vrlwmi, 2, 2),
+GEN_VXFORM_300(vrldmi, 2, 3),
GEN_VXFORM_DUAL(vsl, vrldnm, 2, 7, PPC_ALTIVEC, PPC_NONE),
GEN_VXFORM(vsr, 2, 11),
GEN_VXFORM(vpkuhum, 7, 0),
--
2.25.1
^ permalink raw reply related [flat|nested] 97+ messages in thread
* Re: [PATCH v4 23/47] target/ppc: move vrl[bhwd] to decodetree
2022-02-22 14:36 ` [PATCH v4 23/47] target/ppc: move vrl[bhwd] to decodetree matheus.ferst
@ 2022-02-22 22:20 ` Richard Henderson
0 siblings, 0 replies; 97+ messages in thread
From: Richard Henderson @ 2022-02-22 22:20 UTC (permalink / raw)
To: matheus.ferst, qemu-devel, qemu-ppc; +Cc: groug, danielhb413, clg, david
On 2/22/22 04:36, matheus.ferst@eldorado.org.br wrote:
> From: Matheus Ferst<matheus.ferst@eldorado.org.br>
>
> Signed-off-by: Matheus Ferst<matheus.ferst@eldorado.org.br>
> ---
> v4:
> - New in v4.
> ---
> target/ppc/insn32.decode | 5 +++++
> target/ppc/translate/vmx-impl.c.inc | 13 +++++--------
> target/ppc/translate/vmx-ops.c.inc | 6 ++----
> 3 files changed, 12 insertions(+), 12 deletions(-)
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
r~
^ permalink raw reply [flat|nested] 97+ messages in thread
* [PATCH v4 24/47] target/ppc: move vrl[bhwd]nm/vrl[bhwd]mi to decodetree
2022-02-22 14:35 [PATCH v4 00/47] target/ppc: PowerISA Vector/VSX instruction batch matheus.ferst
` (22 preceding siblings ...)
2022-02-22 14:36 ` [PATCH v4 23/47] target/ppc: move vrl[bhwd] to decodetree matheus.ferst
@ 2022-02-22 14:36 ` matheus.ferst
2022-02-22 22:30 ` Richard Henderson
2022-02-22 14:36 ` [PATCH v4 25/47] target/ppc: implement vrlq matheus.ferst
` (22 subsequent siblings)
46 siblings, 1 reply; 97+ messages in thread
From: matheus.ferst @ 2022-02-22 14:36 UTC (permalink / raw)
To: qemu-devel, qemu-ppc
Cc: danielhb413, richard.henderson, groug, clg, Matheus Ferst, david
From: Matheus Ferst <matheus.ferst@eldorado.org.br>
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
---
v4:
- New in v4.
---
target/ppc/helper.h | 8 +-
target/ppc/insn32.decode | 6 ++
target/ppc/int_helper.c | 50 ++++-----
target/ppc/translate/vmx-impl.c.inc | 152 ++++++++++++++++++++++++++--
target/ppc/translate/vmx-ops.c.inc | 5 +-
5 files changed, 182 insertions(+), 39 deletions(-)
diff --git a/target/ppc/helper.h b/target/ppc/helper.h
index 269150b197..a2a0d461dd 100644
--- a/target/ppc/helper.h
+++ b/target/ppc/helper.h
@@ -275,10 +275,10 @@ DEF_HELPER_4(vmaxfp, void, env, avr, avr, avr)
DEF_HELPER_4(vminfp, void, env, avr, avr, avr)
DEF_HELPER_3(vrefp, void, env, avr, avr)
DEF_HELPER_3(vrsqrtefp, void, env, avr, avr)
-DEF_HELPER_3(vrlwmi, void, avr, avr, avr)
-DEF_HELPER_3(vrldmi, void, avr, avr, avr)
-DEF_HELPER_3(vrldnm, void, avr, avr, avr)
-DEF_HELPER_3(vrlwnm, void, avr, avr, avr)
+DEF_HELPER_4(VRLWMI, void, avr, avr, avr, i32)
+DEF_HELPER_4(VRLDMI, void, avr, avr, avr, i32)
+DEF_HELPER_4(VRLDNM, void, avr, avr, avr, i32)
+DEF_HELPER_4(VRLWNM, void, avr, avr, avr, i32)
DEF_HELPER_5(vmaddfp, void, env, avr, avr, avr, avr)
DEF_HELPER_5(vnmsubfp, void, env, avr, avr, avr, avr)
DEF_HELPER_3(vexptefp, void, env, avr, avr)
diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode
index d918e2d0f2..e788dc5152 100644
--- a/target/ppc/insn32.decode
+++ b/target/ppc/insn32.decode
@@ -492,6 +492,12 @@ VRLH 000100 ..... ..... ..... 00001000100 @VX
VRLW 000100 ..... ..... ..... 00010000100 @VX
VRLD 000100 ..... ..... ..... 00011000100 @VX
+VRLWMI 000100 ..... ..... ..... 00010000101 @VX
+VRLDMI 000100 ..... ..... ..... 00011000101 @VX
+
+VRLWNM 000100 ..... ..... ..... 00110000101 @VX
+VRLDNM 000100 ..... ..... ..... 00111000101 @VX
+
## Vector Integer Arithmetic Instructions
VEXTSB2W 000100 ..... 10000 ..... 11000000010 @VX_tb
diff --git a/target/ppc/int_helper.c b/target/ppc/int_helper.c
index 0a094b535a..58e57b2563 100644
--- a/target/ppc/int_helper.c
+++ b/target/ppc/int_helper.c
@@ -1291,33 +1291,33 @@ void helper_vrsqrtefp(CPUPPCState *env, ppc_avr_t *r, ppc_avr_t *b)
}
}
-#define VRLMI(name, size, element, insert) \
-void helper_##name(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b) \
-{ \
- int i; \
- for (i = 0; i < ARRAY_SIZE(r->element); i++) { \
- uint##size##_t src1 = a->element[i]; \
- uint##size##_t src2 = b->element[i]; \
- uint##size##_t src3 = r->element[i]; \
- uint##size##_t begin, end, shift, mask, rot_val; \
- \
- shift = extract##size(src2, 0, 6); \
- end = extract##size(src2, 8, 6); \
- begin = extract##size(src2, 16, 6); \
- rot_val = rol##size(src1, shift); \
- mask = mask_u##size(begin, end); \
- if (insert) { \
- r->element[i] = (rot_val & mask) | (src3 & ~mask); \
- } else { \
- r->element[i] = (rot_val & mask); \
- } \
- } \
+#define VRLMI(name, size, element, insert) \
+void helper_##name(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b, uint32_t desc) \
+{ \
+ int i; \
+ for (i = 0; i < ARRAY_SIZE(r->element); i++) { \
+ uint##size##_t src1 = a->element[i]; \
+ uint##size##_t src2 = b->element[i]; \
+ uint##size##_t src3 = r->element[i]; \
+ uint##size##_t begin, end, shift, mask, rot_val; \
+ \
+ shift = extract##size(src2, 0, 6); \
+ end = extract##size(src2, 8, 6); \
+ begin = extract##size(src2, 16, 6); \
+ rot_val = rol##size(src1, shift); \
+ mask = mask_u##size(begin, end); \
+ if (insert) { \
+ r->element[i] = (rot_val & mask) | (src3 & ~mask); \
+ } else { \
+ r->element[i] = (rot_val & mask); \
+ } \
+ } \
}
-VRLMI(vrldmi, 64, u64, 1);
-VRLMI(vrlwmi, 32, u32, 1);
-VRLMI(vrldnm, 64, u64, 0);
-VRLMI(vrlwnm, 32, u32, 0);
+VRLMI(VRLDMI, 64, u64, 1);
+VRLMI(VRLWMI, 32, u32, 1);
+VRLMI(VRLDNM, 64, u64, 0);
+VRLMI(VRLWNM, 32, u32, 0);
void helper_vsel(CPUPPCState *env, ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b,
ppc_avr_t *c)
diff --git a/target/ppc/translate/vmx-impl.c.inc b/target/ppc/translate/vmx-impl.c.inc
index 9dcac4243f..a025404032 100644
--- a/target/ppc/translate/vmx-impl.c.inc
+++ b/target/ppc/translate/vmx-impl.c.inc
@@ -799,7 +799,6 @@ static void trans_vclzd(DisasContext *ctx)
}
GEN_VXFORM_V(vmuluwm, MO_32, tcg_gen_gvec_mul, 4, 2);
-GEN_VXFORM(vrlwnm, 2, 6);
GEN_VXFORM(vsrv, 2, 28);
GEN_VXFORM(vslv, 2, 29);
GEN_VXFORM(vslo, 6, 16);
@@ -839,6 +838,152 @@ TRANS_FLAGS(ALTIVEC, VRLH, do_vector_gvec3_VX, MO_16, tcg_gen_gvec_rotlv)
TRANS_FLAGS(ALTIVEC, VRLW, do_vector_gvec3_VX, MO_32, tcg_gen_gvec_rotlv)
TRANS_FLAGS2(ALTIVEC_207, VRLD, do_vector_gvec3_VX, MO_64, tcg_gen_gvec_rotlv)
+static TCGv_vec do_vrl_mask_vec(unsigned vece, TCGv_vec vrb)
+{
+ TCGv_vec t0 = tcg_temp_new_vec_matching(vrb),
+ t1 = tcg_temp_new_vec_matching(vrb),
+ t2 = tcg_temp_new_vec_matching(vrb),
+ ones = tcg_constant_vec_matching(vrb, vece, -1);
+
+ /* Extract b and e */
+ tcg_gen_dupi_vec(vece, t2, (8 << vece) - 1);
+
+ tcg_gen_shri_vec(vece, t0, vrb, 16);
+ tcg_gen_and_vec(vece, t0, t0, t2);
+
+ tcg_gen_shri_vec(vece, t1, vrb, 8);
+ tcg_gen_and_vec(vece, t1, t1, t2);
+
+ /* Compare b and e to negate the mask where begin > end */
+ tcg_gen_cmp_vec(TCG_COND_GT, vece, t2, t0, t1);
+
+ /* Create the mask with (~0 >> b) ^ ((~0 >> e) >> 1) */
+ tcg_gen_shrv_vec(vece, t0, ones, t0);
+ tcg_gen_shrv_vec(vece, t1, ones, t1);
+ tcg_gen_shri_vec(vece, t1, t1, 1);
+ tcg_gen_xor_vec(vece, t0, t0, t1);
+
+ /* negate the mask */
+ tcg_gen_xor_vec(vece, t0, t0, t2);
+
+ tcg_temp_free_vec(t1);
+ tcg_temp_free_vec(t2);
+
+ return t0;
+}
+
+static void gen_vrlnm_vec(unsigned vece, TCGv_vec vrt, TCGv_vec vra,
+ TCGv_vec vrb)
+{
+ TCGv_vec mask, n = tcg_temp_new_vec_matching(vrt);
+
+ /* Create the mask */
+ mask = do_vrl_mask_vec(vece, vrb);
+
+ /* Extract n */
+ tcg_gen_dupi_vec(vece, n, (8 << vece) - 1);
+ tcg_gen_and_vec(vece, n, vrb, n);
+
+ /* Rotate and mask */
+ tcg_gen_rotlv_vec(vece, vrt, vra, n);
+ tcg_gen_and_vec(vece, vrt, vrt, mask);
+
+ tcg_temp_free_vec(n);
+ tcg_temp_free_vec(mask);
+}
+
+static bool do_vrlnm(DisasContext *ctx, arg_VX *a, int vece)
+{
+ static const TCGOpcode vecop_list[] = {
+ INDEX_op_cmp_vec, INDEX_op_rotlv_vec, INDEX_op_sari_vec,
+ INDEX_op_shli_vec, INDEX_op_shri_vec, INDEX_op_shrv_vec, 0
+ };
+ static const GVecGen3 ops[2] = {
+ {
+ .fniv = gen_vrlnm_vec,
+ .fno = gen_helper_VRLWNM,
+ .opt_opc = vecop_list,
+ .load_dest = true,
+ .vece = MO_32
+ },
+ {
+ .fniv = gen_vrlnm_vec,
+ .fno = gen_helper_VRLDNM,
+ .opt_opc = vecop_list,
+ .load_dest = true,
+ .vece = MO_64
+ }
+ };
+
+ REQUIRE_INSNS_FLAGS2(ctx, ISA300);
+ REQUIRE_VSX(ctx);
+
+ tcg_gen_gvec_3(avr_full_offset(a->vrt), avr_full_offset(a->vra),
+ avr_full_offset(a->vrb), 16, 16, &ops[vece - 2]);
+
+ return true;
+}
+
+TRANS(VRLWNM, do_vrlnm, MO_32)
+TRANS(VRLDNM, do_vrlnm, MO_64)
+
+static void gen_vrlmi_vec(unsigned vece, TCGv_vec vrt, TCGv_vec vra,
+ TCGv_vec vrb)
+{
+ TCGv_vec mask, n = tcg_temp_new_vec_matching(vrt),
+ tmp = tcg_temp_new_vec_matching(vrt);
+
+ /* Create the mask */
+ mask = do_vrl_mask_vec(vece, vrb);
+
+ /* Extract n */
+ tcg_gen_dupi_vec(vece, n, (8 << vece) - 1);
+ tcg_gen_and_vec(vece, n, vrb, n);
+
+ /* Rotate and insert */
+ tcg_gen_rotlv_vec(vece, tmp, vra, n);
+ tcg_gen_bitsel_vec(vece, vrt, mask, tmp, vrt);
+
+ tcg_temp_free_vec(n);
+ tcg_temp_free_vec(tmp);
+ tcg_temp_free_vec(mask);
+}
+
+static bool do_vrlmi(DisasContext *ctx, arg_VX *a, int vece)
+{
+ static const TCGOpcode vecop_list[] = {
+ INDEX_op_cmp_vec, INDEX_op_rotlv_vec, INDEX_op_sari_vec,
+ INDEX_op_shli_vec, INDEX_op_shri_vec, INDEX_op_shrv_vec, 0
+ };
+ static const GVecGen3 ops[2] = {
+ {
+ .fniv = gen_vrlmi_vec,
+ .fno = gen_helper_VRLWMI,
+ .opt_opc = vecop_list,
+ .load_dest = true,
+ .vece = MO_32
+ },
+ {
+ .fniv = gen_vrlnm_vec,
+ .fno = gen_helper_VRLDMI,
+ .opt_opc = vecop_list,
+ .load_dest = true,
+ .vece = MO_64
+ }
+ };
+
+ REQUIRE_INSNS_FLAGS2(ctx, ISA300);
+ REQUIRE_VSX(ctx);
+
+ tcg_gen_gvec_3(avr_full_offset(a->vrt), avr_full_offset(a->vra),
+ avr_full_offset(a->vrb), 16, 16, &ops[vece - 2]);
+
+ return true;
+}
+
+TRANS(VRLWMI, do_vrlmi, MO_32)
+TRANS(VRLDMI, do_vrlmi, MO_64)
+
static bool do_vector_shift_quad(DisasContext *ctx, arg_VX *a, bool right,
bool alg)
{
@@ -973,12 +1118,7 @@ GEN_VXFORM3(vsubeuqm, 31, 0);
GEN_VXFORM3(vsubecuq, 31, 0);
GEN_VXFORM_DUAL(vsubeuqm, PPC_NONE, PPC2_ALTIVEC_207, \
vsubecuq, PPC_NONE, PPC2_ALTIVEC_207)
-GEN_VXFORM(vrlwmi, 2, 2);
-GEN_VXFORM(vrldmi, 2, 3);
GEN_VXFORM_TRANS(vsl, 2, 7);
-GEN_VXFORM(vrldnm, 2, 7);
-GEN_VXFORM_DUAL(vsl, PPC_ALTIVEC, PPC_NONE, \
- vrldnm, PPC_NONE, PPC2_ISA300)
GEN_VXFORM_TRANS(vsr, 2, 11);
GEN_VXFORM_ENV(vpkuhum, 7, 0);
GEN_VXFORM_ENV(vpkuwum, 7, 1);
diff --git a/target/ppc/translate/vmx-ops.c.inc b/target/ppc/translate/vmx-ops.c.inc
index a7acea3ca7..3a8a9cc564 100644
--- a/target/ppc/translate/vmx-ops.c.inc
+++ b/target/ppc/translate/vmx-ops.c.inc
@@ -102,7 +102,6 @@ GEN_VXFORM_300(vextubrx, 6, 28),
GEN_VXFORM_300(vextuhrx, 6, 29),
GEN_VXFORM_DUAL(vmrgew, vextuwrx, 6, 30, PPC_NONE, PPC2_ALTIVEC_207),
GEN_VXFORM_207(vmuluwm, 4, 2),
-GEN_VXFORM_300(vrlwnm, 2, 6),
GEN_VXFORM_300(vsrv, 2, 28),
GEN_VXFORM_300(vslv, 2, 29),
GEN_VXFORM(vslo, 6, 16),
@@ -133,9 +132,7 @@ GEN_VXFORM_DUAL(vaddeuqm, vaddecuq, 30, 0xFF, PPC_NONE, PPC2_ALTIVEC_207),
GEN_VXFORM_DUAL(vsubuqm, bcdtrunc, 0, 20, PPC2_ALTIVEC_207, PPC2_ISA300),
GEN_VXFORM_DUAL(vsubcuq, bcdutrunc, 0, 21, PPC2_ALTIVEC_207, PPC2_ISA300),
GEN_VXFORM_DUAL(vsubeuqm, vsubecuq, 31, 0xFF, PPC_NONE, PPC2_ALTIVEC_207),
-GEN_VXFORM_300(vrlwmi, 2, 2),
-GEN_VXFORM_300(vrldmi, 2, 3),
-GEN_VXFORM_DUAL(vsl, vrldnm, 2, 7, PPC_ALTIVEC, PPC_NONE),
+GEN_VXFORM(vsl, 2, 7),
GEN_VXFORM(vsr, 2, 11),
GEN_VXFORM(vpkuhum, 7, 0),
GEN_VXFORM(vpkuwum, 7, 1),
--
2.25.1
^ permalink raw reply related [flat|nested] 97+ messages in thread
* Re: [PATCH v4 24/47] target/ppc: move vrl[bhwd]nm/vrl[bhwd]mi to decodetree
2022-02-22 14:36 ` [PATCH v4 24/47] target/ppc: move vrl[bhwd]nm/vrl[bhwd]mi " matheus.ferst
@ 2022-02-22 22:30 ` Richard Henderson
2022-02-23 21:43 ` Matheus K. Ferst
0 siblings, 1 reply; 97+ messages in thread
From: Richard Henderson @ 2022-02-22 22:30 UTC (permalink / raw)
To: matheus.ferst, qemu-devel, qemu-ppc; +Cc: groug, danielhb413, clg, david
On 2/22/22 04:36, matheus.ferst@eldorado.org.br wrote:
> +static void gen_vrlnm_vec(unsigned vece, TCGv_vec vrt, TCGv_vec vra,
> + TCGv_vec vrb)
> +{
> + TCGv_vec mask, n = tcg_temp_new_vec_matching(vrt);
> +
> + /* Create the mask */
> + mask = do_vrl_mask_vec(vece, vrb);
> +
> + /* Extract n */
> + tcg_gen_dupi_vec(vece, n, (8 << vece) - 1);
> + tcg_gen_and_vec(vece, n, vrb, n);
> +
> + /* Rotate and mask */
> + tcg_gen_rotlv_vec(vece, vrt, vra, n);
Note that rotlv does the masking itself:
/*
* Expand D = A << (B % element bits)
*
* Unlike scalar shifts, where it is easy for the target front end
* to include the modulo as part of the expansion. If the target
* naturally includes the modulo as part of the operation, great!
* If the target has some other behaviour from out-of-range shifts,
* then it could not use this function anyway, and would need to
* do it's own expansion with custom functions.
*/
> +static bool do_vrlnm(DisasContext *ctx, arg_VX *a, int vece)
> +{
> + static const TCGOpcode vecop_list[] = {
> + INDEX_op_cmp_vec, INDEX_op_rotlv_vec, INDEX_op_sari_vec,
> + INDEX_op_shli_vec, INDEX_op_shri_vec, INDEX_op_shrv_vec, 0
> + };
Where is sari used?
r~
^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: [PATCH v4 24/47] target/ppc: move vrl[bhwd]nm/vrl[bhwd]mi to decodetree
2022-02-22 22:30 ` Richard Henderson
@ 2022-02-23 21:43 ` Matheus K. Ferst
2022-02-23 22:19 ` Richard Henderson
0 siblings, 1 reply; 97+ messages in thread
From: Matheus K. Ferst @ 2022-02-23 21:43 UTC (permalink / raw)
To: Richard Henderson, qemu-devel, qemu-ppc; +Cc: groug, danielhb413, clg, david
On 22/02/2022 19:30, Richard Henderson wrote:
> On 2/22/22 04:36, matheus.ferst@eldorado.org.br wrote:
>> +static void gen_vrlnm_vec(unsigned vece, TCGv_vec vrt, TCGv_vec vra,
>> + TCGv_vec vrb)
>> +{
>> + TCGv_vec mask, n = tcg_temp_new_vec_matching(vrt);
>> +
>> + /* Create the mask */
>> + mask = do_vrl_mask_vec(vece, vrb);
>> +
>> + /* Extract n */
>> + tcg_gen_dupi_vec(vece, n, (8 << vece) - 1);
>> + tcg_gen_and_vec(vece, n, vrb, n);
>> +
>> + /* Rotate and mask */
>> + tcg_gen_rotlv_vec(vece, vrt, vra, n);
>
> Note that rotlv does the masking itself:
>
> /*
> * Expand D = A << (B % element bits)
> *
> * Unlike scalar shifts, where it is easy for the target front end
> * to include the modulo as part of the expansion. If the target
> * naturally includes the modulo as part of the operation, great!
> * If the target has some other behaviour from out-of-range shifts,
> * then it could not use this function anyway, and would need to
> * do it's own expansion with custom functions.
> */
>
Using tcg_gen_rotlv_vec(vece, vrt, vra, vrb) works on PPC but fails on
x86. It looks like a problem on the i386 backend. It's using
VPS[RL]LV[DQ], but instead of this modulo behavior, these instructions
write zero to the element[1]. I'm not sure how to fix that. Do we need
an INDEX_op_shlv_vec case in i386 tcg_expand_vec_op?
>> +static bool do_vrlnm(DisasContext *ctx, arg_VX *a, int vece)
>> +{
>> + static const TCGOpcode vecop_list[] = {
>> + INDEX_op_cmp_vec, INDEX_op_rotlv_vec, INDEX_op_sari_vec,
>> + INDEX_op_shli_vec, INDEX_op_shri_vec, INDEX_op_shrv_vec, 0
>> + };
>
> Where is sari used?
>
I'll remove in v5.
[1] Section 5.3 of
https://www.intel.com/content/dam/develop/external/us/en/documents/36945
Thanks,
Matheus K. Ferst
Instituto de Pesquisas ELDORADO <http://www.eldorado.org.br/>
Analista de Software
Aviso Legal - Disclaimer <https://www.eldorado.org.br/disclaimer.html>
^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: [PATCH v4 24/47] target/ppc: move vrl[bhwd]nm/vrl[bhwd]mi to decodetree
2022-02-23 21:43 ` Matheus K. Ferst
@ 2022-02-23 22:19 ` Richard Henderson
2022-02-24 20:23 ` Matheus K. Ferst
0 siblings, 1 reply; 97+ messages in thread
From: Richard Henderson @ 2022-02-23 22:19 UTC (permalink / raw)
To: Matheus K. Ferst, qemu-devel, qemu-ppc; +Cc: groug, danielhb413, clg, david
On 2/23/22 11:43, Matheus K. Ferst wrote:
>> Note that rotlv does the masking itself:
>>
>> /*
>> * Expand D = A << (B % element bits)
>> *
>> * Unlike scalar shifts, where it is easy for the target front end
>> * to include the modulo as part of the expansion. If the target
>> * naturally includes the modulo as part of the operation, great!
>> * If the target has some other behaviour from out-of-range shifts,
>> * then it could not use this function anyway, and would need to
>> * do it's own expansion with custom functions.
>> */
>>
>
> Using tcg_gen_rotlv_vec(vece, vrt, vra, vrb) works on PPC but fails on x86. It looks like
> a problem on the i386 backend. It's using VPS[RL]LV[DQ], but instead of this modulo
> behavior, these instructions write zero to the element[1]. I'm not sure how to fix that.
You don't want to use tcg_gen_rotlv_vec directly, but tcg_gen_rotlv_vec.
The generic modulo is being applied here:
static void tcg_gen_rotlv_mod_vec(unsigned vece, TCGv_vec d,
TCGv_vec a, TCGv_vec b)
{
TCGv_vec t = tcg_temp_new_vec_matching(d);
TCGv_vec m = tcg_constant_vec_matching(d, vece, (8 << vece) - 1);
tcg_gen_and_vec(vece, t, b, m);
tcg_gen_rotlv_vec(vece, d, a, t);
tcg_temp_free_vec(t);
}
r~
^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: [PATCH v4 24/47] target/ppc: move vrl[bhwd]nm/vrl[bhwd]mi to decodetree
2022-02-23 22:19 ` Richard Henderson
@ 2022-02-24 20:23 ` Matheus K. Ferst
2022-02-24 21:26 ` Richard Henderson
0 siblings, 1 reply; 97+ messages in thread
From: Matheus K. Ferst @ 2022-02-24 20:23 UTC (permalink / raw)
To: Richard Henderson, qemu-devel, qemu-ppc; +Cc: groug, danielhb413, clg, david
On 23/02/2022 19:19, Richard Henderson wrote:
> On 2/23/22 11:43, Matheus K. Ferst wrote:
>>> Note that rotlv does the masking itself:
>>>
>>> /*
>>> * Expand D = A << (B % element bits)
>>> *
>>> * Unlike scalar shifts, where it is easy for the target front end
>>> * to include the modulo as part of the expansion. If the target
>>> * naturally includes the modulo as part of the operation, great!
>>> * If the target has some other behaviour from out-of-range shifts,
>>> * then it could not use this function anyway, and would need to
>>> * do it's own expansion with custom functions.
>>> */
>>>
>>
>> Using tcg_gen_rotlv_vec(vece, vrt, vra, vrb) works on PPC but fails on
>> x86. It looks like
>> a problem on the i386 backend. It's using VPS[RL]LV[DQ], but instead
>> of this modulo
>> behavior, these instructions write zero to the element[1]. I'm not
>> sure how to fix that.
>
> You don't want to use tcg_gen_rotlv_vec directly, but tcg_gen_rotlv_vec.
>
I guess there is a typo here. Did you mean tcg_gen_gvec_rotlv? Or
tcg_gen_rotlv_mod_vec?
> The generic modulo is being applied here:
>
> static void tcg_gen_rotlv_mod_vec(unsigned vece, TCGv_vec d,
> TCGv_vec a, TCGv_vec b)
> {
> TCGv_vec t = tcg_temp_new_vec_matching(d);
> TCGv_vec m = tcg_constant_vec_matching(d, vece, (8 << vece) - 1);
>
> tcg_gen_and_vec(vece, t, b, m);
> tcg_gen_rotlv_vec(vece, d, a, t);
> tcg_temp_free_vec(t);
> }
I can see that this method is called when we use tcg_gen_gvec_rotlv to
implement vrl[bhwd], and they are working as expected. For vrl[wd]nm and
vrl[wd]mi, however, we can't call tcg_gen_rotlv_mod_vec directly in the
.fniv implementation because it is not exposed in tcg-op.h. Is there any
other way to use this method? Should we add it to the header file?
Thanks,
Matheus K. Ferst
Instituto de Pesquisas ELDORADO <http://www.eldorado.org.br/>
Analista de Software
Aviso Legal - Disclaimer <https://www.eldorado.org.br/disclaimer.html>
^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: [PATCH v4 24/47] target/ppc: move vrl[bhwd]nm/vrl[bhwd]mi to decodetree
2022-02-24 20:23 ` Matheus K. Ferst
@ 2022-02-24 21:26 ` Richard Henderson
0 siblings, 0 replies; 97+ messages in thread
From: Richard Henderson @ 2022-02-24 21:26 UTC (permalink / raw)
To: Matheus K. Ferst, qemu-devel, qemu-ppc; +Cc: groug, danielhb413, clg, david
On 2/24/22 10:23, Matheus K. Ferst wrote:
>> You don't want to use tcg_gen_rotlv_vec directly, but tcg_gen_rotlv_vec.
>>
>
> I guess there is a typo here. Did you mean tcg_gen_gvec_rotlv? Or tcg_gen_rotlv_mod_vec?
Dangit. Paste-paste error. The first: tcg_gen_gvec_rotlv.
r~
^ permalink raw reply [flat|nested] 97+ messages in thread
* [PATCH v4 25/47] target/ppc: implement vrlq
2022-02-22 14:35 [PATCH v4 00/47] target/ppc: PowerISA Vector/VSX instruction batch matheus.ferst
` (23 preceding siblings ...)
2022-02-22 14:36 ` [PATCH v4 24/47] target/ppc: move vrl[bhwd]nm/vrl[bhwd]mi " matheus.ferst
@ 2022-02-22 14:36 ` matheus.ferst
2022-02-22 22:33 ` Richard Henderson
2022-02-22 14:36 ` [PATCH v4 26/47] target/ppc: Move vsel and vperm/vpermr to decodetree matheus.ferst
` (21 subsequent siblings)
46 siblings, 1 reply; 97+ messages in thread
From: matheus.ferst @ 2022-02-22 14:36 UTC (permalink / raw)
To: qemu-devel, qemu-ppc
Cc: danielhb413, richard.henderson, groug, clg, Matheus Ferst, david
From: Matheus Ferst <matheus.ferst@eldorado.org.br>
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
---
v4:
- New in v4.
---
target/ppc/insn32.decode | 1 +
target/ppc/translate/vmx-impl.c.inc | 49 +++++++++++++++++++++++++++++
2 files changed, 50 insertions(+)
diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode
index e788dc5152..c3d47a8815 100644
--- a/target/ppc/insn32.decode
+++ b/target/ppc/insn32.decode
@@ -491,6 +491,7 @@ VRLB 000100 ..... ..... ..... 00000000100 @VX
VRLH 000100 ..... ..... ..... 00001000100 @VX
VRLW 000100 ..... ..... ..... 00010000100 @VX
VRLD 000100 ..... ..... ..... 00011000100 @VX
+VRLQ 000100 ..... ..... ..... 00000000101 @VX
VRLWMI 000100 ..... ..... ..... 00010000101 @VX
VRLDMI 000100 ..... ..... ..... 00011000101 @VX
diff --git a/target/ppc/translate/vmx-impl.c.inc b/target/ppc/translate/vmx-impl.c.inc
index a025404032..6b68a81706 100644
--- a/target/ppc/translate/vmx-impl.c.inc
+++ b/target/ppc/translate/vmx-impl.c.inc
@@ -1053,6 +1053,55 @@ TRANS_FLAGS2(ISA310, VSLQ, do_vector_shift_quad, false, false);
TRANS_FLAGS2(ISA310, VSRQ, do_vector_shift_quad, true, false);
TRANS_FLAGS2(ISA310, VSRAQ, do_vector_shift_quad, true, true);
+static bool trans_VRLQ(DisasContext *ctx, arg_VX *a)
+{
+ TCGv_i64 ah, al, n, t0, t1, sf = tcg_constant_i64(64);
+
+ REQUIRE_VECTOR(ctx);
+ REQUIRE_INSNS_FLAGS2(ctx, ISA310);
+
+ ah = tcg_temp_new_i64();
+ al = tcg_temp_new_i64();
+ n = tcg_temp_new_i64();
+ t0 = tcg_temp_new_i64();
+ t1 = tcg_temp_new_i64();
+
+ get_avr64(ah, a->vra, true);
+ get_avr64(al, a->vra, false);
+ get_avr64(n, a->vrb, true);
+
+ tcg_gen_andi_i64(n, n, 0x7F);
+
+ tcg_gen_mov_i64(t0, ah);
+ tcg_gen_movcond_i64(TCG_COND_GE, ah, n, sf, al, ah);
+ tcg_gen_movcond_i64(TCG_COND_GE, al, n, sf, t0, al);
+ tcg_gen_andi_i64(n, n, ~64ULL);
+
+ tcg_gen_shl_i64(t0, ah, n);
+ tcg_gen_shl_i64(t1, al, n);
+
+ tcg_gen_xori_i64(n, n, 63);
+
+ tcg_gen_shr_i64(al, al, n);
+ tcg_gen_shri_i64(al, al, 1);
+ tcg_gen_or_i64(t0, al, t0);
+
+ tcg_gen_shr_i64(ah, ah, n);
+ tcg_gen_shri_i64(ah, ah, 1);
+ tcg_gen_or_i64(t1, ah, t1);
+
+ set_avr64(a->vrt, t0, true);
+ set_avr64(a->vrt, t1, false);
+
+ tcg_temp_free_i64(ah);
+ tcg_temp_free_i64(al);
+ tcg_temp_free_i64(n);
+ tcg_temp_free_i64(t0);
+ tcg_temp_free_i64(t1);
+
+ return true;
+}
+
#define GEN_VXFORM_SAT(NAME, VECE, NORM, SAT, OPC2, OPC3) \
static void glue(glue(gen_, NAME), _vec)(unsigned vece, TCGv_vec t, \
TCGv_vec sat, TCGv_vec a, \
--
2.25.1
^ permalink raw reply related [flat|nested] 97+ messages in thread
* Re: [PATCH v4 25/47] target/ppc: implement vrlq
2022-02-22 14:36 ` [PATCH v4 25/47] target/ppc: implement vrlq matheus.ferst
@ 2022-02-22 22:33 ` Richard Henderson
0 siblings, 0 replies; 97+ messages in thread
From: Richard Henderson @ 2022-02-22 22:33 UTC (permalink / raw)
To: matheus.ferst, qemu-devel, qemu-ppc; +Cc: groug, danielhb413, clg, david
On 2/22/22 04:36, matheus.ferst@eldorado.org.br wrote:
> From: Matheus Ferst<matheus.ferst@eldorado.org.br>
>
> Signed-off-by: Matheus Ferst<matheus.ferst@eldorado.org.br>
> ---
> v4:
> - New in v4.
> ---
> target/ppc/insn32.decode | 1 +
> target/ppc/translate/vmx-impl.c.inc | 49 +++++++++++++++++++++++++++++
> 2 files changed, 50 insertions(+)
...
> + tcg_gen_andi_i64(n, n, 0x7F);
> +
> + tcg_gen_mov_i64(t0, ah);
> + tcg_gen_movcond_i64(TCG_COND_GE, ah, n, sf, al, ah);
> + tcg_gen_movcond_i64(TCG_COND_GE, al, n, sf, t0, al);
Similar comment re (n & 64) != 0. Otherwise,
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
^ permalink raw reply [flat|nested] 97+ messages in thread
* [PATCH v4 26/47] target/ppc: Move vsel and vperm/vpermr to decodetree
2022-02-22 14:35 [PATCH v4 00/47] target/ppc: PowerISA Vector/VSX instruction batch matheus.ferst
` (24 preceding siblings ...)
2022-02-22 14:36 ` [PATCH v4 25/47] target/ppc: implement vrlq matheus.ferst
@ 2022-02-22 14:36 ` matheus.ferst
2022-02-22 22:37 ` Richard Henderson
2022-02-22 14:36 ` [PATCH v4 27/47] target/ppc: Move xxsel " matheus.ferst
` (20 subsequent siblings)
46 siblings, 1 reply; 97+ messages in thread
From: matheus.ferst @ 2022-02-22 14:36 UTC (permalink / raw)
To: qemu-devel, qemu-ppc
Cc: danielhb413, richard.henderson, groug, clg, Matheus Ferst, david
From: Matheus Ferst <matheus.ferst@eldorado.org.br>
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
---
target/ppc/helper.h | 5 +--
target/ppc/insn32.decode | 5 +++
target/ppc/int_helper.c | 13 +-----
target/ppc/translate/vmx-impl.c.inc | 69 ++++++++++++++++++++++-------
target/ppc/translate/vmx-ops.c.inc | 2 -
5 files changed, 62 insertions(+), 32 deletions(-)
diff --git a/target/ppc/helper.h b/target/ppc/helper.h
index a2a0d461dd..c57b3035ae 100644
--- a/target/ppc/helper.h
+++ b/target/ppc/helper.h
@@ -227,9 +227,8 @@ DEF_HELPER_2(vupklsh, void, avr, avr)
DEF_HELPER_2(vupklsw, void, avr, avr)
DEF_HELPER_5(vmsumubm, void, env, avr, avr, avr, avr)
DEF_HELPER_5(vmsummbm, void, env, avr, avr, avr, avr)
-DEF_HELPER_5(vsel, void, env, avr, avr, avr, avr)
-DEF_HELPER_5(vperm, void, env, avr, avr, avr, avr)
-DEF_HELPER_5(vpermr, void, env, avr, avr, avr, avr)
+DEF_HELPER_4(VPERM, void, avr, avr, avr, avr)
+DEF_HELPER_4(VPERMR, void, avr, avr, avr, avr)
DEF_HELPER_4(vpkshss, void, env, avr, avr, avr)
DEF_HELPER_4(vpkshus, void, env, avr, avr, avr)
DEF_HELPER_4(vpkswss, void, env, avr, avr, avr)
diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode
index c3d47a8815..1456fa2b9d 100644
--- a/target/ppc/insn32.decode
+++ b/target/ppc/insn32.decode
@@ -467,6 +467,11 @@ VINSWVRX 000100 ..... ..... ..... 00110001111 @VX
VSLDBI 000100 ..... ..... ..... 00 ... 010110 @VN
VSRDBI 000100 ..... ..... ..... 01 ... 010110 @VN
+VPERM 000100 ..... ..... ..... ..... 101011 @VA
+VPERMR 000100 ..... ..... ..... ..... 111011 @VA
+
+VSEL 000100 ..... ..... ..... ..... 101010 @VA
+
## Vector Integer Shift Instruction
VSLB 000100 ..... ..... ..... 00100000100 @VX
diff --git a/target/ppc/int_helper.c b/target/ppc/int_helper.c
index 58e57b2563..05978b686d 100644
--- a/target/ppc/int_helper.c
+++ b/target/ppc/int_helper.c
@@ -1031,8 +1031,7 @@ void helper_VMULOUD(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b)
mulu64(&r->VsrD(1), &r->VsrD(0), a->VsrD(1), b->VsrD(1));
}
-void helper_vperm(CPUPPCState *env, ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b,
- ppc_avr_t *c)
+void helper_VPERM(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b, ppc_avr_t *c)
{
ppc_avr_t result;
int i;
@@ -1050,8 +1049,7 @@ void helper_vperm(CPUPPCState *env, ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b,
*r = result;
}
-void helper_vpermr(CPUPPCState *env, ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b,
- ppc_avr_t *c)
+void helper_VPERMR(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b, ppc_avr_t *c)
{
ppc_avr_t result;
int i;
@@ -1319,13 +1317,6 @@ VRLMI(VRLWMI, 32, u32, 1);
VRLMI(VRLDNM, 64, u64, 0);
VRLMI(VRLWNM, 32, u32, 0);
-void helper_vsel(CPUPPCState *env, ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b,
- ppc_avr_t *c)
-{
- r->u64[0] = (a->u64[0] & ~c->u64[0]) | (b->u64[0] & c->u64[0]);
- r->u64[1] = (a->u64[1] & ~c->u64[1]) | (b->u64[1] & c->u64[1]);
-}
-
void helper_vexptefp(CPUPPCState *env, ppc_avr_t *r, ppc_avr_t *b)
{
int i;
diff --git a/target/ppc/translate/vmx-impl.c.inc b/target/ppc/translate/vmx-impl.c.inc
index 6b68a81706..f734f449e0 100644
--- a/target/ppc/translate/vmx-impl.c.inc
+++ b/target/ppc/translate/vmx-impl.c.inc
@@ -2474,28 +2474,65 @@ static void gen_vmladduhm(DisasContext *ctx)
tcg_temp_free_ptr(rd);
}
-static void gen_vpermr(DisasContext *ctx)
+static bool trans_VPERM(DisasContext *ctx, arg_VA *a)
{
- TCGv_ptr ra, rb, rc, rd;
- if (unlikely(!ctx->altivec_enabled)) {
- gen_exception(ctx, POWERPC_EXCP_VPU);
- return;
- }
- ra = gen_avr_ptr(rA(ctx->opcode));
- rb = gen_avr_ptr(rB(ctx->opcode));
- rc = gen_avr_ptr(rC(ctx->opcode));
- rd = gen_avr_ptr(rD(ctx->opcode));
- gen_helper_vpermr(cpu_env, rd, ra, rb, rc);
- tcg_temp_free_ptr(ra);
- tcg_temp_free_ptr(rb);
- tcg_temp_free_ptr(rc);
- tcg_temp_free_ptr(rd);
+ TCGv_ptr vrt, vra, vrb, vrc;
+
+ REQUIRE_INSNS_FLAGS(ctx, ALTIVEC);
+ REQUIRE_VECTOR(ctx);
+
+ vrt = gen_avr_ptr(a->vrt);
+ vra = gen_avr_ptr(a->vra);
+ vrb = gen_avr_ptr(a->vrb);
+ vrc = gen_avr_ptr(a->rc);
+
+ gen_helper_VPERM(vrt, vra, vrb, vrc);
+
+ tcg_temp_free_ptr(vrt);
+ tcg_temp_free_ptr(vra);
+ tcg_temp_free_ptr(vrb);
+ tcg_temp_free_ptr(vrc);
+
+ return true;
+}
+
+static bool trans_VPERMR(DisasContext *ctx, arg_VA *a)
+{
+ TCGv_ptr vrt, vra, vrb, vrc;
+
+ REQUIRE_INSNS_FLAGS2(ctx, ISA300);
+ REQUIRE_VECTOR(ctx);
+
+ vrt = gen_avr_ptr(a->vrt);
+ vra = gen_avr_ptr(a->vra);
+ vrb = gen_avr_ptr(a->vrb);
+ vrc = gen_avr_ptr(a->rc);
+
+ gen_helper_VPERMR(vrt, vra, vrb, vrc);
+
+ tcg_temp_free_ptr(vrt);
+ tcg_temp_free_ptr(vra);
+ tcg_temp_free_ptr(vrb);
+ tcg_temp_free_ptr(vrc);
+
+ return true;
+}
+
+static bool trans_VSEL(DisasContext *ctx, arg_VA *a)
+{
+ REQUIRE_INSNS_FLAGS(ctx, ALTIVEC);
+ REQUIRE_VECTOR(ctx);
+
+ tcg_gen_gvec_bitsel(MO_64, avr_full_offset(a->vrt), avr_full_offset(a->rc),
+ avr_full_offset(a->vrb), avr_full_offset(a->vra),
+ 16, 16);
+
+ return true;
}
GEN_VAFORM_PAIRED(vmsumubm, vmsummbm, 18)
GEN_VAFORM_PAIRED(vmsumuhm, vmsumuhs, 19)
GEN_VAFORM_PAIRED(vmsumshm, vmsumshs, 20)
-GEN_VAFORM_PAIRED(vsel, vperm, 21)
GEN_VAFORM_PAIRED(vmaddfp, vnmsubfp, 23)
GEN_VXFORM_NOA(vclzb, 1, 28)
diff --git a/target/ppc/translate/vmx-ops.c.inc b/target/ppc/translate/vmx-ops.c.inc
index 3a8a9cc564..d960648d52 100644
--- a/target/ppc/translate/vmx-ops.c.inc
+++ b/target/ppc/translate/vmx-ops.c.inc
@@ -194,7 +194,6 @@ GEN_VXFORM_300_EO(vctzw, 0x01, 0x18, 0x1E),
GEN_VXFORM_300_EO(vctzd, 0x01, 0x18, 0x1F),
GEN_VXFORM_300_EO(vclzlsbb, 0x01, 0x18, 0x0),
GEN_VXFORM_300_EO(vctzlsbb, 0x01, 0x18, 0x1),
-GEN_VXFORM_300(vpermr, 0x1D, 0xFF),
#define GEN_VXFORM_NOA(name, opc2, opc3) \
GEN_HANDLER(name, 0x04, opc2, opc3, 0x001f0000, PPC_ALTIVEC)
@@ -229,7 +228,6 @@ GEN_VAFORM_PAIRED(vmhaddshs, vmhraddshs, 16),
GEN_VAFORM_PAIRED(vmsumubm, vmsummbm, 18),
GEN_VAFORM_PAIRED(vmsumuhm, vmsumuhs, 19),
GEN_VAFORM_PAIRED(vmsumshm, vmsumshs, 20),
-GEN_VAFORM_PAIRED(vsel, vperm, 21),
GEN_VAFORM_PAIRED(vmaddfp, vnmsubfp, 23),
GEN_VXFORM_DUAL(vclzb, vpopcntb, 1, 28, PPC_NONE, PPC2_ALTIVEC_207),
--
2.25.1
^ permalink raw reply related [flat|nested] 97+ messages in thread
* Re: [PATCH v4 26/47] target/ppc: Move vsel and vperm/vpermr to decodetree
2022-02-22 14:36 ` [PATCH v4 26/47] target/ppc: Move vsel and vperm/vpermr to decodetree matheus.ferst
@ 2022-02-22 22:37 ` Richard Henderson
0 siblings, 0 replies; 97+ messages in thread
From: Richard Henderson @ 2022-02-22 22:37 UTC (permalink / raw)
To: matheus.ferst, qemu-devel, qemu-ppc; +Cc: groug, danielhb413, clg, david
On 2/22/22 04:36, matheus.ferst@eldorado.org.br wrote:
> From: Matheus Ferst<matheus.ferst@eldorado.org.br>
>
> Signed-off-by: Matheus Ferst<matheus.ferst@eldorado.org.br>
> ---
> target/ppc/helper.h | 5 +--
> target/ppc/insn32.decode | 5 +++
> target/ppc/int_helper.c | 13 +-----
> target/ppc/translate/vmx-impl.c.inc | 69 ++++++++++++++++++++++-------
> target/ppc/translate/vmx-ops.c.inc | 2 -
> 5 files changed, 62 insertions(+), 32 deletions(-)
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
r~
^ permalink raw reply [flat|nested] 97+ messages in thread
* [PATCH v4 27/47] target/ppc: Move xxsel to decodetree
2022-02-22 14:35 [PATCH v4 00/47] target/ppc: PowerISA Vector/VSX instruction batch matheus.ferst
` (25 preceding siblings ...)
2022-02-22 14:36 ` [PATCH v4 26/47] target/ppc: Move vsel and vperm/vpermr to decodetree matheus.ferst
@ 2022-02-22 14:36 ` matheus.ferst
2022-02-22 22:38 ` Richard Henderson
2022-02-22 14:36 ` [PATCH v4 28/47] target/ppc: move xxperm/xxpermr " matheus.ferst
` (19 subsequent siblings)
46 siblings, 1 reply; 97+ messages in thread
From: matheus.ferst @ 2022-02-22 14:36 UTC (permalink / raw)
To: qemu-devel, qemu-ppc
Cc: danielhb413, richard.henderson, groug, clg, Matheus Ferst, david
From: Matheus Ferst <matheus.ferst@eldorado.org.br>
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
---
target/ppc/insn32.decode | 6 ++++
target/ppc/insn64.decode | 24 ++++++++--------
target/ppc/translate/vsx-impl.c.inc | 20 ++++++--------
target/ppc/translate/vsx-ops.c.inc | 43 -----------------------------
4 files changed, 26 insertions(+), 67 deletions(-)
diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode
index 1456fa2b9d..ad2aa0257c 100644
--- a/target/ppc/insn32.decode
+++ b/target/ppc/insn32.decode
@@ -148,12 +148,16 @@
%xx_xt 0:1 21:5
%xx_xb 1:1 11:5
%xx_xa 2:1 16:5
+%xx_xc 3:1 6:5
&XX2 xt xb uim:uint8_t
@XX2 ...... ..... ... uim:2 ..... ......... .. &XX2 xt=%xx_xt xb=%xx_xb
&XX3 xt xa xb
@XX3 ...... ..... ..... ..... ........ ... &XX3 xt=%xx_xt xa=%xx_xa xb=%xx_xb
+&XX4 xt xa xb xc
+@XX4 ...... ..... ..... ..... ..... .. .... &XX4 xt=%xx_xt xa=%xx_xa xb=%xx_xb xc=%xx_xc
+
&Z22_bf_fra bf fra dm
@Z22_bf_fra ...... bf:3 .. fra:5 dm:6 ......... . &Z22_bf_fra
@@ -598,6 +602,8 @@ STXVPX 011111 ..... ..... ..... 0111001101 - @X_TSXP
XXSPLTIB 111100 ..... 00 ........ 0101101000 . @X_imm8
XXSPLTW 111100 ..... ---.. ..... 010100100 . . @XX2
+XXSEL 111100 ..... ..... ..... ..... 11 .... @XX4
+
## VSX Vector Load Special Value Instruction
LXVKQ 111100 ..... 11111 ..... 0101101000 . @X_uim5
diff --git a/target/ppc/insn64.decode b/target/ppc/insn64.decode
index 39e610913d..9e4f531fb9 100644
--- a/target/ppc/insn64.decode
+++ b/target/ppc/insn64.decode
@@ -44,15 +44,15 @@
...... ..... .... . ................ \
&8RR_D si=%8rr_si xt=%8rr_xt
-# Format XX4
-&XX4 xt xa xb xc
-%xx4_xt 0:1 21:5
-%xx4_xa 2:1 16:5
-%xx4_xb 1:1 11:5
-%xx4_xc 3:1 6:5
-@XX4 ........ ........ ........ ........ \
+# Format 8RR:XX4
+%8rr_xx_xt 0:1 21:5
+%8rr_xx_xa 2:1 16:5
+%8rr_xx_xb 1:1 11:5
+%8rr_xx_xc 3:1 6:5
+&8RR_XX4 xt xa xb xc
+@8RR_XX4 ........ ........ ........ ........ \
...... ..... ..... ..... ..... .. .... \
- &XX4 xt=%xx4_xt xa=%xx4_xa xb=%xx4_xb xc=%xx4_xc
+ &8RR_XX4 xt=%8rr_xx_xt xa=%8rr_xx_xa xb=%8rr_xx_xb xc=%8rr_xx_xc
### Fixed-Point Load Instructions
@@ -187,10 +187,10 @@ XXSPLTI32DX 000001 01 0000 -- -- ................ \
100000 ..... 000 .. ................ @8RR_D_IX
XXBLENDVD 000001 01 0000 -- ------------------ \
- 100001 ..... ..... ..... ..... 11 .... @XX4
+ 100001 ..... ..... ..... ..... 11 .... @8RR_XX4
XXBLENDVW 000001 01 0000 -- ------------------ \
- 100001 ..... ..... ..... ..... 10 .... @XX4
+ 100001 ..... ..... ..... ..... 10 .... @8RR_XX4
XXBLENDVH 000001 01 0000 -- ------------------ \
- 100001 ..... ..... ..... ..... 01 .... @XX4
+ 100001 ..... ..... ..... ..... 01 .... @8RR_XX4
XXBLENDVB 000001 01 0000 -- ------------------ \
- 100001 ..... ..... ..... ..... 00 .... @XX4
+ 100001 ..... ..... ..... ..... 00 .... @8RR_XX4
diff --git a/target/ppc/translate/vsx-impl.c.inc b/target/ppc/translate/vsx-impl.c.inc
index e8a4ba0cfa..48e4a2e266 100644
--- a/target/ppc/translate/vsx-impl.c.inc
+++ b/target/ppc/translate/vsx-impl.c.inc
@@ -1422,19 +1422,15 @@ static void glue(gen_, name)(DisasContext *ctx) \
VSX_XXMRG(xxmrghw, 1)
VSX_XXMRG(xxmrglw, 0)
-static void gen_xxsel(DisasContext *ctx)
+static bool trans_XXSEL(DisasContext *ctx, arg_XX4 *a)
{
- int rt = xT(ctx->opcode);
- int ra = xA(ctx->opcode);
- int rb = xB(ctx->opcode);
- int rc = xC(ctx->opcode);
+ REQUIRE_INSNS_FLAGS2(ctx, VSX);
+ REQUIRE_VSX(ctx);
- if (unlikely(!ctx->vsx_enabled)) {
- gen_exception(ctx, POWERPC_EXCP_VSXU);
- return;
- }
- tcg_gen_gvec_bitsel(MO_64, vsr_full_offset(rt), vsr_full_offset(rc),
- vsr_full_offset(rb), vsr_full_offset(ra), 16, 16);
+ tcg_gen_gvec_bitsel(MO_64, vsr_full_offset(a->xt), vsr_full_offset(a->xc),
+ vsr_full_offset(a->xb), vsr_full_offset(a->xa), 16, 16);
+
+ return true;
}
static bool trans_XXSPLTW(DisasContext *ctx, arg_XX2 *a)
@@ -2127,7 +2123,7 @@ static void gen_xxblendv_vec(unsigned vece, TCGv_vec t, TCGv_vec a, TCGv_vec b,
tcg_temp_free_vec(tmp);
}
-static bool do_xxblendv(DisasContext *ctx, arg_XX4 *a, unsigned vece)
+static bool do_xxblendv(DisasContext *ctx, arg_8RR_XX4 *a, unsigned vece)
{
static const TCGOpcode vecop_list[] = {
INDEX_op_sari_vec, 0
diff --git a/target/ppc/translate/vsx-ops.c.inc b/target/ppc/translate/vsx-ops.c.inc
index c974324c4c..b0dbb38c80 100644
--- a/target/ppc/translate/vsx-ops.c.inc
+++ b/target/ppc/translate/vsx-ops.c.inc
@@ -347,47 +347,4 @@ GEN_XX3FORM_DM(xxsldwi, 0x08, 0x00),
GEN_XX2FORM_EXT(xxextractuw, 0x0A, 0x0A, PPC2_ISA300),
GEN_XX2FORM_EXT(xxinsertw, 0x0A, 0x0B, PPC2_ISA300),
-#define GEN_XXSEL_ROW(opc3) \
-GEN_HANDLER2_E(xxsel, "xxsel", 0x3C, 0x18, opc3, 0, PPC_NONE, PPC2_VSX), \
-GEN_HANDLER2_E(xxsel, "xxsel", 0x3C, 0x19, opc3, 0, PPC_NONE, PPC2_VSX), \
-GEN_HANDLER2_E(xxsel, "xxsel", 0x3C, 0x1A, opc3, 0, PPC_NONE, PPC2_VSX), \
-GEN_HANDLER2_E(xxsel, "xxsel", 0x3C, 0x1B, opc3, 0, PPC_NONE, PPC2_VSX), \
-GEN_HANDLER2_E(xxsel, "xxsel", 0x3C, 0x1C, opc3, 0, PPC_NONE, PPC2_VSX), \
-GEN_HANDLER2_E(xxsel, "xxsel", 0x3C, 0x1D, opc3, 0, PPC_NONE, PPC2_VSX), \
-GEN_HANDLER2_E(xxsel, "xxsel", 0x3C, 0x1E, opc3, 0, PPC_NONE, PPC2_VSX), \
-GEN_HANDLER2_E(xxsel, "xxsel", 0x3C, 0x1F, opc3, 0, PPC_NONE, PPC2_VSX), \
-
-GEN_XXSEL_ROW(0x00)
-GEN_XXSEL_ROW(0x01)
-GEN_XXSEL_ROW(0x02)
-GEN_XXSEL_ROW(0x03)
-GEN_XXSEL_ROW(0x04)
-GEN_XXSEL_ROW(0x05)
-GEN_XXSEL_ROW(0x06)
-GEN_XXSEL_ROW(0x07)
-GEN_XXSEL_ROW(0x08)
-GEN_XXSEL_ROW(0x09)
-GEN_XXSEL_ROW(0x0A)
-GEN_XXSEL_ROW(0x0B)
-GEN_XXSEL_ROW(0x0C)
-GEN_XXSEL_ROW(0x0D)
-GEN_XXSEL_ROW(0x0E)
-GEN_XXSEL_ROW(0x0F)
-GEN_XXSEL_ROW(0x10)
-GEN_XXSEL_ROW(0x11)
-GEN_XXSEL_ROW(0x12)
-GEN_XXSEL_ROW(0x13)
-GEN_XXSEL_ROW(0x14)
-GEN_XXSEL_ROW(0x15)
-GEN_XXSEL_ROW(0x16)
-GEN_XXSEL_ROW(0x17)
-GEN_XXSEL_ROW(0x18)
-GEN_XXSEL_ROW(0x19)
-GEN_XXSEL_ROW(0x1A)
-GEN_XXSEL_ROW(0x1B)
-GEN_XXSEL_ROW(0x1C)
-GEN_XXSEL_ROW(0x1D)
-GEN_XXSEL_ROW(0x1E)
-GEN_XXSEL_ROW(0x1F)
-
GEN_XX3FORM_DM(xxpermdi, 0x08, 0x01),
--
2.25.1
^ permalink raw reply related [flat|nested] 97+ messages in thread
* Re: [PATCH v4 27/47] target/ppc: Move xxsel to decodetree
2022-02-22 14:36 ` [PATCH v4 27/47] target/ppc: Move xxsel " matheus.ferst
@ 2022-02-22 22:38 ` Richard Henderson
0 siblings, 0 replies; 97+ messages in thread
From: Richard Henderson @ 2022-02-22 22:38 UTC (permalink / raw)
To: matheus.ferst, qemu-devel, qemu-ppc; +Cc: groug, danielhb413, clg, david
On 2/22/22 04:36, matheus.ferst@eldorado.org.br wrote:
> From: Matheus Ferst<matheus.ferst@eldorado.org.br>
>
> Signed-off-by: Matheus Ferst<matheus.ferst@eldorado.org.br>
> ---
> target/ppc/insn32.decode | 6 ++++
> target/ppc/insn64.decode | 24 ++++++++--------
> target/ppc/translate/vsx-impl.c.inc | 20 ++++++--------
> target/ppc/translate/vsx-ops.c.inc | 43 -----------------------------
> 4 files changed, 26 insertions(+), 67 deletions(-)
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
r~
^ permalink raw reply [flat|nested] 97+ messages in thread
* [PATCH v4 28/47] target/ppc: move xxperm/xxpermr to decodetree
2022-02-22 14:35 [PATCH v4 00/47] target/ppc: PowerISA Vector/VSX instruction batch matheus.ferst
` (26 preceding siblings ...)
2022-02-22 14:36 ` [PATCH v4 27/47] target/ppc: Move xxsel " matheus.ferst
@ 2022-02-22 14:36 ` matheus.ferst
2022-02-22 22:40 ` Richard Henderson
2022-02-22 14:36 ` [PATCH v4 29/47] target/ppc: Move xxpermdi " matheus.ferst
` (18 subsequent siblings)
46 siblings, 1 reply; 97+ messages in thread
From: matheus.ferst @ 2022-02-22 14:36 UTC (permalink / raw)
To: qemu-devel, qemu-ppc
Cc: danielhb413, richard.henderson, groug, clg, Matheus Ferst, david
From: Matheus Ferst <matheus.ferst@eldorado.org.br>
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
---
target/ppc/fpu_helper.c | 21 ---------------
target/ppc/helper.h | 2 --
target/ppc/insn32.decode | 5 ++++
target/ppc/translate/vsx-impl.c.inc | 42 +++++++++++++++++++++++++++--
target/ppc/translate/vsx-ops.c.inc | 2 --
5 files changed, 45 insertions(+), 27 deletions(-)
diff --git a/target/ppc/fpu_helper.c b/target/ppc/fpu_helper.c
index bd76bee7f1..0fd285defc 100644
--- a/target/ppc/fpu_helper.c
+++ b/target/ppc/fpu_helper.c
@@ -3055,27 +3055,6 @@ uint64_t helper_xsrsp(CPUPPCState *env, uint64_t xb)
return xt;
}
-#define VSX_XXPERM(op, indexed) \
-void helper_##op(CPUPPCState *env, ppc_vsr_t *xt, \
- ppc_vsr_t *xa, ppc_vsr_t *pcv) \
-{ \
- ppc_vsr_t t = *xt; \
- int i, idx; \
- \
- for (i = 0; i < 16; i++) { \
- idx = pcv->VsrB(i) & 0x1F; \
- if (indexed) { \
- idx = 31 - idx; \
- } \
- t.VsrB(i) = (idx <= 15) ? xa->VsrB(idx) \
- : xt->VsrB(idx - 16); \
- } \
- *xt = t; \
-}
-
-VSX_XXPERM(xxperm, 0)
-VSX_XXPERM(xxpermr, 1)
-
void helper_xvxsigsp(CPUPPCState *env, ppc_vsr_t *xt, ppc_vsr_t *xb)
{
ppc_vsr_t t = { };
diff --git a/target/ppc/helper.h b/target/ppc/helper.h
index c57b3035ae..7514eebf6a 100644
--- a/target/ppc/helper.h
+++ b/target/ppc/helper.h
@@ -496,8 +496,6 @@ DEF_HELPER_3(xvrspic, void, env, vsr, vsr)
DEF_HELPER_3(xvrspim, void, env, vsr, vsr)
DEF_HELPER_3(xvrspip, void, env, vsr, vsr)
DEF_HELPER_3(xvrspiz, void, env, vsr, vsr)
-DEF_HELPER_4(xxperm, void, env, vsr, vsr, vsr)
-DEF_HELPER_4(xxpermr, void, env, vsr, vsr, vsr)
DEF_HELPER_4(xxextractuw, void, env, vsr, vsr, i32)
DEF_HELPER_4(xxinsertw, void, env, vsr, vsr, i32)
DEF_HELPER_3(xvxsigsp, void, env, vsr, vsr)
diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode
index ad2aa0257c..5fc29eabc6 100644
--- a/target/ppc/insn32.decode
+++ b/target/ppc/insn32.decode
@@ -602,6 +602,11 @@ STXVPX 011111 ..... ..... ..... 0111001101 - @X_TSXP
XXSPLTIB 111100 ..... 00 ........ 0101101000 . @X_imm8
XXSPLTW 111100 ..... ---.. ..... 010100100 . . @XX2
+## VSX Permute Instructions
+
+XXPERM 111100 ..... ..... ..... 00011010 ... @XX3
+XXPERMR 111100 ..... ..... ..... 00111010 ... @XX3
+
XXSEL 111100 ..... ..... ..... ..... 11 .... @XX4
## VSX Vector Load Special Value Instruction
diff --git a/target/ppc/translate/vsx-impl.c.inc b/target/ppc/translate/vsx-impl.c.inc
index 48e4a2e266..7ce90f18a5 100644
--- a/target/ppc/translate/vsx-impl.c.inc
+++ b/target/ppc/translate/vsx-impl.c.inc
@@ -1200,8 +1200,46 @@ GEN_VSX_HELPER_X2(xvrspip, 0x12, 0x0A, 0, PPC2_VSX)
GEN_VSX_HELPER_X2(xvrspiz, 0x12, 0x09, 0, PPC2_VSX)
GEN_VSX_HELPER_2(xvtstdcsp, 0x14, 0x1A, 0, PPC2_VSX)
GEN_VSX_HELPER_2(xvtstdcdp, 0x14, 0x1E, 0, PPC2_VSX)
-GEN_VSX_HELPER_X3(xxperm, 0x08, 0x03, 0, PPC2_ISA300)
-GEN_VSX_HELPER_X3(xxpermr, 0x08, 0x07, 0, PPC2_ISA300)
+
+static bool trans_XXPERM(DisasContext *ctx, arg_XX3 *a)
+{
+ TCGv_ptr xt, xa, xb;
+
+ REQUIRE_INSNS_FLAGS2(ctx, ISA300);
+ REQUIRE_VSX(ctx);
+
+ xt = gen_vsr_ptr(a->xt);
+ xa = gen_vsr_ptr(a->xa);
+ xb = gen_vsr_ptr(a->xb);
+
+ gen_helper_VPERM(xt, xa, xt, xb);
+
+ tcg_temp_free_ptr(xt);
+ tcg_temp_free_ptr(xa);
+ tcg_temp_free_ptr(xb);
+
+ return true;
+}
+
+static bool trans_XXPERMR(DisasContext *ctx, arg_XX3 *a)
+{
+ TCGv_ptr xt, xa, xb;
+
+ REQUIRE_INSNS_FLAGS2(ctx, ISA300);
+ REQUIRE_VSX(ctx);
+
+ xt = gen_vsr_ptr(a->xt);
+ xa = gen_vsr_ptr(a->xa);
+ xb = gen_vsr_ptr(a->xb);
+
+ gen_helper_VPERMR(xt, xa, xt, xb);
+
+ tcg_temp_free_ptr(xt);
+ tcg_temp_free_ptr(xa);
+ tcg_temp_free_ptr(xb);
+
+ return true;
+}
#define GEN_VSX_HELPER_VSX_MADD(name, op1, aop, mop, inval, type) \
static void gen_##name(DisasContext *ctx) \
diff --git a/target/ppc/translate/vsx-ops.c.inc b/target/ppc/translate/vsx-ops.c.inc
index b0dbb38c80..86ed1a996a 100644
--- a/target/ppc/translate/vsx-ops.c.inc
+++ b/target/ppc/translate/vsx-ops.c.inc
@@ -341,8 +341,6 @@ VSX_LOGICAL(xxlnand, 0x8, 0x16, PPC2_VSX207),
VSX_LOGICAL(xxlorc, 0x8, 0x15, PPC2_VSX207),
GEN_XX3FORM(xxmrghw, 0x08, 0x02, PPC2_VSX),
GEN_XX3FORM(xxmrglw, 0x08, 0x06, PPC2_VSX),
-GEN_XX3FORM(xxperm, 0x08, 0x03, PPC2_ISA300),
-GEN_XX3FORM(xxpermr, 0x08, 0x07, PPC2_ISA300),
GEN_XX3FORM_DM(xxsldwi, 0x08, 0x00),
GEN_XX2FORM_EXT(xxextractuw, 0x0A, 0x0A, PPC2_ISA300),
GEN_XX2FORM_EXT(xxinsertw, 0x0A, 0x0B, PPC2_ISA300),
--
2.25.1
^ permalink raw reply related [flat|nested] 97+ messages in thread
* Re: [PATCH v4 28/47] target/ppc: move xxperm/xxpermr to decodetree
2022-02-22 14:36 ` [PATCH v4 28/47] target/ppc: move xxperm/xxpermr " matheus.ferst
@ 2022-02-22 22:40 ` Richard Henderson
0 siblings, 0 replies; 97+ messages in thread
From: Richard Henderson @ 2022-02-22 22:40 UTC (permalink / raw)
To: matheus.ferst, qemu-devel, qemu-ppc; +Cc: groug, danielhb413, clg, david
On 2/22/22 04:36, matheus.ferst@eldorado.org.br wrote:
> From: Matheus Ferst<matheus.ferst@eldorado.org.br>
>
> Signed-off-by: Matheus Ferst<matheus.ferst@eldorado.org.br>
> ---
> target/ppc/fpu_helper.c | 21 ---------------
> target/ppc/helper.h | 2 --
> target/ppc/insn32.decode | 5 ++++
> target/ppc/translate/vsx-impl.c.inc | 42 +++++++++++++++++++++++++++--
> target/ppc/translate/vsx-ops.c.inc | 2 --
> 5 files changed, 45 insertions(+), 27 deletions(-)
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
r~
^ permalink raw reply [flat|nested] 97+ messages in thread
* [PATCH v4 29/47] target/ppc: Move xxpermdi to decodetree
2022-02-22 14:35 [PATCH v4 00/47] target/ppc: PowerISA Vector/VSX instruction batch matheus.ferst
` (27 preceding siblings ...)
2022-02-22 14:36 ` [PATCH v4 28/47] target/ppc: move xxperm/xxpermr " matheus.ferst
@ 2022-02-22 14:36 ` matheus.ferst
2022-02-22 22:42 ` Richard Henderson
2022-02-22 14:36 ` [PATCH v4 30/47] target/ppc: Implement xxpermx instruction matheus.ferst
` (17 subsequent siblings)
46 siblings, 1 reply; 97+ messages in thread
From: matheus.ferst @ 2022-02-22 14:36 UTC (permalink / raw)
To: qemu-devel, qemu-ppc
Cc: danielhb413, richard.henderson, groug, clg, Matheus Ferst, david
From: Matheus Ferst <matheus.ferst@eldorado.org.br>
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
---
target/ppc/insn32.decode | 4 ++
target/ppc/translate/vsx-impl.c.inc | 71 +++++++++++++----------------
target/ppc/translate/vsx-ops.c.inc | 2 -
3 files changed, 36 insertions(+), 41 deletions(-)
diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode
index 5fc29eabc6..185d697458 100644
--- a/target/ppc/insn32.decode
+++ b/target/ppc/insn32.decode
@@ -155,6 +155,9 @@
&XX3 xt xa xb
@XX3 ...... ..... ..... ..... ........ ... &XX3 xt=%xx_xt xa=%xx_xa xb=%xx_xb
+&XX3_dm xt xa xb dm
+@XX3_dm ...... ..... ..... ..... . dm:2 ..... ... &XX3_dm xt=%xx_xt xa=%xx_xa xb=%xx_xb
+
&XX4 xt xa xb xc
@XX4 ...... ..... ..... ..... ..... .. .... &XX4 xt=%xx_xt xa=%xx_xa xb=%xx_xb xc=%xx_xc
@@ -606,6 +609,7 @@ XXSPLTW 111100 ..... ---.. ..... 010100100 . . @XX2
XXPERM 111100 ..... ..... ..... 00011010 ... @XX3
XXPERMR 111100 ..... ..... ..... 00111010 ... @XX3
+XXPERMDI 111100 ..... ..... ..... 0 .. 01010 ... @XX3_dm
XXSEL 111100 ..... ..... ..... ..... 11 .... @XX4
diff --git a/target/ppc/translate/vsx-impl.c.inc b/target/ppc/translate/vsx-impl.c.inc
index 7ce90f18a5..cdefa13590 100644
--- a/target/ppc/translate/vsx-impl.c.inc
+++ b/target/ppc/translate/vsx-impl.c.inc
@@ -665,45 +665,6 @@ static void gen_mtvsrws(DisasContext *ctx)
#endif
-static void gen_xxpermdi(DisasContext *ctx)
-{
- TCGv_i64 xh, xl;
-
- if (unlikely(!ctx->vsx_enabled)) {
- gen_exception(ctx, POWERPC_EXCP_VSXU);
- return;
- }
-
- xh = tcg_temp_new_i64();
- xl = tcg_temp_new_i64();
-
- if (unlikely((xT(ctx->opcode) == xA(ctx->opcode)) ||
- (xT(ctx->opcode) == xB(ctx->opcode)))) {
- get_cpu_vsr(xh, xA(ctx->opcode), (DM(ctx->opcode) & 2) == 0);
- get_cpu_vsr(xl, xB(ctx->opcode), (DM(ctx->opcode) & 1) == 0);
-
- set_cpu_vsr(xT(ctx->opcode), xh, true);
- set_cpu_vsr(xT(ctx->opcode), xl, false);
- } else {
- if ((DM(ctx->opcode) & 2) == 0) {
- get_cpu_vsr(xh, xA(ctx->opcode), true);
- set_cpu_vsr(xT(ctx->opcode), xh, true);
- } else {
- get_cpu_vsr(xh, xA(ctx->opcode), false);
- set_cpu_vsr(xT(ctx->opcode), xh, true);
- }
- if ((DM(ctx->opcode) & 1) == 0) {
- get_cpu_vsr(xl, xB(ctx->opcode), true);
- set_cpu_vsr(xT(ctx->opcode), xl, false);
- } else {
- get_cpu_vsr(xl, xB(ctx->opcode), false);
- set_cpu_vsr(xT(ctx->opcode), xl, false);
- }
- }
- tcg_temp_free_i64(xh);
- tcg_temp_free_i64(xl);
-}
-
#define OP_ABS 1
#define OP_NABS 2
#define OP_NEG 3
@@ -1241,6 +1202,38 @@ static bool trans_XXPERMR(DisasContext *ctx, arg_XX3 *a)
return true;
}
+static bool trans_XXPERMDI(DisasContext *ctx, arg_XX3_dm *a)
+{
+ TCGv_i64 t0, t1;
+
+ REQUIRE_INSNS_FLAGS2(ctx, VSX);
+ REQUIRE_VSX(ctx);
+
+ t0 = tcg_temp_new_i64();
+
+ if (unlikely(a->xt == a->xa || a->xt == a->xb)) {
+ t1 = tcg_temp_new_i64();
+
+ get_cpu_vsr(t0, a->xa, (a->dm & 2) == 0);
+ get_cpu_vsr(t1, a->xb, (a->dm & 1) == 0);
+
+ set_cpu_vsr(a->xt, t0, true);
+ set_cpu_vsr(a->xt, t1, false);
+
+ tcg_temp_free_i64(t1);
+ } else {
+ get_cpu_vsr(t0, a->xa, (a->dm & 2) == 0);
+ set_cpu_vsr(a->xt, t0, true);
+
+ get_cpu_vsr(t0, a->xb, (a->dm & 1) == 0);
+ set_cpu_vsr(a->xt, t0, false);
+ }
+
+ tcg_temp_free_i64(t0);
+
+ return true;
+}
+
#define GEN_VSX_HELPER_VSX_MADD(name, op1, aop, mop, inval, type) \
static void gen_##name(DisasContext *ctx) \
{ \
diff --git a/target/ppc/translate/vsx-ops.c.inc b/target/ppc/translate/vsx-ops.c.inc
index 86ed1a996a..0a6b2b31ac 100644
--- a/target/ppc/translate/vsx-ops.c.inc
+++ b/target/ppc/translate/vsx-ops.c.inc
@@ -344,5 +344,3 @@ GEN_XX3FORM(xxmrglw, 0x08, 0x06, PPC2_VSX),
GEN_XX3FORM_DM(xxsldwi, 0x08, 0x00),
GEN_XX2FORM_EXT(xxextractuw, 0x0A, 0x0A, PPC2_ISA300),
GEN_XX2FORM_EXT(xxinsertw, 0x0A, 0x0B, PPC2_ISA300),
-
-GEN_XX3FORM_DM(xxpermdi, 0x08, 0x01),
--
2.25.1
^ permalink raw reply related [flat|nested] 97+ messages in thread
* Re: [PATCH v4 29/47] target/ppc: Move xxpermdi to decodetree
2022-02-22 14:36 ` [PATCH v4 29/47] target/ppc: Move xxpermdi " matheus.ferst
@ 2022-02-22 22:42 ` Richard Henderson
0 siblings, 0 replies; 97+ messages in thread
From: Richard Henderson @ 2022-02-22 22:42 UTC (permalink / raw)
To: matheus.ferst, qemu-devel, qemu-ppc; +Cc: groug, danielhb413, clg, david
On 2/22/22 04:36, matheus.ferst@eldorado.org.br wrote:
> From: Matheus Ferst<matheus.ferst@eldorado.org.br>
>
> Signed-off-by: Matheus Ferst<matheus.ferst@eldorado.org.br>
> ---
> target/ppc/insn32.decode | 4 ++
> target/ppc/translate/vsx-impl.c.inc | 71 +++++++++++++----------------
> target/ppc/translate/vsx-ops.c.inc | 2 -
> 3 files changed, 36 insertions(+), 41 deletions(-)
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
r~
^ permalink raw reply [flat|nested] 97+ messages in thread
* [PATCH v4 30/47] target/ppc: Implement xxpermx instruction
2022-02-22 14:35 [PATCH v4 00/47] target/ppc: PowerISA Vector/VSX instruction batch matheus.ferst
` (28 preceding siblings ...)
2022-02-22 14:36 ` [PATCH v4 29/47] target/ppc: Move xxpermdi " matheus.ferst
@ 2022-02-22 14:36 ` matheus.ferst
2022-02-22 22:46 ` Richard Henderson
2022-02-22 14:36 ` [PATCH v4 31/47] tcg/tcg-op-gvec.c: Introduce tcg_gen_gvec_4i matheus.ferst
` (16 subsequent siblings)
46 siblings, 1 reply; 97+ messages in thread
From: matheus.ferst @ 2022-02-22 14:36 UTC (permalink / raw)
To: qemu-devel, qemu-ppc
Cc: danielhb413, richard.henderson, groug, clg, Matheus Ferst, david
From: Matheus Ferst <matheus.ferst@eldorado.org.br>
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
---
target/ppc/helper.h | 1 +
target/ppc/insn64.decode | 8 ++++++++
target/ppc/int_helper.c | 20 ++++++++++++++++++++
target/ppc/translate/vsx-impl.c.inc | 22 ++++++++++++++++++++++
4 files changed, 51 insertions(+)
diff --git a/target/ppc/helper.h b/target/ppc/helper.h
index 7514eebf6a..85a13057ca 100644
--- a/target/ppc/helper.h
+++ b/target/ppc/helper.h
@@ -497,6 +497,7 @@ DEF_HELPER_3(xvrspim, void, env, vsr, vsr)
DEF_HELPER_3(xvrspip, void, env, vsr, vsr)
DEF_HELPER_3(xvrspiz, void, env, vsr, vsr)
DEF_HELPER_4(xxextractuw, void, env, vsr, vsr, i32)
+DEF_HELPER_5(XXPERMX, void, vsr, vsr, vsr, vsr, tl)
DEF_HELPER_4(xxinsertw, void, env, vsr, vsr, i32)
DEF_HELPER_3(xvxsigsp, void, env, vsr, vsr)
DEF_HELPER_5(XXBLENDVB, void, vsr, vsr, vsr, vsr, i32)
diff --git a/target/ppc/insn64.decode b/target/ppc/insn64.decode
index 9e4f531fb9..0963e064b1 100644
--- a/target/ppc/insn64.decode
+++ b/target/ppc/insn64.decode
@@ -54,6 +54,11 @@
...... ..... ..... ..... ..... .. .... \
&8RR_XX4 xt=%8rr_xx_xt xa=%8rr_xx_xa xb=%8rr_xx_xb xc=%8rr_xx_xc
+&8RR_XX4_uim3 xt xa xb xc uim3
+@8RR_XX4_uim3 ...... .. .... .. ............... uim3:3 \
+ ...... ..... ..... ..... ..... .. .... \
+ &8RR_XX4_uim3 xt=%8rr_xx_xt xa=%8rr_xx_xa xb=%8rr_xx_xb xc=%8rr_xx_xc
+
### Fixed-Point Load Instructions
PLBZ 000001 10 0--.-- .................. \
@@ -194,3 +199,6 @@ XXBLENDVH 000001 01 0000 -- ------------------ \
100001 ..... ..... ..... ..... 01 .... @8RR_XX4
XXBLENDVB 000001 01 0000 -- ------------------ \
100001 ..... ..... ..... ..... 00 .... @8RR_XX4
+
+XXPERMX 000001 01 0000 -- --------------- ... \
+ 100010 ..... ..... ..... ..... 00 .... @8RR_XX4_uim3
diff --git a/target/ppc/int_helper.c b/target/ppc/int_helper.c
index 05978b686d..a92a006c6d 100644
--- a/target/ppc/int_helper.c
+++ b/target/ppc/int_helper.c
@@ -1031,6 +1031,26 @@ void helper_VMULOUD(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b)
mulu64(&r->VsrD(1), &r->VsrD(0), a->VsrD(1), b->VsrD(1));
}
+void helper_XXPERMX(ppc_vsr_t *t, ppc_vsr_t *s0, ppc_vsr_t *s1, ppc_vsr_t *pcv,
+ target_ulong uim)
+{
+ int i, idx;
+ ppc_vsr_t tmp = { .u64 = {0, 0} };
+
+ for (i = 0; i < ARRAY_SIZE(t->u8); i++) {
+ if ((pcv->VsrB(i) >> 5) == uim) {
+ idx = pcv->VsrB(i) & 0x1f;
+ if (idx < ARRAY_SIZE(t->u8)) {
+ tmp.VsrB(i) = s0->VsrB(idx);
+ } else {
+ tmp.VsrB(i) = s1->VsrB(idx - ARRAY_SIZE(t->u8));
+ }
+ }
+ }
+
+ *t = tmp;
+}
+
void helper_VPERM(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b, ppc_avr_t *c)
{
ppc_avr_t result;
diff --git a/target/ppc/translate/vsx-impl.c.inc b/target/ppc/translate/vsx-impl.c.inc
index cdefa13590..92851b8926 100644
--- a/target/ppc/translate/vsx-impl.c.inc
+++ b/target/ppc/translate/vsx-impl.c.inc
@@ -1234,6 +1234,28 @@ static bool trans_XXPERMDI(DisasContext *ctx, arg_XX3_dm *a)
return true;
}
+static bool trans_XXPERMX(DisasContext *ctx, arg_8RR_XX4_uim3 *a)
+{
+ TCGv_ptr xt, xa, xb, xc;
+
+ REQUIRE_INSNS_FLAGS2(ctx, ISA310);
+ REQUIRE_VSX(ctx);
+
+ xt = gen_vsr_ptr(a->xt);
+ xa = gen_vsr_ptr(a->xa);
+ xb = gen_vsr_ptr(a->xb);
+ xc = gen_vsr_ptr(a->xc);
+
+ gen_helper_XXPERMX(xt, xa, xb, xc, tcg_constant_tl(a->uim3));
+
+ tcg_temp_free_ptr(xt);
+ tcg_temp_free_ptr(xa);
+ tcg_temp_free_ptr(xb);
+ tcg_temp_free_ptr(xc);
+
+ return true;
+}
+
#define GEN_VSX_HELPER_VSX_MADD(name, op1, aop, mop, inval, type) \
static void gen_##name(DisasContext *ctx) \
{ \
--
2.25.1
^ permalink raw reply related [flat|nested] 97+ messages in thread
* Re: [PATCH v4 30/47] target/ppc: Implement xxpermx instruction
2022-02-22 14:36 ` [PATCH v4 30/47] target/ppc: Implement xxpermx instruction matheus.ferst
@ 2022-02-22 22:46 ` Richard Henderson
0 siblings, 0 replies; 97+ messages in thread
From: Richard Henderson @ 2022-02-22 22:46 UTC (permalink / raw)
To: matheus.ferst, qemu-devel, qemu-ppc; +Cc: groug, danielhb413, clg, david
On 2/22/22 04:36, matheus.ferst@eldorado.org.br wrote:
> From: Matheus Ferst<matheus.ferst@eldorado.org.br>
>
> Signed-off-by: Matheus Ferst<matheus.ferst@eldorado.org.br>
> ---
> target/ppc/helper.h | 1 +
> target/ppc/insn64.decode | 8 ++++++++
> target/ppc/int_helper.c | 20 ++++++++++++++++++++
> target/ppc/translate/vsx-impl.c.inc | 22 ++++++++++++++++++++++
> 4 files changed, 51 insertions(+)
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
r~
^ permalink raw reply [flat|nested] 97+ messages in thread
* [PATCH v4 31/47] tcg/tcg-op-gvec.c: Introduce tcg_gen_gvec_4i
2022-02-22 14:35 [PATCH v4 00/47] target/ppc: PowerISA Vector/VSX instruction batch matheus.ferst
` (29 preceding siblings ...)
2022-02-22 14:36 ` [PATCH v4 30/47] target/ppc: Implement xxpermx instruction matheus.ferst
@ 2022-02-22 14:36 ` matheus.ferst
2022-02-22 23:04 ` Richard Henderson
2022-02-22 14:36 ` [PATCH v4 32/47] target/ppc: Implement xxeval matheus.ferst
` (15 subsequent siblings)
46 siblings, 1 reply; 97+ messages in thread
From: matheus.ferst @ 2022-02-22 14:36 UTC (permalink / raw)
To: qemu-devel, qemu-ppc
Cc: danielhb413, richard.henderson, groug, clg, Matheus Ferst, david
From: Matheus Ferst <matheus.ferst@eldorado.org.br>
Following the implementation of tcg_gen_gvec_3i, add a four-vector and
immediate operand expansion method.
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
---
include/tcg/tcg-op-gvec.h | 22 ++++++
tcg/tcg-op-gvec.c | 146 ++++++++++++++++++++++++++++++++++++++
2 files changed, 168 insertions(+)
diff --git a/include/tcg/tcg-op-gvec.h b/include/tcg/tcg-op-gvec.h
index da55fed870..28cafbcc5c 100644
--- a/include/tcg/tcg-op-gvec.h
+++ b/include/tcg/tcg-op-gvec.h
@@ -218,6 +218,25 @@ typedef struct {
bool write_aofs;
} GVecGen4;
+typedef struct {
+ /*
+ * Expand inline as a 64-bit or 32-bit integer. Only one of these will be
+ * non-NULL.
+ */
+ void (*fni8)(TCGv_i64, TCGv_i64, TCGv_i64, TCGv_i64, int64_t);
+ void (*fni4)(TCGv_i32, TCGv_i32, TCGv_i32, TCGv_i32, int32_t);
+ /* Expand inline with a host vector type. */
+ void (*fniv)(unsigned, TCGv_vec, TCGv_vec, TCGv_vec, TCGv_vec, int64_t);
+ /* Expand out-of-line helper w/descriptor, data in descriptor. */
+ gen_helper_gvec_4 *fno;
+ /* The optional opcodes, if any, utilized by .fniv. */
+ const TCGOpcode *opt_opc;
+ /* The vector element size, if applicable. */
+ uint8_t vece;
+ /* Prefer i64 to v64. */
+ bool prefer_i64;
+} GVecGen4i;
+
void tcg_gen_gvec_2(uint32_t dofs, uint32_t aofs,
uint32_t oprsz, uint32_t maxsz, const GVecGen2 *);
void tcg_gen_gvec_2i(uint32_t dofs, uint32_t aofs, uint32_t oprsz,
@@ -231,6 +250,9 @@ void tcg_gen_gvec_3i(uint32_t dofs, uint32_t aofs, uint32_t bofs,
const GVecGen3i *);
void tcg_gen_gvec_4(uint32_t dofs, uint32_t aofs, uint32_t bofs, uint32_t cofs,
uint32_t oprsz, uint32_t maxsz, const GVecGen4 *);
+void tcg_gen_gvec_4i(uint32_t dofs, uint32_t aofs, uint32_t bofs, uint32_t cofs,
+ uint32_t oprsz, uint32_t maxsz, int64_t c,
+ const GVecGen4i *);
/* Expand a specific vector operation. */
diff --git a/tcg/tcg-op-gvec.c b/tcg/tcg-op-gvec.c
index ffe55e908f..079a761b04 100644
--- a/tcg/tcg-op-gvec.c
+++ b/tcg/tcg-op-gvec.c
@@ -836,6 +836,30 @@ static void expand_4_i32(uint32_t dofs, uint32_t aofs, uint32_t bofs,
tcg_temp_free_i32(t0);
}
+static void expand_4i_i32(uint32_t dofs, uint32_t aofs, uint32_t bofs,
+ uint32_t cofs, uint32_t oprsz, int32_t c,
+ void (*fni)(TCGv_i32, TCGv_i32, TCGv_i32, TCGv_i32,
+ int32_t))
+{
+ TCGv_i32 t0 = tcg_temp_new_i32();
+ TCGv_i32 t1 = tcg_temp_new_i32();
+ TCGv_i32 t2 = tcg_temp_new_i32();
+ TCGv_i32 t3 = tcg_temp_new_i32();
+ uint32_t i;
+
+ for (i = 0; i < oprsz; i += 4) {
+ tcg_gen_ld_i32(t1, cpu_env, aofs + i);
+ tcg_gen_ld_i32(t2, cpu_env, bofs + i);
+ tcg_gen_ld_i32(t3, cpu_env, cofs + i);
+ fni(t0, t1, t2, t3, c);
+ tcg_gen_st_i32(t0, cpu_env, dofs + i);
+ }
+ tcg_temp_free_i32(t3);
+ tcg_temp_free_i32(t2);
+ tcg_temp_free_i32(t1);
+ tcg_temp_free_i32(t0);
+}
+
/* Expand OPSZ bytes worth of two-operand operations using i64 elements. */
static void expand_2_i64(uint32_t dofs, uint32_t aofs, uint32_t oprsz,
bool load_dest, void (*fni)(TCGv_i64, TCGv_i64))
@@ -971,6 +995,30 @@ static void expand_4_i64(uint32_t dofs, uint32_t aofs, uint32_t bofs,
tcg_temp_free_i64(t0);
}
+static void expand_4i_i64(uint32_t dofs, uint32_t aofs, uint32_t bofs,
+ uint32_t cofs, uint32_t oprsz, int64_t c,
+ void (*fni)(TCGv_i64, TCGv_i64, TCGv_i64, TCGv_i64,
+ int64_t))
+{
+ TCGv_i64 t0 = tcg_temp_new_i64();
+ TCGv_i64 t1 = tcg_temp_new_i64();
+ TCGv_i64 t2 = tcg_temp_new_i64();
+ TCGv_i64 t3 = tcg_temp_new_i64();
+ uint32_t i;
+
+ for (i = 0; i < oprsz; i += 8) {
+ tcg_gen_ld_i64(t1, cpu_env, aofs + i);
+ tcg_gen_ld_i64(t2, cpu_env, bofs + i);
+ tcg_gen_ld_i64(t3, cpu_env, cofs + i);
+ fni(t0, t1, t2, t3, c);
+ tcg_gen_st_i64(t0, cpu_env, dofs + i);
+ }
+ tcg_temp_free_i64(t3);
+ tcg_temp_free_i64(t2);
+ tcg_temp_free_i64(t1);
+ tcg_temp_free_i64(t0);
+}
+
/* Expand OPSZ bytes worth of two-operand operations using host vectors. */
static void expand_2_vec(unsigned vece, uint32_t dofs, uint32_t aofs,
uint32_t oprsz, uint32_t tysz, TCGType type,
@@ -1121,6 +1169,35 @@ static void expand_4_vec(unsigned vece, uint32_t dofs, uint32_t aofs,
tcg_temp_free_vec(t0);
}
+/*
+ * Expand OPSZ bytes worth of four-vector operands and an immediate operand
+ * using host vectors.
+ */
+static void expand_4i_vec(unsigned vece, uint32_t dofs, uint32_t aofs,
+ uint32_t bofs, uint32_t cofs, uint32_t oprsz,
+ uint32_t tysz, TCGType type, int64_t c,
+ void (*fni)(unsigned, TCGv_vec, TCGv_vec,
+ TCGv_vec, TCGv_vec, int64_t))
+{
+ TCGv_vec t0 = tcg_temp_new_vec(type);
+ TCGv_vec t1 = tcg_temp_new_vec(type);
+ TCGv_vec t2 = tcg_temp_new_vec(type);
+ TCGv_vec t3 = tcg_temp_new_vec(type);
+ uint32_t i;
+
+ for (i = 0; i < oprsz; i += tysz) {
+ tcg_gen_ld_vec(t1, cpu_env, aofs + i);
+ tcg_gen_ld_vec(t2, cpu_env, bofs + i);
+ tcg_gen_ld_vec(t3, cpu_env, cofs + i);
+ fni(vece, t0, t1, t2, t3, c);
+ tcg_gen_st_vec(t0, cpu_env, dofs + i);
+ }
+ tcg_temp_free_vec(t3);
+ tcg_temp_free_vec(t2);
+ tcg_temp_free_vec(t1);
+ tcg_temp_free_vec(t0);
+}
+
/* Expand a vector two-operand operation. */
void tcg_gen_gvec_2(uint32_t dofs, uint32_t aofs,
uint32_t oprsz, uint32_t maxsz, const GVecGen2 *g)
@@ -1533,6 +1610,75 @@ void tcg_gen_gvec_4(uint32_t dofs, uint32_t aofs, uint32_t bofs, uint32_t cofs,
}
}
+/* Expand a vector four-operand operation. */
+void tcg_gen_gvec_4i(uint32_t dofs, uint32_t aofs, uint32_t bofs, uint32_t cofs,
+ uint32_t oprsz, uint32_t maxsz, int64_t c,
+ const GVecGen4i *g)
+{
+ const TCGOpcode *this_list = g->opt_opc ? : vecop_list_empty;
+ const TCGOpcode *hold_list = tcg_swap_vecop_list(this_list);
+ TCGType type;
+ uint32_t some;
+
+ check_size_align(oprsz, maxsz, dofs | aofs | bofs | cofs);
+ check_overlap_4(dofs, aofs, bofs, cofs, maxsz);
+
+ type = 0;
+ if (g->fniv) {
+ type = choose_vector_type(g->opt_opc, g->vece, oprsz, g->prefer_i64);
+ }
+ switch (type) {
+ case TCG_TYPE_V256:
+ /*
+ * Recall that ARM SVE allows vector sizes that are not a
+ * power of 2, but always a multiple of 16. The intent is
+ * that e.g. size == 80 would be expanded with 2x32 + 1x16.
+ */
+ some = QEMU_ALIGN_DOWN(oprsz, 32);
+ expand_4i_vec(g->vece, dofs, aofs, bofs, cofs, some,
+ 32, TCG_TYPE_V256, c, g->fniv);
+ if (some == oprsz) {
+ break;
+ }
+ dofs += some;
+ aofs += some;
+ bofs += some;
+ cofs += some;
+ oprsz -= some;
+ maxsz -= some;
+ /* fallthru */
+ case TCG_TYPE_V128:
+ expand_4i_vec(g->vece, dofs, aofs, bofs, cofs, oprsz,
+ 16, TCG_TYPE_V128, c, g->fniv);
+ break;
+ case TCG_TYPE_V64:
+ expand_4i_vec(g->vece, dofs, aofs, bofs, cofs, oprsz,
+ 8, TCG_TYPE_V64, c, g->fniv);
+ break;
+
+ case 0:
+ if (g->fni8 && check_size_impl(oprsz, 8)) {
+ expand_4i_i64(dofs, aofs, bofs, cofs, oprsz, c, g->fni8);
+ } else if (g->fni4 && check_size_impl(oprsz, 4)) {
+ expand_4i_i32(dofs, aofs, bofs, cofs, oprsz, c, g->fni4);
+ } else {
+ assert(g->fno != NULL);
+ tcg_gen_gvec_4_ool(dofs, aofs, bofs, cofs,
+ oprsz, maxsz, c, g->fno);
+ oprsz = maxsz;
+ }
+ break;
+
+ default:
+ g_assert_not_reached();
+ }
+ tcg_swap_vecop_list(hold_list);
+
+ if (oprsz < maxsz) {
+ expand_clr(dofs + oprsz, maxsz - oprsz);
+ }
+}
+
/*
* Expand specific vector operations.
*/
--
2.25.1
^ permalink raw reply related [flat|nested] 97+ messages in thread
* Re: [PATCH v4 31/47] tcg/tcg-op-gvec.c: Introduce tcg_gen_gvec_4i
2022-02-22 14:36 ` [PATCH v4 31/47] tcg/tcg-op-gvec.c: Introduce tcg_gen_gvec_4i matheus.ferst
@ 2022-02-22 23:04 ` Richard Henderson
0 siblings, 0 replies; 97+ messages in thread
From: Richard Henderson @ 2022-02-22 23:04 UTC (permalink / raw)
To: matheus.ferst, qemu-devel, qemu-ppc; +Cc: groug, danielhb413, clg, david
On 2/22/22 04:36, matheus.ferst@eldorado.org.br wrote:
> From: Matheus Ferst<matheus.ferst@eldorado.org.br>
>
> Following the implementation of tcg_gen_gvec_3i, add a four-vector and
> immediate operand expansion method.
>
> Signed-off-by: Matheus Ferst<matheus.ferst@eldorado.org.br>
> ---
> include/tcg/tcg-op-gvec.h | 22 ++++++
> tcg/tcg-op-gvec.c | 146 ++++++++++++++++++++++++++++++++++++++
> 2 files changed, 168 insertions(+)
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
r~
^ permalink raw reply [flat|nested] 97+ messages in thread
* [PATCH v4 32/47] target/ppc: Implement xxeval
2022-02-22 14:35 [PATCH v4 00/47] target/ppc: PowerISA Vector/VSX instruction batch matheus.ferst
` (30 preceding siblings ...)
2022-02-22 14:36 ` [PATCH v4 31/47] tcg/tcg-op-gvec.c: Introduce tcg_gen_gvec_4i matheus.ferst
@ 2022-02-22 14:36 ` matheus.ferst
2022-02-22 23:43 ` Richard Henderson
2022-02-22 14:36 ` [PATCH v4 33/47] target/ppc: Implement xxgenpcv[bhwd]m instruction matheus.ferst
` (14 subsequent siblings)
46 siblings, 1 reply; 97+ messages in thread
From: matheus.ferst @ 2022-02-22 14:36 UTC (permalink / raw)
To: qemu-devel, qemu-ppc
Cc: danielhb413, richard.henderson, groug, clg, Matheus Ferst, david
From: Matheus Ferst <matheus.ferst@eldorado.org.br>
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
---
target/ppc/helper.h | 1 +
target/ppc/insn64.decode | 8 ++
target/ppc/int_helper.c | 42 ++++++++++
target/ppc/translate/vsx-impl.c.inc | 121 ++++++++++++++++++++++++++++
4 files changed, 172 insertions(+)
diff --git a/target/ppc/helper.h b/target/ppc/helper.h
index 85a13057ca..b8c818f573 100644
--- a/target/ppc/helper.h
+++ b/target/ppc/helper.h
@@ -500,6 +500,7 @@ DEF_HELPER_4(xxextractuw, void, env, vsr, vsr, i32)
DEF_HELPER_5(XXPERMX, void, vsr, vsr, vsr, vsr, tl)
DEF_HELPER_4(xxinsertw, void, env, vsr, vsr, i32)
DEF_HELPER_3(xvxsigsp, void, env, vsr, vsr)
+DEF_HELPER_5(XXEVAL, void, vsr, vsr, vsr, vsr, i32)
DEF_HELPER_5(XXBLENDVB, void, vsr, vsr, vsr, vsr, i32)
DEF_HELPER_5(XXBLENDVH, void, vsr, vsr, vsr, vsr, i32)
DEF_HELPER_5(XXBLENDVW, void, vsr, vsr, vsr, vsr, i32)
diff --git a/target/ppc/insn64.decode b/target/ppc/insn64.decode
index 0963e064b1..fdb859f62d 100644
--- a/target/ppc/insn64.decode
+++ b/target/ppc/insn64.decode
@@ -54,6 +54,11 @@
...... ..... ..... ..... ..... .. .... \
&8RR_XX4 xt=%8rr_xx_xt xa=%8rr_xx_xa xb=%8rr_xx_xb xc=%8rr_xx_xc
+&8RR_XX4_imm xt xa xb xc imm
+@8RR_XX4_imm ........ ........ ........ imm:8 \
+ ...... ..... ..... ..... ..... .. .... \
+ &8RR_XX4_imm xt=%8rr_xx_xt xa=%8rr_xx_xa xb=%8rr_xx_xb xc=%8rr_xx_xc
+
&8RR_XX4_uim3 xt xa xb xc uim3
@8RR_XX4_uim3 ...... .. .... .. ............... uim3:3 \
...... ..... ..... ..... ..... .. .... \
@@ -184,6 +189,9 @@ PLXVP 000001 00 0--.-- .................. \
PSTXVP 000001 00 0--.-- .................. \
111110 ..... ..... ................ @8LS_D_TSXP
+XXEVAL 000001 01 0000 -- ---------- ........ \
+ 100010 ..... ..... ..... ..... 01 .... @8RR_XX4_imm
+
XXSPLTIDP 000001 01 0000 -- -- ................ \
100000 ..... 0010 . ................ @8RR_D
XXSPLTIW 000001 01 0000 -- -- ................ \
diff --git a/target/ppc/int_helper.c b/target/ppc/int_helper.c
index a92a006c6d..255645ef1d 100644
--- a/target/ppc/int_helper.c
+++ b/target/ppc/int_helper.c
@@ -28,6 +28,7 @@
#include "fpu/softfloat.h"
#include "qapi/error.h"
#include "qemu/guest-random.h"
+#include "tcg/tcg-gvec-desc.h"
#include "helper_regs.h"
/*****************************************************************************/
@@ -1588,6 +1589,47 @@ void helper_xxinsertw(CPUPPCState *env, ppc_vsr_t *xt,
*xt = t;
}
+void helper_XXEVAL(ppc_avr_t *t, ppc_avr_t *a, ppc_avr_t *b, ppc_avr_t *c,
+ uint32_t desc)
+{
+ /*
+ * Instead of processing imm bit-by-bit, we'll skip the computation of
+ * conjunctions whose corresponding bit is unset.
+ */
+ int bit, imm = simd_data(desc);
+ Int128 conj, disj = int128_zero();
+
+ /* Iterate over set bits from the least to the most significant bit */
+ while (imm) {
+ /*
+ * Get the next bit to be processed with ctz64. Invert the result of
+ * ctz64 to match the indexing used by PowerISA.
+ */
+ bit = 7 - ctzl(imm);
+ if (bit & 0x4) {
+ conj = a->s128;
+ } else {
+ conj = int128_not(a->s128);
+ }
+ if (bit & 0x2) {
+ conj = int128_and(conj, b->s128);
+ } else {
+ conj = int128_and(conj, int128_not(b->s128));
+ }
+ if (bit & 0x1) {
+ conj = int128_and(conj, c->s128);
+ } else {
+ conj = int128_and(conj, int128_not(c->s128));
+ }
+ disj = int128_or(disj, conj);
+
+ /* Unset the least significant bit that is set */
+ imm &= imm - 1;
+ }
+
+ t->s128 = disj;
+}
+
#define XXBLEND(name, sz) \
void glue(helper_XXBLENDV, name)(ppc_avr_t *t, ppc_avr_t *a, ppc_avr_t *b, \
ppc_avr_t *c, uint32_t desc) \
diff --git a/target/ppc/translate/vsx-impl.c.inc b/target/ppc/translate/vsx-impl.c.inc
index 92851b8926..d389ca2a83 100644
--- a/target/ppc/translate/vsx-impl.c.inc
+++ b/target/ppc/translate/vsx-impl.c.inc
@@ -2167,6 +2167,127 @@ TRANS64_FLAGS2(ISA310, PLXV, do_lstxv_PLS_D, false, false)
TRANS64_FLAGS2(ISA310, PSTXVP, do_lstxv_PLS_D, true, true)
TRANS64_FLAGS2(ISA310, PLXVP, do_lstxv_PLS_D, false, true)
+static void gen_xxeval_i64(TCGv_i64 t, TCGv_i64 a, TCGv_i64 b, TCGv_i64 c,
+ int64_t imm)
+{
+ /*
+ * Instead of processing imm bit-by-bit, we'll skip the computation of
+ * conjunctions whose corresponding bit is unset.
+ */
+ int bit;
+ TCGv_i64 conj, disj;
+
+ conj = tcg_temp_new_i64();
+ disj = tcg_temp_new_i64();
+
+ tcg_gen_movi_i64(disj, 0);
+
+ /* Iterate over set bits from the least to the most significant bit */
+ while (imm) {
+ /*
+ * Get the next bit to be processed with ctz64. Invert the result of
+ * ctz64 to match the indexing used by PowerISA.
+ */
+ bit = 7 - ctz64(imm);
+ if (bit & 0x4) {
+ tcg_gen_mov_i64(conj, a);
+ } else {
+ tcg_gen_not_i64(conj, a);
+ }
+ if (bit & 0x2) {
+ tcg_gen_and_i64(conj, conj, b);
+ } else {
+ tcg_gen_andc_i64(conj, conj, b);
+ }
+ if (bit & 0x1) {
+ tcg_gen_and_i64(conj, conj, c);
+ } else {
+ tcg_gen_andc_i64(conj, conj, c);
+ }
+ tcg_gen_or_i64(disj, disj, conj);
+
+ /* Unset the least significant bit that is set */
+ imm &= imm - 1;
+ }
+
+ tcg_gen_mov_i64(t, disj);
+
+ tcg_temp_free_i64(conj);
+ tcg_temp_free_i64(disj);
+}
+
+static void gen_xxeval_vec(unsigned vece, TCGv_vec t, TCGv_vec a, TCGv_vec b,
+ TCGv_vec c, int64_t imm)
+{
+ /*
+ * Instead of processing imm bit-by-bit, we'll skip the computation of
+ * conjunctions whose corresponding bit is unset.
+ */
+ int bit;
+ TCGv_vec disj, conj;
+
+ disj = tcg_temp_new_vec_matching(t);
+ conj = tcg_temp_new_vec_matching(t);
+
+ tcg_gen_dupi_vec(vece, disj, 0);
+
+ /* Iterate over set bits from the least to the most significant bit */
+ while (imm) {
+ /*
+ * Get the next bit to be processed with ctz64. Invert the result of
+ * ctz64 to match the indexing used by PowerISA.
+ */
+ bit = 7 - ctz64(imm);
+ if (bit & 0x4) {
+ tcg_gen_mov_vec(conj, a);
+ } else {
+ tcg_gen_not_vec(vece, conj, a);
+ }
+ if (bit & 0x2) {
+ tcg_gen_and_vec(vece, conj, conj, b);
+ } else {
+ tcg_gen_andc_vec(vece, conj, conj, b);
+ }
+ if (bit & 0x1) {
+ tcg_gen_and_vec(vece, conj, conj, c);
+ } else {
+ tcg_gen_andc_vec(vece, conj, conj, c);
+ }
+ tcg_gen_or_vec(vece, disj, disj, conj);
+
+ /* Unset the least significant bit that is set */
+ imm &= imm - 1;
+ }
+
+ tcg_gen_mov_vec(t, disj);
+
+ tcg_temp_free_vec(disj);
+ tcg_temp_free_vec(conj);
+}
+
+static bool trans_XXEVAL(DisasContext *ctx, arg_8RR_XX4_imm *a)
+{
+ REQUIRE_INSNS_FLAGS2(ctx, ISA310);
+ REQUIRE_VSX(ctx);
+
+ static const TCGOpcode vecop_list[] = {
+ INDEX_op_andc_vec, 0
+ };
+ static const GVecGen4i op = {
+ .fniv = gen_xxeval_vec,
+ .fno = gen_helper_XXEVAL,
+ .fni8 = gen_xxeval_i64,
+ .opt_opc = vecop_list,
+ .vece = MO_64
+ };
+
+ tcg_gen_gvec_4i(vsr_full_offset(a->xt), vsr_full_offset(a->xa),
+ vsr_full_offset(a->xb), vsr_full_offset(a->xc),
+ 16, 16, a->imm, &op);
+
+ return true;
+}
+
static void gen_xxblendv_vec(unsigned vece, TCGv_vec t, TCGv_vec a, TCGv_vec b,
TCGv_vec c)
{
--
2.25.1
^ permalink raw reply related [flat|nested] 97+ messages in thread
* Re: [PATCH v4 32/47] target/ppc: Implement xxeval
2022-02-22 14:36 ` [PATCH v4 32/47] target/ppc: Implement xxeval matheus.ferst
@ 2022-02-22 23:43 ` Richard Henderson
0 siblings, 0 replies; 97+ messages in thread
From: Richard Henderson @ 2022-02-22 23:43 UTC (permalink / raw)
To: matheus.ferst, qemu-devel, qemu-ppc; +Cc: groug, danielhb413, clg, david
On 2/22/22 04:36, matheus.ferst@eldorado.org.br wrote:
> + tcg_gen_movi_i64(disj, 0);
The init here means there's one more OR generated than necessary. Though perhaps it gets
folded away...
> +
> + /* Iterate over set bits from the least to the most significant bit */
> + while (imm) {
> + /*
> + * Get the next bit to be processed with ctz64. Invert the result of
> + * ctz64 to match the indexing used by PowerISA.
> + */
> + bit = 7 - ctz64(imm);
> + if (bit & 0x4) {
> + tcg_gen_mov_i64(conj, a);
> + } else {
> + tcg_gen_not_i64(conj, a);
> + }
> + if (bit & 0x2) {
> + tcg_gen_and_i64(conj, conj, b);
> + } else {
> + tcg_gen_andc_i64(conj, conj, b);
> + }
> + if (bit & 0x1) {
> + tcg_gen_and_i64(conj, conj, c);
> + } else {
> + tcg_gen_andc_i64(conj, conj, c);
> + }
> + tcg_gen_or_i64(disj, disj, conj);
> +
> + /* Unset the least significant bit that is set */
> + imm &= imm - 1;
I guess this works, though it's not nearly optimal.
It's certainly a good fallback for the out-of-line function.
Table 145 has the folded equivalent functions. Implementing all 256 of them as is, twice,
for both i64 and vec could be tedious. But we could cherry-pick the easiest, or most
commonly used, or something, and let all other imm values go through to out-of-line function.
r~
^ permalink raw reply [flat|nested] 97+ messages in thread
* [PATCH v4 33/47] target/ppc: Implement xxgenpcv[bhwd]m instruction
2022-02-22 14:35 [PATCH v4 00/47] target/ppc: PowerISA Vector/VSX instruction batch matheus.ferst
` (31 preceding siblings ...)
2022-02-22 14:36 ` [PATCH v4 32/47] target/ppc: Implement xxeval matheus.ferst
@ 2022-02-22 14:36 ` matheus.ferst
2022-02-22 23:48 ` Richard Henderson
2022-02-22 14:36 ` [PATCH v4 34/47] target/ppc: move xs[n]madd[am][ds]p/xs[n]msub[am][ds]p to decodetree matheus.ferst
` (13 subsequent siblings)
46 siblings, 1 reply; 97+ messages in thread
From: matheus.ferst @ 2022-02-22 14:36 UTC (permalink / raw)
To: qemu-devel, qemu-ppc
Cc: danielhb413, richard.henderson, groug, clg, Matheus Ferst, david
From: Matheus Ferst <matheus.ferst@eldorado.org.br>
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
---
target/ppc/helper.h | 4 ++
target/ppc/insn32.decode | 10 ++++
target/ppc/int_helper.c | 84 +++++++++++++++++++++++++++++
target/ppc/translate/vsx-impl.c.inc | 29 ++++++++++
4 files changed, 127 insertions(+)
diff --git a/target/ppc/helper.h b/target/ppc/helper.h
index b8c818f573..9751871370 100644
--- a/target/ppc/helper.h
+++ b/target/ppc/helper.h
@@ -496,6 +496,10 @@ DEF_HELPER_3(xvrspic, void, env, vsr, vsr)
DEF_HELPER_3(xvrspim, void, env, vsr, vsr)
DEF_HELPER_3(xvrspip, void, env, vsr, vsr)
DEF_HELPER_3(xvrspiz, void, env, vsr, vsr)
+DEF_HELPER_3(XXGENPCVBM, void, vsr, avr, tl)
+DEF_HELPER_3(XXGENPCVHM, void, vsr, avr, tl)
+DEF_HELPER_3(XXGENPCVWM, void, vsr, avr, tl)
+DEF_HELPER_3(XXGENPCVDM, void, vsr, avr, tl)
DEF_HELPER_4(xxextractuw, void, env, vsr, vsr, i32)
DEF_HELPER_5(XXPERMX, void, vsr, vsr, vsr, vsr, tl)
DEF_HELPER_4(xxinsertw, void, env, vsr, vsr, i32)
diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode
index 185d697458..b11a3ee29a 100644
--- a/target/ppc/insn32.decode
+++ b/target/ppc/insn32.decode
@@ -119,6 +119,9 @@
@X_bfl ...... bf:3 - l:1 ra:5 rb:5 ..........- &X_bfl
%x_xt 0:1 21:5
+&X_imm5 xt imm:uint8_t vrb
+@X_imm5 ...... ..... imm:5 vrb:5 .......... . &X_imm5 xt=%x_xt
+
&X_imm8 xt imm:uint8_t
@X_imm8 ...... ..... .. imm:8 .......... . &X_imm8 xt=%x_xt
@@ -613,6 +616,13 @@ XXPERMDI 111100 ..... ..... ..... 0 .. 01010 ... @XX3_dm
XXSEL 111100 ..... ..... ..... ..... 11 .... @XX4
+## VSX Vector Generate PCV
+
+XXGENPCVBM 111100 ..... ..... ..... 1110010100 . @X_imm5
+XXGENPCVHM 111100 ..... ..... ..... 1110010101 . @X_imm5
+XXGENPCVWM 111100 ..... ..... ..... 1110110100 . @X_imm5
+XXGENPCVDM 111100 ..... ..... ..... 1110110101 . @X_imm5
+
## VSX Vector Load Special Value Instruction
LXVKQ 111100 ..... 11111 ..... 0101101000 . @X_uim5
diff --git a/target/ppc/int_helper.c b/target/ppc/int_helper.c
index 255645ef1d..dc106aaab9 100644
--- a/target/ppc/int_helper.c
+++ b/target/ppc/int_helper.c
@@ -1088,6 +1088,90 @@ void helper_VPERMR(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b, ppc_avr_t *c)
*r = result;
}
+#define XXGENPCV(NAME, SZ) \
+void helper_##NAME(ppc_vsr_t *t, ppc_vsr_t *b, target_ulong imm) \
+{ \
+ ppc_vsr_t tmp = { .u64 = { 0, 0 } }; \
+ \
+ switch (imm) { \
+ case 0b00000: /* Big-Endian expansion */ \
+ /* Initialize tmp with the result of an all-zeros mask */ \
+ tmp.VsrD(0) = 0x1011121314151617; \
+ tmp.VsrD(1) = 0x18191A1B1C1D1E1F; \
+ \
+ /* Iterate over the most significant byte of each element */ \
+ for (int i = 0, j = 0; i < ARRAY_SIZE(b->u8); i += SZ) { \
+ if (b->VsrB(i) & 0x80) { \
+ /* Update each byte of the element */ \
+ for (int k = 0; k < SZ; k++) { \
+ tmp.VsrB(i + k) = j + k; \
+ } \
+ j += SZ; \
+ } \
+ } \
+ \
+ break; \
+ case 0b00001: /* Big-Endian compression */ \
+ /* Iterate over the most significant byte of each element */ \
+ for (int i = 0, j = 0; i < ARRAY_SIZE(b->u8); i += SZ) { \
+ if (b->VsrB(i) & 0x80) { \
+ /* Update each byte of the element */ \
+ for (int k = 0; k < SZ; k++) { \
+ tmp.VsrB(j + k) = i + k; \
+ } \
+ j += SZ; \
+ } \
+ } \
+ \
+ break; \
+ case 0b00010: /* Little-Endian expansion */ \
+ /* Initialize tmp with the result of an all-zeros mask */ \
+ tmp.VsrD(0) = 0x1F1E1D1C1B1A1918; \
+ tmp.VsrD(1) = 0x1716151413121110; \
+ \
+ /* Iterate over the most significant byte of each element */ \
+ for (int i = 0, j = 0; i < ARRAY_SIZE(b->u8); i += SZ) { \
+ /* Reverse indexing of "i" */ \
+ const int idx = ARRAY_SIZE(b->u8) - i - SZ; \
+ if (b->VsrB(idx) & 0x80) { \
+ /* Update each byte of the element */ \
+ for (int k = 0, rk = SZ - 1; k < SZ; k++, rk--) { \
+ tmp.VsrB(idx + rk) = j + k; \
+ } \
+ j += SZ; \
+ } \
+ } \
+ \
+ break; \
+ case 0b00011: /* Little-Endian compression */ \
+ /* Iterate over the most significant byte of each element */ \
+ for (int i = 0, j = 0; i < ARRAY_SIZE(b->u8); i += SZ) { \
+ if (b->VsrB(ARRAY_SIZE(b->u8) - i - SZ) & 0x80) { \
+ /* Update each byte of the element */ \
+ for (int k = 0, rk = SZ - 1; k < SZ; k++, rk--) { \
+ /* Reverse indexing of "j" */ \
+ const int idx = ARRAY_SIZE(b->u8) - j - SZ; \
+ tmp.VsrB(idx + rk) = i + k; \
+ } \
+ j += SZ; \
+ } \
+ } \
+ \
+ break; \
+ default: \
+ /* Translation code validates IMM before calling this helper */ \
+ g_assert_not_reached(); \
+ break; \
+ } \
+ \
+ *t = tmp; \
+}
+XXGENPCV(XXGENPCVBM, 1)
+XXGENPCV(XXGENPCVHM, 2)
+XXGENPCV(XXGENPCVWM, 4)
+XXGENPCV(XXGENPCVDM, 8)
+#undef XXGENPCV
+
#if defined(HOST_WORDS_BIGENDIAN)
#define VBPERMQ_INDEX(avr, i) ((avr)->u8[(i)])
#define VBPERMD_INDEX(i) (i)
diff --git a/target/ppc/translate/vsx-impl.c.inc b/target/ppc/translate/vsx-impl.c.inc
index d389ca2a83..a75c4e68f8 100644
--- a/target/ppc/translate/vsx-impl.c.inc
+++ b/target/ppc/translate/vsx-impl.c.inc
@@ -1256,6 +1256,35 @@ static bool trans_XXPERMX(DisasContext *ctx, arg_8RR_XX4_uim3 *a)
return true;
}
+static bool do_xxgenpcv(DisasContext *ctx, arg_X_imm5 *a,
+ void (*gen_helper)(TCGv_ptr, TCGv_ptr, TCGv))
+{
+ TCGv_ptr xt, vrb;
+
+ REQUIRE_INSNS_FLAGS2(ctx, ISA310);
+ REQUIRE_VSX(ctx);
+
+ if (a->imm & ~0x3) {
+ gen_invalid(ctx);
+ return true;
+ }
+
+ xt = gen_vsr_ptr(a->xt);
+ vrb = gen_avr_ptr(a->vrb);
+
+ gen_helper(xt, vrb, tcg_constant_tl(a->imm));
+
+ tcg_temp_free_ptr(xt);
+ tcg_temp_free_ptr(vrb);
+
+ return true;
+}
+
+TRANS(XXGENPCVBM, do_xxgenpcv, gen_helper_XXGENPCVBM)
+TRANS(XXGENPCVHM, do_xxgenpcv, gen_helper_XXGENPCVHM)
+TRANS(XXGENPCVWM, do_xxgenpcv, gen_helper_XXGENPCVWM)
+TRANS(XXGENPCVDM, do_xxgenpcv, gen_helper_XXGENPCVDM)
+
#define GEN_VSX_HELPER_VSX_MADD(name, op1, aop, mop, inval, type) \
static void gen_##name(DisasContext *ctx) \
{ \
--
2.25.1
^ permalink raw reply related [flat|nested] 97+ messages in thread
* Re: [PATCH v4 33/47] target/ppc: Implement xxgenpcv[bhwd]m instruction
2022-02-22 14:36 ` [PATCH v4 33/47] target/ppc: Implement xxgenpcv[bhwd]m instruction matheus.ferst
@ 2022-02-22 23:48 ` Richard Henderson
0 siblings, 0 replies; 97+ messages in thread
From: Richard Henderson @ 2022-02-22 23:48 UTC (permalink / raw)
To: matheus.ferst, qemu-devel, qemu-ppc; +Cc: groug, danielhb413, clg, david
On 2/22/22 04:36, matheus.ferst@eldorado.org.br wrote:
> +#define XXGENPCV(NAME, SZ) \
> +void helper_##NAME(ppc_vsr_t *t, ppc_vsr_t *b, target_ulong imm) \
> +{ \
> + ppc_vsr_t tmp = { .u64 = { 0, 0 } }; \
> + \
> + switch (imm) { \
You should split the helper and not pass down imm.
r~
^ permalink raw reply [flat|nested] 97+ messages in thread
* [PATCH v4 34/47] target/ppc: move xs[n]madd[am][ds]p/xs[n]msub[am][ds]p to decodetree
2022-02-22 14:35 [PATCH v4 00/47] target/ppc: PowerISA Vector/VSX instruction batch matheus.ferst
` (32 preceding siblings ...)
2022-02-22 14:36 ` [PATCH v4 33/47] target/ppc: Implement xxgenpcv[bhwd]m instruction matheus.ferst
@ 2022-02-22 14:36 ` matheus.ferst
2022-02-22 23:52 ` Richard Henderson
2022-02-22 14:36 ` [PATCH v4 35/47] target/ppc: implement xs[n]maddqp[o]/xs[n]msubqp[o] matheus.ferst
` (12 subsequent siblings)
46 siblings, 1 reply; 97+ messages in thread
From: matheus.ferst @ 2022-02-22 14:36 UTC (permalink / raw)
To: qemu-devel, qemu-ppc
Cc: danielhb413, richard.henderson, groug, clg, Matheus Ferst, david
From: Matheus Ferst <matheus.ferst@eldorado.org.br>
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
---
target/ppc/fpu_helper.c | 23 ++++++------
target/ppc/helper.h | 16 ++++-----
target/ppc/insn32.decode | 22 ++++++++++++
target/ppc/translate/vsx-impl.c.inc | 56 ++++++++++++++++++++++++-----
target/ppc/translate/vsx-ops.c.inc | 16 ---------
5 files changed, 90 insertions(+), 43 deletions(-)
diff --git a/target/ppc/fpu_helper.c b/target/ppc/fpu_helper.c
index 0fd285defc..c8797d8053 100644
--- a/target/ppc/fpu_helper.c
+++ b/target/ppc/fpu_helper.c
@@ -2156,10 +2156,11 @@ VSX_TSQRT(xvtsqrtsp, 4, float32, VsrW(i), -126, 23)
* maddflgs - flags for the float*muladd routine that control the
* various forms (madd, msub, nmadd, nmsub)
* sfprf - set FPRF
+ * r2sp - round intermediate double precision result to single precision
*/
#define VSX_MADD(op, nels, tp, fld, maddflgs, sfprf, r2sp) \
void helper_##op(CPUPPCState *env, ppc_vsr_t *xt, \
- ppc_vsr_t *xa, ppc_vsr_t *b, ppc_vsr_t *c) \
+ ppc_vsr_t *s1, ppc_vsr_t *s2, ppc_vsr_t *s3) \
{ \
ppc_vsr_t t = *xt; \
int i; \
@@ -2175,12 +2176,12 @@ void helper_##op(CPUPPCState *env, ppc_vsr_t *xt, \
* result to odd. \
*/ \
set_float_rounding_mode(float_round_to_zero, &tstat); \
- t.fld = tp##_muladd(xa->fld, b->fld, c->fld, \
+ t.fld = tp##_muladd(s1->fld, s3->fld, s2->fld, \
maddflgs, &tstat); \
t.fld |= (get_float_exception_flags(&tstat) & \
float_flag_inexact) != 0; \
} else { \
- t.fld = tp##_muladd(xa->fld, b->fld, c->fld, \
+ t.fld = tp##_muladd(s1->fld, s3->fld, s2->fld, \
maddflgs, &tstat); \
} \
env->fp_status.float_exception_flags |= tstat.float_exception_flags; \
@@ -2202,14 +2203,14 @@ void helper_##op(CPUPPCState *env, ppc_vsr_t *xt, \
do_float_check_status(env, GETPC()); \
}
-VSX_MADD(xsmadddp, 1, float64, VsrD(0), MADD_FLGS, 1, 0)
-VSX_MADD(xsmsubdp, 1, float64, VsrD(0), MSUB_FLGS, 1, 0)
-VSX_MADD(xsnmadddp, 1, float64, VsrD(0), NMADD_FLGS, 1, 0)
-VSX_MADD(xsnmsubdp, 1, float64, VsrD(0), NMSUB_FLGS, 1, 0)
-VSX_MADD(xsmaddsp, 1, float64, VsrD(0), MADD_FLGS, 1, 1)
-VSX_MADD(xsmsubsp, 1, float64, VsrD(0), MSUB_FLGS, 1, 1)
-VSX_MADD(xsnmaddsp, 1, float64, VsrD(0), NMADD_FLGS, 1, 1)
-VSX_MADD(xsnmsubsp, 1, float64, VsrD(0), NMSUB_FLGS, 1, 1)
+VSX_MADD(XSMADDDP, 1, float64, VsrD(0), MADD_FLGS, 1, 0)
+VSX_MADD(XSMSUBDP, 1, float64, VsrD(0), MSUB_FLGS, 1, 0)
+VSX_MADD(XSNMADDDP, 1, float64, VsrD(0), NMADD_FLGS, 1, 0)
+VSX_MADD(XSNMSUBDP, 1, float64, VsrD(0), NMSUB_FLGS, 1, 0)
+VSX_MADD(XSMADDSP, 1, float64, VsrD(0), MADD_FLGS, 1, 1)
+VSX_MADD(XSMSUBSP, 1, float64, VsrD(0), MSUB_FLGS, 1, 1)
+VSX_MADD(XSNMADDSP, 1, float64, VsrD(0), NMADD_FLGS, 1, 1)
+VSX_MADD(XSNMSUBSP, 1, float64, VsrD(0), NMSUB_FLGS, 1, 1)
VSX_MADD(xvmadddp, 2, float64, VsrD(i), MADD_FLGS, 0, 0)
VSX_MADD(xvmsubdp, 2, float64, VsrD(i), MSUB_FLGS, 0, 0)
diff --git a/target/ppc/helper.h b/target/ppc/helper.h
index 9751871370..fd249a22f0 100644
--- a/target/ppc/helper.h
+++ b/target/ppc/helper.h
@@ -357,10 +357,10 @@ DEF_HELPER_3(xssqrtdp, void, env, vsr, vsr)
DEF_HELPER_3(xsrsqrtedp, void, env, vsr, vsr)
DEF_HELPER_4(xstdivdp, void, env, i32, vsr, vsr)
DEF_HELPER_3(xstsqrtdp, void, env, i32, vsr)
-DEF_HELPER_5(xsmadddp, void, env, vsr, vsr, vsr, vsr)
-DEF_HELPER_5(xsmsubdp, void, env, vsr, vsr, vsr, vsr)
-DEF_HELPER_5(xsnmadddp, void, env, vsr, vsr, vsr, vsr)
-DEF_HELPER_5(xsnmsubdp, void, env, vsr, vsr, vsr, vsr)
+DEF_HELPER_5(XSMADDDP, void, env, vsr, vsr, vsr, vsr)
+DEF_HELPER_5(XSMSUBDP, void, env, vsr, vsr, vsr, vsr)
+DEF_HELPER_5(XSNMADDDP, void, env, vsr, vsr, vsr, vsr)
+DEF_HELPER_5(XSNMSUBDP, void, env, vsr, vsr, vsr, vsr)
DEF_HELPER_4(xscmpeqdp, void, env, vsr, vsr, vsr)
DEF_HELPER_4(xscmpgtdp, void, env, vsr, vsr, vsr)
DEF_HELPER_4(xscmpgedp, void, env, vsr, vsr, vsr)
@@ -420,10 +420,10 @@ DEF_HELPER_3(xsresp, void, env, vsr, vsr)
DEF_HELPER_2(xsrsp, i64, env, i64)
DEF_HELPER_3(xssqrtsp, void, env, vsr, vsr)
DEF_HELPER_3(xsrsqrtesp, void, env, vsr, vsr)
-DEF_HELPER_5(xsmaddsp, void, env, vsr, vsr, vsr, vsr)
-DEF_HELPER_5(xsmsubsp, void, env, vsr, vsr, vsr, vsr)
-DEF_HELPER_5(xsnmaddsp, void, env, vsr, vsr, vsr, vsr)
-DEF_HELPER_5(xsnmsubsp, void, env, vsr, vsr, vsr, vsr)
+DEF_HELPER_5(XSMADDSP, void, env, vsr, vsr, vsr, vsr)
+DEF_HELPER_5(XSMSUBSP, void, env, vsr, vsr, vsr, vsr)
+DEF_HELPER_5(XSNMADDSP, void, env, vsr, vsr, vsr, vsr)
+DEF_HELPER_5(XSNMSUBSP, void, env, vsr, vsr, vsr, vsr)
DEF_HELPER_4(xvadddp, void, env, vsr, vsr, vsr)
DEF_HELPER_4(xvsubdp, void, env, vsr, vsr, vsr)
diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode
index b11a3ee29a..881b7093f6 100644
--- a/target/ppc/insn32.decode
+++ b/target/ppc/insn32.decode
@@ -603,6 +603,28 @@ STXVX 011111 ..... ..... ..... 0110001100 . @X_TSX
LXVPX 011111 ..... ..... ..... 0101001101 - @X_TSXP
STXVPX 011111 ..... ..... ..... 0111001101 - @X_TSXP
+## VSX Scalar Multiply-Add Instructions
+
+XSMADDADP 111100 ..... ..... ..... 00100001 . . . @XX3
+XSMADDMDP 111100 ..... ..... ..... 00101001 . . . @XX3
+XSMADDASP 111100 ..... ..... ..... 00000001 . . . @XX3
+XSMADDMSP 111100 ..... ..... ..... 00001001 . . . @XX3
+
+XSMSUBADP 111100 ..... ..... ..... 00110001 . . . @XX3
+XSMSUBMDP 111100 ..... ..... ..... 00111001 . . . @XX3
+XSMSUBASP 111100 ..... ..... ..... 00010001 . . . @XX3
+XSMSUBMSP 111100 ..... ..... ..... 00011001 . . . @XX3
+
+XSNMADDASP 111100 ..... ..... ..... 10000001 . . . @XX3
+XSNMADDMSP 111100 ..... ..... ..... 10001001 . . . @XX3
+XSNMADDADP 111100 ..... ..... ..... 10100001 . . . @XX3
+XSNMADDMDP 111100 ..... ..... ..... 10101001 . . . @XX3
+
+XSNMSUBASP 111100 ..... ..... ..... 10010001 . . . @XX3
+XSNMSUBMSP 111100 ..... ..... ..... 10011001 . . . @XX3
+XSNMSUBADP 111100 ..... ..... ..... 10110001 . . . @XX3
+XSNMSUBMDP 111100 ..... ..... ..... 10111001 . . . @XX3
+
## VSX splat instruction
XXSPLTIB 111100 ..... 00 ........ 0101101000 . @X_imm8
diff --git a/target/ppc/translate/vsx-impl.c.inc b/target/ppc/translate/vsx-impl.c.inc
index a75c4e68f8..a54afb4dbb 100644
--- a/target/ppc/translate/vsx-impl.c.inc
+++ b/target/ppc/translate/vsx-impl.c.inc
@@ -1285,6 +1285,54 @@ TRANS(XXGENPCVHM, do_xxgenpcv, gen_helper_XXGENPCVHM)
TRANS(XXGENPCVWM, do_xxgenpcv, gen_helper_XXGENPCVWM)
TRANS(XXGENPCVDM, do_xxgenpcv, gen_helper_XXGENPCVDM)
+static bool do_xsmadd(DisasContext *ctx, int tgt, int src1, int src2, int src3,
+ void (gen_helper)(TCGv_ptr, TCGv_ptr, TCGv_ptr, TCGv_ptr, TCGv_ptr))
+{
+ TCGv_ptr t, s1, s2, s3;
+
+ t = gen_vsr_ptr(tgt);
+ s1 = gen_vsr_ptr(src1);
+ s2 = gen_vsr_ptr(src2);
+ s3 = gen_vsr_ptr(src3);
+
+ gen_helper(cpu_env, t, s1, s2, s3);
+
+ tcg_temp_free_ptr(t);
+ tcg_temp_free_ptr(s1);
+ tcg_temp_free_ptr(s2);
+ tcg_temp_free_ptr(s3);
+
+ return true;
+}
+
+static bool do_xsmadd_XX3(DisasContext *ctx, arg_XX3 *a, bool type_a,
+ void (gen_helper)(TCGv_ptr, TCGv_ptr, TCGv_ptr, TCGv_ptr, TCGv_ptr))
+{
+ REQUIRE_VSX(ctx);
+
+ if (type_a) {
+ return do_xsmadd(ctx, a->xt, a->xa, a->xt, a->xb, gen_helper);
+ }
+ return do_xsmadd(ctx, a->xt, a->xa, a->xb, a->xt, gen_helper);
+}
+
+TRANS_FLAGS2(VSX, XSMADDADP, do_xsmadd_XX3, true, gen_helper_XSMADDDP)
+TRANS_FLAGS2(VSX, XSMADDMDP, do_xsmadd_XX3, false, gen_helper_XSMADDDP)
+TRANS_FLAGS2(VSX, XSMSUBADP, do_xsmadd_XX3, true, gen_helper_XSMSUBDP)
+TRANS_FLAGS2(VSX, XSMSUBMDP, do_xsmadd_XX3, false, gen_helper_XSMSUBDP)
+TRANS_FLAGS2(VSX, XSNMADDADP, do_xsmadd_XX3, true, gen_helper_XSNMADDDP)
+TRANS_FLAGS2(VSX, XSNMADDMDP, do_xsmadd_XX3, false, gen_helper_XSNMADDDP)
+TRANS_FLAGS2(VSX, XSNMSUBADP, do_xsmadd_XX3, true, gen_helper_XSNMSUBDP)
+TRANS_FLAGS2(VSX, XSNMSUBMDP, do_xsmadd_XX3, false, gen_helper_XSNMSUBDP)
+TRANS_FLAGS2(VSX207, XSMADDASP, do_xsmadd_XX3, true, gen_helper_XSMADDSP)
+TRANS_FLAGS2(VSX207, XSMADDMSP, do_xsmadd_XX3, false, gen_helper_XSMADDSP)
+TRANS_FLAGS2(VSX207, XSMSUBASP, do_xsmadd_XX3, true, gen_helper_XSMSUBSP)
+TRANS_FLAGS2(VSX207, XSMSUBMSP, do_xsmadd_XX3, false, gen_helper_XSMSUBSP)
+TRANS_FLAGS2(VSX207, XSNMADDASP, do_xsmadd_XX3, true, gen_helper_XSNMADDSP)
+TRANS_FLAGS2(VSX207, XSNMADDMSP, do_xsmadd_XX3, false, gen_helper_XSNMADDSP)
+TRANS_FLAGS2(VSX207, XSNMSUBASP, do_xsmadd_XX3, true, gen_helper_XSNMSUBSP)
+TRANS_FLAGS2(VSX207, XSNMSUBMSP, do_xsmadd_XX3, false, gen_helper_XSNMSUBSP)
+
#define GEN_VSX_HELPER_VSX_MADD(name, op1, aop, mop, inval, type) \
static void gen_##name(DisasContext *ctx) \
{ \
@@ -1315,14 +1363,6 @@ static void gen_##name(DisasContext *ctx) \
tcg_temp_free_ptr(c); \
}
-GEN_VSX_HELPER_VSX_MADD(xsmadddp, 0x04, 0x04, 0x05, 0, PPC2_VSX)
-GEN_VSX_HELPER_VSX_MADD(xsmsubdp, 0x04, 0x06, 0x07, 0, PPC2_VSX)
-GEN_VSX_HELPER_VSX_MADD(xsnmadddp, 0x04, 0x14, 0x15, 0, PPC2_VSX)
-GEN_VSX_HELPER_VSX_MADD(xsnmsubdp, 0x04, 0x16, 0x17, 0, PPC2_VSX)
-GEN_VSX_HELPER_VSX_MADD(xsmaddsp, 0x04, 0x00, 0x01, 0, PPC2_VSX207)
-GEN_VSX_HELPER_VSX_MADD(xsmsubsp, 0x04, 0x02, 0x03, 0, PPC2_VSX207)
-GEN_VSX_HELPER_VSX_MADD(xsnmaddsp, 0x04, 0x10, 0x11, 0, PPC2_VSX207)
-GEN_VSX_HELPER_VSX_MADD(xsnmsubsp, 0x04, 0x12, 0x13, 0, PPC2_VSX207)
GEN_VSX_HELPER_VSX_MADD(xvmadddp, 0x04, 0x0C, 0x0D, 0, PPC2_VSX)
GEN_VSX_HELPER_VSX_MADD(xvmsubdp, 0x04, 0x0E, 0x0F, 0, PPC2_VSX)
GEN_VSX_HELPER_VSX_MADD(xvnmadddp, 0x04, 0x1C, 0x1D, 0, PPC2_VSX)
diff --git a/target/ppc/translate/vsx-ops.c.inc b/target/ppc/translate/vsx-ops.c.inc
index 0a6b2b31ac..9cfec53df0 100644
--- a/target/ppc/translate/vsx-ops.c.inc
+++ b/target/ppc/translate/vsx-ops.c.inc
@@ -186,14 +186,6 @@ GEN_XX2FORM(xssqrtdp, 0x16, 0x04, PPC2_VSX),
GEN_XX2FORM(xsrsqrtedp, 0x14, 0x04, PPC2_VSX),
GEN_XX3FORM(xstdivdp, 0x14, 0x07, PPC2_VSX),
GEN_XX2FORM(xstsqrtdp, 0x14, 0x06, PPC2_VSX),
-GEN_XX3FORM_NAME(xsmadddp, "xsmaddadp", 0x04, 0x04, PPC2_VSX),
-GEN_XX3FORM_NAME(xsmadddp, "xsmaddmdp", 0x04, 0x05, PPC2_VSX),
-GEN_XX3FORM_NAME(xsmsubdp, "xsmsubadp", 0x04, 0x06, PPC2_VSX),
-GEN_XX3FORM_NAME(xsmsubdp, "xsmsubmdp", 0x04, 0x07, PPC2_VSX),
-GEN_XX3FORM_NAME(xsnmadddp, "xsnmaddadp", 0x04, 0x14, PPC2_VSX),
-GEN_XX3FORM_NAME(xsnmadddp, "xsnmaddmdp", 0x04, 0x15, PPC2_VSX),
-GEN_XX3FORM_NAME(xsnmsubdp, "xsnmsubadp", 0x04, 0x16, PPC2_VSX),
-GEN_XX3FORM_NAME(xsnmsubdp, "xsnmsubmdp", 0x04, 0x17, PPC2_VSX),
GEN_XX3FORM(xscmpeqdp, 0x0C, 0x00, PPC2_ISA300),
GEN_XX3FORM(xscmpgtdp, 0x0C, 0x01, PPC2_ISA300),
GEN_XX3FORM(xscmpgedp, 0x0C, 0x02, PPC2_ISA300),
@@ -235,14 +227,6 @@ GEN_XX2FORM(xsresp, 0x14, 0x01, PPC2_VSX207),
GEN_XX2FORM(xsrsp, 0x12, 0x11, PPC2_VSX207),
GEN_XX2FORM(xssqrtsp, 0x16, 0x00, PPC2_VSX207),
GEN_XX2FORM(xsrsqrtesp, 0x14, 0x00, PPC2_VSX207),
-GEN_XX3FORM_NAME(xsmaddsp, "xsmaddasp", 0x04, 0x00, PPC2_VSX207),
-GEN_XX3FORM_NAME(xsmaddsp, "xsmaddmsp", 0x04, 0x01, PPC2_VSX207),
-GEN_XX3FORM_NAME(xsmsubsp, "xsmsubasp", 0x04, 0x02, PPC2_VSX207),
-GEN_XX3FORM_NAME(xsmsubsp, "xsmsubmsp", 0x04, 0x03, PPC2_VSX207),
-GEN_XX3FORM_NAME(xsnmaddsp, "xsnmaddasp", 0x04, 0x10, PPC2_VSX207),
-GEN_XX3FORM_NAME(xsnmaddsp, "xsnmaddmsp", 0x04, 0x11, PPC2_VSX207),
-GEN_XX3FORM_NAME(xsnmsubsp, "xsnmsubasp", 0x04, 0x12, PPC2_VSX207),
-GEN_XX3FORM_NAME(xsnmsubsp, "xsnmsubmsp", 0x04, 0x13, PPC2_VSX207),
GEN_XX2FORM(xscvsxdsp, 0x10, 0x13, PPC2_VSX207),
GEN_XX2FORM(xscvuxdsp, 0x10, 0x12, PPC2_VSX207),
--
2.25.1
^ permalink raw reply related [flat|nested] 97+ messages in thread
* Re: [PATCH v4 34/47] target/ppc: move xs[n]madd[am][ds]p/xs[n]msub[am][ds]p to decodetree
2022-02-22 14:36 ` [PATCH v4 34/47] target/ppc: move xs[n]madd[am][ds]p/xs[n]msub[am][ds]p to decodetree matheus.ferst
@ 2022-02-22 23:52 ` Richard Henderson
0 siblings, 0 replies; 97+ messages in thread
From: Richard Henderson @ 2022-02-22 23:52 UTC (permalink / raw)
To: matheus.ferst, qemu-devel, qemu-ppc; +Cc: groug, danielhb413, clg, david
On 2/22/22 04:36, matheus.ferst@eldorado.org.br wrote:
> +static bool do_xsmadd(DisasContext *ctx, int tgt, int src1, int src2, int src3,
> + void (gen_helper)(TCGv_ptr, TCGv_ptr, TCGv_ptr, TCGv_ptr, TCGv_ptr))
Missing a * before gen_helper. Somewhat surprised this compiled...
> +static bool do_xsmadd_XX3(DisasContext *ctx, arg_XX3 *a, bool type_a,
> + void (gen_helper)(TCGv_ptr, TCGv_ptr, TCGv_ptr, TCGv_ptr, TCGv_ptr))
Likewise.
Otherwise,
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
r~
^ permalink raw reply [flat|nested] 97+ messages in thread
* [PATCH v4 35/47] target/ppc: implement xs[n]maddqp[o]/xs[n]msubqp[o]
2022-02-22 14:35 [PATCH v4 00/47] target/ppc: PowerISA Vector/VSX instruction batch matheus.ferst
` (33 preceding siblings ...)
2022-02-22 14:36 ` [PATCH v4 34/47] target/ppc: move xs[n]madd[am][ds]p/xs[n]msub[am][ds]p to decodetree matheus.ferst
@ 2022-02-22 14:36 ` matheus.ferst
2022-02-22 23:56 ` Richard Henderson
2022-02-22 14:36 ` [PATCH v4 36/47] target/ppc: Implement xvtlsbb instruction matheus.ferst
` (11 subsequent siblings)
46 siblings, 1 reply; 97+ messages in thread
From: matheus.ferst @ 2022-02-22 14:36 UTC (permalink / raw)
To: qemu-devel, qemu-ppc
Cc: danielhb413, richard.henderson, groug, clg, Matheus Ferst, david
From: Matheus Ferst <matheus.ferst@eldorado.org.br>
Implement the following PowerISA v3.0 instuctions:
xsmaddqp[o]: VSX Scalar Multiply-Add Quad-Precision [using round to Odd]
xsmsubqp[o]: VSX Scalar Multiply-Subtract Quad-Precision [using round
to Odd]
xsnmaddqp[o]: VSX Scalar Negative Multiply-Add Quad-Precision [using
round to Odd]
xsnmsubqp[o]: VSX Scalar Negative Multiply-Subtract Quad-Precision
[using round to Odd]
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
---
target/ppc/fpu_helper.c | 42 +++++++++++++++++++++++++++++
target/ppc/helper.h | 9 +++++++
target/ppc/insn32.decode | 4 +++
target/ppc/translate/vsx-impl.c.inc | 25 +++++++++++++++++
4 files changed, 80 insertions(+)
diff --git a/target/ppc/fpu_helper.c b/target/ppc/fpu_helper.c
index c8797d8053..98e9576608 100644
--- a/target/ppc/fpu_helper.c
+++ b/target/ppc/fpu_helper.c
@@ -2222,6 +2222,48 @@ VSX_MADD(xvmsubsp, 4, float32, VsrW(i), MSUB_FLGS, 0, 0)
VSX_MADD(xvnmaddsp, 4, float32, VsrW(i), NMADD_FLGS, 0, 0)
VSX_MADD(xvnmsubsp, 4, float32, VsrW(i), NMSUB_FLGS, 0, 0)
+/*
+ * VSX_MADDQ - VSX floating point quad-precision muliply/add
+ * op - instruction mnemonic
+ * maddflgs - flags for the float*muladd routine that control the
+ * various forms (madd, msub, nmadd, nmsub)
+ * ro - round to odd
+ */
+#define VSX_MADDQ(op, maddflgs, ro) \
+void helper_##op(CPUPPCState *env, ppc_vsr_t *xt, ppc_vsr_t *s1, ppc_vsr_t *s2,\
+ ppc_vsr_t *s3) \
+{ \
+ ppc_vsr_t t = *xt; \
+ \
+ helper_reset_fpstatus(env); \
+ \
+ float_status tstat = env->fp_status; \
+ set_float_exception_flags(0, &tstat); \
+ if (ro) { \
+ tstat.float_rounding_mode = float_round_to_odd; \
+ } \
+ t.f128 = float128_muladd(s1->f128, s3->f128, s2->f128, maddflgs, &tstat); \
+ env->fp_status.float_exception_flags |= tstat.float_exception_flags; \
+ \
+ if (unlikely(tstat.float_exception_flags & float_flag_invalid)) { \
+ float_invalid_op_madd(env, tstat.float_exception_flags, \
+ false, GETPC()); \
+ } \
+ \
+ helper_compute_fprf_float128(env, t.f128); \
+ *xt = t; \
+ do_float_check_status(env, GETPC()); \
+}
+
+VSX_MADDQ(XSMADDQP, MADD_FLGS, 0)
+VSX_MADDQ(XSMADDQPO, MADD_FLGS, 1)
+VSX_MADDQ(XSMSUBQP, MSUB_FLGS, 0)
+VSX_MADDQ(XSMSUBQPO, MSUB_FLGS, 1)
+VSX_MADDQ(XSNMADDQP, NMADD_FLGS, 0)
+VSX_MADDQ(XSNMADDQPO, NMADD_FLGS, 1)
+VSX_MADDQ(XSNMSUBQP, NMSUB_FLGS, 0)
+VSX_MADDQ(XSNMSUBQPO, NMSUB_FLGS, 0)
+
/*
* VSX_SCALAR_CMP_DP - VSX scalar floating point compare double precision
* op - instruction mnemonic
diff --git a/target/ppc/helper.h b/target/ppc/helper.h
index fd249a22f0..1649fffff8 100644
--- a/target/ppc/helper.h
+++ b/target/ppc/helper.h
@@ -425,6 +425,15 @@ DEF_HELPER_5(XSMSUBSP, void, env, vsr, vsr, vsr, vsr)
DEF_HELPER_5(XSNMADDSP, void, env, vsr, vsr, vsr, vsr)
DEF_HELPER_5(XSNMSUBSP, void, env, vsr, vsr, vsr, vsr)
+DEF_HELPER_5(XSMADDQP, void, env, vsr, vsr, vsr, vsr)
+DEF_HELPER_5(XSMADDQPO, void, env, vsr, vsr, vsr, vsr)
+DEF_HELPER_5(XSMSUBQP, void, env, vsr, vsr, vsr, vsr)
+DEF_HELPER_5(XSMSUBQPO, void, env, vsr, vsr, vsr, vsr)
+DEF_HELPER_5(XSNMADDQP, void, env, vsr, vsr, vsr, vsr)
+DEF_HELPER_5(XSNMADDQPO, void, env, vsr, vsr, vsr, vsr)
+DEF_HELPER_5(XSNMSUBQP, void, env, vsr, vsr, vsr, vsr)
+DEF_HELPER_5(XSNMSUBQPO, void, env, vsr, vsr, vsr, vsr)
+
DEF_HELPER_4(xvadddp, void, env, vsr, vsr, vsr)
DEF_HELPER_4(xvsubdp, void, env, vsr, vsr, vsr)
DEF_HELPER_4(xvmuldp, void, env, vsr, vsr, vsr)
diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode
index 881b7093f6..1395a91c44 100644
--- a/target/ppc/insn32.decode
+++ b/target/ppc/insn32.decode
@@ -609,21 +609,25 @@ XSMADDADP 111100 ..... ..... ..... 00100001 . . . @XX3
XSMADDMDP 111100 ..... ..... ..... 00101001 . . . @XX3
XSMADDASP 111100 ..... ..... ..... 00000001 . . . @XX3
XSMADDMSP 111100 ..... ..... ..... 00001001 . . . @XX3
+XSMADDQP 111111 ..... ..... ..... 0110000100 . @X_rc
XSMSUBADP 111100 ..... ..... ..... 00110001 . . . @XX3
XSMSUBMDP 111100 ..... ..... ..... 00111001 . . . @XX3
XSMSUBASP 111100 ..... ..... ..... 00010001 . . . @XX3
XSMSUBMSP 111100 ..... ..... ..... 00011001 . . . @XX3
+XSMSUBQP 111111 ..... ..... ..... 0110100100 . @X_rc
XSNMADDASP 111100 ..... ..... ..... 10000001 . . . @XX3
XSNMADDMSP 111100 ..... ..... ..... 10001001 . . . @XX3
XSNMADDADP 111100 ..... ..... ..... 10100001 . . . @XX3
XSNMADDMDP 111100 ..... ..... ..... 10101001 . . . @XX3
+XSNMADDQP 111111 ..... ..... ..... 0111000100 . @X_rc
XSNMSUBASP 111100 ..... ..... ..... 10010001 . . . @XX3
XSNMSUBMSP 111100 ..... ..... ..... 10011001 . . . @XX3
XSNMSUBADP 111100 ..... ..... ..... 10110001 . . . @XX3
XSNMSUBMDP 111100 ..... ..... ..... 10111001 . . . @XX3
+XSNMSUBQP 111111 ..... ..... ..... 0111100100 . @X_rc
## VSX splat instruction
diff --git a/target/ppc/translate/vsx-impl.c.inc b/target/ppc/translate/vsx-impl.c.inc
index a54afb4dbb..9128407365 100644
--- a/target/ppc/translate/vsx-impl.c.inc
+++ b/target/ppc/translate/vsx-impl.c.inc
@@ -1333,6 +1333,31 @@ TRANS_FLAGS2(VSX207, XSNMADDMSP, do_xsmadd_XX3, false, gen_helper_XSNMADDSP)
TRANS_FLAGS2(VSX207, XSNMSUBASP, do_xsmadd_XX3, true, gen_helper_XSNMSUBSP)
TRANS_FLAGS2(VSX207, XSNMSUBMSP, do_xsmadd_XX3, false, gen_helper_XSNMSUBSP)
+static bool do_xsmadd_X(DisasContext *ctx, arg_X_rc *a,
+ void (gen_helper)(TCGv_ptr, TCGv_ptr, TCGv_ptr, TCGv_ptr, TCGv_ptr),
+ void (gen_helper_ro)(TCGv_ptr, TCGv_ptr, TCGv_ptr, TCGv_ptr, TCGv_ptr))
+{
+ int vrt, vra, vrb;
+
+ REQUIRE_INSNS_FLAGS2(ctx, ISA300);
+ REQUIRE_VSX(ctx);
+
+ vrt = a->rt + 32;
+ vra = a->ra + 32;
+ vrb = a->rb + 32;
+
+ if (a->rc) {
+ return do_xsmadd(ctx, vrt, vra, vrt, vrb, gen_helper_ro);
+ }
+
+ return do_xsmadd(ctx, vrt, vra, vrt, vrb, gen_helper);
+}
+
+TRANS(XSMADDQP, do_xsmadd_X, gen_helper_XSMADDQP, gen_helper_XSMADDQPO)
+TRANS(XSMSUBQP, do_xsmadd_X, gen_helper_XSMSUBQP, gen_helper_XSMSUBQPO)
+TRANS(XSNMADDQP, do_xsmadd_X, gen_helper_XSNMADDQP, gen_helper_XSNMADDQPO)
+TRANS(XSNMSUBQP, do_xsmadd_X, gen_helper_XSNMSUBQP, gen_helper_XSNMSUBQPO)
+
#define GEN_VSX_HELPER_VSX_MADD(name, op1, aop, mop, inval, type) \
static void gen_##name(DisasContext *ctx) \
{ \
--
2.25.1
^ permalink raw reply related [flat|nested] 97+ messages in thread
* Re: [PATCH v4 35/47] target/ppc: implement xs[n]maddqp[o]/xs[n]msubqp[o]
2022-02-22 14:36 ` [PATCH v4 35/47] target/ppc: implement xs[n]maddqp[o]/xs[n]msubqp[o] matheus.ferst
@ 2022-02-22 23:56 ` Richard Henderson
0 siblings, 0 replies; 97+ messages in thread
From: Richard Henderson @ 2022-02-22 23:56 UTC (permalink / raw)
To: matheus.ferst, qemu-devel, qemu-ppc; +Cc: groug, danielhb413, clg, david
On 2/22/22 04:36, matheus.ferst@eldorado.org.br wrote:
> From: Matheus Ferst<matheus.ferst@eldorado.org.br>
>
> Implement the following PowerISA v3.0 instuctions:
> xsmaddqp[o]: VSX Scalar Multiply-Add Quad-Precision [using round to Odd]
> xsmsubqp[o]: VSX Scalar Multiply-Subtract Quad-Precision [using round
> to Odd]
> xsnmaddqp[o]: VSX Scalar Negative Multiply-Add Quad-Precision [using
> round to Odd]
> xsnmsubqp[o]: VSX Scalar Negative Multiply-Subtract Quad-Precision
> [using round to Odd]
>
> Signed-off-by: Matheus Ferst<matheus.ferst@eldorado.org.br>
> ---
> target/ppc/fpu_helper.c | 42 +++++++++++++++++++++++++++++
> target/ppc/helper.h | 9 +++++++
> target/ppc/insn32.decode | 4 +++
> target/ppc/translate/vsx-impl.c.inc | 25 +++++++++++++++++
> 4 files changed, 80 insertions(+)
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
r~
^ permalink raw reply [flat|nested] 97+ messages in thread
* [PATCH v4 36/47] target/ppc: Implement xvtlsbb instruction
2022-02-22 14:35 [PATCH v4 00/47] target/ppc: PowerISA Vector/VSX instruction batch matheus.ferst
` (34 preceding siblings ...)
2022-02-22 14:36 ` [PATCH v4 35/47] target/ppc: implement xs[n]maddqp[o]/xs[n]msubqp[o] matheus.ferst
@ 2022-02-22 14:36 ` matheus.ferst
2022-02-23 0:07 ` Richard Henderson
2022-02-22 14:36 ` [PATCH v4 37/47] target/ppc: Remove xscmpnedp instruction matheus.ferst
` (10 subsequent siblings)
46 siblings, 1 reply; 97+ messages in thread
From: matheus.ferst @ 2022-02-22 14:36 UTC (permalink / raw)
To: qemu-devel, qemu-ppc
Cc: danielhb413, richard.henderson, groug, Víctor Colombo, clg,
Matheus Ferst, david
From: Víctor Colombo <victor.colombo@eldorado.org.br>
Signed-off-by: Víctor Colombo <victor.colombo@eldorado.org.br>
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
---
target/ppc/insn32.decode | 7 ++++++
target/ppc/translate/vsx-impl.c.inc | 37 +++++++++++++++++++++++++++++
2 files changed, 44 insertions(+)
diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode
index 1395a91c44..2617ab8ca4 100644
--- a/target/ppc/insn32.decode
+++ b/target/ppc/insn32.decode
@@ -155,6 +155,9 @@
&XX2 xt xb uim:uint8_t
@XX2 ...... ..... ... uim:2 ..... ......... .. &XX2 xt=%xx_xt xb=%xx_xb
+&XX2_bf_xb bf xb
+@XX2_bf_xb ...... bf:3 .. ..... ..... ......... . . &XX2_bf_xb xb=%xx_xb
+
&XX3 xt xa xb
@XX3 ...... ..... ..... ..... ........ ... &XX3 xt=%xx_xt xa=%xx_xa xb=%xx_xb
@@ -664,6 +667,10 @@ XSMINJDP 111100 ..... ..... ..... 10011000 ... @XX3
XSCVQPDP 111111 ..... 10100 ..... 1101000100 . @X_tb_rc
+## VSX Vector Test Least-Significant Bit by Byte Instruction
+
+XVTLSBB 111100 ... -- 00010 ..... 111011011 . - @XX2_bf_xb
+
### rfebb
&XL_s s:uint8_t
@XL_s ......-------------- s:1 .......... - &XL_s
diff --git a/target/ppc/translate/vsx-impl.c.inc b/target/ppc/translate/vsx-impl.c.inc
index 9128407365..2aecaa8021 100644
--- a/target/ppc/translate/vsx-impl.c.inc
+++ b/target/ppc/translate/vsx-impl.c.inc
@@ -1690,6 +1690,43 @@ static bool trans_LXVKQ(DisasContext *ctx, arg_X_uim5 *a)
return true;
}
+static bool trans_XVTLSBB(DisasContext *ctx, arg_XX2_bf_xb *a)
+{
+ TCGv_i64 xb, tmp, all_true, all_false, mask, zero;
+
+ REQUIRE_INSNS_FLAGS2(ctx, ISA310);
+ REQUIRE_VSX(ctx);
+
+ xb = tcg_temp_new_i64();
+ tmp = tcg_temp_new_i64();
+ all_true = tcg_const_i64(0b1000);
+ all_false = tcg_const_i64(0b0010);
+ mask = tcg_constant_i64(dup_const(MO_8, 1));
+ zero = tcg_constant_i64(0);
+
+ for (int dw = 0; dw < 2; dw++) {
+ get_cpu_vsr(xb, a->xb, dw);
+
+ tcg_gen_and_i64(tmp, mask, xb);
+ tcg_gen_movcond_i64(TCG_COND_EQ, all_true, tmp,
+ mask, all_true, zero);
+
+ tcg_gen_andc_i64(tmp, mask, xb);
+ tcg_gen_movcond_i64(TCG_COND_EQ, all_false, tmp,
+ mask, all_false, zero);
+ }
+
+ tcg_gen_or_i64(tmp, all_false, all_true);
+ tcg_gen_extrl_i64_i32(cpu_crf[a->bf], tmp);
+
+ tcg_temp_free_i64(xb);
+ tcg_temp_free_i64(tmp);
+ tcg_temp_free_i64(all_true);
+ tcg_temp_free_i64(all_false);
+
+ return true;
+}
+
static void gen_xxsldwi(DisasContext *ctx)
{
TCGv_i64 xth, xtl;
--
2.25.1
^ permalink raw reply related [flat|nested] 97+ messages in thread
* Re: [PATCH v4 36/47] target/ppc: Implement xvtlsbb instruction
2022-02-22 14:36 ` [PATCH v4 36/47] target/ppc: Implement xvtlsbb instruction matheus.ferst
@ 2022-02-23 0:07 ` Richard Henderson
0 siblings, 0 replies; 97+ messages in thread
From: Richard Henderson @ 2022-02-23 0:07 UTC (permalink / raw)
To: matheus.ferst, qemu-devel, qemu-ppc
Cc: Víctor Colombo, groug, danielhb413, clg, david
On 2/22/22 04:36, matheus.ferst@eldorado.org.br wrote:
> + tcg_gen_and_i64(tmp, mask, xb);
> + tcg_gen_movcond_i64(TCG_COND_EQ, all_true, tmp,
> + mask, all_true, zero);
> +
> + tcg_gen_andc_i64(tmp, mask, xb);
> + tcg_gen_movcond_i64(TCG_COND_EQ, all_false, tmp,
> + mask, all_false, zero);
I would unroll this and use fewer conditions.
t0 = mask & xb[0]
t1 = mask & xb[1]
o2 = t0 | t1
a2 = t0 & t1
o2 = (o2 == 0) << 1
a2 = (a2 == mask) << 3
crf = o2 | a2
r~
^ permalink raw reply [flat|nested] 97+ messages in thread
* [PATCH v4 37/47] target/ppc: Remove xscmpnedp instruction
2022-02-22 14:35 [PATCH v4 00/47] target/ppc: PowerISA Vector/VSX instruction batch matheus.ferst
` (35 preceding siblings ...)
2022-02-22 14:36 ` [PATCH v4 36/47] target/ppc: Implement xvtlsbb instruction matheus.ferst
@ 2022-02-22 14:36 ` matheus.ferst
2022-02-22 14:36 ` [PATCH v4 38/47] target/ppc: Refactor VSX_SCALAR_CMP_DP matheus.ferst
` (9 subsequent siblings)
46 siblings, 0 replies; 97+ messages in thread
From: matheus.ferst @ 2022-02-22 14:36 UTC (permalink / raw)
To: qemu-devel, qemu-ppc
Cc: danielhb413, richard.henderson, groug, Víctor Colombo, clg,
Matheus Ferst, david
From: Víctor Colombo <victor.colombo@eldorado.org.br>
xscmpnedp was added in ISA v3.0 but removed in v3.0B. This patch
removes this instruction as it was not in the final version of v3.0.
Signed-off-by: Víctor Colombo <victor.colombo@eldorado.org.br>
Acked-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
---
target/ppc/fpu_helper.c | 1 -
target/ppc/helper.h | 1 -
target/ppc/translate/vsx-impl.c.inc | 1 -
target/ppc/translate/vsx-ops.c.inc | 1 -
4 files changed, 4 deletions(-)
diff --git a/target/ppc/fpu_helper.c b/target/ppc/fpu_helper.c
index 98e9576608..9b034d1fe4 100644
--- a/target/ppc/fpu_helper.c
+++ b/target/ppc/fpu_helper.c
@@ -2313,7 +2313,6 @@ void helper_##op(CPUPPCState *env, ppc_vsr_t *xt, \
VSX_SCALAR_CMP_DP(xscmpeqdp, eq, 1, 0)
VSX_SCALAR_CMP_DP(xscmpgedp, le, 1, 1)
VSX_SCALAR_CMP_DP(xscmpgtdp, lt, 1, 1)
-VSX_SCALAR_CMP_DP(xscmpnedp, eq, 0, 0)
void helper_xscmpexpdp(CPUPPCState *env, uint32_t opcode,
ppc_vsr_t *xa, ppc_vsr_t *xb)
diff --git a/target/ppc/helper.h b/target/ppc/helper.h
index 1649fffff8..ee2a89b89d 100644
--- a/target/ppc/helper.h
+++ b/target/ppc/helper.h
@@ -364,7 +364,6 @@ DEF_HELPER_5(XSNMSUBDP, void, env, vsr, vsr, vsr, vsr)
DEF_HELPER_4(xscmpeqdp, void, env, vsr, vsr, vsr)
DEF_HELPER_4(xscmpgtdp, void, env, vsr, vsr, vsr)
DEF_HELPER_4(xscmpgedp, void, env, vsr, vsr, vsr)
-DEF_HELPER_4(xscmpnedp, void, env, vsr, vsr, vsr)
DEF_HELPER_4(xscmpexpdp, void, env, i32, vsr, vsr)
DEF_HELPER_4(xscmpexpqp, void, env, i32, vsr, vsr)
DEF_HELPER_4(xscmpodp, void, env, i32, vsr, vsr)
diff --git a/target/ppc/translate/vsx-impl.c.inc b/target/ppc/translate/vsx-impl.c.inc
index 2aecaa8021..751b941bac 100644
--- a/target/ppc/translate/vsx-impl.c.inc
+++ b/target/ppc/translate/vsx-impl.c.inc
@@ -1055,7 +1055,6 @@ GEN_VSX_HELPER_X1(xstsqrtdp, 0x14, 0x06, 0, PPC2_VSX)
GEN_VSX_HELPER_X3(xscmpeqdp, 0x0C, 0x00, 0, PPC2_ISA300)
GEN_VSX_HELPER_X3(xscmpgtdp, 0x0C, 0x01, 0, PPC2_ISA300)
GEN_VSX_HELPER_X3(xscmpgedp, 0x0C, 0x02, 0, PPC2_ISA300)
-GEN_VSX_HELPER_X3(xscmpnedp, 0x0C, 0x03, 0, PPC2_ISA300)
GEN_VSX_HELPER_X2_AB(xscmpexpdp, 0x0C, 0x07, 0, PPC2_ISA300)
GEN_VSX_HELPER_R2_AB(xscmpexpqp, 0x04, 0x05, 0, PPC2_ISA300)
GEN_VSX_HELPER_X2_AB(xscmpodp, 0x0C, 0x05, 0, PPC2_VSX)
diff --git a/target/ppc/translate/vsx-ops.c.inc b/target/ppc/translate/vsx-ops.c.inc
index 9cfec53df0..34310c1fb5 100644
--- a/target/ppc/translate/vsx-ops.c.inc
+++ b/target/ppc/translate/vsx-ops.c.inc
@@ -189,7 +189,6 @@ GEN_XX2FORM(xstsqrtdp, 0x14, 0x06, PPC2_VSX),
GEN_XX3FORM(xscmpeqdp, 0x0C, 0x00, PPC2_ISA300),
GEN_XX3FORM(xscmpgtdp, 0x0C, 0x01, PPC2_ISA300),
GEN_XX3FORM(xscmpgedp, 0x0C, 0x02, PPC2_ISA300),
-GEN_XX3FORM(xscmpnedp, 0x0C, 0x03, PPC2_ISA300),
GEN_XX3FORM(xscmpexpdp, 0x0C, 0x07, PPC2_ISA300),
GEN_VSX_XFORM_300(xscmpexpqp, 0x04, 0x05, 0x00600001),
GEN_XX2IFORM(xscmpodp, 0x0C, 0x05, PPC2_VSX),
--
2.25.1
^ permalink raw reply related [flat|nested] 97+ messages in thread
* [PATCH v4 38/47] target/ppc: Refactor VSX_SCALAR_CMP_DP
2022-02-22 14:35 [PATCH v4 00/47] target/ppc: PowerISA Vector/VSX instruction batch matheus.ferst
` (36 preceding siblings ...)
2022-02-22 14:36 ` [PATCH v4 37/47] target/ppc: Remove xscmpnedp instruction matheus.ferst
@ 2022-02-22 14:36 ` matheus.ferst
2022-02-23 0:20 ` Richard Henderson
2022-02-22 14:36 ` [PATCH v4 39/47] target/ppc: Implement xscmp{eq,ge,gt}qp matheus.ferst
` (8 subsequent siblings)
46 siblings, 1 reply; 97+ messages in thread
From: matheus.ferst @ 2022-02-22 14:36 UTC (permalink / raw)
To: qemu-devel, qemu-ppc
Cc: danielhb413, richard.henderson, groug, Víctor Colombo, clg,
Matheus Ferst, david
From: Víctor Colombo <victor.colombo@eldorado.org.br>
Refactor VSX_SCALAR_CMP_DP, changing its name to VSX_SCALAR_CMP and
prepare the helper to be used for quadword comparisons.
Signed-off-by: Víctor Colombo <victor.colombo@eldorado.org.br>
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
---
target/ppc/fpu_helper.c | 31 ++++++++++++++-----------------
1 file changed, 14 insertions(+), 17 deletions(-)
diff --git a/target/ppc/fpu_helper.c b/target/ppc/fpu_helper.c
index 9b034d1fe4..5ebbcfe3b7 100644
--- a/target/ppc/fpu_helper.c
+++ b/target/ppc/fpu_helper.c
@@ -2265,28 +2265,30 @@ VSX_MADDQ(XSNMSUBQP, NMSUB_FLGS, 0)
VSX_MADDQ(XSNMSUBQPO, NMSUB_FLGS, 0)
/*
- * VSX_SCALAR_CMP_DP - VSX scalar floating point compare double precision
+ * VSX_SCALAR_CMP - VSX scalar floating point compare
* op - instruction mnemonic
+ * tp - type
* cmp - comparison operation
* exp - expected result of comparison
+ * fld - vsr_t field
* svxvc - set VXVC bit
*/
-#define VSX_SCALAR_CMP_DP(op, cmp, exp, svxvc) \
+#define VSX_SCALAR_CMP(op, tp, cmp, fld, exp, svxvc) \
void helper_##op(CPUPPCState *env, ppc_vsr_t *xt, \
ppc_vsr_t *xa, ppc_vsr_t *xb) \
{ \
- ppc_vsr_t t = *xt; \
+ ppc_vsr_t t = { }; \
bool vxsnan_flag = false, vxvc_flag = false, vex_flag = false; \
\
- if (float64_is_signaling_nan(xa->VsrD(0), &env->fp_status) || \
- float64_is_signaling_nan(xb->VsrD(0), &env->fp_status)) { \
+ if (tp##_is_signaling_nan(xa->fld, &env->fp_status) || \
+ tp##_is_signaling_nan(xb->fld, &env->fp_status)) { \
vxsnan_flag = true; \
if (fpscr_ve == 0 && svxvc) { \
vxvc_flag = true; \
} \
} else if (svxvc) { \
- vxvc_flag = float64_is_quiet_nan(xa->VsrD(0), &env->fp_status) || \
- float64_is_quiet_nan(xb->VsrD(0), &env->fp_status); \
+ vxvc_flag = tp##_is_quiet_nan(xa->fld, &env->fp_status) || \
+ tp##_is_quiet_nan(xb->fld, &env->fp_status); \
} \
if (vxsnan_flag) { \
float_invalid_op_vxsnan(env, GETPC()); \
@@ -2297,22 +2299,17 @@ void helper_##op(CPUPPCState *env, ppc_vsr_t *xt, \
vex_flag = fpscr_ve && (vxvc_flag || vxsnan_flag); \
\
if (!vex_flag) { \
- if (float64_##cmp(xb->VsrD(0), xa->VsrD(0), \
- &env->fp_status) == exp) { \
- t.VsrD(0) = -1; \
- t.VsrD(1) = 0; \
- } else { \
- t.VsrD(0) = 0; \
- t.VsrD(1) = 0; \
+ if (tp##_##cmp(xb->fld, xa->fld, &env->fp_status) == exp) { \
+ memset(&t.fld, 0xFF, sizeof(t.fld)); \
} \
} \
*xt = t; \
do_float_check_status(env, GETPC()); \
}
-VSX_SCALAR_CMP_DP(xscmpeqdp, eq, 1, 0)
-VSX_SCALAR_CMP_DP(xscmpgedp, le, 1, 1)
-VSX_SCALAR_CMP_DP(xscmpgtdp, lt, 1, 1)
+VSX_SCALAR_CMP(xscmpeqdp, float64, eq, VsrD(0), 1, 0)
+VSX_SCALAR_CMP(xscmpgedp, float64, le, VsrD(0), 1, 1)
+VSX_SCALAR_CMP(xscmpgtdp, float64, lt, VsrD(0), 1, 1)
void helper_xscmpexpdp(CPUPPCState *env, uint32_t opcode,
ppc_vsr_t *xa, ppc_vsr_t *xb)
--
2.25.1
^ permalink raw reply related [flat|nested] 97+ messages in thread
* Re: [PATCH v4 38/47] target/ppc: Refactor VSX_SCALAR_CMP_DP
2022-02-22 14:36 ` [PATCH v4 38/47] target/ppc: Refactor VSX_SCALAR_CMP_DP matheus.ferst
@ 2022-02-23 0:20 ` Richard Henderson
2022-02-24 19:16 ` Víctor Colombo
0 siblings, 1 reply; 97+ messages in thread
From: Richard Henderson @ 2022-02-23 0:20 UTC (permalink / raw)
To: matheus.ferst, qemu-devel, qemu-ppc
Cc: Víctor Colombo, groug, danielhb413, clg, david
On 2/22/22 04:36, matheus.ferst@eldorado.org.br wrote:
> From: Víctor Colombo <victor.colombo@eldorado.org.br>
>
> Refactor VSX_SCALAR_CMP_DP, changing its name to VSX_SCALAR_CMP and
> prepare the helper to be used for quadword comparisons.
>
> Signed-off-by: Víctor Colombo <victor.colombo@eldorado.org.br>
> Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
> ---
> target/ppc/fpu_helper.c | 31 ++++++++++++++-----------------
> 1 file changed, 14 insertions(+), 17 deletions(-)
>
> diff --git a/target/ppc/fpu_helper.c b/target/ppc/fpu_helper.c
> index 9b034d1fe4..5ebbcfe3b7 100644
> --- a/target/ppc/fpu_helper.c
> +++ b/target/ppc/fpu_helper.c
> @@ -2265,28 +2265,30 @@ VSX_MADDQ(XSNMSUBQP, NMSUB_FLGS, 0)
> VSX_MADDQ(XSNMSUBQPO, NMSUB_FLGS, 0)
>
> /*
> - * VSX_SCALAR_CMP_DP - VSX scalar floating point compare double precision
> + * VSX_SCALAR_CMP - VSX scalar floating point compare
> * op - instruction mnemonic
> + * tp - type
> * cmp - comparison operation
> * exp - expected result of comparison
> + * fld - vsr_t field
> * svxvc - set VXVC bit
> */
> -#define VSX_SCALAR_CMP_DP(op, cmp, exp, svxvc) \
> +#define VSX_SCALAR_CMP(op, tp, cmp, fld, exp, svxvc) \
> void helper_##op(CPUPPCState *env, ppc_vsr_t *xt, \
> ppc_vsr_t *xa, ppc_vsr_t *xb) \
> { \
> - ppc_vsr_t t = *xt; \
> + ppc_vsr_t t = { }; \
> bool vxsnan_flag = false, vxvc_flag = false, vex_flag = false; \
> \
> - if (float64_is_signaling_nan(xa->VsrD(0), &env->fp_status) || \
> - float64_is_signaling_nan(xb->VsrD(0), &env->fp_status)) { \
> + if (tp##_is_signaling_nan(xa->fld, &env->fp_status) || \
> + tp##_is_signaling_nan(xb->fld, &env->fp_status)) { \
> vxsnan_flag = true; \
> if (fpscr_ve == 0 && svxvc) { \
> vxvc_flag = true; \
> } \
> } else if (svxvc) { \
> - vxvc_flag = float64_is_quiet_nan(xa->VsrD(0), &env->fp_status) || \
> - float64_is_quiet_nan(xb->VsrD(0), &env->fp_status); \
> + vxvc_flag = tp##_is_quiet_nan(xa->fld, &env->fp_status) || \
> + tp##_is_quiet_nan(xb->fld, &env->fp_status); \
> }
Note that this can be simplified further, using the full FloatRelation result and
float_flag_invalid_snan.
Note that do_scalar_cmp gets half-way there, only checking for NaNs once we have
float_relation_unordered as a comparision result. But it could go further and check
float_flag_invalid_snan and drop all of the other checks vs snan and qnan.
r~
^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: [PATCH v4 38/47] target/ppc: Refactor VSX_SCALAR_CMP_DP
2022-02-23 0:20 ` Richard Henderson
@ 2022-02-24 19:16 ` Víctor Colombo
2022-02-24 21:24 ` Richard Henderson
0 siblings, 1 reply; 97+ messages in thread
From: Víctor Colombo @ 2022-02-24 19:16 UTC (permalink / raw)
To: Richard Henderson, matheus.ferst, qemu-devel, qemu-ppc
Cc: groug, danielhb413, clg, david
On 22/02/2022 21:20, Richard Henderson wrote:> On 2/22/22 04:36,
matheus.ferst@eldorado.org.br wrote:
>> From: Víctor Colombo <victor.colombo@eldorado.org.br>
>>
>> Refactor VSX_SCALAR_CMP_DP, changing its name to VSX_SCALAR_CMP and
>> prepare the helper to be used for quadword comparisons.
>>
>> Signed-off-by: Víctor Colombo <victor.colombo@eldorado.org.br>
>> Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
>> ---
>> target/ppc/fpu_helper.c | 31 ++++++++++++++-----------------
>> 1 file changed, 14 insertions(+), 17 deletions(-)
>>
>> diff --git a/target/ppc/fpu_helper.c b/target/ppc/fpu_helper.c
>> index 9b034d1fe4..5ebbcfe3b7 100644
>> --- a/target/ppc/fpu_helper.c
>> +++ b/target/ppc/fpu_helper.c
>> @@ -2265,28 +2265,30 @@ VSX_MADDQ(XSNMSUBQP, NMSUB_FLGS, 0)
>> VSX_MADDQ(XSNMSUBQPO, NMSUB_FLGS, 0)
>>
>> /*
>> - * VSX_SCALAR_CMP_DP - VSX scalar floating point compare double
>> precision
>> + * VSX_SCALAR_CMP - VSX scalar floating point compare
>> * op - instruction mnemonic
>> + * tp - type
>> * cmp - comparison operation
>> * exp - expected result of comparison
>> + * fld - vsr_t field
>> * svxvc - set VXVC bit
>> */
>> -#define VSX_SCALAR_CMP_DP(op, cmp, exp,
>> svxvc) \
>> +#define VSX_SCALAR_CMP(op, tp, cmp, fld, exp,
>> svxvc) \
>> void helper_##op(CPUPPCState *env, ppc_vsr_t
>> *xt, \
>> ppc_vsr_t *xa, ppc_vsr_t
>> *xb) \
>>
>> {
>> \
>> - ppc_vsr_t t =
>> *xt; \
>> + ppc_vsr_t t = {
>> }; \
>> bool vxsnan_flag = false, vxvc_flag = false, vex_flag =
>> false; \
>>
>> \
>> - if (float64_is_signaling_nan(xa->VsrD(0), &env->fp_status)
>> || \
>> - float64_is_signaling_nan(xb->VsrD(0), &env->fp_status))
>> { \
>> + if (tp##_is_signaling_nan(xa->fld, &env->fp_status)
>> || \
>> + tp##_is_signaling_nan(xb->fld, &env->fp_status))
>> { \
>> vxsnan_flag =
>> true; \
>> if (fpscr_ve == 0 && svxvc)
>> { \
>> vxvc_flag =
>> true; \
>>
>> } \
>> } else if (svxvc)
>> { \
>> - vxvc_flag = float64_is_quiet_nan(xa->VsrD(0),
>> &env->fp_status) || \
>> - float64_is_quiet_nan(xb->VsrD(0),
>> &env->fp_status); \
>> + vxvc_flag = tp##_is_quiet_nan(xa->fld, &env->fp_status)
>> || \
>> + tp##_is_quiet_nan(xb->fld,
>> &env->fp_status); \
>> }
>
> Note that this can be simplified further, using the full FloatRelation
> result and
> float_flag_invalid_snan.
>
> Note that do_scalar_cmp gets half-way there, only checking for NaNs once
> we have
> float_relation_unordered as a comparision result. But it could go
> further and check
> float_flag_invalid_snan and drop all of the other checks vs snan and qnan.
>
>
> r~
Hello Richard! Thanks for your review
Could you please elaborate more on how do you think using
float*_compare and its FloatRelation result would work here?
I noticed do_scalar_cmp modifies CR and sets FPCC flag, which
is not what VSX_SCALAR_CMP do. Using that function would require a
rework.
An option I though would be to bring into VSX_SCALAR_CMP the
important necessary parts, something like this:
#define VSX_SCALAR_CMP(op, tp, cmp, fld, svxvc, expr)
...
r = tp##_compare(xa->fld, xb->fld, &env->fp_status);
\
if (expr) {
\
memset(&t.fld, 0xFF, sizeof(t.fld));
\
} else if (r == float_relation_unordered) {
\
if (env->fp_status.float_exception_flags &
float_flag_invalid_snan) { \
float_invalid_op_vxsnan(env, GETPC());
\
if (fpscr_ve == 0 && svxvc) {
\
float_invalid_op_vxvc(env, 0, GETPC());
\
}
\
} else if (svxvc) {
\
if (tp##_is_quiet_nan(xa->fld, &env->fp_status) ||
\
tp##_is_quiet_nan(xb->fld, &env->fp_status)) {
\
float_invalid_op_vxvc(env, 0, GETPC());
\
}
\
}
\
}
\
...
VSX_SCALAR_CMP(XSCMPEQDP, float64, eq, VsrD(0), 0, r ==
float_relation_equal)
VSX_SCALAR_CMP(XSCMPGEDP, float64, le, VsrD(0), 1, \
r == float_relation_equal || r == float_relation_greater)
VSX_SCALAR_CMP(XSCMPGTDP, float64, lt, VsrD(0), 1, r ==
float_relation_greater)
But this still looks convoluted. Another option I came with would be:
ppc_vsr_t t = { };
\
\
helper_reset_fpstatus(env);
\
\
if (tp##_##cmp##_quiet(xb->fld, xa->fld, &env->fp_status)) {
\
memset(&t.fld, 0xFF, sizeof(t.fld));
\
}
\
\
if (env->fp_status.float_exception_flags & float_flag_invalid_snan)
{ \
float_invalid_op_vxsnan(env, GETPC());
\
if (fpscr_ve == 0 && svxvc) {
\
float_invalid_op_vxvc(env, 0, GETPC());
\
}
\
} else if (svxvc) {
\
if (tp##_is_quiet_nan(xa->fld, &env->fp_status) ||
\
tp##_is_quiet_nan(xb->fld, &env->fp_status)) {
\
float_invalid_op_vxvc(env, 0, GETPC());
\
}
\
}
\
Is this close to what you were thinking?
Thank you very much!
-- Víctor
^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: [PATCH v4 38/47] target/ppc: Refactor VSX_SCALAR_CMP_DP
2022-02-24 19:16 ` Víctor Colombo
@ 2022-02-24 21:24 ` Richard Henderson
0 siblings, 0 replies; 97+ messages in thread
From: Richard Henderson @ 2022-02-24 21:24 UTC (permalink / raw)
To: Víctor Colombo, matheus.ferst, qemu-devel, qemu-ppc
Cc: groug, danielhb413, clg, david
On 2/24/22 09:16, Víctor Colombo wrote:
> Could you please elaborate more on how do you think using
> float*_compare and its FloatRelation result would work here?
> I noticed do_scalar_cmp modifies CR and sets FPCC flag, which
> is not what VSX_SCALAR_CMP do. Using that function would require a
> rework.
>
> An option I though would be to bring into VSX_SCALAR_CMP the
> important necessary parts, something like this:
>
> #define VSX_SCALAR_CMP(op, tp, cmp, fld, svxvc, expr) ...
> r = tp##_compare(xa->fld, xb->fld, &env->fp_status); \
> if (expr) { \
> memset(&t.fld, 0xFF, sizeof(t.fld)); \
> } else if (r == float_relation_unordered) { \
> if (env->fp_status.float_exception_flags & float_flag_invalid_snan) { \
> float_invalid_op_vxsnan(env, GETPC()); \
> if (fpscr_ve == 0 && svxvc) { \
> float_invalid_op_vxvc(env, 0, GETPC()); \
> } \
> } else if (svxvc) { \
> if (tp##_is_quiet_nan(xa->fld, &env->fp_status) || \
> tp##_is_quiet_nan(xb->fld, &env->fp_status)) { \
> float_invalid_op_vxvc(env, 0, GETPC()); \
> } \
> } \
> } \
> ...
> VSX_SCALAR_CMP(XSCMPEQDP, float64, eq, VsrD(0), 0, r == float_relation_equal)
> VSX_SCALAR_CMP(XSCMPGEDP, float64, le, VsrD(0), 1, \
> r == float_relation_equal || r == float_relation_greater)
> VSX_SCALAR_CMP(XSCMPGTDP, float64, lt, VsrD(0), 1, r == float_relation_greater)
I was thinking along the lines of:
bool r;
int flags;
helper_reset_fpstatus(env);
if (svxvc) {
r = tp##cmp(...);
} else {
r = tp##cmp##_quiet(...);
}
flags = get_float_exception_flags(&env->fp_status);
if (unlikely(flags & float_flag_invalid)) {
bool vxvc = svxvc;
if (flags & float_flag_invalid_snan)) {
float_invalid_op_vxsnan(...);
vxvc &= fpscr_ve == 0;
}
if (vxvc) {
float_invalid_op_vxvc(...);
}
}
memset(xt, 0, sizeof(*xt));
memset(&xt->fld, -r, sizeof(xt->fld));
do_float_check_status(...);
r~
^ permalink raw reply [flat|nested] 97+ messages in thread
* [PATCH v4 39/47] target/ppc: Implement xscmp{eq,ge,gt}qp
2022-02-22 14:35 [PATCH v4 00/47] target/ppc: PowerISA Vector/VSX instruction batch matheus.ferst
` (37 preceding siblings ...)
2022-02-22 14:36 ` [PATCH v4 38/47] target/ppc: Refactor VSX_SCALAR_CMP_DP matheus.ferst
@ 2022-02-22 14:36 ` matheus.ferst
2022-02-23 0:21 ` Richard Henderson
2022-02-22 14:36 ` [PATCH v4 40/47] target/ppc: Move xscmp{eq,ge,gt}dp to decodetree matheus.ferst
` (7 subsequent siblings)
46 siblings, 1 reply; 97+ messages in thread
From: matheus.ferst @ 2022-02-22 14:36 UTC (permalink / raw)
To: qemu-devel, qemu-ppc
Cc: danielhb413, richard.henderson, groug, Víctor Colombo, clg,
Matheus Ferst, david
From: Víctor Colombo <victor.colombo@eldorado.org.br>
Signed-off-by: Víctor Colombo <victor.colombo@eldorado.org.br>
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
---
target/ppc/fpu_helper.c | 4 ++++
target/ppc/helper.h | 3 +++
target/ppc/insn32.decode | 3 +++
target/ppc/translate/vsx-impl.c.inc | 31 +++++++++++++++++++++++++++++
4 files changed, 41 insertions(+)
diff --git a/target/ppc/fpu_helper.c b/target/ppc/fpu_helper.c
index 5ebbcfe3b7..eb62ae5455 100644
--- a/target/ppc/fpu_helper.c
+++ b/target/ppc/fpu_helper.c
@@ -2311,6 +2311,10 @@ VSX_SCALAR_CMP(xscmpeqdp, float64, eq, VsrD(0), 1, 0)
VSX_SCALAR_CMP(xscmpgedp, float64, le, VsrD(0), 1, 1)
VSX_SCALAR_CMP(xscmpgtdp, float64, lt, VsrD(0), 1, 1)
+VSX_SCALAR_CMP(XSCMPEQQP, float128, eq, f128, 1, 0)
+VSX_SCALAR_CMP(XSCMPGEQP, float128, le, f128, 1, 1)
+VSX_SCALAR_CMP(XSCMPGTQP, float128, lt, f128, 1, 1)
+
void helper_xscmpexpdp(CPUPPCState *env, uint32_t opcode,
ppc_vsr_t *xa, ppc_vsr_t *xb)
{
diff --git a/target/ppc/helper.h b/target/ppc/helper.h
index ee2a89b89d..e44de15d07 100644
--- a/target/ppc/helper.h
+++ b/target/ppc/helper.h
@@ -364,6 +364,9 @@ DEF_HELPER_5(XSNMSUBDP, void, env, vsr, vsr, vsr, vsr)
DEF_HELPER_4(xscmpeqdp, void, env, vsr, vsr, vsr)
DEF_HELPER_4(xscmpgtdp, void, env, vsr, vsr, vsr)
DEF_HELPER_4(xscmpgedp, void, env, vsr, vsr, vsr)
+DEF_HELPER_4(XSCMPEQQP, void, env, vsr, vsr, vsr)
+DEF_HELPER_4(XSCMPGTQP, void, env, vsr, vsr, vsr)
+DEF_HELPER_4(XSCMPGEQP, void, env, vsr, vsr, vsr)
DEF_HELPER_4(xscmpexpdp, void, env, i32, vsr, vsr)
DEF_HELPER_4(xscmpexpqp, void, env, i32, vsr, vsr)
DEF_HELPER_4(xscmpodp, void, env, i32, vsr, vsr)
diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode
index 2617ab8ca4..d5c3bd13f7 100644
--- a/target/ppc/insn32.decode
+++ b/target/ppc/insn32.decode
@@ -662,6 +662,9 @@ XSMAXCDP 111100 ..... ..... ..... 10000000 ... @XX3
XSMINCDP 111100 ..... ..... ..... 10001000 ... @XX3
XSMAXJDP 111100 ..... ..... ..... 10010000 ... @XX3
XSMINJDP 111100 ..... ..... ..... 10011000 ... @XX3
+XSCMPEQQP 111111 ..... ..... ..... 0001000100 - @X
+XSCMPGEQP 111111 ..... ..... ..... 0011000100 - @X
+XSCMPGTQP 111111 ..... ..... ..... 0011100100 - @X
## VSX Binary Floating-Point Convert Instructions
diff --git a/target/ppc/translate/vsx-impl.c.inc b/target/ppc/translate/vsx-impl.c.inc
index 751b941bac..f0d02e61fc 100644
--- a/target/ppc/translate/vsx-impl.c.inc
+++ b/target/ppc/translate/vsx-impl.c.inc
@@ -2499,6 +2499,37 @@ TRANS(XSMINCDP, do_xsmaxmincjdp, gen_helper_xsmincdp)
TRANS(XSMAXJDP, do_xsmaxmincjdp, gen_helper_xsmaxjdp)
TRANS(XSMINJDP, do_xsmaxmincjdp, gen_helper_xsminjdp)
+static bool do_helper_X(arg_X *a,
+ void (*helper)(TCGv_ptr, TCGv_ptr, TCGv_ptr, TCGv_ptr))
+{
+ TCGv_ptr rt, ra, rb;
+
+ rt = gen_avr_ptr(a->rt);
+ ra = gen_avr_ptr(a->ra);
+ rb = gen_avr_ptr(a->rb);
+
+ helper(cpu_env, rt, ra, rb);
+
+ tcg_temp_free_ptr(rt);
+ tcg_temp_free_ptr(ra);
+ tcg_temp_free_ptr(rb);
+
+ return true;
+}
+
+static bool do_xscmpqp(DisasContext *ctx, arg_X *a,
+ void (*helper)(TCGv_ptr, TCGv_ptr, TCGv_ptr, TCGv_ptr))
+{
+ REQUIRE_INSNS_FLAGS2(ctx, ISA310);
+ REQUIRE_VSX(ctx);
+
+ return do_helper_X(a, helper);
+}
+
+TRANS(XSCMPEQQP, do_xscmpqp, gen_helper_XSCMPEQQP)
+TRANS(XSCMPGEQP, do_xscmpqp, gen_helper_XSCMPGEQP)
+TRANS(XSCMPGTQP, do_xscmpqp, gen_helper_XSCMPGTQP)
+
#undef GEN_XX2FORM
#undef GEN_XX3FORM
#undef GEN_XX2IFORM
--
2.25.1
^ permalink raw reply related [flat|nested] 97+ messages in thread
* Re: [PATCH v4 39/47] target/ppc: Implement xscmp{eq,ge,gt}qp
2022-02-22 14:36 ` [PATCH v4 39/47] target/ppc: Implement xscmp{eq,ge,gt}qp matheus.ferst
@ 2022-02-23 0:21 ` Richard Henderson
0 siblings, 0 replies; 97+ messages in thread
From: Richard Henderson @ 2022-02-23 0:21 UTC (permalink / raw)
To: matheus.ferst, qemu-devel, qemu-ppc
Cc: Víctor Colombo, groug, danielhb413, clg, david
On 2/22/22 04:36, matheus.ferst@eldorado.org.br wrote:
> From: Víctor Colombo<victor.colombo@eldorado.org.br>
>
> Signed-off-by: Víctor Colombo<victor.colombo@eldorado.org.br>
> Signed-off-by: Matheus Ferst<matheus.ferst@eldorado.org.br>
> ---
> target/ppc/fpu_helper.c | 4 ++++
> target/ppc/helper.h | 3 +++
> target/ppc/insn32.decode | 3 +++
> target/ppc/translate/vsx-impl.c.inc | 31 +++++++++++++++++++++++++++++
> 4 files changed, 41 insertions(+)
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
r~
^ permalink raw reply [flat|nested] 97+ messages in thread
* [PATCH v4 40/47] target/ppc: Move xscmp{eq,ge,gt}dp to decodetree
2022-02-22 14:35 [PATCH v4 00/47] target/ppc: PowerISA Vector/VSX instruction batch matheus.ferst
` (38 preceding siblings ...)
2022-02-22 14:36 ` [PATCH v4 39/47] target/ppc: Implement xscmp{eq,ge,gt}qp matheus.ferst
@ 2022-02-22 14:36 ` matheus.ferst
2022-02-23 0:22 ` [PATCH v4 40/47] target/ppc: Move xscmp{eq, ge, gt}dp " Richard Henderson
2022-02-22 14:36 ` [PATCH v4 41/47] target/ppc: Move xs{max, min}[cj]dp to use do_helper_XX3 matheus.ferst
` (6 subsequent siblings)
46 siblings, 1 reply; 97+ messages in thread
From: matheus.ferst @ 2022-02-22 14:36 UTC (permalink / raw)
To: qemu-devel, qemu-ppc
Cc: danielhb413, richard.henderson, groug, Víctor Colombo, clg,
Matheus Ferst, david
From: Víctor Colombo <victor.colombo@eldorado.org.br>
Signed-off-by: Víctor Colombo <victor.colombo@eldorado.org.br>
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
---
target/ppc/fpu_helper.c | 7 +++----
target/ppc/helper.h | 6 +++---
target/ppc/insn32.decode | 3 +++
target/ppc/translate/vsx-impl.c.inc | 28 +++++++++++++++++++++++++---
target/ppc/translate/vsx-ops.c.inc | 3 ---
5 files changed, 34 insertions(+), 13 deletions(-)
diff --git a/target/ppc/fpu_helper.c b/target/ppc/fpu_helper.c
index eb62ae5455..bfe49a63f8 100644
--- a/target/ppc/fpu_helper.c
+++ b/target/ppc/fpu_helper.c
@@ -2307,10 +2307,9 @@ void helper_##op(CPUPPCState *env, ppc_vsr_t *xt, \
do_float_check_status(env, GETPC()); \
}
-VSX_SCALAR_CMP(xscmpeqdp, float64, eq, VsrD(0), 1, 0)
-VSX_SCALAR_CMP(xscmpgedp, float64, le, VsrD(0), 1, 1)
-VSX_SCALAR_CMP(xscmpgtdp, float64, lt, VsrD(0), 1, 1)
-
+VSX_SCALAR_CMP(XSCMPEQDP, float64, eq, VsrD(0), 1, 0)
+VSX_SCALAR_CMP(XSCMPGEDP, float64, le, VsrD(0), 1, 1)
+VSX_SCALAR_CMP(XSCMPGTDP, float64, lt, VsrD(0), 1, 1)
VSX_SCALAR_CMP(XSCMPEQQP, float128, eq, f128, 1, 0)
VSX_SCALAR_CMP(XSCMPGEQP, float128, le, f128, 1, 1)
VSX_SCALAR_CMP(XSCMPGTQP, float128, lt, f128, 1, 1)
diff --git a/target/ppc/helper.h b/target/ppc/helper.h
index e44de15d07..8a57a48200 100644
--- a/target/ppc/helper.h
+++ b/target/ppc/helper.h
@@ -361,9 +361,9 @@ DEF_HELPER_5(XSMADDDP, void, env, vsr, vsr, vsr, vsr)
DEF_HELPER_5(XSMSUBDP, void, env, vsr, vsr, vsr, vsr)
DEF_HELPER_5(XSNMADDDP, void, env, vsr, vsr, vsr, vsr)
DEF_HELPER_5(XSNMSUBDP, void, env, vsr, vsr, vsr, vsr)
-DEF_HELPER_4(xscmpeqdp, void, env, vsr, vsr, vsr)
-DEF_HELPER_4(xscmpgtdp, void, env, vsr, vsr, vsr)
-DEF_HELPER_4(xscmpgedp, void, env, vsr, vsr, vsr)
+DEF_HELPER_4(XSCMPEQDP, void, env, vsr, vsr, vsr)
+DEF_HELPER_4(XSCMPGTDP, void, env, vsr, vsr, vsr)
+DEF_HELPER_4(XSCMPGEDP, void, env, vsr, vsr, vsr)
DEF_HELPER_4(XSCMPEQQP, void, env, vsr, vsr, vsr)
DEF_HELPER_4(XSCMPGTQP, void, env, vsr, vsr, vsr)
DEF_HELPER_4(XSCMPGEQP, void, env, vsr, vsr, vsr)
diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode
index d5c3bd13f7..a6e3855958 100644
--- a/target/ppc/insn32.decode
+++ b/target/ppc/insn32.decode
@@ -662,6 +662,9 @@ XSMAXCDP 111100 ..... ..... ..... 10000000 ... @XX3
XSMINCDP 111100 ..... ..... ..... 10001000 ... @XX3
XSMAXJDP 111100 ..... ..... ..... 10010000 ... @XX3
XSMINJDP 111100 ..... ..... ..... 10011000 ... @XX3
+XSCMPEQDP 111100 ..... ..... ..... 00000011 ... @XX3
+XSCMPGEDP 111100 ..... ..... ..... 00010011 ... @XX3
+XSCMPGTDP 111100 ..... ..... ..... 00001011 ... @XX3
XSCMPEQQP 111111 ..... ..... ..... 0001000100 - @X
XSCMPGEQP 111111 ..... ..... ..... 0011000100 - @X
XSCMPGTQP 111111 ..... ..... ..... 0011100100 - @X
diff --git a/target/ppc/translate/vsx-impl.c.inc b/target/ppc/translate/vsx-impl.c.inc
index f0d02e61fc..29f04a4178 100644
--- a/target/ppc/translate/vsx-impl.c.inc
+++ b/target/ppc/translate/vsx-impl.c.inc
@@ -1052,9 +1052,6 @@ GEN_VSX_HELPER_X2(xssqrtdp, 0x16, 0x04, 0, PPC2_VSX)
GEN_VSX_HELPER_X2(xsrsqrtedp, 0x14, 0x04, 0, PPC2_VSX)
GEN_VSX_HELPER_X2_AB(xstdivdp, 0x14, 0x07, 0, PPC2_VSX)
GEN_VSX_HELPER_X1(xstsqrtdp, 0x14, 0x06, 0, PPC2_VSX)
-GEN_VSX_HELPER_X3(xscmpeqdp, 0x0C, 0x00, 0, PPC2_ISA300)
-GEN_VSX_HELPER_X3(xscmpgtdp, 0x0C, 0x01, 0, PPC2_ISA300)
-GEN_VSX_HELPER_X3(xscmpgedp, 0x0C, 0x02, 0, PPC2_ISA300)
GEN_VSX_HELPER_X2_AB(xscmpexpdp, 0x0C, 0x07, 0, PPC2_ISA300)
GEN_VSX_HELPER_R2_AB(xscmpexpqp, 0x04, 0x05, 0, PPC2_ISA300)
GEN_VSX_HELPER_X2_AB(xscmpodp, 0x0C, 0x05, 0, PPC2_VSX)
@@ -2473,6 +2470,31 @@ TRANS(XXBLENDVH, do_xxblendv, MO_16)
TRANS(XXBLENDVW, do_xxblendv, MO_32)
TRANS(XXBLENDVD, do_xxblendv, MO_64)
+static bool do_helper_XX3(DisasContext *ctx, arg_XX3 *a,
+ void (*helper)(TCGv_ptr, TCGv_ptr, TCGv_ptr, TCGv_ptr))
+{
+ TCGv_ptr xt, xa, xb;
+
+ REQUIRE_INSNS_FLAGS2(ctx, ISA300);
+ REQUIRE_VSX(ctx);
+
+ xt = gen_vsr_ptr(a->xt);
+ xa = gen_vsr_ptr(a->xa);
+ xb = gen_vsr_ptr(a->xb);
+
+ helper(cpu_env, xt, xa, xb);
+
+ tcg_temp_free_ptr(xt);
+ tcg_temp_free_ptr(xa);
+ tcg_temp_free_ptr(xb);
+
+ return true;
+}
+
+TRANS(XSCMPEQDP, do_helper_XX3, gen_helper_XSCMPEQDP)
+TRANS(XSCMPGEDP, do_helper_XX3, gen_helper_XSCMPGEDP)
+TRANS(XSCMPGTDP, do_helper_XX3, gen_helper_XSCMPGTDP)
+
static bool do_xsmaxmincjdp(DisasContext *ctx, arg_XX3 *a,
void (*helper)(TCGv_ptr, TCGv_ptr, TCGv_ptr, TCGv_ptr))
{
diff --git a/target/ppc/translate/vsx-ops.c.inc b/target/ppc/translate/vsx-ops.c.inc
index 34310c1fb5..b8fd116728 100644
--- a/target/ppc/translate/vsx-ops.c.inc
+++ b/target/ppc/translate/vsx-ops.c.inc
@@ -186,9 +186,6 @@ GEN_XX2FORM(xssqrtdp, 0x16, 0x04, PPC2_VSX),
GEN_XX2FORM(xsrsqrtedp, 0x14, 0x04, PPC2_VSX),
GEN_XX3FORM(xstdivdp, 0x14, 0x07, PPC2_VSX),
GEN_XX2FORM(xstsqrtdp, 0x14, 0x06, PPC2_VSX),
-GEN_XX3FORM(xscmpeqdp, 0x0C, 0x00, PPC2_ISA300),
-GEN_XX3FORM(xscmpgtdp, 0x0C, 0x01, PPC2_ISA300),
-GEN_XX3FORM(xscmpgedp, 0x0C, 0x02, PPC2_ISA300),
GEN_XX3FORM(xscmpexpdp, 0x0C, 0x07, PPC2_ISA300),
GEN_VSX_XFORM_300(xscmpexpqp, 0x04, 0x05, 0x00600001),
GEN_XX2IFORM(xscmpodp, 0x0C, 0x05, PPC2_VSX),
--
2.25.1
^ permalink raw reply related [flat|nested] 97+ messages in thread
* Re: [PATCH v4 40/47] target/ppc: Move xscmp{eq, ge, gt}dp to decodetree
2022-02-22 14:36 ` [PATCH v4 40/47] target/ppc: Move xscmp{eq,ge,gt}dp to decodetree matheus.ferst
@ 2022-02-23 0:22 ` Richard Henderson
0 siblings, 0 replies; 97+ messages in thread
From: Richard Henderson @ 2022-02-23 0:22 UTC (permalink / raw)
To: matheus.ferst, qemu-devel, qemu-ppc
Cc: Víctor Colombo, groug, danielhb413, clg, david
On 2/22/22 04:36, matheus.ferst@eldorado.org.br wrote:
> From: Víctor Colombo<victor.colombo@eldorado.org.br>
>
> Signed-off-by: Víctor Colombo<victor.colombo@eldorado.org.br>
> Signed-off-by: Matheus Ferst<matheus.ferst@eldorado.org.br>
> ---
> target/ppc/fpu_helper.c | 7 +++----
> target/ppc/helper.h | 6 +++---
> target/ppc/insn32.decode | 3 +++
> target/ppc/translate/vsx-impl.c.inc | 28 +++++++++++++++++++++++++---
> target/ppc/translate/vsx-ops.c.inc | 3 ---
> 5 files changed, 34 insertions(+), 13 deletions(-)
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
r~
^ permalink raw reply [flat|nested] 97+ messages in thread
* [PATCH v4 41/47] target/ppc: Move xs{max, min}[cj]dp to use do_helper_XX3
2022-02-22 14:35 [PATCH v4 00/47] target/ppc: PowerISA Vector/VSX instruction batch matheus.ferst
` (39 preceding siblings ...)
2022-02-22 14:36 ` [PATCH v4 40/47] target/ppc: Move xscmp{eq,ge,gt}dp to decodetree matheus.ferst
@ 2022-02-22 14:36 ` matheus.ferst
2022-02-23 0:23 ` Richard Henderson
2022-02-22 14:36 ` [PATCH v4 42/47] target/ppc: Refactor VSX_MAX_MINC helper matheus.ferst
` (5 subsequent siblings)
46 siblings, 1 reply; 97+ messages in thread
From: matheus.ferst @ 2022-02-22 14:36 UTC (permalink / raw)
To: qemu-devel, qemu-ppc
Cc: danielhb413, richard.henderson, groug, Víctor Colombo, clg,
Matheus Ferst, david
From: Víctor Colombo <victor.colombo@eldorado.org.br>
Also, fixes these instructions not being capitalized.
Signed-off-by: Víctor Colombo <victor.colombo@eldorado.org.br>
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
---
target/ppc/fpu_helper.c | 8 ++++----
target/ppc/helper.h | 8 ++++----
target/ppc/translate/vsx-impl.c.inc | 30 ++++-------------------------
3 files changed, 12 insertions(+), 34 deletions(-)
diff --git a/target/ppc/fpu_helper.c b/target/ppc/fpu_helper.c
index bfe49a63f8..7ae576cba9 100644
--- a/target/ppc/fpu_helper.c
+++ b/target/ppc/fpu_helper.c
@@ -2568,8 +2568,8 @@ void helper_##name(CPUPPCState *env, \
} \
} \
-VSX_MAX_MINC(xsmaxcdp, 1);
-VSX_MAX_MINC(xsmincdp, 0);
+VSX_MAX_MINC(XSMAXCDP, 1);
+VSX_MAX_MINC(XSMINCDP, 0);
#define VSX_MAX_MINJ(name, max) \
void helper_##name(CPUPPCState *env, \
@@ -2623,8 +2623,8 @@ void helper_##name(CPUPPCState *env, \
} \
} \
-VSX_MAX_MINJ(xsmaxjdp, 1);
-VSX_MAX_MINJ(xsminjdp, 0);
+VSX_MAX_MINJ(XSMAXJDP, 1);
+VSX_MAX_MINJ(XSMINJDP, 0);
/*
* VSX_CMP - VSX floating point compare
diff --git a/target/ppc/helper.h b/target/ppc/helper.h
index 8a57a48200..3a1cb9abf5 100644
--- a/target/ppc/helper.h
+++ b/target/ppc/helper.h
@@ -375,10 +375,10 @@ DEF_HELPER_4(xscmpoqp, void, env, i32, vsr, vsr)
DEF_HELPER_4(xscmpuqp, void, env, i32, vsr, vsr)
DEF_HELPER_4(xsmaxdp, void, env, vsr, vsr, vsr)
DEF_HELPER_4(xsmindp, void, env, vsr, vsr, vsr)
-DEF_HELPER_4(xsmaxcdp, void, env, vsr, vsr, vsr)
-DEF_HELPER_4(xsmincdp, void, env, vsr, vsr, vsr)
-DEF_HELPER_4(xsmaxjdp, void, env, vsr, vsr, vsr)
-DEF_HELPER_4(xsminjdp, void, env, vsr, vsr, vsr)
+DEF_HELPER_4(XSMAXCDP, void, env, vsr, vsr, vsr)
+DEF_HELPER_4(XSMINCDP, void, env, vsr, vsr, vsr)
+DEF_HELPER_4(XSMAXJDP, void, env, vsr, vsr, vsr)
+DEF_HELPER_4(XSMINJDP, void, env, vsr, vsr, vsr)
DEF_HELPER_3(xscvdphp, void, env, vsr, vsr)
DEF_HELPER_4(xscvdpqp, void, env, i32, vsr, vsr)
DEF_HELPER_3(xscvdpsp, void, env, vsr, vsr)
diff --git a/target/ppc/translate/vsx-impl.c.inc b/target/ppc/translate/vsx-impl.c.inc
index 29f04a4178..730f073cf5 100644
--- a/target/ppc/translate/vsx-impl.c.inc
+++ b/target/ppc/translate/vsx-impl.c.inc
@@ -2494,32 +2494,10 @@ static bool do_helper_XX3(DisasContext *ctx, arg_XX3 *a,
TRANS(XSCMPEQDP, do_helper_XX3, gen_helper_XSCMPEQDP)
TRANS(XSCMPGEDP, do_helper_XX3, gen_helper_XSCMPGEDP)
TRANS(XSCMPGTDP, do_helper_XX3, gen_helper_XSCMPGTDP)
-
-static bool do_xsmaxmincjdp(DisasContext *ctx, arg_XX3 *a,
- void (*helper)(TCGv_ptr, TCGv_ptr, TCGv_ptr, TCGv_ptr))
-{
- TCGv_ptr xt, xa, xb;
-
- REQUIRE_INSNS_FLAGS2(ctx, ISA300);
- REQUIRE_VSX(ctx);
-
- xt = gen_vsr_ptr(a->xt);
- xa = gen_vsr_ptr(a->xa);
- xb = gen_vsr_ptr(a->xb);
-
- helper(cpu_env, xt, xa, xb);
-
- tcg_temp_free_ptr(xt);
- tcg_temp_free_ptr(xa);
- tcg_temp_free_ptr(xb);
-
- return true;
-}
-
-TRANS(XSMAXCDP, do_xsmaxmincjdp, gen_helper_xsmaxcdp)
-TRANS(XSMINCDP, do_xsmaxmincjdp, gen_helper_xsmincdp)
-TRANS(XSMAXJDP, do_xsmaxmincjdp, gen_helper_xsmaxjdp)
-TRANS(XSMINJDP, do_xsmaxmincjdp, gen_helper_xsminjdp)
+TRANS(XSMAXCDP, do_helper_XX3, gen_helper_XSMAXCDP)
+TRANS(XSMINCDP, do_helper_XX3, gen_helper_XSMINCDP)
+TRANS(XSMAXJDP, do_helper_XX3, gen_helper_XSMAXJDP)
+TRANS(XSMINJDP, do_helper_XX3, gen_helper_XSMINJDP)
static bool do_helper_X(arg_X *a,
void (*helper)(TCGv_ptr, TCGv_ptr, TCGv_ptr, TCGv_ptr))
--
2.25.1
^ permalink raw reply related [flat|nested] 97+ messages in thread
* Re: [PATCH v4 41/47] target/ppc: Move xs{max, min}[cj]dp to use do_helper_XX3
2022-02-22 14:36 ` [PATCH v4 41/47] target/ppc: Move xs{max, min}[cj]dp to use do_helper_XX3 matheus.ferst
@ 2022-02-23 0:23 ` Richard Henderson
0 siblings, 0 replies; 97+ messages in thread
From: Richard Henderson @ 2022-02-23 0:23 UTC (permalink / raw)
To: matheus.ferst, qemu-devel, qemu-ppc
Cc: Víctor Colombo, groug, danielhb413, clg, david
On 2/22/22 04:36, matheus.ferst@eldorado.org.br wrote:
> From: Víctor Colombo<victor.colombo@eldorado.org.br>
>
> Also, fixes these instructions not being capitalized.
>
> Signed-off-by: Víctor Colombo<victor.colombo@eldorado.org.br>
> Signed-off-by: Matheus Ferst<matheus.ferst@eldorado.org.br>
> ---
> target/ppc/fpu_helper.c | 8 ++++----
> target/ppc/helper.h | 8 ++++----
> target/ppc/translate/vsx-impl.c.inc | 30 ++++-------------------------
> 3 files changed, 12 insertions(+), 34 deletions(-)
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
r~
^ permalink raw reply [flat|nested] 97+ messages in thread
* [PATCH v4 42/47] target/ppc: Refactor VSX_MAX_MINC helper
2022-02-22 14:35 [PATCH v4 00/47] target/ppc: PowerISA Vector/VSX instruction batch matheus.ferst
` (40 preceding siblings ...)
2022-02-22 14:36 ` [PATCH v4 41/47] target/ppc: Move xs{max, min}[cj]dp to use do_helper_XX3 matheus.ferst
@ 2022-02-22 14:36 ` matheus.ferst
2022-02-23 0:40 ` Richard Henderson
2022-02-22 14:36 ` [PATCH v4 43/47] target/ppc: Implement xs{max,min}cqp matheus.ferst
` (4 subsequent siblings)
46 siblings, 1 reply; 97+ messages in thread
From: matheus.ferst @ 2022-02-22 14:36 UTC (permalink / raw)
To: qemu-devel, qemu-ppc
Cc: danielhb413, richard.henderson, groug, Víctor Colombo, clg,
Matheus Ferst, david
From: Víctor Colombo <victor.colombo@eldorado.org.br>
Refactor xs{max,min}cdp VSX_MAX_MINC helper to prepare for
xs{max,min}cqp implementation.
Signed-off-by: Víctor Colombo <victor.colombo@eldorado.org.br>
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
---
target/ppc/fpu_helper.c | 23 +++++++++--------------
1 file changed, 9 insertions(+), 14 deletions(-)
diff --git a/target/ppc/fpu_helper.c b/target/ppc/fpu_helper.c
index 7ae576cba9..f6eb8bf2d8 100644
--- a/target/ppc/fpu_helper.c
+++ b/target/ppc/fpu_helper.c
@@ -2536,27 +2536,22 @@ VSX_MAX_MIN(xsmindp, minnum, 1, float64, VsrD(0))
VSX_MAX_MIN(xvmindp, minnum, 2, float64, VsrD(i))
VSX_MAX_MIN(xvminsp, minnum, 4, float32, VsrW(i))
-#define VSX_MAX_MINC(name, max) \
+#define VSX_MAX_MINC(name, op, tp, fld) \
void helper_##name(CPUPPCState *env, \
ppc_vsr_t *xt, ppc_vsr_t *xa, ppc_vsr_t *xb) \
{ \
ppc_vsr_t t = { }; \
bool vxsnan_flag = false, vex_flag = false; \
\
- if (unlikely(float64_is_any_nan(xa->VsrD(0)) || \
- float64_is_any_nan(xb->VsrD(0)))) { \
- if (float64_is_signaling_nan(xa->VsrD(0), &env->fp_status) || \
- float64_is_signaling_nan(xb->VsrD(0), &env->fp_status)) { \
+ if (unlikely(tp##_is_any_nan(xa->fld) || \
+ tp##_is_any_nan(xb->fld))) { \
+ if (tp##_is_signaling_nan(xa->fld, &env->fp_status) || \
+ tp##_is_signaling_nan(xb->fld, &env->fp_status)) { \
vxsnan_flag = true; \
} \
- t.VsrD(0) = xb->VsrD(0); \
- } else if ((max && \
- !float64_lt(xa->VsrD(0), xb->VsrD(0), &env->fp_status)) || \
- (!max && \
- float64_lt(xa->VsrD(0), xb->VsrD(0), &env->fp_status))) { \
- t.VsrD(0) = xa->VsrD(0); \
+ t.fld = xb->fld; \
} else { \
- t.VsrD(0) = xb->VsrD(0); \
+ t.fld = tp##_##op(xa->fld, xb->fld, &env->fp_status); \
} \
\
vex_flag = fpscr_ve & vxsnan_flag; \
@@ -2568,8 +2563,8 @@ void helper_##name(CPUPPCState *env, \
} \
} \
-VSX_MAX_MINC(XSMAXCDP, 1);
-VSX_MAX_MINC(XSMINCDP, 0);
+VSX_MAX_MINC(XSMAXCDP, maxnum, float64, VsrD(0));
+VSX_MAX_MINC(XSMINCDP, minnum, float64, VsrD(0));
#define VSX_MAX_MINJ(name, max) \
void helper_##name(CPUPPCState *env, \
--
2.25.1
^ permalink raw reply related [flat|nested] 97+ messages in thread
* Re: [PATCH v4 42/47] target/ppc: Refactor VSX_MAX_MINC helper
2022-02-22 14:36 ` [PATCH v4 42/47] target/ppc: Refactor VSX_MAX_MINC helper matheus.ferst
@ 2022-02-23 0:40 ` Richard Henderson
0 siblings, 0 replies; 97+ messages in thread
From: Richard Henderson @ 2022-02-23 0:40 UTC (permalink / raw)
To: matheus.ferst, qemu-devel, qemu-ppc
Cc: Víctor Colombo, groug, danielhb413, clg, david
On 2/22/22 04:36, matheus.ferst@eldorado.org.br wrote:
> -#define VSX_MAX_MINC(name, max) \
> +#define VSX_MAX_MINC(name, op, tp, fld) \
> void helper_##name(CPUPPCState *env, \
> ppc_vsr_t *xt, ppc_vsr_t *xa, ppc_vsr_t *xb) \
> { \
> ppc_vsr_t t = { }; \
> bool vxsnan_flag = false, vex_flag = false; \
> \
> - if (unlikely(float64_is_any_nan(xa->VsrD(0)) || \
> - float64_is_any_nan(xb->VsrD(0)))) { \
> - if (float64_is_signaling_nan(xa->VsrD(0), &env->fp_status) || \
> - float64_is_signaling_nan(xb->VsrD(0), &env->fp_status)) { \
> + if (unlikely(tp##_is_any_nan(xa->fld) || \
> + tp##_is_any_nan(xb->fld))) { \
> + if (tp##_is_signaling_nan(xa->fld, &env->fp_status) || \
> + tp##_is_signaling_nan(xb->fld, &env->fp_status)) { \
> vxsnan_flag = true; \
> } \
> - t.VsrD(0) = xb->VsrD(0); \
> - } else if ((max && \
> - !float64_lt(xa->VsrD(0), xb->VsrD(0), &env->fp_status)) || \
> - (!max && \
> - float64_lt(xa->VsrD(0), xb->VsrD(0), &env->fp_status))) { \
> - t.VsrD(0) = xa->VsrD(0); \
> + t.fld = xb->fld; \
> } else { \
> - t.VsrD(0) = xb->VsrD(0); \
> + t.fld = tp##_##op(xa->fld, xb->fld, &env->fp_status); \
> } \
> \
> vex_flag = fpscr_ve & vxsnan_flag; \
I think this would be simpler to utilize the result of the compare vs nans:
bool first;
if (max) {
first = tp##_le_quiet(xb->fld, xa->fld, status);
} else {
first = tp##_lt_quiet(xa->fld, xb->fld, status);
}
if (first) {
t.fld = xa->fld;
} else {
t.fld = xb->fld;
if (flags & float_flag_invalid_snan) {
float_invalid_op_vxsnan(env, retaddr);
}
}
xt = *t;
r~
^ permalink raw reply [flat|nested] 97+ messages in thread
* [PATCH v4 43/47] target/ppc: Implement xs{max,min}cqp
2022-02-22 14:35 [PATCH v4 00/47] target/ppc: PowerISA Vector/VSX instruction batch matheus.ferst
` (41 preceding siblings ...)
2022-02-22 14:36 ` [PATCH v4 42/47] target/ppc: Refactor VSX_MAX_MINC helper matheus.ferst
@ 2022-02-22 14:36 ` matheus.ferst
2022-02-23 0:41 ` Richard Henderson
2022-02-22 14:36 ` [PATCH v4 44/47] target/ppc: Implement xvcvbf16spn and xvcvspbf16 instructions matheus.ferst
` (3 subsequent siblings)
46 siblings, 1 reply; 97+ messages in thread
From: matheus.ferst @ 2022-02-22 14:36 UTC (permalink / raw)
To: qemu-devel, qemu-ppc
Cc: danielhb413, richard.henderson, groug, Víctor Colombo, clg,
Matheus Ferst, david
From: Víctor Colombo <victor.colombo@eldorado.org.br>
Signed-off-by: Víctor Colombo <victor.colombo@eldorado.org.br>
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
---
target/ppc/fpu_helper.c | 2 ++
target/ppc/helper.h | 2 ++
target/ppc/insn32.decode | 3 +++
target/ppc/translate/vsx-impl.c.inc | 2 ++
4 files changed, 9 insertions(+)
diff --git a/target/ppc/fpu_helper.c b/target/ppc/fpu_helper.c
index f6eb8bf2d8..7773333bd7 100644
--- a/target/ppc/fpu_helper.c
+++ b/target/ppc/fpu_helper.c
@@ -2565,6 +2565,8 @@ void helper_##name(CPUPPCState *env, \
VSX_MAX_MINC(XSMAXCDP, maxnum, float64, VsrD(0));
VSX_MAX_MINC(XSMINCDP, minnum, float64, VsrD(0));
+VSX_MAX_MINC(XSMAXCQP, maxnum, float128, f128);
+VSX_MAX_MINC(XSMINCQP, minnum, float128, f128);
#define VSX_MAX_MINJ(name, max) \
void helper_##name(CPUPPCState *env, \
diff --git a/target/ppc/helper.h b/target/ppc/helper.h
index 3a1cb9abf5..d3af130dc2 100644
--- a/target/ppc/helper.h
+++ b/target/ppc/helper.h
@@ -379,6 +379,8 @@ DEF_HELPER_4(XSMAXCDP, void, env, vsr, vsr, vsr)
DEF_HELPER_4(XSMINCDP, void, env, vsr, vsr, vsr)
DEF_HELPER_4(XSMAXJDP, void, env, vsr, vsr, vsr)
DEF_HELPER_4(XSMINJDP, void, env, vsr, vsr, vsr)
+DEF_HELPER_4(XSMAXCQP, void, env, vsr, vsr, vsr)
+DEF_HELPER_4(XSMINCQP, void, env, vsr, vsr, vsr)
DEF_HELPER_3(xscvdphp, void, env, vsr, vsr)
DEF_HELPER_4(xscvdpqp, void, env, i32, vsr, vsr)
DEF_HELPER_3(xscvdpsp, void, env, vsr, vsr)
diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode
index a6e3855958..892d4bfd84 100644
--- a/target/ppc/insn32.decode
+++ b/target/ppc/insn32.decode
@@ -662,6 +662,9 @@ XSMAXCDP 111100 ..... ..... ..... 10000000 ... @XX3
XSMINCDP 111100 ..... ..... ..... 10001000 ... @XX3
XSMAXJDP 111100 ..... ..... ..... 10010000 ... @XX3
XSMINJDP 111100 ..... ..... ..... 10011000 ... @XX3
+XSMAXCQP 111111 ..... ..... ..... 1010100100 - @X
+XSMINCQP 111111 ..... ..... ..... 1011100100 - @X
+
XSCMPEQDP 111100 ..... ..... ..... 00000011 ... @XX3
XSCMPGEDP 111100 ..... ..... ..... 00010011 ... @XX3
XSCMPGTDP 111100 ..... ..... ..... 00001011 ... @XX3
diff --git a/target/ppc/translate/vsx-impl.c.inc b/target/ppc/translate/vsx-impl.c.inc
index 730f073cf5..0546dc736e 100644
--- a/target/ppc/translate/vsx-impl.c.inc
+++ b/target/ppc/translate/vsx-impl.c.inc
@@ -2529,6 +2529,8 @@ static bool do_xscmpqp(DisasContext *ctx, arg_X *a,
TRANS(XSCMPEQQP, do_xscmpqp, gen_helper_XSCMPEQQP)
TRANS(XSCMPGEQP, do_xscmpqp, gen_helper_XSCMPGEQP)
TRANS(XSCMPGTQP, do_xscmpqp, gen_helper_XSCMPGTQP)
+TRANS(XSMAXCQP, do_xscmpqp, gen_helper_XSMAXCQP)
+TRANS(XSMINCQP, do_xscmpqp, gen_helper_XSMINCQP)
#undef GEN_XX2FORM
#undef GEN_XX3FORM
--
2.25.1
^ permalink raw reply related [flat|nested] 97+ messages in thread
* Re: [PATCH v4 43/47] target/ppc: Implement xs{max,min}cqp
2022-02-22 14:36 ` [PATCH v4 43/47] target/ppc: Implement xs{max,min}cqp matheus.ferst
@ 2022-02-23 0:41 ` Richard Henderson
0 siblings, 0 replies; 97+ messages in thread
From: Richard Henderson @ 2022-02-23 0:41 UTC (permalink / raw)
To: matheus.ferst, qemu-devel, qemu-ppc
Cc: Víctor Colombo, groug, danielhb413, clg, david
On 2/22/22 04:36, matheus.ferst@eldorado.org.br wrote:
> From: Víctor Colombo<victor.colombo@eldorado.org.br>
>
> Signed-off-by: Víctor Colombo<victor.colombo@eldorado.org.br>
> Signed-off-by: Matheus Ferst<matheus.ferst@eldorado.org.br>
> ---
> target/ppc/fpu_helper.c | 2 ++
> target/ppc/helper.h | 2 ++
> target/ppc/insn32.decode | 3 +++
> target/ppc/translate/vsx-impl.c.inc | 2 ++
> 4 files changed, 9 insertions(+)
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
r~
^ permalink raw reply [flat|nested] 97+ messages in thread
* [PATCH v4 44/47] target/ppc: Implement xvcvbf16spn and xvcvspbf16 instructions
2022-02-22 14:35 [PATCH v4 00/47] target/ppc: PowerISA Vector/VSX instruction batch matheus.ferst
` (42 preceding siblings ...)
2022-02-22 14:36 ` [PATCH v4 43/47] target/ppc: Implement xs{max,min}cqp matheus.ferst
@ 2022-02-22 14:36 ` matheus.ferst
2022-02-23 3:08 ` Richard Henderson
2022-02-22 14:36 ` [PATCH v4 45/47] target/ppc: implement plxsd/pstxsd matheus.ferst
` (2 subsequent siblings)
46 siblings, 1 reply; 97+ messages in thread
From: matheus.ferst @ 2022-02-22 14:36 UTC (permalink / raw)
To: qemu-devel, qemu-ppc
Cc: danielhb413, richard.henderson, groug, Víctor Colombo, clg,
Matheus Ferst, david
From: Víctor Colombo <victor.colombo@eldorado.org.br>
Signed-off-by: Víctor Colombo <victor.colombo@eldorado.org.br>
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
---
target/ppc/fpu_helper.c | 21 +++++++++++++++++++
target/ppc/helper.h | 1 +
target/ppc/insn32.decode | 11 +++++++---
target/ppc/translate/vsx-impl.c.inc | 31 ++++++++++++++++++++++++++++-
4 files changed, 60 insertions(+), 4 deletions(-)
diff --git a/target/ppc/fpu_helper.c b/target/ppc/fpu_helper.c
index 7773333bd7..d77900fff1 100644
--- a/target/ppc/fpu_helper.c
+++ b/target/ppc/fpu_helper.c
@@ -2790,6 +2790,27 @@ VSX_CVT_FP_TO_FP_HP(xscvhpdp, 1, float16, float64, VsrH(3), VsrD(0), 1)
VSX_CVT_FP_TO_FP_HP(xvcvsphp, 4, float32, float16, VsrW(i), VsrH(2 * i + 1), 0)
VSX_CVT_FP_TO_FP_HP(xvcvhpsp, 4, float16, float32, VsrH(2 * i + 1), VsrW(i), 0)
+void helper_XVCVSPBF16(CPUPPCState *env, ppc_vsr_t *xt, ppc_vsr_t *xb)
+{
+ ppc_vsr_t t = { };
+ int i;
+
+ helper_reset_fpstatus(env);
+ for (i = 0; i < 4; i++) {
+ if (unlikely(float32_is_signaling_nan(xb->VsrW(i), &env->fp_status))) {
+ float_invalid_op_vxsnan(env, GETPC());
+ t.VsrH(2 * i + 1) = float32_to_bfloat16(
+ float32_snan_to_qnan(xb->VsrW(i)), &env->fp_status);
+ } else {
+ t.VsrH(2 * i + 1) =
+ float32_to_bfloat16(xb->VsrW(i), &env->fp_status);
+ }
+ }
+
+ *xt = t;
+ do_float_check_status(env, GETPC());
+}
+
void helper_XSCVQPDP(CPUPPCState *env, uint32_t ro, ppc_vsr_t *xt,
ppc_vsr_t *xb)
{
diff --git a/target/ppc/helper.h b/target/ppc/helper.h
index d3af130dc2..805a5046d8 100644
--- a/target/ppc/helper.h
+++ b/target/ppc/helper.h
@@ -494,6 +494,7 @@ DEF_HELPER_FLAGS_4(xvcmpnesp, TCG_CALL_NO_RWG, i32, env, vsr, vsr, vsr)
DEF_HELPER_3(xvcvspdp, void, env, vsr, vsr)
DEF_HELPER_3(xvcvsphp, void, env, vsr, vsr)
DEF_HELPER_3(xvcvhpsp, void, env, vsr, vsr)
+DEF_HELPER_3(XVCVSPBF16, void, env, vsr, vsr)
DEF_HELPER_3(xvcvspsxds, void, env, vsr, vsr)
DEF_HELPER_3(xvcvspsxws, void, env, vsr, vsr)
DEF_HELPER_3(xvcvspuxds, void, env, vsr, vsr)
diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode
index 892d4bfd84..8964898f20 100644
--- a/target/ppc/insn32.decode
+++ b/target/ppc/insn32.decode
@@ -152,8 +152,11 @@
%xx_xb 1:1 11:5
%xx_xa 2:1 16:5
%xx_xc 3:1 6:5
-&XX2 xt xb uim:uint8_t
-@XX2 ...... ..... ... uim:2 ..... ......... .. &XX2 xt=%xx_xt xb=%xx_xb
+&XX2 xt xb
+@XX2 ...... ..... ..... ..... ......... .. &XX2 xt=%xx_xt xb=%xx_xb
+
+&XX2_uim2 xt xb uim:uint8_t
+@XX2_uim2 ...... ..... ... uim:2 ..... ......... .. &XX2_uim2 xt=%xx_xt xb=%xx_xb
&XX2_bf_xb bf xb
@XX2_bf_xb ...... bf:3 .. ..... ..... ......... . . &XX2_bf_xb xb=%xx_xb
@@ -635,7 +638,7 @@ XSNMSUBQP 111111 ..... ..... ..... 0111100100 . @X_rc
## VSX splat instruction
XXSPLTIB 111100 ..... 00 ........ 0101101000 . @X_imm8
-XXSPLTW 111100 ..... ---.. ..... 010100100 . . @XX2
+XXSPLTW 111100 ..... ---.. ..... 010100100 . . @XX2_uim2
## VSX Permute Instructions
@@ -675,6 +678,8 @@ XSCMPGTQP 111111 ..... ..... ..... 0011100100 - @X
## VSX Binary Floating-Point Convert Instructions
XSCVQPDP 111111 ..... 10100 ..... 1101000100 . @X_tb_rc
+XVCVBF16SPN 111100 ..... 10000 ..... 111011011 .. @XX2
+XVCVSPBF16 111100 ..... 10001 ..... 111011011 .. @XX2
## VSX Vector Test Least-Significant Bit by Byte Instruction
diff --git a/target/ppc/translate/vsx-impl.c.inc b/target/ppc/translate/vsx-impl.c.inc
index 0546dc736e..2930537b8e 100644
--- a/target/ppc/translate/vsx-impl.c.inc
+++ b/target/ppc/translate/vsx-impl.c.inc
@@ -1576,7 +1576,7 @@ static bool trans_XXSEL(DisasContext *ctx, arg_XX4 *a)
return true;
}
-static bool trans_XXSPLTW(DisasContext *ctx, arg_XX2 *a)
+static bool trans_XXSPLTW(DisasContext *ctx, arg_XX2_uim2 *a)
{
int tofs, bofs;
@@ -2532,6 +2532,35 @@ TRANS(XSCMPGTQP, do_xscmpqp, gen_helper_XSCMPGTQP)
TRANS(XSMAXCQP, do_xscmpqp, gen_helper_XSMAXCQP)
TRANS(XSMINCQP, do_xscmpqp, gen_helper_XSMINCQP)
+static bool trans_XVCVSPBF16(DisasContext *ctx, arg_XX2 *a)
+{
+ TCGv_ptr xt, xb;
+
+ REQUIRE_INSNS_FLAGS2(ctx, ISA310);
+ REQUIRE_VSX(ctx);
+
+ xt = gen_vsr_ptr(a->xt);
+ xb = gen_vsr_ptr(a->xb);
+
+ gen_helper_XVCVSPBF16(cpu_env, xt, xb);
+
+ tcg_temp_free_ptr(xt);
+ tcg_temp_free_ptr(xb);
+
+ return true;
+}
+
+static bool trans_XVCVBF16SPN(DisasContext *ctx, arg_XX2 *a)
+{
+ REQUIRE_INSNS_FLAGS2(ctx, ISA310);
+ REQUIRE_VSX(ctx);
+
+ tcg_gen_gvec_shli(MO_32, vsr_full_offset(a->xt), vsr_full_offset(a->xb),
+ 16, 16, 16);
+
+ return true;
+}
+
#undef GEN_XX2FORM
#undef GEN_XX3FORM
#undef GEN_XX2IFORM
--
2.25.1
^ permalink raw reply related [flat|nested] 97+ messages in thread
* Re: [PATCH v4 44/47] target/ppc: Implement xvcvbf16spn and xvcvspbf16 instructions
2022-02-22 14:36 ` [PATCH v4 44/47] target/ppc: Implement xvcvbf16spn and xvcvspbf16 instructions matheus.ferst
@ 2022-02-23 3:08 ` Richard Henderson
0 siblings, 0 replies; 97+ messages in thread
From: Richard Henderson @ 2022-02-23 3:08 UTC (permalink / raw)
To: matheus.ferst, qemu-devel, qemu-ppc
Cc: Víctor Colombo, groug, danielhb413, clg, david
On 2/22/22 04:36, matheus.ferst@eldorado.org.br wrote:
> From: Víctor Colombo <victor.colombo@eldorado.org.br>
>
> Signed-off-by: Víctor Colombo <victor.colombo@eldorado.org.br>
> Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
> ---
> target/ppc/fpu_helper.c | 21 +++++++++++++++++++
> target/ppc/helper.h | 1 +
> target/ppc/insn32.decode | 11 +++++++---
> target/ppc/translate/vsx-impl.c.inc | 31 ++++++++++++++++++++++++++++-
> 4 files changed, 60 insertions(+), 4 deletions(-)
>
> diff --git a/target/ppc/fpu_helper.c b/target/ppc/fpu_helper.c
> index 7773333bd7..d77900fff1 100644
> --- a/target/ppc/fpu_helper.c
> +++ b/target/ppc/fpu_helper.c
> @@ -2790,6 +2790,27 @@ VSX_CVT_FP_TO_FP_HP(xscvhpdp, 1, float16, float64, VsrH(3), VsrD(0), 1)
> VSX_CVT_FP_TO_FP_HP(xvcvsphp, 4, float32, float16, VsrW(i), VsrH(2 * i + 1), 0)
> VSX_CVT_FP_TO_FP_HP(xvcvhpsp, 4, float16, float32, VsrH(2 * i + 1), VsrW(i), 0)
>
> +void helper_XVCVSPBF16(CPUPPCState *env, ppc_vsr_t *xt, ppc_vsr_t *xb)
> +{
> + ppc_vsr_t t = { };
> + int i;
> +
> + helper_reset_fpstatus(env);
> + for (i = 0; i < 4; i++) {
> + if (unlikely(float32_is_signaling_nan(xb->VsrW(i), &env->fp_status))) {
> + float_invalid_op_vxsnan(env, GETPC());
> + t.VsrH(2 * i + 1) = float32_to_bfloat16(
> + float32_snan_to_qnan(xb->VsrW(i)), &env->fp_status);
> + } else {
> + t.VsrH(2 * i + 1) =
> + float32_to_bfloat16(xb->VsrW(i), &env->fp_status);
> + }
> + }
Do not check for snan first; use float_flag_invalid_snan.
And you can move that check outside the loop, before the
writeback of t to *xt.
r~
^ permalink raw reply [flat|nested] 97+ messages in thread
* [PATCH v4 45/47] target/ppc: implement plxsd/pstxsd
2022-02-22 14:35 [PATCH v4 00/47] target/ppc: PowerISA Vector/VSX instruction batch matheus.ferst
` (43 preceding siblings ...)
2022-02-22 14:36 ` [PATCH v4 44/47] target/ppc: Implement xvcvbf16spn and xvcvspbf16 instructions matheus.ferst
@ 2022-02-22 14:36 ` matheus.ferst
2022-02-23 3:14 ` Richard Henderson
2022-02-22 14:36 ` [PATCH v4 46/47] target/ppc: implement plxssp/pstxssp matheus.ferst
2022-02-22 14:36 ` [PATCH v4 47/47] target/ppc: implement lxvr[bhwd]/stxvr[bhwd]x matheus.ferst
46 siblings, 1 reply; 97+ messages in thread
From: matheus.ferst @ 2022-02-22 14:36 UTC (permalink / raw)
To: qemu-devel, qemu-ppc
Cc: Leandro Lupori, danielhb413, richard.henderson, groug, clg,
Matheus Ferst, david
From: Leandro Lupori <leandro.lupori@eldorado.org.br>
Implement instructions plxsd/pstxsd and port lxsd/stxsd to decode
tree.
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
---
target/ppc/insn32.decode | 2 ++
target/ppc/insn64.decode | 10 ++++++
target/ppc/translate.c | 14 ++------
target/ppc/translate/vsx-impl.c.inc | 55 +++++++++++++++++++++++++++--
4 files changed, 67 insertions(+), 14 deletions(-)
diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode
index 8964898f20..d84ff333ec 100644
--- a/target/ppc/insn32.decode
+++ b/target/ppc/insn32.decode
@@ -600,6 +600,8 @@ VCLRRB 000100 ..... ..... ..... 00111001101 @VX
# VSX Load/Store Instructions
+LXSD 111001 ..... ..... .............. 10 @DS
+STXSD 111101 ..... ..... .............. 10 @DS
LXV 111101 ..... ..... ............ . 001 @DQ_TSX
STXV 111101 ..... ..... ............ . 101 @DQ_TSX
LXVP 000110 ..... ..... ............ 0000 @DQ_TSXP
diff --git a/target/ppc/insn64.decode b/target/ppc/insn64.decode
index fdb859f62d..b7426f5b24 100644
--- a/target/ppc/insn64.decode
+++ b/target/ppc/insn64.decode
@@ -32,6 +32,10 @@
...... ..... ra:5 ................ \
&PLS_D si=%pls_si rt=%rt_tsxp
+@8LS_D ...... .. . .. r:1 .. .................. \
+ ...... rt:5 ra:5 ................ \
+ &PLS_D si=%pls_si
+
# Format 8RR:D
%8rr_si 32:s16 0:16
%8rr_xt 16:1 21:5
@@ -180,6 +184,12 @@ PSTFD 000001 10 0--.-- .................. \
### VSX instructions
+PLXSD 000001 00 0--.-- .................. \
+ 101010 ..... ..... ................ @8LS_D
+
+PSTXSD 000001 00 0--.-- .................. \
+ 101110 ..... ..... ................ @8LS_D
+
PLXV 000001 00 0--.-- .................. \
11001 ...... ..... ................ @8LS_D_TSX
PSTXV 000001 00 0--.-- .................. \
diff --git a/target/ppc/translate.c b/target/ppc/translate.c
index b647430012..aa860d6bf9 100644
--- a/target/ppc/translate.c
+++ b/target/ppc/translate.c
@@ -6668,7 +6668,7 @@ static bool resolve_PLS_D(DisasContext *ctx, arg_D *d, arg_PLS_D *a)
#include "translate/branch-impl.c.inc"
-/* Handles lfdp, lxsd, lxssp */
+/* Handles lfdp, lxssp */
static void gen_dform39(DisasContext *ctx)
{
switch (ctx->opcode & 0x3) {
@@ -6677,11 +6677,6 @@ static void gen_dform39(DisasContext *ctx)
return gen_lfdp(ctx);
}
break;
- case 2: /* lxsd */
- if (ctx->insns_flags2 & PPC2_ISA300) {
- return gen_lxsd(ctx);
- }
- break;
case 3: /* lxssp */
if (ctx->insns_flags2 & PPC2_ISA300) {
return gen_lxssp(ctx);
@@ -6691,7 +6686,7 @@ static void gen_dform39(DisasContext *ctx)
return gen_invalid(ctx);
}
-/* handles stfdp, lxv, stxsd, stxssp lxvx */
+/* handles stfdp, lxv, stxssp lxvx */
static void gen_dform3D(DisasContext *ctx)
{
if ((ctx->opcode & 3) != 1) { /* DS-FORM */
@@ -6701,11 +6696,6 @@ static void gen_dform3D(DisasContext *ctx)
return gen_stfdp(ctx);
}
break;
- case 2: /* stxsd */
- if (ctx->insns_flags2 & PPC2_ISA300) {
- return gen_stxsd(ctx);
- }
- break;
case 3: /* stxssp */
if (ctx->insns_flags2 & PPC2_ISA300) {
return gen_stxssp(ctx);
diff --git a/target/ppc/translate/vsx-impl.c.inc b/target/ppc/translate/vsx-impl.c.inc
index 2930537b8e..cabadcf106 100644
--- a/target/ppc/translate/vsx-impl.c.inc
+++ b/target/ppc/translate/vsx-impl.c.inc
@@ -309,7 +309,6 @@ static void gen_##name(DisasContext *ctx) \
tcg_temp_free_i64(xth); \
}
-VSX_LOAD_SCALAR_DS(lxsd, ld64_i64)
VSX_LOAD_SCALAR_DS(lxssp, ld32fs)
#define VSX_STORE_SCALAR(name, operation) \
@@ -482,7 +481,6 @@ static void gen_##name(DisasContext *ctx) \
tcg_temp_free_i64(xth); \
}
-VSX_STORE_SCALAR_DS(stxsd, st64_i64)
VSX_STORE_SCALAR_DS(stxssp, st32fs)
static void gen_mfvsrwz(DisasContext *ctx)
@@ -2281,6 +2279,57 @@ static bool do_lstxv_X(DisasContext *ctx, arg_X *a, bool store, bool paired)
return do_lstxv(ctx, a->ra, cpu_gpr[a->rb], a->rt, store, paired);
}
+static bool do_lstxsd(DisasContext *ctx, int rt, int ra, TCGv displ, bool store)
+{
+ TCGv ea;
+ TCGv_i64 xt;
+ MemOp mop;
+
+ if (store) {
+ REQUIRE_VECTOR(ctx);
+ } else {
+ REQUIRE_VSX(ctx);
+ }
+
+ xt = tcg_temp_new_i64();
+ mop = DEF_MEMOP(MO_UQ);
+
+ gen_set_access_type(ctx, ACCESS_INT);
+ ea = do_ea_calc(ctx, ra, displ);
+
+ if (store) {
+ get_cpu_vsr(xt, rt + 32, true);
+ tcg_gen_qemu_st_i64(xt, ea, ctx->mem_idx, mop);
+ } else {
+ tcg_gen_qemu_ld_i64(xt, ea, ctx->mem_idx, mop);
+ set_cpu_vsr(rt + 32, xt, true);
+ set_cpu_vsr(rt + 32, tcg_constant_i64(0), false);
+ }
+
+ tcg_temp_free(ea);
+ tcg_temp_free_i64(xt);
+
+ return true;
+}
+
+static bool do_lstxsd_DS(DisasContext *ctx, arg_D *a, bool store)
+{
+ return do_lstxsd(ctx, a->rt, a->ra, tcg_constant_tl(a->si), store);
+}
+
+static bool do_plstxsd_PLS_D(DisasContext *ctx, arg_PLS_D *a, bool store)
+{
+ arg_D d;
+
+ if (!resolve_PLS_D(ctx, &d, a)) {
+ return true;
+ }
+
+ return do_lstxsd(ctx, d.rt, d.ra, tcg_constant_tl(d.si), store);
+}
+
+TRANS_FLAGS2(ISA300, LXSD, do_lstxsd_DS, false)
+TRANS_FLAGS2(ISA300, STXSD, do_lstxsd_DS, true)
TRANS_FLAGS2(ISA300, STXV, do_lstxv_D, true, false)
TRANS_FLAGS2(ISA300, LXV, do_lstxv_D, false, false)
TRANS_FLAGS2(ISA310, STXVP, do_lstxv_D, true, true)
@@ -2289,6 +2338,8 @@ TRANS_FLAGS2(ISA300, STXVX, do_lstxv_X, true, false)
TRANS_FLAGS2(ISA300, LXVX, do_lstxv_X, false, false)
TRANS_FLAGS2(ISA310, STXVPX, do_lstxv_X, true, true)
TRANS_FLAGS2(ISA310, LXVPX, do_lstxv_X, false, true)
+TRANS64_FLAGS2(ISA310, PLXSD, do_plstxsd_PLS_D, false)
+TRANS64_FLAGS2(ISA310, PSTXSD, do_plstxsd_PLS_D, true)
TRANS64_FLAGS2(ISA310, PSTXV, do_lstxv_PLS_D, true, false)
TRANS64_FLAGS2(ISA310, PLXV, do_lstxv_PLS_D, false, false)
TRANS64_FLAGS2(ISA310, PSTXVP, do_lstxv_PLS_D, true, true)
--
2.25.1
^ permalink raw reply related [flat|nested] 97+ messages in thread
* Re: [PATCH v4 45/47] target/ppc: implement plxsd/pstxsd
2022-02-22 14:36 ` [PATCH v4 45/47] target/ppc: implement plxsd/pstxsd matheus.ferst
@ 2022-02-23 3:14 ` Richard Henderson
0 siblings, 0 replies; 97+ messages in thread
From: Richard Henderson @ 2022-02-23 3:14 UTC (permalink / raw)
To: matheus.ferst, qemu-devel, qemu-ppc
Cc: groug, danielhb413, Leandro Lupori, clg, david
On 2/22/22 04:36, matheus.ferst@eldorado.org.br wrote:
> From: Leandro Lupori<leandro.lupori@eldorado.org.br>
>
> Implement instructions plxsd/pstxsd and port lxsd/stxsd to decode
> tree.
>
> Signed-off-by: Matheus Ferst<matheus.ferst@eldorado.org.br>
> ---
> target/ppc/insn32.decode | 2 ++
> target/ppc/insn64.decode | 10 ++++++
> target/ppc/translate.c | 14 ++------
> target/ppc/translate/vsx-impl.c.inc | 55 +++++++++++++++++++++++++++--
> 4 files changed, 67 insertions(+), 14 deletions(-)
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
r~
^ permalink raw reply [flat|nested] 97+ messages in thread
* [PATCH v4 46/47] target/ppc: implement plxssp/pstxssp
2022-02-22 14:35 [PATCH v4 00/47] target/ppc: PowerISA Vector/VSX instruction batch matheus.ferst
` (44 preceding siblings ...)
2022-02-22 14:36 ` [PATCH v4 45/47] target/ppc: implement plxsd/pstxsd matheus.ferst
@ 2022-02-22 14:36 ` matheus.ferst
2022-02-23 3:16 ` Richard Henderson
2022-02-22 14:36 ` [PATCH v4 47/47] target/ppc: implement lxvr[bhwd]/stxvr[bhwd]x matheus.ferst
46 siblings, 1 reply; 97+ messages in thread
From: matheus.ferst @ 2022-02-22 14:36 UTC (permalink / raw)
To: qemu-devel, qemu-ppc
Cc: Leandro Lupori, danielhb413, richard.henderson, groug, clg,
Matheus Ferst, david
From: Leandro Lupori <leandro.lupori@eldorado.org.br>
Implement instructions plxssp/pstxssp and port lxssp/stxssp to
decode tree.
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
---
target/ppc/insn32.decode | 2 +
target/ppc/insn64.decode | 6 ++
target/ppc/translate.c | 29 +++------
target/ppc/translate/vsx-impl.c.inc | 93 +++++++++++++++--------------
4 files changed, 62 insertions(+), 68 deletions(-)
diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode
index d84ff333ec..5d3cfadfc6 100644
--- a/target/ppc/insn32.decode
+++ b/target/ppc/insn32.decode
@@ -602,6 +602,8 @@ VCLRRB 000100 ..... ..... ..... 00111001101 @VX
LXSD 111001 ..... ..... .............. 10 @DS
STXSD 111101 ..... ..... .............. 10 @DS
+LXSSP 111001 ..... ..... .............. 11 @DS
+STXSSP 111101 ..... ..... .............. 11 @DS
LXV 111101 ..... ..... ............ . 001 @DQ_TSX
STXV 111101 ..... ..... ............ . 101 @DQ_TSX
LXVP 000110 ..... ..... ............ 0000 @DQ_TSXP
diff --git a/target/ppc/insn64.decode b/target/ppc/insn64.decode
index b7426f5b24..691e8fe6c0 100644
--- a/target/ppc/insn64.decode
+++ b/target/ppc/insn64.decode
@@ -190,6 +190,12 @@ PLXSD 000001 00 0--.-- .................. \
PSTXSD 000001 00 0--.-- .................. \
101110 ..... ..... ................ @8LS_D
+PLXSSP 000001 00 0--.-- .................. \
+ 101011 ..... ..... ................ @8LS_D
+
+PSTXSSP 000001 00 0--.-- .................. \
+ 101111 ..... ..... ................ @8LS_D
+
PLXV 000001 00 0--.-- .................. \
11001 ...... ..... ................ @8LS_D_TSX
PSTXV 000001 00 0--.-- .................. \
diff --git a/target/ppc/translate.c b/target/ppc/translate.c
index aa860d6bf9..589ed8b7c1 100644
--- a/target/ppc/translate.c
+++ b/target/ppc/translate.c
@@ -6668,39 +6668,24 @@ static bool resolve_PLS_D(DisasContext *ctx, arg_D *d, arg_PLS_D *a)
#include "translate/branch-impl.c.inc"
-/* Handles lfdp, lxssp */
+/* Handles lfdp */
static void gen_dform39(DisasContext *ctx)
{
- switch (ctx->opcode & 0x3) {
- case 0: /* lfdp */
+ if ((ctx->opcode & 0x3) == 0) {
if (ctx->insns_flags2 & PPC2_ISA205) {
return gen_lfdp(ctx);
}
- break;
- case 3: /* lxssp */
- if (ctx->insns_flags2 & PPC2_ISA300) {
- return gen_lxssp(ctx);
- }
- break;
}
return gen_invalid(ctx);
}
-/* handles stfdp, lxv, stxssp lxvx */
+/* Handles stfdp */
static void gen_dform3D(DisasContext *ctx)
{
- if ((ctx->opcode & 3) != 1) { /* DS-FORM */
- switch (ctx->opcode & 0x3) {
- case 0: /* stfdp */
- if (ctx->insns_flags2 & PPC2_ISA205) {
- return gen_stfdp(ctx);
- }
- break;
- case 3: /* stxssp */
- if (ctx->insns_flags2 & PPC2_ISA300) {
- return gen_stxssp(ctx);
- }
- break;
+ if ((ctx->opcode & 3) == 0) { /* DS-FORM */
+ /* stfdp */
+ if (ctx->insns_flags2 & PPC2_ISA205) {
+ return gen_stfdp(ctx);
}
}
return gen_invalid(ctx);
diff --git a/target/ppc/translate/vsx-impl.c.inc b/target/ppc/translate/vsx-impl.c.inc
index cabadcf106..48a398da0e 100644
--- a/target/ppc/translate/vsx-impl.c.inc
+++ b/target/ppc/translate/vsx-impl.c.inc
@@ -288,29 +288,6 @@ VSX_VECTOR_LOAD_STORE_LENGTH(stxvl)
VSX_VECTOR_LOAD_STORE_LENGTH(stxvll)
#endif
-#define VSX_LOAD_SCALAR_DS(name, operation) \
-static void gen_##name(DisasContext *ctx) \
-{ \
- TCGv EA; \
- TCGv_i64 xth; \
- \
- if (unlikely(!ctx->altivec_enabled)) { \
- gen_exception(ctx, POWERPC_EXCP_VPU); \
- return; \
- } \
- xth = tcg_temp_new_i64(); \
- gen_set_access_type(ctx, ACCESS_INT); \
- EA = tcg_temp_new(); \
- gen_addr_imm_index(ctx, EA, 0x03); \
- gen_qemu_##operation(ctx, xth, EA); \
- set_cpu_vsr(rD(ctx->opcode) + 32, xth, true); \
- /* NOTE: cpu_vsrl is undefined */ \
- tcg_temp_free(EA); \
- tcg_temp_free_i64(xth); \
-}
-
-VSX_LOAD_SCALAR_DS(lxssp, ld32fs)
-
#define VSX_STORE_SCALAR(name, operation) \
static void gen_##name(DisasContext *ctx) \
{ \
@@ -460,29 +437,6 @@ static void gen_stxvb16x(DisasContext *ctx)
tcg_temp_free_i64(xsl);
}
-#define VSX_STORE_SCALAR_DS(name, operation) \
-static void gen_##name(DisasContext *ctx) \
-{ \
- TCGv EA; \
- TCGv_i64 xth; \
- \
- if (unlikely(!ctx->altivec_enabled)) { \
- gen_exception(ctx, POWERPC_EXCP_VPU); \
- return; \
- } \
- xth = tcg_temp_new_i64(); \
- get_cpu_vsr(xth, rD(ctx->opcode) + 32, true); \
- gen_set_access_type(ctx, ACCESS_INT); \
- EA = tcg_temp_new(); \
- gen_addr_imm_index(ctx, EA, 0x03); \
- gen_qemu_##operation(ctx, xth, EA); \
- /* NOTE: cpu_vsrl is undefined */ \
- tcg_temp_free(EA); \
- tcg_temp_free_i64(xth); \
-}
-
-VSX_STORE_SCALAR_DS(stxssp, st32fs)
-
static void gen_mfvsrwz(DisasContext *ctx)
{
if (xS(ctx->opcode) < 32) {
@@ -2328,8 +2282,53 @@ static bool do_plstxsd_PLS_D(DisasContext *ctx, arg_PLS_D *a, bool store)
return do_lstxsd(ctx, d.rt, d.ra, tcg_constant_tl(d.si), store);
}
+static bool do_lstxssp(DisasContext *ctx, int rt, int ra, TCGv displ, bool store)
+{
+ TCGv ea;
+ TCGv_i64 xt;
+
+ REQUIRE_VECTOR(ctx);
+
+ xt = tcg_temp_new_i64();
+
+ gen_set_access_type(ctx, ACCESS_INT);
+ ea = do_ea_calc(ctx, ra, displ);
+
+ if (store) {
+ get_cpu_vsr(xt, rt + 32, true);
+ gen_qemu_st32fs(ctx, xt, ea);
+ } else {
+ gen_qemu_ld32fs(ctx, xt, ea);
+ set_cpu_vsr(rt + 32, xt, true);
+ set_cpu_vsr(rt + 32, tcg_constant_i64(0), false);
+ }
+
+ tcg_temp_free(ea);
+ tcg_temp_free_i64(xt);
+
+ return true;
+}
+
+static bool do_lstxssp_DS(DisasContext *ctx, arg_D *a, bool store)
+{
+ return do_lstxssp(ctx, a->rt, a->ra, tcg_constant_tl(a->si), store);
+}
+
+static bool do_plstxssp_PLS_D(DisasContext *ctx, arg_PLS_D *a, bool store)
+{
+ arg_D d;
+
+ if (!resolve_PLS_D(ctx, &d, a)) {
+ return true;
+ }
+
+ return do_lstxssp(ctx, d.rt, d.ra, tcg_constant_tl(d.si), store);
+}
+
TRANS_FLAGS2(ISA300, LXSD, do_lstxsd_DS, false)
TRANS_FLAGS2(ISA300, STXSD, do_lstxsd_DS, true)
+TRANS_FLAGS2(ISA300, LXSSP, do_lstxssp_DS, false)
+TRANS_FLAGS2(ISA300, STXSSP, do_lstxssp_DS, true)
TRANS_FLAGS2(ISA300, STXV, do_lstxv_D, true, false)
TRANS_FLAGS2(ISA300, LXV, do_lstxv_D, false, false)
TRANS_FLAGS2(ISA310, STXVP, do_lstxv_D, true, true)
@@ -2340,6 +2339,8 @@ TRANS_FLAGS2(ISA310, STXVPX, do_lstxv_X, true, true)
TRANS_FLAGS2(ISA310, LXVPX, do_lstxv_X, false, true)
TRANS64_FLAGS2(ISA310, PLXSD, do_plstxsd_PLS_D, false)
TRANS64_FLAGS2(ISA310, PSTXSD, do_plstxsd_PLS_D, true)
+TRANS64_FLAGS2(ISA310, PLXSSP, do_plstxssp_PLS_D, false)
+TRANS64_FLAGS2(ISA310, PSTXSSP, do_plstxssp_PLS_D, true)
TRANS64_FLAGS2(ISA310, PSTXV, do_lstxv_PLS_D, true, false)
TRANS64_FLAGS2(ISA310, PLXV, do_lstxv_PLS_D, false, false)
TRANS64_FLAGS2(ISA310, PSTXVP, do_lstxv_PLS_D, true, true)
--
2.25.1
^ permalink raw reply related [flat|nested] 97+ messages in thread
* Re: [PATCH v4 46/47] target/ppc: implement plxssp/pstxssp
2022-02-22 14:36 ` [PATCH v4 46/47] target/ppc: implement plxssp/pstxssp matheus.ferst
@ 2022-02-23 3:16 ` Richard Henderson
0 siblings, 0 replies; 97+ messages in thread
From: Richard Henderson @ 2022-02-23 3:16 UTC (permalink / raw)
To: matheus.ferst, qemu-devel, qemu-ppc
Cc: groug, danielhb413, Leandro Lupori, clg, david
On 2/22/22 04:36, matheus.ferst@eldorado.org.br wrote:
> From: Leandro Lupori<leandro.lupori@eldorado.org.br>
>
> Implement instructions plxssp/pstxssp and port lxssp/stxssp to
> decode tree.
>
> Signed-off-by: Matheus Ferst<matheus.ferst@eldorado.org.br>
> ---
> target/ppc/insn32.decode | 2 +
> target/ppc/insn64.decode | 6 ++
> target/ppc/translate.c | 29 +++------
> target/ppc/translate/vsx-impl.c.inc | 93 +++++++++++++++--------------
> 4 files changed, 62 insertions(+), 68 deletions(-)
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
r~
^ permalink raw reply [flat|nested] 97+ messages in thread
* [PATCH v4 47/47] target/ppc: implement lxvr[bhwd]/stxvr[bhwd]x
2022-02-22 14:35 [PATCH v4 00/47] target/ppc: PowerISA Vector/VSX instruction batch matheus.ferst
` (45 preceding siblings ...)
2022-02-22 14:36 ` [PATCH v4 46/47] target/ppc: implement plxssp/pstxssp matheus.ferst
@ 2022-02-22 14:36 ` matheus.ferst
2022-02-23 3:23 ` Richard Henderson
46 siblings, 1 reply; 97+ messages in thread
From: matheus.ferst @ 2022-02-22 14:36 UTC (permalink / raw)
To: qemu-devel, qemu-ppc
Cc: danielhb413, richard.henderson, groug, clg, Matheus Ferst,
Lucas Coutinho, david
From: Lucas Coutinho <lucas.coutinho@eldorado.org.br>
Implement the following PowerISA v3.1 instuctions:
lxvrbx: Load VSX Vector Rightmost Byte Indexed X-form
lxvrhx: Load VSX Vector Rightmost Halfword Indexed X-form
lxvrwx: Load VSX Vector Rightmost Word Indexed X-form
lxvrdx: Load VSX Vector Rightmost Doubleword Indexed X-form
stxvrbx: Store VSX Vector Rightmost Byte Indexed X-form
stxvrhx: Store VSX Vector Rightmost Halfword Indexed X-form
stxvrwx: Store VSX Vector Rightmost Word Indexed X-form
stxvrdx: Store VSX Vector Rightmost Doubleword Indexed X-form
Signed-off-by: Lucas Coutinho <lucas.coutinho@eldorado.org.br>
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
---
target/ppc/insn32.decode | 8 +++++++
target/ppc/translate/vsx-impl.c.inc | 35 +++++++++++++++++++++++++++++
2 files changed, 43 insertions(+)
diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode
index 5d3cfadfc6..00c825b856 100644
--- a/target/ppc/insn32.decode
+++ b/target/ppc/insn32.decode
@@ -612,6 +612,14 @@ LXVX 011111 ..... ..... ..... 0100 - 01100 . @X_TSX
STXVX 011111 ..... ..... ..... 0110001100 . @X_TSX
LXVPX 011111 ..... ..... ..... 0101001101 - @X_TSXP
STXVPX 011111 ..... ..... ..... 0111001101 - @X_TSXP
+LXVRBX 011111 ..... ..... ..... 0000001101 . @X_TSX
+LXVRHX 011111 ..... ..... ..... 0000101101 . @X_TSX
+LXVRWX 011111 ..... ..... ..... 0001001101 . @X_TSX
+LXVRDX 011111 ..... ..... ..... 0001101101 . @X_TSX
+STXVRBX 011111 ..... ..... ..... 0010001101 . @X_TSX
+STXVRHX 011111 ..... ..... ..... 0010101101 . @X_TSX
+STXVRWX 011111 ..... ..... ..... 0011001101 . @X_TSX
+STXVRDX 011111 ..... ..... ..... 0011101101 . @X_TSX
## VSX Scalar Multiply-Add Instructions
diff --git a/target/ppc/translate/vsx-impl.c.inc b/target/ppc/translate/vsx-impl.c.inc
index 48a398da0e..55a4a9bd27 100644
--- a/target/ppc/translate/vsx-impl.c.inc
+++ b/target/ppc/translate/vsx-impl.c.inc
@@ -2346,6 +2346,41 @@ TRANS64_FLAGS2(ISA310, PLXV, do_lstxv_PLS_D, false, false)
TRANS64_FLAGS2(ISA310, PSTXVP, do_lstxv_PLS_D, true, true)
TRANS64_FLAGS2(ISA310, PLXVP, do_lstxv_PLS_D, false, true)
+static bool do_lstrm(DisasContext *ctx, arg_X *a, MemOp mop, bool store)
+{
+ TCGv ea;
+ TCGv_i64 xt;
+
+ REQUIRE_VSX(ctx);
+
+ xt = tcg_temp_new_i64();
+
+ gen_set_access_type(ctx, ACCESS_INT);
+ ea = do_ea_calc(ctx, a->ra , cpu_gpr[a->rb]);
+
+ if (store) {
+ get_cpu_vsr(xt, a->rt, false);
+ tcg_gen_qemu_st_i64(xt, ea, ctx->mem_idx, mop);
+ } else {
+ tcg_gen_qemu_ld_i64(xt, ea, ctx->mem_idx, mop);
+ set_cpu_vsr(a->rt, xt, false);
+ set_cpu_vsr(a->rt, tcg_const_i64(0), true);
+ }
+
+ tcg_temp_free(ea);
+ tcg_temp_free_i64(xt);
+ return true;
+}
+
+TRANS_FLAGS2(ISA310, LXVRBX, do_lstrm, DEF_MEMOP(MO_UB), false)
+TRANS_FLAGS2(ISA310, LXVRHX, do_lstrm, DEF_MEMOP(MO_UW), false)
+TRANS_FLAGS2(ISA310, LXVRWX, do_lstrm, DEF_MEMOP(MO_UL), false)
+TRANS_FLAGS2(ISA310, LXVRDX, do_lstrm, DEF_MEMOP(MO_UQ), false)
+TRANS_FLAGS2(ISA310, STXVRBX, do_lstrm, DEF_MEMOP(MO_UB), true)
+TRANS_FLAGS2(ISA310, STXVRHX, do_lstrm, DEF_MEMOP(MO_UW), true)
+TRANS_FLAGS2(ISA310, STXVRWX, do_lstrm, DEF_MEMOP(MO_UL), true)
+TRANS_FLAGS2(ISA310, STXVRDX, do_lstrm, DEF_MEMOP(MO_UQ), true)
+
static void gen_xxeval_i64(TCGv_i64 t, TCGv_i64 a, TCGv_i64 b, TCGv_i64 c,
int64_t imm)
{
--
2.25.1
^ permalink raw reply related [flat|nested] 97+ messages in thread
* Re: [PATCH v4 47/47] target/ppc: implement lxvr[bhwd]/stxvr[bhwd]x
2022-02-22 14:36 ` [PATCH v4 47/47] target/ppc: implement lxvr[bhwd]/stxvr[bhwd]x matheus.ferst
@ 2022-02-23 3:23 ` Richard Henderson
0 siblings, 0 replies; 97+ messages in thread
From: Richard Henderson @ 2022-02-23 3:23 UTC (permalink / raw)
To: matheus.ferst, qemu-devel, qemu-ppc
Cc: Lucas Coutinho, groug, danielhb413, clg, david
On 2/22/22 04:36, matheus.ferst@eldorado.org.br wrote:
> From: Lucas Coutinho<lucas.coutinho@eldorado.org.br>
>
> Implement the following PowerISA v3.1 instuctions:
> lxvrbx: Load VSX Vector Rightmost Byte Indexed X-form
> lxvrhx: Load VSX Vector Rightmost Halfword Indexed X-form
> lxvrwx: Load VSX Vector Rightmost Word Indexed X-form
> lxvrdx: Load VSX Vector Rightmost Doubleword Indexed X-form
>
> stxvrbx: Store VSX Vector Rightmost Byte Indexed X-form
> stxvrhx: Store VSX Vector Rightmost Halfword Indexed X-form
> stxvrwx: Store VSX Vector Rightmost Word Indexed X-form
> stxvrdx: Store VSX Vector Rightmost Doubleword Indexed X-form
>
> Signed-off-by: Lucas Coutinho<lucas.coutinho@eldorado.org.br>
> Signed-off-by: Matheus Ferst<matheus.ferst@eldorado.org.br>
> ---
> target/ppc/insn32.decode | 8 +++++++
> target/ppc/translate/vsx-impl.c.inc | 35 +++++++++++++++++++++++++++++
> 2 files changed, 43 insertions(+)
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
r~
^ permalink raw reply [flat|nested] 97+ messages in thread