* [PATCH v5 01/11] tcg: Implement tcg_gen_{h,w}swap_{i32,i64}
2022-03-23 13:57 [PATCH v5 00/11] s390x/tcg: Implement Vector-Enhancements Facility 2 David Miller
@ 2022-03-23 13:57 ` David Miller
2022-03-23 13:57 ` [PATCH v5 02/11] target/s390x: vxeh2: vector convert short/32b David Miller
` (10 subsequent siblings)
11 siblings, 0 replies; 19+ messages in thread
From: David Miller @ 2022-03-23 13:57 UTC (permalink / raw)
To: qemu-s390x, qemu-devel
Cc: thuth, david, cohuck, richard.henderson, farman, David Miller,
pasic, borntraeger
From: Richard Henderson <richard.henderson@linaro.org>
Swap half-words (16-bit) and words (32-bit) within a larger value.
Mirrors functions of the same names within include/qemu/bitops.h.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: David Miller <dmiller423@gmail.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
---
include/tcg/tcg-op.h | 6 ++++++
tcg/tcg-op.c | 30 ++++++++++++++++++++++++++++++
2 files changed, 36 insertions(+)
diff --git a/include/tcg/tcg-op.h b/include/tcg/tcg-op.h
index caa0a63612..b09b8b4a05 100644
--- a/include/tcg/tcg-op.h
+++ b/include/tcg/tcg-op.h
@@ -332,6 +332,7 @@ void tcg_gen_ext8u_i32(TCGv_i32 ret, TCGv_i32 arg);
void tcg_gen_ext16u_i32(TCGv_i32 ret, TCGv_i32 arg);
void tcg_gen_bswap16_i32(TCGv_i32 ret, TCGv_i32 arg, int flags);
void tcg_gen_bswap32_i32(TCGv_i32 ret, TCGv_i32 arg);
+void tcg_gen_hswap_i32(TCGv_i32 ret, TCGv_i32 arg);
void tcg_gen_smin_i32(TCGv_i32, TCGv_i32 arg1, TCGv_i32 arg2);
void tcg_gen_smax_i32(TCGv_i32, TCGv_i32 arg1, TCGv_i32 arg2);
void tcg_gen_umin_i32(TCGv_i32, TCGv_i32 arg1, TCGv_i32 arg2);
@@ -531,6 +532,8 @@ void tcg_gen_ext32u_i64(TCGv_i64 ret, TCGv_i64 arg);
void tcg_gen_bswap16_i64(TCGv_i64 ret, TCGv_i64 arg, int flags);
void tcg_gen_bswap32_i64(TCGv_i64 ret, TCGv_i64 arg, int flags);
void tcg_gen_bswap64_i64(TCGv_i64 ret, TCGv_i64 arg);
+void tcg_gen_hswap_i64(TCGv_i64 ret, TCGv_i64 arg);
+void tcg_gen_wswap_i64(TCGv_i64 ret, TCGv_i64 arg);
void tcg_gen_smin_i64(TCGv_i64, TCGv_i64 arg1, TCGv_i64 arg2);
void tcg_gen_smax_i64(TCGv_i64, TCGv_i64 arg1, TCGv_i64 arg2);
void tcg_gen_umin_i64(TCGv_i64, TCGv_i64 arg1, TCGv_i64 arg2);
@@ -1077,6 +1080,8 @@ void tcg_gen_stl_vec(TCGv_vec r, TCGv_ptr base, TCGArg offset, TCGType t);
#define tcg_gen_bswap32_tl tcg_gen_bswap32_i64
#define tcg_gen_bswap64_tl tcg_gen_bswap64_i64
#define tcg_gen_bswap_tl tcg_gen_bswap64_i64
+#define tcg_gen_hswap_tl tcg_gen_hswap_i64
+#define tcg_gen_wswap_tl tcg_gen_wswap_i64
#define tcg_gen_concat_tl_i64 tcg_gen_concat32_i64
#define tcg_gen_extr_i64_tl tcg_gen_extr32_i64
#define tcg_gen_andc_tl tcg_gen_andc_i64
@@ -1192,6 +1197,7 @@ void tcg_gen_stl_vec(TCGv_vec r, TCGv_ptr base, TCGArg offset, TCGType t);
#define tcg_gen_bswap16_tl tcg_gen_bswap16_i32
#define tcg_gen_bswap32_tl(D, S, F) tcg_gen_bswap32_i32(D, S)
#define tcg_gen_bswap_tl tcg_gen_bswap32_i32
+#define tcg_gen_hswap_tl tcg_gen_hswap_i32
#define tcg_gen_concat_tl_i64 tcg_gen_concat_i32_i64
#define tcg_gen_extr_i64_tl tcg_gen_extr_i64_i32
#define tcg_gen_andc_tl tcg_gen_andc_i32
diff --git a/tcg/tcg-op.c b/tcg/tcg-op.c
index 65e1c94c2d..ae336ff6c2 100644
--- a/tcg/tcg-op.c
+++ b/tcg/tcg-op.c
@@ -1056,6 +1056,12 @@ void tcg_gen_bswap32_i32(TCGv_i32 ret, TCGv_i32 arg)
}
}
+void tcg_gen_hswap_i32(TCGv_i32 ret, TCGv_i32 arg)
+{
+ /* Swapping 2 16-bit elements is a rotate. */
+ tcg_gen_rotli_i32(ret, arg, 16);
+}
+
void tcg_gen_smin_i32(TCGv_i32 ret, TCGv_i32 a, TCGv_i32 b)
{
tcg_gen_movcond_i32(TCG_COND_LT, ret, a, b, a, b);
@@ -1792,6 +1798,30 @@ void tcg_gen_bswap64_i64(TCGv_i64 ret, TCGv_i64 arg)
}
}
+void tcg_gen_hswap_i64(TCGv_i64 ret, TCGv_i64 arg)
+{
+ uint64_t m = 0x0000ffff0000ffffull;
+ TCGv_i64 t0 = tcg_temp_new_i64();
+ TCGv_i64 t1 = tcg_temp_new_i64();
+
+ /* See include/qemu/bitops.h, hswap64. */
+ tcg_gen_rotli_i64(t1, arg, 32);
+ tcg_gen_andi_i64(t0, t1, m);
+ tcg_gen_shli_i64(t0, t0, 16);
+ tcg_gen_shri_i64(t1, t1, 16);
+ tcg_gen_andi_i64(t1, t1, m);
+ tcg_gen_or_i64(ret, t0, t1);
+
+ tcg_temp_free_i64(t0);
+ tcg_temp_free_i64(t1);
+}
+
+void tcg_gen_wswap_i64(TCGv_i64 ret, TCGv_i64 arg)
+{
+ /* Swapping 2 32-bit elements is a rotate. */
+ tcg_gen_rotli_i64(ret, arg, 32);
+}
+
void tcg_gen_not_i64(TCGv_i64 ret, TCGv_i64 arg)
{
if (TCG_TARGET_REG_BITS == 32) {
--
2.34.1
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH v5 02/11] target/s390x: vxeh2: vector convert short/32b
2022-03-23 13:57 [PATCH v5 00/11] s390x/tcg: Implement Vector-Enhancements Facility 2 David Miller
2022-03-23 13:57 ` [PATCH v5 01/11] tcg: Implement tcg_gen_{h,w}swap_{i32,i64} David Miller
@ 2022-03-23 13:57 ` David Miller
2022-03-23 13:57 ` [PATCH v5 03/11] target/s390x: vxeh2: vector string search David Miller
` (9 subsequent siblings)
11 siblings, 0 replies; 19+ messages in thread
From: David Miller @ 2022-03-23 13:57 UTC (permalink / raw)
To: qemu-s390x, qemu-devel
Cc: thuth, david, cohuck, richard.henderson, farman, David Miller,
pasic, borntraeger
Signed-off-by: David Miller <dmiller423@gmail.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
target/s390x/helper.h | 4 +++
target/s390x/tcg/translate_vx.c.inc | 44 ++++++++++++++++++++++++++---
target/s390x/tcg/vec_fpu_helper.c | 31 ++++++++++++++++++++
3 files changed, 75 insertions(+), 4 deletions(-)
diff --git a/target/s390x/helper.h b/target/s390x/helper.h
index 69f69cf718..7cbcbd7f0b 100644
--- a/target/s390x/helper.h
+++ b/target/s390x/helper.h
@@ -275,6 +275,10 @@ DEF_HELPER_FLAGS_5(gvec_vfche64, TCG_CALL_NO_WG, void, ptr, cptr, cptr, env, i32
DEF_HELPER_5(gvec_vfche64_cc, void, ptr, cptr, cptr, env, i32)
DEF_HELPER_FLAGS_5(gvec_vfche128, TCG_CALL_NO_WG, void, ptr, cptr, cptr, env, i32)
DEF_HELPER_5(gvec_vfche128_cc, void, ptr, cptr, cptr, env, i32)
+DEF_HELPER_FLAGS_4(gvec_vcdg32, TCG_CALL_NO_WG, void, ptr, cptr, env, i32)
+DEF_HELPER_FLAGS_4(gvec_vcdlg32, TCG_CALL_NO_WG, void, ptr, cptr, env, i32)
+DEF_HELPER_FLAGS_4(gvec_vcgd32, TCG_CALL_NO_WG, void, ptr, cptr, env, i32)
+DEF_HELPER_FLAGS_4(gvec_vclgd32, TCG_CALL_NO_WG, void, ptr, cptr, env, i32)
DEF_HELPER_FLAGS_4(gvec_vcdg64, TCG_CALL_NO_WG, void, ptr, cptr, env, i32)
DEF_HELPER_FLAGS_4(gvec_vcdlg64, TCG_CALL_NO_WG, void, ptr, cptr, env, i32)
DEF_HELPER_FLAGS_4(gvec_vcgd64, TCG_CALL_NO_WG, void, ptr, cptr, env, i32)
diff --git a/target/s390x/tcg/translate_vx.c.inc b/target/s390x/tcg/translate_vx.c.inc
index 98eb7710a4..ea28e40d4f 100644
--- a/target/s390x/tcg/translate_vx.c.inc
+++ b/target/s390x/tcg/translate_vx.c.inc
@@ -2720,23 +2720,59 @@ static DisasJumpType op_vcdg(DisasContext *s, DisasOps *o)
switch (s->fields.op2) {
case 0xc3:
- if (fpf == FPF_LONG) {
+ switch (fpf) {
+ case FPF_LONG:
fn = gen_helper_gvec_vcdg64;
+ break;
+ case FPF_SHORT:
+ if (s390_has_feat(S390_FEAT_VECTOR_ENH2)) {
+ fn = gen_helper_gvec_vcdg32;
+ }
+ break;
+ default:
+ break;
}
break;
case 0xc1:
- if (fpf == FPF_LONG) {
+ switch (fpf) {
+ case FPF_LONG:
fn = gen_helper_gvec_vcdlg64;
+ break;
+ case FPF_SHORT:
+ if (s390_has_feat(S390_FEAT_VECTOR_ENH2)) {
+ fn = gen_helper_gvec_vcdlg32;
+ }
+ break;
+ default:
+ break;
}
break;
case 0xc2:
- if (fpf == FPF_LONG) {
+ switch (fpf) {
+ case FPF_LONG:
fn = gen_helper_gvec_vcgd64;
+ break;
+ case FPF_SHORT:
+ if (s390_has_feat(S390_FEAT_VECTOR_ENH2)) {
+ fn = gen_helper_gvec_vcgd32;
+ }
+ break;
+ default:
+ break;
}
break;
case 0xc0:
- if (fpf == FPF_LONG) {
+ switch (fpf) {
+ case FPF_LONG:
fn = gen_helper_gvec_vclgd64;
+ break;
+ case FPF_SHORT:
+ if (s390_has_feat(S390_FEAT_VECTOR_ENH2)) {
+ fn = gen_helper_gvec_vclgd32;
+ }
+ break;
+ default:
+ break;
}
break;
case 0xc7:
diff --git a/target/s390x/tcg/vec_fpu_helper.c b/target/s390x/tcg/vec_fpu_helper.c
index 1a77993471..6834dbc540 100644
--- a/target/s390x/tcg/vec_fpu_helper.c
+++ b/target/s390x/tcg/vec_fpu_helper.c
@@ -176,6 +176,30 @@ static void vop128_2(S390Vector *v1, const S390Vector *v2, CPUS390XState *env,
*v1 = tmp;
}
+static float32 vcdg32(float32 a, float_status *s)
+{
+ return int32_to_float32(a, s);
+}
+
+static float32 vcdlg32(float32 a, float_status *s)
+{
+ return uint32_to_float32(a, s);
+}
+
+static float32 vcgd32(float32 a, float_status *s)
+{
+ const float32 tmp = float32_to_int32(a, s);
+
+ return float32_is_any_nan(a) ? INT32_MIN : tmp;
+}
+
+static float32 vclgd32(float32 a, float_status *s)
+{
+ const float32 tmp = float32_to_uint32(a, s);
+
+ return float32_is_any_nan(a) ? 0 : tmp;
+}
+
static float64 vcdg64(float64 a, float_status *s)
{
return int64_to_float64(a, s);
@@ -211,6 +235,9 @@ void HELPER(gvec_##NAME##BITS)(void *v1, const void *v2, CPUS390XState *env, \
vop##BITS##_2(v1, v2, env, se, XxC, erm, FN, GETPC()); \
}
+#define DEF_GVEC_VOP2_32(NAME) \
+DEF_GVEC_VOP2_FN(NAME, NAME##32, 32)
+
#define DEF_GVEC_VOP2_64(NAME) \
DEF_GVEC_VOP2_FN(NAME, NAME##64, 64)
@@ -219,6 +246,10 @@ DEF_GVEC_VOP2_FN(NAME, float32_##OP, 32) \
DEF_GVEC_VOP2_FN(NAME, float64_##OP, 64) \
DEF_GVEC_VOP2_FN(NAME, float128_##OP, 128)
+DEF_GVEC_VOP2_32(vcdg)
+DEF_GVEC_VOP2_32(vcdlg)
+DEF_GVEC_VOP2_32(vcgd)
+DEF_GVEC_VOP2_32(vclgd)
DEF_GVEC_VOP2_64(vcdg)
DEF_GVEC_VOP2_64(vcdlg)
DEF_GVEC_VOP2_64(vcgd)
--
2.34.1
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH v5 03/11] target/s390x: vxeh2: vector string search
2022-03-23 13:57 [PATCH v5 00/11] s390x/tcg: Implement Vector-Enhancements Facility 2 David Miller
2022-03-23 13:57 ` [PATCH v5 01/11] tcg: Implement tcg_gen_{h,w}swap_{i32,i64} David Miller
2022-03-23 13:57 ` [PATCH v5 02/11] target/s390x: vxeh2: vector convert short/32b David Miller
@ 2022-03-23 13:57 ` David Miller
2022-03-23 13:57 ` [PATCH v5 04/11] target/s390x: vxeh2: Update for changes to vector shifts David Miller
` (8 subsequent siblings)
11 siblings, 0 replies; 19+ messages in thread
From: David Miller @ 2022-03-23 13:57 UTC (permalink / raw)
To: qemu-s390x, qemu-devel
Cc: thuth, david, cohuck, richard.henderson, farman, David Miller,
pasic, borntraeger
Signed-off-by: David Miller <dmiller423@gmail.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: David Hildenbrand <david@redhat.com>
---
target/s390x/helper.h | 6 ++
target/s390x/tcg/insn-data.def | 2 +
target/s390x/tcg/translate.c | 3 +-
target/s390x/tcg/translate_vx.c.inc | 25 +++++++
target/s390x/tcg/vec_string_helper.c | 99 ++++++++++++++++++++++++++++
5 files changed, 134 insertions(+), 1 deletion(-)
diff --git a/target/s390x/helper.h b/target/s390x/helper.h
index 7cbcbd7f0b..7412130883 100644
--- a/target/s390x/helper.h
+++ b/target/s390x/helper.h
@@ -246,6 +246,12 @@ DEF_HELPER_6(gvec_vstrc_cc32, void, ptr, cptr, cptr, cptr, env, i32)
DEF_HELPER_6(gvec_vstrc_cc_rt8, void, ptr, cptr, cptr, cptr, env, i32)
DEF_HELPER_6(gvec_vstrc_cc_rt16, void, ptr, cptr, cptr, cptr, env, i32)
DEF_HELPER_6(gvec_vstrc_cc_rt32, void, ptr, cptr, cptr, cptr, env, i32)
+DEF_HELPER_6(gvec_vstrs_8, void, ptr, cptr, cptr, cptr, env, i32)
+DEF_HELPER_6(gvec_vstrs_16, void, ptr, cptr, cptr, cptr, env, i32)
+DEF_HELPER_6(gvec_vstrs_32, void, ptr, cptr, cptr, cptr, env, i32)
+DEF_HELPER_6(gvec_vstrs_zs8, void, ptr, cptr, cptr, cptr, env, i32)
+DEF_HELPER_6(gvec_vstrs_zs16, void, ptr, cptr, cptr, cptr, env, i32)
+DEF_HELPER_6(gvec_vstrs_zs32, void, ptr, cptr, cptr, cptr, env, i32)
/* === Vector Floating-Point Instructions */
DEF_HELPER_FLAGS_5(gvec_vfa32, TCG_CALL_NO_WG, void, ptr, cptr, cptr, env, i32)
diff --git a/target/s390x/tcg/insn-data.def b/target/s390x/tcg/insn-data.def
index 6c8a8b229f..46add91a0e 100644
--- a/target/s390x/tcg/insn-data.def
+++ b/target/s390x/tcg/insn-data.def
@@ -1246,6 +1246,8 @@
F(0xe75c, VISTR, VRR_a, V, 0, 0, 0, 0, vistr, 0, IF_VEC)
/* VECTOR STRING RANGE COMPARE */
F(0xe78a, VSTRC, VRR_d, V, 0, 0, 0, 0, vstrc, 0, IF_VEC)
+/* VECTOR STRING SEARCH */
+ F(0xe78b, VSTRS, VRR_d, VE2, 0, 0, 0, 0, vstrs, 0, IF_VEC)
/* === Vector Floating-Point Instructions */
diff --git a/target/s390x/tcg/translate.c b/target/s390x/tcg/translate.c
index 904b51542f..d9ac29573d 100644
--- a/target/s390x/tcg/translate.c
+++ b/target/s390x/tcg/translate.c
@@ -6222,7 +6222,8 @@ enum DisasInsnEnum {
#define FAC_PCI S390_FEAT_ZPCI /* z/PCI facility */
#define FAC_AIS S390_FEAT_ADAPTER_INT_SUPPRESSION
#define FAC_V S390_FEAT_VECTOR /* vector facility */
-#define FAC_VE S390_FEAT_VECTOR_ENH /* vector enhancements facility 1 */
+#define FAC_VE S390_FEAT_VECTOR_ENH /* vector enhancements facility 1 */
+#define FAC_VE2 S390_FEAT_VECTOR_ENH2 /* vector enhancements facility 2 */
#define FAC_MIE2 S390_FEAT_MISC_INSTRUCTION_EXT2 /* miscellaneous-instruction-extensions facility 2 */
#define FAC_MIE3 S390_FEAT_MISC_INSTRUCTION_EXT3 /* miscellaneous-instruction-extensions facility 3 */
diff --git a/target/s390x/tcg/translate_vx.c.inc b/target/s390x/tcg/translate_vx.c.inc
index ea28e40d4f..29e4dd78a8 100644
--- a/target/s390x/tcg/translate_vx.c.inc
+++ b/target/s390x/tcg/translate_vx.c.inc
@@ -2497,6 +2497,31 @@ static DisasJumpType op_vstrc(DisasContext *s, DisasOps *o)
return DISAS_NEXT;
}
+static DisasJumpType op_vstrs(DisasContext *s, DisasOps *o)
+{
+ typedef void (*helper_vstrs)(TCGv_ptr, TCGv_ptr, TCGv_ptr,
+ TCGv_ptr, TCGv_ptr, TCGv_i32);
+ static const helper_vstrs fns[3][2] = {
+ { gen_helper_gvec_vstrs_8, gen_helper_gvec_vstrs_zs8 },
+ { gen_helper_gvec_vstrs_16, gen_helper_gvec_vstrs_zs16 },
+ { gen_helper_gvec_vstrs_32, gen_helper_gvec_vstrs_zs32 },
+ };
+ const uint8_t es = get_field(s, m5);
+ const uint8_t m6 = get_field(s, m6);
+ const bool zs = extract32(m6, 1, 1);
+
+ if (es > ES_32 || m6 & ~2) {
+ gen_program_exception(s, PGM_SPECIFICATION);
+ return DISAS_NORETURN;
+ }
+
+ gen_gvec_4_ptr(get_field(s, v1), get_field(s, v2),
+ get_field(s, v3), get_field(s, v4),
+ cpu_env, 0, fns[es][zs]);
+ set_cc_static(s);
+ return DISAS_NEXT;
+}
+
static DisasJumpType op_vfa(DisasContext *s, DisasOps *o)
{
const uint8_t fpf = get_field(s, m4);
diff --git a/target/s390x/tcg/vec_string_helper.c b/target/s390x/tcg/vec_string_helper.c
index ac315eb095..00135865c0 100644
--- a/target/s390x/tcg/vec_string_helper.c
+++ b/target/s390x/tcg/vec_string_helper.c
@@ -471,3 +471,102 @@ void HELPER(gvec_vstrc_cc_rt##BITS)(void *v1, const void *v2, const void *v3, \
DEF_VSTRC_CC_RT_HELPER(8)
DEF_VSTRC_CC_RT_HELPER(16)
DEF_VSTRC_CC_RT_HELPER(32)
+
+static int vstrs(S390Vector *v1, const S390Vector *v2, const S390Vector *v3,
+ const S390Vector *v4, uint8_t es, bool zs)
+{
+ int substr_elen, substr_0, str_elen, i, j, k, cc;
+ int nelem = 16 >> es;
+ bool eos = false;
+
+ substr_elen = s390_vec_read_element8(v4, 7) >> es;
+
+ /* If ZS, bound substr length by min(nelem, strlen(v3)). */
+ if (zs) {
+ substr_elen = MIN(substr_elen, nelem);
+ for (i = 0; i < substr_elen; i++) {
+ if (s390_vec_read_element(v3, i, es) == 0) {
+ substr_elen = i;
+ break;
+ }
+ }
+ }
+
+ if (substr_elen == 0) {
+ cc = 2; /* full match for degenerate case of empty substr */
+ k = 0;
+ goto done;
+ }
+
+ /* If ZS, look for eos in the searched string. */
+ if (zs) {
+ for (k = 0; k < nelem; k++) {
+ if (s390_vec_read_element(v2, k, es) == 0) {
+ eos = true;
+ break;
+ }
+ }
+ str_elen = k;
+ } else {
+ str_elen = nelem;
+ }
+
+ substr_0 = s390_vec_read_element(v3, 0, es);
+
+ for (k = 0; ; k++) {
+ for (; k < str_elen; k++) {
+ if (s390_vec_read_element(v2, k, es) == substr_0) {
+ break;
+ }
+ }
+
+ /* If we reached the end of the string, no match. */
+ if (k == str_elen) {
+ cc = eos; /* no match (with or without zero char) */
+ goto done;
+ }
+
+ /* If the substring is only one char, match. */
+ if (substr_elen == 1) {
+ cc = 2; /* full match */
+ goto done;
+ }
+
+ /* If the match begins at the last char, we have a partial match. */
+ if (k == str_elen - 1) {
+ cc = 3; /* partial match */
+ goto done;
+ }
+
+ i = MIN(nelem, k + substr_elen);
+ for (j = k + 1; j < i; j++) {
+ uint32_t e2 = s390_vec_read_element(v2, j, es);
+ uint32_t e3 = s390_vec_read_element(v3, j - k, es);
+ if (e2 != e3) {
+ break;
+ }
+ }
+ if (j == i) {
+ /* Matched up until "end". */
+ cc = i - k == substr_elen ? 2 : 3; /* full or partial match */
+ goto done;
+ }
+ }
+
+ done:
+ s390_vec_write_element64(v1, 0, k << es);
+ s390_vec_write_element64(v1, 1, 0);
+ return cc;
+}
+
+#define DEF_VSTRS_HELPER(BITS) \
+void QEMU_FLATTEN HELPER(gvec_vstrs_##BITS)(void *v1, const void *v2, \
+ const void *v3, const void *v4, CPUS390XState *env, uint32_t desc) \
+ { env->cc_op = vstrs(v1, v2, v3, v4, MO_##BITS, false); } \
+void QEMU_FLATTEN HELPER(gvec_vstrs_zs##BITS)(void *v1, const void *v2, \
+ const void *v3, const void *v4, CPUS390XState *env, uint32_t desc) \
+ { env->cc_op = vstrs(v1, v2, v3, v4, MO_##BITS, true); }
+
+DEF_VSTRS_HELPER(8)
+DEF_VSTRS_HELPER(16)
+DEF_VSTRS_HELPER(32)
--
2.34.1
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH v5 04/11] target/s390x: vxeh2: Update for changes to vector shifts
2022-03-23 13:57 [PATCH v5 00/11] s390x/tcg: Implement Vector-Enhancements Facility 2 David Miller
` (2 preceding siblings ...)
2022-03-23 13:57 ` [PATCH v5 03/11] target/s390x: vxeh2: vector string search David Miller
@ 2022-03-23 13:57 ` David Miller
2022-03-23 13:57 ` [PATCH v5 05/11] target/s390x: vxeh2: vector shift double by bit David Miller
` (7 subsequent siblings)
11 siblings, 0 replies; 19+ messages in thread
From: David Miller @ 2022-03-23 13:57 UTC (permalink / raw)
To: qemu-s390x, qemu-devel
Cc: thuth, david, cohuck, richard.henderson, farman, David Miller,
pasic, borntraeger
Signed-off-by: David Miller <dmiller423@gmail.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: David Hildenbrand <david@redhat.com>
---
target/s390x/helper.h | 3 ++
target/s390x/tcg/insn-data.def | 12 ++---
target/s390x/tcg/translate_vx.c.inc | 75 ++++++++++++-----------------
target/s390x/tcg/vec_int_helper.c | 55 +++++++++++++++++++++
4 files changed, 95 insertions(+), 50 deletions(-)
diff --git a/target/s390x/helper.h b/target/s390x/helper.h
index 7412130883..bf33d86f74 100644
--- a/target/s390x/helper.h
+++ b/target/s390x/helper.h
@@ -203,8 +203,11 @@ DEF_HELPER_FLAGS_3(gvec_vpopct16, TCG_CALL_NO_RWG, void, ptr, cptr, i32)
DEF_HELPER_FLAGS_4(gvec_verim8, TCG_CALL_NO_RWG, void, ptr, cptr, cptr, i32)
DEF_HELPER_FLAGS_4(gvec_verim16, TCG_CALL_NO_RWG, void, ptr, cptr, cptr, i32)
DEF_HELPER_FLAGS_4(gvec_vsl, TCG_CALL_NO_RWG, void, ptr, cptr, i64, i32)
+DEF_HELPER_FLAGS_4(gvec_vsl_ve2, TCG_CALL_NO_RWG, void, ptr, cptr, cptr, i32)
DEF_HELPER_FLAGS_4(gvec_vsra, TCG_CALL_NO_RWG, void, ptr, cptr, i64, i32)
+DEF_HELPER_FLAGS_4(gvec_vsra_ve2, TCG_CALL_NO_RWG, void, ptr, cptr, cptr, i32)
DEF_HELPER_FLAGS_4(gvec_vsrl, TCG_CALL_NO_RWG, void, ptr, cptr, i64, i32)
+DEF_HELPER_FLAGS_4(gvec_vsrl_ve2, TCG_CALL_NO_RWG, void, ptr, cptr, cptr, i32)
DEF_HELPER_FLAGS_4(gvec_vscbi8, TCG_CALL_NO_RWG, void, ptr, cptr, cptr, i32)
DEF_HELPER_FLAGS_4(gvec_vscbi16, TCG_CALL_NO_RWG, void, ptr, cptr, cptr, i32)
DEF_HELPER_4(gvec_vtm, void, ptr, cptr, env, i32)
diff --git a/target/s390x/tcg/insn-data.def b/target/s390x/tcg/insn-data.def
index 46add91a0e..f487a64abf 100644
--- a/target/s390x/tcg/insn-data.def
+++ b/target/s390x/tcg/insn-data.def
@@ -1204,19 +1204,19 @@
F(0xe778, VESRLV, VRR_c, V, 0, 0, 0, 0, vesv, 0, IF_VEC)
F(0xe738, VESRL, VRS_a, V, la2, 0, 0, 0, ves, 0, IF_VEC)
/* VECTOR SHIFT LEFT */
- F(0xe774, VSL, VRR_c, V, 0, 0, 0, 0, vsl, 0, IF_VEC)
+ E(0xe774, VSL, VRR_c, V, 0, 0, 0, 0, vsl, 0, 0, IF_VEC)
/* VECTOR SHIFT LEFT BY BYTE */
- F(0xe775, VSLB, VRR_c, V, 0, 0, 0, 0, vsl, 0, IF_VEC)
+ E(0xe775, VSLB, VRR_c, V, 0, 0, 0, 0, vsl, 0, 1, IF_VEC)
/* VECTOR SHIFT LEFT DOUBLE BY BYTE */
F(0xe777, VSLDB, VRI_d, V, 0, 0, 0, 0, vsldb, 0, IF_VEC)
/* VECTOR SHIFT RIGHT ARITHMETIC */
- F(0xe77e, VSRA, VRR_c, V, 0, 0, 0, 0, vsra, 0, IF_VEC)
+ E(0xe77e, VSRA, VRR_c, V, 0, 0, 0, 0, vsra, 0, 0, IF_VEC)
/* VECTOR SHIFT RIGHT ARITHMETIC BY BYTE */
- F(0xe77f, VSRAB, VRR_c, V, 0, 0, 0, 0, vsra, 0, IF_VEC)
+ E(0xe77f, VSRAB, VRR_c, V, 0, 0, 0, 0, vsra, 0, 1, IF_VEC)
/* VECTOR SHIFT RIGHT LOGICAL */
- F(0xe77c, VSRL, VRR_c, V, 0, 0, 0, 0, vsrl, 0, IF_VEC)
+ E(0xe77c, VSRL, VRR_c, V, 0, 0, 0, 0, vsrl, 0, 0, IF_VEC)
/* VECTOR SHIFT RIGHT LOGICAL BY BYTE */
- F(0xe77d, VSRLB, VRR_c, V, 0, 0, 0, 0, vsrl, 0, IF_VEC)
+ E(0xe77d, VSRLB, VRR_c, V, 0, 0, 0, 0, vsrl, 0, 1, IF_VEC)
/* VECTOR SUBTRACT */
F(0xe7f7, VS, VRR_c, V, 0, 0, 0, 0, vs, 0, IF_VEC)
/* VECTOR SUBTRACT COMPUTE BORROW INDICATION */
diff --git a/target/s390x/tcg/translate_vx.c.inc b/target/s390x/tcg/translate_vx.c.inc
index 29e4dd78a8..fd53ddafef 100644
--- a/target/s390x/tcg/translate_vx.c.inc
+++ b/target/s390x/tcg/translate_vx.c.inc
@@ -2018,23 +2018,44 @@ static DisasJumpType op_ves(DisasContext *s, DisasOps *o)
return DISAS_NEXT;
}
-static DisasJumpType op_vsl(DisasContext *s, DisasOps *o)
+static DisasJumpType gen_vsh_by_byte(DisasContext *s, DisasOps *o,
+ gen_helper_gvec_2i *gen,
+ gen_helper_gvec_3 *gen_ve2)
{
- TCGv_i64 shift = tcg_temp_new_i64();
+ bool byte = s->insn->data;
- read_vec_element_i64(shift, get_field(s, v3), 7, ES_8);
- if (s->fields.op2 == 0x74) {
- tcg_gen_andi_i64(shift, shift, 0x7);
+ if (!byte && s390_has_feat(S390_FEAT_VECTOR_ENH2)) {
+ gen_gvec_3_ool(get_field(s, v1), get_field(s, v2),
+ get_field(s, v3), 0, gen_ve2);
} else {
- tcg_gen_andi_i64(shift, shift, 0x78);
- }
+ TCGv_i64 shift = tcg_temp_new_i64();
- gen_gvec_2i_ool(get_field(s, v1), get_field(s, v2),
- shift, 0, gen_helper_gvec_vsl);
- tcg_temp_free_i64(shift);
+ read_vec_element_i64(shift, get_field(s, v3), 7, ES_8);
+ tcg_gen_andi_i64(shift, shift, byte ? 0x78 : 7);
+ gen_gvec_2i_ool(get_field(s, v1), get_field(s, v2), shift, 0, gen);
+ tcg_temp_free_i64(shift);
+ }
return DISAS_NEXT;
}
+static DisasJumpType op_vsl(DisasContext *s, DisasOps *o)
+{
+ return gen_vsh_by_byte(s, o, gen_helper_gvec_vsl,
+ gen_helper_gvec_vsl_ve2);
+}
+
+static DisasJumpType op_vsra(DisasContext *s, DisasOps *o)
+{
+ return gen_vsh_by_byte(s, o, gen_helper_gvec_vsra,
+ gen_helper_gvec_vsra_ve2);
+}
+
+static DisasJumpType op_vsrl(DisasContext *s, DisasOps *o)
+{
+ return gen_vsh_by_byte(s, o, gen_helper_gvec_vsrl,
+ gen_helper_gvec_vsrl_ve2);
+}
+
static DisasJumpType op_vsldb(DisasContext *s, DisasOps *o)
{
const uint8_t i4 = get_field(s, i4) & 0xf;
@@ -2064,40 +2085,6 @@ static DisasJumpType op_vsldb(DisasContext *s, DisasOps *o)
return DISAS_NEXT;
}
-static DisasJumpType op_vsra(DisasContext *s, DisasOps *o)
-{
- TCGv_i64 shift = tcg_temp_new_i64();
-
- read_vec_element_i64(shift, get_field(s, v3), 7, ES_8);
- if (s->fields.op2 == 0x7e) {
- tcg_gen_andi_i64(shift, shift, 0x7);
- } else {
- tcg_gen_andi_i64(shift, shift, 0x78);
- }
-
- gen_gvec_2i_ool(get_field(s, v1), get_field(s, v2),
- shift, 0, gen_helper_gvec_vsra);
- tcg_temp_free_i64(shift);
- return DISAS_NEXT;
-}
-
-static DisasJumpType op_vsrl(DisasContext *s, DisasOps *o)
-{
- TCGv_i64 shift = tcg_temp_new_i64();
-
- read_vec_element_i64(shift, get_field(s, v3), 7, ES_8);
- if (s->fields.op2 == 0x7c) {
- tcg_gen_andi_i64(shift, shift, 0x7);
- } else {
- tcg_gen_andi_i64(shift, shift, 0x78);
- }
-
- gen_gvec_2i_ool(get_field(s, v1), get_field(s, v2),
- shift, 0, gen_helper_gvec_vsrl);
- tcg_temp_free_i64(shift);
- return DISAS_NEXT;
-}
-
static DisasJumpType op_vs(DisasContext *s, DisasOps *o)
{
const uint8_t es = get_field(s, m4);
diff --git a/target/s390x/tcg/vec_int_helper.c b/target/s390x/tcg/vec_int_helper.c
index 5561b3ed90..4b6358c67c 100644
--- a/target/s390x/tcg/vec_int_helper.c
+++ b/target/s390x/tcg/vec_int_helper.c
@@ -540,18 +540,73 @@ void HELPER(gvec_vsl)(void *v1, const void *v2, uint64_t count,
s390_vec_shl(v1, v2, count);
}
+void HELPER(gvec_vsl_ve2)(void *v1, const void *v2, const void *v3,
+ uint32_t desc)
+{
+ S390Vector tmp;
+ uint32_t sh, e0, e1 = 0;
+ int i;
+
+ for (i = 15; i >= 0; --i, e1 = e0) {
+ e0 = s390_vec_read_element8(v2, i);
+ sh = s390_vec_read_element8(v3, i) & 7;
+
+ s390_vec_write_element8(&tmp, i, rol32(e0 | (e1 << 24), sh));
+ }
+
+ *(S390Vector *)v1 = tmp;
+}
+
void HELPER(gvec_vsra)(void *v1, const void *v2, uint64_t count,
uint32_t desc)
{
s390_vec_sar(v1, v2, count);
}
+void HELPER(gvec_vsra_ve2)(void *v1, const void *v2, const void *v3,
+ uint32_t desc)
+{
+ S390Vector tmp;
+ uint32_t sh, e0, e1 = 0;
+ int i = 0;
+
+ /* Byte 0 is special only. */
+ e0 = (int32_t)(int8_t)s390_vec_read_element8(v2, i);
+ sh = s390_vec_read_element8(v3, i) & 7;
+ s390_vec_write_element8(&tmp, i, e0 >> sh);
+
+ e1 = e0;
+ for (i = 1; i < 16; ++i, e1 = e0) {
+ e0 = s390_vec_read_element8(v2, i);
+ sh = s390_vec_read_element8(v3, i) & 7;
+ s390_vec_write_element8(&tmp, i, (e0 | e1 << 8) >> sh);
+ }
+
+ *(S390Vector *)v1 = tmp;
+}
+
void HELPER(gvec_vsrl)(void *v1, const void *v2, uint64_t count,
uint32_t desc)
{
s390_vec_shr(v1, v2, count);
}
+void HELPER(gvec_vsrl_ve2)(void *v1, const void *v2, const void *v3,
+ uint32_t desc)
+{
+ S390Vector tmp;
+ uint32_t sh, e0, e1 = 0;
+
+ for (int i = 0; i < 16; ++i, e1 = e0) {
+ e0 = s390_vec_read_element8(v2, i);
+ sh = s390_vec_read_element8(v3, i) & 7;
+
+ s390_vec_write_element8(&tmp, i, (e0 | (e1 << 8)) >> sh);
+ }
+
+ *(S390Vector *)v1 = tmp;
+}
+
#define DEF_VSCBI(BITS) \
void HELPER(gvec_vscbi##BITS)(void *v1, const void *v2, const void *v3, \
uint32_t desc) \
--
2.34.1
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH v5 05/11] target/s390x: vxeh2: vector shift double by bit
2022-03-23 13:57 [PATCH v5 00/11] s390x/tcg: Implement Vector-Enhancements Facility 2 David Miller
` (3 preceding siblings ...)
2022-03-23 13:57 ` [PATCH v5 04/11] target/s390x: vxeh2: Update for changes to vector shifts David Miller
@ 2022-03-23 13:57 ` David Miller
2022-03-23 13:57 ` [PATCH v5 06/11] target/s390x: vxeh2: vector {load, store} elements reversed David Miller
` (6 subsequent siblings)
11 siblings, 0 replies; 19+ messages in thread
From: David Miller @ 2022-03-23 13:57 UTC (permalink / raw)
To: qemu-s390x, qemu-devel
Cc: thuth, david, cohuck, richard.henderson, farman, David Miller,
pasic, borntraeger
Signed-off-by: David Miller <dmiller423@gmail.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
target/s390x/tcg/insn-data.def | 6 +++-
target/s390x/tcg/translate_vx.c.inc | 55 +++++++++++++++++++++++++----
2 files changed, 53 insertions(+), 8 deletions(-)
diff --git a/target/s390x/tcg/insn-data.def b/target/s390x/tcg/insn-data.def
index f487a64abf..98a31a557d 100644
--- a/target/s390x/tcg/insn-data.def
+++ b/target/s390x/tcg/insn-data.def
@@ -1207,12 +1207,16 @@
E(0xe774, VSL, VRR_c, V, 0, 0, 0, 0, vsl, 0, 0, IF_VEC)
/* VECTOR SHIFT LEFT BY BYTE */
E(0xe775, VSLB, VRR_c, V, 0, 0, 0, 0, vsl, 0, 1, IF_VEC)
+/* VECTOR SHIFT LEFT DOUBLE BY BIT */
+ E(0xe786, VSLD, VRI_d, VE2, 0, 0, 0, 0, vsld, 0, 0, IF_VEC)
/* VECTOR SHIFT LEFT DOUBLE BY BYTE */
- F(0xe777, VSLDB, VRI_d, V, 0, 0, 0, 0, vsldb, 0, IF_VEC)
+ E(0xe777, VSLDB, VRI_d, V, 0, 0, 0, 0, vsld, 0, 1, IF_VEC)
/* VECTOR SHIFT RIGHT ARITHMETIC */
E(0xe77e, VSRA, VRR_c, V, 0, 0, 0, 0, vsra, 0, 0, IF_VEC)
/* VECTOR SHIFT RIGHT ARITHMETIC BY BYTE */
E(0xe77f, VSRAB, VRR_c, V, 0, 0, 0, 0, vsra, 0, 1, IF_VEC)
+/* VECTOR SHIFT RIGHT DOUBLE BY BIT */
+ F(0xe787, VSRD, VRI_d, VE2, 0, 0, 0, 0, vsrd, 0, IF_VEC)
/* VECTOR SHIFT RIGHT LOGICAL */
E(0xe77c, VSRL, VRR_c, V, 0, 0, 0, 0, vsrl, 0, 0, IF_VEC)
/* VECTOR SHIFT RIGHT LOGICAL BY BYTE */
diff --git a/target/s390x/tcg/translate_vx.c.inc b/target/s390x/tcg/translate_vx.c.inc
index fd53ddafef..bb997de794 100644
--- a/target/s390x/tcg/translate_vx.c.inc
+++ b/target/s390x/tcg/translate_vx.c.inc
@@ -2056,14 +2056,23 @@ static DisasJumpType op_vsrl(DisasContext *s, DisasOps *o)
gen_helper_gvec_vsrl_ve2);
}
-static DisasJumpType op_vsldb(DisasContext *s, DisasOps *o)
+static DisasJumpType op_vsld(DisasContext *s, DisasOps *o)
{
- const uint8_t i4 = get_field(s, i4) & 0xf;
- const int left_shift = (i4 & 7) * 8;
- const int right_shift = 64 - left_shift;
- TCGv_i64 t0 = tcg_temp_new_i64();
- TCGv_i64 t1 = tcg_temp_new_i64();
- TCGv_i64 t2 = tcg_temp_new_i64();
+ const bool byte = s->insn->data;
+ const uint8_t mask = byte ? 15 : 7;
+ const uint8_t mul = byte ? 8 : 1;
+ const uint8_t i4 = get_field(s, i4);
+ const int right_shift = 64 - (i4 & 7) * mul;
+ TCGv_i64 t0, t1, t2;
+
+ if (i4 & ~mask) {
+ gen_program_exception(s, PGM_SPECIFICATION);
+ return DISAS_NORETURN;
+ }
+
+ t0 = tcg_temp_new_i64();
+ t1 = tcg_temp_new_i64();
+ t2 = tcg_temp_new_i64();
if ((i4 & 8) == 0) {
read_vec_element_i64(t0, get_field(s, v2), 0, ES_64);
@@ -2074,8 +2083,40 @@ static DisasJumpType op_vsldb(DisasContext *s, DisasOps *o)
read_vec_element_i64(t1, get_field(s, v3), 0, ES_64);
read_vec_element_i64(t2, get_field(s, v3), 1, ES_64);
}
+
tcg_gen_extract2_i64(t0, t1, t0, right_shift);
tcg_gen_extract2_i64(t1, t2, t1, right_shift);
+
+ write_vec_element_i64(t0, get_field(s, v1), 0, ES_64);
+ write_vec_element_i64(t1, get_field(s, v1), 1, ES_64);
+
+ tcg_temp_free(t0);
+ tcg_temp_free(t1);
+ tcg_temp_free(t2);
+ return DISAS_NEXT;
+}
+
+static DisasJumpType op_vsrd(DisasContext *s, DisasOps *o)
+{
+ const uint8_t i4 = get_field(s, i4);
+ TCGv_i64 t0, t1, t2;
+
+ if (i4 & ~7) {
+ gen_program_exception(s, PGM_SPECIFICATION);
+ return DISAS_NORETURN;
+ }
+
+ t0 = tcg_temp_new_i64();
+ t1 = tcg_temp_new_i64();
+ t2 = tcg_temp_new_i64();
+
+ read_vec_element_i64(t0, get_field(s, v2), 1, ES_64);
+ read_vec_element_i64(t1, get_field(s, v3), 0, ES_64);
+ read_vec_element_i64(t2, get_field(s, v3), 1, ES_64);
+
+ tcg_gen_extract2_i64(t0, t1, t0, i4);
+ tcg_gen_extract2_i64(t1, t2, t1, i4);
+
write_vec_element_i64(t0, get_field(s, v1), 0, ES_64);
write_vec_element_i64(t1, get_field(s, v1), 1, ES_64);
--
2.34.1
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH v5 06/11] target/s390x: vxeh2: vector {load, store} elements reversed
2022-03-23 13:57 [PATCH v5 00/11] s390x/tcg: Implement Vector-Enhancements Facility 2 David Miller
` (4 preceding siblings ...)
2022-03-23 13:57 ` [PATCH v5 05/11] target/s390x: vxeh2: vector shift double by bit David Miller
@ 2022-03-23 13:57 ` David Miller
2022-03-23 13:57 ` [PATCH v5 07/11] target/s390x: vxeh2: vector {load, store} byte reversed elements David Miller
` (5 subsequent siblings)
11 siblings, 0 replies; 19+ messages in thread
From: David Miller @ 2022-03-23 13:57 UTC (permalink / raw)
To: qemu-s390x, qemu-devel
Cc: thuth, david, cohuck, richard.henderson, farman, David Miller,
pasic, borntraeger
Signed-off-by: David Miller <dmiller423@gmail.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: David Hildenbrand <david@redhat.com>
---
target/s390x/tcg/insn-data.def | 4 ++
target/s390x/tcg/translate_vx.c.inc | 84 +++++++++++++++++++++++++++++
2 files changed, 88 insertions(+)
diff --git a/target/s390x/tcg/insn-data.def b/target/s390x/tcg/insn-data.def
index 98a31a557d..b524541a7d 100644
--- a/target/s390x/tcg/insn-data.def
+++ b/target/s390x/tcg/insn-data.def
@@ -1037,6 +1037,8 @@
E(0xe741, VLEIH, VRI_a, V, 0, 0, 0, 0, vlei, 0, ES_16, IF_VEC)
E(0xe743, VLEIF, VRI_a, V, 0, 0, 0, 0, vlei, 0, ES_32, IF_VEC)
E(0xe742, VLEIG, VRI_a, V, 0, 0, 0, 0, vlei, 0, ES_64, IF_VEC)
+/* VECTOR LOAD ELEMENTS REVERSED */
+ F(0xe607, VLER, VRX, VE2, la2, 0, 0, 0, vler, 0, IF_VEC)
/* VECTOR LOAD GR FROM VR ELEMENT */
F(0xe721, VLGV, VRS_c, V, la2, 0, r1, 0, vlgv, 0, IF_VEC)
/* VECTOR LOAD LOGICAL ELEMENT AND ZERO */
@@ -1082,6 +1084,8 @@
E(0xe709, VSTEH, VRX, V, la2, 0, 0, 0, vste, 0, ES_16, IF_VEC)
E(0xe70b, VSTEF, VRX, V, la2, 0, 0, 0, vste, 0, ES_32, IF_VEC)
E(0xe70a, VSTEG, VRX, V, la2, 0, 0, 0, vste, 0, ES_64, IF_VEC)
+/* VECTOR STORE ELEMENTS REVERSED */
+ F(0xe60f, VSTER, VRX, VE2, la2, 0, 0, 0, vster, 0, IF_VEC)
/* VECTOR STORE MULTIPLE */
F(0xe73e, VSTM, VRS_a, V, la2, 0, 0, 0, vstm, 0, IF_VEC)
/* VECTOR STORE WITH LENGTH */
diff --git a/target/s390x/tcg/translate_vx.c.inc b/target/s390x/tcg/translate_vx.c.inc
index bb997de794..0bef1200e3 100644
--- a/target/s390x/tcg/translate_vx.c.inc
+++ b/target/s390x/tcg/translate_vx.c.inc
@@ -492,6 +492,46 @@ static DisasJumpType op_vlei(DisasContext *s, DisasOps *o)
return DISAS_NEXT;
}
+static DisasJumpType op_vler(DisasContext *s, DisasOps *o)
+{
+ const uint8_t es = get_field(s, m3);
+
+ if (es < ES_16 || es > ES_64) {
+ gen_program_exception(s, PGM_SPECIFICATION);
+ return DISAS_NORETURN;
+ }
+
+ TCGv_i64 t0 = tcg_temp_new_i64();
+ TCGv_i64 t1 = tcg_temp_new_i64();
+
+ /* Begin with the two doublewords swapped... */
+ tcg_gen_qemu_ld_i64(t1, o->addr1, get_mem_index(s), MO_TEUQ);
+ gen_addi_and_wrap_i64(s, o->addr1, o->addr1, 8);
+ tcg_gen_qemu_ld_i64(t0, o->addr1, get_mem_index(s), MO_TEUQ);
+
+ /* ... then swap smaller elements within the doublewords as required. */
+ switch (es) {
+ case MO_16:
+ tcg_gen_hswap_i64(t1, t1);
+ tcg_gen_hswap_i64(t0, t0);
+ break;
+ case MO_32:
+ tcg_gen_wswap_i64(t1, t1);
+ tcg_gen_wswap_i64(t0, t0);
+ break;
+ case MO_64:
+ break;
+ default:
+ g_assert_not_reached();
+ }
+
+ write_vec_element_i64(t0, get_field(s, v1), 0, ES_64);
+ write_vec_element_i64(t1, get_field(s, v1), 1, ES_64);
+ tcg_temp_free(t0);
+ tcg_temp_free(t1);
+ return DISAS_NEXT;
+}
+
static DisasJumpType op_vlgv(DisasContext *s, DisasOps *o)
{
const uint8_t es = get_field(s, m4);
@@ -976,6 +1016,50 @@ static DisasJumpType op_vste(DisasContext *s, DisasOps *o)
return DISAS_NEXT;
}
+static DisasJumpType op_vster(DisasContext *s, DisasOps *o)
+{
+ const uint8_t es = get_field(s, m3);
+ TCGv_i64 t0, t1;
+
+ if (es < ES_16 || es > ES_64) {
+ gen_program_exception(s, PGM_SPECIFICATION);
+ return DISAS_NORETURN;
+ }
+
+ /* Probe write access before actually modifying memory */
+ gen_helper_probe_write_access(cpu_env, o->addr1, tcg_constant_i64(16));
+
+ /* Begin with the two doublewords swapped... */
+ t0 = tcg_temp_new_i64();
+ t1 = tcg_temp_new_i64();
+ read_vec_element_i64(t1, get_field(s, v1), 0, ES_64);
+ read_vec_element_i64(t0, get_field(s, v1), 1, ES_64);
+
+ /* ... then swap smaller elements within the doublewords as required. */
+ switch (es) {
+ case MO_16:
+ tcg_gen_hswap_i64(t1, t1);
+ tcg_gen_hswap_i64(t0, t0);
+ break;
+ case MO_32:
+ tcg_gen_wswap_i64(t1, t1);
+ tcg_gen_wswap_i64(t0, t0);
+ break;
+ case MO_64:
+ break;
+ default:
+ g_assert_not_reached();
+ }
+
+ tcg_gen_qemu_st_i64(t0, o->addr1, get_mem_index(s), MO_TEUQ);
+ gen_addi_and_wrap_i64(s, o->addr1, o->addr1, 8);
+ tcg_gen_qemu_st_i64(t1, o->addr1, get_mem_index(s), MO_TEUQ);
+
+ tcg_temp_free(t0);
+ tcg_temp_free(t1);
+ return DISAS_NEXT;
+}
+
static DisasJumpType op_vstm(DisasContext *s, DisasOps *o)
{
const uint8_t v3 = get_field(s, v3);
--
2.34.1
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH v5 07/11] target/s390x: vxeh2: vector {load, store} byte reversed elements
2022-03-23 13:57 [PATCH v5 00/11] s390x/tcg: Implement Vector-Enhancements Facility 2 David Miller
` (5 preceding siblings ...)
2022-03-23 13:57 ` [PATCH v5 06/11] target/s390x: vxeh2: vector {load, store} elements reversed David Miller
@ 2022-03-23 13:57 ` David Miller
2022-03-23 14:28 ` David Hildenbrand
2022-03-23 13:57 ` [PATCH v5 08/11] target/s390x: vxeh2: vector {load, store} byte reversed element David Miller
` (4 subsequent siblings)
11 siblings, 1 reply; 19+ messages in thread
From: David Miller @ 2022-03-23 13:57 UTC (permalink / raw)
To: qemu-s390x, qemu-devel
Cc: thuth, david, cohuck, richard.henderson, farman, David Miller,
pasic, borntraeger
Signed-off-by: David Miller <dmiller423@gmail.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: David Hildenbrand <david@redhat.com>
---
target/s390x/tcg/insn-data.def | 4 +
target/s390x/tcg/translate_vx.c.inc | 113 ++++++++++++++++++++++++++++
2 files changed, 117 insertions(+)
diff --git a/target/s390x/tcg/insn-data.def b/target/s390x/tcg/insn-data.def
index b524541a7d..ee6e1dc9e5 100644
--- a/target/s390x/tcg/insn-data.def
+++ b/target/s390x/tcg/insn-data.def
@@ -1027,6 +1027,8 @@
F(0xe756, VLR, VRR_a, V, 0, 0, 0, 0, vlr, 0, IF_VEC)
/* VECTOR LOAD AND REPLICATE */
F(0xe705, VLREP, VRX, V, la2, 0, 0, 0, vlrep, 0, IF_VEC)
+/* VECTOR LOAD BYTE REVERSED ELEMENTS */
+ F(0xe606, VLBR, VRX, VE2, la2, 0, 0, 0, vlbr, 0, IF_VEC)
/* VECTOR LOAD ELEMENT */
E(0xe700, VLEB, VRX, V, la2, 0, 0, 0, vle, 0, ES_8, IF_VEC)
E(0xe701, VLEH, VRX, V, la2, 0, 0, 0, vle, 0, ES_16, IF_VEC)
@@ -1079,6 +1081,8 @@
F(0xe75f, VSEG, VRR_a, V, 0, 0, 0, 0, vseg, 0, IF_VEC)
/* VECTOR STORE */
F(0xe70e, VST, VRX, V, la2, 0, 0, 0, vst, 0, IF_VEC)
+/* VECTOR STORE BYTE REVERSED ELEMENTS */
+ F(0xe60e, VSTBR, VRX, VE2, la2, 0, 0, 0, vstbr, 0, IF_VEC)
/* VECTOR STORE ELEMENT */
E(0xe708, VSTEB, VRX, V, la2, 0, 0, 0, vste, 0, ES_8, IF_VEC)
E(0xe709, VSTEH, VRX, V, la2, 0, 0, 0, vste, 0, ES_16, IF_VEC)
diff --git a/target/s390x/tcg/translate_vx.c.inc b/target/s390x/tcg/translate_vx.c.inc
index 0bef1200e3..c0b4a5b9ed 100644
--- a/target/s390x/tcg/translate_vx.c.inc
+++ b/target/s390x/tcg/translate_vx.c.inc
@@ -457,6 +457,62 @@ static DisasJumpType op_vlrep(DisasContext *s, DisasOps *o)
return DISAS_NEXT;
}
+static DisasJumpType op_vlbr(DisasContext *s, DisasOps *o)
+{
+ const uint8_t es = get_field(s, m3);
+ TCGv_i64 t0, t1;
+
+ if (es < ES_16 || es > ES_128) {
+ gen_program_exception(s, PGM_SPECIFICATION);
+ return DISAS_NORETURN;
+ }
+
+ t0 = tcg_temp_new_i64();
+ t1 = tcg_temp_new_i64();
+
+
+ if (es == ES_128) {
+ tcg_gen_qemu_ld_i64(t1, o->addr1, get_mem_index(s), MO_LEUQ);
+ gen_addi_and_wrap_i64(s, o->addr1, o->addr1, 8);
+ tcg_gen_qemu_ld_i64(t0, o->addr1, get_mem_index(s), MO_LEUQ);
+ goto write;
+ }
+
+ /* Begin with byte reversed doublewords... */
+ tcg_gen_qemu_ld_i64(t0, o->addr1, get_mem_index(s), MO_LEUQ);
+ gen_addi_and_wrap_i64(s, o->addr1, o->addr1, 8);
+ tcg_gen_qemu_ld_i64(t1, o->addr1, get_mem_index(s), MO_LEUQ);
+
+ /*
+ * For 16 and 32-bit elements, the doubleword bswap also reversed
+ * the order of the elements. Perform a larger order swap to put
+ * them back into place. For the 128-bit "element", finish the
+ * bswap by swapping the doublewords.
+ */
+ switch (es) {
+ case ES_16:
+ tcg_gen_hswap_i64(t0, t0);
+ tcg_gen_hswap_i64(t1, t1);
+ break;
+ case ES_32:
+ tcg_gen_wswap_i64(t0, t0);
+ tcg_gen_wswap_i64(t1, t1);
+ break;
+ case ES_64:
+ break;
+ default:
+ g_assert_not_reached();
+ }
+
+write:
+ write_vec_element_i64(t0, get_field(s, v1), 0, ES_64);
+ write_vec_element_i64(t1, get_field(s, v1), 1, ES_64);
+
+ tcg_temp_free(t0);
+ tcg_temp_free(t1);
+ return DISAS_NEXT;
+}
+
static DisasJumpType op_vle(DisasContext *s, DisasOps *o)
{
const uint8_t es = s->insn->data;
@@ -998,6 +1054,63 @@ static DisasJumpType op_vst(DisasContext *s, DisasOps *o)
return DISAS_NEXT;
}
+static DisasJumpType op_vstbr(DisasContext *s, DisasOps *o)
+{
+ const uint8_t es = get_field(s, m3);
+ TCGv_i64 t0, t1;
+
+ if (es < ES_16 || es > ES_128) {
+ gen_program_exception(s, PGM_SPECIFICATION);
+ return DISAS_NORETURN;
+ }
+
+ /* Probe write access before actually modifying memory */
+ gen_helper_probe_write_access(cpu_env, o->addr1, tcg_constant_i64(16));
+
+ t0 = tcg_temp_new_i64();
+ t1 = tcg_temp_new_i64();
+
+
+ if (es == ES_128) {
+ read_vec_element_i64(t1, get_field(s, v1), 0, ES_64);
+ read_vec_element_i64(t0, get_field(s, v1), 1, ES_64);
+ goto write;
+ }
+
+ read_vec_element_i64(t0, get_field(s, v1), 0, ES_64);
+ read_vec_element_i64(t1, get_field(s, v1), 1, ES_64);
+
+ /*
+ * For 16 and 32-bit elements, the doubleword bswap below will
+ * reverse the order of the elements. Perform a larger order
+ * swap to put them back into place. For the 128-bit "element",
+ * finish the bswap by swapping the doublewords.
+ */
+ switch (es) {
+ case MO_16:
+ tcg_gen_hswap_i64(t0, t0);
+ tcg_gen_hswap_i64(t1, t1);
+ break;
+ case MO_32:
+ tcg_gen_wswap_i64(t0, t0);
+ tcg_gen_wswap_i64(t1, t1);
+ break;
+ case MO_64:
+ break;
+ default:
+ g_assert_not_reached();
+ }
+
+write:
+ tcg_gen_qemu_st_i64(t0, o->addr1, get_mem_index(s), MO_LEUQ);
+ gen_addi_and_wrap_i64(s, o->addr1, o->addr1, 8);
+ tcg_gen_qemu_st_i64(t1, o->addr1, get_mem_index(s), MO_LEUQ);
+
+ tcg_temp_free(t0);
+ tcg_temp_free(t1);
+ return DISAS_NEXT;
+}
+
static DisasJumpType op_vste(DisasContext *s, DisasOps *o)
{
const uint8_t es = s->insn->data;
--
2.34.1
^ permalink raw reply related [flat|nested] 19+ messages in thread
* Re: [PATCH v5 07/11] target/s390x: vxeh2: vector {load, store} byte reversed elements
2022-03-23 13:57 ` [PATCH v5 07/11] target/s390x: vxeh2: vector {load, store} byte reversed elements David Miller
@ 2022-03-23 14:28 ` David Hildenbrand
0 siblings, 0 replies; 19+ messages in thread
From: David Hildenbrand @ 2022-03-23 14:28 UTC (permalink / raw)
To: David Miller, qemu-s390x, qemu-devel
Cc: thuth, cohuck, richard.henderson, farman, pasic, borntraeger
> +static DisasJumpType op_vlbr(DisasContext *s, DisasOps *o)
> +{
> + const uint8_t es = get_field(s, m3);
> + TCGv_i64 t0, t1;
> +
> + if (es < ES_16 || es > ES_128) {
> + gen_program_exception(s, PGM_SPECIFICATION);
> + return DISAS_NORETURN;
> + }
> +
> + t0 = tcg_temp_new_i64();
> + t1 = tcg_temp_new_i64();
> +
> +
> + if (es == ES_128) {
> + tcg_gen_qemu_ld_i64(t1, o->addr1, get_mem_index(s), MO_LEUQ);
> + gen_addi_and_wrap_i64(s, o->addr1, o->addr1, 8);
> + tcg_gen_qemu_ld_i64(t0, o->addr1, get_mem_index(s), MO_LEUQ);
> + goto write;
> + }
> +
> + /* Begin with byte reversed doublewords... */
> + tcg_gen_qemu_ld_i64(t0, o->addr1, get_mem_index(s), MO_LEUQ);
> + gen_addi_and_wrap_i64(s, o->addr1, o->addr1, 8);
> + tcg_gen_qemu_ld_i64(t1, o->addr1, get_mem_index(s), MO_LEUQ);
> +
> + /*
> + * For 16 and 32-bit elements, the doubleword bswap also reversed
> + * the order of the elements. Perform a larger order swap to put
> + * them back into place. For the 128-bit "element", finish the
> + * bswap by swapping the doublewords.
The last sentence still needs to go.
[...]
> +static DisasJumpType op_vstbr(DisasContext *s, DisasOps *o)
> +{
> + const uint8_t es = get_field(s, m3);
> + TCGv_i64 t0, t1;
> +
> + if (es < ES_16 || es > ES_128) {
> + gen_program_exception(s, PGM_SPECIFICATION);
> + return DISAS_NORETURN;
> + }
> +
> + /* Probe write access before actually modifying memory */
> + gen_helper_probe_write_access(cpu_env, o->addr1, tcg_constant_i64(16));
> +
> + t0 = tcg_temp_new_i64();
> + t1 = tcg_temp_new_i64();
> +
> +
> + if (es == ES_128) {
> + read_vec_element_i64(t1, get_field(s, v1), 0, ES_64);
> + read_vec_element_i64(t0, get_field(s, v1), 1, ES_64);
> + goto write;
> + }
> +
> + read_vec_element_i64(t0, get_field(s, v1), 0, ES_64);
> + read_vec_element_i64(t1, get_field(s, v1), 1, ES_64);
> +
> + /*
> + * For 16 and 32-bit elements, the doubleword bswap below will
> + * reverse the order of the elements. Perform a larger order
> + * swap to put them back into place. For the 128-bit "element",
> + * finish the bswap by swapping the doublewords.
Dito.
I assume Thomas can fixup when applying.
--
Thanks,
David / dhildenb
^ permalink raw reply [flat|nested] 19+ messages in thread
* [PATCH v5 08/11] target/s390x: vxeh2: vector {load, store} byte reversed element
2022-03-23 13:57 [PATCH v5 00/11] s390x/tcg: Implement Vector-Enhancements Facility 2 David Miller
` (6 preceding siblings ...)
2022-03-23 13:57 ` [PATCH v5 07/11] target/s390x: vxeh2: vector {load, store} byte reversed elements David Miller
@ 2022-03-23 13:57 ` David Miller
2022-03-23 13:57 ` [PATCH v5 09/11] target/s390x: add S390_FEAT_VECTOR_ENH2 to qemu CPU model David Miller
` (3 subsequent siblings)
11 siblings, 0 replies; 19+ messages in thread
From: David Miller @ 2022-03-23 13:57 UTC (permalink / raw)
To: qemu-s390x, qemu-devel
Cc: thuth, david, cohuck, richard.henderson, farman, David Miller,
pasic, borntraeger
Signed-off-by: David Miller <dmiller423@gmail.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: David Hildenbrand <david@redhat.com>
---
target/s390x/tcg/insn-data.def | 12 ++++
target/s390x/tcg/translate_vx.c.inc | 85 +++++++++++++++++++++++++++++
2 files changed, 97 insertions(+)
diff --git a/target/s390x/tcg/insn-data.def b/target/s390x/tcg/insn-data.def
index ee6e1dc9e5..5e448bb2c4 100644
--- a/target/s390x/tcg/insn-data.def
+++ b/target/s390x/tcg/insn-data.def
@@ -1027,6 +1027,14 @@
F(0xe756, VLR, VRR_a, V, 0, 0, 0, 0, vlr, 0, IF_VEC)
/* VECTOR LOAD AND REPLICATE */
F(0xe705, VLREP, VRX, V, la2, 0, 0, 0, vlrep, 0, IF_VEC)
+/* VECTOR LOAD BYTE REVERSED ELEMENT */
+ E(0xe601, VLEBRH, VRX, VE2, la2, 0, 0, 0, vlebr, 0, ES_16, IF_VEC)
+ E(0xe603, VLEBRF, VRX, VE2, la2, 0, 0, 0, vlebr, 0, ES_32, IF_VEC)
+ E(0xe602, VLEBRG, VRX, VE2, la2, 0, 0, 0, vlebr, 0, ES_64, IF_VEC)
+/* VECTOR LOAD BYTE REVERSED ELEMENT AND REPLICATE */
+ F(0xe605, VLBRREP, VRX, VE2, la2, 0, 0, 0, vlbrrep, 0, IF_VEC)
+/* VECTOR LOAD BYTE REVERSED ELEMENT AND ZERO */
+ F(0xe604, VLLEBRZ, VRX, VE2, la2, 0, 0, 0, vllebrz, 0, IF_VEC)
/* VECTOR LOAD BYTE REVERSED ELEMENTS */
F(0xe606, VLBR, VRX, VE2, la2, 0, 0, 0, vlbr, 0, IF_VEC)
/* VECTOR LOAD ELEMENT */
@@ -1081,6 +1089,10 @@
F(0xe75f, VSEG, VRR_a, V, 0, 0, 0, 0, vseg, 0, IF_VEC)
/* VECTOR STORE */
F(0xe70e, VST, VRX, V, la2, 0, 0, 0, vst, 0, IF_VEC)
+/* VECTOR STORE BYTE REVERSED ELEMENT */
+ E(0xe609, VSTEBRH, VRX, VE2, la2, 0, 0, 0, vstebr, 0, ES_16, IF_VEC)
+ E(0xe60b, VSTEBRF, VRX, VE2, la2, 0, 0, 0, vstebr, 0, ES_32, IF_VEC)
+ E(0xe60a, VSTEBRG, VRX, VE2, la2, 0, 0, 0, vstebr, 0, ES_64, IF_VEC)
/* VECTOR STORE BYTE REVERSED ELEMENTS */
F(0xe60e, VSTBR, VRX, VE2, la2, 0, 0, 0, vstbr, 0, IF_VEC)
/* VECTOR STORE ELEMENT */
diff --git a/target/s390x/tcg/translate_vx.c.inc b/target/s390x/tcg/translate_vx.c.inc
index c0b4a5b9ed..e57d0395db 100644
--- a/target/s390x/tcg/translate_vx.c.inc
+++ b/target/s390x/tcg/translate_vx.c.inc
@@ -457,6 +457,73 @@ static DisasJumpType op_vlrep(DisasContext *s, DisasOps *o)
return DISAS_NEXT;
}
+static DisasJumpType op_vlebr(DisasContext *s, DisasOps *o)
+{
+ const uint8_t es = s->insn->data;
+ const uint8_t enr = get_field(s, m3);
+ TCGv_i64 tmp;
+
+ if (!valid_vec_element(enr, es)) {
+ gen_program_exception(s, PGM_SPECIFICATION);
+ return DISAS_NORETURN;
+ }
+
+ tmp = tcg_temp_new_i64();
+ tcg_gen_qemu_ld_i64(tmp, o->addr1, get_mem_index(s), MO_LE | es);
+ write_vec_element_i64(tmp, get_field(s, v1), enr, es);
+ tcg_temp_free_i64(tmp);
+ return DISAS_NEXT;
+}
+
+static DisasJumpType op_vlbrrep(DisasContext *s, DisasOps *o)
+{
+ const uint8_t es = get_field(s, m3);
+ TCGv_i64 tmp;
+
+ if (es < ES_16 || es > ES_64) {
+ gen_program_exception(s, PGM_SPECIFICATION);
+ return DISAS_NORETURN;
+ }
+
+ tmp = tcg_temp_new_i64();
+ tcg_gen_qemu_ld_i64(tmp, o->addr1, get_mem_index(s), MO_LE | es);
+ gen_gvec_dup_i64(es, get_field(s, v1), tmp);
+ tcg_temp_free_i64(tmp);
+ return DISAS_NEXT;
+}
+
+static DisasJumpType op_vllebrz(DisasContext *s, DisasOps *o)
+{
+ const uint8_t m3 = get_field(s, m3);
+ TCGv_i64 tmp;
+ int es, lshift;
+
+ switch (m3) {
+ case ES_16:
+ case ES_32:
+ case ES_64:
+ es = m3;
+ lshift = 0;
+ break;
+ case 6:
+ es = ES_32;
+ lshift = 32;
+ break;
+ default:
+ gen_program_exception(s, PGM_SPECIFICATION);
+ return DISAS_NORETURN;
+ }
+
+ tmp = tcg_temp_new_i64();
+ tcg_gen_qemu_ld_i64(tmp, o->addr1, get_mem_index(s), MO_LE | es);
+ tcg_gen_shli_i64(tmp, tmp, lshift);
+
+ write_vec_element_i64(tmp, get_field(s, v1), 0, ES_64);
+ write_vec_element_i64(tcg_constant_i64(0), get_field(s, v1), 1, ES_64);
+ tcg_temp_free_i64(tmp);
+ return DISAS_NEXT;
+}
+
static DisasJumpType op_vlbr(DisasContext *s, DisasOps *o)
{
const uint8_t es = get_field(s, m3);
@@ -1054,6 +1121,24 @@ static DisasJumpType op_vst(DisasContext *s, DisasOps *o)
return DISAS_NEXT;
}
+static DisasJumpType op_vstebr(DisasContext *s, DisasOps *o)
+{
+ const uint8_t es = s->insn->data;
+ const uint8_t enr = get_field(s, m3);
+ TCGv_i64 tmp;
+
+ if (!valid_vec_element(enr, es)) {
+ gen_program_exception(s, PGM_SPECIFICATION);
+ return DISAS_NORETURN;
+ }
+
+ tmp = tcg_temp_new_i64();
+ read_vec_element_i64(tmp, get_field(s, v1), enr, es);
+ tcg_gen_qemu_st_i64(tmp, o->addr1, get_mem_index(s), MO_LE | es);
+ tcg_temp_free_i64(tmp);
+ return DISAS_NEXT;
+}
+
static DisasJumpType op_vstbr(DisasContext *s, DisasOps *o)
{
const uint8_t es = get_field(s, m3);
--
2.34.1
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH v5 09/11] target/s390x: add S390_FEAT_VECTOR_ENH2 to qemu CPU model
2022-03-23 13:57 [PATCH v5 00/11] s390x/tcg: Implement Vector-Enhancements Facility 2 David Miller
` (7 preceding siblings ...)
2022-03-23 13:57 ` [PATCH v5 08/11] target/s390x: vxeh2: vector {load, store} byte reversed element David Miller
@ 2022-03-23 13:57 ` David Miller
2022-03-23 14:03 ` David Miller
2022-03-23 13:57 ` [PATCH v5 10/11] tests/tcg/s390x: Tests for Vector Enhancements Facility 2 David Miller
` (2 subsequent siblings)
11 siblings, 1 reply; 19+ messages in thread
From: David Miller @ 2022-03-23 13:57 UTC (permalink / raw)
To: qemu-s390x, qemu-devel
Cc: thuth, david, cohuck, richard.henderson, farman, David Miller,
pasic, borntraeger
Signed-off-by: David Miller <dmiller423@gmail.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
target/s390x/gen-features.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/target/s390x/gen-features.c b/target/s390x/gen-features.c
index 22846121c4..499a3b10a8 100644
--- a/target/s390x/gen-features.c
+++ b/target/s390x/gen-features.c
@@ -740,7 +740,9 @@ static uint16_t qemu_V6_2[] = {
static uint16_t qemu_LATEST[] = {
S390_FEAT_MISC_INSTRUCTION_EXT3,
+ S390_FEAT_VECTOR_ENH2,
};
+
/* add all new definitions before this point */
static uint16_t qemu_MAX[] = {
/* generates a dependency warning, leave it out for now */
--
2.34.1
^ permalink raw reply related [flat|nested] 19+ messages in thread
* Re: [PATCH v5 09/11] target/s390x: add S390_FEAT_VECTOR_ENH2 to qemu CPU model
2022-03-23 13:57 ` [PATCH v5 09/11] target/s390x: add S390_FEAT_VECTOR_ENH2 to qemu CPU model David Miller
@ 2022-03-23 14:03 ` David Miller
2022-03-23 14:25 ` David Hildenbrand
0 siblings, 1 reply; 19+ messages in thread
From: David Miller @ 2022-03-23 14:03 UTC (permalink / raw)
To: qemu-s390x, qemu-devel
Cc: Thomas Huth, David Hildenbrand, cohuck, Richard Henderson,
farman, pasic, Christian Borntraeger
I am sending v5 now as I'm away from my desk until Sunday.
As I'm unsure whether or not this just needs a V7_0 and the patch
shown as an example isn't a direct correlation : I've left it out for
now.
I will dig into it further Monday March 28th.
Thanks
- David Miller
On Wed, Mar 23, 2022 at 9:57 AM David Miller <dmiller423@gmail.com> wrote:
>
> Signed-off-by: David Miller <dmiller423@gmail.com>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
> target/s390x/gen-features.c | 2 ++
> 1 file changed, 2 insertions(+)
>
> diff --git a/target/s390x/gen-features.c b/target/s390x/gen-features.c
> index 22846121c4..499a3b10a8 100644
> --- a/target/s390x/gen-features.c
> +++ b/target/s390x/gen-features.c
> @@ -740,7 +740,9 @@ static uint16_t qemu_V6_2[] = {
>
> static uint16_t qemu_LATEST[] = {
> S390_FEAT_MISC_INSTRUCTION_EXT3,
> + S390_FEAT_VECTOR_ENH2,
> };
> +
> /* add all new definitions before this point */
> static uint16_t qemu_MAX[] = {
> /* generates a dependency warning, leave it out for now */
> --
> 2.34.1
>
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v5 09/11] target/s390x: add S390_FEAT_VECTOR_ENH2 to qemu CPU model
2022-03-23 14:03 ` David Miller
@ 2022-03-23 14:25 ` David Hildenbrand
0 siblings, 0 replies; 19+ messages in thread
From: David Hildenbrand @ 2022-03-23 14:25 UTC (permalink / raw)
To: David Miller, qemu-s390x, qemu-devel
Cc: Thomas Huth, cohuck, Richard Henderson, farman, pasic,
Christian Borntraeger
On 23.03.22 15:03, David Miller wrote:
> I am sending v5 now as I'm away from my desk until Sunday.
> As I'm unsure whether or not this just needs a V7_0 and the patch
> shown as an example isn't a direct correlation : I've left it out for
> now.
> I will dig into it further Monday March 28th.
We can tackle this after the 7.0 release, then basing it on the new
compat machines and most probably on [1].
Should be easy to adjust.
[1] https://lkml.kernel.org/r/20220322112256.118417-1-david@redhat.com
--
Thanks,
David / dhildenb
^ permalink raw reply [flat|nested] 19+ messages in thread
* [PATCH v5 10/11] tests/tcg/s390x: Tests for Vector Enhancements Facility 2
2022-03-23 13:57 [PATCH v5 00/11] s390x/tcg: Implement Vector-Enhancements Facility 2 David Miller
` (8 preceding siblings ...)
2022-03-23 13:57 ` [PATCH v5 09/11] target/s390x: add S390_FEAT_VECTOR_ENH2 to qemu CPU model David Miller
@ 2022-03-23 13:57 ` David Miller
2022-03-23 13:57 ` [PATCH v5 11/11] target/s390x: Fix writeback to v1 in helper_vstl David Miller
2022-04-25 7:43 ` [PATCH v5 00/11] s390x/tcg: Implement Vector-Enhancements Facility 2 Christian Borntraeger
11 siblings, 0 replies; 19+ messages in thread
From: David Miller @ 2022-03-23 13:57 UTC (permalink / raw)
To: qemu-s390x, qemu-devel
Cc: thuth, david, cohuck, richard.henderson, farman, David Miller,
pasic, borntraeger
Signed-off-by: David Miller <dmiller423@gmail.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
tests/tcg/s390x/Makefile.target | 8 ++
tests/tcg/s390x/vx.h | 19 +++++
tests/tcg/s390x/vxeh2_vcvt.c | 88 ++++++++++++++++++++
tests/tcg/s390x/vxeh2_vlstr.c | 139 ++++++++++++++++++++++++++++++++
tests/tcg/s390x/vxeh2_vs.c | 93 +++++++++++++++++++++
5 files changed, 347 insertions(+)
create mode 100644 tests/tcg/s390x/vx.h
create mode 100644 tests/tcg/s390x/vxeh2_vcvt.c
create mode 100644 tests/tcg/s390x/vxeh2_vlstr.c
create mode 100644 tests/tcg/s390x/vxeh2_vs.c
diff --git a/tests/tcg/s390x/Makefile.target b/tests/tcg/s390x/Makefile.target
index 8c9b6a13ce..921a056dd1 100644
--- a/tests/tcg/s390x/Makefile.target
+++ b/tests/tcg/s390x/Makefile.target
@@ -16,6 +16,14 @@ TESTS+=shift
TESTS+=trap
TESTS+=signals-s390x
+VECTOR_TESTS=vxeh2_vs
+VECTOR_TESTS+=vxeh2_vcvt
+VECTOR_TESTS+=vxeh2_vlstr
+
+TESTS+=$(VECTOR_TESTS)
+
+$(VECTOR_TESTS): CFLAGS+=-march=z15 -O2
+
ifneq ($(HAVE_GDB_BIN),)
GDB_SCRIPT=$(SRC_PATH)/tests/guest-debug/run-test.py
diff --git a/tests/tcg/s390x/vx.h b/tests/tcg/s390x/vx.h
new file mode 100644
index 0000000000..2e66f8b714
--- /dev/null
+++ b/tests/tcg/s390x/vx.h
@@ -0,0 +1,19 @@
+#ifndef QEMU_TESTS_S390X_VX_H
+#define QEMU_TESTS_S390X_VX_H
+
+typedef union S390Vector {
+ uint64_t d[2]; /* doubleword */
+ uint32_t w[4]; /* word */
+ uint16_t h[8]; /* halfword */
+ uint8_t b[16]; /* byte */
+ float f[4]; /* float32 */
+ double fd[2]; /* float64 */
+ __uint128_t v;
+} S390Vector;
+
+#define ES8 0
+#define ES16 1
+#define ES32 2
+#define ES64 3
+
+#endif
\ No newline at end of file
diff --git a/tests/tcg/s390x/vxeh2_vcvt.c b/tests/tcg/s390x/vxeh2_vcvt.c
new file mode 100644
index 0000000000..d6e551c16e
--- /dev/null
+++ b/tests/tcg/s390x/vxeh2_vcvt.c
@@ -0,0 +1,88 @@
+/*
+ * vxeh2_vcvt: vector-enhancements facility 2 vector convert *
+ */
+#include <stdint.h>
+#include "vx.h"
+
+#define M_S 8
+#define M4_XxC 4
+#define M4_def M4_XxC
+
+static inline void vcfps(S390Vector *v1, S390Vector *v2,
+ const uint8_t m3, const uint8_t m4, const uint8_t m5)
+{
+ asm volatile("vcfps %[v1], %[v2], %[m3], %[m4], %[m5]\n"
+ : [v1] "=v" (v1->v)
+ : [v2] "v" (v2->v)
+ , [m3] "i" (m3)
+ , [m4] "i" (m4)
+ , [m5] "i" (m5));
+}
+
+static inline void vcfpl(S390Vector *v1, S390Vector *v2,
+ const uint8_t m3, const uint8_t m4, const uint8_t m5)
+{
+ asm volatile("vcfpl %[v1], %[v2], %[m3], %[m4], %[m5]\n"
+ : [v1] "=v" (v1->v)
+ : [v2] "v" (v2->v)
+ , [m3] "i" (m3)
+ , [m4] "i" (m4)
+ , [m5] "i" (m5));
+}
+
+static inline void vcsfp(S390Vector *v1, S390Vector *v2,
+ const uint8_t m3, const uint8_t m4, const uint8_t m5)
+{
+ asm volatile("vcsfp %[v1], %[v2], %[m3], %[m4], %[m5]\n"
+ : [v1] "=v" (v1->v)
+ : [v2] "v" (v2->v)
+ , [m3] "i" (m3)
+ , [m4] "i" (m4)
+ , [m5] "i" (m5));
+}
+
+static inline void vclfp(S390Vector *v1, S390Vector *v2,
+ const uint8_t m3, const uint8_t m4, const uint8_t m5)
+{
+ asm volatile("vclfp %[v1], %[v2], %[m3], %[m4], %[m5]\n"
+ : [v1] "=v" (v1->v)
+ : [v2] "v" (v2->v)
+ , [m3] "i" (m3)
+ , [m4] "i" (m4)
+ , [m5] "i" (m5));
+}
+
+int main(int argc, char *argv[])
+{
+ S390Vector vd;
+ S390Vector vs_i32 = { .w[0] = 1, .w[1] = 64, .w[2] = 1024, .w[3] = -10 };
+ S390Vector vs_u32 = { .w[0] = 2, .w[1] = 32, .w[2] = 4096, .w[3] = 8888 };
+ S390Vector vs_f32 = { .f[0] = 3.987, .f[1] = 5.123,
+ .f[2] = 4.499, .f[3] = 0.512 };
+
+ vd.d[0] = vd.d[1] = 0;
+ vcfps(&vd, &vs_i32, 2, M4_def, 0);
+ if (1 != vd.f[0] || 1024 != vd.f[2] || 64 != vd.f[1] || -10 != vd.f[3]) {
+ return 1;
+ }
+
+ vd.d[0] = vd.d[1] = 0;
+ vcfpl(&vd, &vs_u32, 2, M4_def, 0);
+ if (2 != vd.f[0] || 4096 != vd.f[2] || 32 != vd.f[1] || 8888 != vd.f[3]) {
+ return 1;
+ }
+
+ vd.d[0] = vd.d[1] = 0;
+ vcsfp(&vd, &vs_f32, 2, M4_def, 0);
+ if (4 != vd.w[0] || 4 != vd.w[2] || 5 != vd.w[1] || 1 != vd.w[3]) {
+ return 1;
+ }
+
+ vd.d[0] = vd.d[1] = 0;
+ vclfp(&vd, &vs_f32, 2, M4_def, 0);
+ if (4 != vd.w[0] || 4 != vd.w[2] || 5 != vd.w[1] || 1 != vd.w[3]) {
+ return 1;
+ }
+
+ return 0;
+}
diff --git a/tests/tcg/s390x/vxeh2_vlstr.c b/tests/tcg/s390x/vxeh2_vlstr.c
new file mode 100644
index 0000000000..5677bf7c29
--- /dev/null
+++ b/tests/tcg/s390x/vxeh2_vlstr.c
@@ -0,0 +1,139 @@
+/*
+ * vxeh2_vlstr: vector-enhancements facility 2 vector load/store reversed *
+ */
+#include <stdint.h>
+#include "vx.h"
+
+#define vtst(v1, v2) \
+ if (v1.d[0] != v2.d[0] || v1.d[1] != v2.d[1]) { \
+ return 1; \
+ }
+
+static inline void vler(S390Vector *v1, const void *va, uint8_t m3)
+{
+ asm volatile("vler %[v1], 0(%[va]), %[m3]\n"
+ : [v1] "+v" (v1->v)
+ : [va] "d" (va)
+ , [m3] "i" (m3)
+ : "memory");
+}
+
+static inline void vster(S390Vector *v1, const void *va, uint8_t m3)
+{
+ asm volatile("vster %[v1], 0(%[va]), %[m3]\n"
+ : [va] "+d" (va)
+ : [v1] "v" (v1->v)
+ , [m3] "i" (m3)
+ : "memory");
+}
+
+static inline void vlbr(S390Vector *v1, void *va, const uint8_t m3)
+{
+ asm volatile("vlbr %[v1], 0(%[va]), %[m3]\n"
+ : [v1] "+v" (v1->v)
+ : [va] "d" (va)
+ , [m3] "i" (m3)
+ : "memory");
+}
+
+static inline void vstbr(S390Vector *v1, void *va, const uint8_t m3)
+{
+ asm volatile("vstbr %[v1], 0(%[va]), %[m3]\n"
+ : [va] "+d" (va)
+ : [v1] "v" (v1->v)
+ , [m3] "i" (m3)
+ : "memory");
+}
+
+
+static inline void vlebrh(S390Vector *v1, void *va, const uint8_t m3)
+{
+ asm volatile("vlebrh %[v1], 0(%[va]), %[m3]\n"
+ : [v1] "+v" (v1->v)
+ : [va] "d" (va)
+ , [m3] "i" (m3)
+ : "memory");
+}
+
+static inline void vstebrh(S390Vector *v1, void *va, const uint8_t m3)
+{
+ asm volatile("vstebrh %[v1], 0(%[va]), %[m3]\n"
+ : [va] "+d" (va)
+ : [v1] "v" (v1->v)
+ , [m3] "i" (m3)
+ : "memory");
+}
+
+static inline void vllebrz(S390Vector *v1, void *va, const uint8_t m3)
+{
+ asm volatile("vllebrz %[v1], 0(%[va]), %[m3]\n"
+ : [v1] "+v" (v1->v)
+ : [va] "d" (va)
+ , [m3] "i" (m3)
+ : "memory");
+}
+
+static inline void vlbrrep(S390Vector *v1, void *va, const uint8_t m3)
+{
+ asm volatile("vlbrrep %[v1], 0(%[va]), %[m3]\n"
+ : [v1] "+v" (v1->v)
+ : [va] "d" (va)
+ , [m3] "i" (m3)
+ : "memory");
+}
+
+int main(int argc, char *argv[])
+{
+ S390Vector vd = { .d[0] = 0, .d[1] = 0 };
+ S390Vector vs = { .d[0] = 0x8FEEDDCCBBAA9988ull,
+ .d[1] = 0x7766554433221107ull };
+
+ const S390Vector vt_v_er16 = {
+ .h[0] = 0x1107, .h[1] = 0x3322, .h[2] = 0x5544, .h[3] = 0x7766,
+ .h[4] = 0x9988, .h[5] = 0xBBAA, .h[6] = 0xDDCC, .h[7] = 0x8FEE };
+
+ const S390Vector vt_v_br16 = {
+ .h[0] = 0xEE8F, .h[1] = 0xCCDD, .h[2] = 0xAABB, .h[3] = 0x8899,
+ .h[4] = 0x6677, .h[5] = 0x4455, .h[6] = 0x2233, .h[7] = 0x0711 };
+
+ int ix;
+ uint64_t ss64 = 0xFEEDFACE0BADBEEFull, sd64 = 0;
+
+ vler(&vd, &vs, ES16);
+ vtst(vd, vt_v_er16);
+
+ vster(&vs, &vd, ES16);
+ vtst(vd, vt_v_er16);
+
+ vlbr(&vd, &vs, ES16);
+ vtst(vd, vt_v_br16);
+
+ vstbr(&vs, &vd, ES16);
+ vtst(vd, vt_v_br16);
+
+ vlebrh(&vd, &ss64, 5);
+ if (0xEDFE != vd.h[5]) {
+ return 1;
+ }
+
+ vstebrh(&vs, (uint8_t *)&sd64 + 4, 7);
+ if (0x0000000007110000ull != sd64) {
+ return 1;
+ }
+
+ vllebrz(&vd, (uint8_t *)&ss64 + 3, 2);
+ for (ix = 0; ix < 4; ix++) {
+ if (vd.w[ix] != (ix != 1 ? 0 : 0xBEAD0BCE)) {
+ return 1;
+ }
+ }
+
+ vlbrrep(&vd, (uint8_t *)&ss64 + 4, 1);
+ for (ix = 0; ix < 8; ix++) {
+ if (0xAD0B != vd.h[ix]) {
+ return 1;
+ }
+ }
+
+ return 0;
+}
diff --git a/tests/tcg/s390x/vxeh2_vs.c b/tests/tcg/s390x/vxeh2_vs.c
new file mode 100644
index 0000000000..b7ef419d79
--- /dev/null
+++ b/tests/tcg/s390x/vxeh2_vs.c
@@ -0,0 +1,93 @@
+/*
+ * vxeh2_vs: vector-enhancements facility 2 vector shift
+ */
+#include <stdint.h>
+#include "vx.h"
+
+#define vtst(v1, v2) \
+ if (v1.d[0] != v2.d[0] || v1.d[1] != v2.d[1]) { \
+ return 1; \
+ }
+
+static inline void vsl(S390Vector *v1, S390Vector *v2, S390Vector *v3)
+{
+ asm volatile("vsl %[v1], %[v2], %[v3]\n"
+ : [v1] "=v" (v1->v)
+ : [v2] "v" (v2->v)
+ , [v3] "v" (v3->v));
+}
+
+static inline void vsra(S390Vector *v1, S390Vector *v2, S390Vector *v3)
+{
+ asm volatile("vsra %[v1], %[v2], %[v3]\n"
+ : [v1] "=v" (v1->v)
+ : [v2] "v" (v2->v)
+ , [v3] "v" (v3->v));
+}
+
+static inline void vsrl(S390Vector *v1, S390Vector *v2, S390Vector *v3)
+{
+ asm volatile("vsrl %[v1], %[v2], %[v3]\n"
+ : [v1] "=v" (v1->v)
+ : [v2] "v" (v2->v)
+ , [v3] "v" (v3->v));
+}
+
+static inline void vsld(S390Vector *v1, S390Vector *v2,
+ S390Vector *v3, const uint8_t I)
+{
+ asm volatile("vsld %[v1], %[v2], %[v3], %[I]\n"
+ : [v1] "=v" (v1->v)
+ : [v2] "v" (v2->v)
+ , [v3] "v" (v3->v)
+ , [I] "i" (I & 7));
+}
+
+static inline void vsrd(S390Vector *v1, S390Vector *v2,
+ S390Vector *v3, const uint8_t I)
+{
+ asm volatile("vsrd %[v1], %[v2], %[v3], %[I]\n"
+ : [v1] "=v" (v1->v)
+ : [v2] "v" (v2->v)
+ , [v3] "v" (v3->v)
+ , [I] "i" (I & 7));
+}
+
+int main(int argc, char *argv[])
+{
+ const S390Vector vt_vsl = { .d[0] = 0x7FEDBB32D5AA311Dull,
+ .d[1] = 0xBB65AA10912220C0ull };
+ const S390Vector vt_vsra = { .d[0] = 0xF1FE6E7399AA5466ull,
+ .d[1] = 0x0E762A5188221044ull };
+ const S390Vector vt_vsrl = { .d[0] = 0x11FE6E7399AA5466ull,
+ .d[1] = 0x0E762A5188221044ull };
+ const S390Vector vt_vsld = { .d[0] = 0x7F76EE65DD54CC43ull,
+ .d[1] = 0xBB32AA2199108838ull };
+ const S390Vector vt_vsrd = { .d[0] = 0x0E060802040E000Aull,
+ .d[1] = 0x0C060802040E000Aull };
+ S390Vector vs = { .d[0] = 0x8FEEDDCCBBAA9988ull,
+ .d[1] = 0x7766554433221107ull };
+ S390Vector vd = { .d[0] = 0, .d[1] = 0 };
+ S390Vector vsi = { .d[0] = 0, .d[1] = 0 };
+
+ for (int ix = 0; ix < 16; ix++) {
+ vsi.b[ix] = (1 + (5 ^ ~ix)) & 7;
+ }
+
+ vsl(&vd, &vs, &vsi);
+ vtst(vd, vt_vsl);
+
+ vsra(&vd, &vs, &vsi);
+ vtst(vd, vt_vsra);
+
+ vsrl(&vd, &vs, &vsi);
+ vtst(vd, vt_vsrl);
+
+ vsld(&vd, &vs, &vsi, 3);
+ vtst(vd, vt_vsld);
+
+ vsrd(&vd, &vs, &vsi, 15);
+ vtst(vd, vt_vsrd);
+
+ return 0;
+}
--
2.34.1
^ permalink raw reply related [flat|nested] 19+ messages in thread
* [PATCH v5 11/11] target/s390x: Fix writeback to v1 in helper_vstl
2022-03-23 13:57 [PATCH v5 00/11] s390x/tcg: Implement Vector-Enhancements Facility 2 David Miller
` (9 preceding siblings ...)
2022-03-23 13:57 ` [PATCH v5 10/11] tests/tcg/s390x: Tests for Vector Enhancements Facility 2 David Miller
@ 2022-03-23 13:57 ` David Miller
2022-04-25 7:43 ` [PATCH v5 00/11] s390x/tcg: Implement Vector-Enhancements Facility 2 Christian Borntraeger
11 siblings, 0 replies; 19+ messages in thread
From: David Miller @ 2022-03-23 13:57 UTC (permalink / raw)
To: qemu-s390x, qemu-devel
Cc: thuth, david, cohuck, richard.henderson, farman, David Miller,
pasic, borntraeger
From: Richard Henderson <richard.henderson@linaro.org>
Fixes: 0e0a5b49ad58 ("s390x/tcg: Implement VECTOR STORE WITH LENGTH")
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: David Miller <dmiller423@gmail.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
---
target/s390x/tcg/vec_helper.c | 2 --
1 file changed, 2 deletions(-)
diff --git a/target/s390x/tcg/vec_helper.c b/target/s390x/tcg/vec_helper.c
index ededf13cf0..48d86722b2 100644
--- a/target/s390x/tcg/vec_helper.c
+++ b/target/s390x/tcg/vec_helper.c
@@ -200,7 +200,6 @@ void HELPER(vstl)(CPUS390XState *env, const void *v1, uint64_t addr,
addr = wrap_address(env, addr + 8);
cpu_stq_data_ra(env, addr, s390_vec_read_element64(v1, 1), GETPC());
} else {
- S390Vector tmp = {};
int i;
for (i = 0; i < bytes; i++) {
@@ -209,6 +208,5 @@ void HELPER(vstl)(CPUS390XState *env, const void *v1, uint64_t addr,
cpu_stb_data_ra(env, addr, byte, GETPC());
addr = wrap_address(env, addr + 1);
}
- *(S390Vector *)v1 = tmp;
}
}
--
2.34.1
^ permalink raw reply related [flat|nested] 19+ messages in thread
* Re: [PATCH v5 00/11] s390x/tcg: Implement Vector-Enhancements Facility 2
2022-03-23 13:57 [PATCH v5 00/11] s390x/tcg: Implement Vector-Enhancements Facility 2 David Miller
` (10 preceding siblings ...)
2022-03-23 13:57 ` [PATCH v5 11/11] target/s390x: Fix writeback to v1 in helper_vstl David Miller
@ 2022-04-25 7:43 ` Christian Borntraeger
2022-04-25 7:51 ` David Hildenbrand
11 siblings, 1 reply; 19+ messages in thread
From: Christian Borntraeger @ 2022-04-25 7:43 UTC (permalink / raw)
To: David Miller, qemu-s390x, qemu-devel
Cc: thuth, david, cohuck, richard.henderson, farman, pasic
Am 23.03.22 um 14:57 schrieb David Miller:
> Implement Vector-Enhancements Facility 2 for s390x
>
> resolves: https://gitlab.com/qemu-project/qemu/-/issues/738
>
> implements:
> VECTOR LOAD ELEMENTS REVERSED (VLER)
> VECTOR LOAD BYTE REVERSED ELEMENTS (VLBR)
> VECTOR LOAD BYTE REVERSED ELEMENT (VLEBRH, VLEBRF, VLEBRG)
> VECTOR LOAD BYTE REVERSED ELEMENT AND ZERO (VLLEBRZ)
> VECTOR LOAD BYTE REVERSED ELEMENT AND REPLICATE (VLBRREP)
> VECTOR STORE ELEMENTS REVERSED (VSTER)
> VECTOR STORE BYTE REVERSED ELEMENTS (VSTBR)
> VECTOR STORE BYTE REVERSED ELEMENTS (VSTEBRH, VSTEBRF, VSTEBRG)
> VECTOR SHIFT LEFT DOUBLE BY BIT (VSLD)
> VECTOR SHIFT RIGHT DOUBLE BY BIT (VSRD)
> VECTOR STRING SEARCH (VSTRS)
>
> modifies:
> VECTOR FP CONVERT FROM FIXED (VCFPS)
> VECTOR FP CONVERT FROM LOGICAL (VCFPL)
> VECTOR FP CONVERT TO FIXED (VCSFP)
> VECTOR FP CONVERT TO LOGICAL (VCLFP)
> VECTOR SHIFT LEFT (VSL)
> VECTOR SHIFT RIGHT ARITHMETIC (VSRA)
> VECTOR SHIFT RIGHT LOGICAL (VSRL)
>
>
> David Miller (9):
> tcg: Implement tcg_gen_{h,w}swap_{i32,i64}
> target/s390x: vxeh2: vector convert short/32b
> target/s390x: vxeh2: vector string search
> target/s390x: vxeh2: Update for changes to vector shifts
> target/s390x: vxeh2: vector shift double by bit
> target/s390x: vxeh2: vector {load, store} elements reversed
> target/s390x: vxeh2: vector {load, store} byte reversed elements
> target/s390x: vxeh2: vector {load, store} byte reversed element
> target/s390x: add S390_FEAT_VECTOR_ENH2 to qemu CPU model
> tests/tcg/s390x: Tests for Vector Enhancements Facility 2
> target/s390x: Fix writeback to v1 in helper_vstl
>
> Richard Henderson (2):
> tcg: Implement tcg_gen_{h,w}swap_{i32,i64}
> target/s390x: Fix writeback to v1 in helper_vstl
I guess we can now re-do this series against 7.1-devel (qemu/master) which does
have the machine compat changes. Apart from that this should be ready now?
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v5 00/11] s390x/tcg: Implement Vector-Enhancements Facility 2
2022-04-25 7:43 ` [PATCH v5 00/11] s390x/tcg: Implement Vector-Enhancements Facility 2 Christian Borntraeger
@ 2022-04-25 7:51 ` David Hildenbrand
2022-04-25 9:06 ` Thomas Huth
2022-04-28 0:26 ` David Miller
0 siblings, 2 replies; 19+ messages in thread
From: David Hildenbrand @ 2022-04-25 7:51 UTC (permalink / raw)
To: Christian Borntraeger, David Miller, qemu-s390x, qemu-devel
Cc: pasic, thuth, cohuck, richard.henderson, farman
On 25.04.22 09:43, Christian Borntraeger wrote:
> Am 23.03.22 um 14:57 schrieb David Miller:
>> Implement Vector-Enhancements Facility 2 for s390x
>>
>> resolves: https://gitlab.com/qemu-project/qemu/-/issues/738
>>
>> implements:
>> VECTOR LOAD ELEMENTS REVERSED (VLER)
>> VECTOR LOAD BYTE REVERSED ELEMENTS (VLBR)
>> VECTOR LOAD BYTE REVERSED ELEMENT (VLEBRH, VLEBRF, VLEBRG)
>> VECTOR LOAD BYTE REVERSED ELEMENT AND ZERO (VLLEBRZ)
>> VECTOR LOAD BYTE REVERSED ELEMENT AND REPLICATE (VLBRREP)
>> VECTOR STORE ELEMENTS REVERSED (VSTER)
>> VECTOR STORE BYTE REVERSED ELEMENTS (VSTBR)
>> VECTOR STORE BYTE REVERSED ELEMENTS (VSTEBRH, VSTEBRF, VSTEBRG)
>> VECTOR SHIFT LEFT DOUBLE BY BIT (VSLD)
>> VECTOR SHIFT RIGHT DOUBLE BY BIT (VSRD)
>> VECTOR STRING SEARCH (VSTRS)
>>
>> modifies:
>> VECTOR FP CONVERT FROM FIXED (VCFPS)
>> VECTOR FP CONVERT FROM LOGICAL (VCFPL)
>> VECTOR FP CONVERT TO FIXED (VCSFP)
>> VECTOR FP CONVERT TO LOGICAL (VCLFP)
>> VECTOR SHIFT LEFT (VSL)
>> VECTOR SHIFT RIGHT ARITHMETIC (VSRA)
>> VECTOR SHIFT RIGHT LOGICAL (VSRL)
>>
>>
>> David Miller (9):
>> tcg: Implement tcg_gen_{h,w}swap_{i32,i64}
>> target/s390x: vxeh2: vector convert short/32b
>> target/s390x: vxeh2: vector string search
>> target/s390x: vxeh2: Update for changes to vector shifts
>> target/s390x: vxeh2: vector shift double by bit
>> target/s390x: vxeh2: vector {load, store} elements reversed
>> target/s390x: vxeh2: vector {load, store} byte reversed elements
>> target/s390x: vxeh2: vector {load, store} byte reversed element
>> target/s390x: add S390_FEAT_VECTOR_ENH2 to qemu CPU model
>> tests/tcg/s390x: Tests for Vector Enhancements Facility 2
>> target/s390x: Fix writeback to v1 in helper_vstl
>>
>> Richard Henderson (2):
>> tcg: Implement tcg_gen_{h,w}swap_{i32,i64}
>> target/s390x: Fix writeback to v1 in helper_vstl
>
>
> I guess we can now re-do this series against 7.1-devel (qemu/master) which does
> have the machine compat changes. Apart from that this should be ready now?
>
Yes, I think so. I can respin with the proper compat changes if requested.
--
Thanks,
David / dhildenb
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v5 00/11] s390x/tcg: Implement Vector-Enhancements Facility 2
2022-04-25 7:51 ` David Hildenbrand
@ 2022-04-25 9:06 ` Thomas Huth
2022-04-28 0:26 ` David Miller
1 sibling, 0 replies; 19+ messages in thread
From: Thomas Huth @ 2022-04-25 9:06 UTC (permalink / raw)
To: David Hildenbrand, Christian Borntraeger, David Miller,
qemu-s390x, qemu-devel
Cc: pasic, farman, cohuck, richard.henderson
On 25/04/2022 09.51, David Hildenbrand wrote:
> On 25.04.22 09:43, Christian Borntraeger wrote:
>> Am 23.03.22 um 14:57 schrieb David Miller:
>>> Implement Vector-Enhancements Facility 2 for s390x
>>>
>>> resolves: https://gitlab.com/qemu-project/qemu/-/issues/738
>>>
>>> implements:
>>> VECTOR LOAD ELEMENTS REVERSED (VLER)
>>> VECTOR LOAD BYTE REVERSED ELEMENTS (VLBR)
>>> VECTOR LOAD BYTE REVERSED ELEMENT (VLEBRH, VLEBRF, VLEBRG)
>>> VECTOR LOAD BYTE REVERSED ELEMENT AND ZERO (VLLEBRZ)
>>> VECTOR LOAD BYTE REVERSED ELEMENT AND REPLICATE (VLBRREP)
>>> VECTOR STORE ELEMENTS REVERSED (VSTER)
>>> VECTOR STORE BYTE REVERSED ELEMENTS (VSTBR)
>>> VECTOR STORE BYTE REVERSED ELEMENTS (VSTEBRH, VSTEBRF, VSTEBRG)
>>> VECTOR SHIFT LEFT DOUBLE BY BIT (VSLD)
>>> VECTOR SHIFT RIGHT DOUBLE BY BIT (VSRD)
>>> VECTOR STRING SEARCH (VSTRS)
>>>
>>> modifies:
>>> VECTOR FP CONVERT FROM FIXED (VCFPS)
>>> VECTOR FP CONVERT FROM LOGICAL (VCFPL)
>>> VECTOR FP CONVERT TO FIXED (VCSFP)
>>> VECTOR FP CONVERT TO LOGICAL (VCLFP)
>>> VECTOR SHIFT LEFT (VSL)
>>> VECTOR SHIFT RIGHT ARITHMETIC (VSRA)
>>> VECTOR SHIFT RIGHT LOGICAL (VSRL)
>>>
>>>
>>> David Miller (9):
>>> tcg: Implement tcg_gen_{h,w}swap_{i32,i64}
>>> target/s390x: vxeh2: vector convert short/32b
>>> target/s390x: vxeh2: vector string search
>>> target/s390x: vxeh2: Update for changes to vector shifts
>>> target/s390x: vxeh2: vector shift double by bit
>>> target/s390x: vxeh2: vector {load, store} elements reversed
>>> target/s390x: vxeh2: vector {load, store} byte reversed elements
>>> target/s390x: vxeh2: vector {load, store} byte reversed element
>>> target/s390x: add S390_FEAT_VECTOR_ENH2 to qemu CPU model
>>> tests/tcg/s390x: Tests for Vector Enhancements Facility 2
>>> target/s390x: Fix writeback to v1 in helper_vstl
>>>
>>> Richard Henderson (2):
>>> tcg: Implement tcg_gen_{h,w}swap_{i32,i64}
>>> target/s390x: Fix writeback to v1 in helper_vstl
>>
>>
>> I guess we can now re-do this series against 7.1-devel (qemu/master) which does
>> have the machine compat changes. Apart from that this should be ready now?
>>
>
> Yes, I think so. I can respin with the proper compat changes if requested.
FWIW, I played with the series a little bit, and did not spot any obvious
problems anymore, so feel free to add:
Tested-by: Thomas Huth <thuth@redhat.com>
If you send a v6 with the compat stuff fixed, I can queue it for my next
s390x pull request.
Thomas
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v5 00/11] s390x/tcg: Implement Vector-Enhancements Facility 2
2022-04-25 7:51 ` David Hildenbrand
2022-04-25 9:06 ` Thomas Huth
@ 2022-04-28 0:26 ` David Miller
1 sibling, 0 replies; 19+ messages in thread
From: David Miller @ 2022-04-28 0:26 UTC (permalink / raw)
To: David Hildenbrand
Cc: Thomas Huth, farman, cohuck, Richard Henderson, qemu-devel,
pasic, qemu-s390x, Christian Borntraeger
I'm playing catch up a bit here, as I was out sick for a few days.
It would be very much appreciated if you could do so, as I'm not
familiar with what is required.
Thanks
- David Miller
On Mon, Apr 25, 2022 at 3:51 AM David Hildenbrand <david@redhat.com> wrote:
>
> On 25.04.22 09:43, Christian Borntraeger wrote:
> > Am 23.03.22 um 14:57 schrieb David Miller:
> >> Implement Vector-Enhancements Facility 2 for s390x
> >>
> >> resolves: https://gitlab.com/qemu-project/qemu/-/issues/738
> >>
> >> implements:
> >> VECTOR LOAD ELEMENTS REVERSED (VLER)
> >> VECTOR LOAD BYTE REVERSED ELEMENTS (VLBR)
> >> VECTOR LOAD BYTE REVERSED ELEMENT (VLEBRH, VLEBRF, VLEBRG)
> >> VECTOR LOAD BYTE REVERSED ELEMENT AND ZERO (VLLEBRZ)
> >> VECTOR LOAD BYTE REVERSED ELEMENT AND REPLICATE (VLBRREP)
> >> VECTOR STORE ELEMENTS REVERSED (VSTER)
> >> VECTOR STORE BYTE REVERSED ELEMENTS (VSTBR)
> >> VECTOR STORE BYTE REVERSED ELEMENTS (VSTEBRH, VSTEBRF, VSTEBRG)
> >> VECTOR SHIFT LEFT DOUBLE BY BIT (VSLD)
> >> VECTOR SHIFT RIGHT DOUBLE BY BIT (VSRD)
> >> VECTOR STRING SEARCH (VSTRS)
> >>
> >> modifies:
> >> VECTOR FP CONVERT FROM FIXED (VCFPS)
> >> VECTOR FP CONVERT FROM LOGICAL (VCFPL)
> >> VECTOR FP CONVERT TO FIXED (VCSFP)
> >> VECTOR FP CONVERT TO LOGICAL (VCLFP)
> >> VECTOR SHIFT LEFT (VSL)
> >> VECTOR SHIFT RIGHT ARITHMETIC (VSRA)
> >> VECTOR SHIFT RIGHT LOGICAL (VSRL)
> >>
> >>
> >> David Miller (9):
> >> tcg: Implement tcg_gen_{h,w}swap_{i32,i64}
> >> target/s390x: vxeh2: vector convert short/32b
> >> target/s390x: vxeh2: vector string search
> >> target/s390x: vxeh2: Update for changes to vector shifts
> >> target/s390x: vxeh2: vector shift double by bit
> >> target/s390x: vxeh2: vector {load, store} elements reversed
> >> target/s390x: vxeh2: vector {load, store} byte reversed elements
> >> target/s390x: vxeh2: vector {load, store} byte reversed element
> >> target/s390x: add S390_FEAT_VECTOR_ENH2 to qemu CPU model
> >> tests/tcg/s390x: Tests for Vector Enhancements Facility 2
> >> target/s390x: Fix writeback to v1 in helper_vstl
> >>
> >> Richard Henderson (2):
> >> tcg: Implement tcg_gen_{h,w}swap_{i32,i64}
> >> target/s390x: Fix writeback to v1 in helper_vstl
> >
> >
> > I guess we can now re-do this series against 7.1-devel (qemu/master) which does
> > have the machine compat changes. Apart from that this should be ready now?
> >
>
> Yes, I think so. I can respin with the proper compat changes if requested.
>
> --
> Thanks,
>
> David / dhildenb
>
^ permalink raw reply [flat|nested] 19+ messages in thread