* [PATCH 1/9] target/s390x: Use a single return for helper_divs32/u32
2022-10-21 7:29 [PATCH 0/9] target/s390x: Use Int128 for float128 and retxl Richard Henderson
@ 2022-10-21 7:29 ` Richard Henderson
2022-10-21 11:25 ` Ilya Leoshkevich
` (3 more replies)
2022-10-21 7:29 ` [PATCH 2/9] target/s390x: Use a single return for helper_divs64/u64 Richard Henderson
` (7 subsequent siblings)
8 siblings, 4 replies; 35+ messages in thread
From: Richard Henderson @ 2022-10-21 7:29 UTC (permalink / raw)
To: qemu-devel; +Cc: qemu-s390x
Pack the quotient and remainder into a single uint64_t.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
target/s390x/helper.h | 2 +-
target/s390x/tcg/int_helper.c | 26 +++++++++++++-------------
target/s390x/tcg/translate.c | 10 ++++++----
3 files changed, 20 insertions(+), 18 deletions(-)
diff --git a/target/s390x/helper.h b/target/s390x/helper.h
index bf33d86f74..97a9668eef 100644
--- a/target/s390x/helper.h
+++ b/target/s390x/helper.h
@@ -10,7 +10,7 @@ DEF_HELPER_FLAGS_4(clc, TCG_CALL_NO_WG, i32, env, i32, i64, i64)
DEF_HELPER_3(mvcl, i32, env, i32, i32)
DEF_HELPER_3(clcl, i32, env, i32, i32)
DEF_HELPER_FLAGS_4(clm, TCG_CALL_NO_WG, i32, env, i32, i32, i64)
-DEF_HELPER_FLAGS_3(divs32, TCG_CALL_NO_WG, s64, env, s64, s64)
+DEF_HELPER_FLAGS_3(divs32, TCG_CALL_NO_WG, i64, env, s64, s64)
DEF_HELPER_FLAGS_3(divu32, TCG_CALL_NO_WG, i64, env, i64, i64)
DEF_HELPER_FLAGS_3(divs64, TCG_CALL_NO_WG, s64, env, s64, s64)
DEF_HELPER_FLAGS_4(divu64, TCG_CALL_NO_WG, i64, env, i64, i64, i64)
diff --git a/target/s390x/tcg/int_helper.c b/target/s390x/tcg/int_helper.c
index 954542388a..130b8bd4d2 100644
--- a/target/s390x/tcg/int_helper.c
+++ b/target/s390x/tcg/int_helper.c
@@ -34,45 +34,45 @@
#endif
/* 64/32 -> 32 signed division */
-int64_t HELPER(divs32)(CPUS390XState *env, int64_t a, int64_t b64)
+uint64_t HELPER(divs32)(CPUS390XState *env, int64_t a, int64_t b64)
{
- int32_t ret, b = b64;
- int64_t q;
+ int32_t b = b64;
+ int64_t q, r;
if (b == 0) {
tcg_s390_program_interrupt(env, PGM_FIXPT_DIVIDE, GETPC());
}
- ret = q = a / b;
- env->retxl = a % b;
+ q = a / b;
+ r = a % b;
/* Catch non-representable quotient. */
- if (ret != q) {
+ if (q != (int32_t)q) {
tcg_s390_program_interrupt(env, PGM_FIXPT_DIVIDE, GETPC());
}
- return ret;
+ return deposit64(r, 32, 32, q);
}
/* 64/32 -> 32 unsigned division */
uint64_t HELPER(divu32)(CPUS390XState *env, uint64_t a, uint64_t b64)
{
- uint32_t ret, b = b64;
- uint64_t q;
+ uint32_t b = b64;
+ uint64_t q, r;
if (b == 0) {
tcg_s390_program_interrupt(env, PGM_FIXPT_DIVIDE, GETPC());
}
- ret = q = a / b;
- env->retxl = a % b;
+ q = a / b;
+ r = a % b;
/* Catch non-representable quotient. */
- if (ret != q) {
+ if (q != (uint32_t)q) {
tcg_s390_program_interrupt(env, PGM_FIXPT_DIVIDE, GETPC());
}
- return ret;
+ return deposit64(r, 32, 32, q);
}
/* 64/64 -> 64 signed division */
diff --git a/target/s390x/tcg/translate.c b/target/s390x/tcg/translate.c
index 1d2dddab1c..525317c9df 100644
--- a/target/s390x/tcg/translate.c
+++ b/target/s390x/tcg/translate.c
@@ -2395,15 +2395,17 @@ static DisasJumpType op_diag(DisasContext *s, DisasOps *o)
static DisasJumpType op_divs32(DisasContext *s, DisasOps *o)
{
- gen_helper_divs32(o->out2, cpu_env, o->in1, o->in2);
- return_low128(o->out);
+ gen_helper_divs32(o->out, cpu_env, o->in1, o->in2);
+ tcg_gen_ext32u_i64(o->out2, o->out);
+ tcg_gen_shri_i64(o->out, o->out, 32);
return DISAS_NEXT;
}
static DisasJumpType op_divu32(DisasContext *s, DisasOps *o)
{
- gen_helper_divu32(o->out2, cpu_env, o->in1, o->in2);
- return_low128(o->out);
+ gen_helper_divu32(o->out, cpu_env, o->in1, o->in2);
+ tcg_gen_ext32u_i64(o->out2, o->out);
+ tcg_gen_shri_i64(o->out, o->out, 32);
return DISAS_NEXT;
}
--
2.34.1
^ permalink raw reply related [flat|nested] 35+ messages in thread
* Re: [PATCH 1/9] target/s390x: Use a single return for helper_divs32/u32
2022-10-21 7:29 ` [PATCH 1/9] target/s390x: Use a single return for helper_divs32/u32 Richard Henderson
@ 2022-10-21 11:25 ` Ilya Leoshkevich
2022-10-25 9:59 ` Philippe Mathieu-Daudé
` (2 subsequent siblings)
3 siblings, 0 replies; 35+ messages in thread
From: Ilya Leoshkevich @ 2022-10-21 11:25 UTC (permalink / raw)
To: Richard Henderson, qemu-devel; +Cc: qemu-s390x
On Fri, 2022-10-21 at 17:29 +1000, Richard Henderson wrote:
> Pack the quotient and remainder into a single uint64_t.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
> target/s390x/helper.h | 2 +-
> target/s390x/tcg/int_helper.c | 26 +++++++++++++-------------
> target/s390x/tcg/translate.c | 10 ++++++----
> 3 files changed, 20 insertions(+), 18 deletions(-)
Acked-by: Ilya Leoshkevich <iii@linux.ibm.com>
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH 1/9] target/s390x: Use a single return for helper_divs32/u32
2022-10-21 7:29 ` [PATCH 1/9] target/s390x: Use a single return for helper_divs32/u32 Richard Henderson
2022-10-21 11:25 ` Ilya Leoshkevich
@ 2022-10-25 9:59 ` Philippe Mathieu-Daudé
2022-11-01 11:09 ` Ilya Leoshkevich
2022-11-01 11:13 ` [PATCH] tests/tcg/s390x: Add div.c Ilya Leoshkevich
3 siblings, 0 replies; 35+ messages in thread
From: Philippe Mathieu-Daudé @ 2022-10-25 9:59 UTC (permalink / raw)
To: Richard Henderson, qemu-devel; +Cc: qemu-s390x
On 21/10/22 09:29, Richard Henderson wrote:
> Pack the quotient and remainder into a single uint64_t.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
> target/s390x/helper.h | 2 +-
> target/s390x/tcg/int_helper.c | 26 +++++++++++++-------------
> target/s390x/tcg/translate.c | 10 ++++++----
> 3 files changed, 20 insertions(+), 18 deletions(-)
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH 1/9] target/s390x: Use a single return for helper_divs32/u32
2022-10-21 7:29 ` [PATCH 1/9] target/s390x: Use a single return for helper_divs32/u32 Richard Henderson
2022-10-21 11:25 ` Ilya Leoshkevich
2022-10-25 9:59 ` Philippe Mathieu-Daudé
@ 2022-11-01 11:09 ` Ilya Leoshkevich
2022-11-01 11:13 ` [PATCH] tests/tcg/s390x: Add div.c Ilya Leoshkevich
3 siblings, 0 replies; 35+ messages in thread
From: Ilya Leoshkevich @ 2022-11-01 11:09 UTC (permalink / raw)
To: Richard Henderson, qemu-devel; +Cc: qemu-s390x
On Fri, Oct 21, 2022 at 05:29:58PM +1000, Richard Henderson wrote:
> Pack the quotient and remainder into a single uint64_t.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
> target/s390x/helper.h | 2 +-
> target/s390x/tcg/int_helper.c | 26 +++++++++++++-------------
> target/s390x/tcg/translate.c | 10 ++++++----
> 3 files changed, 20 insertions(+), 18 deletions(-)
>
> diff --git a/target/s390x/helper.h b/target/s390x/helper.h
> index bf33d86f74..97a9668eef 100644
> --- a/target/s390x/helper.h
> +++ b/target/s390x/helper.h
> @@ -10,7 +10,7 @@ DEF_HELPER_FLAGS_4(clc, TCG_CALL_NO_WG, i32, env, i32, i64, i64)
> DEF_HELPER_3(mvcl, i32, env, i32, i32)
> DEF_HELPER_3(clcl, i32, env, i32, i32)
> DEF_HELPER_FLAGS_4(clm, TCG_CALL_NO_WG, i32, env, i32, i32, i64)
> -DEF_HELPER_FLAGS_3(divs32, TCG_CALL_NO_WG, s64, env, s64, s64)
> +DEF_HELPER_FLAGS_3(divs32, TCG_CALL_NO_WG, i64, env, s64, s64)
> DEF_HELPER_FLAGS_3(divu32, TCG_CALL_NO_WG, i64, env, i64, i64)
> DEF_HELPER_FLAGS_3(divs64, TCG_CALL_NO_WG, s64, env, s64, s64)
> DEF_HELPER_FLAGS_4(divu64, TCG_CALL_NO_WG, i64, env, i64, i64, i64)
> diff --git a/target/s390x/tcg/int_helper.c b/target/s390x/tcg/int_helper.c
> index 954542388a..130b8bd4d2 100644
> --- a/target/s390x/tcg/int_helper.c
> +++ b/target/s390x/tcg/int_helper.c
> @@ -34,45 +34,45 @@
> #endif
>
> /* 64/32 -> 32 signed division */
> -int64_t HELPER(divs32)(CPUS390XState *env, int64_t a, int64_t b64)
> +uint64_t HELPER(divs32)(CPUS390XState *env, int64_t a, int64_t b64)
> {
> - int32_t ret, b = b64;
> - int64_t q;
> + int32_t b = b64;
> + int64_t q, r;
>
> if (b == 0) {
> tcg_s390_program_interrupt(env, PGM_FIXPT_DIVIDE, GETPC());
> }
>
> - ret = q = a / b;
> - env->retxl = a % b;
> + q = a / b;
> + r = a % b;
>
> /* Catch non-representable quotient. */
> - if (ret != q) {
> + if (q != (int32_t)q) {
> tcg_s390_program_interrupt(env, PGM_FIXPT_DIVIDE, GETPC());
> }
>
> - return ret;
> + return deposit64(r, 32, 32, q);
> }
>
> /* 64/32 -> 32 unsigned division */
> uint64_t HELPER(divu32)(CPUS390XState *env, uint64_t a, uint64_t b64)
> {
> - uint32_t ret, b = b64;
> - uint64_t q;
> + uint32_t b = b64;
> + uint64_t q, r;
>
> if (b == 0) {
> tcg_s390_program_interrupt(env, PGM_FIXPT_DIVIDE, GETPC());
> }
>
> - ret = q = a / b;
> - env->retxl = a % b;
> + q = a / b;
> + r = a % b;
>
> /* Catch non-representable quotient. */
> - if (ret != q) {
> + if (q != (uint32_t)q) {
> tcg_s390_program_interrupt(env, PGM_FIXPT_DIVIDE, GETPC());
> }
>
> - return ret;
> + return deposit64(r, 32, 32, q);
> }
>
> /* 64/64 -> 64 signed division */
> diff --git a/target/s390x/tcg/translate.c b/target/s390x/tcg/translate.c
> index 1d2dddab1c..525317c9df 100644
> --- a/target/s390x/tcg/translate.c
> +++ b/target/s390x/tcg/translate.c
> @@ -2395,15 +2395,17 @@ static DisasJumpType op_diag(DisasContext *s, DisasOps *o)
>
> static DisasJumpType op_divs32(DisasContext *s, DisasOps *o)
> {
> - gen_helper_divs32(o->out2, cpu_env, o->in1, o->in2);
> - return_low128(o->out);
> + gen_helper_divs32(o->out, cpu_env, o->in1, o->in2);
> + tcg_gen_ext32u_i64(o->out2, o->out);
> + tcg_gen_shri_i64(o->out, o->out, 32);
> return DISAS_NEXT;
> }
>
> static DisasJumpType op_divu32(DisasContext *s, DisasOps *o)
> {
> - gen_helper_divu32(o->out2, cpu_env, o->in1, o->in2);
> - return_low128(o->out);
> + gen_helper_divu32(o->out, cpu_env, o->in1, o->in2);
> + tcg_gen_ext32u_i64(o->out2, o->out);
> + tcg_gen_shri_i64(o->out, o->out, 32);
> return DISAS_NEXT;
> }
>
> --
> 2.34.1
Hi,
The wasmtime testsuite was failing and bisect pointed to this commit.
Apparently this needs a fixup:
--- a/target/s390x/tcg/int_helper.c
+++ b/target/s390x/tcg/int_helper.c
@@ -51,7 +51,7 @@ uint64_t HELPER(divs32)(CPUS390XState *env, int64_t a, int64_t b64)
tcg_s390_program_interrupt(env, PGM_FIXPT_DIVIDE, GETPC());
}
- return deposit64(r, 32, 32, q);
+ return deposit64(q, 32, 32, r);
}
/* 64/32 -> 32 unsigned division */
@@ -72,7 +72,7 @@ uint64_t HELPER(divu32)(CPUS390XState *env, uint64_t a, uint64_t b64)
tcg_s390_program_interrupt(env, PGM_FIXPT_DIVIDE, GETPC());
}
- return deposit64(r, 32, 32, q);
+ return deposit64(q, 32, 32, r);
}
/* 64/64 -> 64 signed division */
Currently we return out = r | (q << 32) here.
op_divu32 makes out2 = r, out = q from this.
Finally, r1_P32 stores r1 = q, r1+1 = r.
But DLR wants the opposite:
The remainder is placed in bit
positions 32-63 of general register R1, and the quo-
tient is placed in bit positions 32-63 of general regis-
ter R1 + 1.
Ditto DR.
Best regards,
Ilya
^ permalink raw reply [flat|nested] 35+ messages in thread
* [PATCH] tests/tcg/s390x: Add div.c
2022-10-21 7:29 ` [PATCH 1/9] target/s390x: Use a single return for helper_divs32/u32 Richard Henderson
` (2 preceding siblings ...)
2022-11-01 11:09 ` Ilya Leoshkevich
@ 2022-11-01 11:13 ` Ilya Leoshkevich
3 siblings, 0 replies; 35+ messages in thread
From: Ilya Leoshkevich @ 2022-11-01 11:13 UTC (permalink / raw)
To: Richard Henderson; +Cc: qemu-devel, Ilya Leoshkevich
Add a basic test to prevent regressions.
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
---
tests/tcg/s390x/Makefile.target | 1 +
tests/tcg/s390x/div.c | 40 +++++++++++++++++++++++++++++++++
2 files changed, 41 insertions(+)
create mode 100644 tests/tcg/s390x/div.c
diff --git a/tests/tcg/s390x/Makefile.target b/tests/tcg/s390x/Makefile.target
index c882db7a78a..f2ec7587387 100644
--- a/tests/tcg/s390x/Makefile.target
+++ b/tests/tcg/s390x/Makefile.target
@@ -26,6 +26,7 @@ TESTS+=branch-relative-long
TESTS+=noexec
TESTS+=clst
TESTS+=long-double
+TESTS+=div
Z13_TESTS=vistr
$(Z13_TESTS): CFLAGS+=-march=z13 -O2
diff --git a/tests/tcg/s390x/div.c b/tests/tcg/s390x/div.c
new file mode 100644
index 00000000000..58072956147
--- /dev/null
+++ b/tests/tcg/s390x/div.c
@@ -0,0 +1,40 @@
+#include <assert.h>
+#include <stdint.h>
+
+static void test_dr(void)
+{
+ register int32_t r0 asm("r0") = -1;
+ register int32_t r1 asm("r1") = -4241;
+ int32_t b = 101, q, r;
+
+ asm("dr %[r0],%[b]"
+ : [r0] "+r" (r0), [r1] "+r" (r1)
+ : [b] "r" (b)
+ : "cc");
+ q = r1;
+ r = r0;
+ assert(q == -41);
+ assert(r == -100);
+}
+
+static void test_dlr(void)
+{
+ register uint32_t r0 asm("r0") = 0;
+ register uint32_t r1 asm("r1") = 4243;
+ uint32_t b = 101, q, r;
+
+ asm("dlr %[r0],%[b]"
+ : [r0] "+r" (r0), [r1] "+r" (r1)
+ : [b] "r" (b)
+ : "cc");
+ q = r1;
+ r = r0;
+ assert(q == 42);
+ assert(r == 1);
+}
+
+int main(void)
+{
+ test_dr();
+ test_dlr();
+}
--
2.37.2
^ permalink raw reply related [flat|nested] 35+ messages in thread
* [PATCH 2/9] target/s390x: Use a single return for helper_divs64/u64
2022-10-21 7:29 [PATCH 0/9] target/s390x: Use Int128 for float128 and retxl Richard Henderson
2022-10-21 7:29 ` [PATCH 1/9] target/s390x: Use a single return for helper_divs32/u32 Richard Henderson
@ 2022-10-21 7:29 ` Richard Henderson
2022-10-24 13:16 ` Philippe Mathieu-Daudé
2022-10-25 20:47 ` Ilya Leoshkevich
2022-10-21 7:30 ` [PATCH 3/9] target/s390x: Use Int128 for return from CLST Richard Henderson
` (6 subsequent siblings)
8 siblings, 2 replies; 35+ messages in thread
From: Richard Henderson @ 2022-10-21 7:29 UTC (permalink / raw)
To: qemu-devel; +Cc: qemu-s390x
Pack the quotient and remainder into a single Int128.
Use the divu128 primitive to remove the cpu_abort on
32-bit hosts.
This is the first use of Int128 as a return value.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
target/s390x/helper.h | 4 ++--
target/s390x/tcg/int_helper.c | 38 +++++++++--------------------------
target/s390x/tcg/translate.c | 14 +++++++++----
3 files changed, 21 insertions(+), 35 deletions(-)
diff --git a/target/s390x/helper.h b/target/s390x/helper.h
index 97a9668eef..29986557ed 100644
--- a/target/s390x/helper.h
+++ b/target/s390x/helper.h
@@ -12,8 +12,8 @@ DEF_HELPER_3(clcl, i32, env, i32, i32)
DEF_HELPER_FLAGS_4(clm, TCG_CALL_NO_WG, i32, env, i32, i32, i64)
DEF_HELPER_FLAGS_3(divs32, TCG_CALL_NO_WG, i64, env, s64, s64)
DEF_HELPER_FLAGS_3(divu32, TCG_CALL_NO_WG, i64, env, i64, i64)
-DEF_HELPER_FLAGS_3(divs64, TCG_CALL_NO_WG, s64, env, s64, s64)
-DEF_HELPER_FLAGS_4(divu64, TCG_CALL_NO_WG, i64, env, i64, i64, i64)
+DEF_HELPER_FLAGS_3(divs64, TCG_CALL_NO_WG, i128, env, s64, s64)
+DEF_HELPER_FLAGS_4(divu64, TCG_CALL_NO_WG, i128, env, i64, i64, i64)
DEF_HELPER_3(srst, void, env, i32, i32)
DEF_HELPER_3(srstu, void, env, i32, i32)
DEF_HELPER_4(clst, i64, env, i64, i64, i64)
diff --git a/target/s390x/tcg/int_helper.c b/target/s390x/tcg/int_helper.c
index 130b8bd4d2..7dc919a102 100644
--- a/target/s390x/tcg/int_helper.c
+++ b/target/s390x/tcg/int_helper.c
@@ -76,46 +76,26 @@ uint64_t HELPER(divu32)(CPUS390XState *env, uint64_t a, uint64_t b64)
}
/* 64/64 -> 64 signed division */
-int64_t HELPER(divs64)(CPUS390XState *env, int64_t a, int64_t b)
+Int128 HELPER(divs64)(CPUS390XState *env, int64_t a, int64_t b)
{
/* Catch divide by zero, and non-representable quotient (MIN / -1). */
if (b == 0 || (b == -1 && a == (1ll << 63))) {
tcg_s390_program_interrupt(env, PGM_FIXPT_DIVIDE, GETPC());
}
- env->retxl = a % b;
- return a / b;
+ return int128_make128(a % b, a / b);
}
/* 128 -> 64/64 unsigned division */
-uint64_t HELPER(divu64)(CPUS390XState *env, uint64_t ah, uint64_t al,
- uint64_t b)
+Int128 HELPER(divu64)(CPUS390XState *env, uint64_t ah, uint64_t al, uint64_t b)
{
- uint64_t ret;
- /* Signal divide by zero. */
- if (b == 0) {
- tcg_s390_program_interrupt(env, PGM_FIXPT_DIVIDE, GETPC());
- }
- if (ah == 0) {
- /* 64 -> 64/64 case */
- env->retxl = al % b;
- ret = al / b;
- } else {
- /* ??? Move i386 idivq helper to host-utils. */
-#ifdef CONFIG_INT128
- __uint128_t a = ((__uint128_t)ah << 64) | al;
- __uint128_t q = a / b;
- env->retxl = a % b;
- ret = q;
- if (ret != q) {
- tcg_s390_program_interrupt(env, PGM_FIXPT_DIVIDE, GETPC());
+ if (b != 0) {
+ uint64_t r = divu128(&al, &ah, b);
+ if (ah == 0) {
+ return int128_make128(r, al);
}
-#else
- /* 32-bit hosts would need special wrapper functionality - just abort if
- we encounter such a case; it's very unlikely anyways. */
- cpu_abort(env_cpu(env), "128 -> 64/64 division not implemented\n");
-#endif
}
- return ret;
+ /* divide by zero or overflow */
+ tcg_s390_program_interrupt(env, PGM_FIXPT_DIVIDE, GETPC());
}
uint64_t HELPER(cvd)(int32_t reg)
diff --git a/target/s390x/tcg/translate.c b/target/s390x/tcg/translate.c
index 525317c9df..a3f5a55891 100644
--- a/target/s390x/tcg/translate.c
+++ b/target/s390x/tcg/translate.c
@@ -2411,15 +2411,21 @@ static DisasJumpType op_divu32(DisasContext *s, DisasOps *o)
static DisasJumpType op_divs64(DisasContext *s, DisasOps *o)
{
- gen_helper_divs64(o->out2, cpu_env, o->in1, o->in2);
- return_low128(o->out);
+ TCGv_i128 t = tcg_temp_new_i128();
+
+ gen_helper_divs64(t, cpu_env, o->in1, o->in2);
+ tcg_gen_extr_i128_i64(o->out, o->out2, t);
+ tcg_temp_free_i128(t);
return DISAS_NEXT;
}
static DisasJumpType op_divu64(DisasContext *s, DisasOps *o)
{
- gen_helper_divu64(o->out2, cpu_env, o->out, o->out2, o->in2);
- return_low128(o->out);
+ TCGv_i128 t = tcg_temp_new_i128();
+
+ gen_helper_divu64(t, cpu_env, o->out, o->out2, o->in2);
+ tcg_gen_extr_i128_i64(o->out, o->out2, t);
+ tcg_temp_free_i128(t);
return DISAS_NEXT;
}
--
2.34.1
^ permalink raw reply related [flat|nested] 35+ messages in thread
* Re: [PATCH 2/9] target/s390x: Use a single return for helper_divs64/u64
2022-10-21 7:29 ` [PATCH 2/9] target/s390x: Use a single return for helper_divs64/u64 Richard Henderson
@ 2022-10-24 13:16 ` Philippe Mathieu-Daudé
2022-10-25 20:47 ` Ilya Leoshkevich
1 sibling, 0 replies; 35+ messages in thread
From: Philippe Mathieu-Daudé @ 2022-10-24 13:16 UTC (permalink / raw)
To: Richard Henderson, qemu-devel; +Cc: qemu-s390x
On 21/10/22 09:29, Richard Henderson wrote:
> Pack the quotient and remainder into a single Int128.
> Use the divu128 primitive to remove the cpu_abort on
> 32-bit hosts.
>
> This is the first use of Int128 as a return value.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
> target/s390x/helper.h | 4 ++--
> target/s390x/tcg/int_helper.c | 38 +++++++++--------------------------
> target/s390x/tcg/translate.c | 14 +++++++++----
> 3 files changed, 21 insertions(+), 35 deletions(-)
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH 2/9] target/s390x: Use a single return for helper_divs64/u64
2022-10-21 7:29 ` [PATCH 2/9] target/s390x: Use a single return for helper_divs64/u64 Richard Henderson
2022-10-24 13:16 ` Philippe Mathieu-Daudé
@ 2022-10-25 20:47 ` Ilya Leoshkevich
1 sibling, 0 replies; 35+ messages in thread
From: Ilya Leoshkevich @ 2022-10-25 20:47 UTC (permalink / raw)
To: Richard Henderson, qemu-devel; +Cc: qemu-s390x
On Fri, Oct 21, 2022 at 05:29:59PM +1000, Richard Henderson wrote:
> Pack the quotient and remainder into a single Int128.
> Use the divu128 primitive to remove the cpu_abort on
> 32-bit hosts.
>
> This is the first use of Int128 as a return value.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Acked-by: Ilya Leoshkevich <iii@linux.ibm.com>
^ permalink raw reply [flat|nested] 35+ messages in thread
* [PATCH 3/9] target/s390x: Use Int128 for return from CLST
2022-10-21 7:29 [PATCH 0/9] target/s390x: Use Int128 for float128 and retxl Richard Henderson
2022-10-21 7:29 ` [PATCH 1/9] target/s390x: Use a single return for helper_divs32/u32 Richard Henderson
2022-10-21 7:29 ` [PATCH 2/9] target/s390x: Use a single return for helper_divs64/u64 Richard Henderson
@ 2022-10-21 7:30 ` Richard Henderson
2022-10-24 13:15 ` Philippe Mathieu-Daudé
` (2 more replies)
2022-10-21 7:30 ` [PATCH 4/9] target/s390x: Use Int128 for return from CKSM Richard Henderson
` (5 subsequent siblings)
8 siblings, 3 replies; 35+ messages in thread
From: Richard Henderson @ 2022-10-21 7:30 UTC (permalink / raw)
To: qemu-devel; +Cc: qemu-s390x
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
target/s390x/helper.h | 2 +-
target/s390x/tcg/mem_helper.c | 11 ++++-------
target/s390x/tcg/translate.c | 8 ++++++--
3 files changed, 11 insertions(+), 10 deletions(-)
diff --git a/target/s390x/helper.h b/target/s390x/helper.h
index 29986557ed..ba9ed11896 100644
--- a/target/s390x/helper.h
+++ b/target/s390x/helper.h
@@ -16,7 +16,7 @@ DEF_HELPER_FLAGS_3(divs64, TCG_CALL_NO_WG, i128, env, s64, s64)
DEF_HELPER_FLAGS_4(divu64, TCG_CALL_NO_WG, i128, env, i64, i64, i64)
DEF_HELPER_3(srst, void, env, i32, i32)
DEF_HELPER_3(srstu, void, env, i32, i32)
-DEF_HELPER_4(clst, i64, env, i64, i64, i64)
+DEF_HELPER_4(clst, i128, env, i64, i64, i64)
DEF_HELPER_FLAGS_4(mvn, TCG_CALL_NO_WG, void, env, i32, i64, i64)
DEF_HELPER_FLAGS_4(mvo, TCG_CALL_NO_WG, void, env, i32, i64, i64)
DEF_HELPER_FLAGS_4(mvpg, TCG_CALL_NO_WG, i32, env, i64, i32, i32)
diff --git a/target/s390x/tcg/mem_helper.c b/target/s390x/tcg/mem_helper.c
index 3758b9e688..5a299ca9a2 100644
--- a/target/s390x/tcg/mem_helper.c
+++ b/target/s390x/tcg/mem_helper.c
@@ -886,7 +886,7 @@ void HELPER(srstu)(CPUS390XState *env, uint32_t r1, uint32_t r2)
}
/* unsigned string compare (c is string terminator) */
-uint64_t HELPER(clst)(CPUS390XState *env, uint64_t c, uint64_t s1, uint64_t s2)
+Int128 HELPER(clst)(CPUS390XState *env, uint64_t c, uint64_t s1, uint64_t s2)
{
uintptr_t ra = GETPC();
uint32_t len;
@@ -904,23 +904,20 @@ uint64_t HELPER(clst)(CPUS390XState *env, uint64_t c, uint64_t s1, uint64_t s2)
if (v1 == c) {
/* Equal. CC=0, and don't advance the registers. */
env->cc_op = 0;
- env->retxl = s2;
- return s1;
+ return int128_make128(s2, s1);
}
} else {
/* Unequal. CC={1,2}, and advance the registers. Note that
the terminator need not be zero, but the string that contains
the terminator is by definition "low". */
env->cc_op = (v1 == c ? 1 : v2 == c ? 2 : v1 < v2 ? 1 : 2);
- env->retxl = s2 + len;
- return s1 + len;
+ return int128_make128(s2 + len, s1 + len);
}
}
/* CPU-determined bytes equal; advance the registers. */
env->cc_op = 3;
- env->retxl = s2 + len;
- return s1 + len;
+ return int128_make128(s2 + len, s1 + len);
}
/* move page */
diff --git a/target/s390x/tcg/translate.c b/target/s390x/tcg/translate.c
index a3f5a55891..63ce131cbf 100644
--- a/target/s390x/tcg/translate.c
+++ b/target/s390x/tcg/translate.c
@@ -2164,9 +2164,13 @@ static DisasJumpType op_clm(DisasContext *s, DisasOps *o)
static DisasJumpType op_clst(DisasContext *s, DisasOps *o)
{
- gen_helper_clst(o->in1, cpu_env, regs[0], o->in1, o->in2);
+ TCGv_i128 pair = tcg_temp_new_i128();
+
+ gen_helper_clst(pair, cpu_env, regs[0], o->in1, o->in2);
+ tcg_gen_extr_i128_i64(o->in2, o->in1, pair);
+ tcg_temp_free_i128(pair);
+
set_cc_static(s);
- return_low128(o->in2);
return DISAS_NEXT;
}
--
2.34.1
^ permalink raw reply related [flat|nested] 35+ messages in thread
* Re: [PATCH 3/9] target/s390x: Use Int128 for return from CLST
2022-10-21 7:30 ` [PATCH 3/9] target/s390x: Use Int128 for return from CLST Richard Henderson
@ 2022-10-24 13:15 ` Philippe Mathieu-Daudé
2022-10-25 21:22 ` Ilya Leoshkevich
2022-10-25 21:30 ` [PATCH 0/1] " Ilya Leoshkevich
2 siblings, 0 replies; 35+ messages in thread
From: Philippe Mathieu-Daudé @ 2022-10-24 13:15 UTC (permalink / raw)
To: Richard Henderson, qemu-devel; +Cc: qemu-s390x
On 21/10/22 09:30, Richard Henderson wrote:
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
> target/s390x/helper.h | 2 +-
> target/s390x/tcg/mem_helper.c | 11 ++++-------
> target/s390x/tcg/translate.c | 8 ++++++--
> 3 files changed, 11 insertions(+), 10 deletions(-)
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH 3/9] target/s390x: Use Int128 for return from CLST
2022-10-21 7:30 ` [PATCH 3/9] target/s390x: Use Int128 for return from CLST Richard Henderson
2022-10-24 13:15 ` Philippe Mathieu-Daudé
@ 2022-10-25 21:22 ` Ilya Leoshkevich
2022-10-25 21:30 ` [PATCH 0/1] " Ilya Leoshkevich
2 siblings, 0 replies; 35+ messages in thread
From: Ilya Leoshkevich @ 2022-10-25 21:22 UTC (permalink / raw)
To: Richard Henderson, qemu-devel; +Cc: qemu-s390x
On Fri, Oct 21, 2022 at 05:30:00PM +1000, Richard Henderson wrote:
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
> target/s390x/helper.h | 2 +-
> target/s390x/tcg/mem_helper.c | 11 ++++-------
> target/s390x/tcg/translate.c | 8 ++++++--
> 3 files changed, 11 insertions(+), 10 deletions(-)
Acked-by: Ilya Leoshkevich <iii@linux.ibm.com>
I wanted to make sure the ordering within a pair was right and wrote a
small test. Feel free to add it to the series:
From: Ilya Leoshkevich <iii@linux.ibm.com>
Subject: [PATCH] tests/tcg/s390x: Add clst.c
Add a basic test to prevent regressions.
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
---
tests/tcg/s390x/Makefile.target | 1 +
tests/tcg/s390x/clst.c | 82 +++++++++++++++++++++++++++++++++
2 files changed, 83 insertions(+)
create mode 100644 tests/tcg/s390x/clst.c
diff --git a/tests/tcg/s390x/Makefile.target b/tests/tcg/s390x/Makefile.target
index 627668e1ce9..ad2e34b1859 100644
--- a/tests/tcg/s390x/Makefile.target
+++ b/tests/tcg/s390x/Makefile.target
@@ -18,6 +18,7 @@ TESTS+=signals-s390x
TESTS+=branch-relative-long
TESTS+=noexec
TESTS+=long-double
+TESTS+=clst
Z14_TESTS=vfminmax
vfminmax: LDFLAGS+=-lm
diff --git a/tests/tcg/s390x/clst.c b/tests/tcg/s390x/clst.c
new file mode 100644
index 00000000000..ed2fe7326c3
--- /dev/null
+++ b/tests/tcg/s390x/clst.c
@@ -0,0 +1,82 @@
+#define _GNU_SOURCE
+#include <stdio.h>
+#include <stdlib.h>
+
+static int clst(char sep, const char **s1, const char **s2)
+{
+ const char *r1 = *s1;
+ const char *r2 = *s2;
+ int cc;
+
+ do {
+ register int r0 asm("r0") = sep;
+
+ asm("clst %[r1],%[r2]\n"
+ "ipm %[cc]\n"
+ "srl %[cc],28"
+ : [r1] "+r" (r1), [r2] "+r" (r2), "+r" (r0), [cc] "=r" (cc)
+ :
+ : "cc");
+ *s1 = r1;
+ *s2 = r2;
+ } while (cc == 3);
+
+ return cc;
+}
+
+static const struct test {
+ const char *name;
+ char sep;
+ const char *s1;
+ const char *s2;
+ int exp_cc;
+ int exp_off;
+} tests[] = {
+ {
+ .name = "cc0",
+ .sep = 0,
+ .s1 = "aa",
+ .s2 = "aa",
+ .exp_cc = 0,
+ .exp_off = 0,
+ },
+ {
+ .name = "cc1",
+ .sep = 1,
+ .s1 = "a\x01",
+ .s2 = "aa\x01",
+ .exp_cc = 1,
+ .exp_off = 1,
+ },
+ {
+ .name = "cc2",
+ .sep = 2,
+ .s1 = "abc\x02",
+ .s2 = "abb\x02",
+ .exp_cc = 2,
+ .exp_off = 2,
+ },
+};
+
+int main(void)
+{
+ const struct test *t;
+ const char *s1, *s2;
+ size_t i;
+ int cc;
+
+ for (i = 0; i < sizeof(tests) / sizeof(tests[0]); i++) {
+ t = &tests[i];
+ s1 = t->s1;
+ s2 = t->s2;
+ cc = clst(t->sep, &s1, &s2);
+ if (cc != t->exp_cc ||
+ s1 != t->s1 + t->exp_off ||
+ s2 != t->s2 + t->exp_off) {
+ fprintf(stderr, "%s\n", t->name);
+ return EXIT_FAILURE;
+ }
+ }
+
+ return EXIT_SUCCESS;
+}
--
2.37.2
^ permalink raw reply related [flat|nested] 35+ messages in thread
* [PATCH 0/1] Re: target/s390x: Use Int128 for return from CLST
2022-10-21 7:30 ` [PATCH 3/9] target/s390x: Use Int128 for return from CLST Richard Henderson
2022-10-24 13:15 ` Philippe Mathieu-Daudé
2022-10-25 21:22 ` Ilya Leoshkevich
@ 2022-10-25 21:30 ` Ilya Leoshkevich
2022-10-25 21:30 ` [PATCH 1/1] tests/tcg/s390x: Add clst.c Ilya Leoshkevich
2 siblings, 1 reply; 35+ messages in thread
From: Ilya Leoshkevich @ 2022-10-25 21:30 UTC (permalink / raw)
To: Richard Henderson; +Cc: qemu-devel, Ilya Leoshkevich
I wanted to make sure that the pair elements were not mixed up and
wrote a small test. Feel free to add it to the series.
Ilya Leoshkevich (1):
tests/tcg/s390x: Add clst.c
tests/tcg/s390x/Makefile.target | 1 +
tests/tcg/s390x/clst.c | 82 +++++++++++++++++++++++++++++++++
2 files changed, 83 insertions(+)
create mode 100644 tests/tcg/s390x/clst.c
--
2.37.2
^ permalink raw reply [flat|nested] 35+ messages in thread
* [PATCH 1/1] tests/tcg/s390x: Add clst.c
2022-10-25 21:30 ` [PATCH 0/1] " Ilya Leoshkevich
@ 2022-10-25 21:30 ` Ilya Leoshkevich
0 siblings, 0 replies; 35+ messages in thread
From: Ilya Leoshkevich @ 2022-10-25 21:30 UTC (permalink / raw)
To: Richard Henderson; +Cc: qemu-devel, Ilya Leoshkevich
Add a basic test to prevent regressions.
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
---
tests/tcg/s390x/Makefile.target | 1 +
tests/tcg/s390x/clst.c | 82 +++++++++++++++++++++++++++++++++
2 files changed, 83 insertions(+)
create mode 100644 tests/tcg/s390x/clst.c
diff --git a/tests/tcg/s390x/Makefile.target b/tests/tcg/s390x/Makefile.target
index 627668e1ce9..ad2e34b1859 100644
--- a/tests/tcg/s390x/Makefile.target
+++ b/tests/tcg/s390x/Makefile.target
@@ -18,6 +18,7 @@ TESTS+=signals-s390x
TESTS+=branch-relative-long
TESTS+=noexec
TESTS+=long-double
+TESTS+=clst
Z14_TESTS=vfminmax
vfminmax: LDFLAGS+=-lm
diff --git a/tests/tcg/s390x/clst.c b/tests/tcg/s390x/clst.c
new file mode 100644
index 00000000000..ed2fe7326c3
--- /dev/null
+++ b/tests/tcg/s390x/clst.c
@@ -0,0 +1,82 @@
+#define _GNU_SOURCE
+#include <stdio.h>
+#include <stdlib.h>
+
+static int clst(char sep, const char **s1, const char **s2)
+{
+ const char *r1 = *s1;
+ const char *r2 = *s2;
+ int cc;
+
+ do {
+ register int r0 asm("r0") = sep;
+
+ asm("clst %[r1],%[r2]\n"
+ "ipm %[cc]\n"
+ "srl %[cc],28"
+ : [r1] "+r" (r1), [r2] "+r" (r2), "+r" (r0), [cc] "=r" (cc)
+ :
+ : "cc");
+ *s1 = r1;
+ *s2 = r2;
+ } while (cc == 3);
+
+ return cc;
+}
+
+static const struct test {
+ const char *name;
+ char sep;
+ const char *s1;
+ const char *s2;
+ int exp_cc;
+ int exp_off;
+} tests[] = {
+ {
+ .name = "cc0",
+ .sep = 0,
+ .s1 = "aa",
+ .s2 = "aa",
+ .exp_cc = 0,
+ .exp_off = 0,
+ },
+ {
+ .name = "cc1",
+ .sep = 1,
+ .s1 = "a\x01",
+ .s2 = "aa\x01",
+ .exp_cc = 1,
+ .exp_off = 1,
+ },
+ {
+ .name = "cc2",
+ .sep = 2,
+ .s1 = "abc\x02",
+ .s2 = "abb\x02",
+ .exp_cc = 2,
+ .exp_off = 2,
+ },
+};
+
+int main(void)
+{
+ const struct test *t;
+ const char *s1, *s2;
+ size_t i;
+ int cc;
+
+ for (i = 0; i < sizeof(tests) / sizeof(tests[0]); i++) {
+ t = &tests[i];
+ s1 = t->s1;
+ s2 = t->s2;
+ cc = clst(t->sep, &s1, &s2);
+ if (cc != t->exp_cc ||
+ s1 != t->s1 + t->exp_off ||
+ s2 != t->s2 + t->exp_off) {
+ fprintf(stderr, "%s\n", t->name);
+ return EXIT_FAILURE;
+ }
+ }
+
+ return EXIT_SUCCESS;
+}
--
2.37.2
^ permalink raw reply related [flat|nested] 35+ messages in thread
* [PATCH 4/9] target/s390x: Use Int128 for return from CKSM
2022-10-21 7:29 [PATCH 0/9] target/s390x: Use Int128 for float128 and retxl Richard Henderson
` (2 preceding siblings ...)
2022-10-21 7:30 ` [PATCH 3/9] target/s390x: Use Int128 for return from CLST Richard Henderson
@ 2022-10-21 7:30 ` Richard Henderson
2022-10-24 13:17 ` Philippe Mathieu-Daudé
2022-10-27 10:36 ` Ilya Leoshkevich
2022-10-21 7:30 ` [PATCH 5/9] target/s390x: Use Int128 for return from TRE Richard Henderson
` (4 subsequent siblings)
8 siblings, 2 replies; 35+ messages in thread
From: Richard Henderson @ 2022-10-21 7:30 UTC (permalink / raw)
To: qemu-devel; +Cc: qemu-s390x
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
target/s390x/helper.h | 2 +-
target/s390x/tcg/mem_helper.c | 7 +++----
target/s390x/tcg/translate.c | 6 ++++--
3 files changed, 8 insertions(+), 7 deletions(-)
diff --git a/target/s390x/helper.h b/target/s390x/helper.h
index ba9ed11896..16045b6dbe 100644
--- a/target/s390x/helper.h
+++ b/target/s390x/helper.h
@@ -103,7 +103,7 @@ DEF_HELPER_4(tre, i64, env, i64, i64, i64)
DEF_HELPER_4(trt, i32, env, i32, i64, i64)
DEF_HELPER_4(trtr, i32, env, i32, i64, i64)
DEF_HELPER_5(trXX, i32, env, i32, i32, i32, i32)
-DEF_HELPER_4(cksm, i64, env, i64, i64, i64)
+DEF_HELPER_4(cksm, i128, env, i64, i64, i64)
DEF_HELPER_FLAGS_5(calc_cc, TCG_CALL_NO_RWG_SE, i32, env, i32, i64, i64, i64)
DEF_HELPER_FLAGS_2(sfpc, TCG_CALL_NO_WG, void, env, i64)
DEF_HELPER_FLAGS_2(sfas, TCG_CALL_NO_WG, void, env, i64)
diff --git a/target/s390x/tcg/mem_helper.c b/target/s390x/tcg/mem_helper.c
index 5a299ca9a2..bb6714aee4 100644
--- a/target/s390x/tcg/mem_helper.c
+++ b/target/s390x/tcg/mem_helper.c
@@ -1350,8 +1350,8 @@ uint32_t HELPER(clclu)(CPUS390XState *env, uint32_t r1, uint64_t a2,
}
/* checksum */
-uint64_t HELPER(cksm)(CPUS390XState *env, uint64_t r1,
- uint64_t src, uint64_t src_len)
+Int128 HELPER(cksm)(CPUS390XState *env, uint64_t r1,
+ uint64_t src, uint64_t src_len)
{
uintptr_t ra = GETPC();
uint64_t max_len, len;
@@ -1392,8 +1392,7 @@ uint64_t HELPER(cksm)(CPUS390XState *env, uint64_t r1,
env->cc_op = (len == src_len ? 0 : 3);
/* Return both cksm and processed length. */
- env->retxl = cksm;
- return len;
+ return int128_make128(cksm, len);
}
void HELPER(pack)(CPUS390XState *env, uint32_t len, uint64_t dest, uint64_t src)
diff --git a/target/s390x/tcg/translate.c b/target/s390x/tcg/translate.c
index 63ce131cbf..224046551f 100644
--- a/target/s390x/tcg/translate.c
+++ b/target/s390x/tcg/translate.c
@@ -2041,11 +2041,13 @@ static DisasJumpType op_cxlgb(DisasContext *s, DisasOps *o)
static DisasJumpType op_cksm(DisasContext *s, DisasOps *o)
{
int r2 = get_field(s, r2);
+ TCGv_i128 pair = tcg_temp_new_i128();
TCGv_i64 len = tcg_temp_new_i64();
- gen_helper_cksm(len, cpu_env, o->in1, o->in2, regs[r2 + 1]);
+ gen_helper_cksm(pair, cpu_env, o->in1, o->in2, regs[r2 + 1]);
set_cc_static(s);
- return_low128(o->out);
+ tcg_gen_extr_i128_i64(o->out, len, pair);
+ tcg_temp_free_i128(pair);
tcg_gen_add_i64(regs[r2], regs[r2], len);
tcg_gen_sub_i64(regs[r2 + 1], regs[r2 + 1], len);
--
2.34.1
^ permalink raw reply related [flat|nested] 35+ messages in thread
* Re: [PATCH 4/9] target/s390x: Use Int128 for return from CKSM
2022-10-21 7:30 ` [PATCH 4/9] target/s390x: Use Int128 for return from CKSM Richard Henderson
@ 2022-10-24 13:17 ` Philippe Mathieu-Daudé
2022-10-27 10:36 ` Ilya Leoshkevich
1 sibling, 0 replies; 35+ messages in thread
From: Philippe Mathieu-Daudé @ 2022-10-24 13:17 UTC (permalink / raw)
To: Richard Henderson, qemu-devel; +Cc: qemu-s390x
On 21/10/22 09:30, Richard Henderson wrote:
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
> target/s390x/helper.h | 2 +-
> target/s390x/tcg/mem_helper.c | 7 +++----
> target/s390x/tcg/translate.c | 6 ++++--
> 3 files changed, 8 insertions(+), 7 deletions(-)
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH 4/9] target/s390x: Use Int128 for return from CKSM
2022-10-21 7:30 ` [PATCH 4/9] target/s390x: Use Int128 for return from CKSM Richard Henderson
2022-10-24 13:17 ` Philippe Mathieu-Daudé
@ 2022-10-27 10:36 ` Ilya Leoshkevich
1 sibling, 0 replies; 35+ messages in thread
From: Ilya Leoshkevich @ 2022-10-27 10:36 UTC (permalink / raw)
To: Richard Henderson, qemu-devel; +Cc: qemu-s390x
On Fri, Oct 21, 2022 at 05:30:01PM +1000, Richard Henderson wrote:
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
> target/s390x/helper.h | 2 +-
> target/s390x/tcg/mem_helper.c | 7 +++----
> target/s390x/tcg/translate.c | 6 ++++--
> 3 files changed, 8 insertions(+), 7 deletions(-)
Acked-by: Ilya Leoshkevich <iii@linux.ibm.com>
^ permalink raw reply [flat|nested] 35+ messages in thread
* [PATCH 5/9] target/s390x: Use Int128 for return from TRE
2022-10-21 7:29 [PATCH 0/9] target/s390x: Use Int128 for float128 and retxl Richard Henderson
` (3 preceding siblings ...)
2022-10-21 7:30 ` [PATCH 4/9] target/s390x: Use Int128 for return from CKSM Richard Henderson
@ 2022-10-21 7:30 ` Richard Henderson
2022-10-24 13:17 ` Philippe Mathieu-Daudé
2022-10-27 10:40 ` Ilya Leoshkevich
2022-10-21 7:30 ` [PATCH 6/9] target/s390x: Copy wout_x1 to wout_x1_P Richard Henderson
` (3 subsequent siblings)
8 siblings, 2 replies; 35+ messages in thread
From: Richard Henderson @ 2022-10-21 7:30 UTC (permalink / raw)
To: qemu-devel; +Cc: qemu-s390x
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
target/s390x/helper.h | 2 +-
target/s390x/tcg/mem_helper.c | 7 +++----
target/s390x/tcg/translate.c | 7 +++++--
3 files changed, 9 insertions(+), 7 deletions(-)
diff --git a/target/s390x/helper.h b/target/s390x/helper.h
index 16045b6dbe..aaea177246 100644
--- a/target/s390x/helper.h
+++ b/target/s390x/helper.h
@@ -99,7 +99,7 @@ DEF_HELPER_FLAGS_4(unpka, TCG_CALL_NO_WG, i32, env, i64, i32, i64)
DEF_HELPER_FLAGS_4(unpku, TCG_CALL_NO_WG, i32, env, i64, i32, i64)
DEF_HELPER_FLAGS_3(tp, TCG_CALL_NO_WG, i32, env, i64, i32)
DEF_HELPER_FLAGS_4(tr, TCG_CALL_NO_WG, void, env, i32, i64, i64)
-DEF_HELPER_4(tre, i64, env, i64, i64, i64)
+DEF_HELPER_4(tre, i128, env, i64, i64, i64)
DEF_HELPER_4(trt, i32, env, i32, i64, i64)
DEF_HELPER_4(trtr, i32, env, i32, i64, i64)
DEF_HELPER_5(trXX, i32, env, i32, i32, i32, i32)
diff --git a/target/s390x/tcg/mem_helper.c b/target/s390x/tcg/mem_helper.c
index bb6714aee4..caf8c408ef 100644
--- a/target/s390x/tcg/mem_helper.c
+++ b/target/s390x/tcg/mem_helper.c
@@ -1632,8 +1632,8 @@ void HELPER(tr)(CPUS390XState *env, uint32_t len, uint64_t array,
do_helper_tr(env, len, array, trans, GETPC());
}
-uint64_t HELPER(tre)(CPUS390XState *env, uint64_t array,
- uint64_t len, uint64_t trans)
+Int128 HELPER(tre)(CPUS390XState *env, uint64_t array,
+ uint64_t len, uint64_t trans)
{
uintptr_t ra = GETPC();
uint8_t end = env->regs[0] & 0xff;
@@ -1668,8 +1668,7 @@ uint64_t HELPER(tre)(CPUS390XState *env, uint64_t array,
}
env->cc_op = cc;
- env->retxl = len - i;
- return array + i;
+ return int128_make128(len - i, array + i);
}
static inline uint32_t do_helper_trt(CPUS390XState *env, int len,
diff --git a/target/s390x/tcg/translate.c b/target/s390x/tcg/translate.c
index 224046551f..8dc4f4b5ec 100644
--- a/target/s390x/tcg/translate.c
+++ b/target/s390x/tcg/translate.c
@@ -4905,8 +4905,11 @@ static DisasJumpType op_tr(DisasContext *s, DisasOps *o)
static DisasJumpType op_tre(DisasContext *s, DisasOps *o)
{
- gen_helper_tre(o->out, cpu_env, o->out, o->out2, o->in2);
- return_low128(o->out2);
+ TCGv_i128 pair = tcg_temp_new_i128();
+
+ gen_helper_tre(pair, cpu_env, o->out, o->out2, o->in2);
+ tcg_gen_extr_i128_i64(o->out2, o->out, pair);
+ tcg_temp_free_i128(pair);
set_cc_static(s);
return DISAS_NEXT;
}
--
2.34.1
^ permalink raw reply related [flat|nested] 35+ messages in thread
* Re: [PATCH 5/9] target/s390x: Use Int128 for return from TRE
2022-10-21 7:30 ` [PATCH 5/9] target/s390x: Use Int128 for return from TRE Richard Henderson
@ 2022-10-24 13:17 ` Philippe Mathieu-Daudé
2022-10-27 10:40 ` Ilya Leoshkevich
1 sibling, 0 replies; 35+ messages in thread
From: Philippe Mathieu-Daudé @ 2022-10-24 13:17 UTC (permalink / raw)
To: Richard Henderson, qemu-devel; +Cc: qemu-s390x
On 21/10/22 09:30, Richard Henderson wrote:
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
> target/s390x/helper.h | 2 +-
> target/s390x/tcg/mem_helper.c | 7 +++----
> target/s390x/tcg/translate.c | 7 +++++--
> 3 files changed, 9 insertions(+), 7 deletions(-)
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH 5/9] target/s390x: Use Int128 for return from TRE
2022-10-21 7:30 ` [PATCH 5/9] target/s390x: Use Int128 for return from TRE Richard Henderson
2022-10-24 13:17 ` Philippe Mathieu-Daudé
@ 2022-10-27 10:40 ` Ilya Leoshkevich
1 sibling, 0 replies; 35+ messages in thread
From: Ilya Leoshkevich @ 2022-10-27 10:40 UTC (permalink / raw)
To: Richard Henderson, qemu-devel; +Cc: qemu-s390x
On Fri, Oct 21, 2022 at 05:30:02PM +1000, Richard Henderson wrote:
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
> target/s390x/helper.h | 2 +-
> target/s390x/tcg/mem_helper.c | 7 +++----
> target/s390x/tcg/translate.c | 7 +++++--
> 3 files changed, 9 insertions(+), 7 deletions(-)
Acked-by: Ilya Leoshkevich <iii@linux.ibm.com>
^ permalink raw reply [flat|nested] 35+ messages in thread
* [PATCH 6/9] target/s390x: Copy wout_x1 to wout_x1_P
2022-10-21 7:29 [PATCH 0/9] target/s390x: Use Int128 for float128 and retxl Richard Henderson
` (4 preceding siblings ...)
2022-10-21 7:30 ` [PATCH 5/9] target/s390x: Use Int128 for return from TRE Richard Henderson
@ 2022-10-21 7:30 ` Richard Henderson
2022-10-27 10:57 ` Ilya Leoshkevich
2022-10-21 7:30 ` [PATCH 7/9] tests/tcg/s390x: Add long-double.c Richard Henderson
` (2 subsequent siblings)
8 siblings, 1 reply; 35+ messages in thread
From: Richard Henderson @ 2022-10-21 7:30 UTC (permalink / raw)
To: qemu-devel; +Cc: qemu-s390x
Make a copy of wout_x1 before modifying it, as wout_x1_P
emphasizing that it operates on the out/out2 pair. The insns
that use x1_P are data movement that will not change to Int128.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
target/s390x/tcg/translate.c | 8 ++++++++
target/s390x/tcg/insn-data.def | 12 ++++++------
2 files changed, 14 insertions(+), 6 deletions(-)
diff --git a/target/s390x/tcg/translate.c b/target/s390x/tcg/translate.c
index 8dc4f4b5ec..15c1a2067e 100644
--- a/target/s390x/tcg/translate.c
+++ b/target/s390x/tcg/translate.c
@@ -5518,6 +5518,14 @@ static void wout_x1(DisasContext *s, DisasOps *o)
}
#define SPEC_wout_x1 SPEC_r1_f128
+static void wout_x1_P(DisasContext *s, DisasOps *o)
+{
+ int f1 = get_field(s, r1);
+ store_freg(f1, o->out);
+ store_freg(f1 + 2, o->out2);
+}
+#define SPEC_wout_x1_P SPEC_r1_f128
+
static void wout_cond_r1r2_32(DisasContext *s, DisasOps *o)
{
if (get_field(s, r1) != get_field(s, r2)) {
diff --git a/target/s390x/tcg/insn-data.def b/target/s390x/tcg/insn-data.def
index 6382ceabfc..83e246ab16 100644
--- a/target/s390x/tcg/insn-data.def
+++ b/target/s390x/tcg/insn-data.def
@@ -422,7 +422,7 @@
F(0x3800, LER, RR_a, Z, 0, e2, 0, cond_e1e2, mov2, 0, IF_AFP1 | IF_AFP2)
F(0x7800, LE, RX_a, Z, 0, m2_32u, 0, e1, mov2, 0, IF_AFP1)
F(0xed64, LEY, RXY_a, LD, 0, m2_32u, 0, e1, mov2, 0, IF_AFP1)
- F(0xb365, LXR, RRE, Z, x2h, x2l, 0, x1, movx, 0, IF_AFP1)
+ F(0xb365, LXR, RRE, Z, x2h, x2l, 0, x1_P, movx, 0, IF_AFP1)
/* LOAD IMMEDIATE */
C(0xc001, LGFI, RIL_a, EI, 0, i2, 0, r1, mov2, 0)
/* LOAD RELATIVE LONG */
@@ -461,7 +461,7 @@
C(0xe332, LTGF, RXY_a, GIE, 0, a2, r1, 0, ld32s, s64)
F(0xb302, LTEBR, RRE, Z, 0, e2, 0, cond_e1e2, mov2, f32, IF_BFP)
F(0xb312, LTDBR, RRE, Z, 0, f2, 0, f1, mov2, f64, IF_BFP)
- F(0xb342, LTXBR, RRE, Z, x2h, x2l, 0, x1, movx, f128, IF_BFP)
+ F(0xb342, LTXBR, RRE, Z, x2h, x2l, 0, x1_P, movx, f128, IF_BFP)
/* LOAD AND TRAP */
C(0xe39f, LAT, RXY_a, LAT, 0, m2_32u, r1, 0, lat, 0)
C(0xe385, LGAT, RXY_a, LAT, 0, a2, r1, 0, lgat, 0)
@@ -483,7 +483,7 @@
C(0xb913, LCGFR, RRE, Z, 0, r2_32s, r1, 0, neg, neg64)
F(0xb303, LCEBR, RRE, Z, 0, e2, new, e1, negf32, f32, IF_BFP)
F(0xb313, LCDBR, RRE, Z, 0, f2, new, f1, negf64, f64, IF_BFP)
- F(0xb343, LCXBR, RRE, Z, x2h, x2l, new_P, x1, negf128, f128, IF_BFP)
+ F(0xb343, LCXBR, RRE, Z, x2h, x2l, new_P, x1_P, negf128, f128, IF_BFP)
F(0xb373, LCDFR, RRE, FPSSH, 0, f2, new, f1, negf64, 0, IF_AFP1 | IF_AFP2)
/* LOAD COUNT TO BLOCK BOUNDARY */
C(0xe727, LCBB, RXE, V, la2, 0, r1, 0, lcbb, 0)
@@ -552,7 +552,7 @@
C(0xb911, LNGFR, RRE, Z, 0, r2_32s, r1, 0, nabs, nabs64)
F(0xb301, LNEBR, RRE, Z, 0, e2, new, e1, nabsf32, f32, IF_BFP)
F(0xb311, LNDBR, RRE, Z, 0, f2, new, f1, nabsf64, f64, IF_BFP)
- F(0xb341, LNXBR, RRE, Z, x2h, x2l, new_P, x1, nabsf128, f128, IF_BFP)
+ F(0xb341, LNXBR, RRE, Z, x2h, x2l, new_P, x1_P, nabsf128, f128, IF_BFP)
F(0xb371, LNDFR, RRE, FPSSH, 0, f2, new, f1, nabsf64, 0, IF_AFP1 | IF_AFP2)
/* LOAD ON CONDITION */
C(0xb9f2, LOCR, RRF_c, LOC, r1, r2, new, r1_32, loc, 0)
@@ -577,7 +577,7 @@
C(0xb910, LPGFR, RRE, Z, 0, r2_32s, r1, 0, abs, abs64)
F(0xb300, LPEBR, RRE, Z, 0, e2, new, e1, absf32, f32, IF_BFP)
F(0xb310, LPDBR, RRE, Z, 0, f2, new, f1, absf64, f64, IF_BFP)
- F(0xb340, LPXBR, RRE, Z, x2h, x2l, new_P, x1, absf128, f128, IF_BFP)
+ F(0xb340, LPXBR, RRE, Z, x2h, x2l, new_P, x1_P, absf128, f128, IF_BFP)
F(0xb370, LPDFR, RRE, FPSSH, 0, f2, new, f1, absf64, 0, IF_AFP1 | IF_AFP2)
/* LOAD REVERSED */
C(0xb91f, LRVR, RRE, Z, 0, r2_32u, new, r1_32, rev32, 0)
@@ -588,7 +588,7 @@
/* LOAD ZERO */
F(0xb374, LZER, RRE, Z, 0, 0, 0, e1, zero, 0, IF_AFP1)
F(0xb375, LZDR, RRE, Z, 0, 0, 0, f1, zero, 0, IF_AFP1)
- F(0xb376, LZXR, RRE, Z, 0, 0, 0, x1, zero2, 0, IF_AFP1)
+ F(0xb376, LZXR, RRE, Z, 0, 0, 0, x1_P, zero2, 0, IF_AFP1)
/* LOAD FPC */
F(0xb29d, LFPC, S, Z, 0, m2_32u, 0, 0, sfpc, 0, IF_BFP)
--
2.34.1
^ permalink raw reply related [flat|nested] 35+ messages in thread
* Re: [PATCH 6/9] target/s390x: Copy wout_x1 to wout_x1_P
2022-10-21 7:30 ` [PATCH 6/9] target/s390x: Copy wout_x1 to wout_x1_P Richard Henderson
@ 2022-10-27 10:57 ` Ilya Leoshkevich
0 siblings, 0 replies; 35+ messages in thread
From: Ilya Leoshkevich @ 2022-10-27 10:57 UTC (permalink / raw)
To: Richard Henderson, qemu-devel; +Cc: qemu-s390x
On Fri, Oct 21, 2022 at 05:30:03PM +1000, Richard Henderson wrote:
> Make a copy of wout_x1 before modifying it, as wout_x1_P
> emphasizing that it operates on the out/out2 pair. The insns
> that use x1_P are data movement that will not change to Int128.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
> target/s390x/tcg/translate.c | 8 ++++++++
> target/s390x/tcg/insn-data.def | 12 ++++++------
> 2 files changed, 14 insertions(+), 6 deletions(-)
Acked-by: Ilya Leoshkevich <iii@linux.ibm.com>
^ permalink raw reply [flat|nested] 35+ messages in thread
* [PATCH 7/9] tests/tcg/s390x: Add long-double.c
2022-10-21 7:29 [PATCH 0/9] target/s390x: Use Int128 for float128 and retxl Richard Henderson
` (5 preceding siblings ...)
2022-10-21 7:30 ` [PATCH 6/9] target/s390x: Copy wout_x1 to wout_x1_P Richard Henderson
@ 2022-10-21 7:30 ` Richard Henderson
2022-10-24 13:19 ` Philippe Mathieu-Daudé
2022-10-27 11:04 ` Ilya Leoshkevich
2022-10-21 7:30 ` [PATCH 8/9] target/s390x: Use Int128 for returning float128 Richard Henderson
2022-10-21 7:30 ` [PATCH 9/9] target/s390x: Use Int128 for passing float128 Richard Henderson
8 siblings, 2 replies; 35+ messages in thread
From: Richard Henderson @ 2022-10-21 7:30 UTC (permalink / raw)
To: qemu-devel; +Cc: qemu-s390x
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
tests/tcg/s390x/long-double.c | 24 ++++++++++++++++++++++++
tests/tcg/s390x/Makefile.target | 1 +
2 files changed, 25 insertions(+)
create mode 100644 tests/tcg/s390x/long-double.c
diff --git a/tests/tcg/s390x/long-double.c b/tests/tcg/s390x/long-double.c
new file mode 100644
index 0000000000..757a6262fd
--- /dev/null
+++ b/tests/tcg/s390x/long-double.c
@@ -0,0 +1,24 @@
+/*
+ * Perform some basic arithmetic with long double, as a sanity check.
+ * With small integral numbers, we can cross-check with integers.
+ */
+
+#include <assert.h>
+
+int main()
+{
+ int i, j;
+
+ for (i = 1; i < 5; i++) {
+ for (j = 1; j < 5; j++) {
+ long double la = (long double)i + j;
+ long double lm = (long double)i * j;
+ long double ls = (long double)i - j;
+
+ assert(la == i + j);
+ assert(lm == i * j);
+ assert(ls == i - j);
+ }
+ }
+ return 0;
+}
diff --git a/tests/tcg/s390x/Makefile.target b/tests/tcg/s390x/Makefile.target
index c830313e67..627668e1ce 100644
--- a/tests/tcg/s390x/Makefile.target
+++ b/tests/tcg/s390x/Makefile.target
@@ -17,6 +17,7 @@ TESTS+=trap
TESTS+=signals-s390x
TESTS+=branch-relative-long
TESTS+=noexec
+TESTS+=long-double
Z14_TESTS=vfminmax
vfminmax: LDFLAGS+=-lm
--
2.34.1
^ permalink raw reply related [flat|nested] 35+ messages in thread
* Re: [PATCH 7/9] tests/tcg/s390x: Add long-double.c
2022-10-21 7:30 ` [PATCH 7/9] tests/tcg/s390x: Add long-double.c Richard Henderson
@ 2022-10-24 13:19 ` Philippe Mathieu-Daudé
2022-10-27 11:04 ` Ilya Leoshkevich
1 sibling, 0 replies; 35+ messages in thread
From: Philippe Mathieu-Daudé @ 2022-10-24 13:19 UTC (permalink / raw)
To: Richard Henderson, qemu-devel; +Cc: qemu-s390x
On 21/10/22 09:30, Richard Henderson wrote:
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
> tests/tcg/s390x/long-double.c | 24 ++++++++++++++++++++++++
> tests/tcg/s390x/Makefile.target | 1 +
> 2 files changed, 25 insertions(+)
> create mode 100644 tests/tcg/s390x/long-double.c
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH 7/9] tests/tcg/s390x: Add long-double.c
2022-10-21 7:30 ` [PATCH 7/9] tests/tcg/s390x: Add long-double.c Richard Henderson
2022-10-24 13:19 ` Philippe Mathieu-Daudé
@ 2022-10-27 11:04 ` Ilya Leoshkevich
1 sibling, 0 replies; 35+ messages in thread
From: Ilya Leoshkevich @ 2022-10-27 11:04 UTC (permalink / raw)
To: Richard Henderson, qemu-devel; +Cc: qemu-s390x
On Fri, Oct 21, 2022 at 05:30:04PM +1000, Richard Henderson wrote:
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
> tests/tcg/s390x/long-double.c | 24 ++++++++++++++++++++++++
> tests/tcg/s390x/Makefile.target | 1 +
> 2 files changed, 25 insertions(+)
> create mode 100644 tests/tcg/s390x/long-double.c
It might be better to do this in asm in order to be sure that a
compiler doesn't perform any magic. But at least as of today gcc
generates all the "interesting" instructions from this code.
Acked-by: Ilya Leoshkevich <iii@linux.ibm.com>
^ permalink raw reply [flat|nested] 35+ messages in thread
* [PATCH 8/9] target/s390x: Use Int128 for returning float128
2022-10-21 7:29 [PATCH 0/9] target/s390x: Use Int128 for float128 and retxl Richard Henderson
` (6 preceding siblings ...)
2022-10-21 7:30 ` [PATCH 7/9] tests/tcg/s390x: Add long-double.c Richard Henderson
@ 2022-10-21 7:30 ` Richard Henderson
2022-10-25 9:58 ` Philippe Mathieu-Daudé
2022-10-27 11:18 ` Ilya Leoshkevich
2022-10-21 7:30 ` [PATCH 9/9] target/s390x: Use Int128 for passing float128 Richard Henderson
8 siblings, 2 replies; 35+ messages in thread
From: Richard Henderson @ 2022-10-21 7:30 UTC (permalink / raw)
To: qemu-devel; +Cc: qemu-s390x
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
target/s390x/helper.h | 22 +++++++--------
target/s390x/tcg/fpu_helper.c | 29 ++++++++++----------
target/s390x/tcg/translate.c | 49 +++++++++++++++++++---------------
target/s390x/tcg/insn-data.def | 20 +++++++-------
4 files changed, 63 insertions(+), 57 deletions(-)
diff --git a/target/s390x/helper.h b/target/s390x/helper.h
index aaea177246..429131a85e 100644
--- a/target/s390x/helper.h
+++ b/target/s390x/helper.h
@@ -31,32 +31,32 @@ DEF_HELPER_4(clcle, i32, env, i32, i64, i32)
DEF_HELPER_4(clclu, i32, env, i32, i64, i32)
DEF_HELPER_3(cegb, i64, env, s64, i32)
DEF_HELPER_3(cdgb, i64, env, s64, i32)
-DEF_HELPER_3(cxgb, i64, env, s64, i32)
+DEF_HELPER_3(cxgb, i128, env, s64, i32)
DEF_HELPER_3(celgb, i64, env, i64, i32)
DEF_HELPER_3(cdlgb, i64, env, i64, i32)
-DEF_HELPER_3(cxlgb, i64, env, i64, i32)
+DEF_HELPER_3(cxlgb, i128, env, i64, i32)
DEF_HELPER_4(cdsg, void, env, i64, i32, i32)
DEF_HELPER_4(cdsg_parallel, void, env, i64, i32, i32)
DEF_HELPER_4(csst, i32, env, i32, i64, i64)
DEF_HELPER_4(csst_parallel, i32, env, i32, i64, i64)
DEF_HELPER_FLAGS_3(aeb, TCG_CALL_NO_WG, i64, env, i64, i64)
DEF_HELPER_FLAGS_3(adb, TCG_CALL_NO_WG, i64, env, i64, i64)
-DEF_HELPER_FLAGS_5(axb, TCG_CALL_NO_WG, i64, env, i64, i64, i64, i64)
+DEF_HELPER_FLAGS_5(axb, TCG_CALL_NO_WG, i128, env, i64, i64, i64, i64)
DEF_HELPER_FLAGS_3(seb, TCG_CALL_NO_WG, i64, env, i64, i64)
DEF_HELPER_FLAGS_3(sdb, TCG_CALL_NO_WG, i64, env, i64, i64)
-DEF_HELPER_FLAGS_5(sxb, TCG_CALL_NO_WG, i64, env, i64, i64, i64, i64)
+DEF_HELPER_FLAGS_5(sxb, TCG_CALL_NO_WG, i128, env, i64, i64, i64, i64)
DEF_HELPER_FLAGS_3(deb, TCG_CALL_NO_WG, i64, env, i64, i64)
DEF_HELPER_FLAGS_3(ddb, TCG_CALL_NO_WG, i64, env, i64, i64)
-DEF_HELPER_FLAGS_5(dxb, TCG_CALL_NO_WG, i64, env, i64, i64, i64, i64)
+DEF_HELPER_FLAGS_5(dxb, TCG_CALL_NO_WG, i128, env, i64, i64, i64, i64)
DEF_HELPER_FLAGS_3(meeb, TCG_CALL_NO_WG, i64, env, i64, i64)
DEF_HELPER_FLAGS_3(mdeb, TCG_CALL_NO_WG, i64, env, i64, i64)
DEF_HELPER_FLAGS_3(mdb, TCG_CALL_NO_WG, i64, env, i64, i64)
-DEF_HELPER_FLAGS_5(mxb, TCG_CALL_NO_WG, i64, env, i64, i64, i64, i64)
-DEF_HELPER_FLAGS_4(mxdb, TCG_CALL_NO_WG, i64, env, i64, i64, i64)
+DEF_HELPER_FLAGS_5(mxb, TCG_CALL_NO_WG, i128, env, i64, i64, i64, i64)
+DEF_HELPER_FLAGS_4(mxdb, TCG_CALL_NO_WG, i128, env, i64, i64, i64)
DEF_HELPER_FLAGS_2(ldeb, TCG_CALL_NO_WG, i64, env, i64)
DEF_HELPER_FLAGS_4(ldxb, TCG_CALL_NO_WG, i64, env, i64, i64, i32)
-DEF_HELPER_FLAGS_2(lxdb, TCG_CALL_NO_WG, i64, env, i64)
-DEF_HELPER_FLAGS_2(lxeb, TCG_CALL_NO_WG, i64, env, i64)
+DEF_HELPER_FLAGS_2(lxdb, TCG_CALL_NO_WG, i128, env, i64)
+DEF_HELPER_FLAGS_2(lxeb, TCG_CALL_NO_WG, i128, env, i64)
DEF_HELPER_FLAGS_3(ledb, TCG_CALL_NO_WG, i64, env, i64, i32)
DEF_HELPER_FLAGS_4(lexb, TCG_CALL_NO_WG, i64, env, i64, i64, i32)
DEF_HELPER_FLAGS_3(ceb, TCG_CALL_NO_WG_SE, i32, env, i64, i64)
@@ -79,7 +79,7 @@ DEF_HELPER_3(clfdb, i64, env, i64, i32)
DEF_HELPER_4(clfxb, i64, env, i64, i64, i32)
DEF_HELPER_FLAGS_3(fieb, TCG_CALL_NO_WG, i64, env, i64, i32)
DEF_HELPER_FLAGS_3(fidb, TCG_CALL_NO_WG, i64, env, i64, i32)
-DEF_HELPER_FLAGS_4(fixb, TCG_CALL_NO_WG, i64, env, i64, i64, i32)
+DEF_HELPER_FLAGS_4(fixb, TCG_CALL_NO_WG, i128, env, i64, i64, i32)
DEF_HELPER_FLAGS_4(maeb, TCG_CALL_NO_WG, i64, env, i64, i64, i64)
DEF_HELPER_FLAGS_4(madb, TCG_CALL_NO_WG, i64, env, i64, i64, i64)
DEF_HELPER_FLAGS_4(mseb, TCG_CALL_NO_WG, i64, env, i64, i64, i64)
@@ -89,7 +89,7 @@ DEF_HELPER_FLAGS_3(tcdb, TCG_CALL_NO_RWG_SE, i32, env, i64, i64)
DEF_HELPER_FLAGS_4(tcxb, TCG_CALL_NO_RWG_SE, i32, env, i64, i64, i64)
DEF_HELPER_FLAGS_2(sqeb, TCG_CALL_NO_WG, i64, env, i64)
DEF_HELPER_FLAGS_2(sqdb, TCG_CALL_NO_WG, i64, env, i64)
-DEF_HELPER_FLAGS_3(sqxb, TCG_CALL_NO_WG, i64, env, i64, i64)
+DEF_HELPER_FLAGS_3(sqxb, TCG_CALL_NO_WG, i128, env, i64, i64)
DEF_HELPER_FLAGS_1(cvd, TCG_CALL_NO_RWG_SE, i64, s32)
DEF_HELPER_FLAGS_4(pack, TCG_CALL_NO_WG, void, env, i32, i64, i64)
DEF_HELPER_FLAGS_4(pka, TCG_CALL_NO_WG, void, env, i64, i64, i32)
diff --git a/target/s390x/tcg/fpu_helper.c b/target/s390x/tcg/fpu_helper.c
index 4067205405..a584794be6 100644
--- a/target/s390x/tcg/fpu_helper.c
+++ b/target/s390x/tcg/fpu_helper.c
@@ -34,7 +34,10 @@
#define HELPER_LOG(x...)
#endif
-#define RET128(F) (env->retxl = F.low, F.high)
+static inline Int128 RET128(float128 f)
+{
+ return int128_make128(f.low, f.high);
+}
uint8_t s390_softfloat_exc_to_ieee(unsigned int exc)
{
@@ -224,7 +227,7 @@ uint64_t HELPER(adb)(CPUS390XState *env, uint64_t f1, uint64_t f2)
}
/* 128-bit FP addition */
-uint64_t HELPER(axb)(CPUS390XState *env, uint64_t ah, uint64_t al,
+Int128 HELPER(axb)(CPUS390XState *env, uint64_t ah, uint64_t al,
uint64_t bh, uint64_t bl)
{
float128 ret = float128_add(make_float128(ah, al),
@@ -251,7 +254,7 @@ uint64_t HELPER(sdb)(CPUS390XState *env, uint64_t f1, uint64_t f2)
}
/* 128-bit FP subtraction */
-uint64_t HELPER(sxb)(CPUS390XState *env, uint64_t ah, uint64_t al,
+Int128 HELPER(sxb)(CPUS390XState *env, uint64_t ah, uint64_t al,
uint64_t bh, uint64_t bl)
{
float128 ret = float128_sub(make_float128(ah, al),
@@ -278,7 +281,7 @@ uint64_t HELPER(ddb)(CPUS390XState *env, uint64_t f1, uint64_t f2)
}
/* 128-bit FP division */
-uint64_t HELPER(dxb)(CPUS390XState *env, uint64_t ah, uint64_t al,
+Int128 HELPER(dxb)(CPUS390XState *env, uint64_t ah, uint64_t al,
uint64_t bh, uint64_t bl)
{
float128 ret = float128_div(make_float128(ah, al),
@@ -314,7 +317,7 @@ uint64_t HELPER(mdeb)(CPUS390XState *env, uint64_t f1, uint64_t f2)
}
/* 128-bit FP multiplication */
-uint64_t HELPER(mxb)(CPUS390XState *env, uint64_t ah, uint64_t al,
+Int128 HELPER(mxb)(CPUS390XState *env, uint64_t ah, uint64_t al,
uint64_t bh, uint64_t bl)
{
float128 ret = float128_mul(make_float128(ah, al),
@@ -325,8 +328,7 @@ uint64_t HELPER(mxb)(CPUS390XState *env, uint64_t ah, uint64_t al,
}
/* 128/64-bit FP multiplication */
-uint64_t HELPER(mxdb)(CPUS390XState *env, uint64_t ah, uint64_t al,
- uint64_t f2)
+Int128 HELPER(mxdb)(CPUS390XState *env, uint64_t ah, uint64_t al, uint64_t f2)
{
float128 ret = float64_to_float128(f2, &env->fpu_status);
ret = float128_mul(make_float128(ah, al), ret, &env->fpu_status);
@@ -355,7 +357,7 @@ uint64_t HELPER(ldxb)(CPUS390XState *env, uint64_t ah, uint64_t al,
}
/* convert 64-bit float to 128-bit float */
-uint64_t HELPER(lxdb)(CPUS390XState *env, uint64_t f2)
+Int128 HELPER(lxdb)(CPUS390XState *env, uint64_t f2)
{
float128 ret = float64_to_float128(f2, &env->fpu_status);
handle_exceptions(env, false, GETPC());
@@ -363,7 +365,7 @@ uint64_t HELPER(lxdb)(CPUS390XState *env, uint64_t f2)
}
/* convert 32-bit float to 128-bit float */
-uint64_t HELPER(lxeb)(CPUS390XState *env, uint64_t f2)
+Int128 HELPER(lxeb)(CPUS390XState *env, uint64_t f2)
{
float128 ret = float32_to_float128(f2, &env->fpu_status);
handle_exceptions(env, false, GETPC());
@@ -486,7 +488,7 @@ uint64_t HELPER(cdgb)(CPUS390XState *env, int64_t v2, uint32_t m34)
}
/* convert 64-bit int to 128-bit float */
-uint64_t HELPER(cxgb)(CPUS390XState *env, int64_t v2, uint32_t m34)
+Int128 HELPER(cxgb)(CPUS390XState *env, int64_t v2, uint32_t m34)
{
int old_mode = s390_swap_bfp_rounding_mode(env, round_from_m34(m34));
float128 ret = int64_to_float128(v2, &env->fpu_status);
@@ -519,7 +521,7 @@ uint64_t HELPER(cdlgb)(CPUS390XState *env, uint64_t v2, uint32_t m34)
}
/* convert 64-bit uint to 128-bit float */
-uint64_t HELPER(cxlgb)(CPUS390XState *env, uint64_t v2, uint32_t m34)
+Int128 HELPER(cxlgb)(CPUS390XState *env, uint64_t v2, uint32_t m34)
{
int old_mode = s390_swap_bfp_rounding_mode(env, round_from_m34(m34));
float128 ret = uint64_to_float128(v2, &env->fpu_status);
@@ -748,8 +750,7 @@ uint64_t HELPER(fidb)(CPUS390XState *env, uint64_t f2, uint32_t m34)
}
/* round to integer 128-bit */
-uint64_t HELPER(fixb)(CPUS390XState *env, uint64_t ah, uint64_t al,
- uint32_t m34)
+Int128 HELPER(fixb)(CPUS390XState *env, uint64_t ah, uint64_t al, uint32_t m34)
{
int old_mode = s390_swap_bfp_rounding_mode(env, round_from_m34(m34));
float128 ret = float128_round_to_int(make_float128(ah, al),
@@ -890,7 +891,7 @@ uint64_t HELPER(sqdb)(CPUS390XState *env, uint64_t f2)
}
/* square root 128-bit */
-uint64_t HELPER(sqxb)(CPUS390XState *env, uint64_t ah, uint64_t al)
+Int128 HELPER(sqxb)(CPUS390XState *env, uint64_t ah, uint64_t al)
{
float128 ret = float128_sqrt(make_float128(ah, al), &env->fpu_status);
handle_exceptions(env, false, GETPC());
diff --git a/target/s390x/tcg/translate.c b/target/s390x/tcg/translate.c
index 15c1a2067e..d1ffbb8710 100644
--- a/target/s390x/tcg/translate.c
+++ b/target/s390x/tcg/translate.c
@@ -1103,6 +1103,7 @@ typedef struct {
bool g_out, g_out2, g_in1, g_in2;
TCGv_i64 out, out2, in1, in2;
TCGv_i64 addr1;
+ TCGv_i128 out_128;
} DisasOps;
/* Instructions can place constraints on their operands, raising specification
@@ -1461,8 +1462,7 @@ static DisasJumpType op_adb(DisasContext *s, DisasOps *o)
static DisasJumpType op_axb(DisasContext *s, DisasOps *o)
{
- gen_helper_axb(o->out, cpu_env, o->out, o->out2, o->in1, o->in2);
- return_low128(o->out2);
+ gen_helper_axb(o->out_128, cpu_env, o->out, o->out2, o->in1, o->in2);
return DISAS_NEXT;
}
@@ -1995,9 +1995,8 @@ static DisasJumpType op_cxgb(DisasContext *s, DisasOps *o)
if (!m34) {
return DISAS_NORETURN;
}
- gen_helper_cxgb(o->out, cpu_env, o->in2, m34);
+ gen_helper_cxgb(o->out_128, cpu_env, o->in2, m34);
tcg_temp_free_i32(m34);
- return_low128(o->out2);
return DISAS_NEXT;
}
@@ -2032,7 +2031,7 @@ static DisasJumpType op_cxlgb(DisasContext *s, DisasOps *o)
if (!m34) {
return DISAS_NORETURN;
}
- gen_helper_cxlgb(o->out, cpu_env, o->in2, m34);
+ gen_helper_cxlgb(o->out_128, cpu_env, o->in2, m34);
tcg_temp_free_i32(m34);
return_low128(o->out2);
return DISAS_NEXT;
@@ -2449,8 +2448,7 @@ static DisasJumpType op_ddb(DisasContext *s, DisasOps *o)
static DisasJumpType op_dxb(DisasContext *s, DisasOps *o)
{
- gen_helper_dxb(o->out, cpu_env, o->out, o->out2, o->in1, o->in2);
- return_low128(o->out2);
+ gen_helper_dxb(o->out_128, cpu_env, o->out, o->out2, o->in1, o->in2);
return DISAS_NEXT;
}
@@ -2555,8 +2553,7 @@ static DisasJumpType op_fixb(DisasContext *s, DisasOps *o)
if (!m34) {
return DISAS_NORETURN;
}
- gen_helper_fixb(o->out, cpu_env, o->in1, o->in2, m34);
- return_low128(o->out2);
+ gen_helper_fixb(o->out_128, cpu_env, o->in1, o->in2, m34);
tcg_temp_free_i32(m34);
return DISAS_NEXT;
}
@@ -2868,14 +2865,13 @@ static DisasJumpType op_lexb(DisasContext *s, DisasOps *o)
static DisasJumpType op_lxdb(DisasContext *s, DisasOps *o)
{
- gen_helper_lxdb(o->out, cpu_env, o->in2);
- return_low128(o->out2);
+ gen_helper_lxdb(o->out_128, cpu_env, o->in2);
return DISAS_NEXT;
}
static DisasJumpType op_lxeb(DisasContext *s, DisasOps *o)
{
- gen_helper_lxeb(o->out, cpu_env, o->in2);
+ gen_helper_lxeb(o->out_128, cpu_env, o->in2);
return_low128(o->out2);
return DISAS_NEXT;
}
@@ -3590,15 +3586,13 @@ static DisasJumpType op_mdb(DisasContext *s, DisasOps *o)
static DisasJumpType op_mxb(DisasContext *s, DisasOps *o)
{
- gen_helper_mxb(o->out, cpu_env, o->out, o->out2, o->in1, o->in2);
- return_low128(o->out2);
+ gen_helper_mxb(o->out_128, cpu_env, o->out, o->out2, o->in1, o->in2);
return DISAS_NEXT;
}
static DisasJumpType op_mxdb(DisasContext *s, DisasOps *o)
{
- gen_helper_mxdb(o->out, cpu_env, o->out, o->out2, o->in2);
- return_low128(o->out2);
+ gen_helper_mxdb(o->out_128, cpu_env, o->out, o->out2, o->in2);
return DISAS_NEXT;
}
@@ -4063,8 +4057,7 @@ static DisasJumpType op_sdb(DisasContext *s, DisasOps *o)
static DisasJumpType op_sxb(DisasContext *s, DisasOps *o)
{
- gen_helper_sxb(o->out, cpu_env, o->out, o->out2, o->in1, o->in2);
- return_low128(o->out2);
+ gen_helper_sxb(o->out_128, cpu_env, o->out, o->out2, o->in1, o->in2);
return DISAS_NEXT;
}
@@ -4082,8 +4075,7 @@ static DisasJumpType op_sqdb(DisasContext *s, DisasOps *o)
static DisasJumpType op_sqxb(DisasContext *s, DisasOps *o)
{
- gen_helper_sqxb(o->out, cpu_env, o->in1, o->in2);
- return_low128(o->out2);
+ gen_helper_sqxb(o->out_128, cpu_env, o->in1, o->in2);
return DISAS_NEXT;
}
@@ -5395,6 +5387,14 @@ static void prep_new_P(DisasContext *s, DisasOps *o)
}
#define SPEC_prep_new_P 0
+static void prep_new_x(DisasContext *s, DisasOps *o)
+{
+ o->out = tcg_temp_new_i64();
+ o->out2 = tcg_temp_new_i64();
+ o->out_128 = tcg_temp_new_i128();
+}
+#define SPEC_prep_new_x 0
+
static void prep_r1(DisasContext *s, DisasOps *o)
{
o->out = regs[get_field(s, r1)];
@@ -5411,11 +5411,12 @@ static void prep_r1_P(DisasContext *s, DisasOps *o)
}
#define SPEC_prep_r1_P SPEC_r1_even
-/* Whenever we need x1 in addition to other inputs, we'll load it to out/out2 */
static void prep_x1(DisasContext *s, DisasOps *o)
{
o->out = load_freg(get_field(s, r1));
o->out2 = load_freg(get_field(s, r1) + 2);
+ o->out_128 = tcg_temp_new_i128();
+ tcg_gen_concat_i64_i128(o->out_128, o->out2, o->out);
}
#define SPEC_prep_x1 SPEC_r1_f128
@@ -5513,6 +5514,8 @@ static void wout_f1(DisasContext *s, DisasOps *o)
static void wout_x1(DisasContext *s, DisasOps *o)
{
int f1 = get_field(s, r1);
+
+ tcg_gen_extr_i128_i64(o->out2, o->out, o->out_128);
store_freg(f1, o->out);
store_freg(f1 + 2, o->out2);
}
@@ -6582,7 +6585,9 @@ static DisasJumpType translate_one(CPUS390XState *env, DisasContext *s)
if (o.addr1) {
tcg_temp_free_i64(o.addr1);
}
-
+ if (o.out_128) {
+ tcg_temp_free_i128(o.out_128);
+ }
/* io should be the last instruction in tb when icount is enabled */
if (unlikely(icount && ret == DISAS_NEXT)) {
ret = DISAS_TOO_MANY;
diff --git a/target/s390x/tcg/insn-data.def b/target/s390x/tcg/insn-data.def
index 83e246ab16..20bf20c766 100644
--- a/target/s390x/tcg/insn-data.def
+++ b/target/s390x/tcg/insn-data.def
@@ -306,10 +306,10 @@
/* CONVERT FROM FIXED */
F(0xb394, CEFBR, RRF_e, Z, 0, r2_32s, new, e1, cegb, 0, IF_BFP)
F(0xb395, CDFBR, RRF_e, Z, 0, r2_32s, new, f1, cdgb, 0, IF_BFP)
- F(0xb396, CXFBR, RRF_e, Z, 0, r2_32s, new_P, x1, cxgb, 0, IF_BFP)
+ F(0xb396, CXFBR, RRF_e, Z, 0, r2_32s, new_x, x1, cxgb, 0, IF_BFP)
F(0xb3a4, CEGBR, RRF_e, Z, 0, r2_o, new, e1, cegb, 0, IF_BFP)
F(0xb3a5, CDGBR, RRF_e, Z, 0, r2_o, new, f1, cdgb, 0, IF_BFP)
- F(0xb3a6, CXGBR, RRF_e, Z, 0, r2_o, new_P, x1, cxgb, 0, IF_BFP)
+ F(0xb3a6, CXGBR, RRF_e, Z, 0, r2_o, new_x, x1, cxgb, 0, IF_BFP)
/* CONVERT TO LOGICAL */
F(0xb39c, CLFEBR, RRF_e, FPE, 0, e2, new, r1_32, clfeb, 0, IF_BFP)
F(0xb39d, CLFDBR, RRF_e, FPE, 0, f2, new, r1_32, clfdb, 0, IF_BFP)
@@ -320,10 +320,10 @@
/* CONVERT FROM LOGICAL */
F(0xb390, CELFBR, RRF_e, FPE, 0, r2_32u, new, e1, celgb, 0, IF_BFP)
F(0xb391, CDLFBR, RRF_e, FPE, 0, r2_32u, new, f1, cdlgb, 0, IF_BFP)
- F(0xb392, CXLFBR, RRF_e, FPE, 0, r2_32u, new_P, x1, cxlgb, 0, IF_BFP)
+ F(0xb392, CXLFBR, RRF_e, FPE, 0, r2_32u, new_x, x1, cxlgb, 0, IF_BFP)
F(0xb3a0, CELGBR, RRF_e, FPE, 0, r2_o, new, e1, celgb, 0, IF_BFP)
F(0xb3a1, CDLGBR, RRF_e, FPE, 0, r2_o, new, f1, cdlgb, 0, IF_BFP)
- F(0xb3a2, CXLGBR, RRF_e, FPE, 0, r2_o, new_P, x1, cxlgb, 0, IF_BFP)
+ F(0xb3a2, CXLGBR, RRF_e, FPE, 0, r2_o, new_x, x1, cxlgb, 0, IF_BFP)
/* CONVERT UTF-8 TO UTF-16 */
D(0xb2a7, CU12, RRF_c, Z, 0, 0, 0, 0, cuXX, 0, 12)
@@ -597,15 +597,15 @@
/* LOAD FP INTEGER */
F(0xb357, FIEBR, RRF_e, Z, 0, e2, new, e1, fieb, 0, IF_BFP)
F(0xb35f, FIDBR, RRF_e, Z, 0, f2, new, f1, fidb, 0, IF_BFP)
- F(0xb347, FIXBR, RRF_e, Z, x2h, x2l, new_P, x1, fixb, 0, IF_BFP)
+ F(0xb347, FIXBR, RRF_e, Z, x2h, x2l, new_x, x1, fixb, 0, IF_BFP)
/* LOAD LENGTHENED */
F(0xb304, LDEBR, RRE, Z, 0, e2, new, f1, ldeb, 0, IF_BFP)
- F(0xb305, LXDBR, RRE, Z, 0, f2, new_P, x1, lxdb, 0, IF_BFP)
- F(0xb306, LXEBR, RRE, Z, 0, e2, new_P, x1, lxeb, 0, IF_BFP)
+ F(0xb305, LXDBR, RRE, Z, 0, f2, new_x, x1, lxdb, 0, IF_BFP)
+ F(0xb306, LXEBR, RRE, Z, 0, e2, new_x, x1, lxeb, 0, IF_BFP)
F(0xed04, LDEB, RXE, Z, 0, m2_32u, new, f1, ldeb, 0, IF_BFP)
- F(0xed05, LXDB, RXE, Z, 0, m2_64, new_P, x1, lxdb, 0, IF_BFP)
- F(0xed06, LXEB, RXE, Z, 0, m2_32u, new_P, x1, lxeb, 0, IF_BFP)
+ F(0xed05, LXDB, RXE, Z, 0, m2_64, new_x, x1, lxdb, 0, IF_BFP)
+ F(0xed06, LXEB, RXE, Z, 0, m2_32u, new_x, x1, lxeb, 0, IF_BFP)
F(0xb324, LDER, RXE, Z, 0, e2, new, f1, lde, 0, IF_AFP1)
F(0xed24, LDE, RXE, Z, 0, m2_32u, new, f1, lde, 0, IF_AFP1)
/* LOAD ROUNDED */
@@ -835,7 +835,7 @@
/* SQUARE ROOT */
F(0xb314, SQEBR, RRE, Z, 0, e2, new, e1, sqeb, 0, IF_BFP)
F(0xb315, SQDBR, RRE, Z, 0, f2, new, f1, sqdb, 0, IF_BFP)
- F(0xb316, SQXBR, RRE, Z, x2h, x2l, new_P, x1, sqxb, 0, IF_BFP)
+ F(0xb316, SQXBR, RRE, Z, x2h, x2l, new_x, x1, sqxb, 0, IF_BFP)
F(0xed14, SQEB, RXE, Z, 0, m2_32u, new, e1, sqeb, 0, IF_BFP)
F(0xed15, SQDB, RXE, Z, 0, m2_64, new, f1, sqdb, 0, IF_BFP)
--
2.34.1
^ permalink raw reply related [flat|nested] 35+ messages in thread
* Re: [PATCH 8/9] target/s390x: Use Int128 for returning float128
2022-10-21 7:30 ` [PATCH 8/9] target/s390x: Use Int128 for returning float128 Richard Henderson
@ 2022-10-25 9:58 ` Philippe Mathieu-Daudé
2022-10-27 11:18 ` Ilya Leoshkevich
1 sibling, 0 replies; 35+ messages in thread
From: Philippe Mathieu-Daudé @ 2022-10-25 9:58 UTC (permalink / raw)
To: Richard Henderson, qemu-devel; +Cc: qemu-s390x
On 21/10/22 09:30, Richard Henderson wrote:
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
> target/s390x/helper.h | 22 +++++++--------
> target/s390x/tcg/fpu_helper.c | 29 ++++++++++----------
> target/s390x/tcg/translate.c | 49 +++++++++++++++++++---------------
> target/s390x/tcg/insn-data.def | 20 +++++++-------
> 4 files changed, 63 insertions(+), 57 deletions(-)
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH 8/9] target/s390x: Use Int128 for returning float128
2022-10-21 7:30 ` [PATCH 8/9] target/s390x: Use Int128 for returning float128 Richard Henderson
2022-10-25 9:58 ` Philippe Mathieu-Daudé
@ 2022-10-27 11:18 ` Ilya Leoshkevich
2022-10-27 11:31 ` Richard Henderson
1 sibling, 1 reply; 35+ messages in thread
From: Ilya Leoshkevich @ 2022-10-27 11:18 UTC (permalink / raw)
To: Richard Henderson, qemu-devel; +Cc: qemu-s390x
On Fri, Oct 21, 2022 at 05:30:05PM +1000, Richard Henderson wrote:
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
> target/s390x/helper.h | 22 +++++++--------
> target/s390x/tcg/fpu_helper.c | 29 ++++++++++----------
> target/s390x/tcg/translate.c | 49 +++++++++++++++++++---------------
> target/s390x/tcg/insn-data.def | 20 +++++++-------
> 4 files changed, 63 insertions(+), 57 deletions(-)
>
> @@ -2032,7 +2031,7 @@ static DisasJumpType op_cxlgb(DisasContext *s, DisasOps *o)
> if (!m34) {
> return DISAS_NORETURN;
> }
> - gen_helper_cxlgb(o->out, cpu_env, o->in2, m34);
> + gen_helper_cxlgb(o->out_128, cpu_env, o->in2, m34);
> tcg_temp_free_i32(m34);
> return_low128(o->out2);
> return DISAS_NEXT;
Do we still need return_low128() here?
> static DisasJumpType op_lxeb(DisasContext *s, DisasOps *o)
> {
> - gen_helper_lxeb(o->out, cpu_env, o->in2);
> + gen_helper_lxeb(o->out_128, cpu_env, o->in2);
> return_low128(o->out2);
> return DISAS_NEXT;
> }
Same question.
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH 8/9] target/s390x: Use Int128 for returning float128
2022-10-27 11:18 ` Ilya Leoshkevich
@ 2022-10-27 11:31 ` Richard Henderson
0 siblings, 0 replies; 35+ messages in thread
From: Richard Henderson @ 2022-10-27 11:31 UTC (permalink / raw)
To: Ilya Leoshkevich, qemu-devel; +Cc: qemu-s390x
On 10/27/22 21:18, Ilya Leoshkevich wrote:
> On Fri, Oct 21, 2022 at 05:30:05PM +1000, Richard Henderson wrote:
>> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
>> ---
>> target/s390x/helper.h | 22 +++++++--------
>> target/s390x/tcg/fpu_helper.c | 29 ++++++++++----------
>> target/s390x/tcg/translate.c | 49 +++++++++++++++++++---------------
>> target/s390x/tcg/insn-data.def | 20 +++++++-------
>> 4 files changed, 63 insertions(+), 57 deletions(-)
>>
>
>> @@ -2032,7 +2031,7 @@ static DisasJumpType op_cxlgb(DisasContext *s, DisasOps *o)
>> if (!m34) {
>> return DISAS_NORETURN;
>> }
>> - gen_helper_cxlgb(o->out, cpu_env, o->in2, m34);
>> + gen_helper_cxlgb(o->out_128, cpu_env, o->in2, m34);
>> tcg_temp_free_i32(m34);
>> return_low128(o->out2);
>> return DISAS_NEXT;
>
> Do we still need return_low128() here?
>
>> static DisasJumpType op_lxeb(DisasContext *s, DisasOps *o)
>> {
>> - gen_helper_lxeb(o->out, cpu_env, o->in2);
>> + gen_helper_lxeb(o->out_128, cpu_env, o->in2);
>> return_low128(o->out2);
>> return DISAS_NEXT;
>> }
>
> Same question.
Nope, that should have gone away. Oops.
r~
^ permalink raw reply [flat|nested] 35+ messages in thread
* [PATCH 9/9] target/s390x: Use Int128 for passing float128
2022-10-21 7:29 [PATCH 0/9] target/s390x: Use Int128 for float128 and retxl Richard Henderson
` (7 preceding siblings ...)
2022-10-21 7:30 ` [PATCH 8/9] target/s390x: Use Int128 for returning float128 Richard Henderson
@ 2022-10-21 7:30 ` Richard Henderson
2022-10-24 18:01 ` Philippe Mathieu-Daudé
2022-11-02 9:38 ` Ilya Leoshkevich
8 siblings, 2 replies; 35+ messages in thread
From: Richard Henderson @ 2022-10-21 7:30 UTC (permalink / raw)
To: qemu-devel; +Cc: qemu-s390x
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
target/s390x/helper.h | 32 ++++++-------
target/s390x/tcg/fpu_helper.c | 88 ++++++++++++++--------------------
target/s390x/tcg/translate.c | 76 ++++++++++++++++++++---------
target/s390x/tcg/insn-data.def | 30 ++++++------
4 files changed, 121 insertions(+), 105 deletions(-)
diff --git a/target/s390x/helper.h b/target/s390x/helper.h
index 429131a85e..481b9019f9 100644
--- a/target/s390x/helper.h
+++ b/target/s390x/helper.h
@@ -41,55 +41,55 @@ DEF_HELPER_4(csst, i32, env, i32, i64, i64)
DEF_HELPER_4(csst_parallel, i32, env, i32, i64, i64)
DEF_HELPER_FLAGS_3(aeb, TCG_CALL_NO_WG, i64, env, i64, i64)
DEF_HELPER_FLAGS_3(adb, TCG_CALL_NO_WG, i64, env, i64, i64)
-DEF_HELPER_FLAGS_5(axb, TCG_CALL_NO_WG, i128, env, i64, i64, i64, i64)
+DEF_HELPER_FLAGS_3(axb, TCG_CALL_NO_WG, i128, env, i128, i128)
DEF_HELPER_FLAGS_3(seb, TCG_CALL_NO_WG, i64, env, i64, i64)
DEF_HELPER_FLAGS_3(sdb, TCG_CALL_NO_WG, i64, env, i64, i64)
-DEF_HELPER_FLAGS_5(sxb, TCG_CALL_NO_WG, i128, env, i64, i64, i64, i64)
+DEF_HELPER_FLAGS_3(sxb, TCG_CALL_NO_WG, i128, env, i128, i128)
DEF_HELPER_FLAGS_3(deb, TCG_CALL_NO_WG, i64, env, i64, i64)
DEF_HELPER_FLAGS_3(ddb, TCG_CALL_NO_WG, i64, env, i64, i64)
-DEF_HELPER_FLAGS_5(dxb, TCG_CALL_NO_WG, i128, env, i64, i64, i64, i64)
+DEF_HELPER_FLAGS_3(dxb, TCG_CALL_NO_WG, i128, env, i128, i128)
DEF_HELPER_FLAGS_3(meeb, TCG_CALL_NO_WG, i64, env, i64, i64)
DEF_HELPER_FLAGS_3(mdeb, TCG_CALL_NO_WG, i64, env, i64, i64)
DEF_HELPER_FLAGS_3(mdb, TCG_CALL_NO_WG, i64, env, i64, i64)
-DEF_HELPER_FLAGS_5(mxb, TCG_CALL_NO_WG, i128, env, i64, i64, i64, i64)
-DEF_HELPER_FLAGS_4(mxdb, TCG_CALL_NO_WG, i128, env, i64, i64, i64)
+DEF_HELPER_FLAGS_3(mxb, TCG_CALL_NO_WG, i128, env, i128, i128)
+DEF_HELPER_FLAGS_3(mxdb, TCG_CALL_NO_WG, i128, env, i128, i64)
DEF_HELPER_FLAGS_2(ldeb, TCG_CALL_NO_WG, i64, env, i64)
-DEF_HELPER_FLAGS_4(ldxb, TCG_CALL_NO_WG, i64, env, i64, i64, i32)
+DEF_HELPER_FLAGS_3(ldxb, TCG_CALL_NO_WG, i64, env, i128, i32)
DEF_HELPER_FLAGS_2(lxdb, TCG_CALL_NO_WG, i128, env, i64)
DEF_HELPER_FLAGS_2(lxeb, TCG_CALL_NO_WG, i128, env, i64)
DEF_HELPER_FLAGS_3(ledb, TCG_CALL_NO_WG, i64, env, i64, i32)
-DEF_HELPER_FLAGS_4(lexb, TCG_CALL_NO_WG, i64, env, i64, i64, i32)
+DEF_HELPER_FLAGS_3(lexb, TCG_CALL_NO_WG, i64, env, i128, i32)
DEF_HELPER_FLAGS_3(ceb, TCG_CALL_NO_WG_SE, i32, env, i64, i64)
DEF_HELPER_FLAGS_3(cdb, TCG_CALL_NO_WG_SE, i32, env, i64, i64)
-DEF_HELPER_FLAGS_5(cxb, TCG_CALL_NO_WG_SE, i32, env, i64, i64, i64, i64)
+DEF_HELPER_FLAGS_3(cxb, TCG_CALL_NO_WG_SE, i32, env, i128, i128)
DEF_HELPER_FLAGS_3(keb, TCG_CALL_NO_WG, i32, env, i64, i64)
DEF_HELPER_FLAGS_3(kdb, TCG_CALL_NO_WG, i32, env, i64, i64)
-DEF_HELPER_FLAGS_5(kxb, TCG_CALL_NO_WG, i32, env, i64, i64, i64, i64)
+DEF_HELPER_FLAGS_3(kxb, TCG_CALL_NO_WG, i32, env, i128, i128)
DEF_HELPER_3(cgeb, i64, env, i64, i32)
DEF_HELPER_3(cgdb, i64, env, i64, i32)
-DEF_HELPER_4(cgxb, i64, env, i64, i64, i32)
+DEF_HELPER_3(cgxb, i64, env, i128, i32)
DEF_HELPER_3(cfeb, i64, env, i64, i32)
DEF_HELPER_3(cfdb, i64, env, i64, i32)
-DEF_HELPER_4(cfxb, i64, env, i64, i64, i32)
+DEF_HELPER_3(cfxb, i64, env, i128, i32)
DEF_HELPER_3(clgeb, i64, env, i64, i32)
DEF_HELPER_3(clgdb, i64, env, i64, i32)
-DEF_HELPER_4(clgxb, i64, env, i64, i64, i32)
+DEF_HELPER_3(clgxb, i64, env, i128, i32)
DEF_HELPER_3(clfeb, i64, env, i64, i32)
DEF_HELPER_3(clfdb, i64, env, i64, i32)
-DEF_HELPER_4(clfxb, i64, env, i64, i64, i32)
+DEF_HELPER_3(clfxb, i64, env, i128, i32)
DEF_HELPER_FLAGS_3(fieb, TCG_CALL_NO_WG, i64, env, i64, i32)
DEF_HELPER_FLAGS_3(fidb, TCG_CALL_NO_WG, i64, env, i64, i32)
-DEF_HELPER_FLAGS_4(fixb, TCG_CALL_NO_WG, i128, env, i64, i64, i32)
+DEF_HELPER_FLAGS_3(fixb, TCG_CALL_NO_WG, i128, env, i128, i32)
DEF_HELPER_FLAGS_4(maeb, TCG_CALL_NO_WG, i64, env, i64, i64, i64)
DEF_HELPER_FLAGS_4(madb, TCG_CALL_NO_WG, i64, env, i64, i64, i64)
DEF_HELPER_FLAGS_4(mseb, TCG_CALL_NO_WG, i64, env, i64, i64, i64)
DEF_HELPER_FLAGS_4(msdb, TCG_CALL_NO_WG, i64, env, i64, i64, i64)
DEF_HELPER_FLAGS_3(tceb, TCG_CALL_NO_RWG_SE, i32, env, i64, i64)
DEF_HELPER_FLAGS_3(tcdb, TCG_CALL_NO_RWG_SE, i32, env, i64, i64)
-DEF_HELPER_FLAGS_4(tcxb, TCG_CALL_NO_RWG_SE, i32, env, i64, i64, i64)
+DEF_HELPER_FLAGS_3(tcxb, TCG_CALL_NO_RWG_SE, i32, env, i128, i64)
DEF_HELPER_FLAGS_2(sqeb, TCG_CALL_NO_WG, i64, env, i64)
DEF_HELPER_FLAGS_2(sqdb, TCG_CALL_NO_WG, i64, env, i64)
-DEF_HELPER_FLAGS_3(sqxb, TCG_CALL_NO_WG, i128, env, i64, i64)
+DEF_HELPER_FLAGS_2(sqxb, TCG_CALL_NO_WG, i128, env, i128)
DEF_HELPER_FLAGS_1(cvd, TCG_CALL_NO_RWG_SE, i64, s32)
DEF_HELPER_FLAGS_4(pack, TCG_CALL_NO_WG, void, env, i32, i64, i64)
DEF_HELPER_FLAGS_4(pka, TCG_CALL_NO_WG, void, env, i64, i64, i32)
diff --git a/target/s390x/tcg/fpu_helper.c b/target/s390x/tcg/fpu_helper.c
index a584794be6..5a322e3f87 100644
--- a/target/s390x/tcg/fpu_helper.c
+++ b/target/s390x/tcg/fpu_helper.c
@@ -39,6 +39,11 @@ static inline Int128 RET128(float128 f)
return int128_make128(f.low, f.high);
}
+static inline float128 ARG128(Int128 i)
+{
+ return make_float128(int128_gethi(i), int128_getlo(i));
+}
+
uint8_t s390_softfloat_exc_to_ieee(unsigned int exc)
{
uint8_t s390_exc = 0;
@@ -227,12 +232,9 @@ uint64_t HELPER(adb)(CPUS390XState *env, uint64_t f1, uint64_t f2)
}
/* 128-bit FP addition */
-Int128 HELPER(axb)(CPUS390XState *env, uint64_t ah, uint64_t al,
- uint64_t bh, uint64_t bl)
+Int128 HELPER(axb)(CPUS390XState *env, Int128 a, Int128 b)
{
- float128 ret = float128_add(make_float128(ah, al),
- make_float128(bh, bl),
- &env->fpu_status);
+ float128 ret = float128_add(ARG128(a), ARG128(b), &env->fpu_status);
handle_exceptions(env, false, GETPC());
return RET128(ret);
}
@@ -254,12 +256,9 @@ uint64_t HELPER(sdb)(CPUS390XState *env, uint64_t f1, uint64_t f2)
}
/* 128-bit FP subtraction */
-Int128 HELPER(sxb)(CPUS390XState *env, uint64_t ah, uint64_t al,
- uint64_t bh, uint64_t bl)
+Int128 HELPER(sxb)(CPUS390XState *env, Int128 a, Int128 b)
{
- float128 ret = float128_sub(make_float128(ah, al),
- make_float128(bh, bl),
- &env->fpu_status);
+ float128 ret = float128_sub(ARG128(a), ARG128(b), &env->fpu_status);
handle_exceptions(env, false, GETPC());
return RET128(ret);
}
@@ -281,12 +280,9 @@ uint64_t HELPER(ddb)(CPUS390XState *env, uint64_t f1, uint64_t f2)
}
/* 128-bit FP division */
-Int128 HELPER(dxb)(CPUS390XState *env, uint64_t ah, uint64_t al,
- uint64_t bh, uint64_t bl)
+Int128 HELPER(dxb)(CPUS390XState *env, Int128 a, Int128 b)
{
- float128 ret = float128_div(make_float128(ah, al),
- make_float128(bh, bl),
- &env->fpu_status);
+ float128 ret = float128_div(ARG128(a), ARG128(b), &env->fpu_status);
handle_exceptions(env, false, GETPC());
return RET128(ret);
}
@@ -317,21 +313,18 @@ uint64_t HELPER(mdeb)(CPUS390XState *env, uint64_t f1, uint64_t f2)
}
/* 128-bit FP multiplication */
-Int128 HELPER(mxb)(CPUS390XState *env, uint64_t ah, uint64_t al,
- uint64_t bh, uint64_t bl)
+Int128 HELPER(mxb)(CPUS390XState *env, Int128 a, Int128 b)
{
- float128 ret = float128_mul(make_float128(ah, al),
- make_float128(bh, bl),
- &env->fpu_status);
+ float128 ret = float128_mul(ARG128(a), ARG128(b), &env->fpu_status);
handle_exceptions(env, false, GETPC());
return RET128(ret);
}
/* 128/64-bit FP multiplication */
-Int128 HELPER(mxdb)(CPUS390XState *env, uint64_t ah, uint64_t al, uint64_t f2)
+Int128 HELPER(mxdb)(CPUS390XState *env, Int128 a, uint64_t f2)
{
float128 ret = float64_to_float128(f2, &env->fpu_status);
- ret = float128_mul(make_float128(ah, al), ret, &env->fpu_status);
+ ret = float128_mul(ARG128(a), ret, &env->fpu_status);
handle_exceptions(env, false, GETPC());
return RET128(ret);
}
@@ -345,11 +338,10 @@ uint64_t HELPER(ldeb)(CPUS390XState *env, uint64_t f2)
}
/* convert 128-bit float to 64-bit float */
-uint64_t HELPER(ldxb)(CPUS390XState *env, uint64_t ah, uint64_t al,
- uint32_t m34)
+uint64_t HELPER(ldxb)(CPUS390XState *env, Int128 a, uint32_t m34)
{
int old_mode = s390_swap_bfp_rounding_mode(env, round_from_m34(m34));
- float64 ret = float128_to_float64(make_float128(ah, al), &env->fpu_status);
+ float64 ret = float128_to_float64(ARG128(a), &env->fpu_status);
s390_restore_bfp_rounding_mode(env, old_mode);
handle_exceptions(env, xxc_from_m34(m34), GETPC());
@@ -384,11 +376,10 @@ uint64_t HELPER(ledb)(CPUS390XState *env, uint64_t f2, uint32_t m34)
}
/* convert 128-bit float to 32-bit float */
-uint64_t HELPER(lexb)(CPUS390XState *env, uint64_t ah, uint64_t al,
- uint32_t m34)
+uint64_t HELPER(lexb)(CPUS390XState *env, Int128 a, uint32_t m34)
{
int old_mode = s390_swap_bfp_rounding_mode(env, round_from_m34(m34));
- float32 ret = float128_to_float32(make_float128(ah, al), &env->fpu_status);
+ float32 ret = float128_to_float32(ARG128(a), &env->fpu_status);
s390_restore_bfp_rounding_mode(env, old_mode);
handle_exceptions(env, xxc_from_m34(m34), GETPC());
@@ -412,11 +403,9 @@ uint32_t HELPER(cdb)(CPUS390XState *env, uint64_t f1, uint64_t f2)
}
/* 128-bit FP compare */
-uint32_t HELPER(cxb)(CPUS390XState *env, uint64_t ah, uint64_t al,
- uint64_t bh, uint64_t bl)
+uint32_t HELPER(cxb)(CPUS390XState *env, Int128 a, Int128 b)
{
- FloatRelation cmp = float128_compare_quiet(make_float128(ah, al),
- make_float128(bh, bl),
+ FloatRelation cmp = float128_compare_quiet(ARG128(a), ARG128(b),
&env->fpu_status);
handle_exceptions(env, false, GETPC());
return float_comp_to_cc(env, cmp);
@@ -564,10 +553,10 @@ uint64_t HELPER(cgdb)(CPUS390XState *env, uint64_t v2, uint32_t m34)
}
/* convert 128-bit float to 64-bit int */
-uint64_t HELPER(cgxb)(CPUS390XState *env, uint64_t h, uint64_t l, uint32_t m34)
+uint64_t HELPER(cgxb)(CPUS390XState *env, Int128 i2, uint32_t m34)
{
int old_mode = s390_swap_bfp_rounding_mode(env, round_from_m34(m34));
- float128 v2 = make_float128(h, l);
+ float128 v2 = ARG128(i2);
int64_t ret = float128_to_int64(v2, &env->fpu_status);
uint32_t cc = set_cc_conv_f128(v2, &env->fpu_status);
@@ -613,10 +602,10 @@ uint64_t HELPER(cfdb)(CPUS390XState *env, uint64_t v2, uint32_t m34)
}
/* convert 128-bit float to 32-bit int */
-uint64_t HELPER(cfxb)(CPUS390XState *env, uint64_t h, uint64_t l, uint32_t m34)
+uint64_t HELPER(cfxb)(CPUS390XState *env, Int128 i2, uint32_t m34)
{
int old_mode = s390_swap_bfp_rounding_mode(env, round_from_m34(m34));
- float128 v2 = make_float128(h, l);
+ float128 v2 = ARG128(i2);
int32_t ret = float128_to_int32(v2, &env->fpu_status);
uint32_t cc = set_cc_conv_f128(v2, &env->fpu_status);
@@ -662,10 +651,10 @@ uint64_t HELPER(clgdb)(CPUS390XState *env, uint64_t v2, uint32_t m34)
}
/* convert 128-bit float to 64-bit uint */
-uint64_t HELPER(clgxb)(CPUS390XState *env, uint64_t h, uint64_t l, uint32_t m34)
+uint64_t HELPER(clgxb)(CPUS390XState *env, Int128 i2, uint32_t m34)
{
int old_mode = s390_swap_bfp_rounding_mode(env, round_from_m34(m34));
- float128 v2 = make_float128(h, l);
+ float128 v2 = ARG128(i2);
uint64_t ret = float128_to_uint64(v2, &env->fpu_status);
uint32_t cc = set_cc_conv_f128(v2, &env->fpu_status);
@@ -711,10 +700,10 @@ uint64_t HELPER(clfdb)(CPUS390XState *env, uint64_t v2, uint32_t m34)
}
/* convert 128-bit float to 32-bit uint */
-uint64_t HELPER(clfxb)(CPUS390XState *env, uint64_t h, uint64_t l, uint32_t m34)
+uint64_t HELPER(clfxb)(CPUS390XState *env, Int128 i2, uint32_t m34)
{
int old_mode = s390_swap_bfp_rounding_mode(env, round_from_m34(m34));
- float128 v2 = make_float128(h, l);
+ float128 v2 = ARG128(i2);
uint32_t ret = float128_to_uint32(v2, &env->fpu_status);
uint32_t cc = set_cc_conv_f128(v2, &env->fpu_status);
@@ -750,11 +739,10 @@ uint64_t HELPER(fidb)(CPUS390XState *env, uint64_t f2, uint32_t m34)
}
/* round to integer 128-bit */
-Int128 HELPER(fixb)(CPUS390XState *env, uint64_t ah, uint64_t al, uint32_t m34)
+Int128 HELPER(fixb)(CPUS390XState *env, Int128 a, uint32_t m34)
{
int old_mode = s390_swap_bfp_rounding_mode(env, round_from_m34(m34));
- float128 ret = float128_round_to_int(make_float128(ah, al),
- &env->fpu_status);
+ float128 ret = float128_round_to_int(ARG128(a), &env->fpu_status);
s390_restore_bfp_rounding_mode(env, old_mode);
handle_exceptions(env, xxc_from_m34(m34), GETPC());
@@ -778,11 +766,9 @@ uint32_t HELPER(kdb)(CPUS390XState *env, uint64_t f1, uint64_t f2)
}
/* 128-bit FP compare and signal */
-uint32_t HELPER(kxb)(CPUS390XState *env, uint64_t ah, uint64_t al,
- uint64_t bh, uint64_t bl)
+uint32_t HELPER(kxb)(CPUS390XState *env, Int128 a, Int128 b)
{
- FloatRelation cmp = float128_compare(make_float128(ah, al),
- make_float128(bh, bl),
+ FloatRelation cmp = float128_compare(ARG128(a), ARG128(b),
&env->fpu_status);
handle_exceptions(env, false, GETPC());
return float_comp_to_cc(env, cmp);
@@ -869,9 +855,9 @@ uint32_t HELPER(tcdb)(CPUS390XState *env, uint64_t v1, uint64_t m2)
}
/* test data class 128-bit */
-uint32_t HELPER(tcxb)(CPUS390XState *env, uint64_t ah, uint64_t al, uint64_t m2)
+uint32_t HELPER(tcxb)(CPUS390XState *env, Int128 a, uint64_t m2)
{
- return (m2 & float128_dcmask(env, make_float128(ah, al))) != 0;
+ return (m2 & float128_dcmask(env, ARG128(a))) != 0;
}
/* square root 32-bit */
@@ -891,9 +877,9 @@ uint64_t HELPER(sqdb)(CPUS390XState *env, uint64_t f2)
}
/* square root 128-bit */
-Int128 HELPER(sqxb)(CPUS390XState *env, uint64_t ah, uint64_t al)
+Int128 HELPER(sqxb)(CPUS390XState *env, Int128 a)
{
- float128 ret = float128_sqrt(make_float128(ah, al), &env->fpu_status);
+ float128 ret = float128_sqrt(ARG128(a), &env->fpu_status);
handle_exceptions(env, false, GETPC());
return RET128(ret);
}
diff --git a/target/s390x/tcg/translate.c b/target/s390x/tcg/translate.c
index d1ffbb8710..8023bbab2f 100644
--- a/target/s390x/tcg/translate.c
+++ b/target/s390x/tcg/translate.c
@@ -305,6 +305,18 @@ static TCGv_i64 load_freg32_i64(int reg)
return r;
}
+static TCGv_i128 load_freg_128(int reg)
+{
+ TCGv_i64 h = load_freg(reg);
+ TCGv_i64 l = load_freg(reg + 2);
+ TCGv_i128 r = tcg_temp_new_i128();
+
+ tcg_gen_concat_i64_i128(r, l, h);
+ tcg_temp_free_i64(h);
+ tcg_temp_free_i64(l);
+ return r;
+}
+
static void store_reg(int reg, TCGv_i64 v)
{
tcg_gen_mov_i64(regs[reg], v);
@@ -1103,7 +1115,7 @@ typedef struct {
bool g_out, g_out2, g_in1, g_in2;
TCGv_i64 out, out2, in1, in2;
TCGv_i64 addr1;
- TCGv_i128 out_128;
+ TCGv_i128 out_128, in1_128, in2_128;
} DisasOps;
/* Instructions can place constraints on their operands, raising specification
@@ -1462,7 +1474,7 @@ static DisasJumpType op_adb(DisasContext *s, DisasOps *o)
static DisasJumpType op_axb(DisasContext *s, DisasOps *o)
{
- gen_helper_axb(o->out_128, cpu_env, o->out, o->out2, o->in1, o->in2);
+ gen_helper_axb(o->out_128, cpu_env, o->in1_128, o->in2_128);
return DISAS_NEXT;
}
@@ -1778,7 +1790,7 @@ static DisasJumpType op_cdb(DisasContext *s, DisasOps *o)
static DisasJumpType op_cxb(DisasContext *s, DisasOps *o)
{
- gen_helper_cxb(cc_op, cpu_env, o->out, o->out2, o->in1, o->in2);
+ gen_helper_cxb(cc_op, cpu_env, o->in1_128, o->in2_128);
set_cc_static(s);
return DISAS_NEXT;
}
@@ -1841,7 +1853,7 @@ static DisasJumpType op_cfxb(DisasContext *s, DisasOps *o)
if (!m34) {
return DISAS_NORETURN;
}
- gen_helper_cfxb(o->out, cpu_env, o->in1, o->in2, m34);
+ gen_helper_cfxb(o->out, cpu_env, o->in2_128, m34);
tcg_temp_free_i32(m34);
set_cc_static(s);
return DISAS_NEXT;
@@ -1880,7 +1892,7 @@ static DisasJumpType op_cgxb(DisasContext *s, DisasOps *o)
if (!m34) {
return DISAS_NORETURN;
}
- gen_helper_cgxb(o->out, cpu_env, o->in1, o->in2, m34);
+ gen_helper_cgxb(o->out, cpu_env, o->in2_128, m34);
tcg_temp_free_i32(m34);
set_cc_static(s);
return DISAS_NEXT;
@@ -1919,7 +1931,7 @@ static DisasJumpType op_clfxb(DisasContext *s, DisasOps *o)
if (!m34) {
return DISAS_NORETURN;
}
- gen_helper_clfxb(o->out, cpu_env, o->in1, o->in2, m34);
+ gen_helper_clfxb(o->out, cpu_env, o->in2_128, m34);
tcg_temp_free_i32(m34);
set_cc_static(s);
return DISAS_NEXT;
@@ -1958,7 +1970,7 @@ static DisasJumpType op_clgxb(DisasContext *s, DisasOps *o)
if (!m34) {
return DISAS_NORETURN;
}
- gen_helper_clgxb(o->out, cpu_env, o->in1, o->in2, m34);
+ gen_helper_clgxb(o->out, cpu_env, o->in2_128, m34);
tcg_temp_free_i32(m34);
set_cc_static(s);
return DISAS_NEXT;
@@ -2448,7 +2460,7 @@ static DisasJumpType op_ddb(DisasContext *s, DisasOps *o)
static DisasJumpType op_dxb(DisasContext *s, DisasOps *o)
{
- gen_helper_dxb(o->out_128, cpu_env, o->out, o->out2, o->in1, o->in2);
+ gen_helper_dxb(o->out_128, cpu_env, o->in1_128, o->in2_128);
return DISAS_NEXT;
}
@@ -2553,7 +2565,7 @@ static DisasJumpType op_fixb(DisasContext *s, DisasOps *o)
if (!m34) {
return DISAS_NORETURN;
}
- gen_helper_fixb(o->out_128, cpu_env, o->in1, o->in2, m34);
+ gen_helper_fixb(o->out_128, cpu_env, o->in2_128, m34);
tcg_temp_free_i32(m34);
return DISAS_NEXT;
}
@@ -2772,7 +2784,7 @@ static DisasJumpType op_kdb(DisasContext *s, DisasOps *o)
static DisasJumpType op_kxb(DisasContext *s, DisasOps *o)
{
- gen_helper_kxb(cc_op, cpu_env, o->out, o->out2, o->in1, o->in2);
+ gen_helper_kxb(cc_op, cpu_env, o->in1_128, o->in2_128);
set_cc_static(s);
return DISAS_NEXT;
}
@@ -2846,7 +2858,7 @@ static DisasJumpType op_ldxb(DisasContext *s, DisasOps *o)
if (!m34) {
return DISAS_NORETURN;
}
- gen_helper_ldxb(o->out, cpu_env, o->in1, o->in2, m34);
+ gen_helper_ldxb(o->out, cpu_env, o->in2_128, m34);
tcg_temp_free_i32(m34);
return DISAS_NEXT;
}
@@ -2858,7 +2870,7 @@ static DisasJumpType op_lexb(DisasContext *s, DisasOps *o)
if (!m34) {
return DISAS_NORETURN;
}
- gen_helper_lexb(o->out, cpu_env, o->in1, o->in2, m34);
+ gen_helper_lexb(o->out, cpu_env, o->in2_128, m34);
tcg_temp_free_i32(m34);
return DISAS_NEXT;
}
@@ -3586,13 +3598,13 @@ static DisasJumpType op_mdb(DisasContext *s, DisasOps *o)
static DisasJumpType op_mxb(DisasContext *s, DisasOps *o)
{
- gen_helper_mxb(o->out_128, cpu_env, o->out, o->out2, o->in1, o->in2);
+ gen_helper_mxb(o->out_128, cpu_env, o->in1_128, o->in2_128);
return DISAS_NEXT;
}
static DisasJumpType op_mxdb(DisasContext *s, DisasOps *o)
{
- gen_helper_mxdb(o->out_128, cpu_env, o->out, o->out2, o->in2);
+ gen_helper_mxdb(o->out_128, cpu_env, o->in1_128, o->in2);
return DISAS_NEXT;
}
@@ -4057,7 +4069,7 @@ static DisasJumpType op_sdb(DisasContext *s, DisasOps *o)
static DisasJumpType op_sxb(DisasContext *s, DisasOps *o)
{
- gen_helper_sxb(o->out_128, cpu_env, o->out, o->out2, o->in1, o->in2);
+ gen_helper_sxb(o->out_128, cpu_env, o->in1_128, o->in2_128);
return DISAS_NEXT;
}
@@ -4075,7 +4087,7 @@ static DisasJumpType op_sqdb(DisasContext *s, DisasOps *o)
static DisasJumpType op_sqxb(DisasContext *s, DisasOps *o)
{
- gen_helper_sqxb(o->out_128, cpu_env, o->in1, o->in2);
+ gen_helper_sqxb(o->out_128, cpu_env, o->in2_128);
return DISAS_NEXT;
}
@@ -4854,7 +4866,7 @@ static DisasJumpType op_tcdb(DisasContext *s, DisasOps *o)
static DisasJumpType op_tcxb(DisasContext *s, DisasOps *o)
{
- gen_helper_tcxb(cc_op, cpu_env, o->out, o->out2, o->in2);
+ gen_helper_tcxb(cc_op, cpu_env, o->in1_128, o->in2);
set_cc_static(s);
return DISAS_NEXT;
}
@@ -5389,8 +5401,6 @@ static void prep_new_P(DisasContext *s, DisasOps *o)
static void prep_new_x(DisasContext *s, DisasOps *o)
{
- o->out = tcg_temp_new_i64();
- o->out2 = tcg_temp_new_i64();
o->out_128 = tcg_temp_new_i128();
}
#define SPEC_prep_new_x 0
@@ -5413,10 +5423,7 @@ static void prep_r1_P(DisasContext *s, DisasOps *o)
static void prep_x1(DisasContext *s, DisasOps *o)
{
- o->out = load_freg(get_field(s, r1));
- o->out2 = load_freg(get_field(s, r1) + 2);
- o->out_128 = tcg_temp_new_i128();
- tcg_gen_concat_i64_i128(o->out_128, o->out2, o->out);
+ o->out_128 = load_freg_128(get_field(s, r1));
}
#define SPEC_prep_x1 SPEC_r1_f128
@@ -5515,6 +5522,11 @@ static void wout_x1(DisasContext *s, DisasOps *o)
{
int f1 = get_field(s, r1);
+ /* Split out_128 into out+out2 for cout_f128. */
+ tcg_debug_assert(o->out == NULL);
+ o->out = tcg_temp_new_i64();
+ o->out2 = tcg_temp_new_i64();
+
tcg_gen_extr_i128_i64(o->out2, o->out, o->out_128);
store_freg(f1, o->out);
store_freg(f1 + 2, o->out2);
@@ -5757,6 +5769,12 @@ static void in1_f1(DisasContext *s, DisasOps *o)
}
#define SPEC_in1_f1 0
+static void in1_x1(DisasContext *s, DisasOps *o)
+{
+ o->in1_128 = load_freg_128(get_field(s, r1));
+}
+#define SPEC_in1_x1 SPEC_r2_f128
+
/* Load the high double word of an extended (128-bit) format FP number */
static void in1_x2h(DisasContext *s, DisasOps *o)
{
@@ -5966,6 +5984,12 @@ static void in2_f2(DisasContext *s, DisasOps *o)
}
#define SPEC_in2_f2 0
+static void in2_x2(DisasContext *s, DisasOps *o)
+{
+ o->in2_128 = load_freg_128(get_field(s, r2));
+}
+#define SPEC_in2_x2 SPEC_r2_f128
+
/* Load the low double word of an extended (128-bit) format FP number */
static void in2_x2l(DisasContext *s, DisasOps *o)
{
@@ -6588,6 +6612,12 @@ static DisasJumpType translate_one(CPUS390XState *env, DisasContext *s)
if (o.out_128) {
tcg_temp_free_i128(o.out_128);
}
+ if (o.in1_128) {
+ tcg_temp_free_i128(o.in1_128);
+ }
+ if (o.in2_128) {
+ tcg_temp_free_i128(o.in2_128);
+ }
/* io should be the last instruction in tb when icount is enabled */
if (unlikely(icount && ret == DISAS_NEXT)) {
ret = DISAS_TOO_MANY;
diff --git a/target/s390x/tcg/insn-data.def b/target/s390x/tcg/insn-data.def
index 20bf20c766..26523746d6 100644
--- a/target/s390x/tcg/insn-data.def
+++ b/target/s390x/tcg/insn-data.def
@@ -34,7 +34,7 @@
C(0xe318, AGF, RXY_a, Z, r1, m2_32s, r1, 0, add, adds64)
F(0xb30a, AEBR, RRE, Z, e1, e2, new, e1, aeb, f32, IF_BFP)
F(0xb31a, ADBR, RRE, Z, f1, f2, new, f1, adb, f64, IF_BFP)
- F(0xb34a, AXBR, RRE, Z, x2h, x2l, x1, x1, axb, f128, IF_BFP)
+ F(0xb34a, AXBR, RRE, Z, x1, x2, new_x, x1, axb, f128, IF_BFP)
F(0xed0a, AEB, RXE, Z, e1, m2_32u, new, e1, aeb, f32, IF_BFP)
F(0xed1a, ADB, RXE, Z, f1, m2_64, new, f1, adb, f64, IF_BFP)
/* ADD HIGH */
@@ -172,13 +172,13 @@
C(0xe330, CGF, RXY_a, Z, r1_o, m2_32s, 0, 0, 0, cmps64)
F(0xb309, CEBR, RRE, Z, e1, e2, 0, 0, ceb, 0, IF_BFP)
F(0xb319, CDBR, RRE, Z, f1, f2, 0, 0, cdb, 0, IF_BFP)
- F(0xb349, CXBR, RRE, Z, x2h, x2l, x1, 0, cxb, 0, IF_BFP)
+ F(0xb349, CXBR, RRE, Z, x1, x2, 0, 0, cxb, 0, IF_BFP)
F(0xed09, CEB, RXE, Z, e1, m2_32u, 0, 0, ceb, 0, IF_BFP)
F(0xed19, CDB, RXE, Z, f1, m2_64, 0, 0, cdb, 0, IF_BFP)
/* COMPARE AND SIGNAL */
F(0xb308, KEBR, RRE, Z, e1, e2, 0, 0, keb, 0, IF_BFP)
F(0xb318, KDBR, RRE, Z, f1, f2, 0, 0, kdb, 0, IF_BFP)
- F(0xb348, KXBR, RRE, Z, x2h, x2l, x1, 0, kxb, 0, IF_BFP)
+ F(0xb348, KXBR, RRE, Z, x1, x2, 0, 0, kxb, 0, IF_BFP)
F(0xed08, KEB, RXE, Z, e1, m2_32u, 0, 0, keb, 0, IF_BFP)
F(0xed18, KDB, RXE, Z, f1, m2_64, 0, 0, kdb, 0, IF_BFP)
/* COMPARE IMMEDIATE */
@@ -299,10 +299,10 @@
/* CONVERT TO FIXED */
F(0xb398, CFEBR, RRF_e, Z, 0, e2, new, r1_32, cfeb, 0, IF_BFP)
F(0xb399, CFDBR, RRF_e, Z, 0, f2, new, r1_32, cfdb, 0, IF_BFP)
- F(0xb39a, CFXBR, RRF_e, Z, x2h, x2l, new, r1_32, cfxb, 0, IF_BFP)
+ F(0xb39a, CFXBR, RRF_e, Z, 0, x2, new, r1_32, cfxb, 0, IF_BFP)
F(0xb3a8, CGEBR, RRF_e, Z, 0, e2, r1, 0, cgeb, 0, IF_BFP)
F(0xb3a9, CGDBR, RRF_e, Z, 0, f2, r1, 0, cgdb, 0, IF_BFP)
- F(0xb3aa, CGXBR, RRF_e, Z, x2h, x2l, r1, 0, cgxb, 0, IF_BFP)
+ F(0xb3aa, CGXBR, RRF_e, Z, 0, x2, r1, 0, cgxb, 0, IF_BFP)
/* CONVERT FROM FIXED */
F(0xb394, CEFBR, RRF_e, Z, 0, r2_32s, new, e1, cegb, 0, IF_BFP)
F(0xb395, CDFBR, RRF_e, Z, 0, r2_32s, new, f1, cdgb, 0, IF_BFP)
@@ -313,10 +313,10 @@
/* CONVERT TO LOGICAL */
F(0xb39c, CLFEBR, RRF_e, FPE, 0, e2, new, r1_32, clfeb, 0, IF_BFP)
F(0xb39d, CLFDBR, RRF_e, FPE, 0, f2, new, r1_32, clfdb, 0, IF_BFP)
- F(0xb39e, CLFXBR, RRF_e, FPE, x2h, x2l, new, r1_32, clfxb, 0, IF_BFP)
+ F(0xb39e, CLFXBR, RRF_e, FPE, 0, x2, new, r1_32, clfxb, 0, IF_BFP)
F(0xb3ac, CLGEBR, RRF_e, FPE, 0, e2, r1, 0, clgeb, 0, IF_BFP)
F(0xb3ad, CLGDBR, RRF_e, FPE, 0, f2, r1, 0, clgdb, 0, IF_BFP)
- F(0xb3ae, CLGXBR, RRF_e, FPE, x2h, x2l, r1, 0, clgxb, 0, IF_BFP)
+ F(0xb3ae, CLGXBR, RRF_e, FPE, 0, x2, r1, 0, clgxb, 0, IF_BFP)
/* CONVERT FROM LOGICAL */
F(0xb390, CELFBR, RRF_e, FPE, 0, r2_32u, new, e1, celgb, 0, IF_BFP)
F(0xb391, CDLFBR, RRF_e, FPE, 0, r2_32u, new, f1, cdlgb, 0, IF_BFP)
@@ -343,7 +343,7 @@
C(0x5d00, D, RX_a, Z, r1_D32, m2_32s, new_P, r1_P32, divs32, 0)
F(0xb30d, DEBR, RRE, Z, e1, e2, new, e1, deb, 0, IF_BFP)
F(0xb31d, DDBR, RRE, Z, f1, f2, new, f1, ddb, 0, IF_BFP)
- F(0xb34d, DXBR, RRE, Z, x2h, x2l, x1, x1, dxb, 0, IF_BFP)
+ F(0xb34d, DXBR, RRE, Z, x1, x2, new_x, x1, dxb, 0, IF_BFP)
F(0xed0d, DEB, RXE, Z, e1, m2_32u, new, e1, deb, 0, IF_BFP)
F(0xed1d, DDB, RXE, Z, f1, m2_64, new, f1, ddb, 0, IF_BFP)
/* DIVIDE LOGICAL */
@@ -597,7 +597,7 @@
/* LOAD FP INTEGER */
F(0xb357, FIEBR, RRF_e, Z, 0, e2, new, e1, fieb, 0, IF_BFP)
F(0xb35f, FIDBR, RRF_e, Z, 0, f2, new, f1, fidb, 0, IF_BFP)
- F(0xb347, FIXBR, RRF_e, Z, x2h, x2l, new_x, x1, fixb, 0, IF_BFP)
+ F(0xb347, FIXBR, RRF_e, Z, 0, x2, new_x, x1, fixb, 0, IF_BFP)
/* LOAD LENGTHENED */
F(0xb304, LDEBR, RRE, Z, 0, e2, new, f1, ldeb, 0, IF_BFP)
@@ -610,8 +610,8 @@
F(0xed24, LDE, RXE, Z, 0, m2_32u, new, f1, lde, 0, IF_AFP1)
/* LOAD ROUNDED */
F(0xb344, LEDBR, RRF_e, Z, 0, f2, new, e1, ledb, 0, IF_BFP)
- F(0xb345, LDXBR, RRF_e, Z, x2h, x2l, new, f1, ldxb, 0, IF_BFP)
- F(0xb346, LEXBR, RRF_e, Z, x2h, x2l, new, e1, lexb, 0, IF_BFP)
+ F(0xb345, LDXBR, RRF_e, Z, 0, x2, new, f1, ldxb, 0, IF_BFP)
+ F(0xb346, LEXBR, RRF_e, Z, 0, x2, new, e1, lexb, 0, IF_BFP)
/* LOAD MULTIPLE */
C(0x9800, LM, RS_a, Z, 0, a2, 0, 0, lm32, 0)
@@ -666,7 +666,7 @@
C(0xe384, MG, RXY_a, MIE2,r1p1_o, m2_64, r1_P, 0, muls128, 0)
F(0xb317, MEEBR, RRE, Z, e1, e2, new, e1, meeb, 0, IF_BFP)
F(0xb31c, MDBR, RRE, Z, f1, f2, new, f1, mdb, 0, IF_BFP)
- F(0xb34c, MXBR, RRE, Z, x2h, x2l, x1, x1, mxb, 0, IF_BFP)
+ F(0xb34c, MXBR, RRE, Z, x1, x2, new_x, x1, mxb, 0, IF_BFP)
F(0xb30c, MDEBR, RRE, Z, f1, e2, new, f1, mdeb, 0, IF_BFP)
F(0xb307, MXDBR, RRE, Z, 0, f2, x1, x1, mxdb, 0, IF_BFP)
F(0xed17, MEEB, RXE, Z, e1, m2_32u, new, e1, meeb, 0, IF_BFP)
@@ -835,7 +835,7 @@
/* SQUARE ROOT */
F(0xb314, SQEBR, RRE, Z, 0, e2, new, e1, sqeb, 0, IF_BFP)
F(0xb315, SQDBR, RRE, Z, 0, f2, new, f1, sqdb, 0, IF_BFP)
- F(0xb316, SQXBR, RRE, Z, x2h, x2l, new_x, x1, sqxb, 0, IF_BFP)
+ F(0xb316, SQXBR, RRE, Z, 0, x2, new_x, x1, sqxb, 0, IF_BFP)
F(0xed14, SQEB, RXE, Z, 0, m2_32u, new, e1, sqeb, 0, IF_BFP)
F(0xed15, SQDB, RXE, Z, 0, m2_64, new, f1, sqdb, 0, IF_BFP)
@@ -913,7 +913,7 @@
C(0xe319, SGF, RXY_a, Z, r1, m2_32s, r1, 0, sub, subs64)
F(0xb30b, SEBR, RRE, Z, e1, e2, new, e1, seb, f32, IF_BFP)
F(0xb31b, SDBR, RRE, Z, f1, f2, new, f1, sdb, f64, IF_BFP)
- F(0xb34b, SXBR, RRE, Z, x2h, x2l, x1, x1, sxb, f128, IF_BFP)
+ F(0xb34b, SXBR, RRE, Z, x1, x2, new_x, x1, sxb, f128, IF_BFP)
F(0xed0b, SEB, RXE, Z, e1, m2_32u, new, e1, seb, f32, IF_BFP)
F(0xed1b, SDB, RXE, Z, f1, m2_64, new, f1, sdb, f64, IF_BFP)
/* SUBTRACT HALFWORD */
@@ -957,7 +957,7 @@
/* TEST DATA CLASS */
F(0xed10, TCEB, RXE, Z, e1, a2, 0, 0, tceb, 0, IF_BFP)
F(0xed11, TCDB, RXE, Z, f1, a2, 0, 0, tcdb, 0, IF_BFP)
- F(0xed12, TCXB, RXE, Z, 0, a2, x1, 0, tcxb, 0, IF_BFP)
+ F(0xed12, TCXB, RXE, Z, x1, a2, 0, 0, tcxb, 0, IF_BFP)
/* TEST DECIMAL */
C(0xebc0, TP, RSL, E2, la1, 0, 0, 0, tp, 0)
--
2.34.1
^ permalink raw reply related [flat|nested] 35+ messages in thread
* Re: [PATCH 9/9] target/s390x: Use Int128 for passing float128
2022-10-21 7:30 ` [PATCH 9/9] target/s390x: Use Int128 for passing float128 Richard Henderson
@ 2022-10-24 18:01 ` Philippe Mathieu-Daudé
2022-10-24 22:31 ` Richard Henderson
2022-11-02 9:38 ` Ilya Leoshkevich
1 sibling, 1 reply; 35+ messages in thread
From: Philippe Mathieu-Daudé @ 2022-10-24 18:01 UTC (permalink / raw)
To: Richard Henderson, qemu-devel; +Cc: qemu-s390x
On 21/10/22 09:30, Richard Henderson wrote:
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
> target/s390x/helper.h | 32 ++++++-------
> target/s390x/tcg/fpu_helper.c | 88 ++++++++++++++--------------------
> target/s390x/tcg/translate.c | 76 ++++++++++++++++++++---------
> target/s390x/tcg/insn-data.def | 30 ++++++------
> 4 files changed, 121 insertions(+), 105 deletions(-)
> diff --git a/target/s390x/tcg/translate.c b/target/s390x/tcg/translate.c
> index d1ffbb8710..8023bbab2f 100644
> --- a/target/s390x/tcg/translate.c
> +++ b/target/s390x/tcg/translate.c
> @@ -305,6 +305,18 @@ static TCGv_i64 load_freg32_i64(int reg)
> return r;
> }
>
> +static TCGv_i128 load_freg_128(int reg)
> +{
> + TCGv_i64 h = load_freg(reg);
> + TCGv_i64 l = load_freg(reg + 2);
> + TCGv_i128 r = tcg_temp_new_i128();
Maybe rename as load_freg_new_128() to make emphasis on the returned
TCGv need to be freed? Otherwise:
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
> + tcg_gen_concat_i64_i128(r, l, h);
> + tcg_temp_free_i64(h);
> + tcg_temp_free_i64(l);
> + return r;
> +}
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH 9/9] target/s390x: Use Int128 for passing float128
2022-10-24 18:01 ` Philippe Mathieu-Daudé
@ 2022-10-24 22:31 ` Richard Henderson
0 siblings, 0 replies; 35+ messages in thread
From: Richard Henderson @ 2022-10-24 22:31 UTC (permalink / raw)
To: Philippe Mathieu-Daudé, qemu-devel; +Cc: qemu-s390x
On 10/25/22 04:01, Philippe Mathieu-Daudé wrote:
> On 21/10/22 09:30, Richard Henderson wrote:
>> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
>> ---
>> target/s390x/helper.h | 32 ++++++-------
>> target/s390x/tcg/fpu_helper.c | 88 ++++++++++++++--------------------
>> target/s390x/tcg/translate.c | 76 ++++++++++++++++++++---------
>> target/s390x/tcg/insn-data.def | 30 ++++++------
>> 4 files changed, 121 insertions(+), 105 deletions(-)
>
>> diff --git a/target/s390x/tcg/translate.c b/target/s390x/tcg/translate.c
>> index d1ffbb8710..8023bbab2f 100644
>> --- a/target/s390x/tcg/translate.c
>> +++ b/target/s390x/tcg/translate.c
>> @@ -305,6 +305,18 @@ static TCGv_i64 load_freg32_i64(int reg)
>> return r;
>> }
>> +static TCGv_i128 load_freg_128(int reg)
>> +{
>> + TCGv_i64 h = load_freg(reg);
>> + TCGv_i64 l = load_freg(reg + 2);
>> + TCGv_i128 r = tcg_temp_new_i128();
>
> Maybe rename as load_freg_new_128() to make emphasis on the returned
> TCGv need to be freed?
It's no different from the other load_freg* functions just above. As with those, the
result is assigned to one of the DisasOps slots, and all of those slots are freed at the
end of each instruction.
r~
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH 9/9] target/s390x: Use Int128 for passing float128
2022-10-21 7:30 ` [PATCH 9/9] target/s390x: Use Int128 for passing float128 Richard Henderson
2022-10-24 18:01 ` Philippe Mathieu-Daudé
@ 2022-11-02 9:38 ` Ilya Leoshkevich
2022-11-02 9:47 ` Richard Henderson
1 sibling, 1 reply; 35+ messages in thread
From: Ilya Leoshkevich @ 2022-11-02 9:38 UTC (permalink / raw)
To: Richard Henderson, qemu-devel; +Cc: qemu-s390x
On Fri, Oct 21, 2022 at 05:30:06PM +1000, Richard Henderson wrote:
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
> target/s390x/helper.h | 32 ++++++-------
> target/s390x/tcg/fpu_helper.c | 88 ++++++++++++++--------------------
> target/s390x/tcg/translate.c | 76 ++++++++++++++++++++---------
> target/s390x/tcg/insn-data.def | 30 ++++++------
> 4 files changed, 121 insertions(+), 105 deletions(-)
>
> diff --git a/target/s390x/helper.h b/target/s390x/helper.h
> index 429131a85e..481b9019f9 100644
> --- a/target/s390x/helper.h
> +++ b/target/s390x/helper.h
> @@ -41,55 +41,55 @@ DEF_HELPER_4(csst, i32, env, i32, i64, i64)
> DEF_HELPER_4(csst_parallel, i32, env, i32, i64, i64)
> DEF_HELPER_FLAGS_3(aeb, TCG_CALL_NO_WG, i64, env, i64, i64)
> DEF_HELPER_FLAGS_3(adb, TCG_CALL_NO_WG, i64, env, i64, i64)
> -DEF_HELPER_FLAGS_5(axb, TCG_CALL_NO_WG, i128, env, i64, i64, i64, i64)
> +DEF_HELPER_FLAGS_3(axb, TCG_CALL_NO_WG, i128, env, i128, i128)
> DEF_HELPER_FLAGS_3(seb, TCG_CALL_NO_WG, i64, env, i64, i64)
> DEF_HELPER_FLAGS_3(sdb, TCG_CALL_NO_WG, i64, env, i64, i64)
> -DEF_HELPER_FLAGS_5(sxb, TCG_CALL_NO_WG, i128, env, i64, i64, i64, i64)
> +DEF_HELPER_FLAGS_3(sxb, TCG_CALL_NO_WG, i128, env, i128, i128)
> DEF_HELPER_FLAGS_3(deb, TCG_CALL_NO_WG, i64, env, i64, i64)
> DEF_HELPER_FLAGS_3(ddb, TCG_CALL_NO_WG, i64, env, i64, i64)
> -DEF_HELPER_FLAGS_5(dxb, TCG_CALL_NO_WG, i128, env, i64, i64, i64, i64)
> +DEF_HELPER_FLAGS_3(dxb, TCG_CALL_NO_WG, i128, env, i128, i128)
> DEF_HELPER_FLAGS_3(meeb, TCG_CALL_NO_WG, i64, env, i64, i64)
> DEF_HELPER_FLAGS_3(mdeb, TCG_CALL_NO_WG, i64, env, i64, i64)
> DEF_HELPER_FLAGS_3(mdb, TCG_CALL_NO_WG, i64, env, i64, i64)
> -DEF_HELPER_FLAGS_5(mxb, TCG_CALL_NO_WG, i128, env, i64, i64, i64, i64)
> -DEF_HELPER_FLAGS_4(mxdb, TCG_CALL_NO_WG, i128, env, i64, i64, i64)
> +DEF_HELPER_FLAGS_3(mxb, TCG_CALL_NO_WG, i128, env, i128, i128)
> +DEF_HELPER_FLAGS_3(mxdb, TCG_CALL_NO_WG, i128, env, i128, i64)
> DEF_HELPER_FLAGS_2(ldeb, TCG_CALL_NO_WG, i64, env, i64)
> -DEF_HELPER_FLAGS_4(ldxb, TCG_CALL_NO_WG, i64, env, i64, i64, i32)
> +DEF_HELPER_FLAGS_3(ldxb, TCG_CALL_NO_WG, i64, env, i128, i32)
> DEF_HELPER_FLAGS_2(lxdb, TCG_CALL_NO_WG, i128, env, i64)
> DEF_HELPER_FLAGS_2(lxeb, TCG_CALL_NO_WG, i128, env, i64)
> DEF_HELPER_FLAGS_3(ledb, TCG_CALL_NO_WG, i64, env, i64, i32)
> -DEF_HELPER_FLAGS_4(lexb, TCG_CALL_NO_WG, i64, env, i64, i64, i32)
> +DEF_HELPER_FLAGS_3(lexb, TCG_CALL_NO_WG, i64, env, i128, i32)
> DEF_HELPER_FLAGS_3(ceb, TCG_CALL_NO_WG_SE, i32, env, i64, i64)
> DEF_HELPER_FLAGS_3(cdb, TCG_CALL_NO_WG_SE, i32, env, i64, i64)
> -DEF_HELPER_FLAGS_5(cxb, TCG_CALL_NO_WG_SE, i32, env, i64, i64, i64, i64)
> +DEF_HELPER_FLAGS_3(cxb, TCG_CALL_NO_WG_SE, i32, env, i128, i128)
> DEF_HELPER_FLAGS_3(keb, TCG_CALL_NO_WG, i32, env, i64, i64)
> DEF_HELPER_FLAGS_3(kdb, TCG_CALL_NO_WG, i32, env, i64, i64)
> -DEF_HELPER_FLAGS_5(kxb, TCG_CALL_NO_WG, i32, env, i64, i64, i64, i64)
> +DEF_HELPER_FLAGS_3(kxb, TCG_CALL_NO_WG, i32, env, i128, i128)
> DEF_HELPER_3(cgeb, i64, env, i64, i32)
> DEF_HELPER_3(cgdb, i64, env, i64, i32)
> -DEF_HELPER_4(cgxb, i64, env, i64, i64, i32)
> +DEF_HELPER_3(cgxb, i64, env, i128, i32)
> DEF_HELPER_3(cfeb, i64, env, i64, i32)
> DEF_HELPER_3(cfdb, i64, env, i64, i32)
> -DEF_HELPER_4(cfxb, i64, env, i64, i64, i32)
> +DEF_HELPER_3(cfxb, i64, env, i128, i32)
> DEF_HELPER_3(clgeb, i64, env, i64, i32)
> DEF_HELPER_3(clgdb, i64, env, i64, i32)
> -DEF_HELPER_4(clgxb, i64, env, i64, i64, i32)
> +DEF_HELPER_3(clgxb, i64, env, i128, i32)
> DEF_HELPER_3(clfeb, i64, env, i64, i32)
> DEF_HELPER_3(clfdb, i64, env, i64, i32)
> -DEF_HELPER_4(clfxb, i64, env, i64, i64, i32)
> +DEF_HELPER_3(clfxb, i64, env, i128, i32)
> DEF_HELPER_FLAGS_3(fieb, TCG_CALL_NO_WG, i64, env, i64, i32)
> DEF_HELPER_FLAGS_3(fidb, TCG_CALL_NO_WG, i64, env, i64, i32)
> -DEF_HELPER_FLAGS_4(fixb, TCG_CALL_NO_WG, i128, env, i64, i64, i32)
> +DEF_HELPER_FLAGS_3(fixb, TCG_CALL_NO_WG, i128, env, i128, i32)
> DEF_HELPER_FLAGS_4(maeb, TCG_CALL_NO_WG, i64, env, i64, i64, i64)
> DEF_HELPER_FLAGS_4(madb, TCG_CALL_NO_WG, i64, env, i64, i64, i64)
> DEF_HELPER_FLAGS_4(mseb, TCG_CALL_NO_WG, i64, env, i64, i64, i64)
> DEF_HELPER_FLAGS_4(msdb, TCG_CALL_NO_WG, i64, env, i64, i64, i64)
> DEF_HELPER_FLAGS_3(tceb, TCG_CALL_NO_RWG_SE, i32, env, i64, i64)
> DEF_HELPER_FLAGS_3(tcdb, TCG_CALL_NO_RWG_SE, i32, env, i64, i64)
> -DEF_HELPER_FLAGS_4(tcxb, TCG_CALL_NO_RWG_SE, i32, env, i64, i64, i64)
> +DEF_HELPER_FLAGS_3(tcxb, TCG_CALL_NO_RWG_SE, i32, env, i128, i64)
> DEF_HELPER_FLAGS_2(sqeb, TCG_CALL_NO_WG, i64, env, i64)
> DEF_HELPER_FLAGS_2(sqdb, TCG_CALL_NO_WG, i64, env, i64)
> -DEF_HELPER_FLAGS_3(sqxb, TCG_CALL_NO_WG, i128, env, i64, i64)
> +DEF_HELPER_FLAGS_2(sqxb, TCG_CALL_NO_WG, i128, env, i128)
> DEF_HELPER_FLAGS_1(cvd, TCG_CALL_NO_RWG_SE, i64, s32)
> DEF_HELPER_FLAGS_4(pack, TCG_CALL_NO_WG, void, env, i32, i64, i64)
> DEF_HELPER_FLAGS_4(pka, TCG_CALL_NO_WG, void, env, i64, i64, i32)
> diff --git a/target/s390x/tcg/fpu_helper.c b/target/s390x/tcg/fpu_helper.c
> index a584794be6..5a322e3f87 100644
> --- a/target/s390x/tcg/fpu_helper.c
> +++ b/target/s390x/tcg/fpu_helper.c
> @@ -39,6 +39,11 @@ static inline Int128 RET128(float128 f)
> return int128_make128(f.low, f.high);
> }
>
> +static inline float128 ARG128(Int128 i)
> +{
> + return make_float128(int128_gethi(i), int128_getlo(i));
> +}
> +
> uint8_t s390_softfloat_exc_to_ieee(unsigned int exc)
> {
> uint8_t s390_exc = 0;
> @@ -227,12 +232,9 @@ uint64_t HELPER(adb)(CPUS390XState *env, uint64_t f1, uint64_t f2)
> }
>
> /* 128-bit FP addition */
> -Int128 HELPER(axb)(CPUS390XState *env, uint64_t ah, uint64_t al,
> - uint64_t bh, uint64_t bl)
> +Int128 HELPER(axb)(CPUS390XState *env, Int128 a, Int128 b)
> {
> - float128 ret = float128_add(make_float128(ah, al),
> - make_float128(bh, bl),
> - &env->fpu_status);
> + float128 ret = float128_add(ARG128(a), ARG128(b), &env->fpu_status);
> handle_exceptions(env, false, GETPC());
> return RET128(ret);
> }
> @@ -254,12 +256,9 @@ uint64_t HELPER(sdb)(CPUS390XState *env, uint64_t f1, uint64_t f2)
> }
>
> /* 128-bit FP subtraction */
> -Int128 HELPER(sxb)(CPUS390XState *env, uint64_t ah, uint64_t al,
> - uint64_t bh, uint64_t bl)
> +Int128 HELPER(sxb)(CPUS390XState *env, Int128 a, Int128 b)
> {
> - float128 ret = float128_sub(make_float128(ah, al),
> - make_float128(bh, bl),
> - &env->fpu_status);
> + float128 ret = float128_sub(ARG128(a), ARG128(b), &env->fpu_status);
> handle_exceptions(env, false, GETPC());
> return RET128(ret);
> }
> @@ -281,12 +280,9 @@ uint64_t HELPER(ddb)(CPUS390XState *env, uint64_t f1, uint64_t f2)
> }
>
> /* 128-bit FP division */
> -Int128 HELPER(dxb)(CPUS390XState *env, uint64_t ah, uint64_t al,
> - uint64_t bh, uint64_t bl)
> +Int128 HELPER(dxb)(CPUS390XState *env, Int128 a, Int128 b)
> {
> - float128 ret = float128_div(make_float128(ah, al),
> - make_float128(bh, bl),
> - &env->fpu_status);
> + float128 ret = float128_div(ARG128(a), ARG128(b), &env->fpu_status);
> handle_exceptions(env, false, GETPC());
> return RET128(ret);
> }
> @@ -317,21 +313,18 @@ uint64_t HELPER(mdeb)(CPUS390XState *env, uint64_t f1, uint64_t f2)
> }
>
> /* 128-bit FP multiplication */
> -Int128 HELPER(mxb)(CPUS390XState *env, uint64_t ah, uint64_t al,
> - uint64_t bh, uint64_t bl)
> +Int128 HELPER(mxb)(CPUS390XState *env, Int128 a, Int128 b)
> {
> - float128 ret = float128_mul(make_float128(ah, al),
> - make_float128(bh, bl),
> - &env->fpu_status);
> + float128 ret = float128_mul(ARG128(a), ARG128(b), &env->fpu_status);
> handle_exceptions(env, false, GETPC());
> return RET128(ret);
> }
>
> /* 128/64-bit FP multiplication */
> -Int128 HELPER(mxdb)(CPUS390XState *env, uint64_t ah, uint64_t al, uint64_t f2)
> +Int128 HELPER(mxdb)(CPUS390XState *env, Int128 a, uint64_t f2)
> {
> float128 ret = float64_to_float128(f2, &env->fpu_status);
> - ret = float128_mul(make_float128(ah, al), ret, &env->fpu_status);
> + ret = float128_mul(ARG128(a), ret, &env->fpu_status);
> handle_exceptions(env, false, GETPC());
> return RET128(ret);
> }
> @@ -345,11 +338,10 @@ uint64_t HELPER(ldeb)(CPUS390XState *env, uint64_t f2)
> }
>
> /* convert 128-bit float to 64-bit float */
> -uint64_t HELPER(ldxb)(CPUS390XState *env, uint64_t ah, uint64_t al,
> - uint32_t m34)
> +uint64_t HELPER(ldxb)(CPUS390XState *env, Int128 a, uint32_t m34)
> {
> int old_mode = s390_swap_bfp_rounding_mode(env, round_from_m34(m34));
> - float64 ret = float128_to_float64(make_float128(ah, al), &env->fpu_status);
> + float64 ret = float128_to_float64(ARG128(a), &env->fpu_status);
>
> s390_restore_bfp_rounding_mode(env, old_mode);
> handle_exceptions(env, xxc_from_m34(m34), GETPC());
> @@ -384,11 +376,10 @@ uint64_t HELPER(ledb)(CPUS390XState *env, uint64_t f2, uint32_t m34)
> }
>
> /* convert 128-bit float to 32-bit float */
> -uint64_t HELPER(lexb)(CPUS390XState *env, uint64_t ah, uint64_t al,
> - uint32_t m34)
> +uint64_t HELPER(lexb)(CPUS390XState *env, Int128 a, uint32_t m34)
> {
> int old_mode = s390_swap_bfp_rounding_mode(env, round_from_m34(m34));
> - float32 ret = float128_to_float32(make_float128(ah, al), &env->fpu_status);
> + float32 ret = float128_to_float32(ARG128(a), &env->fpu_status);
>
> s390_restore_bfp_rounding_mode(env, old_mode);
> handle_exceptions(env, xxc_from_m34(m34), GETPC());
> @@ -412,11 +403,9 @@ uint32_t HELPER(cdb)(CPUS390XState *env, uint64_t f1, uint64_t f2)
> }
>
> /* 128-bit FP compare */
> -uint32_t HELPER(cxb)(CPUS390XState *env, uint64_t ah, uint64_t al,
> - uint64_t bh, uint64_t bl)
> +uint32_t HELPER(cxb)(CPUS390XState *env, Int128 a, Int128 b)
> {
> - FloatRelation cmp = float128_compare_quiet(make_float128(ah, al),
> - make_float128(bh, bl),
> + FloatRelation cmp = float128_compare_quiet(ARG128(a), ARG128(b),
> &env->fpu_status);
> handle_exceptions(env, false, GETPC());
> return float_comp_to_cc(env, cmp);
> @@ -564,10 +553,10 @@ uint64_t HELPER(cgdb)(CPUS390XState *env, uint64_t v2, uint32_t m34)
> }
>
> /* convert 128-bit float to 64-bit int */
> -uint64_t HELPER(cgxb)(CPUS390XState *env, uint64_t h, uint64_t l, uint32_t m34)
> +uint64_t HELPER(cgxb)(CPUS390XState *env, Int128 i2, uint32_t m34)
> {
> int old_mode = s390_swap_bfp_rounding_mode(env, round_from_m34(m34));
> - float128 v2 = make_float128(h, l);
> + float128 v2 = ARG128(i2);
> int64_t ret = float128_to_int64(v2, &env->fpu_status);
> uint32_t cc = set_cc_conv_f128(v2, &env->fpu_status);
>
> @@ -613,10 +602,10 @@ uint64_t HELPER(cfdb)(CPUS390XState *env, uint64_t v2, uint32_t m34)
> }
>
> /* convert 128-bit float to 32-bit int */
> -uint64_t HELPER(cfxb)(CPUS390XState *env, uint64_t h, uint64_t l, uint32_t m34)
> +uint64_t HELPER(cfxb)(CPUS390XState *env, Int128 i2, uint32_t m34)
> {
> int old_mode = s390_swap_bfp_rounding_mode(env, round_from_m34(m34));
> - float128 v2 = make_float128(h, l);
> + float128 v2 = ARG128(i2);
> int32_t ret = float128_to_int32(v2, &env->fpu_status);
> uint32_t cc = set_cc_conv_f128(v2, &env->fpu_status);
>
> @@ -662,10 +651,10 @@ uint64_t HELPER(clgdb)(CPUS390XState *env, uint64_t v2, uint32_t m34)
> }
>
> /* convert 128-bit float to 64-bit uint */
> -uint64_t HELPER(clgxb)(CPUS390XState *env, uint64_t h, uint64_t l, uint32_t m34)
> +uint64_t HELPER(clgxb)(CPUS390XState *env, Int128 i2, uint32_t m34)
> {
> int old_mode = s390_swap_bfp_rounding_mode(env, round_from_m34(m34));
> - float128 v2 = make_float128(h, l);
> + float128 v2 = ARG128(i2);
> uint64_t ret = float128_to_uint64(v2, &env->fpu_status);
> uint32_t cc = set_cc_conv_f128(v2, &env->fpu_status);
>
> @@ -711,10 +700,10 @@ uint64_t HELPER(clfdb)(CPUS390XState *env, uint64_t v2, uint32_t m34)
> }
>
> /* convert 128-bit float to 32-bit uint */
> -uint64_t HELPER(clfxb)(CPUS390XState *env, uint64_t h, uint64_t l, uint32_t m34)
> +uint64_t HELPER(clfxb)(CPUS390XState *env, Int128 i2, uint32_t m34)
> {
> int old_mode = s390_swap_bfp_rounding_mode(env, round_from_m34(m34));
> - float128 v2 = make_float128(h, l);
> + float128 v2 = ARG128(i2);
> uint32_t ret = float128_to_uint32(v2, &env->fpu_status);
> uint32_t cc = set_cc_conv_f128(v2, &env->fpu_status);
>
> @@ -750,11 +739,10 @@ uint64_t HELPER(fidb)(CPUS390XState *env, uint64_t f2, uint32_t m34)
> }
>
> /* round to integer 128-bit */
> -Int128 HELPER(fixb)(CPUS390XState *env, uint64_t ah, uint64_t al, uint32_t m34)
> +Int128 HELPER(fixb)(CPUS390XState *env, Int128 a, uint32_t m34)
> {
> int old_mode = s390_swap_bfp_rounding_mode(env, round_from_m34(m34));
> - float128 ret = float128_round_to_int(make_float128(ah, al),
> - &env->fpu_status);
> + float128 ret = float128_round_to_int(ARG128(a), &env->fpu_status);
>
> s390_restore_bfp_rounding_mode(env, old_mode);
> handle_exceptions(env, xxc_from_m34(m34), GETPC());
> @@ -778,11 +766,9 @@ uint32_t HELPER(kdb)(CPUS390XState *env, uint64_t f1, uint64_t f2)
> }
>
> /* 128-bit FP compare and signal */
> -uint32_t HELPER(kxb)(CPUS390XState *env, uint64_t ah, uint64_t al,
> - uint64_t bh, uint64_t bl)
> +uint32_t HELPER(kxb)(CPUS390XState *env, Int128 a, Int128 b)
> {
> - FloatRelation cmp = float128_compare(make_float128(ah, al),
> - make_float128(bh, bl),
> + FloatRelation cmp = float128_compare(ARG128(a), ARG128(b),
> &env->fpu_status);
> handle_exceptions(env, false, GETPC());
> return float_comp_to_cc(env, cmp);
> @@ -869,9 +855,9 @@ uint32_t HELPER(tcdb)(CPUS390XState *env, uint64_t v1, uint64_t m2)
> }
>
> /* test data class 128-bit */
> -uint32_t HELPER(tcxb)(CPUS390XState *env, uint64_t ah, uint64_t al, uint64_t m2)
> +uint32_t HELPER(tcxb)(CPUS390XState *env, Int128 a, uint64_t m2)
> {
> - return (m2 & float128_dcmask(env, make_float128(ah, al))) != 0;
> + return (m2 & float128_dcmask(env, ARG128(a))) != 0;
> }
>
> /* square root 32-bit */
> @@ -891,9 +877,9 @@ uint64_t HELPER(sqdb)(CPUS390XState *env, uint64_t f2)
> }
>
> /* square root 128-bit */
> -Int128 HELPER(sqxb)(CPUS390XState *env, uint64_t ah, uint64_t al)
> +Int128 HELPER(sqxb)(CPUS390XState *env, Int128 a)
> {
> - float128 ret = float128_sqrt(make_float128(ah, al), &env->fpu_status);
> + float128 ret = float128_sqrt(ARG128(a), &env->fpu_status);
> handle_exceptions(env, false, GETPC());
> return RET128(ret);
> }
> diff --git a/target/s390x/tcg/translate.c b/target/s390x/tcg/translate.c
> index d1ffbb8710..8023bbab2f 100644
> --- a/target/s390x/tcg/translate.c
> +++ b/target/s390x/tcg/translate.c
> @@ -305,6 +305,18 @@ static TCGv_i64 load_freg32_i64(int reg)
> return r;
> }
>
> +static TCGv_i128 load_freg_128(int reg)
> +{
> + TCGv_i64 h = load_freg(reg);
> + TCGv_i64 l = load_freg(reg + 2);
> + TCGv_i128 r = tcg_temp_new_i128();
> +
> + tcg_gen_concat_i64_i128(r, l, h);
> + tcg_temp_free_i64(h);
> + tcg_temp_free_i64(l);
> + return r;
> +}
> +
> static void store_reg(int reg, TCGv_i64 v)
> {
> tcg_gen_mov_i64(regs[reg], v);
> @@ -1103,7 +1115,7 @@ typedef struct {
> bool g_out, g_out2, g_in1, g_in2;
> TCGv_i64 out, out2, in1, in2;
> TCGv_i64 addr1;
> - TCGv_i128 out_128;
> + TCGv_i128 out_128, in1_128, in2_128;
> } DisasOps;
>
> /* Instructions can place constraints on their operands, raising specification
> @@ -1462,7 +1474,7 @@ static DisasJumpType op_adb(DisasContext *s, DisasOps *o)
>
> static DisasJumpType op_axb(DisasContext *s, DisasOps *o)
> {
> - gen_helper_axb(o->out_128, cpu_env, o->out, o->out2, o->in1, o->in2);
> + gen_helper_axb(o->out_128, cpu_env, o->in1_128, o->in2_128);
> return DISAS_NEXT;
> }
>
> @@ -1778,7 +1790,7 @@ static DisasJumpType op_cdb(DisasContext *s, DisasOps *o)
>
> static DisasJumpType op_cxb(DisasContext *s, DisasOps *o)
> {
> - gen_helper_cxb(cc_op, cpu_env, o->out, o->out2, o->in1, o->in2);
> + gen_helper_cxb(cc_op, cpu_env, o->in1_128, o->in2_128);
> set_cc_static(s);
> return DISAS_NEXT;
> }
> @@ -1841,7 +1853,7 @@ static DisasJumpType op_cfxb(DisasContext *s, DisasOps *o)
> if (!m34) {
> return DISAS_NORETURN;
> }
> - gen_helper_cfxb(o->out, cpu_env, o->in1, o->in2, m34);
> + gen_helper_cfxb(o->out, cpu_env, o->in2_128, m34);
> tcg_temp_free_i32(m34);
> set_cc_static(s);
> return DISAS_NEXT;
> @@ -1880,7 +1892,7 @@ static DisasJumpType op_cgxb(DisasContext *s, DisasOps *o)
> if (!m34) {
> return DISAS_NORETURN;
> }
> - gen_helper_cgxb(o->out, cpu_env, o->in1, o->in2, m34);
> + gen_helper_cgxb(o->out, cpu_env, o->in2_128, m34);
> tcg_temp_free_i32(m34);
> set_cc_static(s);
> return DISAS_NEXT;
> @@ -1919,7 +1931,7 @@ static DisasJumpType op_clfxb(DisasContext *s, DisasOps *o)
> if (!m34) {
> return DISAS_NORETURN;
> }
> - gen_helper_clfxb(o->out, cpu_env, o->in1, o->in2, m34);
> + gen_helper_clfxb(o->out, cpu_env, o->in2_128, m34);
> tcg_temp_free_i32(m34);
> set_cc_static(s);
> return DISAS_NEXT;
> @@ -1958,7 +1970,7 @@ static DisasJumpType op_clgxb(DisasContext *s, DisasOps *o)
> if (!m34) {
> return DISAS_NORETURN;
> }
> - gen_helper_clgxb(o->out, cpu_env, o->in1, o->in2, m34);
> + gen_helper_clgxb(o->out, cpu_env, o->in2_128, m34);
> tcg_temp_free_i32(m34);
> set_cc_static(s);
> return DISAS_NEXT;
> @@ -2448,7 +2460,7 @@ static DisasJumpType op_ddb(DisasContext *s, DisasOps *o)
>
> static DisasJumpType op_dxb(DisasContext *s, DisasOps *o)
> {
> - gen_helper_dxb(o->out_128, cpu_env, o->out, o->out2, o->in1, o->in2);
> + gen_helper_dxb(o->out_128, cpu_env, o->in1_128, o->in2_128);
> return DISAS_NEXT;
> }
>
> @@ -2553,7 +2565,7 @@ static DisasJumpType op_fixb(DisasContext *s, DisasOps *o)
> if (!m34) {
> return DISAS_NORETURN;
> }
> - gen_helper_fixb(o->out_128, cpu_env, o->in1, o->in2, m34);
> + gen_helper_fixb(o->out_128, cpu_env, o->in2_128, m34);
> tcg_temp_free_i32(m34);
> return DISAS_NEXT;
> }
> @@ -2772,7 +2784,7 @@ static DisasJumpType op_kdb(DisasContext *s, DisasOps *o)
>
> static DisasJumpType op_kxb(DisasContext *s, DisasOps *o)
> {
> - gen_helper_kxb(cc_op, cpu_env, o->out, o->out2, o->in1, o->in2);
> + gen_helper_kxb(cc_op, cpu_env, o->in1_128, o->in2_128);
> set_cc_static(s);
> return DISAS_NEXT;
> }
> @@ -2846,7 +2858,7 @@ static DisasJumpType op_ldxb(DisasContext *s, DisasOps *o)
> if (!m34) {
> return DISAS_NORETURN;
> }
> - gen_helper_ldxb(o->out, cpu_env, o->in1, o->in2, m34);
> + gen_helper_ldxb(o->out, cpu_env, o->in2_128, m34);
> tcg_temp_free_i32(m34);
> return DISAS_NEXT;
> }
> @@ -2858,7 +2870,7 @@ static DisasJumpType op_lexb(DisasContext *s, DisasOps *o)
> if (!m34) {
> return DISAS_NORETURN;
> }
> - gen_helper_lexb(o->out, cpu_env, o->in1, o->in2, m34);
> + gen_helper_lexb(o->out, cpu_env, o->in2_128, m34);
> tcg_temp_free_i32(m34);
> return DISAS_NEXT;
> }
> @@ -3586,13 +3598,13 @@ static DisasJumpType op_mdb(DisasContext *s, DisasOps *o)
>
> static DisasJumpType op_mxb(DisasContext *s, DisasOps *o)
> {
> - gen_helper_mxb(o->out_128, cpu_env, o->out, o->out2, o->in1, o->in2);
> + gen_helper_mxb(o->out_128, cpu_env, o->in1_128, o->in2_128);
> return DISAS_NEXT;
> }
>
> static DisasJumpType op_mxdb(DisasContext *s, DisasOps *o)
> {
> - gen_helper_mxdb(o->out_128, cpu_env, o->out, o->out2, o->in2);
> + gen_helper_mxdb(o->out_128, cpu_env, o->in1_128, o->in2);
> return DISAS_NEXT;
> }
>
> @@ -4057,7 +4069,7 @@ static DisasJumpType op_sdb(DisasContext *s, DisasOps *o)
>
> static DisasJumpType op_sxb(DisasContext *s, DisasOps *o)
> {
> - gen_helper_sxb(o->out_128, cpu_env, o->out, o->out2, o->in1, o->in2);
> + gen_helper_sxb(o->out_128, cpu_env, o->in1_128, o->in2_128);
> return DISAS_NEXT;
> }
>
> @@ -4075,7 +4087,7 @@ static DisasJumpType op_sqdb(DisasContext *s, DisasOps *o)
>
> static DisasJumpType op_sqxb(DisasContext *s, DisasOps *o)
> {
> - gen_helper_sqxb(o->out_128, cpu_env, o->in1, o->in2);
> + gen_helper_sqxb(o->out_128, cpu_env, o->in2_128);
> return DISAS_NEXT;
> }
>
> @@ -4854,7 +4866,7 @@ static DisasJumpType op_tcdb(DisasContext *s, DisasOps *o)
>
> static DisasJumpType op_tcxb(DisasContext *s, DisasOps *o)
> {
> - gen_helper_tcxb(cc_op, cpu_env, o->out, o->out2, o->in2);
> + gen_helper_tcxb(cc_op, cpu_env, o->in1_128, o->in2);
> set_cc_static(s);
> return DISAS_NEXT;
> }
> @@ -5389,8 +5401,6 @@ static void prep_new_P(DisasContext *s, DisasOps *o)
>
> static void prep_new_x(DisasContext *s, DisasOps *o)
> {
> - o->out = tcg_temp_new_i64();
> - o->out2 = tcg_temp_new_i64();
> o->out_128 = tcg_temp_new_i128();
> }
> #define SPEC_prep_new_x 0
> @@ -5413,10 +5423,7 @@ static void prep_r1_P(DisasContext *s, DisasOps *o)
>
> static void prep_x1(DisasContext *s, DisasOps *o)
> {
> - o->out = load_freg(get_field(s, r1));
> - o->out2 = load_freg(get_field(s, r1) + 2);
> - o->out_128 = tcg_temp_new_i128();
> - tcg_gen_concat_i64_i128(o->out_128, o->out2, o->out);
> + o->out_128 = load_freg_128(get_field(s, r1));
> }
> #define SPEC_prep_x1 SPEC_r1_f128
>
> @@ -5515,6 +5522,11 @@ static void wout_x1(DisasContext *s, DisasOps *o)
> {
> int f1 = get_field(s, r1);
>
> + /* Split out_128 into out+out2 for cout_f128. */
> + tcg_debug_assert(o->out == NULL);
> + o->out = tcg_temp_new_i64();
> + o->out2 = tcg_temp_new_i64();
> +
> tcg_gen_extr_i128_i64(o->out2, o->out, o->out_128);
> store_freg(f1, o->out);
> store_freg(f1 + 2, o->out2);
> @@ -5757,6 +5769,12 @@ static void in1_f1(DisasContext *s, DisasOps *o)
> }
> #define SPEC_in1_f1 0
>
> +static void in1_x1(DisasContext *s, DisasOps *o)
> +{
> + o->in1_128 = load_freg_128(get_field(s, r1));
> +}
> +#define SPEC_in1_x1 SPEC_r2_f128
> +
> /* Load the high double word of an extended (128-bit) format FP number */
> static void in1_x2h(DisasContext *s, DisasOps *o)
> {
> @@ -5966,6 +5984,12 @@ static void in2_f2(DisasContext *s, DisasOps *o)
> }
> #define SPEC_in2_f2 0
>
> +static void in2_x2(DisasContext *s, DisasOps *o)
> +{
> + o->in2_128 = load_freg_128(get_field(s, r2));
> +}
> +#define SPEC_in2_x2 SPEC_r2_f128
> +
> /* Load the low double word of an extended (128-bit) format FP number */
> static void in2_x2l(DisasContext *s, DisasOps *o)
> {
> @@ -6588,6 +6612,12 @@ static DisasJumpType translate_one(CPUS390XState *env, DisasContext *s)
> if (o.out_128) {
> tcg_temp_free_i128(o.out_128);
> }
> + if (o.in1_128) {
> + tcg_temp_free_i128(o.in1_128);
> + }
> + if (o.in2_128) {
> + tcg_temp_free_i128(o.in2_128);
> + }
> /* io should be the last instruction in tb when icount is enabled */
> if (unlikely(icount && ret == DISAS_NEXT)) {
> ret = DISAS_TOO_MANY;
> diff --git a/target/s390x/tcg/insn-data.def b/target/s390x/tcg/insn-data.def
> index 20bf20c766..26523746d6 100644
> --- a/target/s390x/tcg/insn-data.def
> +++ b/target/s390x/tcg/insn-data.def
> @@ -34,7 +34,7 @@
> C(0xe318, AGF, RXY_a, Z, r1, m2_32s, r1, 0, add, adds64)
> F(0xb30a, AEBR, RRE, Z, e1, e2, new, e1, aeb, f32, IF_BFP)
> F(0xb31a, ADBR, RRE, Z, f1, f2, new, f1, adb, f64, IF_BFP)
> - F(0xb34a, AXBR, RRE, Z, x2h, x2l, x1, x1, axb, f128, IF_BFP)
> + F(0xb34a, AXBR, RRE, Z, x1, x2, new_x, x1, axb, f128, IF_BFP)
> F(0xed0a, AEB, RXE, Z, e1, m2_32u, new, e1, aeb, f32, IF_BFP)
> F(0xed1a, ADB, RXE, Z, f1, m2_64, new, f1, adb, f64, IF_BFP)
> /* ADD HIGH */
> @@ -172,13 +172,13 @@
> C(0xe330, CGF, RXY_a, Z, r1_o, m2_32s, 0, 0, 0, cmps64)
> F(0xb309, CEBR, RRE, Z, e1, e2, 0, 0, ceb, 0, IF_BFP)
> F(0xb319, CDBR, RRE, Z, f1, f2, 0, 0, cdb, 0, IF_BFP)
> - F(0xb349, CXBR, RRE, Z, x2h, x2l, x1, 0, cxb, 0, IF_BFP)
> + F(0xb349, CXBR, RRE, Z, x1, x2, 0, 0, cxb, 0, IF_BFP)
> F(0xed09, CEB, RXE, Z, e1, m2_32u, 0, 0, ceb, 0, IF_BFP)
> F(0xed19, CDB, RXE, Z, f1, m2_64, 0, 0, cdb, 0, IF_BFP)
> /* COMPARE AND SIGNAL */
> F(0xb308, KEBR, RRE, Z, e1, e2, 0, 0, keb, 0, IF_BFP)
> F(0xb318, KDBR, RRE, Z, f1, f2, 0, 0, kdb, 0, IF_BFP)
> - F(0xb348, KXBR, RRE, Z, x2h, x2l, x1, 0, kxb, 0, IF_BFP)
> + F(0xb348, KXBR, RRE, Z, x1, x2, 0, 0, kxb, 0, IF_BFP)
> F(0xed08, KEB, RXE, Z, e1, m2_32u, 0, 0, keb, 0, IF_BFP)
> F(0xed18, KDB, RXE, Z, f1, m2_64, 0, 0, kdb, 0, IF_BFP)
> /* COMPARE IMMEDIATE */
> @@ -299,10 +299,10 @@
> /* CONVERT TO FIXED */
> F(0xb398, CFEBR, RRF_e, Z, 0, e2, new, r1_32, cfeb, 0, IF_BFP)
> F(0xb399, CFDBR, RRF_e, Z, 0, f2, new, r1_32, cfdb, 0, IF_BFP)
> - F(0xb39a, CFXBR, RRF_e, Z, x2h, x2l, new, r1_32, cfxb, 0, IF_BFP)
> + F(0xb39a, CFXBR, RRF_e, Z, 0, x2, new, r1_32, cfxb, 0, IF_BFP)
> F(0xb3a8, CGEBR, RRF_e, Z, 0, e2, r1, 0, cgeb, 0, IF_BFP)
> F(0xb3a9, CGDBR, RRF_e, Z, 0, f2, r1, 0, cgdb, 0, IF_BFP)
> - F(0xb3aa, CGXBR, RRF_e, Z, x2h, x2l, r1, 0, cgxb, 0, IF_BFP)
> + F(0xb3aa, CGXBR, RRF_e, Z, 0, x2, r1, 0, cgxb, 0, IF_BFP)
> /* CONVERT FROM FIXED */
> F(0xb394, CEFBR, RRF_e, Z, 0, r2_32s, new, e1, cegb, 0, IF_BFP)
> F(0xb395, CDFBR, RRF_e, Z, 0, r2_32s, new, f1, cdgb, 0, IF_BFP)
> @@ -313,10 +313,10 @@
> /* CONVERT TO LOGICAL */
> F(0xb39c, CLFEBR, RRF_e, FPE, 0, e2, new, r1_32, clfeb, 0, IF_BFP)
> F(0xb39d, CLFDBR, RRF_e, FPE, 0, f2, new, r1_32, clfdb, 0, IF_BFP)
> - F(0xb39e, CLFXBR, RRF_e, FPE, x2h, x2l, new, r1_32, clfxb, 0, IF_BFP)
> + F(0xb39e, CLFXBR, RRF_e, FPE, 0, x2, new, r1_32, clfxb, 0, IF_BFP)
> F(0xb3ac, CLGEBR, RRF_e, FPE, 0, e2, r1, 0, clgeb, 0, IF_BFP)
> F(0xb3ad, CLGDBR, RRF_e, FPE, 0, f2, r1, 0, clgdb, 0, IF_BFP)
> - F(0xb3ae, CLGXBR, RRF_e, FPE, x2h, x2l, r1, 0, clgxb, 0, IF_BFP)
> + F(0xb3ae, CLGXBR, RRF_e, FPE, 0, x2, r1, 0, clgxb, 0, IF_BFP)
> /* CONVERT FROM LOGICAL */
> F(0xb390, CELFBR, RRF_e, FPE, 0, r2_32u, new, e1, celgb, 0, IF_BFP)
> F(0xb391, CDLFBR, RRF_e, FPE, 0, r2_32u, new, f1, cdlgb, 0, IF_BFP)
> @@ -343,7 +343,7 @@
> C(0x5d00, D, RX_a, Z, r1_D32, m2_32s, new_P, r1_P32, divs32, 0)
> F(0xb30d, DEBR, RRE, Z, e1, e2, new, e1, deb, 0, IF_BFP)
> F(0xb31d, DDBR, RRE, Z, f1, f2, new, f1, ddb, 0, IF_BFP)
> - F(0xb34d, DXBR, RRE, Z, x2h, x2l, x1, x1, dxb, 0, IF_BFP)
> + F(0xb34d, DXBR, RRE, Z, x1, x2, new_x, x1, dxb, 0, IF_BFP)
> F(0xed0d, DEB, RXE, Z, e1, m2_32u, new, e1, deb, 0, IF_BFP)
> F(0xed1d, DDB, RXE, Z, f1, m2_64, new, f1, ddb, 0, IF_BFP)
> /* DIVIDE LOGICAL */
> @@ -597,7 +597,7 @@
> /* LOAD FP INTEGER */
> F(0xb357, FIEBR, RRF_e, Z, 0, e2, new, e1, fieb, 0, IF_BFP)
> F(0xb35f, FIDBR, RRF_e, Z, 0, f2, new, f1, fidb, 0, IF_BFP)
> - F(0xb347, FIXBR, RRF_e, Z, x2h, x2l, new_x, x1, fixb, 0, IF_BFP)
> + F(0xb347, FIXBR, RRF_e, Z, 0, x2, new_x, x1, fixb, 0, IF_BFP)
>
> /* LOAD LENGTHENED */
> F(0xb304, LDEBR, RRE, Z, 0, e2, new, f1, ldeb, 0, IF_BFP)
> @@ -610,8 +610,8 @@
> F(0xed24, LDE, RXE, Z, 0, m2_32u, new, f1, lde, 0, IF_AFP1)
> /* LOAD ROUNDED */
> F(0xb344, LEDBR, RRF_e, Z, 0, f2, new, e1, ledb, 0, IF_BFP)
> - F(0xb345, LDXBR, RRF_e, Z, x2h, x2l, new, f1, ldxb, 0, IF_BFP)
> - F(0xb346, LEXBR, RRF_e, Z, x2h, x2l, new, e1, lexb, 0, IF_BFP)
> + F(0xb345, LDXBR, RRF_e, Z, 0, x2, new, f1, ldxb, 0, IF_BFP)
> + F(0xb346, LEXBR, RRF_e, Z, 0, x2, new, e1, lexb, 0, IF_BFP)
>
> /* LOAD MULTIPLE */
> C(0x9800, LM, RS_a, Z, 0, a2, 0, 0, lm32, 0)
> @@ -666,7 +666,7 @@
> C(0xe384, MG, RXY_a, MIE2,r1p1_o, m2_64, r1_P, 0, muls128, 0)
> F(0xb317, MEEBR, RRE, Z, e1, e2, new, e1, meeb, 0, IF_BFP)
> F(0xb31c, MDBR, RRE, Z, f1, f2, new, f1, mdb, 0, IF_BFP)
> - F(0xb34c, MXBR, RRE, Z, x2h, x2l, x1, x1, mxb, 0, IF_BFP)
> + F(0xb34c, MXBR, RRE, Z, x1, x2, new_x, x1, mxb, 0, IF_BFP)
> F(0xb30c, MDEBR, RRE, Z, f1, e2, new, f1, mdeb, 0, IF_BFP)
> F(0xb307, MXDBR, RRE, Z, 0, f2, x1, x1, mxdb, 0, IF_BFP)
> F(0xed17, MEEB, RXE, Z, e1, m2_32u, new, e1, meeb, 0, IF_BFP)
> @@ -835,7 +835,7 @@
> /* SQUARE ROOT */
> F(0xb314, SQEBR, RRE, Z, 0, e2, new, e1, sqeb, 0, IF_BFP)
> F(0xb315, SQDBR, RRE, Z, 0, f2, new, f1, sqdb, 0, IF_BFP)
> - F(0xb316, SQXBR, RRE, Z, x2h, x2l, new_x, x1, sqxb, 0, IF_BFP)
> + F(0xb316, SQXBR, RRE, Z, 0, x2, new_x, x1, sqxb, 0, IF_BFP)
> F(0xed14, SQEB, RXE, Z, 0, m2_32u, new, e1, sqeb, 0, IF_BFP)
> F(0xed15, SQDB, RXE, Z, 0, m2_64, new, f1, sqdb, 0, IF_BFP)
>
> @@ -913,7 +913,7 @@
> C(0xe319, SGF, RXY_a, Z, r1, m2_32s, r1, 0, sub, subs64)
> F(0xb30b, SEBR, RRE, Z, e1, e2, new, e1, seb, f32, IF_BFP)
> F(0xb31b, SDBR, RRE, Z, f1, f2, new, f1, sdb, f64, IF_BFP)
> - F(0xb34b, SXBR, RRE, Z, x2h, x2l, x1, x1, sxb, f128, IF_BFP)
> + F(0xb34b, SXBR, RRE, Z, x1, x2, new_x, x1, sxb, f128, IF_BFP)
> F(0xed0b, SEB, RXE, Z, e1, m2_32u, new, e1, seb, f32, IF_BFP)
> F(0xed1b, SDB, RXE, Z, f1, m2_64, new, f1, sdb, f64, IF_BFP)
> /* SUBTRACT HALFWORD */
> @@ -957,7 +957,7 @@
> /* TEST DATA CLASS */
> F(0xed10, TCEB, RXE, Z, e1, a2, 0, 0, tceb, 0, IF_BFP)
> F(0xed11, TCDB, RXE, Z, f1, a2, 0, 0, tcdb, 0, IF_BFP)
> - F(0xed12, TCXB, RXE, Z, 0, a2, x1, 0, tcxb, 0, IF_BFP)
> + F(0xed12, TCXB, RXE, Z, x1, a2, 0, 0, tcxb, 0, IF_BFP)
>
> /* TEST DECIMAL */
> C(0xebc0, TP, RSL, E2, la1, 0, 0, 0, tp, 0)
> --
> 2.34.1
>
>
Hi,
I ran valgrind's testsuite with this patch, and their bpf-4 test
triggered an assertion in the
(insn->spec & SPEC_r2_f128 && !is_fp_pair(get_field(s, r2)))
condition. The following fixup helped:
--- a/target/s390x/tcg/translate.c
+++ b/target/s390x/tcg/translate.c
@@ -5771,14 +5771,14 @@ static void in1_x1(DisasContext *s, DisasOps *o)
{
o->in1_128 = load_freg_128(get_field(s, r1));
}
-#define SPEC_in1_x1 SPEC_r2_f128
+#define SPEC_in1_x1 SPEC_r1_f128
/* Load the high double word of an extended (128-bit) format FP number */
static void in1_x2h(DisasContext *s, DisasOps *o)
{
o->in1 = load_freg(get_field(s, r2));
}
-#define SPEC_in1_x2h SPEC_r2_f128
+#define SPEC_in1_x2h SPEC_r1_f128
static void in1_f3(DisasContext *s, DisasOps *o)
{
Best regards,
Ilya
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH 9/9] target/s390x: Use Int128 for passing float128
2022-11-02 9:38 ` Ilya Leoshkevich
@ 2022-11-02 9:47 ` Richard Henderson
2022-11-02 10:05 ` Ilya Leoshkevich
0 siblings, 1 reply; 35+ messages in thread
From: Richard Henderson @ 2022-11-02 9:47 UTC (permalink / raw)
To: Ilya Leoshkevich, qemu-devel; +Cc: qemu-s390x
On 11/2/22 20:38, Ilya Leoshkevich wrote:
> On Fri, Oct 21, 2022 at 05:30:06PM +1000, Richard Henderson wrote:
>> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
>> ---
>> target/s390x/helper.h | 32 ++++++-------
>> target/s390x/tcg/fpu_helper.c | 88 ++++++++++++++--------------------
>> target/s390x/tcg/translate.c | 76 ++++++++++++++++++++---------
>> target/s390x/tcg/insn-data.def | 30 ++++++------
>> 4 files changed, 121 insertions(+), 105 deletions(-)
>>
>> diff --git a/target/s390x/helper.h b/target/s390x/helper.h
>> index 429131a85e..481b9019f9 100644
>> --- a/target/s390x/helper.h
>> +++ b/target/s390x/helper.h
>> @@ -41,55 +41,55 @@ DEF_HELPER_4(csst, i32, env, i32, i64, i64)
>> DEF_HELPER_4(csst_parallel, i32, env, i32, i64, i64)
>> DEF_HELPER_FLAGS_3(aeb, TCG_CALL_NO_WG, i64, env, i64, i64)
>> DEF_HELPER_FLAGS_3(adb, TCG_CALL_NO_WG, i64, env, i64, i64)
>> -DEF_HELPER_FLAGS_5(axb, TCG_CALL_NO_WG, i128, env, i64, i64, i64, i64)
>> +DEF_HELPER_FLAGS_3(axb, TCG_CALL_NO_WG, i128, env, i128, i128)
>> DEF_HELPER_FLAGS_3(seb, TCG_CALL_NO_WG, i64, env, i64, i64)
>> DEF_HELPER_FLAGS_3(sdb, TCG_CALL_NO_WG, i64, env, i64, i64)
>> -DEF_HELPER_FLAGS_5(sxb, TCG_CALL_NO_WG, i128, env, i64, i64, i64, i64)
>> +DEF_HELPER_FLAGS_3(sxb, TCG_CALL_NO_WG, i128, env, i128, i128)
>> DEF_HELPER_FLAGS_3(deb, TCG_CALL_NO_WG, i64, env, i64, i64)
>> DEF_HELPER_FLAGS_3(ddb, TCG_CALL_NO_WG, i64, env, i64, i64)
>> -DEF_HELPER_FLAGS_5(dxb, TCG_CALL_NO_WG, i128, env, i64, i64, i64, i64)
>> +DEF_HELPER_FLAGS_3(dxb, TCG_CALL_NO_WG, i128, env, i128, i128)
>> DEF_HELPER_FLAGS_3(meeb, TCG_CALL_NO_WG, i64, env, i64, i64)
>> DEF_HELPER_FLAGS_3(mdeb, TCG_CALL_NO_WG, i64, env, i64, i64)
>> DEF_HELPER_FLAGS_3(mdb, TCG_CALL_NO_WG, i64, env, i64, i64)
>> -DEF_HELPER_FLAGS_5(mxb, TCG_CALL_NO_WG, i128, env, i64, i64, i64, i64)
>> -DEF_HELPER_FLAGS_4(mxdb, TCG_CALL_NO_WG, i128, env, i64, i64, i64)
>> +DEF_HELPER_FLAGS_3(mxb, TCG_CALL_NO_WG, i128, env, i128, i128)
>> +DEF_HELPER_FLAGS_3(mxdb, TCG_CALL_NO_WG, i128, env, i128, i64)
>> DEF_HELPER_FLAGS_2(ldeb, TCG_CALL_NO_WG, i64, env, i64)
>> -DEF_HELPER_FLAGS_4(ldxb, TCG_CALL_NO_WG, i64, env, i64, i64, i32)
>> +DEF_HELPER_FLAGS_3(ldxb, TCG_CALL_NO_WG, i64, env, i128, i32)
>> DEF_HELPER_FLAGS_2(lxdb, TCG_CALL_NO_WG, i128, env, i64)
>> DEF_HELPER_FLAGS_2(lxeb, TCG_CALL_NO_WG, i128, env, i64)
>> DEF_HELPER_FLAGS_3(ledb, TCG_CALL_NO_WG, i64, env, i64, i32)
>> -DEF_HELPER_FLAGS_4(lexb, TCG_CALL_NO_WG, i64, env, i64, i64, i32)
>> +DEF_HELPER_FLAGS_3(lexb, TCG_CALL_NO_WG, i64, env, i128, i32)
>> DEF_HELPER_FLAGS_3(ceb, TCG_CALL_NO_WG_SE, i32, env, i64, i64)
>> DEF_HELPER_FLAGS_3(cdb, TCG_CALL_NO_WG_SE, i32, env, i64, i64)
>> -DEF_HELPER_FLAGS_5(cxb, TCG_CALL_NO_WG_SE, i32, env, i64, i64, i64, i64)
>> +DEF_HELPER_FLAGS_3(cxb, TCG_CALL_NO_WG_SE, i32, env, i128, i128)
>> DEF_HELPER_FLAGS_3(keb, TCG_CALL_NO_WG, i32, env, i64, i64)
>> DEF_HELPER_FLAGS_3(kdb, TCG_CALL_NO_WG, i32, env, i64, i64)
>> -DEF_HELPER_FLAGS_5(kxb, TCG_CALL_NO_WG, i32, env, i64, i64, i64, i64)
>> +DEF_HELPER_FLAGS_3(kxb, TCG_CALL_NO_WG, i32, env, i128, i128)
>> DEF_HELPER_3(cgeb, i64, env, i64, i32)
>> DEF_HELPER_3(cgdb, i64, env, i64, i32)
>> -DEF_HELPER_4(cgxb, i64, env, i64, i64, i32)
>> +DEF_HELPER_3(cgxb, i64, env, i128, i32)
>> DEF_HELPER_3(cfeb, i64, env, i64, i32)
>> DEF_HELPER_3(cfdb, i64, env, i64, i32)
>> -DEF_HELPER_4(cfxb, i64, env, i64, i64, i32)
>> +DEF_HELPER_3(cfxb, i64, env, i128, i32)
>> DEF_HELPER_3(clgeb, i64, env, i64, i32)
>> DEF_HELPER_3(clgdb, i64, env, i64, i32)
>> -DEF_HELPER_4(clgxb, i64, env, i64, i64, i32)
>> +DEF_HELPER_3(clgxb, i64, env, i128, i32)
>> DEF_HELPER_3(clfeb, i64, env, i64, i32)
>> DEF_HELPER_3(clfdb, i64, env, i64, i32)
>> -DEF_HELPER_4(clfxb, i64, env, i64, i64, i32)
>> +DEF_HELPER_3(clfxb, i64, env, i128, i32)
>> DEF_HELPER_FLAGS_3(fieb, TCG_CALL_NO_WG, i64, env, i64, i32)
>> DEF_HELPER_FLAGS_3(fidb, TCG_CALL_NO_WG, i64, env, i64, i32)
>> -DEF_HELPER_FLAGS_4(fixb, TCG_CALL_NO_WG, i128, env, i64, i64, i32)
>> +DEF_HELPER_FLAGS_3(fixb, TCG_CALL_NO_WG, i128, env, i128, i32)
>> DEF_HELPER_FLAGS_4(maeb, TCG_CALL_NO_WG, i64, env, i64, i64, i64)
>> DEF_HELPER_FLAGS_4(madb, TCG_CALL_NO_WG, i64, env, i64, i64, i64)
>> DEF_HELPER_FLAGS_4(mseb, TCG_CALL_NO_WG, i64, env, i64, i64, i64)
>> DEF_HELPER_FLAGS_4(msdb, TCG_CALL_NO_WG, i64, env, i64, i64, i64)
>> DEF_HELPER_FLAGS_3(tceb, TCG_CALL_NO_RWG_SE, i32, env, i64, i64)
>> DEF_HELPER_FLAGS_3(tcdb, TCG_CALL_NO_RWG_SE, i32, env, i64, i64)
>> -DEF_HELPER_FLAGS_4(tcxb, TCG_CALL_NO_RWG_SE, i32, env, i64, i64, i64)
>> +DEF_HELPER_FLAGS_3(tcxb, TCG_CALL_NO_RWG_SE, i32, env, i128, i64)
>> DEF_HELPER_FLAGS_2(sqeb, TCG_CALL_NO_WG, i64, env, i64)
>> DEF_HELPER_FLAGS_2(sqdb, TCG_CALL_NO_WG, i64, env, i64)
>> -DEF_HELPER_FLAGS_3(sqxb, TCG_CALL_NO_WG, i128, env, i64, i64)
>> +DEF_HELPER_FLAGS_2(sqxb, TCG_CALL_NO_WG, i128, env, i128)
>> DEF_HELPER_FLAGS_1(cvd, TCG_CALL_NO_RWG_SE, i64, s32)
>> DEF_HELPER_FLAGS_4(pack, TCG_CALL_NO_WG, void, env, i32, i64, i64)
>> DEF_HELPER_FLAGS_4(pka, TCG_CALL_NO_WG, void, env, i64, i64, i32)
>> diff --git a/target/s390x/tcg/fpu_helper.c b/target/s390x/tcg/fpu_helper.c
>> index a584794be6..5a322e3f87 100644
>> --- a/target/s390x/tcg/fpu_helper.c
>> +++ b/target/s390x/tcg/fpu_helper.c
>> @@ -39,6 +39,11 @@ static inline Int128 RET128(float128 f)
>> return int128_make128(f.low, f.high);
>> }
>>
>> +static inline float128 ARG128(Int128 i)
>> +{
>> + return make_float128(int128_gethi(i), int128_getlo(i));
>> +}
>> +
>> uint8_t s390_softfloat_exc_to_ieee(unsigned int exc)
>> {
>> uint8_t s390_exc = 0;
>> @@ -227,12 +232,9 @@ uint64_t HELPER(adb)(CPUS390XState *env, uint64_t f1, uint64_t f2)
>> }
>>
>> /* 128-bit FP addition */
>> -Int128 HELPER(axb)(CPUS390XState *env, uint64_t ah, uint64_t al,
>> - uint64_t bh, uint64_t bl)
>> +Int128 HELPER(axb)(CPUS390XState *env, Int128 a, Int128 b)
>> {
>> - float128 ret = float128_add(make_float128(ah, al),
>> - make_float128(bh, bl),
>> - &env->fpu_status);
>> + float128 ret = float128_add(ARG128(a), ARG128(b), &env->fpu_status);
>> handle_exceptions(env, false, GETPC());
>> return RET128(ret);
>> }
>> @@ -254,12 +256,9 @@ uint64_t HELPER(sdb)(CPUS390XState *env, uint64_t f1, uint64_t f2)
>> }
>>
>> /* 128-bit FP subtraction */
>> -Int128 HELPER(sxb)(CPUS390XState *env, uint64_t ah, uint64_t al,
>> - uint64_t bh, uint64_t bl)
>> +Int128 HELPER(sxb)(CPUS390XState *env, Int128 a, Int128 b)
>> {
>> - float128 ret = float128_sub(make_float128(ah, al),
>> - make_float128(bh, bl),
>> - &env->fpu_status);
>> + float128 ret = float128_sub(ARG128(a), ARG128(b), &env->fpu_status);
>> handle_exceptions(env, false, GETPC());
>> return RET128(ret);
>> }
>> @@ -281,12 +280,9 @@ uint64_t HELPER(ddb)(CPUS390XState *env, uint64_t f1, uint64_t f2)
>> }
>>
>> /* 128-bit FP division */
>> -Int128 HELPER(dxb)(CPUS390XState *env, uint64_t ah, uint64_t al,
>> - uint64_t bh, uint64_t bl)
>> +Int128 HELPER(dxb)(CPUS390XState *env, Int128 a, Int128 b)
>> {
>> - float128 ret = float128_div(make_float128(ah, al),
>> - make_float128(bh, bl),
>> - &env->fpu_status);
>> + float128 ret = float128_div(ARG128(a), ARG128(b), &env->fpu_status);
>> handle_exceptions(env, false, GETPC());
>> return RET128(ret);
>> }
>> @@ -317,21 +313,18 @@ uint64_t HELPER(mdeb)(CPUS390XState *env, uint64_t f1, uint64_t f2)
>> }
>>
>> /* 128-bit FP multiplication */
>> -Int128 HELPER(mxb)(CPUS390XState *env, uint64_t ah, uint64_t al,
>> - uint64_t bh, uint64_t bl)
>> +Int128 HELPER(mxb)(CPUS390XState *env, Int128 a, Int128 b)
>> {
>> - float128 ret = float128_mul(make_float128(ah, al),
>> - make_float128(bh, bl),
>> - &env->fpu_status);
>> + float128 ret = float128_mul(ARG128(a), ARG128(b), &env->fpu_status);
>> handle_exceptions(env, false, GETPC());
>> return RET128(ret);
>> }
>>
>> /* 128/64-bit FP multiplication */
>> -Int128 HELPER(mxdb)(CPUS390XState *env, uint64_t ah, uint64_t al, uint64_t f2)
>> +Int128 HELPER(mxdb)(CPUS390XState *env, Int128 a, uint64_t f2)
>> {
>> float128 ret = float64_to_float128(f2, &env->fpu_status);
>> - ret = float128_mul(make_float128(ah, al), ret, &env->fpu_status);
>> + ret = float128_mul(ARG128(a), ret, &env->fpu_status);
>> handle_exceptions(env, false, GETPC());
>> return RET128(ret);
>> }
>> @@ -345,11 +338,10 @@ uint64_t HELPER(ldeb)(CPUS390XState *env, uint64_t f2)
>> }
>>
>> /* convert 128-bit float to 64-bit float */
>> -uint64_t HELPER(ldxb)(CPUS390XState *env, uint64_t ah, uint64_t al,
>> - uint32_t m34)
>> +uint64_t HELPER(ldxb)(CPUS390XState *env, Int128 a, uint32_t m34)
>> {
>> int old_mode = s390_swap_bfp_rounding_mode(env, round_from_m34(m34));
>> - float64 ret = float128_to_float64(make_float128(ah, al), &env->fpu_status);
>> + float64 ret = float128_to_float64(ARG128(a), &env->fpu_status);
>>
>> s390_restore_bfp_rounding_mode(env, old_mode);
>> handle_exceptions(env, xxc_from_m34(m34), GETPC());
>> @@ -384,11 +376,10 @@ uint64_t HELPER(ledb)(CPUS390XState *env, uint64_t f2, uint32_t m34)
>> }
>>
>> /* convert 128-bit float to 32-bit float */
>> -uint64_t HELPER(lexb)(CPUS390XState *env, uint64_t ah, uint64_t al,
>> - uint32_t m34)
>> +uint64_t HELPER(lexb)(CPUS390XState *env, Int128 a, uint32_t m34)
>> {
>> int old_mode = s390_swap_bfp_rounding_mode(env, round_from_m34(m34));
>> - float32 ret = float128_to_float32(make_float128(ah, al), &env->fpu_status);
>> + float32 ret = float128_to_float32(ARG128(a), &env->fpu_status);
>>
>> s390_restore_bfp_rounding_mode(env, old_mode);
>> handle_exceptions(env, xxc_from_m34(m34), GETPC());
>> @@ -412,11 +403,9 @@ uint32_t HELPER(cdb)(CPUS390XState *env, uint64_t f1, uint64_t f2)
>> }
>>
>> /* 128-bit FP compare */
>> -uint32_t HELPER(cxb)(CPUS390XState *env, uint64_t ah, uint64_t al,
>> - uint64_t bh, uint64_t bl)
>> +uint32_t HELPER(cxb)(CPUS390XState *env, Int128 a, Int128 b)
>> {
>> - FloatRelation cmp = float128_compare_quiet(make_float128(ah, al),
>> - make_float128(bh, bl),
>> + FloatRelation cmp = float128_compare_quiet(ARG128(a), ARG128(b),
>> &env->fpu_status);
>> handle_exceptions(env, false, GETPC());
>> return float_comp_to_cc(env, cmp);
>> @@ -564,10 +553,10 @@ uint64_t HELPER(cgdb)(CPUS390XState *env, uint64_t v2, uint32_t m34)
>> }
>>
>> /* convert 128-bit float to 64-bit int */
>> -uint64_t HELPER(cgxb)(CPUS390XState *env, uint64_t h, uint64_t l, uint32_t m34)
>> +uint64_t HELPER(cgxb)(CPUS390XState *env, Int128 i2, uint32_t m34)
>> {
>> int old_mode = s390_swap_bfp_rounding_mode(env, round_from_m34(m34));
>> - float128 v2 = make_float128(h, l);
>> + float128 v2 = ARG128(i2);
>> int64_t ret = float128_to_int64(v2, &env->fpu_status);
>> uint32_t cc = set_cc_conv_f128(v2, &env->fpu_status);
>>
>> @@ -613,10 +602,10 @@ uint64_t HELPER(cfdb)(CPUS390XState *env, uint64_t v2, uint32_t m34)
>> }
>>
>> /* convert 128-bit float to 32-bit int */
>> -uint64_t HELPER(cfxb)(CPUS390XState *env, uint64_t h, uint64_t l, uint32_t m34)
>> +uint64_t HELPER(cfxb)(CPUS390XState *env, Int128 i2, uint32_t m34)
>> {
>> int old_mode = s390_swap_bfp_rounding_mode(env, round_from_m34(m34));
>> - float128 v2 = make_float128(h, l);
>> + float128 v2 = ARG128(i2);
>> int32_t ret = float128_to_int32(v2, &env->fpu_status);
>> uint32_t cc = set_cc_conv_f128(v2, &env->fpu_status);
>>
>> @@ -662,10 +651,10 @@ uint64_t HELPER(clgdb)(CPUS390XState *env, uint64_t v2, uint32_t m34)
>> }
>>
>> /* convert 128-bit float to 64-bit uint */
>> -uint64_t HELPER(clgxb)(CPUS390XState *env, uint64_t h, uint64_t l, uint32_t m34)
>> +uint64_t HELPER(clgxb)(CPUS390XState *env, Int128 i2, uint32_t m34)
>> {
>> int old_mode = s390_swap_bfp_rounding_mode(env, round_from_m34(m34));
>> - float128 v2 = make_float128(h, l);
>> + float128 v2 = ARG128(i2);
>> uint64_t ret = float128_to_uint64(v2, &env->fpu_status);
>> uint32_t cc = set_cc_conv_f128(v2, &env->fpu_status);
>>
>> @@ -711,10 +700,10 @@ uint64_t HELPER(clfdb)(CPUS390XState *env, uint64_t v2, uint32_t m34)
>> }
>>
>> /* convert 128-bit float to 32-bit uint */
>> -uint64_t HELPER(clfxb)(CPUS390XState *env, uint64_t h, uint64_t l, uint32_t m34)
>> +uint64_t HELPER(clfxb)(CPUS390XState *env, Int128 i2, uint32_t m34)
>> {
>> int old_mode = s390_swap_bfp_rounding_mode(env, round_from_m34(m34));
>> - float128 v2 = make_float128(h, l);
>> + float128 v2 = ARG128(i2);
>> uint32_t ret = float128_to_uint32(v2, &env->fpu_status);
>> uint32_t cc = set_cc_conv_f128(v2, &env->fpu_status);
>>
>> @@ -750,11 +739,10 @@ uint64_t HELPER(fidb)(CPUS390XState *env, uint64_t f2, uint32_t m34)
>> }
>>
>> /* round to integer 128-bit */
>> -Int128 HELPER(fixb)(CPUS390XState *env, uint64_t ah, uint64_t al, uint32_t m34)
>> +Int128 HELPER(fixb)(CPUS390XState *env, Int128 a, uint32_t m34)
>> {
>> int old_mode = s390_swap_bfp_rounding_mode(env, round_from_m34(m34));
>> - float128 ret = float128_round_to_int(make_float128(ah, al),
>> - &env->fpu_status);
>> + float128 ret = float128_round_to_int(ARG128(a), &env->fpu_status);
>>
>> s390_restore_bfp_rounding_mode(env, old_mode);
>> handle_exceptions(env, xxc_from_m34(m34), GETPC());
>> @@ -778,11 +766,9 @@ uint32_t HELPER(kdb)(CPUS390XState *env, uint64_t f1, uint64_t f2)
>> }
>>
>> /* 128-bit FP compare and signal */
>> -uint32_t HELPER(kxb)(CPUS390XState *env, uint64_t ah, uint64_t al,
>> - uint64_t bh, uint64_t bl)
>> +uint32_t HELPER(kxb)(CPUS390XState *env, Int128 a, Int128 b)
>> {
>> - FloatRelation cmp = float128_compare(make_float128(ah, al),
>> - make_float128(bh, bl),
>> + FloatRelation cmp = float128_compare(ARG128(a), ARG128(b),
>> &env->fpu_status);
>> handle_exceptions(env, false, GETPC());
>> return float_comp_to_cc(env, cmp);
>> @@ -869,9 +855,9 @@ uint32_t HELPER(tcdb)(CPUS390XState *env, uint64_t v1, uint64_t m2)
>> }
>>
>> /* test data class 128-bit */
>> -uint32_t HELPER(tcxb)(CPUS390XState *env, uint64_t ah, uint64_t al, uint64_t m2)
>> +uint32_t HELPER(tcxb)(CPUS390XState *env, Int128 a, uint64_t m2)
>> {
>> - return (m2 & float128_dcmask(env, make_float128(ah, al))) != 0;
>> + return (m2 & float128_dcmask(env, ARG128(a))) != 0;
>> }
>>
>> /* square root 32-bit */
>> @@ -891,9 +877,9 @@ uint64_t HELPER(sqdb)(CPUS390XState *env, uint64_t f2)
>> }
>>
>> /* square root 128-bit */
>> -Int128 HELPER(sqxb)(CPUS390XState *env, uint64_t ah, uint64_t al)
>> +Int128 HELPER(sqxb)(CPUS390XState *env, Int128 a)
>> {
>> - float128 ret = float128_sqrt(make_float128(ah, al), &env->fpu_status);
>> + float128 ret = float128_sqrt(ARG128(a), &env->fpu_status);
>> handle_exceptions(env, false, GETPC());
>> return RET128(ret);
>> }
>> diff --git a/target/s390x/tcg/translate.c b/target/s390x/tcg/translate.c
>> index d1ffbb8710..8023bbab2f 100644
>> --- a/target/s390x/tcg/translate.c
>> +++ b/target/s390x/tcg/translate.c
>> @@ -305,6 +305,18 @@ static TCGv_i64 load_freg32_i64(int reg)
>> return r;
>> }
>>
>> +static TCGv_i128 load_freg_128(int reg)
>> +{
>> + TCGv_i64 h = load_freg(reg);
>> + TCGv_i64 l = load_freg(reg + 2);
>> + TCGv_i128 r = tcg_temp_new_i128();
>> +
>> + tcg_gen_concat_i64_i128(r, l, h);
>> + tcg_temp_free_i64(h);
>> + tcg_temp_free_i64(l);
>> + return r;
>> +}
>> +
>> static void store_reg(int reg, TCGv_i64 v)
>> {
>> tcg_gen_mov_i64(regs[reg], v);
>> @@ -1103,7 +1115,7 @@ typedef struct {
>> bool g_out, g_out2, g_in1, g_in2;
>> TCGv_i64 out, out2, in1, in2;
>> TCGv_i64 addr1;
>> - TCGv_i128 out_128;
>> + TCGv_i128 out_128, in1_128, in2_128;
>> } DisasOps;
>>
>> /* Instructions can place constraints on their operands, raising specification
>> @@ -1462,7 +1474,7 @@ static DisasJumpType op_adb(DisasContext *s, DisasOps *o)
>>
>> static DisasJumpType op_axb(DisasContext *s, DisasOps *o)
>> {
>> - gen_helper_axb(o->out_128, cpu_env, o->out, o->out2, o->in1, o->in2);
>> + gen_helper_axb(o->out_128, cpu_env, o->in1_128, o->in2_128);
>> return DISAS_NEXT;
>> }
>>
>> @@ -1778,7 +1790,7 @@ static DisasJumpType op_cdb(DisasContext *s, DisasOps *o)
>>
>> static DisasJumpType op_cxb(DisasContext *s, DisasOps *o)
>> {
>> - gen_helper_cxb(cc_op, cpu_env, o->out, o->out2, o->in1, o->in2);
>> + gen_helper_cxb(cc_op, cpu_env, o->in1_128, o->in2_128);
>> set_cc_static(s);
>> return DISAS_NEXT;
>> }
>> @@ -1841,7 +1853,7 @@ static DisasJumpType op_cfxb(DisasContext *s, DisasOps *o)
>> if (!m34) {
>> return DISAS_NORETURN;
>> }
>> - gen_helper_cfxb(o->out, cpu_env, o->in1, o->in2, m34);
>> + gen_helper_cfxb(o->out, cpu_env, o->in2_128, m34);
>> tcg_temp_free_i32(m34);
>> set_cc_static(s);
>> return DISAS_NEXT;
>> @@ -1880,7 +1892,7 @@ static DisasJumpType op_cgxb(DisasContext *s, DisasOps *o)
>> if (!m34) {
>> return DISAS_NORETURN;
>> }
>> - gen_helper_cgxb(o->out, cpu_env, o->in1, o->in2, m34);
>> + gen_helper_cgxb(o->out, cpu_env, o->in2_128, m34);
>> tcg_temp_free_i32(m34);
>> set_cc_static(s);
>> return DISAS_NEXT;
>> @@ -1919,7 +1931,7 @@ static DisasJumpType op_clfxb(DisasContext *s, DisasOps *o)
>> if (!m34) {
>> return DISAS_NORETURN;
>> }
>> - gen_helper_clfxb(o->out, cpu_env, o->in1, o->in2, m34);
>> + gen_helper_clfxb(o->out, cpu_env, o->in2_128, m34);
>> tcg_temp_free_i32(m34);
>> set_cc_static(s);
>> return DISAS_NEXT;
>> @@ -1958,7 +1970,7 @@ static DisasJumpType op_clgxb(DisasContext *s, DisasOps *o)
>> if (!m34) {
>> return DISAS_NORETURN;
>> }
>> - gen_helper_clgxb(o->out, cpu_env, o->in1, o->in2, m34);
>> + gen_helper_clgxb(o->out, cpu_env, o->in2_128, m34);
>> tcg_temp_free_i32(m34);
>> set_cc_static(s);
>> return DISAS_NEXT;
>> @@ -2448,7 +2460,7 @@ static DisasJumpType op_ddb(DisasContext *s, DisasOps *o)
>>
>> static DisasJumpType op_dxb(DisasContext *s, DisasOps *o)
>> {
>> - gen_helper_dxb(o->out_128, cpu_env, o->out, o->out2, o->in1, o->in2);
>> + gen_helper_dxb(o->out_128, cpu_env, o->in1_128, o->in2_128);
>> return DISAS_NEXT;
>> }
>>
>> @@ -2553,7 +2565,7 @@ static DisasJumpType op_fixb(DisasContext *s, DisasOps *o)
>> if (!m34) {
>> return DISAS_NORETURN;
>> }
>> - gen_helper_fixb(o->out_128, cpu_env, o->in1, o->in2, m34);
>> + gen_helper_fixb(o->out_128, cpu_env, o->in2_128, m34);
>> tcg_temp_free_i32(m34);
>> return DISAS_NEXT;
>> }
>> @@ -2772,7 +2784,7 @@ static DisasJumpType op_kdb(DisasContext *s, DisasOps *o)
>>
>> static DisasJumpType op_kxb(DisasContext *s, DisasOps *o)
>> {
>> - gen_helper_kxb(cc_op, cpu_env, o->out, o->out2, o->in1, o->in2);
>> + gen_helper_kxb(cc_op, cpu_env, o->in1_128, o->in2_128);
>> set_cc_static(s);
>> return DISAS_NEXT;
>> }
>> @@ -2846,7 +2858,7 @@ static DisasJumpType op_ldxb(DisasContext *s, DisasOps *o)
>> if (!m34) {
>> return DISAS_NORETURN;
>> }
>> - gen_helper_ldxb(o->out, cpu_env, o->in1, o->in2, m34);
>> + gen_helper_ldxb(o->out, cpu_env, o->in2_128, m34);
>> tcg_temp_free_i32(m34);
>> return DISAS_NEXT;
>> }
>> @@ -2858,7 +2870,7 @@ static DisasJumpType op_lexb(DisasContext *s, DisasOps *o)
>> if (!m34) {
>> return DISAS_NORETURN;
>> }
>> - gen_helper_lexb(o->out, cpu_env, o->in1, o->in2, m34);
>> + gen_helper_lexb(o->out, cpu_env, o->in2_128, m34);
>> tcg_temp_free_i32(m34);
>> return DISAS_NEXT;
>> }
>> @@ -3586,13 +3598,13 @@ static DisasJumpType op_mdb(DisasContext *s, DisasOps *o)
>>
>> static DisasJumpType op_mxb(DisasContext *s, DisasOps *o)
>> {
>> - gen_helper_mxb(o->out_128, cpu_env, o->out, o->out2, o->in1, o->in2);
>> + gen_helper_mxb(o->out_128, cpu_env, o->in1_128, o->in2_128);
>> return DISAS_NEXT;
>> }
>>
>> static DisasJumpType op_mxdb(DisasContext *s, DisasOps *o)
>> {
>> - gen_helper_mxdb(o->out_128, cpu_env, o->out, o->out2, o->in2);
>> + gen_helper_mxdb(o->out_128, cpu_env, o->in1_128, o->in2);
>> return DISAS_NEXT;
>> }
>>
>> @@ -4057,7 +4069,7 @@ static DisasJumpType op_sdb(DisasContext *s, DisasOps *o)
>>
>> static DisasJumpType op_sxb(DisasContext *s, DisasOps *o)
>> {
>> - gen_helper_sxb(o->out_128, cpu_env, o->out, o->out2, o->in1, o->in2);
>> + gen_helper_sxb(o->out_128, cpu_env, o->in1_128, o->in2_128);
>> return DISAS_NEXT;
>> }
>>
>> @@ -4075,7 +4087,7 @@ static DisasJumpType op_sqdb(DisasContext *s, DisasOps *o)
>>
>> static DisasJumpType op_sqxb(DisasContext *s, DisasOps *o)
>> {
>> - gen_helper_sqxb(o->out_128, cpu_env, o->in1, o->in2);
>> + gen_helper_sqxb(o->out_128, cpu_env, o->in2_128);
>> return DISAS_NEXT;
>> }
>>
>> @@ -4854,7 +4866,7 @@ static DisasJumpType op_tcdb(DisasContext *s, DisasOps *o)
>>
>> static DisasJumpType op_tcxb(DisasContext *s, DisasOps *o)
>> {
>> - gen_helper_tcxb(cc_op, cpu_env, o->out, o->out2, o->in2);
>> + gen_helper_tcxb(cc_op, cpu_env, o->in1_128, o->in2);
>> set_cc_static(s);
>> return DISAS_NEXT;
>> }
>> @@ -5389,8 +5401,6 @@ static void prep_new_P(DisasContext *s, DisasOps *o)
>>
>> static void prep_new_x(DisasContext *s, DisasOps *o)
>> {
>> - o->out = tcg_temp_new_i64();
>> - o->out2 = tcg_temp_new_i64();
>> o->out_128 = tcg_temp_new_i128();
>> }
>> #define SPEC_prep_new_x 0
>> @@ -5413,10 +5423,7 @@ static void prep_r1_P(DisasContext *s, DisasOps *o)
>>
>> static void prep_x1(DisasContext *s, DisasOps *o)
>> {
>> - o->out = load_freg(get_field(s, r1));
>> - o->out2 = load_freg(get_field(s, r1) + 2);
>> - o->out_128 = tcg_temp_new_i128();
>> - tcg_gen_concat_i64_i128(o->out_128, o->out2, o->out);
>> + o->out_128 = load_freg_128(get_field(s, r1));
>> }
>> #define SPEC_prep_x1 SPEC_r1_f128
>>
>> @@ -5515,6 +5522,11 @@ static void wout_x1(DisasContext *s, DisasOps *o)
>> {
>> int f1 = get_field(s, r1);
>>
>> + /* Split out_128 into out+out2 for cout_f128. */
>> + tcg_debug_assert(o->out == NULL);
>> + o->out = tcg_temp_new_i64();
>> + o->out2 = tcg_temp_new_i64();
>> +
>> tcg_gen_extr_i128_i64(o->out2, o->out, o->out_128);
>> store_freg(f1, o->out);
>> store_freg(f1 + 2, o->out2);
>> @@ -5757,6 +5769,12 @@ static void in1_f1(DisasContext *s, DisasOps *o)
>> }
>> #define SPEC_in1_f1 0
>>
>> +static void in1_x1(DisasContext *s, DisasOps *o)
>> +{
>> + o->in1_128 = load_freg_128(get_field(s, r1));
>> +}
>> +#define SPEC_in1_x1 SPEC_r2_f128
>> +
>> /* Load the high double word of an extended (128-bit) format FP number */
>> static void in1_x2h(DisasContext *s, DisasOps *o)
>> {
>> @@ -5966,6 +5984,12 @@ static void in2_f2(DisasContext *s, DisasOps *o)
>> }
>> #define SPEC_in2_f2 0
>>
>> +static void in2_x2(DisasContext *s, DisasOps *o)
>> +{
>> + o->in2_128 = load_freg_128(get_field(s, r2));
>> +}
>> +#define SPEC_in2_x2 SPEC_r2_f128
>> +
>> /* Load the low double word of an extended (128-bit) format FP number */
>> static void in2_x2l(DisasContext *s, DisasOps *o)
>> {
>> @@ -6588,6 +6612,12 @@ static DisasJumpType translate_one(CPUS390XState *env, DisasContext *s)
>> if (o.out_128) {
>> tcg_temp_free_i128(o.out_128);
>> }
>> + if (o.in1_128) {
>> + tcg_temp_free_i128(o.in1_128);
>> + }
>> + if (o.in2_128) {
>> + tcg_temp_free_i128(o.in2_128);
>> + }
>> /* io should be the last instruction in tb when icount is enabled */
>> if (unlikely(icount && ret == DISAS_NEXT)) {
>> ret = DISAS_TOO_MANY;
>> diff --git a/target/s390x/tcg/insn-data.def b/target/s390x/tcg/insn-data.def
>> index 20bf20c766..26523746d6 100644
>> --- a/target/s390x/tcg/insn-data.def
>> +++ b/target/s390x/tcg/insn-data.def
>> @@ -34,7 +34,7 @@
>> C(0xe318, AGF, RXY_a, Z, r1, m2_32s, r1, 0, add, adds64)
>> F(0xb30a, AEBR, RRE, Z, e1, e2, new, e1, aeb, f32, IF_BFP)
>> F(0xb31a, ADBR, RRE, Z, f1, f2, new, f1, adb, f64, IF_BFP)
>> - F(0xb34a, AXBR, RRE, Z, x2h, x2l, x1, x1, axb, f128, IF_BFP)
>> + F(0xb34a, AXBR, RRE, Z, x1, x2, new_x, x1, axb, f128, IF_BFP)
>> F(0xed0a, AEB, RXE, Z, e1, m2_32u, new, e1, aeb, f32, IF_BFP)
>> F(0xed1a, ADB, RXE, Z, f1, m2_64, new, f1, adb, f64, IF_BFP)
>> /* ADD HIGH */
>> @@ -172,13 +172,13 @@
>> C(0xe330, CGF, RXY_a, Z, r1_o, m2_32s, 0, 0, 0, cmps64)
>> F(0xb309, CEBR, RRE, Z, e1, e2, 0, 0, ceb, 0, IF_BFP)
>> F(0xb319, CDBR, RRE, Z, f1, f2, 0, 0, cdb, 0, IF_BFP)
>> - F(0xb349, CXBR, RRE, Z, x2h, x2l, x1, 0, cxb, 0, IF_BFP)
>> + F(0xb349, CXBR, RRE, Z, x1, x2, 0, 0, cxb, 0, IF_BFP)
>> F(0xed09, CEB, RXE, Z, e1, m2_32u, 0, 0, ceb, 0, IF_BFP)
>> F(0xed19, CDB, RXE, Z, f1, m2_64, 0, 0, cdb, 0, IF_BFP)
>> /* COMPARE AND SIGNAL */
>> F(0xb308, KEBR, RRE, Z, e1, e2, 0, 0, keb, 0, IF_BFP)
>> F(0xb318, KDBR, RRE, Z, f1, f2, 0, 0, kdb, 0, IF_BFP)
>> - F(0xb348, KXBR, RRE, Z, x2h, x2l, x1, 0, kxb, 0, IF_BFP)
>> + F(0xb348, KXBR, RRE, Z, x1, x2, 0, 0, kxb, 0, IF_BFP)
>> F(0xed08, KEB, RXE, Z, e1, m2_32u, 0, 0, keb, 0, IF_BFP)
>> F(0xed18, KDB, RXE, Z, f1, m2_64, 0, 0, kdb, 0, IF_BFP)
>> /* COMPARE IMMEDIATE */
>> @@ -299,10 +299,10 @@
>> /* CONVERT TO FIXED */
>> F(0xb398, CFEBR, RRF_e, Z, 0, e2, new, r1_32, cfeb, 0, IF_BFP)
>> F(0xb399, CFDBR, RRF_e, Z, 0, f2, new, r1_32, cfdb, 0, IF_BFP)
>> - F(0xb39a, CFXBR, RRF_e, Z, x2h, x2l, new, r1_32, cfxb, 0, IF_BFP)
>> + F(0xb39a, CFXBR, RRF_e, Z, 0, x2, new, r1_32, cfxb, 0, IF_BFP)
>> F(0xb3a8, CGEBR, RRF_e, Z, 0, e2, r1, 0, cgeb, 0, IF_BFP)
>> F(0xb3a9, CGDBR, RRF_e, Z, 0, f2, r1, 0, cgdb, 0, IF_BFP)
>> - F(0xb3aa, CGXBR, RRF_e, Z, x2h, x2l, r1, 0, cgxb, 0, IF_BFP)
>> + F(0xb3aa, CGXBR, RRF_e, Z, 0, x2, r1, 0, cgxb, 0, IF_BFP)
>> /* CONVERT FROM FIXED */
>> F(0xb394, CEFBR, RRF_e, Z, 0, r2_32s, new, e1, cegb, 0, IF_BFP)
>> F(0xb395, CDFBR, RRF_e, Z, 0, r2_32s, new, f1, cdgb, 0, IF_BFP)
>> @@ -313,10 +313,10 @@
>> /* CONVERT TO LOGICAL */
>> F(0xb39c, CLFEBR, RRF_e, FPE, 0, e2, new, r1_32, clfeb, 0, IF_BFP)
>> F(0xb39d, CLFDBR, RRF_e, FPE, 0, f2, new, r1_32, clfdb, 0, IF_BFP)
>> - F(0xb39e, CLFXBR, RRF_e, FPE, x2h, x2l, new, r1_32, clfxb, 0, IF_BFP)
>> + F(0xb39e, CLFXBR, RRF_e, FPE, 0, x2, new, r1_32, clfxb, 0, IF_BFP)
>> F(0xb3ac, CLGEBR, RRF_e, FPE, 0, e2, r1, 0, clgeb, 0, IF_BFP)
>> F(0xb3ad, CLGDBR, RRF_e, FPE, 0, f2, r1, 0, clgdb, 0, IF_BFP)
>> - F(0xb3ae, CLGXBR, RRF_e, FPE, x2h, x2l, r1, 0, clgxb, 0, IF_BFP)
>> + F(0xb3ae, CLGXBR, RRF_e, FPE, 0, x2, r1, 0, clgxb, 0, IF_BFP)
>> /* CONVERT FROM LOGICAL */
>> F(0xb390, CELFBR, RRF_e, FPE, 0, r2_32u, new, e1, celgb, 0, IF_BFP)
>> F(0xb391, CDLFBR, RRF_e, FPE, 0, r2_32u, new, f1, cdlgb, 0, IF_BFP)
>> @@ -343,7 +343,7 @@
>> C(0x5d00, D, RX_a, Z, r1_D32, m2_32s, new_P, r1_P32, divs32, 0)
>> F(0xb30d, DEBR, RRE, Z, e1, e2, new, e1, deb, 0, IF_BFP)
>> F(0xb31d, DDBR, RRE, Z, f1, f2, new, f1, ddb, 0, IF_BFP)
>> - F(0xb34d, DXBR, RRE, Z, x2h, x2l, x1, x1, dxb, 0, IF_BFP)
>> + F(0xb34d, DXBR, RRE, Z, x1, x2, new_x, x1, dxb, 0, IF_BFP)
>> F(0xed0d, DEB, RXE, Z, e1, m2_32u, new, e1, deb, 0, IF_BFP)
>> F(0xed1d, DDB, RXE, Z, f1, m2_64, new, f1, ddb, 0, IF_BFP)
>> /* DIVIDE LOGICAL */
>> @@ -597,7 +597,7 @@
>> /* LOAD FP INTEGER */
>> F(0xb357, FIEBR, RRF_e, Z, 0, e2, new, e1, fieb, 0, IF_BFP)
>> F(0xb35f, FIDBR, RRF_e, Z, 0, f2, new, f1, fidb, 0, IF_BFP)
>> - F(0xb347, FIXBR, RRF_e, Z, x2h, x2l, new_x, x1, fixb, 0, IF_BFP)
>> + F(0xb347, FIXBR, RRF_e, Z, 0, x2, new_x, x1, fixb, 0, IF_BFP)
>>
>> /* LOAD LENGTHENED */
>> F(0xb304, LDEBR, RRE, Z, 0, e2, new, f1, ldeb, 0, IF_BFP)
>> @@ -610,8 +610,8 @@
>> F(0xed24, LDE, RXE, Z, 0, m2_32u, new, f1, lde, 0, IF_AFP1)
>> /* LOAD ROUNDED */
>> F(0xb344, LEDBR, RRF_e, Z, 0, f2, new, e1, ledb, 0, IF_BFP)
>> - F(0xb345, LDXBR, RRF_e, Z, x2h, x2l, new, f1, ldxb, 0, IF_BFP)
>> - F(0xb346, LEXBR, RRF_e, Z, x2h, x2l, new, e1, lexb, 0, IF_BFP)
>> + F(0xb345, LDXBR, RRF_e, Z, 0, x2, new, f1, ldxb, 0, IF_BFP)
>> + F(0xb346, LEXBR, RRF_e, Z, 0, x2, new, e1, lexb, 0, IF_BFP)
>>
>> /* LOAD MULTIPLE */
>> C(0x9800, LM, RS_a, Z, 0, a2, 0, 0, lm32, 0)
>> @@ -666,7 +666,7 @@
>> C(0xe384, MG, RXY_a, MIE2,r1p1_o, m2_64, r1_P, 0, muls128, 0)
>> F(0xb317, MEEBR, RRE, Z, e1, e2, new, e1, meeb, 0, IF_BFP)
>> F(0xb31c, MDBR, RRE, Z, f1, f2, new, f1, mdb, 0, IF_BFP)
>> - F(0xb34c, MXBR, RRE, Z, x2h, x2l, x1, x1, mxb, 0, IF_BFP)
>> + F(0xb34c, MXBR, RRE, Z, x1, x2, new_x, x1, mxb, 0, IF_BFP)
>> F(0xb30c, MDEBR, RRE, Z, f1, e2, new, f1, mdeb, 0, IF_BFP)
>> F(0xb307, MXDBR, RRE, Z, 0, f2, x1, x1, mxdb, 0, IF_BFP)
>> F(0xed17, MEEB, RXE, Z, e1, m2_32u, new, e1, meeb, 0, IF_BFP)
>> @@ -835,7 +835,7 @@
>> /* SQUARE ROOT */
>> F(0xb314, SQEBR, RRE, Z, 0, e2, new, e1, sqeb, 0, IF_BFP)
>> F(0xb315, SQDBR, RRE, Z, 0, f2, new, f1, sqdb, 0, IF_BFP)
>> - F(0xb316, SQXBR, RRE, Z, x2h, x2l, new_x, x1, sqxb, 0, IF_BFP)
>> + F(0xb316, SQXBR, RRE, Z, 0, x2, new_x, x1, sqxb, 0, IF_BFP)
>> F(0xed14, SQEB, RXE, Z, 0, m2_32u, new, e1, sqeb, 0, IF_BFP)
>> F(0xed15, SQDB, RXE, Z, 0, m2_64, new, f1, sqdb, 0, IF_BFP)
>>
>> @@ -913,7 +913,7 @@
>> C(0xe319, SGF, RXY_a, Z, r1, m2_32s, r1, 0, sub, subs64)
>> F(0xb30b, SEBR, RRE, Z, e1, e2, new, e1, seb, f32, IF_BFP)
>> F(0xb31b, SDBR, RRE, Z, f1, f2, new, f1, sdb, f64, IF_BFP)
>> - F(0xb34b, SXBR, RRE, Z, x2h, x2l, x1, x1, sxb, f128, IF_BFP)
>> + F(0xb34b, SXBR, RRE, Z, x1, x2, new_x, x1, sxb, f128, IF_BFP)
>> F(0xed0b, SEB, RXE, Z, e1, m2_32u, new, e1, seb, f32, IF_BFP)
>> F(0xed1b, SDB, RXE, Z, f1, m2_64, new, f1, sdb, f64, IF_BFP)
>> /* SUBTRACT HALFWORD */
>> @@ -957,7 +957,7 @@
>> /* TEST DATA CLASS */
>> F(0xed10, TCEB, RXE, Z, e1, a2, 0, 0, tceb, 0, IF_BFP)
>> F(0xed11, TCDB, RXE, Z, f1, a2, 0, 0, tcdb, 0, IF_BFP)
>> - F(0xed12, TCXB, RXE, Z, 0, a2, x1, 0, tcxb, 0, IF_BFP)
>> + F(0xed12, TCXB, RXE, Z, x1, a2, 0, 0, tcxb, 0, IF_BFP)
>>
>> /* TEST DECIMAL */
>> C(0xebc0, TP, RSL, E2, la1, 0, 0, 0, tp, 0)
>> --
>> 2.34.1
>>
>>
>
> Hi,
>
> I ran valgrind's testsuite with this patch, and their bpf-4 test
> triggered an assertion in the
>
> (insn->spec & SPEC_r2_f128 && !is_fp_pair(get_field(s, r2)))
>
> condition. The following fixup helped:
>
>
> --- a/target/s390x/tcg/translate.c
> +++ b/target/s390x/tcg/translate.c
> @@ -5771,14 +5771,14 @@ static void in1_x1(DisasContext *s, DisasOps *o)
> {
> o->in1_128 = load_freg_128(get_field(s, r1));
> }
> -#define SPEC_in1_x1 SPEC_r2_f128
> +#define SPEC_in1_x1 SPEC_r1_f128
Looks right, thanks.
>
> /* Load the high double word of an extended (128-bit) format FP number */
> static void in1_x2h(DisasContext *s, DisasOps *o)
> {
> o->in1 = load_freg(get_field(s, r2));
> }
> -#define SPEC_in1_x2h SPEC_r2_f128
> +#define SPEC_in1_x2h SPEC_r1_f128
This looks wrong.
r~
^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [PATCH 9/9] target/s390x: Use Int128 for passing float128
2022-11-02 9:47 ` Richard Henderson
@ 2022-11-02 10:05 ` Ilya Leoshkevich
0 siblings, 0 replies; 35+ messages in thread
From: Ilya Leoshkevich @ 2022-11-02 10:05 UTC (permalink / raw)
To: Richard Henderson, qemu-devel; +Cc: qemu-s390x
On Wed, Nov 02, 2022 at 08:47:24PM +1100, Richard Henderson wrote:
> On 11/2/22 20:38, Ilya Leoshkevich wrote:
> > On Fri, Oct 21, 2022 at 05:30:06PM +1000, Richard Henderson wrote:
> > > Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> > > ---
> > > target/s390x/helper.h | 32 ++++++-------
> > > target/s390x/tcg/fpu_helper.c | 88 ++++++++++++++--------------------
> > > target/s390x/tcg/translate.c | 76 ++++++++++++++++++++---------
> > > target/s390x/tcg/insn-data.def | 30 ++++++------
> > > 4 files changed, 121 insertions(+), 105 deletions(-)
...
> > Hi,
> >
> > I ran valgrind's testsuite with this patch, and their bpf-4 test
> > triggered an assertion in the
> >
> > (insn->spec & SPEC_r2_f128 && !is_fp_pair(get_field(s, r2)))
> >
> > condition. The following fixup helped:
> >
> >
> > --- a/target/s390x/tcg/translate.c
> > +++ b/target/s390x/tcg/translate.c
> > @@ -5771,14 +5771,14 @@ static void in1_x1(DisasContext *s, DisasOps *o)
> > {
> > o->in1_128 = load_freg_128(get_field(s, r1));
> > }
> > -#define SPEC_in1_x1 SPEC_r2_f128
> > +#define SPEC_in1_x1 SPEC_r1_f128
>
> Looks right, thanks.
>
> > /* Load the high double word of an extended (128-bit) format FP number */
> > static void in1_x2h(DisasContext *s, DisasOps *o)
> > {
> > o->in1 = load_freg(get_field(s, r2));
> > }
> > -#define SPEC_in1_x2h SPEC_r2_f128
> > +#define SPEC_in1_x2h SPEC_r1_f128
>
> This looks wrong.
Oh, right - we do get_field(r2) here.
Only the first hunk is necessary.
> r~
^ permalink raw reply [flat|nested] 35+ messages in thread