* [PATCH 0/2] target/riscv: fixup atomic implementation @ 2020-06-29 13:07 ` LIU Zhiwei 0 siblings, 0 replies; 14+ messages in thread From: LIU Zhiwei @ 2020-06-29 13:07 UTC (permalink / raw) To: qemu-devel, qemu-riscv Cc: richard.henderson, wxy194768, wenmeng_zhang, Alistair.Francis, palmer, LIU Zhiwei When I tested RVA with RISU, I found there is something wrong. In particular, amo*.w instructions should only operate the lowerest 32 bits. However, the current implementation uses the whole XLEN bits. LIU Zhiwei (2): tcg/tcg-op: Fix nonatomic_op load with MO_SIGN target/riscv: Do amo*.w insns operate with 32 bits target/riscv/insn_trans/trans_rva.inc.c | 60 +++++++++++++++++++------ tcg/tcg-op.c | 4 +- 2 files changed, 49 insertions(+), 15 deletions(-) -- 2.23.0 ^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH 0/2] target/riscv: fixup atomic implementation @ 2020-06-29 13:07 ` LIU Zhiwei 0 siblings, 0 replies; 14+ messages in thread From: LIU Zhiwei @ 2020-06-29 13:07 UTC (permalink / raw) To: qemu-devel, qemu-riscv Cc: richard.henderson, Alistair.Francis, palmer, wenmeng_zhang, wxy194768, LIU Zhiwei When I tested RVA with RISU, I found there is something wrong. In particular, amo*.w instructions should only operate the lowerest 32 bits. However, the current implementation uses the whole XLEN bits. LIU Zhiwei (2): tcg/tcg-op: Fix nonatomic_op load with MO_SIGN target/riscv: Do amo*.w insns operate with 32 bits target/riscv/insn_trans/trans_rva.inc.c | 60 +++++++++++++++++++------ tcg/tcg-op.c | 4 +- 2 files changed, 49 insertions(+), 15 deletions(-) -- 2.23.0 ^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH 1/2] tcg/tcg-op: Fix nonatomic_op load with MO_SIGN 2020-06-29 13:07 ` LIU Zhiwei @ 2020-06-29 13:07 ` LIU Zhiwei -1 siblings, 0 replies; 14+ messages in thread From: LIU Zhiwei @ 2020-06-29 13:07 UTC (permalink / raw) To: qemu-devel, qemu-riscv Cc: richard.henderson, wxy194768, wenmeng_zhang, Alistair.Francis, palmer, LIU Zhiwei As an op follows load, MO_SIGN should not be cleared. Thus, we can call tcg_gen_atomic_*_i64 with a smaller Memop than MO_Q. Signed-off-by: LIU Zhiwei <zhiwei_liu@c-sky.com> --- tcg/tcg-op.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/tcg/tcg-op.c b/tcg/tcg-op.c index e60b74fb82..75b31048f5 100644 --- a/tcg/tcg-op.c +++ b/tcg/tcg-op.c @@ -3189,7 +3189,7 @@ static void do_nonatomic_op_i32(TCGv_i32 ret, TCGv addr, TCGv_i32 val, memop = tcg_canonicalize_memop(memop, 0, 0); - tcg_gen_qemu_ld_i32(t1, addr, idx, memop & ~MO_SIGN); + tcg_gen_qemu_ld_i32(t1, addr, idx, memop); gen(t2, t1, val); tcg_gen_qemu_st_i32(t2, addr, idx, memop); @@ -3232,7 +3232,7 @@ static void do_nonatomic_op_i64(TCGv_i64 ret, TCGv addr, TCGv_i64 val, memop = tcg_canonicalize_memop(memop, 1, 0); - tcg_gen_qemu_ld_i64(t1, addr, idx, memop & ~MO_SIGN); + tcg_gen_qemu_ld_i64(t1, addr, idx, memop); gen(t2, t1, val); tcg_gen_qemu_st_i64(t2, addr, idx, memop); -- 2.23.0 ^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH 1/2] tcg/tcg-op: Fix nonatomic_op load with MO_SIGN @ 2020-06-29 13:07 ` LIU Zhiwei 0 siblings, 0 replies; 14+ messages in thread From: LIU Zhiwei @ 2020-06-29 13:07 UTC (permalink / raw) To: qemu-devel, qemu-riscv Cc: richard.henderson, Alistair.Francis, palmer, wenmeng_zhang, wxy194768, LIU Zhiwei As an op follows load, MO_SIGN should not be cleared. Thus, we can call tcg_gen_atomic_*_i64 with a smaller Memop than MO_Q. Signed-off-by: LIU Zhiwei <zhiwei_liu@c-sky.com> --- tcg/tcg-op.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/tcg/tcg-op.c b/tcg/tcg-op.c index e60b74fb82..75b31048f5 100644 --- a/tcg/tcg-op.c +++ b/tcg/tcg-op.c @@ -3189,7 +3189,7 @@ static void do_nonatomic_op_i32(TCGv_i32 ret, TCGv addr, TCGv_i32 val, memop = tcg_canonicalize_memop(memop, 0, 0); - tcg_gen_qemu_ld_i32(t1, addr, idx, memop & ~MO_SIGN); + tcg_gen_qemu_ld_i32(t1, addr, idx, memop); gen(t2, t1, val); tcg_gen_qemu_st_i32(t2, addr, idx, memop); @@ -3232,7 +3232,7 @@ static void do_nonatomic_op_i64(TCGv_i64 ret, TCGv addr, TCGv_i64 val, memop = tcg_canonicalize_memop(memop, 1, 0); - tcg_gen_qemu_ld_i64(t1, addr, idx, memop & ~MO_SIGN); + tcg_gen_qemu_ld_i64(t1, addr, idx, memop); gen(t2, t1, val); tcg_gen_qemu_st_i64(t2, addr, idx, memop); -- 2.23.0 ^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH 1/2] tcg/tcg-op: Fix nonatomic_op load with MO_SIGN 2020-06-29 13:07 ` LIU Zhiwei @ 2020-06-30 14:56 ` Richard Henderson -1 siblings, 0 replies; 14+ messages in thread From: Richard Henderson @ 2020-06-30 14:56 UTC (permalink / raw) To: LIU Zhiwei, qemu-devel, qemu-riscv Cc: palmer, wenmeng_zhang, Alistair.Francis, wxy194768 On 6/29/20 6:07 AM, LIU Zhiwei wrote: > @@ -3189,7 +3189,7 @@ static void do_nonatomic_op_i32(TCGv_i32 ret, TCGv addr, TCGv_i32 val, > > memop = tcg_canonicalize_memop(memop, 0, 0); > > - tcg_gen_qemu_ld_i32(t1, addr, idx, memop & ~MO_SIGN); > + tcg_gen_qemu_ld_i32(t1, addr, idx, memop); > gen(t2, t1, val); > tcg_gen_qemu_st_i32(t2, addr, idx, memop); > > @@ -3232,7 +3232,7 @@ static void do_nonatomic_op_i64(TCGv_i64 ret, TCGv addr, TCGv_i64 val, > > memop = tcg_canonicalize_memop(memop, 1, 0); > > - tcg_gen_qemu_ld_i64(t1, addr, idx, memop & ~MO_SIGN); > + tcg_gen_qemu_ld_i64(t1, addr, idx, memop); > gen(t2, t1, val); > tcg_gen_qemu_st_i64(t2, addr, idx, memop); This is insufficient for smin/smax -- we also need to extend the "val" input. r~ ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 1/2] tcg/tcg-op: Fix nonatomic_op load with MO_SIGN @ 2020-06-30 14:56 ` Richard Henderson 0 siblings, 0 replies; 14+ messages in thread From: Richard Henderson @ 2020-06-30 14:56 UTC (permalink / raw) To: LIU Zhiwei, qemu-devel, qemu-riscv Cc: Alistair.Francis, palmer, wenmeng_zhang, wxy194768 On 6/29/20 6:07 AM, LIU Zhiwei wrote: > @@ -3189,7 +3189,7 @@ static void do_nonatomic_op_i32(TCGv_i32 ret, TCGv addr, TCGv_i32 val, > > memop = tcg_canonicalize_memop(memop, 0, 0); > > - tcg_gen_qemu_ld_i32(t1, addr, idx, memop & ~MO_SIGN); > + tcg_gen_qemu_ld_i32(t1, addr, idx, memop); > gen(t2, t1, val); > tcg_gen_qemu_st_i32(t2, addr, idx, memop); > > @@ -3232,7 +3232,7 @@ static void do_nonatomic_op_i64(TCGv_i64 ret, TCGv addr, TCGv_i64 val, > > memop = tcg_canonicalize_memop(memop, 1, 0); > > - tcg_gen_qemu_ld_i64(t1, addr, idx, memop & ~MO_SIGN); > + tcg_gen_qemu_ld_i64(t1, addr, idx, memop); > gen(t2, t1, val); > tcg_gen_qemu_st_i64(t2, addr, idx, memop); This is insufficient for smin/smax -- we also need to extend the "val" input. r~ ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 1/2] tcg/tcg-op: Fix nonatomic_op load with MO_SIGN 2020-06-30 14:56 ` Richard Henderson @ 2020-06-30 15:22 ` LIU Zhiwei -1 siblings, 0 replies; 14+ messages in thread From: LIU Zhiwei @ 2020-06-30 15:22 UTC (permalink / raw) To: Richard Henderson, qemu-devel, qemu-riscv Cc: palmer, wenmeng_zhang, Alistair.Francis, wxy194768 On 2020/6/30 22:56, Richard Henderson wrote: > On 6/29/20 6:07 AM, LIU Zhiwei wrote: >> @@ -3189,7 +3189,7 @@ static void do_nonatomic_op_i32(TCGv_i32 ret, TCGv addr, TCGv_i32 val, >> >> memop = tcg_canonicalize_memop(memop, 0, 0); >> >> - tcg_gen_qemu_ld_i32(t1, addr, idx, memop & ~MO_SIGN); >> + tcg_gen_qemu_ld_i32(t1, addr, idx, memop); >> gen(t2, t1, val); >> tcg_gen_qemu_st_i32(t2, addr, idx, memop); >> >> @@ -3232,7 +3232,7 @@ static void do_nonatomic_op_i64(TCGv_i64 ret, TCGv addr, TCGv_i64 val, >> >> memop = tcg_canonicalize_memop(memop, 1, 0); >> >> - tcg_gen_qemu_ld_i64(t1, addr, idx, memop & ~MO_SIGN); >> + tcg_gen_qemu_ld_i64(t1, addr, idx, memop); >> gen(t2, t1, val); >> tcg_gen_qemu_st_i64(t2, addr, idx, memop); > This is insufficient for smin/smax -- we also need to extend the "val" input. Do you mean we should call tcg_gen_ext_i64(val, val, memop) before gen(t2, t1, val) for do_nonatomic_op_i64? I think it will be good if it doesn't have any other side effects. Zhiwei > > > r~ ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 1/2] tcg/tcg-op: Fix nonatomic_op load with MO_SIGN @ 2020-06-30 15:22 ` LIU Zhiwei 0 siblings, 0 replies; 14+ messages in thread From: LIU Zhiwei @ 2020-06-30 15:22 UTC (permalink / raw) To: Richard Henderson, qemu-devel, qemu-riscv Cc: Alistair.Francis, palmer, wenmeng_zhang, wxy194768 On 2020/6/30 22:56, Richard Henderson wrote: > On 6/29/20 6:07 AM, LIU Zhiwei wrote: >> @@ -3189,7 +3189,7 @@ static void do_nonatomic_op_i32(TCGv_i32 ret, TCGv addr, TCGv_i32 val, >> >> memop = tcg_canonicalize_memop(memop, 0, 0); >> >> - tcg_gen_qemu_ld_i32(t1, addr, idx, memop & ~MO_SIGN); >> + tcg_gen_qemu_ld_i32(t1, addr, idx, memop); >> gen(t2, t1, val); >> tcg_gen_qemu_st_i32(t2, addr, idx, memop); >> >> @@ -3232,7 +3232,7 @@ static void do_nonatomic_op_i64(TCGv_i64 ret, TCGv addr, TCGv_i64 val, >> >> memop = tcg_canonicalize_memop(memop, 1, 0); >> >> - tcg_gen_qemu_ld_i64(t1, addr, idx, memop & ~MO_SIGN); >> + tcg_gen_qemu_ld_i64(t1, addr, idx, memop); >> gen(t2, t1, val); >> tcg_gen_qemu_st_i64(t2, addr, idx, memop); > This is insufficient for smin/smax -- we also need to extend the "val" input. Do you mean we should call tcg_gen_ext_i64(val, val, memop) before gen(t2, t1, val) for do_nonatomic_op_i64? I think it will be good if it doesn't have any other side effects. Zhiwei > > > r~ ^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH 2/2] target/riscv: Do amo*.w insns operate with 32 bits 2020-06-29 13:07 ` LIU Zhiwei @ 2020-06-29 13:07 ` LIU Zhiwei -1 siblings, 0 replies; 14+ messages in thread From: LIU Zhiwei @ 2020-06-29 13:07 UTC (permalink / raw) To: qemu-devel, qemu-riscv Cc: richard.henderson, wxy194768, wenmeng_zhang, Alistair.Francis, palmer, LIU Zhiwei For amo*.w insns, we should only calculate on the 32 bits data either from the register or the memory. Signed-off-by: LIU Zhiwei <zhiwei_liu@c-sky.com> --- target/riscv/insn_trans/trans_rva.inc.c | 60 +++++++++++++++++++------ 1 file changed, 47 insertions(+), 13 deletions(-) diff --git a/target/riscv/insn_trans/trans_rva.inc.c b/target/riscv/insn_trans/trans_rva.inc.c index be8a9f06dd..6b3fc14436 100644 --- a/target/riscv/insn_trans/trans_rva.inc.c +++ b/target/riscv/insn_trans/trans_rva.inc.c @@ -81,19 +81,26 @@ static inline bool gen_sc(DisasContext *ctx, arg_atomic *a, MemOp mop) return true; } -static bool gen_amo(DisasContext *ctx, arg_atomic *a, - void(*func)(TCGv, TCGv, TCGv, TCGArg, MemOp), - MemOp mop) +static bool +gen_amo_w(DisasContext *ctx, arg_atomic *a, + void(*func)(TCGv, TCGv, TCGv, TCGArg, MemOp), + MemOp mop, bool sign) { TCGv src1 = tcg_temp_new(); TCGv src2 = tcg_temp_new(); gen_get_gpr(src1, a->rs1); gen_get_gpr(src2, a->rs2); + if (sign) { + tcg_gen_ext32s_tl(src2, src2); + } else { + tcg_gen_ext32u_tl(src2, src2); + } (*func)(src2, src1, src2, ctx->mem_idx, mop); - + tcg_gen_ext32s_tl(src2, src2); gen_set_gpr(a->rd, src2); + tcg_temp_free(src1); tcg_temp_free(src2); return true; @@ -114,59 +121,86 @@ static bool trans_sc_w(DisasContext *ctx, arg_sc_w *a) static bool trans_amoswap_w(DisasContext *ctx, arg_amoswap_w *a) { REQUIRE_EXT(ctx, RVA); - return gen_amo(ctx, a, &tcg_gen_atomic_xchg_tl, (MO_ALIGN | MO_TESL)); + return gen_amo_w(ctx, a, &tcg_gen_atomic_xchg_tl, + (MO_ALIGN | MO_TESL), true); } static bool trans_amoadd_w(DisasContext *ctx, arg_amoadd_w *a) { REQUIRE_EXT(ctx, RVA); - return gen_amo(ctx, a, &tcg_gen_atomic_fetch_add_tl, (MO_ALIGN | MO_TESL)); + return gen_amo_w(ctx, a, &tcg_gen_atomic_fetch_add_tl, + (MO_ALIGN | MO_TESL), true); } static bool trans_amoxor_w(DisasContext *ctx, arg_amoxor_w *a) { REQUIRE_EXT(ctx, RVA); - return gen_amo(ctx, a, &tcg_gen_atomic_fetch_xor_tl, (MO_ALIGN | MO_TESL)); + return gen_amo_w(ctx, a, &tcg_gen_atomic_fetch_xor_tl, + (MO_ALIGN | MO_TESL), true); } static bool trans_amoand_w(DisasContext *ctx, arg_amoand_w *a) { REQUIRE_EXT(ctx, RVA); - return gen_amo(ctx, a, &tcg_gen_atomic_fetch_and_tl, (MO_ALIGN | MO_TESL)); + return gen_amo_w(ctx, a, &tcg_gen_atomic_fetch_and_tl, + (MO_ALIGN | MO_TESL), true); } static bool trans_amoor_w(DisasContext *ctx, arg_amoor_w *a) { REQUIRE_EXT(ctx, RVA); - return gen_amo(ctx, a, &tcg_gen_atomic_fetch_or_tl, (MO_ALIGN | MO_TESL)); + return gen_amo_w(ctx, a, &tcg_gen_atomic_fetch_or_tl, + (MO_ALIGN | MO_TESL), true); } static bool trans_amomin_w(DisasContext *ctx, arg_amomin_w *a) { REQUIRE_EXT(ctx, RVA); - return gen_amo(ctx, a, &tcg_gen_atomic_fetch_smin_tl, (MO_ALIGN | MO_TESL)); + return gen_amo_w(ctx, a, &tcg_gen_atomic_fetch_smin_tl, + (MO_ALIGN | MO_TESL), true); } static bool trans_amomax_w(DisasContext *ctx, arg_amomax_w *a) { REQUIRE_EXT(ctx, RVA); - return gen_amo(ctx, a, &tcg_gen_atomic_fetch_smax_tl, (MO_ALIGN | MO_TESL)); + return gen_amo_w(ctx, a, &tcg_gen_atomic_fetch_smax_tl, + (MO_ALIGN | MO_TESL), true); } static bool trans_amominu_w(DisasContext *ctx, arg_amominu_w *a) { REQUIRE_EXT(ctx, RVA); - return gen_amo(ctx, a, &tcg_gen_atomic_fetch_umin_tl, (MO_ALIGN | MO_TESL)); + return gen_amo_w(ctx, a, &tcg_gen_atomic_fetch_umin_tl, + (MO_ALIGN | MO_TEUL), false); } static bool trans_amomaxu_w(DisasContext *ctx, arg_amomaxu_w *a) { REQUIRE_EXT(ctx, RVA); - return gen_amo(ctx, a, &tcg_gen_atomic_fetch_umax_tl, (MO_ALIGN | MO_TESL)); + return gen_amo_w(ctx, a, &tcg_gen_atomic_fetch_umax_tl, + (MO_ALIGN | MO_TEUL), false); } #ifdef TARGET_RISCV64 +static bool gen_amo(DisasContext *ctx, arg_atomic *a, + void(*func)(TCGv_i64, TCGv_i64, TCGv_i64, TCGArg, MemOp), + MemOp mop) +{ + TCGv src1 = tcg_temp_new(); + TCGv src2 = tcg_temp_new(); + + gen_get_gpr(src1, a->rs1); + gen_get_gpr(src2, a->rs2); + + (*func)(src2, src1, src2, ctx->mem_idx, mop); + + gen_set_gpr(a->rd, src2); + tcg_temp_free(src1); + tcg_temp_free(src2); + return true; +} + static bool trans_lr_d(DisasContext *ctx, arg_lr_d *a) { return gen_lr(ctx, a, MO_ALIGN | MO_TEQ); -- 2.23.0 ^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH 2/2] target/riscv: Do amo*.w insns operate with 32 bits @ 2020-06-29 13:07 ` LIU Zhiwei 0 siblings, 0 replies; 14+ messages in thread From: LIU Zhiwei @ 2020-06-29 13:07 UTC (permalink / raw) To: qemu-devel, qemu-riscv Cc: richard.henderson, Alistair.Francis, palmer, wenmeng_zhang, wxy194768, LIU Zhiwei For amo*.w insns, we should only calculate on the 32 bits data either from the register or the memory. Signed-off-by: LIU Zhiwei <zhiwei_liu@c-sky.com> --- target/riscv/insn_trans/trans_rva.inc.c | 60 +++++++++++++++++++------ 1 file changed, 47 insertions(+), 13 deletions(-) diff --git a/target/riscv/insn_trans/trans_rva.inc.c b/target/riscv/insn_trans/trans_rva.inc.c index be8a9f06dd..6b3fc14436 100644 --- a/target/riscv/insn_trans/trans_rva.inc.c +++ b/target/riscv/insn_trans/trans_rva.inc.c @@ -81,19 +81,26 @@ static inline bool gen_sc(DisasContext *ctx, arg_atomic *a, MemOp mop) return true; } -static bool gen_amo(DisasContext *ctx, arg_atomic *a, - void(*func)(TCGv, TCGv, TCGv, TCGArg, MemOp), - MemOp mop) +static bool +gen_amo_w(DisasContext *ctx, arg_atomic *a, + void(*func)(TCGv, TCGv, TCGv, TCGArg, MemOp), + MemOp mop, bool sign) { TCGv src1 = tcg_temp_new(); TCGv src2 = tcg_temp_new(); gen_get_gpr(src1, a->rs1); gen_get_gpr(src2, a->rs2); + if (sign) { + tcg_gen_ext32s_tl(src2, src2); + } else { + tcg_gen_ext32u_tl(src2, src2); + } (*func)(src2, src1, src2, ctx->mem_idx, mop); - + tcg_gen_ext32s_tl(src2, src2); gen_set_gpr(a->rd, src2); + tcg_temp_free(src1); tcg_temp_free(src2); return true; @@ -114,59 +121,86 @@ static bool trans_sc_w(DisasContext *ctx, arg_sc_w *a) static bool trans_amoswap_w(DisasContext *ctx, arg_amoswap_w *a) { REQUIRE_EXT(ctx, RVA); - return gen_amo(ctx, a, &tcg_gen_atomic_xchg_tl, (MO_ALIGN | MO_TESL)); + return gen_amo_w(ctx, a, &tcg_gen_atomic_xchg_tl, + (MO_ALIGN | MO_TESL), true); } static bool trans_amoadd_w(DisasContext *ctx, arg_amoadd_w *a) { REQUIRE_EXT(ctx, RVA); - return gen_amo(ctx, a, &tcg_gen_atomic_fetch_add_tl, (MO_ALIGN | MO_TESL)); + return gen_amo_w(ctx, a, &tcg_gen_atomic_fetch_add_tl, + (MO_ALIGN | MO_TESL), true); } static bool trans_amoxor_w(DisasContext *ctx, arg_amoxor_w *a) { REQUIRE_EXT(ctx, RVA); - return gen_amo(ctx, a, &tcg_gen_atomic_fetch_xor_tl, (MO_ALIGN | MO_TESL)); + return gen_amo_w(ctx, a, &tcg_gen_atomic_fetch_xor_tl, + (MO_ALIGN | MO_TESL), true); } static bool trans_amoand_w(DisasContext *ctx, arg_amoand_w *a) { REQUIRE_EXT(ctx, RVA); - return gen_amo(ctx, a, &tcg_gen_atomic_fetch_and_tl, (MO_ALIGN | MO_TESL)); + return gen_amo_w(ctx, a, &tcg_gen_atomic_fetch_and_tl, + (MO_ALIGN | MO_TESL), true); } static bool trans_amoor_w(DisasContext *ctx, arg_amoor_w *a) { REQUIRE_EXT(ctx, RVA); - return gen_amo(ctx, a, &tcg_gen_atomic_fetch_or_tl, (MO_ALIGN | MO_TESL)); + return gen_amo_w(ctx, a, &tcg_gen_atomic_fetch_or_tl, + (MO_ALIGN | MO_TESL), true); } static bool trans_amomin_w(DisasContext *ctx, arg_amomin_w *a) { REQUIRE_EXT(ctx, RVA); - return gen_amo(ctx, a, &tcg_gen_atomic_fetch_smin_tl, (MO_ALIGN | MO_TESL)); + return gen_amo_w(ctx, a, &tcg_gen_atomic_fetch_smin_tl, + (MO_ALIGN | MO_TESL), true); } static bool trans_amomax_w(DisasContext *ctx, arg_amomax_w *a) { REQUIRE_EXT(ctx, RVA); - return gen_amo(ctx, a, &tcg_gen_atomic_fetch_smax_tl, (MO_ALIGN | MO_TESL)); + return gen_amo_w(ctx, a, &tcg_gen_atomic_fetch_smax_tl, + (MO_ALIGN | MO_TESL), true); } static bool trans_amominu_w(DisasContext *ctx, arg_amominu_w *a) { REQUIRE_EXT(ctx, RVA); - return gen_amo(ctx, a, &tcg_gen_atomic_fetch_umin_tl, (MO_ALIGN | MO_TESL)); + return gen_amo_w(ctx, a, &tcg_gen_atomic_fetch_umin_tl, + (MO_ALIGN | MO_TEUL), false); } static bool trans_amomaxu_w(DisasContext *ctx, arg_amomaxu_w *a) { REQUIRE_EXT(ctx, RVA); - return gen_amo(ctx, a, &tcg_gen_atomic_fetch_umax_tl, (MO_ALIGN | MO_TESL)); + return gen_amo_w(ctx, a, &tcg_gen_atomic_fetch_umax_tl, + (MO_ALIGN | MO_TEUL), false); } #ifdef TARGET_RISCV64 +static bool gen_amo(DisasContext *ctx, arg_atomic *a, + void(*func)(TCGv_i64, TCGv_i64, TCGv_i64, TCGArg, MemOp), + MemOp mop) +{ + TCGv src1 = tcg_temp_new(); + TCGv src2 = tcg_temp_new(); + + gen_get_gpr(src1, a->rs1); + gen_get_gpr(src2, a->rs2); + + (*func)(src2, src1, src2, ctx->mem_idx, mop); + + gen_set_gpr(a->rd, src2); + tcg_temp_free(src1); + tcg_temp_free(src2); + return true; +} + static bool trans_lr_d(DisasContext *ctx, arg_lr_d *a) { return gen_lr(ctx, a, MO_ALIGN | MO_TEQ); -- 2.23.0 ^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH 2/2] target/riscv: Do amo*.w insns operate with 32 bits 2020-06-29 13:07 ` LIU Zhiwei @ 2020-06-30 15:00 ` Richard Henderson -1 siblings, 0 replies; 14+ messages in thread From: Richard Henderson @ 2020-06-30 15:00 UTC (permalink / raw) To: LIU Zhiwei, qemu-devel, qemu-riscv Cc: palmer, wenmeng_zhang, Alistair.Francis, wxy194768 On 6/29/20 6:07 AM, LIU Zhiwei wrote: > +static bool > +gen_amo_w(DisasContext *ctx, arg_atomic *a, > + void(*func)(TCGv, TCGv, TCGv, TCGArg, MemOp), > + MemOp mop, bool sign) > { > TCGv src1 = tcg_temp_new(); > TCGv src2 = tcg_temp_new(); > > gen_get_gpr(src1, a->rs1); > gen_get_gpr(src2, a->rs2); > + if (sign) { > + tcg_gen_ext32s_tl(src2, src2); > + } else { > + tcg_gen_ext32u_tl(src2, src2); > + } > > (*func)(src2, src1, src2, ctx->mem_idx, mop); > - > + tcg_gen_ext32s_tl(src2, src2); > gen_set_gpr(a->rd, src2); > + > tcg_temp_free(src1); > tcg_temp_free(src2); > return true; With the fix to tcg, there should be no change required here, since you're already passing MO_TESL for signed input. Note that unsigned comparisions work as expected with sign-extended inputs. That's what the risc-v isa does, after all. r~ ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 2/2] target/riscv: Do amo*.w insns operate with 32 bits @ 2020-06-30 15:00 ` Richard Henderson 0 siblings, 0 replies; 14+ messages in thread From: Richard Henderson @ 2020-06-30 15:00 UTC (permalink / raw) To: LIU Zhiwei, qemu-devel, qemu-riscv Cc: Alistair.Francis, palmer, wenmeng_zhang, wxy194768 On 6/29/20 6:07 AM, LIU Zhiwei wrote: > +static bool > +gen_amo_w(DisasContext *ctx, arg_atomic *a, > + void(*func)(TCGv, TCGv, TCGv, TCGArg, MemOp), > + MemOp mop, bool sign) > { > TCGv src1 = tcg_temp_new(); > TCGv src2 = tcg_temp_new(); > > gen_get_gpr(src1, a->rs1); > gen_get_gpr(src2, a->rs2); > + if (sign) { > + tcg_gen_ext32s_tl(src2, src2); > + } else { > + tcg_gen_ext32u_tl(src2, src2); > + } > > (*func)(src2, src1, src2, ctx->mem_idx, mop); > - > + tcg_gen_ext32s_tl(src2, src2); > gen_set_gpr(a->rd, src2); > + > tcg_temp_free(src1); > tcg_temp_free(src2); > return true; With the fix to tcg, there should be no change required here, since you're already passing MO_TESL for signed input. Note that unsigned comparisions work as expected with sign-extended inputs. That's what the risc-v isa does, after all. r~ ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 2/2] target/riscv: Do amo*.w insns operate with 32 bits 2020-06-30 15:00 ` Richard Henderson @ 2020-06-30 15:38 ` LIU Zhiwei -1 siblings, 0 replies; 14+ messages in thread From: LIU Zhiwei @ 2020-06-30 15:38 UTC (permalink / raw) To: Richard Henderson, qemu-devel, qemu-riscv Cc: palmer, wenmeng_zhang, Alistair.Francis, wxy194768 [-- Attachment #1: Type: text/plain, Size: 1287 bytes --] On 2020/6/30 23:00, Richard Henderson wrote: > On 6/29/20 6:07 AM, LIU Zhiwei wrote: >> +static bool >> +gen_amo_w(DisasContext *ctx, arg_atomic *a, >> + void(*func)(TCGv, TCGv, TCGv, TCGArg, MemOp), >> + MemOp mop, bool sign) >> { >> TCGv src1 = tcg_temp_new(); >> TCGv src2 = tcg_temp_new(); >> >> gen_get_gpr(src1, a->rs1); >> gen_get_gpr(src2, a->rs2); >> + if (sign) { >> + tcg_gen_ext32s_tl(src2, src2); >> + } else { >> + tcg_gen_ext32u_tl(src2, src2); >> + } >> >> (*func)(src2, src1, src2, ctx->mem_idx, mop); >> - >> + tcg_gen_ext32s_tl(src2, src2); >> gen_set_gpr(a->rd, src2); >> + >> tcg_temp_free(src1); >> tcg_temp_free(src2); >> return true; > With the fix to tcg, there should be no change required here, since you're > already passing MO_TESL for signed input. > > Note that unsigned comparisions work as expected with sign-extended inputs. > That's what the risc-v isa does, after all. > Although some confusing, it is right for unsigned comparisons. Thus the amominu.w will still be calculated by gen_amo(ctx, a, &tcg_gen_atomic_fetch_umin_tl,/(MO_ALIGN |//MO_TESL//)/); In this way, the only fix is in tcg and this patch will be dropped. Zhiwei > r~ [-- Attachment #2: Type: text/html, Size: 2068 bytes --] ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 2/2] target/riscv: Do amo*.w insns operate with 32 bits @ 2020-06-30 15:38 ` LIU Zhiwei 0 siblings, 0 replies; 14+ messages in thread From: LIU Zhiwei @ 2020-06-30 15:38 UTC (permalink / raw) To: Richard Henderson, qemu-devel, qemu-riscv Cc: Alistair.Francis, palmer, wenmeng_zhang, wxy194768 [-- Attachment #1: Type: text/plain, Size: 1287 bytes --] On 2020/6/30 23:00, Richard Henderson wrote: > On 6/29/20 6:07 AM, LIU Zhiwei wrote: >> +static bool >> +gen_amo_w(DisasContext *ctx, arg_atomic *a, >> + void(*func)(TCGv, TCGv, TCGv, TCGArg, MemOp), >> + MemOp mop, bool sign) >> { >> TCGv src1 = tcg_temp_new(); >> TCGv src2 = tcg_temp_new(); >> >> gen_get_gpr(src1, a->rs1); >> gen_get_gpr(src2, a->rs2); >> + if (sign) { >> + tcg_gen_ext32s_tl(src2, src2); >> + } else { >> + tcg_gen_ext32u_tl(src2, src2); >> + } >> >> (*func)(src2, src1, src2, ctx->mem_idx, mop); >> - >> + tcg_gen_ext32s_tl(src2, src2); >> gen_set_gpr(a->rd, src2); >> + >> tcg_temp_free(src1); >> tcg_temp_free(src2); >> return true; > With the fix to tcg, there should be no change required here, since you're > already passing MO_TESL for signed input. > > Note that unsigned comparisions work as expected with sign-extended inputs. > That's what the risc-v isa does, after all. > Although some confusing, it is right for unsigned comparisons. Thus the amominu.w will still be calculated by gen_amo(ctx, a, &tcg_gen_atomic_fetch_umin_tl,/(MO_ALIGN |//MO_TESL//)/); In this way, the only fix is in tcg and this patch will be dropped. Zhiwei > r~ [-- Attachment #2: Type: text/html, Size: 2068 bytes --] ^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2020-06-30 15:39 UTC | newest] Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2020-06-29 13:07 [PATCH 0/2] target/riscv: fixup atomic implementation LIU Zhiwei 2020-06-29 13:07 ` LIU Zhiwei 2020-06-29 13:07 ` [PATCH 1/2] tcg/tcg-op: Fix nonatomic_op load with MO_SIGN LIU Zhiwei 2020-06-29 13:07 ` LIU Zhiwei 2020-06-30 14:56 ` Richard Henderson 2020-06-30 14:56 ` Richard Henderson 2020-06-30 15:22 ` LIU Zhiwei 2020-06-30 15:22 ` LIU Zhiwei 2020-06-29 13:07 ` [PATCH 2/2] target/riscv: Do amo*.w insns operate with 32 bits LIU Zhiwei 2020-06-29 13:07 ` LIU Zhiwei 2020-06-30 15:00 ` Richard Henderson 2020-06-30 15:00 ` Richard Henderson 2020-06-30 15:38 ` LIU Zhiwei 2020-06-30 15:38 ` LIU Zhiwei
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.