All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 00/10] tcg/loongarch64: Reorg goto_tb and cleanups
@ 2023-01-18  1:11 Richard Henderson
  2023-01-18  1:11 ` [PATCH v2 01/10] target/loongarch: Enable the disassembler for host tcg Richard Henderson
                   ` (10 more replies)
  0 siblings, 11 replies; 25+ messages in thread
From: Richard Henderson @ 2023-01-18  1:11 UTC (permalink / raw)
  To: qemu-devel; +Cc: git

Based-on: 20230117231051.354444-1-richard.henderson@linaro.org
("[PULL 00/22] tcg patch queue")

Includes:
  * Disassembler from target/loongarch/.
  * Improvements to movi by Rui Wang, with minor tweaks.
  * Improvements to setcond.
  * Implement movcond.
  * Fix the same goto_tb bug that affected some others.


r~


Richard Henderson (9):
  target/loongarch: Enable the disassembler for host tcg
  target/loongarch: Disassemble jirl properly
  target/loongarch: Disassemble pcadd* addresses
  tcg/loongarch64: Update tcg-insn-defs.c.inc
  tcg/loongarch64: Introduce tcg_out_addi
  tcg/loongarch64: Improve setcond expansion
  tcg/loongarch64: Implement movcond
  tcg/loongarch64: Use tcg_pcrel_diff in tcg_out_ldst
  tcg/loongarch64: Reorg goto_tb implementation

Rui Wang (1):
  tcg/loongarch64: Optimize immediate loading

 tcg/loongarch64/tcg-target-con-set.h          |   5 +-
 tcg/loongarch64/tcg-target-con-str.h          |   2 +-
 tcg/loongarch64/tcg-target.h                  |  11 +-
 disas.c                                       |   2 +
 target/loongarch/disas.c                      |  39 +-
 .../loongarch/insn_trans/trans_branch.c.inc   |   2 +-
 target/loongarch/insns.decode                 |   3 +-
 target/loongarch/meson.build                  |   3 +-
 tcg/loongarch64/tcg-insn-defs.c.inc           |  10 +-
 tcg/loongarch64/tcg-target.c.inc              | 364 ++++++++++++------
 10 files changed, 300 insertions(+), 141 deletions(-)
 mode change 100644 => 100755 tcg/loongarch64/tcg-insn-defs.c.inc

-- 
2.34.1



^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH v2 01/10] target/loongarch: Enable the disassembler for host tcg
  2023-01-18  1:11 [PATCH v2 00/10] tcg/loongarch64: Reorg goto_tb and cleanups Richard Henderson
@ 2023-01-18  1:11 ` Richard Henderson
  2023-01-22  8:32   ` WANG Xuerui
  2023-01-23  8:37   ` Philippe Mathieu-Daudé
  2023-01-18  1:11 ` [PATCH v2 02/10] target/loongarch: Disassemble jirl properly Richard Henderson
                   ` (9 subsequent siblings)
  10 siblings, 2 replies; 25+ messages in thread
From: Richard Henderson @ 2023-01-18  1:11 UTC (permalink / raw)
  To: qemu-devel; +Cc: git

Reuse the decodetree based disassembler from
target/loongarch/ for tcg/loongarch64/.

The generation of decode-insns.c.inc into ./libcommon.fa.p/ could
eventually result in conflict, if any other host requires the same
trick, but this is good enough for now.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 disas.c                      | 2 ++
 target/loongarch/meson.build | 3 ++-
 2 files changed, 4 insertions(+), 1 deletion(-)

diff --git a/disas.c b/disas.c
index 3b31315f40..c9fa38e6d7 100644
--- a/disas.c
+++ b/disas.c
@@ -198,6 +198,8 @@ static void initialize_debug_host(CPUDebug *s)
     s->info.cap_insn_split = 6;
 #elif defined(__hppa__)
     s->info.print_insn = print_insn_hppa;
+#elif defined(__loongarch64)
+    s->info.print_insn = print_insn_loongarch;
 #endif
 }
 
diff --git a/target/loongarch/meson.build b/target/loongarch/meson.build
index 6376f9e84b..690633969f 100644
--- a/target/loongarch/meson.build
+++ b/target/loongarch/meson.build
@@ -3,7 +3,6 @@ gen = decodetree.process('insns.decode')
 loongarch_ss = ss.source_set()
 loongarch_ss.add(files(
   'cpu.c',
-  'disas.c',
 ))
 loongarch_tcg_ss = ss.source_set()
 loongarch_tcg_ss.add(gen)
@@ -24,6 +23,8 @@ loongarch_softmmu_ss.add(files(
   'iocsr_helper.c',
 ))
 
+common_ss.add(when: 'CONFIG_LOONGARCH_DIS', if_true: [files('disas.c'), gen])
+
 loongarch_ss.add_all(when: 'CONFIG_TCG', if_true: [loongarch_tcg_ss])
 
 target_arch += {'loongarch': loongarch_ss}
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v2 02/10] target/loongarch: Disassemble jirl properly
  2023-01-18  1:11 [PATCH v2 00/10] tcg/loongarch64: Reorg goto_tb and cleanups Richard Henderson
  2023-01-18  1:11 ` [PATCH v2 01/10] target/loongarch: Enable the disassembler for host tcg Richard Henderson
@ 2023-01-18  1:11 ` Richard Henderson
  2023-01-22  8:24   ` WANG Xuerui
  2023-01-18  1:11 ` [PATCH v2 03/10] target/loongarch: Disassemble pcadd* addresses Richard Henderson
                   ` (8 subsequent siblings)
  10 siblings, 1 reply; 25+ messages in thread
From: Richard Henderson @ 2023-01-18  1:11 UTC (permalink / raw)
  To: qemu-devel; +Cc: git

While jirl shares the same instruction format as bne etc,
it is not assembled the same.  In particular, rd is printed
first not second and the immediate is not pc-relative.

Decode into the arg_rr_i structure, which prints correctly.
This changes the "offs" member to "imm", to update translate.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/loongarch/disas.c                       | 2 +-
 target/loongarch/insn_trans/trans_branch.c.inc | 2 +-
 target/loongarch/insns.decode                  | 3 ++-
 3 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c
index 858dfcc53a..7cffd853ec 100644
--- a/target/loongarch/disas.c
+++ b/target/loongarch/disas.c
@@ -628,7 +628,7 @@ INSN(beqz,         r_offs)
 INSN(bnez,         r_offs)
 INSN(bceqz,        c_offs)
 INSN(bcnez,        c_offs)
-INSN(jirl,         rr_offs)
+INSN(jirl,         rr_i)
 INSN(b,            offs)
 INSN(bl,           offs)
 INSN(beq,          rr_offs)
diff --git a/target/loongarch/insn_trans/trans_branch.c.inc b/target/loongarch/insn_trans/trans_branch.c.inc
index 65dbdff41e..a860f7e733 100644
--- a/target/loongarch/insn_trans/trans_branch.c.inc
+++ b/target/loongarch/insn_trans/trans_branch.c.inc
@@ -23,7 +23,7 @@ static bool trans_jirl(DisasContext *ctx, arg_jirl *a)
     TCGv dest = gpr_dst(ctx, a->rd, EXT_NONE);
     TCGv src1 = gpr_src(ctx, a->rj, EXT_NONE);
 
-    tcg_gen_addi_tl(cpu_pc, src1, a->offs);
+    tcg_gen_addi_tl(cpu_pc, src1, a->imm);
     tcg_gen_movi_tl(dest, ctx->base.pc_next + 4);
     gen_set_gpr(a->rd, dest, EXT_NONE);
     tcg_gen_lookup_and_goto_ptr();
diff --git a/target/loongarch/insns.decode b/target/loongarch/insns.decode
index 3fdc6e148c..de7b8f0f3c 100644
--- a/target/loongarch/insns.decode
+++ b/target/loongarch/insns.decode
@@ -67,6 +67,7 @@
 @rr_ui12                 .... ...... imm:12 rj:5 rd:5    &rr_i
 @rr_i14s2         .... ....  .............. rj:5 rd:5    &rr_i imm=%i14s2
 @rr_i16                     .... .. imm:s16 rj:5 rd:5    &rr_i
+@rr_i16s2         .... ..  ................ rj:5 rd:5    &rr_i imm=%offs16
 @hint_r_i12           .... ...... imm:s12 rj:5 hint:5    &hint_r_i
 @rrr_sa2p1        .... ........ ... .. rk:5 rj:5 rd:5    &rrr_sa  sa=%sa2p1
 @rrr_sa2        .... ........ ... sa:2 rk:5 rj:5 rd:5    &rrr_sa
@@ -444,7 +445,7 @@ beqz            0100 00 ................ ..... .....     @r_offs21
 bnez            0100 01 ................ ..... .....     @r_offs21
 bceqz           0100 10 ................ 00 ... .....    @c_offs21
 bcnez           0100 10 ................ 01 ... .....    @c_offs21
-jirl            0100 11 ................ ..... .....     @rr_offs16
+jirl            0100 11 ................ ..... .....     @rr_i16s2
 b               0101 00 ..........................       @offs26
 bl              0101 01 ..........................       @offs26
 beq             0101 10 ................ ..... .....     @rr_offs16
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v2 03/10] target/loongarch: Disassemble pcadd* addresses
  2023-01-18  1:11 [PATCH v2 00/10] tcg/loongarch64: Reorg goto_tb and cleanups Richard Henderson
  2023-01-18  1:11 ` [PATCH v2 01/10] target/loongarch: Enable the disassembler for host tcg Richard Henderson
  2023-01-18  1:11 ` [PATCH v2 02/10] target/loongarch: Disassemble jirl properly Richard Henderson
@ 2023-01-18  1:11 ` Richard Henderson
  2023-01-22  8:24   ` WANG Xuerui
  2023-01-18  1:11 ` [PATCH v2 04/10] tcg/loongarch64: Optimize immediate loading Richard Henderson
                   ` (7 subsequent siblings)
  10 siblings, 1 reply; 25+ messages in thread
From: Richard Henderson @ 2023-01-18  1:11 UTC (permalink / raw)
  To: qemu-devel; +Cc: git

Print both the raw field and the resolved pc-relative
address, as we do for branches.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/loongarch/disas.c | 37 +++++++++++++++++++++++++++++++++----
 1 file changed, 33 insertions(+), 4 deletions(-)

diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c
index 7cffd853ec..2e93e77e0d 100644
--- a/target/loongarch/disas.c
+++ b/target/loongarch/disas.c
@@ -519,10 +519,6 @@ INSN(fsel,         fffc)
 INSN(addu16i_d,    rr_i)
 INSN(lu12i_w,      r_i)
 INSN(lu32i_d,      r_i)
-INSN(pcaddi,       r_i)
-INSN(pcalau12i,    r_i)
-INSN(pcaddu12i,    r_i)
-INSN(pcaddu18i,    r_i)
 INSN(ll_w,         rr_i)
 INSN(sc_w,         rr_i)
 INSN(ll_d,         rr_i)
@@ -755,3 +751,36 @@ static bool trans_fcmp_cond_##suffix(DisasContext *ctx, \
 
 FCMP_INSN(s)
 FCMP_INSN(d)
+
+#define PCADD_INSN(name)                                        \
+static bool trans_##name(DisasContext *ctx, arg_##name *a)      \
+{                                                               \
+    output(ctx, #name, "r%d, %d # 0x%" PRIx64,                  \
+           a->rd, a->imm, gen_##name(ctx->pc, a->imm));         \
+    return true;                                                \
+}
+
+static uint64_t gen_pcaddi(uint64_t pc, int imm)
+{
+    return pc + (imm << 2);
+}
+
+static uint64_t gen_pcalau12i(uint64_t pc, int imm)
+{
+    return (pc + (imm << 12)) & ~0xfff;
+}
+
+static uint64_t gen_pcaddu12i(uint64_t pc, int imm)
+{
+    return pc + (imm << 12);
+}
+
+static uint64_t gen_pcaddu18i(uint64_t pc, int imm)
+{
+    return pc + ((uint64_t)(imm) << 18);
+}
+
+PCADD_INSN(pcaddi)
+PCADD_INSN(pcalau12i)
+PCADD_INSN(pcaddu12i)
+PCADD_INSN(pcaddu18i)
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v2 04/10] tcg/loongarch64: Optimize immediate loading
  2023-01-18  1:11 [PATCH v2 00/10] tcg/loongarch64: Reorg goto_tb and cleanups Richard Henderson
                   ` (2 preceding siblings ...)
  2023-01-18  1:11 ` [PATCH v2 03/10] target/loongarch: Disassemble pcadd* addresses Richard Henderson
@ 2023-01-18  1:11 ` Richard Henderson
  2023-01-22  8:21   ` WANG Xuerui
  2023-01-18  1:11 ` [PATCH v2 05/10] tcg/loongarch64: Update tcg-insn-defs.c.inc Richard Henderson
                   ` (6 subsequent siblings)
  10 siblings, 1 reply; 25+ messages in thread
From: Richard Henderson @ 2023-01-18  1:11 UTC (permalink / raw)
  To: qemu-devel; +Cc: git, Rui Wang

From: Rui Wang <wangrui@loongson.cn>

diff:
  Imm                 Before                  After
  0000000000000000    addi.w  rd, zero, 0     addi.w  rd, zero, 0
                      lu52i.d rd, zero, 0
  00000000fffff800    lu12i.w rd, -1          addi.w  rd, zero, -2048
                      ori     rd, rd, 2048    lu32i.d rd, 0
                      lu32i.d rd, 0
  ...

Signed-off-by: Rui Wang <wangrui@loongson.cn>
Message-Id: <20221107144713.845550-1-wangrui@loongson.cn>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 tcg/loongarch64/tcg-target.c.inc | 35 +++++++++++---------------------
 1 file changed, 12 insertions(+), 23 deletions(-)

diff --git a/tcg/loongarch64/tcg-target.c.inc b/tcg/loongarch64/tcg-target.c.inc
index 3174557ce3..428f3abd71 100644
--- a/tcg/loongarch64/tcg-target.c.inc
+++ b/tcg/loongarch64/tcg-target.c.inc
@@ -274,16 +274,6 @@ static bool tcg_out_mov(TCGContext *s, TCGType type, TCGReg ret, TCGReg arg)
     return true;
 }
 
-static bool imm_part_needs_loading(bool high_bits_are_ones,
-                                   tcg_target_long part)
-{
-    if (high_bits_are_ones) {
-        return part != -1;
-    } else {
-        return part != 0;
-    }
-}
-
 /* Loads a 32-bit immediate into rd, sign-extended.  */
 static void tcg_out_movi_i32(TCGContext *s, TCGReg rd, int32_t val)
 {
@@ -291,16 +281,16 @@ static void tcg_out_movi_i32(TCGContext *s, TCGReg rd, int32_t val)
     tcg_target_long hi12 = sextreg(val, 12, 20);
 
     /* Single-instruction cases.  */
-    if (lo == val) {
-        /* val fits in simm12: addi.w rd, zero, val */
-        tcg_out_opc_addi_w(s, rd, TCG_REG_ZERO, val);
-        return;
-    }
-    if (0x800 <= val && val <= 0xfff) {
+    if (hi12 == 0) {
         /* val fits in uimm12: ori rd, zero, val */
         tcg_out_opc_ori(s, rd, TCG_REG_ZERO, val);
         return;
     }
+    if (hi12 == sextreg(lo, 12, 20)) {
+        /* val fits in simm12: addi.w rd, zero, val */
+        tcg_out_opc_addi_w(s, rd, TCG_REG_ZERO, val);
+        return;
+    }
 
     /* High bits must be set; load with lu12i.w + optional ori.  */
     tcg_out_opc_lu12i_w(s, rd, hi12);
@@ -334,8 +324,7 @@ static void tcg_out_movi(TCGContext *s, TCGType type, TCGReg rd,
 
     intptr_t pc_offset;
     tcg_target_long val_lo, val_hi, pc_hi, offset_hi;
-    tcg_target_long hi32, hi52;
-    bool rd_high_bits_are_ones;
+    tcg_target_long hi12, hi32, hi52;
 
     /* Value fits in signed i32.  */
     if (type == TCG_TYPE_I32 || val == (int32_t)val) {
@@ -366,25 +355,25 @@ static void tcg_out_movi(TCGContext *s, TCGType type, TCGReg rd,
         return;
     }
 
+    hi12 = sextreg(val, 12, 20);
     hi32 = sextreg(val, 32, 20);
     hi52 = sextreg(val, 52, 12);
 
     /* Single cu52i.d case.  */
-    if (ctz64(val) >= 52) {
+    if ((hi52 != 0) && (ctz64(val) >= 52)) {
         tcg_out_opc_cu52i_d(s, rd, TCG_REG_ZERO, hi52);
         return;
     }
 
     /* Slow path.  Initialize the low 32 bits, then concat high bits.  */
     tcg_out_movi_i32(s, rd, val);
-    rd_high_bits_are_ones = (int32_t)val < 0;
 
-    if (imm_part_needs_loading(rd_high_bits_are_ones, hi32)) {
+    /* Load hi32 and hi52 explicitly when they are unexpected values. */
+    if (hi32 != sextreg(hi12, 20, 20)) {
         tcg_out_opc_cu32i_d(s, rd, hi32);
-        rd_high_bits_are_ones = hi32 < 0;
     }
 
-    if (imm_part_needs_loading(rd_high_bits_are_ones, hi52)) {
+    if (hi52 != sextreg(hi32, 20, 12)) {
         tcg_out_opc_cu52i_d(s, rd, rd, hi52);
     }
 }
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v2 05/10] tcg/loongarch64: Update tcg-insn-defs.c.inc
  2023-01-18  1:11 [PATCH v2 00/10] tcg/loongarch64: Reorg goto_tb and cleanups Richard Henderson
                   ` (3 preceding siblings ...)
  2023-01-18  1:11 ` [PATCH v2 04/10] tcg/loongarch64: Optimize immediate loading Richard Henderson
@ 2023-01-18  1:11 ` Richard Henderson
  2023-01-22  8:20   ` WANG Xuerui
  2023-01-23  8:33   ` Philippe Mathieu-Daudé
  2023-01-18  1:11 ` [PATCH v2 06/10] tcg/loongarch64: Introduce tcg_out_addi Richard Henderson
                   ` (5 subsequent siblings)
  10 siblings, 2 replies; 25+ messages in thread
From: Richard Henderson @ 2023-01-18  1:11 UTC (permalink / raw)
  To: qemu-devel; +Cc: git

Regenerate with ADDU16I included:

   $ cd loongarch-opcodes/scripts/go
   $ go run ./genqemutcgdefs > $QEMU/tcg/loongarch64/tcg-insn-defs.c.inc

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 tcg/loongarch64/tcg-insn-defs.c.inc | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)
 mode change 100644 => 100755 tcg/loongarch64/tcg-insn-defs.c.inc

diff --git a/tcg/loongarch64/tcg-insn-defs.c.inc b/tcg/loongarch64/tcg-insn-defs.c.inc
old mode 100644
new mode 100755
index d162571856..b5bb0c5e73
--- a/tcg/loongarch64/tcg-insn-defs.c.inc
+++ b/tcg/loongarch64/tcg-insn-defs.c.inc
@@ -4,7 +4,7 @@
  *
  * This file is auto-generated by genqemutcgdefs from
  * https://github.com/loongson-community/loongarch-opcodes,
- * from commit 961f0c60f5b63e574d785995600c71ad5413fdc4.
+ * from commit 25ca7effe9d88101c1cf96c4005423643386d81f.
  * DO NOT EDIT.
  */
 
@@ -74,6 +74,7 @@ typedef enum {
     OPC_ANDI = 0x03400000,
     OPC_ORI = 0x03800000,
     OPC_XORI = 0x03c00000,
+    OPC_ADDU16I_D = 0x10000000,
     OPC_LU12I_W = 0x14000000,
     OPC_CU32I_D = 0x16000000,
     OPC_PCADDU2I = 0x18000000,
@@ -710,6 +711,13 @@ tcg_out_opc_xori(TCGContext *s, TCGReg d, TCGReg j, uint32_t uk12)
     tcg_out32(s, encode_djuk12_insn(OPC_XORI, d, j, uk12));
 }
 
+/* Emits the `addu16i.d d, j, sk16` instruction.  */
+static void __attribute__((unused))
+tcg_out_opc_addu16i_d(TCGContext *s, TCGReg d, TCGReg j, int32_t sk16)
+{
+    tcg_out32(s, encode_djsk16_insn(OPC_ADDU16I_D, d, j, sk16));
+}
+
 /* Emits the `lu12i.w d, sj20` instruction.  */
 static void __attribute__((unused))
 tcg_out_opc_lu12i_w(TCGContext *s, TCGReg d, int32_t sj20)
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v2 06/10] tcg/loongarch64: Introduce tcg_out_addi
  2023-01-18  1:11 [PATCH v2 00/10] tcg/loongarch64: Reorg goto_tb and cleanups Richard Henderson
                   ` (4 preceding siblings ...)
  2023-01-18  1:11 ` [PATCH v2 05/10] tcg/loongarch64: Update tcg-insn-defs.c.inc Richard Henderson
@ 2023-01-18  1:11 ` Richard Henderson
  2023-01-23  6:52   ` WANG Xuerui
  2023-01-18  1:11 ` [PATCH v2 07/10] tcg/loongarch64: Improve setcond expansion Richard Henderson
                   ` (4 subsequent siblings)
  10 siblings, 1 reply; 25+ messages in thread
From: Richard Henderson @ 2023-01-18  1:11 UTC (permalink / raw)
  To: qemu-devel; +Cc: git

Adjust the constraints to allow any int32_t for immediate
addition.  Split immediate adds into addu16i + addi, which
covers quite a lot of the immediate space.  For the hole in
the middle, load the constant into TMP0 instead.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 tcg/loongarch64/tcg-target-con-set.h |  4 +-
 tcg/loongarch64/tcg-target-con-str.h |  2 +-
 tcg/loongarch64/tcg-target.c.inc     | 57 ++++++++++++++++++++++++----
 3 files changed, 53 insertions(+), 10 deletions(-)

diff --git a/tcg/loongarch64/tcg-target-con-set.h b/tcg/loongarch64/tcg-target-con-set.h
index 349c672687..7b5a7a3f5d 100644
--- a/tcg/loongarch64/tcg-target-con-set.h
+++ b/tcg/loongarch64/tcg-target-con-set.h
@@ -23,9 +23,11 @@ C_O1_I1(r, L)
 C_O1_I2(r, r, rC)
 C_O1_I2(r, r, ri)
 C_O1_I2(r, r, rI)
+C_O1_I2(r, r, rJ)
 C_O1_I2(r, r, rU)
 C_O1_I2(r, r, rW)
 C_O1_I2(r, r, rZ)
 C_O1_I2(r, 0, rZ)
-C_O1_I2(r, rZ, rN)
+C_O1_I2(r, rZ, ri)
+C_O1_I2(r, rZ, rJ)
 C_O1_I2(r, rZ, rZ)
diff --git a/tcg/loongarch64/tcg-target-con-str.h b/tcg/loongarch64/tcg-target-con-str.h
index c3986a4fd4..541ff47fa9 100644
--- a/tcg/loongarch64/tcg-target-con-str.h
+++ b/tcg/loongarch64/tcg-target-con-str.h
@@ -21,7 +21,7 @@ REGS('L', ALL_GENERAL_REGS & ~SOFTMMU_RESERVE_REGS)
  * CONST(letter, TCG_CT_CONST_* bit set)
  */
 CONST('I', TCG_CT_CONST_S12)
-CONST('N', TCG_CT_CONST_N12)
+CONST('J', TCG_CT_CONST_S32)
 CONST('U', TCG_CT_CONST_U12)
 CONST('Z', TCG_CT_CONST_ZERO)
 CONST('C', TCG_CT_CONST_C12)
diff --git a/tcg/loongarch64/tcg-target.c.inc b/tcg/loongarch64/tcg-target.c.inc
index 428f3abd71..8cc6c5eec2 100644
--- a/tcg/loongarch64/tcg-target.c.inc
+++ b/tcg/loongarch64/tcg-target.c.inc
@@ -126,7 +126,7 @@ static const int tcg_target_call_oarg_regs[] = {
 
 #define TCG_CT_CONST_ZERO  0x100
 #define TCG_CT_CONST_S12   0x200
-#define TCG_CT_CONST_N12   0x400
+#define TCG_CT_CONST_S32   0x400
 #define TCG_CT_CONST_U12   0x800
 #define TCG_CT_CONST_C12   0x1000
 #define TCG_CT_CONST_WSZ   0x2000
@@ -161,7 +161,7 @@ static bool tcg_target_const_match(int64_t val, TCGType type, int ct)
     if ((ct & TCG_CT_CONST_S12) && val == sextreg(val, 0, 12)) {
         return true;
     }
-    if ((ct & TCG_CT_CONST_N12) && -val == sextreg(-val, 0, 12)) {
+    if ((ct & TCG_CT_CONST_S32) && val == (int32_t)val) {
         return true;
     }
     if ((ct & TCG_CT_CONST_U12) && val >= 0 && val <= 0xfff) {
@@ -378,6 +378,45 @@ static void tcg_out_movi(TCGContext *s, TCGType type, TCGReg rd,
     }
 }
 
+static void tcg_out_addi(TCGContext *s, TCGType type, TCGReg rd,
+                         TCGReg rs, tcg_target_long imm)
+{
+    tcg_target_long lo12 = sextreg(imm, 0, 12);
+    tcg_target_long hi16 = sextreg(imm - lo12, 16, 16);
+
+    /*
+     * Note that there's a hole in between hi16 and lo12:
+     *
+     *       3                   2                   1                   0
+     *     1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0
+     * ...+-------------------------------+-------+-----------------------+
+     *    |             hi16              |       |          lo12         |
+     * ...+-------------------------------+-------+-----------------------+
+     *
+     * For bits within that hole, it's more efficient to use LU12I and ADD.
+     */
+    if (imm == (hi16 << 16) + lo12) {
+        if (hi16) {
+            tcg_out_opc_addu16i_d(s, rd, rs, hi16);
+            rs = rd;
+        }
+        if (type == TCG_TYPE_I32) {
+            tcg_out_opc_addi_w(s, rd, rs, lo12);
+        } else if (lo12) {
+            tcg_out_opc_addi_d(s, rd, rs, lo12);
+        } else {
+            tcg_out_mov(s, type, rd, rs);
+        }
+    } else {
+        tcg_out_movi(s, type, TCG_REG_TMP0, imm);
+        if (type == TCG_TYPE_I32) {
+            tcg_out_opc_add_w(s, rd, rs, TCG_REG_TMP0);
+        } else {
+            tcg_out_opc_add_d(s, rd, rs, TCG_REG_TMP0);
+        }
+    }
+}
+
 static void tcg_out_ext8u(TCGContext *s, TCGReg ret, TCGReg arg)
 {
     tcg_out_opc_andi(s, ret, arg, 0xff);
@@ -1350,14 +1389,14 @@ static void tcg_out_op(TCGContext *s, TCGOpcode opc,
 
     case INDEX_op_add_i32:
         if (c2) {
-            tcg_out_opc_addi_w(s, a0, a1, a2);
+            tcg_out_addi(s, TCG_TYPE_I32, a0, a1, a2);
         } else {
             tcg_out_opc_add_w(s, a0, a1, a2);
         }
         break;
     case INDEX_op_add_i64:
         if (c2) {
-            tcg_out_opc_addi_d(s, a0, a1, a2);
+            tcg_out_addi(s, TCG_TYPE_I64, a0, a1, a2);
         } else {
             tcg_out_opc_add_d(s, a0, a1, a2);
         }
@@ -1365,14 +1404,14 @@ static void tcg_out_op(TCGContext *s, TCGOpcode opc,
 
     case INDEX_op_sub_i32:
         if (c2) {
-            tcg_out_opc_addi_w(s, a0, a1, -a2);
+            tcg_out_addi(s, TCG_TYPE_I32, a0, a1, -a2);
         } else {
             tcg_out_opc_sub_w(s, a0, a1, a2);
         }
         break;
     case INDEX_op_sub_i64:
         if (c2) {
-            tcg_out_opc_addi_d(s, a0, a1, -a2);
+            tcg_out_addi(s, TCG_TYPE_I64, a0, a1, -a2);
         } else {
             tcg_out_opc_sub_d(s, a0, a1, a2);
         }
@@ -1586,8 +1625,9 @@ static TCGConstraintSetIndex tcg_target_op_def(TCGOpcode op)
         return C_O1_I2(r, r, ri);
 
     case INDEX_op_add_i32:
+        return C_O1_I2(r, r, ri);
     case INDEX_op_add_i64:
-        return C_O1_I2(r, r, rI);
+        return C_O1_I2(r, r, rJ);
 
     case INDEX_op_and_i32:
     case INDEX_op_and_i64:
@@ -1616,8 +1656,9 @@ static TCGConstraintSetIndex tcg_target_op_def(TCGOpcode op)
         return C_O1_I2(r, 0, rZ);
 
     case INDEX_op_sub_i32:
+        return C_O1_I2(r, rZ, ri);
     case INDEX_op_sub_i64:
-        return C_O1_I2(r, rZ, rN);
+        return C_O1_I2(r, rZ, rJ);
 
     case INDEX_op_mul_i32:
     case INDEX_op_mul_i64:
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v2 07/10] tcg/loongarch64: Improve setcond expansion
  2023-01-18  1:11 [PATCH v2 00/10] tcg/loongarch64: Reorg goto_tb and cleanups Richard Henderson
                   ` (5 preceding siblings ...)
  2023-01-18  1:11 ` [PATCH v2 06/10] tcg/loongarch64: Introduce tcg_out_addi Richard Henderson
@ 2023-01-18  1:11 ` Richard Henderson
  2023-01-23  7:10   ` WANG Xuerui
  2023-01-18  1:11 ` [PATCH v2 08/10] tcg/loongarch64: Implement movcond Richard Henderson
                   ` (3 subsequent siblings)
  10 siblings, 1 reply; 25+ messages in thread
From: Richard Henderson @ 2023-01-18  1:11 UTC (permalink / raw)
  To: qemu-devel; +Cc: git

Split out a helper function, tcg_out_setcond_int, which
does not always produce the complete boolean result, but
returns a set of flags to do so.

Accept all int32_t as constant input, so that LE/GT can
adjust the constant to LT.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 tcg/loongarch64/tcg-target.c.inc | 165 +++++++++++++++++++++----------
 1 file changed, 115 insertions(+), 50 deletions(-)

diff --git a/tcg/loongarch64/tcg-target.c.inc b/tcg/loongarch64/tcg-target.c.inc
index 8cc6c5eec2..ccc1c0f392 100644
--- a/tcg/loongarch64/tcg-target.c.inc
+++ b/tcg/loongarch64/tcg-target.c.inc
@@ -469,64 +469,131 @@ static void tcg_out_clzctz(TCGContext *s, LoongArchInsn opc,
     tcg_out_opc_or(s, a0, TCG_REG_TMP0, a0);
 }
 
-static void tcg_out_setcond(TCGContext *s, TCGCond cond, TCGReg ret,
-                            TCGReg arg1, TCGReg arg2, bool c2)
-{
-    TCGReg tmp;
+#define SETCOND_INV    TCG_TARGET_NB_REGS
+#define SETCOND_NEZ    (SETCOND_INV << 1)
+#define SETCOND_FLAGS  (SETCOND_INV | SETCOND_NEZ)
 
-    if (c2) {
-        tcg_debug_assert(arg2 == 0);
+static int tcg_out_setcond_int(TCGContext *s, TCGCond cond, TCGReg ret,
+                               TCGReg arg1, tcg_target_long arg2, bool c2)
+{
+    int flags = 0;
+
+    switch (cond) {
+    case TCG_COND_EQ:    /* -> NE  */
+    case TCG_COND_GE:    /* -> LT  */
+    case TCG_COND_GEU:   /* -> LTU */
+    case TCG_COND_GT:    /* -> LE  */
+    case TCG_COND_GTU:   /* -> LEU */
+        cond = tcg_invert_cond(cond);
+        flags ^= SETCOND_INV;
+        break;
+    default:
+        break;
     }
 
     switch (cond) {
-    case TCG_COND_EQ:
-        if (c2) {
-            tmp = arg1;
-        } else {
-            tcg_out_opc_sub_d(s, ret, arg1, arg2);
-            tmp = ret;
-        }
-        tcg_out_opc_sltui(s, ret, tmp, 1);
-        break;
-    case TCG_COND_NE:
-        if (c2) {
-            tmp = arg1;
-        } else {
-            tcg_out_opc_sub_d(s, ret, arg1, arg2);
-            tmp = ret;
-        }
-        tcg_out_opc_sltu(s, ret, TCG_REG_ZERO, tmp);
-        break;
-    case TCG_COND_LT:
-        tcg_out_opc_slt(s, ret, arg1, arg2);
-        break;
-    case TCG_COND_GE:
-        tcg_out_opc_slt(s, ret, arg1, arg2);
-        tcg_out_opc_xori(s, ret, ret, 1);
-        break;
     case TCG_COND_LE:
-        tcg_out_setcond(s, TCG_COND_GE, ret, arg2, arg1, false);
-        break;
-    case TCG_COND_GT:
-        tcg_out_setcond(s, TCG_COND_LT, ret, arg2, arg1, false);
-        break;
-    case TCG_COND_LTU:
-        tcg_out_opc_sltu(s, ret, arg1, arg2);
-        break;
-    case TCG_COND_GEU:
-        tcg_out_opc_sltu(s, ret, arg1, arg2);
-        tcg_out_opc_xori(s, ret, ret, 1);
-        break;
     case TCG_COND_LEU:
-        tcg_out_setcond(s, TCG_COND_GEU, ret, arg2, arg1, false);
+        /*
+         * If we have a constant input, the most efficient way to implement
+         * LE is by adding 1 and using LT.  Watch out for wrap around for LEU.
+         * We don't need to care for this for LE because the constant input
+         * is still constrained to int32_t, and INT32_MAX+1 is representable
+         * in the 64-bit temporary register.
+         */
+        if (c2) {
+            if (cond == TCG_COND_LEU) {
+                /* unsigned <= -1 is true */
+                if (arg2 == -1) {
+                    tcg_out_movi(s, TCG_TYPE_REG, ret, !(flags & SETCOND_INV));
+                    return ret;
+                }
+                cond = TCG_COND_LTU;
+            } else {
+                cond = TCG_COND_LT;
+            }
+            arg2 += 1;
+        } else {
+            TCGReg tmp = arg2;
+            arg2 = arg1;
+            arg1 = tmp;
+            cond = tcg_swap_cond(cond);    /* LE -> GE */
+            cond = tcg_invert_cond(cond);  /* GE -> LT */
+            flags ^= SETCOND_INV;
+        }
         break;
-    case TCG_COND_GTU:
-        tcg_out_setcond(s, TCG_COND_LTU, ret, arg2, arg1, false);
+    default:
         break;
+    }
+
+    switch (cond) {
+    case TCG_COND_NE:
+        flags |= SETCOND_NEZ;
+        if (!c2) {
+            tcg_out_opc_xor(s, ret, arg1, arg2);
+        } else if (arg2 == 0) {
+            ret = arg1;
+        } else if (arg2 >= 0 && arg2 <= 0xfff) {
+            tcg_out_opc_xori(s, ret, arg1, arg2);
+        } else {
+            tcg_out_addi(s, TCG_TYPE_REG, ret, arg1, -arg2);
+        }
+        break;
+
+    case TCG_COND_LT:
+    case TCG_COND_LTU:
+        if (c2) {
+            if (arg2 >= -0x800 && arg2 <= 0x7ff) {
+                if (cond == TCG_COND_LT) {
+                    tcg_out_opc_slti(s, ret, arg1, arg2);
+                } else {
+                    tcg_out_opc_sltui(s, ret, arg1, arg2);
+                }
+                break;
+            }
+            tcg_out_movi(s, TCG_TYPE_REG, TCG_REG_TMP0, arg2);
+            arg2 = TCG_REG_TMP0;
+        }
+        if (cond == TCG_COND_LT) {
+            tcg_out_opc_slt(s, ret, arg1, arg2);
+        } else {
+            tcg_out_opc_sltu(s, ret, arg1, arg2);
+        }
+        break;
+
     default:
         g_assert_not_reached();
         break;
     }
+
+    return ret | flags;
+}
+
+static void tcg_out_setcond(TCGContext *s, TCGCond cond, TCGReg ret,
+                            TCGReg arg1, tcg_target_long arg2, bool c2)
+{
+    int tmpflags = tcg_out_setcond_int(s, cond, ret, arg1, arg2, c2);
+
+    if (tmpflags != ret) {
+        TCGReg tmp = tmpflags & ~SETCOND_FLAGS;
+
+        switch (tmpflags & SETCOND_FLAGS) {
+        case SETCOND_INV:
+            /* Intermediate result is boolean: simply invert. */
+            tcg_out_opc_xori(s, ret, tmp, 1);
+            break;
+        case SETCOND_NEZ:
+            /* Intermediate result is zero/non-zero: test != 0. */
+            tcg_out_opc_sltu(s, ret, TCG_REG_ZERO, tmp);
+            break;
+        case SETCOND_NEZ | SETCOND_INV:
+            /* Intermediate result is zero/non-zero: test == 0. */
+            tcg_out_opc_sltui(s, ret, tmp, 1);
+            break;
+        default:
+            g_assert_not_reached();
+        }
+    }
 }
 
 /*
@@ -1646,18 +1713,16 @@ static TCGConstraintSetIndex tcg_target_op_def(TCGOpcode op)
     case INDEX_op_ctz_i64:
         return C_O1_I2(r, r, rW);
 
-    case INDEX_op_setcond_i32:
-    case INDEX_op_setcond_i64:
-        return C_O1_I2(r, r, rZ);
-
     case INDEX_op_deposit_i32:
     case INDEX_op_deposit_i64:
         /* Must deposit into the same register as input */
         return C_O1_I2(r, 0, rZ);
 
     case INDEX_op_sub_i32:
+    case INDEX_op_setcond_i32:
         return C_O1_I2(r, rZ, ri);
     case INDEX_op_sub_i64:
+    case INDEX_op_setcond_i64:
         return C_O1_I2(r, rZ, rJ);
 
     case INDEX_op_mul_i32:
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v2 08/10] tcg/loongarch64: Implement movcond
  2023-01-18  1:11 [PATCH v2 00/10] tcg/loongarch64: Reorg goto_tb and cleanups Richard Henderson
                   ` (6 preceding siblings ...)
  2023-01-18  1:11 ` [PATCH v2 07/10] tcg/loongarch64: Improve setcond expansion Richard Henderson
@ 2023-01-18  1:11 ` Richard Henderson
  2023-01-22  8:23   ` WANG Xuerui
  2023-01-18  1:11 ` [PATCH v2 09/10] tcg/loongarch64: Use tcg_pcrel_diff in tcg_out_ldst Richard Henderson
                   ` (2 subsequent siblings)
  10 siblings, 1 reply; 25+ messages in thread
From: Richard Henderson @ 2023-01-18  1:11 UTC (permalink / raw)
  To: qemu-devel; +Cc: git

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 tcg/loongarch64/tcg-target-con-set.h |  1 +
 tcg/loongarch64/tcg-target.h         |  4 ++--
 tcg/loongarch64/tcg-target.c.inc     | 33 ++++++++++++++++++++++++++++
 3 files changed, 36 insertions(+), 2 deletions(-)

diff --git a/tcg/loongarch64/tcg-target-con-set.h b/tcg/loongarch64/tcg-target-con-set.h
index 7b5a7a3f5d..172c107289 100644
--- a/tcg/loongarch64/tcg-target-con-set.h
+++ b/tcg/loongarch64/tcg-target-con-set.h
@@ -31,3 +31,4 @@ C_O1_I2(r, 0, rZ)
 C_O1_I2(r, rZ, ri)
 C_O1_I2(r, rZ, rJ)
 C_O1_I2(r, rZ, rZ)
+C_O1_I4(r, rZ, rJ, rZ, rZ)
diff --git a/tcg/loongarch64/tcg-target.h b/tcg/loongarch64/tcg-target.h
index 1c3e48d662..533a539ce9 100644
--- a/tcg/loongarch64/tcg-target.h
+++ b/tcg/loongarch64/tcg-target.h
@@ -97,7 +97,7 @@ typedef enum {
 #define TCG_TARGET_CALL_ARG_I64         TCG_CALL_ARG_NORMAL
 
 /* optional instructions */
-#define TCG_TARGET_HAS_movcond_i32      0
+#define TCG_TARGET_HAS_movcond_i32      1
 #define TCG_TARGET_HAS_div_i32          1
 #define TCG_TARGET_HAS_rem_i32          1
 #define TCG_TARGET_HAS_div2_i32         0
@@ -133,7 +133,7 @@ typedef enum {
 #define TCG_TARGET_HAS_qemu_st8_i32     0
 
 /* 64-bit operations */
-#define TCG_TARGET_HAS_movcond_i64      0
+#define TCG_TARGET_HAS_movcond_i64      1
 #define TCG_TARGET_HAS_div_i64          1
 #define TCG_TARGET_HAS_rem_i64          1
 #define TCG_TARGET_HAS_div2_i64         0
diff --git a/tcg/loongarch64/tcg-target.c.inc b/tcg/loongarch64/tcg-target.c.inc
index ccc1c0f392..29d75c80eb 100644
--- a/tcg/loongarch64/tcg-target.c.inc
+++ b/tcg/loongarch64/tcg-target.c.inc
@@ -596,6 +596,30 @@ static void tcg_out_setcond(TCGContext *s, TCGCond cond, TCGReg ret,
     }
 }
 
+static void tcg_out_movcond(TCGContext *s, TCGCond cond, TCGReg ret,
+                            TCGReg c1, tcg_target_long c2, bool const2,
+                            TCGReg v1, TCGReg v2)
+{
+    int tmpflags = tcg_out_setcond_int(s, cond, TCG_REG_TMP0, c1, c2, const2);
+    TCGReg t;
+
+    /* Standardize the test below to t != 0. */
+    if (tmpflags & SETCOND_INV) {
+        t = v1, v1 = v2, v2 = t;
+    }
+
+    t = tmpflags & ~SETCOND_FLAGS;
+    if (v1 == TCG_REG_ZERO) {
+        tcg_out_opc_masknez(s, ret, v2, t);
+    } else if (v2 == TCG_REG_ZERO) {
+        tcg_out_opc_maskeqz(s, ret, v1, t);
+    } else {
+        tcg_out_opc_masknez(s, TCG_REG_TMP2, v2, t); /* t ? 0 : v2 */
+        tcg_out_opc_maskeqz(s, TCG_REG_TMP1, v1, t); /* t ? v1 : 0 */
+        tcg_out_opc_or(s, ret, TCG_REG_TMP1, TCG_REG_TMP2);
+    }
+}
+
 /*
  * Branch helpers
  */
@@ -1538,6 +1562,11 @@ static void tcg_out_op(TCGContext *s, TCGOpcode opc,
         tcg_out_setcond(s, args[3], a0, a1, a2, c2);
         break;
 
+    case INDEX_op_movcond_i32:
+    case INDEX_op_movcond_i64:
+        tcg_out_movcond(s, args[5], a0, a1, a2, c2, args[3], args[4]);
+        break;
+
     case INDEX_op_ld8s_i32:
     case INDEX_op_ld8s_i64:
         tcg_out_ldst(s, OPC_LD_B, a0, a1, a2);
@@ -1741,6 +1770,10 @@ static TCGConstraintSetIndex tcg_target_op_def(TCGOpcode op)
     case INDEX_op_remu_i64:
         return C_O1_I2(r, rZ, rZ);
 
+    case INDEX_op_movcond_i32:
+    case INDEX_op_movcond_i64:
+        return C_O1_I4(r, rZ, rJ, rZ, rZ);
+
     default:
         g_assert_not_reached();
     }
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v2 09/10] tcg/loongarch64: Use tcg_pcrel_diff in tcg_out_ldst
  2023-01-18  1:11 [PATCH v2 00/10] tcg/loongarch64: Reorg goto_tb and cleanups Richard Henderson
                   ` (7 preceding siblings ...)
  2023-01-18  1:11 ` [PATCH v2 08/10] tcg/loongarch64: Implement movcond Richard Henderson
@ 2023-01-18  1:11 ` Richard Henderson
  2023-01-22  8:22   ` WANG Xuerui
  2023-01-23  8:32   ` Philippe Mathieu-Daudé
  2023-01-18  1:11 ` [PATCH v2 10/10] tcg/loongarch64: Reorg goto_tb implementation Richard Henderson
  2023-01-22  8:28 ` [PATCH v2 00/10] tcg/loongarch64: Reorg goto_tb and cleanups WANG Xuerui
  10 siblings, 2 replies; 25+ messages in thread
From: Richard Henderson @ 2023-01-18  1:11 UTC (permalink / raw)
  To: qemu-devel; +Cc: git

Take the w^x split into account when computing the
pc-relative distance to an absolute pointer.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 tcg/loongarch64/tcg-target.c.inc | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tcg/loongarch64/tcg-target.c.inc b/tcg/loongarch64/tcg-target.c.inc
index 29d75c80eb..d6926bdb83 100644
--- a/tcg/loongarch64/tcg-target.c.inc
+++ b/tcg/loongarch64/tcg-target.c.inc
@@ -702,7 +702,7 @@ static void tcg_out_ldst(TCGContext *s, LoongArchInsn opc, TCGReg data,
     intptr_t imm12 = sextreg(offset, 0, 12);
 
     if (offset != imm12) {
-        intptr_t diff = offset - (uintptr_t)s->code_ptr;
+        intptr_t diff = tcg_pcrel_diff(s, (void *)offset);
 
         if (addr == TCG_REG_ZERO && diff == (int32_t)diff) {
             imm12 = sextreg(diff, 0, 12);
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v2 10/10] tcg/loongarch64: Reorg goto_tb implementation
  2023-01-18  1:11 [PATCH v2 00/10] tcg/loongarch64: Reorg goto_tb and cleanups Richard Henderson
                   ` (8 preceding siblings ...)
  2023-01-18  1:11 ` [PATCH v2 09/10] tcg/loongarch64: Use tcg_pcrel_diff in tcg_out_ldst Richard Henderson
@ 2023-01-18  1:11 ` Richard Henderson
  2023-01-23  8:12   ` WANG Xuerui
  2023-01-22  8:28 ` [PATCH v2 00/10] tcg/loongarch64: Reorg goto_tb and cleanups WANG Xuerui
  10 siblings, 1 reply; 25+ messages in thread
From: Richard Henderson @ 2023-01-18  1:11 UTC (permalink / raw)
  To: qemu-devel; +Cc: git

The old implementation replaces two insns, swapping between

        b       <dest>
        nop
and
        pcaddu18i tmp, <dest>
        jirl      zero, tmp, <dest> & 0xffff

There is a race condition in which a thread could be stopped at
the jirl, i.e. with the top of the address loaded, and when
restarted we have re-linked to a different TB, so that the top
half no longer matches the bottom half.

Note that while we never directly re-link to a different TB, we
can link, unlink, and link again all while the stopped thread
remains stopped.

The new implementation replaces only one insn, swapping between

        b       <dest>
and
        pcadd   tmp, <jmp_addr>

falling through to load the address from tmp, and branch.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 tcg/loongarch64/tcg-target.h     |  7 +---
 tcg/loongarch64/tcg-target.c.inc | 72 ++++++++++++++------------------
 2 files changed, 33 insertions(+), 46 deletions(-)

diff --git a/tcg/loongarch64/tcg-target.h b/tcg/loongarch64/tcg-target.h
index 533a539ce9..8b151e7f6f 100644
--- a/tcg/loongarch64/tcg-target.h
+++ b/tcg/loongarch64/tcg-target.h
@@ -42,11 +42,8 @@
 
 #define TCG_TARGET_INSN_UNIT_SIZE 4
 #define TCG_TARGET_NB_REGS 32
-/*
- * PCADDU18I + JIRL sequence can give 20 + 16 + 2 = 38 bits
- * signed offset, which is +/- 128 GiB.
- */
-#define MAX_CODE_GEN_BUFFER_SIZE  (128 * GiB)
+
+#define MAX_CODE_GEN_BUFFER_SIZE  ((size_t)-1)
 
 typedef enum {
     TCG_REG_ZERO,
diff --git a/tcg/loongarch64/tcg-target.c.inc b/tcg/loongarch64/tcg-target.c.inc
index d6926bdb83..ce4a153887 100644
--- a/tcg/loongarch64/tcg-target.c.inc
+++ b/tcg/loongarch64/tcg-target.c.inc
@@ -1151,37 +1151,6 @@ static void tcg_out_qemu_st(TCGContext *s, const TCGArg *args)
 #endif
 }
 
-/* LoongArch uses `andi zero, zero, 0` as NOP.  */
-#define NOP OPC_ANDI
-static void tcg_out_nop(TCGContext *s)
-{
-    tcg_out32(s, NOP);
-}
-
-void tb_target_set_jmp_target(const TranslationBlock *tb, int n,
-                              uintptr_t jmp_rx, uintptr_t jmp_rw)
-{
-    tcg_insn_unit i1, i2;
-    ptrdiff_t upper, lower;
-    uintptr_t addr = tb->jmp_target_addr[n];
-    ptrdiff_t offset = (ptrdiff_t)(addr - jmp_rx) >> 2;
-
-    if (offset == sextreg(offset, 0, 26)) {
-        i1 = encode_sd10k16_insn(OPC_B, offset);
-        i2 = NOP;
-    } else {
-        tcg_debug_assert(offset == sextreg(offset, 0, 36));
-        lower = (int16_t)offset;
-        upper = (offset - lower) >> 16;
-
-        i1 = encode_dsj20_insn(OPC_PCADDU18I, TCG_REG_TMP0, upper);
-        i2 = encode_djsk16_insn(OPC_JIRL, TCG_REG_ZERO, TCG_REG_TMP0, lower);
-    }
-    uint64_t pair = ((uint64_t)i2 << 32) | i1;
-    qatomic_set((uint64_t *)jmp_rw, pair);
-    flush_idcache_range(jmp_rx, jmp_rw, 8);
-}
-
 /*
  * Entry-points
  */
@@ -1202,22 +1171,43 @@ static void tcg_out_exit_tb(TCGContext *s, uintptr_t a0)
 static void tcg_out_goto_tb(TCGContext *s, int which)
 {
     /*
-     * Ensure that patch area is 8-byte aligned so that an
-     * atomic write can be used to patch the target address.
+     * Direct branch, or load indirect address, to be patched
+     * by tb_target_set_jmp_target.  Check indirect load offset
+     * in range early, regardless of direct branch distance,
+     * via assert within tcg_out_opc_pcaddu2i.
      */
-    if ((uintptr_t)s->code_ptr & 7) {
-        tcg_out_nop(s);
-    }
+    uintptr_t i_addr = get_jmp_target_addr(s, which);
+    intptr_t i_disp = tcg_pcrel_diff(s, (void *)i_addr);
+
     set_jmp_insn_offset(s, which);
-    /*
-     * actual branch destination will be patched by
-     * tb_target_set_jmp_target later
-     */
-    tcg_out_opc_pcaddu18i(s, TCG_REG_TMP0, 0);
+    tcg_out_opc_pcaddu2i(s, TCG_REG_TMP0, i_disp >> 2);
+
+    /* Finish the load and indirect branch. */
+    tcg_out_ld(s, TCG_TYPE_PTR, TCG_REG_TMP0, TCG_REG_TMP0, 0);
     tcg_out_opc_jirl(s, TCG_REG_ZERO, TCG_REG_TMP0, 0);
     set_jmp_reset_offset(s, which);
 }
 
+void tb_target_set_jmp_target(const TranslationBlock *tb, int n,
+                              uintptr_t jmp_rx, uintptr_t jmp_rw)
+{
+    uintptr_t d_addr = tb->jmp_target_addr[n];
+    ptrdiff_t d_disp = (ptrdiff_t)(d_addr - jmp_rx) >> 2;
+    tcg_insn_unit insn;
+
+    /* Either directly branch, or load slot address for indirect branch. */
+    if (d_disp == sextreg(d_disp, 0, 26)) {
+        insn = encode_sd10k16_insn(OPC_B, d_disp);
+    } else {
+        uintptr_t i_addr = (uintptr_t)&tb->jmp_target_addr[n];
+        intptr_t i_disp = i_addr - jmp_rx;
+        insn = encode_dsj20_insn(OPC_PCADDU2I, TCG_REG_TMP0, i_disp >> 2);
+    }
+
+    qatomic_set((tcg_insn_unit *)jmp_rw, insn);
+    flush_idcache_range(jmp_rx, jmp_rw, 4);
+}
+
 static void tcg_out_op(TCGContext *s, TCGOpcode opc,
                        const TCGArg args[TCG_MAX_OP_ARGS],
                        const int const_args[TCG_MAX_OP_ARGS])
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [PATCH v2 05/10] tcg/loongarch64: Update tcg-insn-defs.c.inc
  2023-01-18  1:11 ` [PATCH v2 05/10] tcg/loongarch64: Update tcg-insn-defs.c.inc Richard Henderson
@ 2023-01-22  8:20   ` WANG Xuerui
  2023-01-23  8:33   ` Philippe Mathieu-Daudé
  1 sibling, 0 replies; 25+ messages in thread
From: WANG Xuerui @ 2023-01-22  8:20 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel; +Cc: git

On 1/18/23 09:11, Richard Henderson wrote:
> Regenerate with ADDU16I included:
>
>     $ cd loongarch-opcodes/scripts/go
>     $ go run ./genqemutcgdefs > $QEMU/tcg/loongarch64/tcg-insn-defs.c.inc
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>   tcg/loongarch64/tcg-insn-defs.c.inc | 10 +++++++++-
>   1 file changed, 9 insertions(+), 1 deletion(-)
>   mode change 100644 => 100755 tcg/loongarch64/tcg-insn-defs.c.inc
>
> diff --git a/tcg/loongarch64/tcg-insn-defs.c.inc b/tcg/loongarch64/tcg-insn-defs.c.inc
> old mode 100644
> new mode 100755
> index d162571856..b5bb0c5e73
> --- a/tcg/loongarch64/tcg-insn-defs.c.inc
> +++ b/tcg/loongarch64/tcg-insn-defs.c.inc
> @@ -4,7 +4,7 @@
>    *
>    * This file is auto-generated by genqemutcgdefs from
>    * https://github.com/loongson-community/loongarch-opcodes,
> - * from commit 961f0c60f5b63e574d785995600c71ad5413fdc4.
> + * from commit 25ca7effe9d88101c1cf96c4005423643386d81f.
>    * DO NOT EDIT.
>    */
>   
> @@ -74,6 +74,7 @@ typedef enum {
>       OPC_ANDI = 0x03400000,
>       OPC_ORI = 0x03800000,
>       OPC_XORI = 0x03c00000,
> +    OPC_ADDU16I_D = 0x10000000,
>       OPC_LU12I_W = 0x14000000,
>       OPC_CU32I_D = 0x16000000,
>       OPC_PCADDU2I = 0x18000000,
> @@ -710,6 +711,13 @@ tcg_out_opc_xori(TCGContext *s, TCGReg d, TCGReg j, uint32_t uk12)
>       tcg_out32(s, encode_djuk12_insn(OPC_XORI, d, j, uk12));
>   }
>   
> +/* Emits the `addu16i.d d, j, sk16` instruction.  */
> +static void __attribute__((unused))
> +tcg_out_opc_addu16i_d(TCGContext *s, TCGReg d, TCGReg j, int32_t sk16)
> +{
> +    tcg_out32(s, encode_djsk16_insn(OPC_ADDU16I_D, d, j, sk16));
> +}
> +
>   /* Emits the `lu12i.w d, sj20` instruction.  */
>   static void __attribute__((unused))
>   tcg_out_opc_lu12i_w(TCGContext *s, TCGReg d, int32_t sj20)

Reviewed-by: WANG Xuerui <git@xen0n.name>



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v2 04/10] tcg/loongarch64: Optimize immediate loading
  2023-01-18  1:11 ` [PATCH v2 04/10] tcg/loongarch64: Optimize immediate loading Richard Henderson
@ 2023-01-22  8:21   ` WANG Xuerui
  0 siblings, 0 replies; 25+ messages in thread
From: WANG Xuerui @ 2023-01-22  8:21 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel; +Cc: git, Rui Wang

On 1/18/23 09:11, Richard Henderson wrote:
> From: Rui Wang <wangrui@loongson.cn>
>
> diff:
>    Imm                 Before                  After
>    0000000000000000    addi.w  rd, zero, 0     addi.w  rd, zero, 0
>                        lu52i.d rd, zero, 0
>    00000000fffff800    lu12i.w rd, -1          addi.w  rd, zero, -2048
>                        ori     rd, rd, 2048    lu32i.d rd, 0
>                        lu32i.d rd, 0
>    ...
>
> Signed-off-by: Rui Wang <wangrui@loongson.cn>
> Message-Id: <20221107144713.845550-1-wangrui@loongson.cn>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>   tcg/loongarch64/tcg-target.c.inc | 35 +++++++++++---------------------
>   1 file changed, 12 insertions(+), 23 deletions(-)

Reviewed-by: WANG Xuerui <git@xen0n.name>

Thanks!



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v2 09/10] tcg/loongarch64: Use tcg_pcrel_diff in tcg_out_ldst
  2023-01-18  1:11 ` [PATCH v2 09/10] tcg/loongarch64: Use tcg_pcrel_diff in tcg_out_ldst Richard Henderson
@ 2023-01-22  8:22   ` WANG Xuerui
  2023-01-23  8:32   ` Philippe Mathieu-Daudé
  1 sibling, 0 replies; 25+ messages in thread
From: WANG Xuerui @ 2023-01-22  8:22 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel; +Cc: git

On 1/18/23 09:11, Richard Henderson wrote:
> Take the w^x split into account when computing the
> pc-relative distance to an absolute pointer.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>   tcg/loongarch64/tcg-target.c.inc | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/tcg/loongarch64/tcg-target.c.inc b/tcg/loongarch64/tcg-target.c.inc
> index 29d75c80eb..d6926bdb83 100644
> --- a/tcg/loongarch64/tcg-target.c.inc
> +++ b/tcg/loongarch64/tcg-target.c.inc
> @@ -702,7 +702,7 @@ static void tcg_out_ldst(TCGContext *s, LoongArchInsn opc, TCGReg data,
>       intptr_t imm12 = sextreg(offset, 0, 12);
>   
>       if (offset != imm12) {
> -        intptr_t diff = offset - (uintptr_t)s->code_ptr;
> +        intptr_t diff = tcg_pcrel_diff(s, (void *)offset);
>   
>           if (addr == TCG_REG_ZERO && diff == (int32_t)diff) {
>               imm12 = sextreg(diff, 0, 12);

Reviewed-by: WANG Xuerui <git@xen0n.name>

Thanks for the catch!



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v2 08/10] tcg/loongarch64: Implement movcond
  2023-01-18  1:11 ` [PATCH v2 08/10] tcg/loongarch64: Implement movcond Richard Henderson
@ 2023-01-22  8:23   ` WANG Xuerui
  0 siblings, 0 replies; 25+ messages in thread
From: WANG Xuerui @ 2023-01-22  8:23 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel; +Cc: git

On 1/18/23 09:11, Richard Henderson wrote:
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>   tcg/loongarch64/tcg-target-con-set.h |  1 +
>   tcg/loongarch64/tcg-target.h         |  4 ++--
>   tcg/loongarch64/tcg-target.c.inc     | 33 ++++++++++++++++++++++++++++
>   3 files changed, 36 insertions(+), 2 deletions(-)

Reviewed-by: WANG Xuerui <git@xen0n.name>



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v2 02/10] target/loongarch: Disassemble jirl properly
  2023-01-18  1:11 ` [PATCH v2 02/10] target/loongarch: Disassemble jirl properly Richard Henderson
@ 2023-01-22  8:24   ` WANG Xuerui
  0 siblings, 0 replies; 25+ messages in thread
From: WANG Xuerui @ 2023-01-22  8:24 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel; +Cc: git

On 1/18/23 09:11, Richard Henderson wrote:
> While jirl shares the same instruction format as bne etc,
> it is not assembled the same.  In particular, rd is printed
> first not second and the immediate is not pc-relative.
>
> Decode into the arg_rr_i structure, which prints correctly.
> This changes the "offs" member to "imm", to update translate.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>   target/loongarch/disas.c                       | 2 +-
>   target/loongarch/insn_trans/trans_branch.c.inc | 2 +-
>   target/loongarch/insns.decode                  | 3 ++-
>   3 files changed, 4 insertions(+), 3 deletions(-)
>
> diff --git a/target/loongarch/disas.c b/target/loongarch/disas.c
> index 858dfcc53a..7cffd853ec 100644
> --- a/target/loongarch/disas.c
> +++ b/target/loongarch/disas.c
> @@ -628,7 +628,7 @@ INSN(beqz,         r_offs)
>   INSN(bnez,         r_offs)
>   INSN(bceqz,        c_offs)
>   INSN(bcnez,        c_offs)
> -INSN(jirl,         rr_offs)
> +INSN(jirl,         rr_i)
>   INSN(b,            offs)
>   INSN(bl,           offs)
>   INSN(beq,          rr_offs)
> diff --git a/target/loongarch/insn_trans/trans_branch.c.inc b/target/loongarch/insn_trans/trans_branch.c.inc
> index 65dbdff41e..a860f7e733 100644
> --- a/target/loongarch/insn_trans/trans_branch.c.inc
> +++ b/target/loongarch/insn_trans/trans_branch.c.inc
> @@ -23,7 +23,7 @@ static bool trans_jirl(DisasContext *ctx, arg_jirl *a)
>       TCGv dest = gpr_dst(ctx, a->rd, EXT_NONE);
>       TCGv src1 = gpr_src(ctx, a->rj, EXT_NONE);
>   
> -    tcg_gen_addi_tl(cpu_pc, src1, a->offs);
> +    tcg_gen_addi_tl(cpu_pc, src1, a->imm);
>       tcg_gen_movi_tl(dest, ctx->base.pc_next + 4);
>       gen_set_gpr(a->rd, dest, EXT_NONE);
>       tcg_gen_lookup_and_goto_ptr();
> diff --git a/target/loongarch/insns.decode b/target/loongarch/insns.decode
> index 3fdc6e148c..de7b8f0f3c 100644
> --- a/target/loongarch/insns.decode
> +++ b/target/loongarch/insns.decode
> @@ -67,6 +67,7 @@
>   @rr_ui12                 .... ...... imm:12 rj:5 rd:5    &rr_i
>   @rr_i14s2         .... ....  .............. rj:5 rd:5    &rr_i imm=%i14s2
>   @rr_i16                     .... .. imm:s16 rj:5 rd:5    &rr_i
> +@rr_i16s2         .... ..  ................ rj:5 rd:5    &rr_i imm=%offs16
>   @hint_r_i12           .... ...... imm:s12 rj:5 hint:5    &hint_r_i
>   @rrr_sa2p1        .... ........ ... .. rk:5 rj:5 rd:5    &rrr_sa  sa=%sa2p1
>   @rrr_sa2        .... ........ ... sa:2 rk:5 rj:5 rd:5    &rrr_sa
> @@ -444,7 +445,7 @@ beqz            0100 00 ................ ..... .....     @r_offs21
>   bnez            0100 01 ................ ..... .....     @r_offs21
>   bceqz           0100 10 ................ 00 ... .....    @c_offs21
>   bcnez           0100 10 ................ 01 ... .....    @c_offs21
> -jirl            0100 11 ................ ..... .....     @rr_offs16
> +jirl            0100 11 ................ ..... .....     @rr_i16s2
>   b               0101 00 ..........................       @offs26
>   bl              0101 01 ..........................       @offs26
>   beq             0101 10 ................ ..... .....     @rr_offs16

Reviewed-by: WANG Xuerui <git@xen0n.name>

Thanks for the catch!



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v2 03/10] target/loongarch: Disassemble pcadd* addresses
  2023-01-18  1:11 ` [PATCH v2 03/10] target/loongarch: Disassemble pcadd* addresses Richard Henderson
@ 2023-01-22  8:24   ` WANG Xuerui
  0 siblings, 0 replies; 25+ messages in thread
From: WANG Xuerui @ 2023-01-22  8:24 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel; +Cc: git

On 1/18/23 09:11, Richard Henderson wrote:
> Print both the raw field and the resolved pc-relative
> address, as we do for branches.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>   target/loongarch/disas.c | 37 +++++++++++++++++++++++++++++++++----
>   1 file changed, 33 insertions(+), 4 deletions(-)

Reviewed-by: WANG Xuerui <git@xen0n.name>

Thanks!



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v2 00/10] tcg/loongarch64: Reorg goto_tb and cleanups
  2023-01-18  1:11 [PATCH v2 00/10] tcg/loongarch64: Reorg goto_tb and cleanups Richard Henderson
                   ` (9 preceding siblings ...)
  2023-01-18  1:11 ` [PATCH v2 10/10] tcg/loongarch64: Reorg goto_tb implementation Richard Henderson
@ 2023-01-22  8:28 ` WANG Xuerui
  10 siblings, 0 replies; 25+ messages in thread
From: WANG Xuerui @ 2023-01-22  8:28 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel; +Cc: git

Hi,

On 1/18/23 09:11, Richard Henderson wrote:
> Based-on: 20230117231051.354444-1-richard.henderson@linaro.org
> ("[PULL 00/22] tcg patch queue")
>
> Includes:
>    * Disassembler from target/loongarch/.
>    * Improvements to movi by Rui Wang, with minor tweaks.
>    * Improvements to setcond.
>    * Implement movcond.
>    * Fix the same goto_tb bug that affected some others.
>
>
> r~
>
>
> Richard Henderson (9):
>    target/loongarch: Enable the disassembler for host tcg
>    target/loongarch: Disassemble jirl properly
>    target/loongarch: Disassemble pcadd* addresses
>    tcg/loongarch64: Update tcg-insn-defs.c.inc
>    tcg/loongarch64: Introduce tcg_out_addi
>    tcg/loongarch64: Improve setcond expansion
>    tcg/loongarch64: Implement movcond
>    tcg/loongarch64: Use tcg_pcrel_diff in tcg_out_ldst
>    tcg/loongarch64: Reorg goto_tb implementation
>
> Rui Wang (1):
>    tcg/loongarch64: Optimize immediate loading
>
>   tcg/loongarch64/tcg-target-con-set.h          |   5 +-
>   tcg/loongarch64/tcg-target-con-str.h          |   2 +-
>   tcg/loongarch64/tcg-target.h                  |  11 +-
>   disas.c                                       |   2 +
>   target/loongarch/disas.c                      |  39 +-
>   .../loongarch/insn_trans/trans_branch.c.inc   |   2 +-
>   target/loongarch/insns.decode                 |   3 +-
>   target/loongarch/meson.build                  |   3 +-
>   tcg/loongarch64/tcg-insn-defs.c.inc           |  10 +-
>   tcg/loongarch64/tcg-target.c.inc              | 364 ++++++++++++------
>   10 files changed, 300 insertions(+), 141 deletions(-)
>   mode change 100644 => 100755 tcg/loongarch64/tcg-insn-defs.c.inc
>
Sorry for the late review; I was focusing more on LLVM and day job these 
days. I've reviewed some of these and will take a look at the rest (and 
test all of them on native HW) tonight. Thanks very much for all the 
refactoring!


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v2 01/10] target/loongarch: Enable the disassembler for host tcg
  2023-01-18  1:11 ` [PATCH v2 01/10] target/loongarch: Enable the disassembler for host tcg Richard Henderson
@ 2023-01-22  8:32   ` WANG Xuerui
  2023-01-23  8:37   ` Philippe Mathieu-Daudé
  1 sibling, 0 replies; 25+ messages in thread
From: WANG Xuerui @ 2023-01-22  8:32 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel; +Cc: git

On 1/18/23 09:11, Richard Henderson wrote:
> Reuse the decodetree based disassembler from
> target/loongarch/ for tcg/loongarch64/.
>
> The generation of decode-insns.c.inc into ./libcommon.fa.p/ could
> eventually result in conflict, if any other host requires the same
> trick, but this is good enough for now.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>   disas.c                      | 2 ++
>   target/loongarch/meson.build | 3 ++-
>   2 files changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/disas.c b/disas.c
> index 3b31315f40..c9fa38e6d7 100644
> --- a/disas.c
> +++ b/disas.c
> @@ -198,6 +198,8 @@ static void initialize_debug_host(CPUDebug *s)
>       s->info.cap_insn_split = 6;
>   #elif defined(__hppa__)
>       s->info.print_insn = print_insn_hppa;
> +#elif defined(__loongarch64)
This could just be `__loongarch__` because both LA32 and LA64 share the 
same encoding, so although LA32 userland isn't quite there yet it 
wouldn't do any harm.
> +    s->info.print_insn = print_insn_loongarch;
>   #endif
>   }
>   
> diff --git a/target/loongarch/meson.build b/target/loongarch/meson.build
> index 6376f9e84b..690633969f 100644
> --- a/target/loongarch/meson.build
> +++ b/target/loongarch/meson.build
> @@ -3,7 +3,6 @@ gen = decodetree.process('insns.decode')
>   loongarch_ss = ss.source_set()
>   loongarch_ss.add(files(
>     'cpu.c',
> -  'disas.c',
>   ))
>   loongarch_tcg_ss = ss.source_set()
>   loongarch_tcg_ss.add(gen)
> @@ -24,6 +23,8 @@ loongarch_softmmu_ss.add(files(
>     'iocsr_helper.c',
>   ))
>   
> +common_ss.add(when: 'CONFIG_LOONGARCH_DIS', if_true: [files('disas.c'), gen])
> +
>   loongarch_ss.add_all(when: 'CONFIG_TCG', if_true: [loongarch_tcg_ss])
>   
>   target_arch += {'loongarch': loongarch_ss}

Apart from the minor suggestion above,

Reviewed-by: WANG Xuerui <git@xen0n.name>

Thanks!



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v2 06/10] tcg/loongarch64: Introduce tcg_out_addi
  2023-01-18  1:11 ` [PATCH v2 06/10] tcg/loongarch64: Introduce tcg_out_addi Richard Henderson
@ 2023-01-23  6:52   ` WANG Xuerui
  0 siblings, 0 replies; 25+ messages in thread
From: WANG Xuerui @ 2023-01-23  6:52 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel; +Cc: git

On 1/18/23 09:11, Richard Henderson wrote:
> Adjust the constraints to allow any int32_t for immediate
> addition.  Split immediate adds into addu16i + addi, which
> covers quite a lot of the immediate space.  For the hole in
> the middle, load the constant into TMP0 instead.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>   tcg/loongarch64/tcg-target-con-set.h |  4 +-
>   tcg/loongarch64/tcg-target-con-str.h |  2 +-
>   tcg/loongarch64/tcg-target.c.inc     | 57 ++++++++++++++++++++++++----
>   3 files changed, 53 insertions(+), 10 deletions(-)

I've checked some generated code and this indeed benefits.

Reviewed-by: WANG Xuerui <git@xen0n.name>



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v2 07/10] tcg/loongarch64: Improve setcond expansion
  2023-01-18  1:11 ` [PATCH v2 07/10] tcg/loongarch64: Improve setcond expansion Richard Henderson
@ 2023-01-23  7:10   ` WANG Xuerui
  0 siblings, 0 replies; 25+ messages in thread
From: WANG Xuerui @ 2023-01-23  7:10 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel; +Cc: git

On 1/18/23 09:11, Richard Henderson wrote:
> Split out a helper function, tcg_out_setcond_int, which
> does not always produce the complete boolean result, but
> returns a set of flags to do so.
>
> Accept all int32_t as constant input, so that LE/GT can
> adjust the constant to LT.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>   tcg/loongarch64/tcg-target.c.inc | 165 +++++++++++++++++++++----------
>   1 file changed, 115 insertions(+), 50 deletions(-)

Reviewed-by: WANG Xuerui <git@xen0n.name>

Thanks!



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v2 10/10] tcg/loongarch64: Reorg goto_tb implementation
  2023-01-18  1:11 ` [PATCH v2 10/10] tcg/loongarch64: Reorg goto_tb implementation Richard Henderson
@ 2023-01-23  8:12   ` WANG Xuerui
  0 siblings, 0 replies; 25+ messages in thread
From: WANG Xuerui @ 2023-01-23  8:12 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel; +Cc: git

On 1/18/23 09:11, Richard Henderson wrote:
> The old implementation replaces two insns, swapping between
>
>          b       <dest>
>          nop
> and
>          pcaddu18i tmp, <dest>
>          jirl      zero, tmp, <dest> & 0xffff
>
> There is a race condition in which a thread could be stopped at
> the jirl, i.e. with the top of the address loaded, and when
> restarted we have re-linked to a different TB, so that the top
> half no longer matches the bottom half.
>
> Note that while we never directly re-link to a different TB, we
> can link, unlink, and link again all while the stopped thread
> remains stopped.
>
> The new implementation replaces only one insn, swapping between
>
>          b       <dest>
> and
>          pcadd   tmp, <jmp_addr>
>
> falling through to load the address from tmp, and branch.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>   tcg/loongarch64/tcg-target.h     |  7 +---
>   tcg/loongarch64/tcg-target.c.inc | 72 ++++++++++++++------------------
>   2 files changed, 33 insertions(+), 46 deletions(-)

I've tested this on my 3A5000 box and things seem to work, thanks.

Reviewed-by: WANG Xuerui <git@xen0n.name>



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v2 09/10] tcg/loongarch64: Use tcg_pcrel_diff in tcg_out_ldst
  2023-01-18  1:11 ` [PATCH v2 09/10] tcg/loongarch64: Use tcg_pcrel_diff in tcg_out_ldst Richard Henderson
  2023-01-22  8:22   ` WANG Xuerui
@ 2023-01-23  8:32   ` Philippe Mathieu-Daudé
  1 sibling, 0 replies; 25+ messages in thread
From: Philippe Mathieu-Daudé @ 2023-01-23  8:32 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel; +Cc: git

On 18/1/23 02:11, Richard Henderson wrote:
> Take the w^x split into account when computing the
> pc-relative distance to an absolute pointer.
> 
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>   tcg/loongarch64/tcg-target.c.inc | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>




^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v2 05/10] tcg/loongarch64: Update tcg-insn-defs.c.inc
  2023-01-18  1:11 ` [PATCH v2 05/10] tcg/loongarch64: Update tcg-insn-defs.c.inc Richard Henderson
  2023-01-22  8:20   ` WANG Xuerui
@ 2023-01-23  8:33   ` Philippe Mathieu-Daudé
  1 sibling, 0 replies; 25+ messages in thread
From: Philippe Mathieu-Daudé @ 2023-01-23  8:33 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel; +Cc: git

On 18/1/23 02:11, Richard Henderson wrote:
> Regenerate with ADDU16I included:
> 
>     $ cd loongarch-opcodes/scripts/go
>     $ go run ./genqemutcgdefs > $QEMU/tcg/loongarch64/tcg-insn-defs.c.inc
> 
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>   tcg/loongarch64/tcg-insn-defs.c.inc | 10 +++++++++-
>   1 file changed, 9 insertions(+), 1 deletion(-)
>   mode change 100644 => 100755 tcg/loongarch64/tcg-insn-defs.c.inc
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v2 01/10] target/loongarch: Enable the disassembler for host tcg
  2023-01-18  1:11 ` [PATCH v2 01/10] target/loongarch: Enable the disassembler for host tcg Richard Henderson
  2023-01-22  8:32   ` WANG Xuerui
@ 2023-01-23  8:37   ` Philippe Mathieu-Daudé
  1 sibling, 0 replies; 25+ messages in thread
From: Philippe Mathieu-Daudé @ 2023-01-23  8:37 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel; +Cc: git

On 18/1/23 02:11, Richard Henderson wrote:
> Reuse the decodetree based disassembler from
> target/loongarch/ for tcg/loongarch64/.
> 
> The generation of decode-insns.c.inc into ./libcommon.fa.p/ could
> eventually result in conflict, if any other host requires the same
> trick, but this is good enough for now.
> 
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>   disas.c                      | 2 ++
>   target/loongarch/meson.build | 3 ++-
>   2 files changed, 4 insertions(+), 1 deletion(-)

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>




^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2023-01-23  8:37 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-01-18  1:11 [PATCH v2 00/10] tcg/loongarch64: Reorg goto_tb and cleanups Richard Henderson
2023-01-18  1:11 ` [PATCH v2 01/10] target/loongarch: Enable the disassembler for host tcg Richard Henderson
2023-01-22  8:32   ` WANG Xuerui
2023-01-23  8:37   ` Philippe Mathieu-Daudé
2023-01-18  1:11 ` [PATCH v2 02/10] target/loongarch: Disassemble jirl properly Richard Henderson
2023-01-22  8:24   ` WANG Xuerui
2023-01-18  1:11 ` [PATCH v2 03/10] target/loongarch: Disassemble pcadd* addresses Richard Henderson
2023-01-22  8:24   ` WANG Xuerui
2023-01-18  1:11 ` [PATCH v2 04/10] tcg/loongarch64: Optimize immediate loading Richard Henderson
2023-01-22  8:21   ` WANG Xuerui
2023-01-18  1:11 ` [PATCH v2 05/10] tcg/loongarch64: Update tcg-insn-defs.c.inc Richard Henderson
2023-01-22  8:20   ` WANG Xuerui
2023-01-23  8:33   ` Philippe Mathieu-Daudé
2023-01-18  1:11 ` [PATCH v2 06/10] tcg/loongarch64: Introduce tcg_out_addi Richard Henderson
2023-01-23  6:52   ` WANG Xuerui
2023-01-18  1:11 ` [PATCH v2 07/10] tcg/loongarch64: Improve setcond expansion Richard Henderson
2023-01-23  7:10   ` WANG Xuerui
2023-01-18  1:11 ` [PATCH v2 08/10] tcg/loongarch64: Implement movcond Richard Henderson
2023-01-22  8:23   ` WANG Xuerui
2023-01-18  1:11 ` [PATCH v2 09/10] tcg/loongarch64: Use tcg_pcrel_diff in tcg_out_ldst Richard Henderson
2023-01-22  8:22   ` WANG Xuerui
2023-01-23  8:32   ` Philippe Mathieu-Daudé
2023-01-18  1:11 ` [PATCH v2 10/10] tcg/loongarch64: Reorg goto_tb implementation Richard Henderson
2023-01-23  8:12   ` WANG Xuerui
2023-01-22  8:28 ` [PATCH v2 00/10] tcg/loongarch64: Reorg goto_tb and cleanups WANG Xuerui

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.