* [PATCH v4] tcg/loongarch64: Add direct jump support
@ 2022-10-15 9:27 Qi Hu
2022-10-15 15:06 ` WANG Xuerui
` (2 more replies)
0 siblings, 3 replies; 5+ messages in thread
From: Qi Hu @ 2022-10-15 9:27 UTC (permalink / raw)
To: WANG Xuerui; +Cc: qemu-devel, Richard Henderson
Similar to the ARM64, LoongArch has PC-relative instructions such as
PCADDU18I. These instructions can be used to support direct jump for
LoongArch. Additionally, if instruction "B offset" can cover the target
address(target is within ±128MB range), a single "B offset" plus a nop
will be used by "tb_target_set_jump_target".
Cc: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Qi Hu <huqi@loongson.cn>
---
Changes since v3:
- Fix the offset check error which is pointed by WANG Xuerui.
- Use TMP0 instead of T0.
- Remove useless block due to direct jump support.
- Add some assertions.
---
tcg/loongarch64/tcg-target.c.inc | 48 +++++++++++++++++++++++++++++---
tcg/loongarch64/tcg-target.h | 9 ++++--
2 files changed, 50 insertions(+), 7 deletions(-)
diff --git a/tcg/loongarch64/tcg-target.c.inc b/tcg/loongarch64/tcg-target.c.inc
index f5a214a17f..8facd78137 100644
--- a/tcg/loongarch64/tcg-target.c.inc
+++ b/tcg/loongarch64/tcg-target.c.inc
@@ -1031,6 +1031,36 @@ static void tcg_out_qemu_st(TCGContext *s, const TCGArg *args)
#endif
}
+/* LoongArch uses `andi zero, zero, 0` as NOP. */
+#define NOP OPC_ANDI
+static void tcg_out_nop(TCGContext *s)
+{
+ tcg_out32(s, NOP);
+}
+
+void tb_target_set_jmp_target(uintptr_t tc_ptr, uintptr_t jmp_rx,
+ uintptr_t jmp_rw, uintptr_t addr)
+{
+ tcg_insn_unit i1, i2;
+ ptrdiff_t upper, lower;
+ ptrdiff_t offset = (ptrdiff_t)(addr - jmp_rx) >> 2;
+
+ if (offset == sextreg(offset, 0, 26)) {
+ i1 = encode_sd10k16_insn(OPC_B, offset);
+ i2 = NOP;
+ } else {
+ tcg_debug_assert(offset == sextreg(offset, 0, 36));
+ lower = (int16_t)offset;
+ upper = (offset - lower) >> 16;
+
+ i1 = encode_dsj20_insn(OPC_PCADDU18I, TCG_REG_TMP0, upper);
+ i2 = encode_djsk16_insn(OPC_JIRL, TCG_REG_ZERO, TCG_REG_TMP0, lower);
+ }
+ uint64_t pair = ((uint64_t)i2 << 32) | i1;
+ qatomic_set((uint64_t *)jmp_rw, pair);
+ flush_idcache_range(jmp_rx, jmp_rw, 8);
+}
+
/*
* Entry-points
*/
@@ -1058,10 +1088,20 @@ static void tcg_out_op(TCGContext *s, TCGOpcode opc,
break;
case INDEX_op_goto_tb:
- assert(s->tb_jmp_insn_offset == 0);
- /* indirect jump method */
- tcg_out_ld(s, TCG_TYPE_PTR, TCG_REG_TMP0, TCG_REG_ZERO,
- (uintptr_t)(s->tb_jmp_target_addr + a0));
+ tcg_debug_assert(s->tb_jmp_insn_offset != NULL);
+ /*
+ * Ensure that patch area is 8-byte aligned so that an
+ * atomic write can be used to patch the target address.
+ */
+ if ((uintptr_t)s->code_ptr & 7) {
+ tcg_out_nop(s);
+ }
+ s->tb_jmp_insn_offset[a0] = tcg_current_code_size(s);
+ /*
+ * actual branch destination will be patched by
+ * tb_target_set_jmp_target later
+ */
+ tcg_out_opc_pcaddu18i(s, TCG_REG_TMP0, 0);
tcg_out_opc_jirl(s, TCG_REG_ZERO, TCG_REG_TMP0, 0);
set_jmp_reset_offset(s, a0);
break;
diff --git a/tcg/loongarch64/tcg-target.h b/tcg/loongarch64/tcg-target.h
index 67380b2432..ba05ba552e 100644
--- a/tcg/loongarch64/tcg-target.h
+++ b/tcg/loongarch64/tcg-target.h
@@ -42,7 +42,11 @@
#define TCG_TARGET_INSN_UNIT_SIZE 4
#define TCG_TARGET_NB_REGS 32
-#define MAX_CODE_GEN_BUFFER_SIZE SIZE_MAX
+/*
+ * PCADDU18I + JIRL sequence can give 20 + 16 + 2 = 38 bits
+ * signed offset, which is +/- 128 GiB.
+ */
+#define MAX_CODE_GEN_BUFFER_SIZE (128 * GiB)
typedef enum {
TCG_REG_ZERO,
@@ -123,7 +127,7 @@ typedef enum {
#define TCG_TARGET_HAS_clz_i32 1
#define TCG_TARGET_HAS_ctz_i32 1
#define TCG_TARGET_HAS_ctpop_i32 0
-#define TCG_TARGET_HAS_direct_jump 0
+#define TCG_TARGET_HAS_direct_jump 1
#define TCG_TARGET_HAS_brcond2 0
#define TCG_TARGET_HAS_setcond2 0
#define TCG_TARGET_HAS_qemu_st8_i32 0
@@ -166,7 +170,6 @@ typedef enum {
#define TCG_TARGET_HAS_muluh_i64 1
#define TCG_TARGET_HAS_mulsh_i64 1
-/* not defined -- call should be eliminated at compile time */
void tb_target_set_jmp_target(uintptr_t, uintptr_t, uintptr_t, uintptr_t);
#define TCG_TARGET_DEFAULT_MO (0)
--
2.38.0
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH v4] tcg/loongarch64: Add direct jump support
2022-10-15 9:27 [PATCH v4] tcg/loongarch64: Add direct jump support Qi Hu
@ 2022-10-15 15:06 ` WANG Xuerui
2022-10-16 10:04 ` Richard Henderson
2022-10-20 22:42 ` Richard Henderson
2 siblings, 0 replies; 5+ messages in thread
From: WANG Xuerui @ 2022-10-15 15:06 UTC (permalink / raw)
To: Qi Hu, WANG Xuerui; +Cc: qemu-devel, Richard Henderson
On 10/15/22 17:27, Qi Hu wrote:
> Similar to the ARM64, LoongArch has PC-relative instructions such as
> PCADDU18I. These instructions can be used to support direct jump for
> LoongArch. Additionally, if instruction "B offset" can cover the target
> address(target is within ±128MB range), a single "B offset" plus a nop
> will be used by "tb_target_set_jump_target".
>
> Cc: Richard Henderson <richard.henderson@linaro.org>
> Signed-off-by: Qi Hu <huqi@loongson.cn>
> ---
> Changes since v3:
> - Fix the offset check error which is pointed by WANG Xuerui.
> - Use TMP0 instead of T0.
> - Remove useless block due to direct jump support.
> - Add some assertions.
> ---
> tcg/loongarch64/tcg-target.c.inc | 48 +++++++++++++++++++++++++++++---
> tcg/loongarch64/tcg-target.h | 9 ++++--
> 2 files changed, 50 insertions(+), 7 deletions(-)
Richard may want to double-check for restoring his R-b, but this looks
good to me now. Thanks!
Reviewed-by: WANG Xuerui <git@xen0n.name>
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH v4] tcg/loongarch64: Add direct jump support
2022-10-15 9:27 [PATCH v4] tcg/loongarch64: Add direct jump support Qi Hu
2022-10-15 15:06 ` WANG Xuerui
@ 2022-10-16 10:04 ` Richard Henderson
2022-10-20 22:42 ` Richard Henderson
2 siblings, 0 replies; 5+ messages in thread
From: Richard Henderson @ 2022-10-16 10:04 UTC (permalink / raw)
To: Qi Hu, WANG Xuerui; +Cc: qemu-devel
On 10/15/22 19:27, Qi Hu wrote:
> Similar to the ARM64, LoongArch has PC-relative instructions such as
> PCADDU18I. These instructions can be used to support direct jump for
> LoongArch. Additionally, if instruction "B offset" can cover the target
> address(target is within ±128MB range), a single "B offset" plus a nop
> will be used by "tb_target_set_jump_target".
>
> Cc: Richard Henderson<richard.henderson@linaro.org>
> Signed-off-by: Qi Hu<huqi@loongson.cn>
> ---
> Changes since v3:
> - Fix the offset check error which is pointed by WANG Xuerui.
> - Use TMP0 instead of T0.
> - Remove useless block due to direct jump support.
> - Add some assertions.
> ---
> tcg/loongarch64/tcg-target.c.inc | 48 +++++++++++++++++++++++++++++---
> tcg/loongarch64/tcg-target.h | 9 ++++--
> 2 files changed, 50 insertions(+), 7 deletions(-)
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
r~
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH v4] tcg/loongarch64: Add direct jump support
2022-10-15 9:27 [PATCH v4] tcg/loongarch64: Add direct jump support Qi Hu
2022-10-15 15:06 ` WANG Xuerui
2022-10-16 10:04 ` Richard Henderson
@ 2022-10-20 22:42 ` Richard Henderson
2022-10-21 0:04 ` WANG Xuerui
2 siblings, 1 reply; 5+ messages in thread
From: Richard Henderson @ 2022-10-20 22:42 UTC (permalink / raw)
To: Qi Hu, WANG Xuerui; +Cc: qemu-devel
On 10/15/22 19:27, Qi Hu wrote:
> Similar to the ARM64, LoongArch has PC-relative instructions such as
> PCADDU18I. These instructions can be used to support direct jump for
> LoongArch. Additionally, if instruction "B offset" can cover the target
> address(target is within ±128MB range), a single "B offset" plus a nop
> will be used by "tb_target_set_jump_target".
>
> Cc: Richard Henderson <richard.henderson@linaro.org>
> Signed-off-by: Qi Hu <huqi@loongson.cn>
> ---
> Changes since v3:
> - Fix the offset check error which is pointed by WANG Xuerui.
> - Use TMP0 instead of T0.
> - Remove useless block due to direct jump support.
> - Add some assertions.
> ---
Queued to tcg-next.
Fixed a minor nit:
> +void tb_target_set_jmp_target(uintptr_t tc_ptr, uintptr_t jmp_rx,
> + uintptr_t jmp_rw, uintptr_t addr)
> +{
> + tcg_insn_unit i1, i2;
> + ptrdiff_t upper, lower;
> + ptrdiff_t offset = (ptrdiff_t)(addr - jmp_rx) >> 2;
> +
> + if (offset == sextreg(offset, 0, 26)) {
> + i1 = encode_sd10k16_insn(OPC_B, offset);
> + i2 = NOP;
> + } else {
> + tcg_debug_assert(offset == sextreg(offset, 0, 36));
This assert is smaller...
> +/*
> + * PCADDU18I + JIRL sequence can give 20 + 16 + 2 = 38 bits
> + * signed offset, which is +/- 128 GiB.
> + */
> +#define MAX_CODE_GEN_BUFFER_SIZE (128 * GiB)
... than the correct calculation here.
r~
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH v4] tcg/loongarch64: Add direct jump support
2022-10-20 22:42 ` Richard Henderson
@ 2022-10-21 0:04 ` WANG Xuerui
0 siblings, 0 replies; 5+ messages in thread
From: WANG Xuerui @ 2022-10-21 0:04 UTC (permalink / raw)
To: Richard Henderson, Qi Hu, WANG Xuerui; +Cc: qemu-devel
On October 21, 2022 6:42:58 AM GMT+08:00, Richard Henderson <richard.henderson@linaro.org> wrote:
>Fixed a minor nit:
>
>> +void tb_target_set_jmp_target(uintptr_t tc_ptr, uintptr_t jmp_rx,
>> + uintptr_t jmp_rw, uintptr_t addr)
>> +{
>> + tcg_insn_unit i1, i2;
>> + ptrdiff_t upper, lower;
>> + ptrdiff_t offset = (ptrdiff_t)(addr - jmp_rx) >> 2;
>> +
>> + if (offset == sextreg(offset, 0, 26)) {
>> + i1 = encode_sd10k16_insn(OPC_B, offset);
>> + i2 = NOP;
>> + } else {
>> + tcg_debug_assert(offset == sextreg(offset, 0, 36));
>
>This assert is smaller...
>
>> +/*
>> + * PCADDU18I + JIRL sequence can give 20 + 16 + 2 = 38 bits
>> + * signed offset, which is +/- 128 GiB.
>> + */
>> +#define MAX_CODE_GEN_BUFFER_SIZE (128 * GiB)
>
>... than the correct calculation here.
Actually no... the offset above is pre-shifted so 36 is exactly 20 (pcaddu18i) + 16 (jirl). The LoongArch assembly gotchas hit hard...
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2022-10-21 0:18 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-10-15 9:27 [PATCH v4] tcg/loongarch64: Add direct jump support Qi Hu
2022-10-15 15:06 ` WANG Xuerui
2022-10-16 10:04 ` Richard Henderson
2022-10-20 22:42 ` Richard Henderson
2022-10-21 0:04 ` WANG Xuerui
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.