qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 00/30] LoongArch64 port of QEMU TCG
@ 2021-09-22 18:08 WANG Xuerui
  2021-09-22 18:08 ` [PATCH v3 01/30] elf: Add machine type value for LoongArch WANG Xuerui
                   ` (29 more replies)
  0 siblings, 30 replies; 53+ messages in thread
From: WANG Xuerui @ 2021-09-22 18:08 UTC (permalink / raw)
  To: qemu-devel
  Cc: WANG Xuerui, Peter Maydell, Richard Henderson,
	Philippe Mathieu-Daudé,
	Laurent Vivier

Hi all,

This is a port of QEMU TCG to the brand-new CPU architecture LoongArch,
introduced by Loongson with their 3A5000 chips. Test suite all passed
except one timeout that is test-crypto-tlssession, but this particular
case runs well when relatively few targets are enabled, so it may be
just a case of low performance (4C4T 2.5GHz). I also boot-tested x86_64
(Debian and Gentoo installation CDs) and install-tested aarch64 (Debian
netboot installer), and ran riscv64 linux-user emulation with a chroot;
everything seems fine so far.

## About the series

Only the LP64 ABI is supported, as this is the only one fully
implemented and supported by Loongson. 32-bit support is incomplete from
outset, and removed from the very latest upstream submissions, so you
can't even configure for that.

The architecture's documentation is already translated into English;
it can be browsed at https://loongson.github.io/LoongArch-Documentation/.
The LoongArch ELF psABI doc (version 1.00) could be found at [1];
if anything is missing there, it's most likely the same as RISC-V, but
you can always raise an issue over their issue tracker at [2].

[1]: https://loongson.github.io/LoongArch-Documentation/LoongArch-ELF-ABI-EN.html
[2]: https://github.com/loongson/LoongArch-Documentation/issues

In this series I made use of generated instruction encodings and
emitters from https://github.com/loongson-community/loongarch-opcodes
(a community project started by myself, something I must admit), as the
LoongArch encoding is highly irregular even for a fixed 32-bit ISA, and
I want to minimize the maintenance burden for future collaboration. This
is something not seen in any of the other TCG ports out there, so I'd
like to see if this is acceptable practice (and also maybe bikeshed the
file name).

This series touches some of the same files as Song Gao's previous
submission of LoongArch *target* support, which is a bit unfortunate;
one of us will have to rebase after either series gets in. Actual
conflict should only happen on build system bits and include/elf.h,
though, as we're working on entirely different areas.

## How to build and test this

Upstream support for LoongArch is largely WIP for now, which means you
must apply a lot of patches if you want to even cross-build for this arch.
The main sources I used are as follows:

* binutils: https://github.com/xen0n/binutils-gdb/tree/for-gentoo-2.37-v2
  based on https://github.com/loongson/binutils-gdb/tree/loongarch/upstream_v6_a1d65b3
* gcc: https://github.com/xen0n/gcc/tree/for-gentoo-gcc-12-v2
  based on https://github.com/loongson/gcc/tree/loongarch_upstream
* glibc: https://github.com/xen0n/glibc/tree/for-gentoo-glibc-2.34
  based on https://github.com/loongson/glibc/tree/loongarch_2_34_for_upstream
* Linux: https://github.com/xen0n/linux/tree/loongarch-playground
  based on https://github.com/loongson/linux/tree/loongarch-next
* Gentoo overlay: https://github.com/xen0n/loongson-overlay

I have made ready-to-use Gentoo stage3 tarballs, but they're served with
CDN off my personal cloud account, and I don't want the link to be
exposed so that my bills skyrocket; you can reach me off-list to get the
links if you're interested.

As for the hardware availability, the boards can already be bought in
China on Taobao, and I think some people at Loongson might be able to
arrange for testing environments, if testing on real hardware other than
mine is required before merging; they have their in-house Debian spin-off
from the early days of this architecture. Their kernel is
ABI-incompatible with the version being upstreamed and used by me, but
QEMU should work there regardless.

Lastly, I'm new to QEMU development and this is my first patch series
here; apologizes if I get anything wrong, and any help or suggestion is
certainly appreciated!

## Changelog

v3 -> v2:

- Addressed all review comments from v2
  - Re-organized changes to tcg-target.h so that it's incrementally
    updated in each commit implementing ops
  - Removed support for the eqv op
  - Added support for bswap16_i{32,64} ops
  - Fixed and refactored various places as pointed out during review
- Updated generated instruction definitions to latest

v2 -> v1:

- Addressed all review comments from v1
  - Use "loongarch64" everywhere, tcg directory renamed to "tcg/loongarch64"
  - Removed all redundant TCG_TARGET_REG_BITS conditional
  - Removed support for the neg op
  - Added support for eqv and bswap32_i64 ops
  - Added safe syscall handling for linux-user
  - Fixed everything else I could see
- Updated generated instruction definitions to latest
- Reordered the configure/meson.build changes to come last

v2: https://patchew.org/QEMU/20210921201915.601245-1-git@xen0n.name/
v1: https://patchew.org/QEMU/20210920080451.408655-1-git@xen0n.name/

WANG Xuerui (30):
  elf: Add machine type value for LoongArch
  MAINTAINERS: Add tcg/loongarch64 entry with myself as maintainer
  tcg/loongarch64: Add the tcg-target.h file
  tcg/loongarch64: Add generated instruction opcodes and encoding
    helpers
  tcg/loongarch64: Add register names, allocation order and input/output
    sets
  tcg/loongarch64: Define the operand constraints
  tcg/loongarch64: Implement necessary relocation operations
  tcg/loongarch64: Implement the memory barrier op
  tcg/loongarch64: Implement tcg_out_mov and tcg_out_movi
  tcg/loongarch64: Implement goto_ptr
  tcg/loongarch64: Implement sign-/zero-extension ops
  tcg/loongarch64: Implement not/and/or/xor/nor/andc/orc ops
  tcg/loongarch64: Implement deposit/extract ops
  tcg/loongarch64: Implement bswap{16,32,64} ops
  tcg/loongarch64: Implement clz/ctz ops
  tcg/loongarch64: Implement shl/shr/sar/rotl/rotr ops
  tcg/loongarch64: Implement add/sub ops
  tcg/loongarch64: Implement mul/mulsh/muluh/div/divu/rem/remu ops
  tcg/loongarch64: Implement br/brcond ops
  tcg/loongarch64: Implement setcond ops
  tcg/loongarch64: Implement tcg_out_call
  tcg/loongarch64: Implement simple load/store ops
  tcg/loongarch64: Add softmmu load/store helpers, implement
    qemu_ld/qemu_st ops
  tcg/loongarch64: Implement tcg_target_qemu_prologue
  tcg/loongarch64: Implement exit_tb/goto_tb
  tcg/loongarch64: Implement tcg_target_init
  tcg/loongarch64: Register the JIT
  linux-user: Add safe syscall handling for loongarch64 hosts
  accel/tcg/user-exec: Implement CPU-specific signal handler for
    loongarch64 hosts
  configure, meson.build: Mark support for loongarch64 hosts

 MAINTAINERS                                   |    5 +
 accel/tcg/user-exec.c                         |   73 +
 configure                                     |    7 +-
 include/elf.h                                 |    2 +
 linux-user/host/loongarch64/hostdep.h         |   34 +
 .../host/loongarch64/safe-syscall.inc.S       |   80 +
 meson.build                                   |    2 +-
 tcg/loongarch64/tcg-insn-defs.c.inc           |  881 +++++++++
 tcg/loongarch64/tcg-target-con-set.h          |   30 +
 tcg/loongarch64/tcg-target-con-str.h          |   28 +
 tcg/loongarch64/tcg-target.c.inc              | 1616 +++++++++++++++++
 tcg/loongarch64/tcg-target.h                  |  180 ++
 12 files changed, 2936 insertions(+), 2 deletions(-)
 create mode 100644 linux-user/host/loongarch64/hostdep.h
 create mode 100644 linux-user/host/loongarch64/safe-syscall.inc.S
 create mode 100644 tcg/loongarch64/tcg-insn-defs.c.inc
 create mode 100644 tcg/loongarch64/tcg-target-con-set.h
 create mode 100644 tcg/loongarch64/tcg-target-con-str.h
 create mode 100644 tcg/loongarch64/tcg-target.c.inc
 create mode 100644 tcg/loongarch64/tcg-target.h

-- 
2.33.0



^ permalink raw reply	[flat|nested] 53+ messages in thread

* [PATCH v3 01/30] elf: Add machine type value for LoongArch
  2021-09-22 18:08 [PATCH v3 00/30] LoongArch64 port of QEMU TCG WANG Xuerui
@ 2021-09-22 18:08 ` WANG Xuerui
  2021-09-22 18:17   ` Richard Henderson
  2021-09-22 18:23   ` Philippe Mathieu-Daudé
  2021-09-22 18:08 ` [PATCH v3 02/30] MAINTAINERS: Add tcg/loongarch64 entry with myself as maintainer WANG Xuerui
                   ` (28 subsequent siblings)
  29 siblings, 2 replies; 53+ messages in thread
From: WANG Xuerui @ 2021-09-22 18:08 UTC (permalink / raw)
  To: qemu-devel
  Cc: WANG Xuerui, Peter Maydell, Richard Henderson,
	Philippe Mathieu-Daudé,
	Laurent Vivier

This is already officially allocated as recorded in GNU binutils
repo [1], and the description is updated in [2]. Add to enable further
work.

[1]: https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=4cf2ad720078a9f490dd5b5bc8893a926479196e
[2]: https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=01a8c731aacbdbed0eb5682d13cc074dc7e25fb3

Signed-off-by: WANG Xuerui <git@xen0n.name>
---
 include/elf.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/include/elf.h b/include/elf.h
index 811bf4a1cb..3a4bcb646a 100644
--- a/include/elf.h
+++ b/include/elf.h
@@ -182,6 +182,8 @@ typedef struct mips_elf_abiflags_v0 {
 
 #define EM_NANOMIPS     249     /* Wave Computing nanoMIPS */
 
+#define EM_LOONGARCH    258     /* LoongArch */
+
 /*
  * This is an interim value that we will use until the committee comes
  * up with a final number.
-- 
2.33.0



^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH v3 02/30] MAINTAINERS: Add tcg/loongarch64 entry with myself as maintainer
  2021-09-22 18:08 [PATCH v3 00/30] LoongArch64 port of QEMU TCG WANG Xuerui
  2021-09-22 18:08 ` [PATCH v3 01/30] elf: Add machine type value for LoongArch WANG Xuerui
@ 2021-09-22 18:08 ` WANG Xuerui
  2021-09-22 18:09 ` [PATCH v3 03/30] tcg/loongarch64: Add the tcg-target.h file WANG Xuerui
                   ` (27 subsequent siblings)
  29 siblings, 0 replies; 53+ messages in thread
From: WANG Xuerui @ 2021-09-22 18:08 UTC (permalink / raw)
  To: qemu-devel
  Cc: WANG Xuerui, Peter Maydell, Richard Henderson,
	Philippe Mathieu-Daudé,
	Laurent Vivier

I ported the initial code, so I should maintain it of course.

Signed-off-by: WANG Xuerui <git@xen0n.name>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
---
 MAINTAINERS | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index d7915ec128..859e5b5ba2 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -3114,6 +3114,11 @@ S: Maintained
 F: tcg/i386/
 F: disas/i386.c
 
+LoongArch64 TCG target
+M: WANG Xuerui <git@xen0n.name>
+S: Maintained
+F: tcg/loongarch64/
+
 MIPS TCG target
 M: Philippe Mathieu-Daudé <f4bug@amsat.org>
 R: Aurelien Jarno <aurelien@aurel32.net>
-- 
2.33.0



^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH v3 03/30] tcg/loongarch64: Add the tcg-target.h file
  2021-09-22 18:08 [PATCH v3 00/30] LoongArch64 port of QEMU TCG WANG Xuerui
  2021-09-22 18:08 ` [PATCH v3 01/30] elf: Add machine type value for LoongArch WANG Xuerui
  2021-09-22 18:08 ` [PATCH v3 02/30] MAINTAINERS: Add tcg/loongarch64 entry with myself as maintainer WANG Xuerui
@ 2021-09-22 18:09 ` WANG Xuerui
  2021-09-22 18:34   ` Philippe Mathieu-Daudé
  2021-09-22 18:09 ` [PATCH v3 04/30] tcg/loongarch64: Add generated instruction opcodes and encoding helpers WANG Xuerui
                   ` (26 subsequent siblings)
  29 siblings, 1 reply; 53+ messages in thread
From: WANG Xuerui @ 2021-09-22 18:09 UTC (permalink / raw)
  To: qemu-devel
  Cc: WANG Xuerui, Peter Maydell, Richard Henderson,
	Philippe Mathieu-Daudé,
	Laurent Vivier

Support for all optional TCG ops are initially marked disabled; the bits
are to be set in individual commits later.

Signed-off-by: WANG Xuerui <git@xen0n.name>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
---
 tcg/loongarch64/tcg-target.h | 180 +++++++++++++++++++++++++++++++++++
 1 file changed, 180 insertions(+)
 create mode 100644 tcg/loongarch64/tcg-target.h

diff --git a/tcg/loongarch64/tcg-target.h b/tcg/loongarch64/tcg-target.h
new file mode 100644
index 0000000000..0fd9b61e6d
--- /dev/null
+++ b/tcg/loongarch64/tcg-target.h
@@ -0,0 +1,180 @@
+/*
+ * Tiny Code Generator for QEMU
+ *
+ * Copyright (c) 2021 WANG Xuerui <git@xen0n.name>
+ *
+ * Based on tcg/riscv/tcg-target.h
+ *
+ * Copyright (c) 2018 SiFive, Inc
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+ * THE SOFTWARE.
+ */
+
+#ifndef LOONGARCH_TCG_TARGET_H
+#define LOONGARCH_TCG_TARGET_H
+
+/*
+ * Loongson removed the (incomplete) 32-bit support from kernel and toolchain
+ * for the initial upstreaming of this architecture, so don't bother and just
+ * support the LP64 ABI for now.
+ */
+#if defined(__loongarch64)
+# define TCG_TARGET_REG_BITS 64
+#else
+# error unsupported LoongArch register size
+#endif
+
+#define TCG_TARGET_INSN_UNIT_SIZE 4
+#define TCG_TARGET_NB_REGS 32
+#define MAX_CODE_GEN_BUFFER_SIZE  ((size_t)-1)
+
+typedef enum {
+    TCG_REG_ZERO,
+    TCG_REG_RA,
+    TCG_REG_TP,
+    TCG_REG_SP,
+    TCG_REG_A0,
+    TCG_REG_A1,
+    TCG_REG_A2,
+    TCG_REG_A3,
+    TCG_REG_A4,
+    TCG_REG_A5,
+    TCG_REG_A6,
+    TCG_REG_A7,
+    TCG_REG_T0,
+    TCG_REG_T1,
+    TCG_REG_T2,
+    TCG_REG_T3,
+    TCG_REG_T4,
+    TCG_REG_T5,
+    TCG_REG_T6,
+    TCG_REG_T7,
+    TCG_REG_T8,
+    TCG_REG_RESERVED,
+    TCG_REG_S9,
+    TCG_REG_S0,
+    TCG_REG_S1,
+    TCG_REG_S2,
+    TCG_REG_S3,
+    TCG_REG_S4,
+    TCG_REG_S5,
+    TCG_REG_S6,
+    TCG_REG_S7,
+    TCG_REG_S8,
+
+    /* aliases */
+    TCG_AREG0    = TCG_REG_S0,
+    TCG_REG_TMP0 = TCG_REG_T8,
+    TCG_REG_TMP1 = TCG_REG_T7,
+    TCG_REG_TMP2 = TCG_REG_T6,
+} TCGReg;
+
+/* used for function call generation */
+#define TCG_REG_CALL_STACK              TCG_REG_SP
+#define TCG_TARGET_STACK_ALIGN          16
+#define TCG_TARGET_CALL_ALIGN_ARGS      1
+#define TCG_TARGET_CALL_STACK_OFFSET    0
+
+/* optional instructions */
+#define TCG_TARGET_HAS_movcond_i32      0
+#define TCG_TARGET_HAS_div_i32          0
+#define TCG_TARGET_HAS_rem_i32          0
+#define TCG_TARGET_HAS_div2_i32         0
+#define TCG_TARGET_HAS_rot_i32          0
+#define TCG_TARGET_HAS_deposit_i32      0
+#define TCG_TARGET_HAS_extract_i32      0
+#define TCG_TARGET_HAS_sextract_i32     0
+#define TCG_TARGET_HAS_extract2_i32     0
+#define TCG_TARGET_HAS_add2_i32         0
+#define TCG_TARGET_HAS_sub2_i32         0
+#define TCG_TARGET_HAS_mulu2_i32        0
+#define TCG_TARGET_HAS_muls2_i32        0
+#define TCG_TARGET_HAS_muluh_i32        0
+#define TCG_TARGET_HAS_mulsh_i32        0
+#define TCG_TARGET_HAS_ext8s_i32        0
+#define TCG_TARGET_HAS_ext16s_i32       0
+#define TCG_TARGET_HAS_ext8u_i32        0
+#define TCG_TARGET_HAS_ext16u_i32       0
+#define TCG_TARGET_HAS_bswap16_i32      0
+#define TCG_TARGET_HAS_bswap32_i32      0
+#define TCG_TARGET_HAS_not_i32          0
+#define TCG_TARGET_HAS_neg_i32          0
+#define TCG_TARGET_HAS_andc_i32         0
+#define TCG_TARGET_HAS_orc_i32          0
+#define TCG_TARGET_HAS_eqv_i32          0
+#define TCG_TARGET_HAS_nand_i32         0
+#define TCG_TARGET_HAS_nor_i32          0
+#define TCG_TARGET_HAS_clz_i32          0
+#define TCG_TARGET_HAS_ctz_i32          0
+#define TCG_TARGET_HAS_ctpop_i32        0
+#define TCG_TARGET_HAS_direct_jump      0
+#define TCG_TARGET_HAS_brcond2          0
+#define TCG_TARGET_HAS_setcond2         0
+#define TCG_TARGET_HAS_qemu_st8_i32     0
+
+/* 64-bit operations */
+#define TCG_TARGET_HAS_movcond_i64      0
+#define TCG_TARGET_HAS_div_i64          0
+#define TCG_TARGET_HAS_rem_i64          0
+#define TCG_TARGET_HAS_div2_i64         0
+#define TCG_TARGET_HAS_rot_i64          0
+#define TCG_TARGET_HAS_deposit_i64      0
+#define TCG_TARGET_HAS_extract_i64      0
+#define TCG_TARGET_HAS_sextract_i64     0
+#define TCG_TARGET_HAS_extract2_i64     0
+#define TCG_TARGET_HAS_extrl_i64_i32    0
+#define TCG_TARGET_HAS_extrh_i64_i32    0
+#define TCG_TARGET_HAS_ext8s_i64        0
+#define TCG_TARGET_HAS_ext16s_i64       0
+#define TCG_TARGET_HAS_ext32s_i64       0
+#define TCG_TARGET_HAS_ext8u_i64        0
+#define TCG_TARGET_HAS_ext16u_i64       0
+#define TCG_TARGET_HAS_ext32u_i64       0
+#define TCG_TARGET_HAS_bswap16_i64      0
+#define TCG_TARGET_HAS_bswap32_i64      0
+#define TCG_TARGET_HAS_bswap64_i64      0
+#define TCG_TARGET_HAS_not_i64          0
+#define TCG_TARGET_HAS_neg_i64          0
+#define TCG_TARGET_HAS_andc_i64         0
+#define TCG_TARGET_HAS_orc_i64          0
+#define TCG_TARGET_HAS_eqv_i64          0
+#define TCG_TARGET_HAS_nand_i64         0
+#define TCG_TARGET_HAS_nor_i64          0
+#define TCG_TARGET_HAS_clz_i64          0
+#define TCG_TARGET_HAS_ctz_i64          0
+#define TCG_TARGET_HAS_ctpop_i64        0
+#define TCG_TARGET_HAS_add2_i64         0
+#define TCG_TARGET_HAS_sub2_i64         0
+#define TCG_TARGET_HAS_mulu2_i64        0
+#define TCG_TARGET_HAS_muls2_i64        0
+#define TCG_TARGET_HAS_muluh_i64        0
+#define TCG_TARGET_HAS_mulsh_i64        0
+
+/* not defined -- call should be eliminated at compile time */
+void tb_target_set_jmp_target(uintptr_t, uintptr_t, uintptr_t, uintptr_t);
+
+#define TCG_TARGET_DEFAULT_MO (0)
+
+#ifdef CONFIG_SOFTMMU
+#define TCG_TARGET_NEED_LDST_LABELS
+#endif
+
+#define TCG_TARGET_HAS_MEMORY_BSWAP 0
+
+#endif /* LOONGARCH_TCG_TARGET_H */
-- 
2.33.0



^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH v3 04/30] tcg/loongarch64: Add generated instruction opcodes and encoding helpers
  2021-09-22 18:08 [PATCH v3 00/30] LoongArch64 port of QEMU TCG WANG Xuerui
                   ` (2 preceding siblings ...)
  2021-09-22 18:09 ` [PATCH v3 03/30] tcg/loongarch64: Add the tcg-target.h file WANG Xuerui
@ 2021-09-22 18:09 ` WANG Xuerui
  2021-09-22 18:37   ` Philippe Mathieu-Daudé
  2021-09-22 18:09 ` [PATCH v3 05/30] tcg/loongarch64: Add register names, allocation order and input/output sets WANG Xuerui
                   ` (25 subsequent siblings)
  29 siblings, 1 reply; 53+ messages in thread
From: WANG Xuerui @ 2021-09-22 18:09 UTC (permalink / raw)
  To: qemu-devel
  Cc: WANG Xuerui, Peter Maydell, Richard Henderson,
	Philippe Mathieu-Daudé,
	Laurent Vivier

Signed-off-by: WANG Xuerui <git@xen0n.name>
Acked-by: Richard Henderson <richard.henderson@linaro.org>
---
 tcg/loongarch64/tcg-insn-defs.c.inc | 881 ++++++++++++++++++++++++++++
 1 file changed, 881 insertions(+)
 create mode 100644 tcg/loongarch64/tcg-insn-defs.c.inc

diff --git a/tcg/loongarch64/tcg-insn-defs.c.inc b/tcg/loongarch64/tcg-insn-defs.c.inc
new file mode 100644
index 0000000000..7d73136b1b
--- /dev/null
+++ b/tcg/loongarch64/tcg-insn-defs.c.inc
@@ -0,0 +1,881 @@
+/* SPDX-License-Identifier: MIT */
+/*
+ * LoongArch instruction formats, opcodes, and encoders for TCG use.
+ *
+ * Code generated by genqemutcgdefs from
+ * https://github.com/loongson-community/loongarch-opcodes,
+ * from commit 308d38087401eb333063556f027ce0b24d86b416.
+ * DO NOT EDIT.
+ */
+
+typedef enum {
+    OPC_CLZ_W = 0x00001400,
+    OPC_CTZ_W = 0x00001c00,
+    OPC_CLZ_D = 0x00002400,
+    OPC_CTZ_D = 0x00002c00,
+    OPC_REVB_2H = 0x00003000,
+    OPC_REVB_2W = 0x00003800,
+    OPC_REVB_D = 0x00003c00,
+    OPC_SEXT_H = 0x00005800,
+    OPC_SEXT_B = 0x00005c00,
+    OPC_ADD_W = 0x00100000,
+    OPC_ADD_D = 0x00108000,
+    OPC_SUB_W = 0x00110000,
+    OPC_SUB_D = 0x00118000,
+    OPC_SLT = 0x00120000,
+    OPC_SLTU = 0x00128000,
+    OPC_MASKEQZ = 0x00130000,
+    OPC_MASKNEZ = 0x00138000,
+    OPC_NOR = 0x00140000,
+    OPC_AND = 0x00148000,
+    OPC_OR = 0x00150000,
+    OPC_XOR = 0x00158000,
+    OPC_ORN = 0x00160000,
+    OPC_ANDN = 0x00168000,
+    OPC_SLL_W = 0x00170000,
+    OPC_SRL_W = 0x00178000,
+    OPC_SRA_W = 0x00180000,
+    OPC_SLL_D = 0x00188000,
+    OPC_SRL_D = 0x00190000,
+    OPC_SRA_D = 0x00198000,
+    OPC_ROTR_W = 0x001b0000,
+    OPC_ROTR_D = 0x001b8000,
+    OPC_MUL_W = 0x001c0000,
+    OPC_MULH_W = 0x001c8000,
+    OPC_MULH_WU = 0x001d0000,
+    OPC_MUL_D = 0x001d8000,
+    OPC_MULH_D = 0x001e0000,
+    OPC_MULH_DU = 0x001e8000,
+    OPC_DIV_W = 0x00200000,
+    OPC_MOD_W = 0x00208000,
+    OPC_DIV_WU = 0x00210000,
+    OPC_MOD_WU = 0x00218000,
+    OPC_DIV_D = 0x00220000,
+    OPC_MOD_D = 0x00228000,
+    OPC_DIV_DU = 0x00230000,
+    OPC_MOD_DU = 0x00238000,
+    OPC_SLLI_W = 0x00408000,
+    OPC_SLLI_D = 0x00410000,
+    OPC_SRLI_W = 0x00448000,
+    OPC_SRLI_D = 0x00450000,
+    OPC_SRAI_W = 0x00488000,
+    OPC_SRAI_D = 0x00490000,
+    OPC_ROTRI_W = 0x004c8000,
+    OPC_ROTRI_D = 0x004d0000,
+    OPC_BSTRINS_W = 0x00600000,
+    OPC_BSTRPICK_W = 0x00608000,
+    OPC_BSTRINS_D = 0x00800000,
+    OPC_BSTRPICK_D = 0x00c00000,
+    OPC_SLTI = 0x02000000,
+    OPC_SLTUI = 0x02400000,
+    OPC_ADDI_W = 0x02800000,
+    OPC_ADDI_D = 0x02c00000,
+    OPC_CU52I_D = 0x03000000,
+    OPC_ANDI = 0x03400000,
+    OPC_ORI = 0x03800000,
+    OPC_XORI = 0x03c00000,
+    OPC_LU12I_W = 0x14000000,
+    OPC_CU32I_D = 0x16000000,
+    OPC_PCADDU2I = 0x18000000,
+    OPC_PCADDU12I = 0x1c000000,
+    OPC_PCADDU18I = 0x1e000000,
+    OPC_LD_B = 0x28000000,
+    OPC_LD_H = 0x28400000,
+    OPC_LD_W = 0x28800000,
+    OPC_LD_D = 0x28c00000,
+    OPC_ST_B = 0x29000000,
+    OPC_ST_H = 0x29400000,
+    OPC_ST_W = 0x29800000,
+    OPC_ST_D = 0x29c00000,
+    OPC_LD_BU = 0x2a000000,
+    OPC_LD_HU = 0x2a400000,
+    OPC_LD_WU = 0x2a800000,
+    OPC_DBAR = 0x38720000,
+    OPC_JIRL = 0x4c000000,
+    OPC_B = 0x50000000,
+    OPC_BL = 0x54000000,
+    OPC_BEQ = 0x58000000,
+    OPC_BNE = 0x5c000000,
+    OPC_BGT = 0x60000000,
+    OPC_BLE = 0x64000000,
+    OPC_BGTU = 0x68000000,
+    OPC_BLEU = 0x6c000000,
+} LoongArchInsn;
+
+static int32_t __attribute__((unused))
+encode_d_slot(LoongArchInsn opc, uint32_t d)
+{
+    return opc | d;
+}
+
+static int32_t __attribute__((unused))
+encode_dj_slots(LoongArchInsn opc, uint32_t d, uint32_t j)
+{
+    return opc | d | j << 5;
+}
+
+static int32_t __attribute__((unused))
+encode_djk_slots(LoongArchInsn opc, uint32_t d, uint32_t j, uint32_t k)
+{
+    return opc | d | j << 5 | k << 10;
+}
+
+static int32_t __attribute__((unused))
+encode_djkm_slots(LoongArchInsn opc, uint32_t d, uint32_t j, uint32_t k,
+                  uint32_t m)
+{
+    return opc | d | j << 5 | k << 10 | m << 16;
+}
+
+static int32_t __attribute__((unused))
+encode_dk_slots(LoongArchInsn opc, uint32_t d, uint32_t k)
+{
+    return opc | d | k << 10;
+}
+
+static int32_t __attribute__((unused))
+encode_dj_insn(LoongArchInsn opc, TCGReg d, TCGReg j)
+{
+    tcg_debug_assert(d >= 0 && d <= 0x1f);
+    tcg_debug_assert(j >= 0 && j <= 0x1f);
+    return encode_dj_slots(opc, d, j);
+}
+
+static int32_t __attribute__((unused))
+encode_djk_insn(LoongArchInsn opc, TCGReg d, TCGReg j, TCGReg k)
+{
+    tcg_debug_assert(d >= 0 && d <= 0x1f);
+    tcg_debug_assert(j >= 0 && j <= 0x1f);
+    tcg_debug_assert(k >= 0 && k <= 0x1f);
+    return encode_djk_slots(opc, d, j, k);
+}
+
+static int32_t __attribute__((unused))
+encode_djsk12_insn(LoongArchInsn opc, TCGReg d, TCGReg j, int32_t sk12)
+{
+    tcg_debug_assert(d >= 0 && d <= 0x1f);
+    tcg_debug_assert(j >= 0 && j <= 0x1f);
+    tcg_debug_assert(sk12 >= -0x800 && sk12 <= 0x7ff);
+    return encode_djk_slots(opc, d, j, sk12 & 0xfff);
+}
+
+static int32_t __attribute__((unused))
+encode_djsk16_insn(LoongArchInsn opc, TCGReg d, TCGReg j, int32_t sk16)
+{
+    tcg_debug_assert(d >= 0 && d <= 0x1f);
+    tcg_debug_assert(j >= 0 && j <= 0x1f);
+    tcg_debug_assert(sk16 >= -0x8000 && sk16 <= 0x7fff);
+    return encode_djk_slots(opc, d, j, sk16 & 0xffff);
+}
+
+static int32_t __attribute__((unused))
+encode_djuk12_insn(LoongArchInsn opc, TCGReg d, TCGReg j, uint32_t uk12)
+{
+    tcg_debug_assert(d >= 0 && d <= 0x1f);
+    tcg_debug_assert(j >= 0 && j <= 0x1f);
+    tcg_debug_assert(uk12 <= 0xfff);
+    return encode_djk_slots(opc, d, j, uk12);
+}
+
+static int32_t __attribute__((unused))
+encode_djuk5_insn(LoongArchInsn opc, TCGReg d, TCGReg j, uint32_t uk5)
+{
+    tcg_debug_assert(d >= 0 && d <= 0x1f);
+    tcg_debug_assert(j >= 0 && j <= 0x1f);
+    tcg_debug_assert(uk5 <= 0x1f);
+    return encode_djk_slots(opc, d, j, uk5);
+}
+
+static int32_t __attribute__((unused))
+encode_djuk5um5_insn(LoongArchInsn opc, TCGReg d, TCGReg j, uint32_t uk5,
+                     uint32_t um5)
+{
+    tcg_debug_assert(d >= 0 && d <= 0x1f);
+    tcg_debug_assert(j >= 0 && j <= 0x1f);
+    tcg_debug_assert(uk5 <= 0x1f);
+    tcg_debug_assert(um5 <= 0x1f);
+    return encode_djkm_slots(opc, d, j, uk5, um5);
+}
+
+static int32_t __attribute__((unused))
+encode_djuk6_insn(LoongArchInsn opc, TCGReg d, TCGReg j, uint32_t uk6)
+{
+    tcg_debug_assert(d >= 0 && d <= 0x1f);
+    tcg_debug_assert(j >= 0 && j <= 0x1f);
+    tcg_debug_assert(uk6 <= 0x3f);
+    return encode_djk_slots(opc, d, j, uk6);
+}
+
+static int32_t __attribute__((unused))
+encode_djuk6um6_insn(LoongArchInsn opc, TCGReg d, TCGReg j, uint32_t uk6,
+                     uint32_t um6)
+{
+    tcg_debug_assert(d >= 0 && d <= 0x1f);
+    tcg_debug_assert(j >= 0 && j <= 0x1f);
+    tcg_debug_assert(uk6 <= 0x3f);
+    tcg_debug_assert(um6 <= 0x3f);
+    return encode_djkm_slots(opc, d, j, uk6, um6);
+}
+
+static int32_t __attribute__((unused))
+encode_dsj20_insn(LoongArchInsn opc, TCGReg d, int32_t sj20)
+{
+    tcg_debug_assert(d >= 0 && d <= 0x1f);
+    tcg_debug_assert(sj20 >= -0x80000 && sj20 <= 0x7ffff);
+    return encode_dj_slots(opc, d, sj20 & 0xfffff);
+}
+
+static int32_t __attribute__((unused))
+encode_sd10k16_insn(LoongArchInsn opc, int32_t sd10k16)
+{
+    tcg_debug_assert(sd10k16 >= -0x2000000 && sd10k16 <= 0x1ffffff);
+    return encode_dk_slots(opc, (sd10k16 >> 16) & 0x3ff, sd10k16 & 0xffff);
+}
+
+static int32_t __attribute__((unused))
+encode_ud15_insn(LoongArchInsn opc, uint32_t ud15)
+{
+    tcg_debug_assert(ud15 <= 0x7fff);
+    return encode_d_slot(opc, ud15);
+}
+
+/* Emits the `clz.w d, j` instruction. */
+static void __attribute__((unused))
+tcg_out_opc_clz_w(TCGContext *s, TCGReg d, TCGReg j)
+{
+    tcg_out32(s, encode_dj_insn(OPC_CLZ_W, d, j));
+}
+
+/* Emits the `ctz.w d, j` instruction. */
+static void __attribute__((unused))
+tcg_out_opc_ctz_w(TCGContext *s, TCGReg d, TCGReg j)
+{
+    tcg_out32(s, encode_dj_insn(OPC_CTZ_W, d, j));
+}
+
+/* Emits the `clz.d d, j` instruction. */
+static void __attribute__((unused))
+tcg_out_opc_clz_d(TCGContext *s, TCGReg d, TCGReg j)
+{
+    tcg_out32(s, encode_dj_insn(OPC_CLZ_D, d, j));
+}
+
+/* Emits the `ctz.d d, j` instruction. */
+static void __attribute__((unused))
+tcg_out_opc_ctz_d(TCGContext *s, TCGReg d, TCGReg j)
+{
+    tcg_out32(s, encode_dj_insn(OPC_CTZ_D, d, j));
+}
+
+/* Emits the `revb.2h d, j` instruction. */
+static void __attribute__((unused))
+tcg_out_opc_revb_2h(TCGContext *s, TCGReg d, TCGReg j)
+{
+    tcg_out32(s, encode_dj_insn(OPC_REVB_2H, d, j));
+}
+
+/* Emits the `revb.2w d, j` instruction. */
+static void __attribute__((unused))
+tcg_out_opc_revb_2w(TCGContext *s, TCGReg d, TCGReg j)
+{
+    tcg_out32(s, encode_dj_insn(OPC_REVB_2W, d, j));
+}
+
+/* Emits the `revb.d d, j` instruction. */
+static void __attribute__((unused))
+tcg_out_opc_revb_d(TCGContext *s, TCGReg d, TCGReg j)
+{
+    tcg_out32(s, encode_dj_insn(OPC_REVB_D, d, j));
+}
+
+/* Emits the `sext.h d, j` instruction. */
+static void __attribute__((unused))
+tcg_out_opc_sext_h(TCGContext *s, TCGReg d, TCGReg j)
+{
+    tcg_out32(s, encode_dj_insn(OPC_SEXT_H, d, j));
+}
+
+/* Emits the `sext.b d, j` instruction. */
+static void __attribute__((unused))
+tcg_out_opc_sext_b(TCGContext *s, TCGReg d, TCGReg j)
+{
+    tcg_out32(s, encode_dj_insn(OPC_SEXT_B, d, j));
+}
+
+/* Emits the `add.w d, j, k` instruction. */
+static void __attribute__((unused))
+tcg_out_opc_add_w(TCGContext *s, TCGReg d, TCGReg j, TCGReg k)
+{
+    tcg_out32(s, encode_djk_insn(OPC_ADD_W, d, j, k));
+}
+
+/* Emits the `add.d d, j, k` instruction. */
+static void __attribute__((unused))
+tcg_out_opc_add_d(TCGContext *s, TCGReg d, TCGReg j, TCGReg k)
+{
+    tcg_out32(s, encode_djk_insn(OPC_ADD_D, d, j, k));
+}
+
+/* Emits the `sub.w d, j, k` instruction. */
+static void __attribute__((unused))
+tcg_out_opc_sub_w(TCGContext *s, TCGReg d, TCGReg j, TCGReg k)
+{
+    tcg_out32(s, encode_djk_insn(OPC_SUB_W, d, j, k));
+}
+
+/* Emits the `sub.d d, j, k` instruction. */
+static void __attribute__((unused))
+tcg_out_opc_sub_d(TCGContext *s, TCGReg d, TCGReg j, TCGReg k)
+{
+    tcg_out32(s, encode_djk_insn(OPC_SUB_D, d, j, k));
+}
+
+/* Emits the `slt d, j, k` instruction. */
+static void __attribute__((unused))
+tcg_out_opc_slt(TCGContext *s, TCGReg d, TCGReg j, TCGReg k)
+{
+    tcg_out32(s, encode_djk_insn(OPC_SLT, d, j, k));
+}
+
+/* Emits the `sltu d, j, k` instruction. */
+static void __attribute__((unused))
+tcg_out_opc_sltu(TCGContext *s, TCGReg d, TCGReg j, TCGReg k)
+{
+    tcg_out32(s, encode_djk_insn(OPC_SLTU, d, j, k));
+}
+
+/* Emits the `maskeqz d, j, k` instruction. */
+static void __attribute__((unused))
+tcg_out_opc_maskeqz(TCGContext *s, TCGReg d, TCGReg j, TCGReg k)
+{
+    tcg_out32(s, encode_djk_insn(OPC_MASKEQZ, d, j, k));
+}
+
+/* Emits the `masknez d, j, k` instruction. */
+static void __attribute__((unused))
+tcg_out_opc_masknez(TCGContext *s, TCGReg d, TCGReg j, TCGReg k)
+{
+    tcg_out32(s, encode_djk_insn(OPC_MASKNEZ, d, j, k));
+}
+
+/* Emits the `nor d, j, k` instruction. */
+static void __attribute__((unused))
+tcg_out_opc_nor(TCGContext *s, TCGReg d, TCGReg j, TCGReg k)
+{
+    tcg_out32(s, encode_djk_insn(OPC_NOR, d, j, k));
+}
+
+/* Emits the `and d, j, k` instruction. */
+static void __attribute__((unused))
+tcg_out_opc_and(TCGContext *s, TCGReg d, TCGReg j, TCGReg k)
+{
+    tcg_out32(s, encode_djk_insn(OPC_AND, d, j, k));
+}
+
+/* Emits the `or d, j, k` instruction. */
+static void __attribute__((unused))
+tcg_out_opc_or(TCGContext *s, TCGReg d, TCGReg j, TCGReg k)
+{
+    tcg_out32(s, encode_djk_insn(OPC_OR, d, j, k));
+}
+
+/* Emits the `xor d, j, k` instruction. */
+static void __attribute__((unused))
+tcg_out_opc_xor(TCGContext *s, TCGReg d, TCGReg j, TCGReg k)
+{
+    tcg_out32(s, encode_djk_insn(OPC_XOR, d, j, k));
+}
+
+/* Emits the `orn d, j, k` instruction. */
+static void __attribute__((unused))
+tcg_out_opc_orn(TCGContext *s, TCGReg d, TCGReg j, TCGReg k)
+{
+    tcg_out32(s, encode_djk_insn(OPC_ORN, d, j, k));
+}
+
+/* Emits the `andn d, j, k` instruction. */
+static void __attribute__((unused))
+tcg_out_opc_andn(TCGContext *s, TCGReg d, TCGReg j, TCGReg k)
+{
+    tcg_out32(s, encode_djk_insn(OPC_ANDN, d, j, k));
+}
+
+/* Emits the `sll.w d, j, k` instruction. */
+static void __attribute__((unused))
+tcg_out_opc_sll_w(TCGContext *s, TCGReg d, TCGReg j, TCGReg k)
+{
+    tcg_out32(s, encode_djk_insn(OPC_SLL_W, d, j, k));
+}
+
+/* Emits the `srl.w d, j, k` instruction. */
+static void __attribute__((unused))
+tcg_out_opc_srl_w(TCGContext *s, TCGReg d, TCGReg j, TCGReg k)
+{
+    tcg_out32(s, encode_djk_insn(OPC_SRL_W, d, j, k));
+}
+
+/* Emits the `sra.w d, j, k` instruction. */
+static void __attribute__((unused))
+tcg_out_opc_sra_w(TCGContext *s, TCGReg d, TCGReg j, TCGReg k)
+{
+    tcg_out32(s, encode_djk_insn(OPC_SRA_W, d, j, k));
+}
+
+/* Emits the `sll.d d, j, k` instruction. */
+static void __attribute__((unused))
+tcg_out_opc_sll_d(TCGContext *s, TCGReg d, TCGReg j, TCGReg k)
+{
+    tcg_out32(s, encode_djk_insn(OPC_SLL_D, d, j, k));
+}
+
+/* Emits the `srl.d d, j, k` instruction. */
+static void __attribute__((unused))
+tcg_out_opc_srl_d(TCGContext *s, TCGReg d, TCGReg j, TCGReg k)
+{
+    tcg_out32(s, encode_djk_insn(OPC_SRL_D, d, j, k));
+}
+
+/* Emits the `sra.d d, j, k` instruction. */
+static void __attribute__((unused))
+tcg_out_opc_sra_d(TCGContext *s, TCGReg d, TCGReg j, TCGReg k)
+{
+    tcg_out32(s, encode_djk_insn(OPC_SRA_D, d, j, k));
+}
+
+/* Emits the `rotr.w d, j, k` instruction. */
+static void __attribute__((unused))
+tcg_out_opc_rotr_w(TCGContext *s, TCGReg d, TCGReg j, TCGReg k)
+{
+    tcg_out32(s, encode_djk_insn(OPC_ROTR_W, d, j, k));
+}
+
+/* Emits the `rotr.d d, j, k` instruction. */
+static void __attribute__((unused))
+tcg_out_opc_rotr_d(TCGContext *s, TCGReg d, TCGReg j, TCGReg k)
+{
+    tcg_out32(s, encode_djk_insn(OPC_ROTR_D, d, j, k));
+}
+
+/* Emits the `mul.w d, j, k` instruction. */
+static void __attribute__((unused))
+tcg_out_opc_mul_w(TCGContext *s, TCGReg d, TCGReg j, TCGReg k)
+{
+    tcg_out32(s, encode_djk_insn(OPC_MUL_W, d, j, k));
+}
+
+/* Emits the `mulh.w d, j, k` instruction. */
+static void __attribute__((unused))
+tcg_out_opc_mulh_w(TCGContext *s, TCGReg d, TCGReg j, TCGReg k)
+{
+    tcg_out32(s, encode_djk_insn(OPC_MULH_W, d, j, k));
+}
+
+/* Emits the `mulh.wu d, j, k` instruction. */
+static void __attribute__((unused))
+tcg_out_opc_mulh_wu(TCGContext *s, TCGReg d, TCGReg j, TCGReg k)
+{
+    tcg_out32(s, encode_djk_insn(OPC_MULH_WU, d, j, k));
+}
+
+/* Emits the `mul.d d, j, k` instruction. */
+static void __attribute__((unused))
+tcg_out_opc_mul_d(TCGContext *s, TCGReg d, TCGReg j, TCGReg k)
+{
+    tcg_out32(s, encode_djk_insn(OPC_MUL_D, d, j, k));
+}
+
+/* Emits the `mulh.d d, j, k` instruction. */
+static void __attribute__((unused))
+tcg_out_opc_mulh_d(TCGContext *s, TCGReg d, TCGReg j, TCGReg k)
+{
+    tcg_out32(s, encode_djk_insn(OPC_MULH_D, d, j, k));
+}
+
+/* Emits the `mulh.du d, j, k` instruction. */
+static void __attribute__((unused))
+tcg_out_opc_mulh_du(TCGContext *s, TCGReg d, TCGReg j, TCGReg k)
+{
+    tcg_out32(s, encode_djk_insn(OPC_MULH_DU, d, j, k));
+}
+
+/* Emits the `div.w d, j, k` instruction. */
+static void __attribute__((unused))
+tcg_out_opc_div_w(TCGContext *s, TCGReg d, TCGReg j, TCGReg k)
+{
+    tcg_out32(s, encode_djk_insn(OPC_DIV_W, d, j, k));
+}
+
+/* Emits the `mod.w d, j, k` instruction. */
+static void __attribute__((unused))
+tcg_out_opc_mod_w(TCGContext *s, TCGReg d, TCGReg j, TCGReg k)
+{
+    tcg_out32(s, encode_djk_insn(OPC_MOD_W, d, j, k));
+}
+
+/* Emits the `div.wu d, j, k` instruction. */
+static void __attribute__((unused))
+tcg_out_opc_div_wu(TCGContext *s, TCGReg d, TCGReg j, TCGReg k)
+{
+    tcg_out32(s, encode_djk_insn(OPC_DIV_WU, d, j, k));
+}
+
+/* Emits the `mod.wu d, j, k` instruction. */
+static void __attribute__((unused))
+tcg_out_opc_mod_wu(TCGContext *s, TCGReg d, TCGReg j, TCGReg k)
+{
+    tcg_out32(s, encode_djk_insn(OPC_MOD_WU, d, j, k));
+}
+
+/* Emits the `div.d d, j, k` instruction. */
+static void __attribute__((unused))
+tcg_out_opc_div_d(TCGContext *s, TCGReg d, TCGReg j, TCGReg k)
+{
+    tcg_out32(s, encode_djk_insn(OPC_DIV_D, d, j, k));
+}
+
+/* Emits the `mod.d d, j, k` instruction. */
+static void __attribute__((unused))
+tcg_out_opc_mod_d(TCGContext *s, TCGReg d, TCGReg j, TCGReg k)
+{
+    tcg_out32(s, encode_djk_insn(OPC_MOD_D, d, j, k));
+}
+
+/* Emits the `div.du d, j, k` instruction. */
+static void __attribute__((unused))
+tcg_out_opc_div_du(TCGContext *s, TCGReg d, TCGReg j, TCGReg k)
+{
+    tcg_out32(s, encode_djk_insn(OPC_DIV_DU, d, j, k));
+}
+
+/* Emits the `mod.du d, j, k` instruction. */
+static void __attribute__((unused))
+tcg_out_opc_mod_du(TCGContext *s, TCGReg d, TCGReg j, TCGReg k)
+{
+    tcg_out32(s, encode_djk_insn(OPC_MOD_DU, d, j, k));
+}
+
+/* Emits the `slli.w d, j, uk5` instruction. */
+static void __attribute__((unused))
+tcg_out_opc_slli_w(TCGContext *s, TCGReg d, TCGReg j, uint32_t uk5)
+{
+    tcg_out32(s, encode_djuk5_insn(OPC_SLLI_W, d, j, uk5));
+}
+
+/* Emits the `slli.d d, j, uk6` instruction. */
+static void __attribute__((unused))
+tcg_out_opc_slli_d(TCGContext *s, TCGReg d, TCGReg j, uint32_t uk6)
+{
+    tcg_out32(s, encode_djuk6_insn(OPC_SLLI_D, d, j, uk6));
+}
+
+/* Emits the `srli.w d, j, uk5` instruction. */
+static void __attribute__((unused))
+tcg_out_opc_srli_w(TCGContext *s, TCGReg d, TCGReg j, uint32_t uk5)
+{
+    tcg_out32(s, encode_djuk5_insn(OPC_SRLI_W, d, j, uk5));
+}
+
+/* Emits the `srli.d d, j, uk6` instruction. */
+static void __attribute__((unused))
+tcg_out_opc_srli_d(TCGContext *s, TCGReg d, TCGReg j, uint32_t uk6)
+{
+    tcg_out32(s, encode_djuk6_insn(OPC_SRLI_D, d, j, uk6));
+}
+
+/* Emits the `srai.w d, j, uk5` instruction. */
+static void __attribute__((unused))
+tcg_out_opc_srai_w(TCGContext *s, TCGReg d, TCGReg j, uint32_t uk5)
+{
+    tcg_out32(s, encode_djuk5_insn(OPC_SRAI_W, d, j, uk5));
+}
+
+/* Emits the `srai.d d, j, uk6` instruction. */
+static void __attribute__((unused))
+tcg_out_opc_srai_d(TCGContext *s, TCGReg d, TCGReg j, uint32_t uk6)
+{
+    tcg_out32(s, encode_djuk6_insn(OPC_SRAI_D, d, j, uk6));
+}
+
+/* Emits the `rotri.w d, j, uk5` instruction. */
+static void __attribute__((unused))
+tcg_out_opc_rotri_w(TCGContext *s, TCGReg d, TCGReg j, uint32_t uk5)
+{
+    tcg_out32(s, encode_djuk5_insn(OPC_ROTRI_W, d, j, uk5));
+}
+
+/* Emits the `rotri.d d, j, uk6` instruction. */
+static void __attribute__((unused))
+tcg_out_opc_rotri_d(TCGContext *s, TCGReg d, TCGReg j, uint32_t uk6)
+{
+    tcg_out32(s, encode_djuk6_insn(OPC_ROTRI_D, d, j, uk6));
+}
+
+/* Emits the `bstrins.w d, j, uk5, um5` instruction. */
+static void __attribute__((unused))
+tcg_out_opc_bstrins_w(TCGContext *s, TCGReg d, TCGReg j, uint32_t uk5,
+                      uint32_t um5)
+{
+    tcg_out32(s, encode_djuk5um5_insn(OPC_BSTRINS_W, d, j, uk5, um5));
+}
+
+/* Emits the `bstrpick.w d, j, uk5, um5` instruction. */
+static void __attribute__((unused))
+tcg_out_opc_bstrpick_w(TCGContext *s, TCGReg d, TCGReg j, uint32_t uk5,
+                       uint32_t um5)
+{
+    tcg_out32(s, encode_djuk5um5_insn(OPC_BSTRPICK_W, d, j, uk5, um5));
+}
+
+/* Emits the `bstrins.d d, j, uk6, um6` instruction. */
+static void __attribute__((unused))
+tcg_out_opc_bstrins_d(TCGContext *s, TCGReg d, TCGReg j, uint32_t uk6,
+                      uint32_t um6)
+{
+    tcg_out32(s, encode_djuk6um6_insn(OPC_BSTRINS_D, d, j, uk6, um6));
+}
+
+/* Emits the `bstrpick.d d, j, uk6, um6` instruction. */
+static void __attribute__((unused))
+tcg_out_opc_bstrpick_d(TCGContext *s, TCGReg d, TCGReg j, uint32_t uk6,
+                       uint32_t um6)
+{
+    tcg_out32(s, encode_djuk6um6_insn(OPC_BSTRPICK_D, d, j, uk6, um6));
+}
+
+/* Emits the `slti d, j, sk12` instruction. */
+static void __attribute__((unused))
+tcg_out_opc_slti(TCGContext *s, TCGReg d, TCGReg j, int32_t sk12)
+{
+    tcg_out32(s, encode_djsk12_insn(OPC_SLTI, d, j, sk12));
+}
+
+/* Emits the `sltui d, j, sk12` instruction. */
+static void __attribute__((unused))
+tcg_out_opc_sltui(TCGContext *s, TCGReg d, TCGReg j, int32_t sk12)
+{
+    tcg_out32(s, encode_djsk12_insn(OPC_SLTUI, d, j, sk12));
+}
+
+/* Emits the `addi.w d, j, sk12` instruction. */
+static void __attribute__((unused))
+tcg_out_opc_addi_w(TCGContext *s, TCGReg d, TCGReg j, int32_t sk12)
+{
+    tcg_out32(s, encode_djsk12_insn(OPC_ADDI_W, d, j, sk12));
+}
+
+/* Emits the `addi.d d, j, sk12` instruction. */
+static void __attribute__((unused))
+tcg_out_opc_addi_d(TCGContext *s, TCGReg d, TCGReg j, int32_t sk12)
+{
+    tcg_out32(s, encode_djsk12_insn(OPC_ADDI_D, d, j, sk12));
+}
+
+/* Emits the `cu52i.d d, j, sk12` instruction. */
+static void __attribute__((unused))
+tcg_out_opc_cu52i_d(TCGContext *s, TCGReg d, TCGReg j, int32_t sk12)
+{
+    tcg_out32(s, encode_djsk12_insn(OPC_CU52I_D, d, j, sk12));
+}
+
+/* Emits the `andi d, j, uk12` instruction. */
+static void __attribute__((unused))
+tcg_out_opc_andi(TCGContext *s, TCGReg d, TCGReg j, uint32_t uk12)
+{
+    tcg_out32(s, encode_djuk12_insn(OPC_ANDI, d, j, uk12));
+}
+
+/* Emits the `ori d, j, uk12` instruction. */
+static void __attribute__((unused))
+tcg_out_opc_ori(TCGContext *s, TCGReg d, TCGReg j, uint32_t uk12)
+{
+    tcg_out32(s, encode_djuk12_insn(OPC_ORI, d, j, uk12));
+}
+
+/* Emits the `xori d, j, uk12` instruction. */
+static void __attribute__((unused))
+tcg_out_opc_xori(TCGContext *s, TCGReg d, TCGReg j, uint32_t uk12)
+{
+    tcg_out32(s, encode_djuk12_insn(OPC_XORI, d, j, uk12));
+}
+
+/* Emits the `lu12i.w d, sj20` instruction. */
+static void __attribute__((unused))
+tcg_out_opc_lu12i_w(TCGContext *s, TCGReg d, int32_t sj20)
+{
+    tcg_out32(s, encode_dsj20_insn(OPC_LU12I_W, d, sj20));
+}
+
+/* Emits the `cu32i.d d, sj20` instruction. */
+static void __attribute__((unused))
+tcg_out_opc_cu32i_d(TCGContext *s, TCGReg d, int32_t sj20)
+{
+    tcg_out32(s, encode_dsj20_insn(OPC_CU32I_D, d, sj20));
+}
+
+/* Emits the `pcaddu2i d, sj20` instruction. */
+static void __attribute__((unused))
+tcg_out_opc_pcaddu2i(TCGContext *s, TCGReg d, int32_t sj20)
+{
+    tcg_out32(s, encode_dsj20_insn(OPC_PCADDU2I, d, sj20));
+}
+
+/* Emits the `pcaddu12i d, sj20` instruction. */
+static void __attribute__((unused))
+tcg_out_opc_pcaddu12i(TCGContext *s, TCGReg d, int32_t sj20)
+{
+    tcg_out32(s, encode_dsj20_insn(OPC_PCADDU12I, d, sj20));
+}
+
+/* Emits the `pcaddu18i d, sj20` instruction. */
+static void __attribute__((unused))
+tcg_out_opc_pcaddu18i(TCGContext *s, TCGReg d, int32_t sj20)
+{
+    tcg_out32(s, encode_dsj20_insn(OPC_PCADDU18I, d, sj20));
+}
+
+/* Emits the `ld.b d, j, sk12` instruction. */
+static void __attribute__((unused))
+tcg_out_opc_ld_b(TCGContext *s, TCGReg d, TCGReg j, int32_t sk12)
+{
+    tcg_out32(s, encode_djsk12_insn(OPC_LD_B, d, j, sk12));
+}
+
+/* Emits the `ld.h d, j, sk12` instruction. */
+static void __attribute__((unused))
+tcg_out_opc_ld_h(TCGContext *s, TCGReg d, TCGReg j, int32_t sk12)
+{
+    tcg_out32(s, encode_djsk12_insn(OPC_LD_H, d, j, sk12));
+}
+
+/* Emits the `ld.w d, j, sk12` instruction. */
+static void __attribute__((unused))
+tcg_out_opc_ld_w(TCGContext *s, TCGReg d, TCGReg j, int32_t sk12)
+{
+    tcg_out32(s, encode_djsk12_insn(OPC_LD_W, d, j, sk12));
+}
+
+/* Emits the `ld.d d, j, sk12` instruction. */
+static void __attribute__((unused))
+tcg_out_opc_ld_d(TCGContext *s, TCGReg d, TCGReg j, int32_t sk12)
+{
+    tcg_out32(s, encode_djsk12_insn(OPC_LD_D, d, j, sk12));
+}
+
+/* Emits the `st.b d, j, sk12` instruction. */
+static void __attribute__((unused))
+tcg_out_opc_st_b(TCGContext *s, TCGReg d, TCGReg j, int32_t sk12)
+{
+    tcg_out32(s, encode_djsk12_insn(OPC_ST_B, d, j, sk12));
+}
+
+/* Emits the `st.h d, j, sk12` instruction. */
+static void __attribute__((unused))
+tcg_out_opc_st_h(TCGContext *s, TCGReg d, TCGReg j, int32_t sk12)
+{
+    tcg_out32(s, encode_djsk12_insn(OPC_ST_H, d, j, sk12));
+}
+
+/* Emits the `st.w d, j, sk12` instruction. */
+static void __attribute__((unused))
+tcg_out_opc_st_w(TCGContext *s, TCGReg d, TCGReg j, int32_t sk12)
+{
+    tcg_out32(s, encode_djsk12_insn(OPC_ST_W, d, j, sk12));
+}
+
+/* Emits the `st.d d, j, sk12` instruction. */
+static void __attribute__((unused))
+tcg_out_opc_st_d(TCGContext *s, TCGReg d, TCGReg j, int32_t sk12)
+{
+    tcg_out32(s, encode_djsk12_insn(OPC_ST_D, d, j, sk12));
+}
+
+/* Emits the `ld.bu d, j, sk12` instruction. */
+static void __attribute__((unused))
+tcg_out_opc_ld_bu(TCGContext *s, TCGReg d, TCGReg j, int32_t sk12)
+{
+    tcg_out32(s, encode_djsk12_insn(OPC_LD_BU, d, j, sk12));
+}
+
+/* Emits the `ld.hu d, j, sk12` instruction. */
+static void __attribute__((unused))
+tcg_out_opc_ld_hu(TCGContext *s, TCGReg d, TCGReg j, int32_t sk12)
+{
+    tcg_out32(s, encode_djsk12_insn(OPC_LD_HU, d, j, sk12));
+}
+
+/* Emits the `ld.wu d, j, sk12` instruction. */
+static void __attribute__((unused))
+tcg_out_opc_ld_wu(TCGContext *s, TCGReg d, TCGReg j, int32_t sk12)
+{
+    tcg_out32(s, encode_djsk12_insn(OPC_LD_WU, d, j, sk12));
+}
+
+/* Emits the `dbar ud15` instruction. */
+static void __attribute__((unused))
+tcg_out_opc_dbar(TCGContext *s, uint32_t ud15)
+{
+    tcg_out32(s, encode_ud15_insn(OPC_DBAR, ud15));
+}
+
+/* Emits the `jirl d, j, sk16` instruction. */
+static void __attribute__((unused))
+tcg_out_opc_jirl(TCGContext *s, TCGReg d, TCGReg j, int32_t sk16)
+{
+    tcg_out32(s, encode_djsk16_insn(OPC_JIRL, d, j, sk16));
+}
+
+/* Emits the `b sd10k16` instruction. */
+static void __attribute__((unused))
+tcg_out_opc_b(TCGContext *s, int32_t sd10k16)
+{
+    tcg_out32(s, encode_sd10k16_insn(OPC_B, sd10k16));
+}
+
+/* Emits the `bl sd10k16` instruction. */
+static void __attribute__((unused))
+tcg_out_opc_bl(TCGContext *s, int32_t sd10k16)
+{
+    tcg_out32(s, encode_sd10k16_insn(OPC_BL, sd10k16));
+}
+
+/* Emits the `beq d, j, sk16` instruction. */
+static void __attribute__((unused))
+tcg_out_opc_beq(TCGContext *s, TCGReg d, TCGReg j, int32_t sk16)
+{
+    tcg_out32(s, encode_djsk16_insn(OPC_BEQ, d, j, sk16));
+}
+
+/* Emits the `bne d, j, sk16` instruction. */
+static void __attribute__((unused))
+tcg_out_opc_bne(TCGContext *s, TCGReg d, TCGReg j, int32_t sk16)
+{
+    tcg_out32(s, encode_djsk16_insn(OPC_BNE, d, j, sk16));
+}
+
+/* Emits the `bgt d, j, sk16` instruction. */
+static void __attribute__((unused))
+tcg_out_opc_bgt(TCGContext *s, TCGReg d, TCGReg j, int32_t sk16)
+{
+    tcg_out32(s, encode_djsk16_insn(OPC_BGT, d, j, sk16));
+}
+
+/* Emits the `ble d, j, sk16` instruction. */
+static void __attribute__((unused))
+tcg_out_opc_ble(TCGContext *s, TCGReg d, TCGReg j, int32_t sk16)
+{
+    tcg_out32(s, encode_djsk16_insn(OPC_BLE, d, j, sk16));
+}
+
+/* Emits the `bgtu d, j, sk16` instruction. */
+static void __attribute__((unused))
+tcg_out_opc_bgtu(TCGContext *s, TCGReg d, TCGReg j, int32_t sk16)
+{
+    tcg_out32(s, encode_djsk16_insn(OPC_BGTU, d, j, sk16));
+}
+
+/* Emits the `bleu d, j, sk16` instruction. */
+static void __attribute__((unused))
+tcg_out_opc_bleu(TCGContext *s, TCGReg d, TCGReg j, int32_t sk16)
+{
+    tcg_out32(s, encode_djsk16_insn(OPC_BLEU, d, j, sk16));
+}
-- 
2.33.0



^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH v3 05/30] tcg/loongarch64: Add register names, allocation order and input/output sets
  2021-09-22 18:08 [PATCH v3 00/30] LoongArch64 port of QEMU TCG WANG Xuerui
                   ` (3 preceding siblings ...)
  2021-09-22 18:09 ` [PATCH v3 04/30] tcg/loongarch64: Add generated instruction opcodes and encoding helpers WANG Xuerui
@ 2021-09-22 18:09 ` WANG Xuerui
  2021-09-22 18:09 ` [PATCH v3 06/30] tcg/loongarch64: Define the operand constraints WANG Xuerui
                   ` (24 subsequent siblings)
  29 siblings, 0 replies; 53+ messages in thread
From: WANG Xuerui @ 2021-09-22 18:09 UTC (permalink / raw)
  To: qemu-devel
  Cc: WANG Xuerui, Peter Maydell, Richard Henderson,
	Philippe Mathieu-Daudé,
	Laurent Vivier

Signed-off-by: WANG Xuerui <git@xen0n.name>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
---
 tcg/loongarch64/tcg-target.c.inc | 118 +++++++++++++++++++++++++++++++
 1 file changed, 118 insertions(+)
 create mode 100644 tcg/loongarch64/tcg-target.c.inc

diff --git a/tcg/loongarch64/tcg-target.c.inc b/tcg/loongarch64/tcg-target.c.inc
new file mode 100644
index 0000000000..42eebef78e
--- /dev/null
+++ b/tcg/loongarch64/tcg-target.c.inc
@@ -0,0 +1,118 @@
+/*
+ * Tiny Code Generator for QEMU
+ *
+ * Copyright (c) 2021 WANG Xuerui <git@xen0n.name>
+ *
+ * Based on tcg/riscv/tcg-target.c.inc
+ *
+ * Copyright (c) 2018 SiFive, Inc
+ * Copyright (c) 2008-2009 Arnaud Patard <arnaud.patard@rtp-net.org>
+ * Copyright (c) 2009 Aurelien Jarno <aurelien@aurel32.net>
+ * Copyright (c) 2008 Fabrice Bellard
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+ * THE SOFTWARE.
+ */
+
+#ifdef CONFIG_DEBUG_TCG
+static const char * const tcg_target_reg_names[TCG_TARGET_NB_REGS] = {
+    "zero",
+    "ra",
+    "tp",
+    "sp",
+    "a0",
+    "a1",
+    "a2",
+    "a3",
+    "a4",
+    "a5",
+    "a6",
+    "a7",
+    "t0",
+    "t1",
+    "t2",
+    "t3",
+    "t4",
+    "t5",
+    "t6",
+    "t7",
+    "t8",
+    "r21", /* reserved in the LP64 ABI, hence no ABI name */
+    "s9",
+    "s0",
+    "s1",
+    "s2",
+    "s3",
+    "s4",
+    "s5",
+    "s6",
+    "s7",
+    "s8"
+};
+#endif
+
+static const int tcg_target_reg_alloc_order[] = {
+    /* Registers preserved across calls */
+    /* TCG_REG_S0 reserved for TCG_AREG0 */
+    TCG_REG_S1,
+    TCG_REG_S2,
+    TCG_REG_S3,
+    TCG_REG_S4,
+    TCG_REG_S5,
+    TCG_REG_S6,
+    TCG_REG_S7,
+    TCG_REG_S8,
+    TCG_REG_S9,
+
+    /* Registers (potentially) clobbered across calls */
+    TCG_REG_T0,
+    TCG_REG_T1,
+    TCG_REG_T2,
+    TCG_REG_T3,
+    TCG_REG_T4,
+    TCG_REG_T5,
+    TCG_REG_T6,
+    TCG_REG_T7,
+    TCG_REG_T8,
+
+    /* Argument registers, opposite order of allocation.  */
+    TCG_REG_A7,
+    TCG_REG_A6,
+    TCG_REG_A5,
+    TCG_REG_A4,
+    TCG_REG_A3,
+    TCG_REG_A2,
+    TCG_REG_A1,
+    TCG_REG_A0,
+};
+
+static const int tcg_target_call_iarg_regs[] = {
+    TCG_REG_A0,
+    TCG_REG_A1,
+    TCG_REG_A2,
+    TCG_REG_A3,
+    TCG_REG_A4,
+    TCG_REG_A5,
+    TCG_REG_A6,
+    TCG_REG_A7,
+};
+
+static const int tcg_target_call_oarg_regs[] = {
+    TCG_REG_A0,
+    TCG_REG_A1,
+};
-- 
2.33.0



^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH v3 06/30] tcg/loongarch64: Define the operand constraints
  2021-09-22 18:08 [PATCH v3 00/30] LoongArch64 port of QEMU TCG WANG Xuerui
                   ` (4 preceding siblings ...)
  2021-09-22 18:09 ` [PATCH v3 05/30] tcg/loongarch64: Add register names, allocation order and input/output sets WANG Xuerui
@ 2021-09-22 18:09 ` WANG Xuerui
  2021-09-22 18:09 ` [PATCH v3 07/30] tcg/loongarch64: Implement necessary relocation operations WANG Xuerui
                   ` (23 subsequent siblings)
  29 siblings, 0 replies; 53+ messages in thread
From: WANG Xuerui @ 2021-09-22 18:09 UTC (permalink / raw)
  To: qemu-devel
  Cc: WANG Xuerui, Peter Maydell, Richard Henderson,
	Philippe Mathieu-Daudé,
	Laurent Vivier

Signed-off-by: WANG Xuerui <git@xen0n.name>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
---
 tcg/loongarch64/tcg-target-con-str.h | 28 +++++++++++++++
 tcg/loongarch64/tcg-target.c.inc     | 52 ++++++++++++++++++++++++++++
 2 files changed, 80 insertions(+)
 create mode 100644 tcg/loongarch64/tcg-target-con-str.h

diff --git a/tcg/loongarch64/tcg-target-con-str.h b/tcg/loongarch64/tcg-target-con-str.h
new file mode 100644
index 0000000000..c3986a4fd4
--- /dev/null
+++ b/tcg/loongarch64/tcg-target-con-str.h
@@ -0,0 +1,28 @@
+/* SPDX-License-Identifier: MIT */
+/*
+ * Define LoongArch target-specific operand constraints.
+ *
+ * Copyright (c) 2021 WANG Xuerui <git@xen0n.name>
+ *
+ * Based on tcg/riscv/tcg-target-con-str.h
+ *
+ * Copyright (c) 2021 Linaro
+ */
+
+/*
+ * Define constraint letters for register sets:
+ * REGS(letter, register_mask)
+ */
+REGS('r', ALL_GENERAL_REGS)
+REGS('L', ALL_GENERAL_REGS & ~SOFTMMU_RESERVE_REGS)
+
+/*
+ * Define constraint letters for constants:
+ * CONST(letter, TCG_CT_CONST_* bit set)
+ */
+CONST('I', TCG_CT_CONST_S12)
+CONST('N', TCG_CT_CONST_N12)
+CONST('U', TCG_CT_CONST_U12)
+CONST('Z', TCG_CT_CONST_ZERO)
+CONST('C', TCG_CT_CONST_C12)
+CONST('W', TCG_CT_CONST_WSZ)
diff --git a/tcg/loongarch64/tcg-target.c.inc b/tcg/loongarch64/tcg-target.c.inc
index 42eebef78e..f0930f77ef 100644
--- a/tcg/loongarch64/tcg-target.c.inc
+++ b/tcg/loongarch64/tcg-target.c.inc
@@ -116,3 +116,55 @@ static const int tcg_target_call_oarg_regs[] = {
     TCG_REG_A0,
     TCG_REG_A1,
 };
+
+#define TCG_CT_CONST_ZERO  0x100
+#define TCG_CT_CONST_S12   0x200
+#define TCG_CT_CONST_N12   0x400
+#define TCG_CT_CONST_U12   0x800
+#define TCG_CT_CONST_C12   0x1000
+#define TCG_CT_CONST_WSZ   0x2000
+
+#define ALL_GENERAL_REGS      MAKE_64BIT_MASK(0, 32)
+/*
+ * For softmmu, we need to avoid conflicts with the first 5
+ * argument registers to call the helper.  Some of these are
+ * also used for the tlb lookup.
+ */
+#ifdef CONFIG_SOFTMMU
+#define SOFTMMU_RESERVE_REGS  MAKE_64BIT_MASK(TCG_REG_A0, 5)
+#else
+#define SOFTMMU_RESERVE_REGS  0
+#endif
+
+
+static inline tcg_target_long sextreg(tcg_target_long val, int pos, int len)
+{
+    return sextract64(val, pos, len);
+}
+
+/* test if a constant matches the constraint */
+static bool tcg_target_const_match(int64_t val, TCGType type, int ct)
+{
+    if (ct & TCG_CT_CONST) {
+        return 1;
+    }
+    if ((ct & TCG_CT_CONST_ZERO) && val == 0) {
+        return 1;
+    }
+    if ((ct & TCG_CT_CONST_S12) && val == sextreg(val, 0, 12)) {
+        return 1;
+    }
+    if ((ct & TCG_CT_CONST_N12) && -val == sextreg(-val, 0, 12)) {
+        return 1;
+    }
+    if ((ct & TCG_CT_CONST_U12) && val >= 0 && val <= 0xfff) {
+        return 1;
+    }
+    if ((ct & TCG_CT_CONST_C12) && ~val >= 0 && ~val <= 0xfff) {
+        return 1;
+    }
+    if ((ct & TCG_CT_CONST_WSZ) && val == (type == TCG_TYPE_I32 ? 32 : 64)) {
+        return 1;
+    }
+    return 0;
+}
-- 
2.33.0



^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH v3 07/30] tcg/loongarch64: Implement necessary relocation operations
  2021-09-22 18:08 [PATCH v3 00/30] LoongArch64 port of QEMU TCG WANG Xuerui
                   ` (5 preceding siblings ...)
  2021-09-22 18:09 ` [PATCH v3 06/30] tcg/loongarch64: Define the operand constraints WANG Xuerui
@ 2021-09-22 18:09 ` WANG Xuerui
  2021-09-22 18:41   ` Philippe Mathieu-Daudé
  2021-09-22 18:09 ` [PATCH v3 08/30] tcg/loongarch64: Implement the memory barrier op WANG Xuerui
                   ` (22 subsequent siblings)
  29 siblings, 1 reply; 53+ messages in thread
From: WANG Xuerui @ 2021-09-22 18:09 UTC (permalink / raw)
  To: qemu-devel
  Cc: WANG Xuerui, Peter Maydell, Richard Henderson,
	Philippe Mathieu-Daudé,
	Laurent Vivier

Signed-off-by: WANG Xuerui <git@xen0n.name>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
---
 tcg/loongarch64/tcg-target.c.inc | 66 ++++++++++++++++++++++++++++++++
 1 file changed, 66 insertions(+)

diff --git a/tcg/loongarch64/tcg-target.c.inc b/tcg/loongarch64/tcg-target.c.inc
index f0930f77ef..69e882ba5d 100644
--- a/tcg/loongarch64/tcg-target.c.inc
+++ b/tcg/loongarch64/tcg-target.c.inc
@@ -168,3 +168,69 @@ static bool tcg_target_const_match(int64_t val, TCGType type, int ct)
     }
     return 0;
 }
+
+/*
+ * Relocations
+ */
+
+/*
+ * Relocation records defined in LoongArch ELF psABI v1.00 is way too
+ * complicated; a whopping stack machine is needed to stuff the fields, at
+ * the very least one SOP_PUSH and one SOP_POP (of the correct format) are
+ * needed.
+ *
+ * Hence, define our own simpler relocation types. Numbers are chosen as to
+ * not collide with potential future additions to the true ELF relocation
+ * type enum.
+ */
+
+/* Field Sk16, shifted right by 2; suitable for conditional jumps */
+#define R_LOONGARCH_BR_SK16     256
+/* Field Sd10k16, shifted right by 2; suitable for B and BL */
+#define R_LOONGARCH_BR_SD10K16  257
+
+static bool reloc_br_sk16(tcg_insn_unit *src_rw, const tcg_insn_unit *target)
+{
+    const tcg_insn_unit *src_rx = tcg_splitwx_to_rx(src_rw);
+    intptr_t offset = (intptr_t)target - (intptr_t)src_rx;
+
+    tcg_debug_assert((offset & 3) == 0);
+    offset >>= 2;
+    if (offset == sextreg(offset, 0, 16)) {
+        *src_rw |= (offset << 10) & 0x3fffc00;
+        return true;
+    }
+
+    return false;
+}
+
+static bool reloc_br_sd10k16(tcg_insn_unit *src_rw,
+                             const tcg_insn_unit *target)
+{
+    const tcg_insn_unit *src_rx = tcg_splitwx_to_rx(src_rw);
+    intptr_t offset = (intptr_t)target - (intptr_t)src_rx;
+
+    tcg_debug_assert((offset & 3) == 0);
+    offset >>= 2;
+    if (offset == sextreg(offset, 0, 26)) {
+        *src_rw |= (offset >> 16) & 0x3ff; /* slot d10 */
+        *src_rw |= ((offset & 0xffff) << 10) & 0x3fffc00; /* slot k16 */
+        return true;
+    }
+
+    return false;
+}
+
+static bool patch_reloc(tcg_insn_unit *code_ptr, int type,
+                        intptr_t value, intptr_t addend)
+{
+    tcg_debug_assert(addend == 0);
+    switch (type) {
+    case R_LOONGARCH_BR_SK16:
+        return reloc_br_sk16(code_ptr, (tcg_insn_unit *)value);
+    case R_LOONGARCH_BR_SD10K16:
+        return reloc_br_sd10k16(code_ptr, (tcg_insn_unit *)value);
+    default:
+        g_assert_not_reached();
+    }
+}
-- 
2.33.0



^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH v3 08/30] tcg/loongarch64: Implement the memory barrier op
  2021-09-22 18:08 [PATCH v3 00/30] LoongArch64 port of QEMU TCG WANG Xuerui
                   ` (6 preceding siblings ...)
  2021-09-22 18:09 ` [PATCH v3 07/30] tcg/loongarch64: Implement necessary relocation operations WANG Xuerui
@ 2021-09-22 18:09 ` WANG Xuerui
  2021-09-22 18:09 ` [PATCH v3 09/30] tcg/loongarch64: Implement tcg_out_mov and tcg_out_movi WANG Xuerui
                   ` (21 subsequent siblings)
  29 siblings, 0 replies; 53+ messages in thread
From: WANG Xuerui @ 2021-09-22 18:09 UTC (permalink / raw)
  To: qemu-devel
  Cc: WANG Xuerui, Peter Maydell, Richard Henderson,
	Philippe Mathieu-Daudé,
	Laurent Vivier

Signed-off-by: WANG Xuerui <git@xen0n.name>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
---
 tcg/loongarch64/tcg-target.c.inc | 32 ++++++++++++++++++++++++++++++++
 1 file changed, 32 insertions(+)

diff --git a/tcg/loongarch64/tcg-target.c.inc b/tcg/loongarch64/tcg-target.c.inc
index 69e882ba5d..338b772732 100644
--- a/tcg/loongarch64/tcg-target.c.inc
+++ b/tcg/loongarch64/tcg-target.c.inc
@@ -234,3 +234,35 @@ static bool patch_reloc(tcg_insn_unit *code_ptr, int type,
         g_assert_not_reached();
     }
 }
+
+#include "tcg-insn-defs.c.inc"
+
+/*
+ * TCG intrinsics
+ */
+
+static void tcg_out_mb(TCGContext *s, TCGArg a0)
+{
+    /* Baseline LoongArch only has the full barrier, unfortunately.  */
+    tcg_out_opc_dbar(s, 0);
+}
+
+/*
+ * Entry-points
+ */
+
+static void tcg_out_op(TCGContext *s, TCGOpcode opc,
+                       const TCGArg args[TCG_MAX_OP_ARGS],
+                       const int const_args[TCG_MAX_OP_ARGS])
+{
+    TCGArg a0 = args[0];
+
+    switch (opc) {
+    case INDEX_op_mb:
+        tcg_out_mb(s, a0);
+        break;
+
+    default:
+        g_assert_not_reached();
+    }
+}
-- 
2.33.0



^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH v3 09/30] tcg/loongarch64: Implement tcg_out_mov and tcg_out_movi
  2021-09-22 18:08 [PATCH v3 00/30] LoongArch64 port of QEMU TCG WANG Xuerui
                   ` (7 preceding siblings ...)
  2021-09-22 18:09 ` [PATCH v3 08/30] tcg/loongarch64: Implement the memory barrier op WANG Xuerui
@ 2021-09-22 18:09 ` WANG Xuerui
  2021-09-22 18:39   ` Richard Henderson
                     ` (2 more replies)
  2021-09-22 18:09 ` [PATCH v3 10/30] tcg/loongarch64: Implement goto_ptr WANG Xuerui
                   ` (20 subsequent siblings)
  29 siblings, 3 replies; 53+ messages in thread
From: WANG Xuerui @ 2021-09-22 18:09 UTC (permalink / raw)
  To: qemu-devel
  Cc: WANG Xuerui, Peter Maydell, Richard Henderson,
	Philippe Mathieu-Daudé,
	Laurent Vivier

Signed-off-by: WANG Xuerui <git@xen0n.name>
---
 tcg/loongarch64/tcg-target.c.inc | 89 ++++++++++++++++++++++++++++++++
 1 file changed, 89 insertions(+)

diff --git a/tcg/loongarch64/tcg-target.c.inc b/tcg/loongarch64/tcg-target.c.inc
index 338b772732..6d28a29070 100644
--- a/tcg/loongarch64/tcg-target.c.inc
+++ b/tcg/loongarch64/tcg-target.c.inc
@@ -247,6 +247,93 @@ static void tcg_out_mb(TCGContext *s, TCGArg a0)
     tcg_out_opc_dbar(s, 0);
 }
 
+static bool tcg_out_mov(TCGContext *s, TCGType type, TCGReg ret, TCGReg arg)
+{
+    if (ret == arg) {
+        return true;
+    }
+    switch (type) {
+    case TCG_TYPE_I32:
+    case TCG_TYPE_I64:
+        /*
+         * Conventional register-register move used in LoongArch is
+         * `or dst, src, zero`.
+         */
+        tcg_out_opc_or(s, ret, arg, TCG_REG_ZERO);
+        break;
+    default:
+        g_assert_not_reached();
+    }
+    return true;
+}
+
+static void tcg_out_movi(TCGContext *s, TCGType type, TCGReg rd,
+                         tcg_target_long val)
+{
+    if (type == TCG_TYPE_I32) {
+        val = (int32_t)val;
+    }
+
+    /* Single-instruction cases.  */
+    tcg_target_long low = sextreg(val, 0, 12);
+    if (low == val) {
+        /* val fits in simm12: addi.w rd, zero, val */
+        tcg_out_opc_addi_w(s, rd, TCG_REG_ZERO, val);
+        return;
+    }
+    if (0x800 <= val && val <= 0xfff) {
+        /* val fits in uimm12: ori rd, zero, val */
+        tcg_out_opc_ori(s, rd, TCG_REG_ZERO, val);
+        return;
+    }
+
+    /* Test for PC-relative values that can be loaded faster.  */
+    intptr_t pc_offset = tcg_pcrel_diff(s, (void *)val);
+    if (pc_offset == sextreg(pc_offset, 0, 22) && (pc_offset & 3) == 0) {
+        tcg_out_opc_pcaddu2i(s, rd, pc_offset >> 2);
+        return;
+    }
+    if (pc_offset == (int32_t)pc_offset) {
+        tcg_target_long lo = sextreg(pc_offset, 0, 12);
+        tcg_target_long hi = pc_offset - lo;
+        tcg_out_opc_pcaddu12i(s, rd, hi >> 12);
+        tcg_out_opc_addi_d(s, rd, rd, lo);
+        return;
+    }
+
+    /*
+     * Slow path: at most lu12i.w + ori + cu32i.d + cu52i.d.
+     *
+     * Chop upper bits into 3 immediate-field-sized segments respectively.
+     */
+    tcg_target_long upper = sextreg(val, 12, 20);
+    tcg_target_long higher = sextreg(val, 32, 20);
+    tcg_target_long top = sextreg(val, 52, 12);
+
+    tcg_out_opc_lu12i_w(s, rd, upper);
+    if (low != 0) {
+        tcg_out_opc_ori(s, rd, rd, low & 0xfff);
+    }
+
+    if (sextreg(val, 0, 32) == val) {
+        /*
+         * Fits in 32-bits, upper bits are already properly sign-extended by
+         * lu12i.w.
+         */
+        return;
+    }
+    tcg_out_opc_cu32i_d(s, rd, higher);
+
+    if (sextreg(val, 0, 52) == val) {
+        /*
+         * Fits in 52-bits, upper bits are already properly sign-extended by
+         * cu32i.d.
+         */
+        return;
+    }
+    tcg_out_opc_cu52i_d(s, rd, rd, top);
+}
+
 /*
  * Entry-points
  */
@@ -262,6 +349,8 @@ static void tcg_out_op(TCGContext *s, TCGOpcode opc,
         tcg_out_mb(s, a0);
         break;
 
+    case INDEX_op_mov_i32:  /* Always emitted via tcg_out_mov.  */
+    case INDEX_op_mov_i64:
     default:
         g_assert_not_reached();
     }
-- 
2.33.0



^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH v3 10/30] tcg/loongarch64: Implement goto_ptr
  2021-09-22 18:08 [PATCH v3 00/30] LoongArch64 port of QEMU TCG WANG Xuerui
                   ` (8 preceding siblings ...)
  2021-09-22 18:09 ` [PATCH v3 09/30] tcg/loongarch64: Implement tcg_out_mov and tcg_out_movi WANG Xuerui
@ 2021-09-22 18:09 ` WANG Xuerui
  2021-09-22 18:09 ` [PATCH v3 11/30] tcg/loongarch64: Implement sign-/zero-extension ops WANG Xuerui
                   ` (19 subsequent siblings)
  29 siblings, 0 replies; 53+ messages in thread
From: WANG Xuerui @ 2021-09-22 18:09 UTC (permalink / raw)
  To: qemu-devel
  Cc: WANG Xuerui, Peter Maydell, Richard Henderson,
	Philippe Mathieu-Daudé,
	Laurent Vivier

Signed-off-by: WANG Xuerui <git@xen0n.name>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
---
 tcg/loongarch64/tcg-target-con-set.h | 17 +++++++++++++++++
 tcg/loongarch64/tcg-target.c.inc     | 15 +++++++++++++++
 2 files changed, 32 insertions(+)
 create mode 100644 tcg/loongarch64/tcg-target-con-set.h

diff --git a/tcg/loongarch64/tcg-target-con-set.h b/tcg/loongarch64/tcg-target-con-set.h
new file mode 100644
index 0000000000..5cc4407367
--- /dev/null
+++ b/tcg/loongarch64/tcg-target-con-set.h
@@ -0,0 +1,17 @@
+/* SPDX-License-Identifier: MIT */
+/*
+ * Define LoongArch target-specific constraint sets.
+ *
+ * Copyright (c) 2021 WANG Xuerui <git@xen0n.name>
+ *
+ * Based on tcg/riscv/tcg-target-con-set.h
+ *
+ * Copyright (c) 2021 Linaro
+ */
+
+/*
+ * C_On_Im(...) defines a constraint set with <n> outputs and <m> inputs.
+ * Each operand should be a sequence of constraint letters as defined by
+ * tcg-target-con-str.h; the constraint combination is inclusive or.
+ */
+C_O0_I1(r)
diff --git a/tcg/loongarch64/tcg-target.c.inc b/tcg/loongarch64/tcg-target.c.inc
index 6d28a29070..54c8dd4459 100644
--- a/tcg/loongarch64/tcg-target.c.inc
+++ b/tcg/loongarch64/tcg-target.c.inc
@@ -349,9 +349,24 @@ static void tcg_out_op(TCGContext *s, TCGOpcode opc,
         tcg_out_mb(s, a0);
         break;
 
+    case INDEX_op_goto_ptr:
+        tcg_out_opc_jirl(s, TCG_REG_ZERO, a0, 0);
+        break;
+
     case INDEX_op_mov_i32:  /* Always emitted via tcg_out_mov.  */
     case INDEX_op_mov_i64:
     default:
         g_assert_not_reached();
     }
 }
+
+static TCGConstraintSetIndex tcg_target_op_def(TCGOpcode op)
+{
+    switch (op) {
+    case INDEX_op_goto_ptr:
+        return C_O0_I1(r);
+
+    default:
+        g_assert_not_reached();
+    }
+}
-- 
2.33.0



^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH v3 11/30] tcg/loongarch64: Implement sign-/zero-extension ops
  2021-09-22 18:08 [PATCH v3 00/30] LoongArch64 port of QEMU TCG WANG Xuerui
                   ` (9 preceding siblings ...)
  2021-09-22 18:09 ` [PATCH v3 10/30] tcg/loongarch64: Implement goto_ptr WANG Xuerui
@ 2021-09-22 18:09 ` WANG Xuerui
  2021-09-22 18:09 ` [PATCH v3 12/30] tcg/loongarch64: Implement not/and/or/xor/nor/andc/orc ops WANG Xuerui
                   ` (18 subsequent siblings)
  29 siblings, 0 replies; 53+ messages in thread
From: WANG Xuerui @ 2021-09-22 18:09 UTC (permalink / raw)
  To: qemu-devel
  Cc: WANG Xuerui, Peter Maydell, Richard Henderson,
	Philippe Mathieu-Daudé,
	Laurent Vivier

Signed-off-by: WANG Xuerui <git@xen0n.name>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
---
 tcg/loongarch64/tcg-target-con-set.h |  1 +
 tcg/loongarch64/tcg-target.c.inc     | 82 ++++++++++++++++++++++++++++
 tcg/loongarch64/tcg-target.h         | 24 ++++----
 3 files changed, 95 insertions(+), 12 deletions(-)

diff --git a/tcg/loongarch64/tcg-target-con-set.h b/tcg/loongarch64/tcg-target-con-set.h
index 5cc4407367..7e459490ea 100644
--- a/tcg/loongarch64/tcg-target-con-set.h
+++ b/tcg/loongarch64/tcg-target-con-set.h
@@ -15,3 +15,4 @@
  * tcg-target-con-str.h; the constraint combination is inclusive or.
  */
 C_O0_I1(r)
+C_O1_I1(r, r)
diff --git a/tcg/loongarch64/tcg-target.c.inc b/tcg/loongarch64/tcg-target.c.inc
index 54c8dd4459..665c04af74 100644
--- a/tcg/loongarch64/tcg-target.c.inc
+++ b/tcg/loongarch64/tcg-target.c.inc
@@ -334,6 +334,36 @@ static void tcg_out_movi(TCGContext *s, TCGType type, TCGReg rd,
     tcg_out_opc_cu52i_d(s, rd, rd, top);
 }
 
+static void tcg_out_ext8u(TCGContext *s, TCGReg ret, TCGReg arg)
+{
+    tcg_out_opc_andi(s, ret, arg, 0xff);
+}
+
+static void tcg_out_ext16u(TCGContext *s, TCGReg ret, TCGReg arg)
+{
+    tcg_out_opc_bstrpick_w(s, ret, arg, 0, 15);
+}
+
+static void tcg_out_ext32u(TCGContext *s, TCGReg ret, TCGReg arg)
+{
+    tcg_out_opc_bstrpick_d(s, ret, arg, 0, 31);
+}
+
+static void tcg_out_ext8s(TCGContext *s, TCGReg ret, TCGReg arg)
+{
+    tcg_out_opc_sext_b(s, ret, arg);
+}
+
+static void tcg_out_ext16s(TCGContext *s, TCGReg ret, TCGReg arg)
+{
+    tcg_out_opc_sext_h(s, ret, arg);
+}
+
+static void tcg_out_ext32s(TCGContext *s, TCGReg ret, TCGReg arg)
+{
+    tcg_out_opc_addi_w(s, ret, arg, 0);
+}
+
 /*
  * Entry-points
  */
@@ -343,6 +373,7 @@ static void tcg_out_op(TCGContext *s, TCGOpcode opc,
                        const int const_args[TCG_MAX_OP_ARGS])
 {
     TCGArg a0 = args[0];
+    TCGArg a1 = args[1];
 
     switch (opc) {
     case INDEX_op_mb:
@@ -353,6 +384,41 @@ static void tcg_out_op(TCGContext *s, TCGOpcode opc,
         tcg_out_opc_jirl(s, TCG_REG_ZERO, a0, 0);
         break;
 
+    case INDEX_op_ext8s_i32:
+    case INDEX_op_ext8s_i64:
+        tcg_out_ext8s(s, a0, a1);
+        break;
+
+    case INDEX_op_ext8u_i32:
+    case INDEX_op_ext8u_i64:
+        tcg_out_ext8u(s, a0, a1);
+        break;
+
+    case INDEX_op_ext16s_i32:
+    case INDEX_op_ext16s_i64:
+        tcg_out_ext16s(s, a0, a1);
+        break;
+
+    case INDEX_op_ext16u_i32:
+    case INDEX_op_ext16u_i64:
+        tcg_out_ext16u(s, a0, a1);
+        break;
+
+    case INDEX_op_ext32u_i64:
+    case INDEX_op_extu_i32_i64:
+        tcg_out_ext32u(s, a0, a1);
+        break;
+
+    case INDEX_op_ext32s_i64:
+    case INDEX_op_extrl_i64_i32:
+    case INDEX_op_ext_i32_i64:
+        tcg_out_ext32s(s, a0, a1);
+        break;
+
+    case INDEX_op_extrh_i64_i32:
+        tcg_out_opc_srai_d(s, a0, a1, 32);
+        break;
+
     case INDEX_op_mov_i32:  /* Always emitted via tcg_out_mov.  */
     case INDEX_op_mov_i64:
     default:
@@ -366,6 +432,22 @@ static TCGConstraintSetIndex tcg_target_op_def(TCGOpcode op)
     case INDEX_op_goto_ptr:
         return C_O0_I1(r);
 
+    case INDEX_op_ext8s_i32:
+    case INDEX_op_ext8s_i64:
+    case INDEX_op_ext8u_i32:
+    case INDEX_op_ext8u_i64:
+    case INDEX_op_ext16s_i32:
+    case INDEX_op_ext16s_i64:
+    case INDEX_op_ext16u_i32:
+    case INDEX_op_ext16u_i64:
+    case INDEX_op_ext32s_i64:
+    case INDEX_op_ext32u_i64:
+    case INDEX_op_extu_i32_i64:
+    case INDEX_op_extrl_i64_i32:
+    case INDEX_op_extrh_i64_i32:
+    case INDEX_op_ext_i32_i64:
+        return C_O1_I1(r, r);
+
     default:
         g_assert_not_reached();
     }
diff --git a/tcg/loongarch64/tcg-target.h b/tcg/loongarch64/tcg-target.h
index 0fd9b61e6d..c047fc598a 100644
--- a/tcg/loongarch64/tcg-target.h
+++ b/tcg/loongarch64/tcg-target.h
@@ -107,10 +107,10 @@ typedef enum {
 #define TCG_TARGET_HAS_muls2_i32        0
 #define TCG_TARGET_HAS_muluh_i32        0
 #define TCG_TARGET_HAS_mulsh_i32        0
-#define TCG_TARGET_HAS_ext8s_i32        0
-#define TCG_TARGET_HAS_ext16s_i32       0
-#define TCG_TARGET_HAS_ext8u_i32        0
-#define TCG_TARGET_HAS_ext16u_i32       0
+#define TCG_TARGET_HAS_ext8s_i32        1
+#define TCG_TARGET_HAS_ext16s_i32       1
+#define TCG_TARGET_HAS_ext8u_i32        1
+#define TCG_TARGET_HAS_ext16u_i32       1
 #define TCG_TARGET_HAS_bswap16_i32      0
 #define TCG_TARGET_HAS_bswap32_i32      0
 #define TCG_TARGET_HAS_not_i32          0
@@ -138,14 +138,14 @@ typedef enum {
 #define TCG_TARGET_HAS_extract_i64      0
 #define TCG_TARGET_HAS_sextract_i64     0
 #define TCG_TARGET_HAS_extract2_i64     0
-#define TCG_TARGET_HAS_extrl_i64_i32    0
-#define TCG_TARGET_HAS_extrh_i64_i32    0
-#define TCG_TARGET_HAS_ext8s_i64        0
-#define TCG_TARGET_HAS_ext16s_i64       0
-#define TCG_TARGET_HAS_ext32s_i64       0
-#define TCG_TARGET_HAS_ext8u_i64        0
-#define TCG_TARGET_HAS_ext16u_i64       0
-#define TCG_TARGET_HAS_ext32u_i64       0
+#define TCG_TARGET_HAS_extrl_i64_i32    1
+#define TCG_TARGET_HAS_extrh_i64_i32    1
+#define TCG_TARGET_HAS_ext8s_i64        1
+#define TCG_TARGET_HAS_ext16s_i64       1
+#define TCG_TARGET_HAS_ext32s_i64       1
+#define TCG_TARGET_HAS_ext8u_i64        1
+#define TCG_TARGET_HAS_ext16u_i64       1
+#define TCG_TARGET_HAS_ext32u_i64       1
 #define TCG_TARGET_HAS_bswap16_i64      0
 #define TCG_TARGET_HAS_bswap32_i64      0
 #define TCG_TARGET_HAS_bswap64_i64      0
-- 
2.33.0



^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH v3 12/30] tcg/loongarch64: Implement not/and/or/xor/nor/andc/orc ops
  2021-09-22 18:08 [PATCH v3 00/30] LoongArch64 port of QEMU TCG WANG Xuerui
                   ` (10 preceding siblings ...)
  2021-09-22 18:09 ` [PATCH v3 11/30] tcg/loongarch64: Implement sign-/zero-extension ops WANG Xuerui
@ 2021-09-22 18:09 ` WANG Xuerui
  2021-09-22 18:09 ` [PATCH v3 13/30] tcg/loongarch64: Implement deposit/extract ops WANG Xuerui
                   ` (17 subsequent siblings)
  29 siblings, 0 replies; 53+ messages in thread
From: WANG Xuerui @ 2021-09-22 18:09 UTC (permalink / raw)
  To: qemu-devel
  Cc: WANG Xuerui, Peter Maydell, Richard Henderson,
	Philippe Mathieu-Daudé,
	Laurent Vivier

Signed-off-by: WANG Xuerui <git@xen0n.name>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
---
 tcg/loongarch64/tcg-target-con-set.h |  2 +
 tcg/loongarch64/tcg-target.c.inc     | 88 ++++++++++++++++++++++++++++
 tcg/loongarch64/tcg-target.h         | 16 ++---
 3 files changed, 98 insertions(+), 8 deletions(-)

diff --git a/tcg/loongarch64/tcg-target-con-set.h b/tcg/loongarch64/tcg-target-con-set.h
index 7e459490ea..9ac24b8ad0 100644
--- a/tcg/loongarch64/tcg-target-con-set.h
+++ b/tcg/loongarch64/tcg-target-con-set.h
@@ -16,3 +16,5 @@
  */
 C_O0_I1(r)
 C_O1_I1(r, r)
+C_O1_I2(r, r, rC)
+C_O1_I2(r, r, rU)
diff --git a/tcg/loongarch64/tcg-target.c.inc b/tcg/loongarch64/tcg-target.c.inc
index 665c04af74..8d08a61899 100644
--- a/tcg/loongarch64/tcg-target.c.inc
+++ b/tcg/loongarch64/tcg-target.c.inc
@@ -374,6 +374,8 @@ static void tcg_out_op(TCGContext *s, TCGOpcode opc,
 {
     TCGArg a0 = args[0];
     TCGArg a1 = args[1];
+    TCGArg a2 = args[2];
+    int c2 = const_args[2];
 
     switch (opc) {
     case INDEX_op_mb:
@@ -419,6 +421,68 @@ static void tcg_out_op(TCGContext *s, TCGOpcode opc,
         tcg_out_opc_srai_d(s, a0, a1, 32);
         break;
 
+    case INDEX_op_not_i32:
+    case INDEX_op_not_i64:
+        tcg_out_opc_nor(s, a0, a1, TCG_REG_ZERO);
+        break;
+
+    case INDEX_op_nor_i32:
+    case INDEX_op_nor_i64:
+        if (c2) {
+            tcg_out_opc_ori(s, a0, a1, a2);
+            tcg_out_opc_nor(s, a0, a0, TCG_REG_ZERO);
+        } else {
+            tcg_out_opc_nor(s, a0, a1, a2);
+        }
+        break;
+
+    case INDEX_op_andc_i32:
+    case INDEX_op_andc_i64:
+        if (c2) {
+            /* guaranteed to fit due to constraint */
+            tcg_out_opc_andi(s, a0, a1, ~a2);
+        } else {
+            tcg_out_opc_andn(s, a0, a1, a2);
+        }
+        break;
+
+    case INDEX_op_orc_i32:
+    case INDEX_op_orc_i64:
+        if (c2) {
+            /* guaranteed to fit due to constraint */
+            tcg_out_opc_ori(s, a0, a1, ~a2);
+        } else {
+            tcg_out_opc_orn(s, a0, a1, a2);
+        }
+        break;
+
+    case INDEX_op_and_i32:
+    case INDEX_op_and_i64:
+        if (c2) {
+            tcg_out_opc_andi(s, a0, a1, a2);
+        } else {
+            tcg_out_opc_and(s, a0, a1, a2);
+        }
+        break;
+
+    case INDEX_op_or_i32:
+    case INDEX_op_or_i64:
+        if (c2) {
+            tcg_out_opc_ori(s, a0, a1, a2);
+        } else {
+            tcg_out_opc_or(s, a0, a1, a2);
+        }
+        break;
+
+    case INDEX_op_xor_i32:
+    case INDEX_op_xor_i64:
+        if (c2) {
+            tcg_out_opc_xori(s, a0, a1, a2);
+        } else {
+            tcg_out_opc_xor(s, a0, a1, a2);
+        }
+        break;
+
     case INDEX_op_mov_i32:  /* Always emitted via tcg_out_mov.  */
     case INDEX_op_mov_i64:
     default:
@@ -446,8 +510,32 @@ static TCGConstraintSetIndex tcg_target_op_def(TCGOpcode op)
     case INDEX_op_extrl_i64_i32:
     case INDEX_op_extrh_i64_i32:
     case INDEX_op_ext_i32_i64:
+    case INDEX_op_not_i32:
+    case INDEX_op_not_i64:
         return C_O1_I1(r, r);
 
+    case INDEX_op_andc_i32:
+    case INDEX_op_andc_i64:
+    case INDEX_op_orc_i32:
+    case INDEX_op_orc_i64:
+        /*
+         * LoongArch insns for these ops don't have reg-imm forms, but we
+         * can express using andi/ori if ~constant satisfies
+         * TCG_CT_CONST_U12.
+         */
+        return C_O1_I2(r, r, rC);
+
+    case INDEX_op_and_i32:
+    case INDEX_op_and_i64:
+    case INDEX_op_nor_i32:
+    case INDEX_op_nor_i64:
+    case INDEX_op_or_i32:
+    case INDEX_op_or_i64:
+    case INDEX_op_xor_i32:
+    case INDEX_op_xor_i64:
+        /* LoongArch reg-imm bitops have their imms ZERO-extended */
+        return C_O1_I2(r, r, rU);
+
     default:
         g_assert_not_reached();
     }
diff --git a/tcg/loongarch64/tcg-target.h b/tcg/loongarch64/tcg-target.h
index c047fc598a..48c198dcff 100644
--- a/tcg/loongarch64/tcg-target.h
+++ b/tcg/loongarch64/tcg-target.h
@@ -113,13 +113,13 @@ typedef enum {
 #define TCG_TARGET_HAS_ext16u_i32       1
 #define TCG_TARGET_HAS_bswap16_i32      0
 #define TCG_TARGET_HAS_bswap32_i32      0
-#define TCG_TARGET_HAS_not_i32          0
+#define TCG_TARGET_HAS_not_i32          1
 #define TCG_TARGET_HAS_neg_i32          0
-#define TCG_TARGET_HAS_andc_i32         0
-#define TCG_TARGET_HAS_orc_i32          0
+#define TCG_TARGET_HAS_andc_i32         1
+#define TCG_TARGET_HAS_orc_i32          1
 #define TCG_TARGET_HAS_eqv_i32          0
 #define TCG_TARGET_HAS_nand_i32         0
-#define TCG_TARGET_HAS_nor_i32          0
+#define TCG_TARGET_HAS_nor_i32          1
 #define TCG_TARGET_HAS_clz_i32          0
 #define TCG_TARGET_HAS_ctz_i32          0
 #define TCG_TARGET_HAS_ctpop_i32        0
@@ -149,13 +149,13 @@ typedef enum {
 #define TCG_TARGET_HAS_bswap16_i64      0
 #define TCG_TARGET_HAS_bswap32_i64      0
 #define TCG_TARGET_HAS_bswap64_i64      0
-#define TCG_TARGET_HAS_not_i64          0
+#define TCG_TARGET_HAS_not_i64          1
 #define TCG_TARGET_HAS_neg_i64          0
-#define TCG_TARGET_HAS_andc_i64         0
-#define TCG_TARGET_HAS_orc_i64          0
+#define TCG_TARGET_HAS_andc_i64         1
+#define TCG_TARGET_HAS_orc_i64          1
 #define TCG_TARGET_HAS_eqv_i64          0
 #define TCG_TARGET_HAS_nand_i64         0
-#define TCG_TARGET_HAS_nor_i64          0
+#define TCG_TARGET_HAS_nor_i64          1
 #define TCG_TARGET_HAS_clz_i64          0
 #define TCG_TARGET_HAS_ctz_i64          0
 #define TCG_TARGET_HAS_ctpop_i64        0
-- 
2.33.0



^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH v3 13/30] tcg/loongarch64: Implement deposit/extract ops
  2021-09-22 18:08 [PATCH v3 00/30] LoongArch64 port of QEMU TCG WANG Xuerui
                   ` (11 preceding siblings ...)
  2021-09-22 18:09 ` [PATCH v3 12/30] tcg/loongarch64: Implement not/and/or/xor/nor/andc/orc ops WANG Xuerui
@ 2021-09-22 18:09 ` WANG Xuerui
  2021-09-22 18:09 ` [PATCH v3 14/30] tcg/loongarch64: Implement bswap{16,32,64} ops WANG Xuerui
                   ` (16 subsequent siblings)
  29 siblings, 0 replies; 53+ messages in thread
From: WANG Xuerui @ 2021-09-22 18:09 UTC (permalink / raw)
  To: qemu-devel
  Cc: WANG Xuerui, Peter Maydell, Richard Henderson,
	Philippe Mathieu-Daudé,
	Laurent Vivier

Signed-off-by: WANG Xuerui <git@xen0n.name>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
---
 tcg/loongarch64/tcg-target-con-set.h |  1 +
 tcg/loongarch64/tcg-target.c.inc     | 21 +++++++++++++++++++++
 tcg/loongarch64/tcg-target.h         |  8 ++++----
 3 files changed, 26 insertions(+), 4 deletions(-)

diff --git a/tcg/loongarch64/tcg-target-con-set.h b/tcg/loongarch64/tcg-target-con-set.h
index 9ac24b8ad0..d958183020 100644
--- a/tcg/loongarch64/tcg-target-con-set.h
+++ b/tcg/loongarch64/tcg-target-con-set.h
@@ -18,3 +18,4 @@ C_O0_I1(r)
 C_O1_I1(r, r)
 C_O1_I2(r, r, rC)
 C_O1_I2(r, r, rU)
+C_O1_I2(r, 0, rZ)
diff --git a/tcg/loongarch64/tcg-target.c.inc b/tcg/loongarch64/tcg-target.c.inc
index 8d08a61899..938de8fe6f 100644
--- a/tcg/loongarch64/tcg-target.c.inc
+++ b/tcg/loongarch64/tcg-target.c.inc
@@ -483,6 +483,20 @@ static void tcg_out_op(TCGContext *s, TCGOpcode opc,
         }
         break;
 
+    case INDEX_op_extract_i32:
+        tcg_out_opc_bstrpick_w(s, a0, a1, a2, a2 + args[3] - 1);
+        break;
+    case INDEX_op_extract_i64:
+        tcg_out_opc_bstrpick_d(s, a0, a1, a2, a2 + args[3] - 1);
+        break;
+
+    case INDEX_op_deposit_i32:
+        tcg_out_opc_bstrins_w(s, a0, a2, args[3], args[3] + args[4] - 1);
+        break;
+    case INDEX_op_deposit_i64:
+        tcg_out_opc_bstrins_d(s, a0, a2, args[3], args[3] + args[4] - 1);
+        break;
+
     case INDEX_op_mov_i32:  /* Always emitted via tcg_out_mov.  */
     case INDEX_op_mov_i64:
     default:
@@ -512,6 +526,8 @@ static TCGConstraintSetIndex tcg_target_op_def(TCGOpcode op)
     case INDEX_op_ext_i32_i64:
     case INDEX_op_not_i32:
     case INDEX_op_not_i64:
+    case INDEX_op_extract_i32:
+    case INDEX_op_extract_i64:
         return C_O1_I1(r, r);
 
     case INDEX_op_andc_i32:
@@ -536,6 +552,11 @@ static TCGConstraintSetIndex tcg_target_op_def(TCGOpcode op)
         /* LoongArch reg-imm bitops have their imms ZERO-extended */
         return C_O1_I2(r, r, rU);
 
+    case INDEX_op_deposit_i32:
+    case INDEX_op_deposit_i64:
+        /* Must deposit into the same register as input */
+        return C_O1_I2(r, 0, rZ);
+
     default:
         g_assert_not_reached();
     }
diff --git a/tcg/loongarch64/tcg-target.h b/tcg/loongarch64/tcg-target.h
index 48c198dcff..06a8f22792 100644
--- a/tcg/loongarch64/tcg-target.h
+++ b/tcg/loongarch64/tcg-target.h
@@ -97,8 +97,8 @@ typedef enum {
 #define TCG_TARGET_HAS_rem_i32          0
 #define TCG_TARGET_HAS_div2_i32         0
 #define TCG_TARGET_HAS_rot_i32          0
-#define TCG_TARGET_HAS_deposit_i32      0
-#define TCG_TARGET_HAS_extract_i32      0
+#define TCG_TARGET_HAS_deposit_i32      1
+#define TCG_TARGET_HAS_extract_i32      1
 #define TCG_TARGET_HAS_sextract_i32     0
 #define TCG_TARGET_HAS_extract2_i32     0
 #define TCG_TARGET_HAS_add2_i32         0
@@ -134,8 +134,8 @@ typedef enum {
 #define TCG_TARGET_HAS_rem_i64          0
 #define TCG_TARGET_HAS_div2_i64         0
 #define TCG_TARGET_HAS_rot_i64          0
-#define TCG_TARGET_HAS_deposit_i64      0
-#define TCG_TARGET_HAS_extract_i64      0
+#define TCG_TARGET_HAS_deposit_i64      1
+#define TCG_TARGET_HAS_extract_i64      1
 #define TCG_TARGET_HAS_sextract_i64     0
 #define TCG_TARGET_HAS_extract2_i64     0
 #define TCG_TARGET_HAS_extrl_i64_i32    1
-- 
2.33.0



^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH v3 14/30] tcg/loongarch64: Implement bswap{16,32,64} ops
  2021-09-22 18:08 [PATCH v3 00/30] LoongArch64 port of QEMU TCG WANG Xuerui
                   ` (12 preceding siblings ...)
  2021-09-22 18:09 ` [PATCH v3 13/30] tcg/loongarch64: Implement deposit/extract ops WANG Xuerui
@ 2021-09-22 18:09 ` WANG Xuerui
  2021-09-22 18:23   ` Richard Henderson
  2021-09-22 18:09 ` [PATCH v3 15/30] tcg/loongarch64: Implement clz/ctz ops WANG Xuerui
                   ` (15 subsequent siblings)
  29 siblings, 1 reply; 53+ messages in thread
From: WANG Xuerui @ 2021-09-22 18:09 UTC (permalink / raw)
  To: qemu-devel
  Cc: WANG Xuerui, Peter Maydell, Richard Henderson,
	Philippe Mathieu-Daudé,
	Laurent Vivier

Signed-off-by: WANG Xuerui <git@xen0n.name>
---
 tcg/loongarch64/tcg-target.c.inc | 32 ++++++++++++++++++++++++++++++++
 tcg/loongarch64/tcg-target.h     | 10 +++++-----
 2 files changed, 37 insertions(+), 5 deletions(-)

diff --git a/tcg/loongarch64/tcg-target.c.inc b/tcg/loongarch64/tcg-target.c.inc
index 938de8fe6f..be12cab2e3 100644
--- a/tcg/loongarch64/tcg-target.c.inc
+++ b/tcg/loongarch64/tcg-target.c.inc
@@ -497,6 +497,33 @@ static void tcg_out_op(TCGContext *s, TCGOpcode opc,
         tcg_out_opc_bstrins_d(s, a0, a2, args[3], args[3] + args[4] - 1);
         break;
 
+    case INDEX_op_bswap16_i32:
+    case INDEX_op_bswap16_i64:
+        tcg_out_opc_revb_2h(s, a0, a1);
+        if (a2 & TCG_BSWAP_OS) {
+            tcg_out_ext16s(s, a0, a0);
+        } else if ((a2 & (TCG_BSWAP_IZ | TCG_BSWAP_OZ)) == TCG_BSWAP_OZ) {
+            tcg_out_ext16u(s, a0, a0);
+        }
+        break;
+
+    case INDEX_op_bswap32_i32:
+        /* All 32-bit values are computed sign-extended in the register.  */
+        a2 = TCG_BSWAP_OS;
+        /* fallthrough */
+    case INDEX_op_bswap32_i64:
+        tcg_out_opc_revb_2w(s, a0, a1);
+        if (a2 & TCG_BSWAP_OS) {
+            tcg_out_ext32s(s, a0, a0);
+        } else if ((a2 & (TCG_BSWAP_IZ | TCG_BSWAP_OZ)) == TCG_BSWAP_OZ) {
+            tcg_out_ext32u(s, a0, a0);
+        }
+        break;
+
+    case INDEX_op_bswap64_i64:
+        tcg_out_opc_revb_d(s, a0, a1);
+        break;
+
     case INDEX_op_mov_i32:  /* Always emitted via tcg_out_mov.  */
     case INDEX_op_mov_i64:
     default:
@@ -528,6 +555,11 @@ static TCGConstraintSetIndex tcg_target_op_def(TCGOpcode op)
     case INDEX_op_not_i64:
     case INDEX_op_extract_i32:
     case INDEX_op_extract_i64:
+    case INDEX_op_bswap16_i32:
+    case INDEX_op_bswap16_i64:
+    case INDEX_op_bswap32_i32:
+    case INDEX_op_bswap32_i64:
+    case INDEX_op_bswap64_i64:
         return C_O1_I1(r, r);
 
     case INDEX_op_andc_i32:
diff --git a/tcg/loongarch64/tcg-target.h b/tcg/loongarch64/tcg-target.h
index 06a8f22792..cee457737a 100644
--- a/tcg/loongarch64/tcg-target.h
+++ b/tcg/loongarch64/tcg-target.h
@@ -111,8 +111,8 @@ typedef enum {
 #define TCG_TARGET_HAS_ext16s_i32       1
 #define TCG_TARGET_HAS_ext8u_i32        1
 #define TCG_TARGET_HAS_ext16u_i32       1
-#define TCG_TARGET_HAS_bswap16_i32      0
-#define TCG_TARGET_HAS_bswap32_i32      0
+#define TCG_TARGET_HAS_bswap16_i32      1
+#define TCG_TARGET_HAS_bswap32_i32      1
 #define TCG_TARGET_HAS_not_i32          1
 #define TCG_TARGET_HAS_neg_i32          0
 #define TCG_TARGET_HAS_andc_i32         1
@@ -146,9 +146,9 @@ typedef enum {
 #define TCG_TARGET_HAS_ext8u_i64        1
 #define TCG_TARGET_HAS_ext16u_i64       1
 #define TCG_TARGET_HAS_ext32u_i64       1
-#define TCG_TARGET_HAS_bswap16_i64      0
-#define TCG_TARGET_HAS_bswap32_i64      0
-#define TCG_TARGET_HAS_bswap64_i64      0
+#define TCG_TARGET_HAS_bswap16_i64      1
+#define TCG_TARGET_HAS_bswap32_i64      1
+#define TCG_TARGET_HAS_bswap64_i64      1
 #define TCG_TARGET_HAS_not_i64          1
 #define TCG_TARGET_HAS_neg_i64          0
 #define TCG_TARGET_HAS_andc_i64         1
-- 
2.33.0



^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH v3 15/30] tcg/loongarch64: Implement clz/ctz ops
  2021-09-22 18:08 [PATCH v3 00/30] LoongArch64 port of QEMU TCG WANG Xuerui
                   ` (13 preceding siblings ...)
  2021-09-22 18:09 ` [PATCH v3 14/30] tcg/loongarch64: Implement bswap{16,32,64} ops WANG Xuerui
@ 2021-09-22 18:09 ` WANG Xuerui
  2021-09-22 18:09 ` [PATCH v3 16/30] tcg/loongarch64: Implement shl/shr/sar/rotl/rotr ops WANG Xuerui
                   ` (14 subsequent siblings)
  29 siblings, 0 replies; 53+ messages in thread
From: WANG Xuerui @ 2021-09-22 18:09 UTC (permalink / raw)
  To: qemu-devel
  Cc: WANG Xuerui, Peter Maydell, Richard Henderson,
	Philippe Mathieu-Daudé,
	Laurent Vivier

Signed-off-by: WANG Xuerui <git@xen0n.name>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
---
 tcg/loongarch64/tcg-target-con-set.h |  1 +
 tcg/loongarch64/tcg-target.c.inc     | 42 ++++++++++++++++++++++++++++
 tcg/loongarch64/tcg-target.h         |  8 +++---
 3 files changed, 47 insertions(+), 4 deletions(-)

diff --git a/tcg/loongarch64/tcg-target-con-set.h b/tcg/loongarch64/tcg-target-con-set.h
index d958183020..2975e03127 100644
--- a/tcg/loongarch64/tcg-target-con-set.h
+++ b/tcg/loongarch64/tcg-target-con-set.h
@@ -18,4 +18,5 @@ C_O0_I1(r)
 C_O1_I1(r, r)
 C_O1_I2(r, r, rC)
 C_O1_I2(r, r, rU)
+C_O1_I2(r, r, rW)
 C_O1_I2(r, 0, rZ)
diff --git a/tcg/loongarch64/tcg-target.c.inc b/tcg/loongarch64/tcg-target.c.inc
index be12cab2e3..4e83167b99 100644
--- a/tcg/loongarch64/tcg-target.c.inc
+++ b/tcg/loongarch64/tcg-target.c.inc
@@ -364,6 +364,28 @@ static void tcg_out_ext32s(TCGContext *s, TCGReg ret, TCGReg arg)
     tcg_out_opc_addi_w(s, ret, arg, 0);
 }
 
+static void tcg_out_clzctz(TCGContext *s, LoongArchInsn opc,
+                           TCGReg a0, TCGReg a1, TCGReg a2,
+                           bool c2, bool is_32bit)
+{
+    if (c2) {
+        /*
+         * Fast path: semantics already satisfied due to constraint and
+         * insn behavior, single instruction is enough.
+         */
+        tcg_debug_assert(a2 == (is_32bit ? 32 : 64));
+        /* all clz/ctz insns belong to DJ-format */
+        tcg_out32(s, encode_dj_insn(opc, a0, a1));
+        return;
+    }
+
+    tcg_out32(s, encode_dj_insn(opc, TCG_REG_TMP0, a1));
+    /* a0 = a1 ? REG_TMP0 : a2 */
+    tcg_out_opc_maskeqz(s, TCG_REG_TMP0, TCG_REG_TMP0, a1);
+    tcg_out_opc_masknez(s, a0, a2, a1);
+    tcg_out_opc_or(s, a0, TCG_REG_TMP0, a0);
+}
+
 /*
  * Entry-points
  */
@@ -524,6 +546,20 @@ static void tcg_out_op(TCGContext *s, TCGOpcode opc,
         tcg_out_opc_revb_d(s, a0, a1);
         break;
 
+    case INDEX_op_clz_i32:
+        tcg_out_clzctz(s, OPC_CLZ_W, a0, a1, a2, c2, true);
+        break;
+    case INDEX_op_clz_i64:
+        tcg_out_clzctz(s, OPC_CLZ_D, a0, a1, a2, c2, false);
+        break;
+
+    case INDEX_op_ctz_i32:
+        tcg_out_clzctz(s, OPC_CTZ_W, a0, a1, a2, c2, true);
+        break;
+    case INDEX_op_ctz_i64:
+        tcg_out_clzctz(s, OPC_CTZ_D, a0, a1, a2, c2, false);
+        break;
+
     case INDEX_op_mov_i32:  /* Always emitted via tcg_out_mov.  */
     case INDEX_op_mov_i64:
     default:
@@ -584,6 +620,12 @@ static TCGConstraintSetIndex tcg_target_op_def(TCGOpcode op)
         /* LoongArch reg-imm bitops have their imms ZERO-extended */
         return C_O1_I2(r, r, rU);
 
+    case INDEX_op_clz_i32:
+    case INDEX_op_clz_i64:
+    case INDEX_op_ctz_i32:
+    case INDEX_op_ctz_i64:
+        return C_O1_I2(r, r, rW);
+
     case INDEX_op_deposit_i32:
     case INDEX_op_deposit_i64:
         /* Must deposit into the same register as input */
diff --git a/tcg/loongarch64/tcg-target.h b/tcg/loongarch64/tcg-target.h
index cee457737a..fe152c17ac 100644
--- a/tcg/loongarch64/tcg-target.h
+++ b/tcg/loongarch64/tcg-target.h
@@ -120,8 +120,8 @@ typedef enum {
 #define TCG_TARGET_HAS_eqv_i32          0
 #define TCG_TARGET_HAS_nand_i32         0
 #define TCG_TARGET_HAS_nor_i32          1
-#define TCG_TARGET_HAS_clz_i32          0
-#define TCG_TARGET_HAS_ctz_i32          0
+#define TCG_TARGET_HAS_clz_i32          1
+#define TCG_TARGET_HAS_ctz_i32          1
 #define TCG_TARGET_HAS_ctpop_i32        0
 #define TCG_TARGET_HAS_direct_jump      0
 #define TCG_TARGET_HAS_brcond2          0
@@ -156,8 +156,8 @@ typedef enum {
 #define TCG_TARGET_HAS_eqv_i64          0
 #define TCG_TARGET_HAS_nand_i64         0
 #define TCG_TARGET_HAS_nor_i64          1
-#define TCG_TARGET_HAS_clz_i64          0
-#define TCG_TARGET_HAS_ctz_i64          0
+#define TCG_TARGET_HAS_clz_i64          1
+#define TCG_TARGET_HAS_ctz_i64          1
 #define TCG_TARGET_HAS_ctpop_i64        0
 #define TCG_TARGET_HAS_add2_i64         0
 #define TCG_TARGET_HAS_sub2_i64         0
-- 
2.33.0



^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH v3 16/30] tcg/loongarch64: Implement shl/shr/sar/rotl/rotr ops
  2021-09-22 18:08 [PATCH v3 00/30] LoongArch64 port of QEMU TCG WANG Xuerui
                   ` (14 preceding siblings ...)
  2021-09-22 18:09 ` [PATCH v3 15/30] tcg/loongarch64: Implement clz/ctz ops WANG Xuerui
@ 2021-09-22 18:09 ` WANG Xuerui
  2021-09-22 18:09 ` [PATCH v3 17/30] tcg/loongarch64: Implement add/sub ops WANG Xuerui
                   ` (13 subsequent siblings)
  29 siblings, 0 replies; 53+ messages in thread
From: WANG Xuerui @ 2021-09-22 18:09 UTC (permalink / raw)
  To: qemu-devel
  Cc: WANG Xuerui, Peter Maydell, Richard Henderson,
	Philippe Mathieu-Daudé,
	Laurent Vivier

Signed-off-by: WANG Xuerui <git@xen0n.name>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
---
 tcg/loongarch64/tcg-target-con-set.h |  1 +
 tcg/loongarch64/tcg-target.c.inc     | 91 ++++++++++++++++++++++++++++
 tcg/loongarch64/tcg-target.h         |  4 +-
 3 files changed, 94 insertions(+), 2 deletions(-)

diff --git a/tcg/loongarch64/tcg-target-con-set.h b/tcg/loongarch64/tcg-target-con-set.h
index 2975e03127..42f8e28741 100644
--- a/tcg/loongarch64/tcg-target-con-set.h
+++ b/tcg/loongarch64/tcg-target-con-set.h
@@ -17,6 +17,7 @@
 C_O0_I1(r)
 C_O1_I1(r, r)
 C_O1_I2(r, r, rC)
+C_O1_I2(r, r, ri)
 C_O1_I2(r, r, rU)
 C_O1_I2(r, r, rW)
 C_O1_I2(r, 0, rZ)
diff --git a/tcg/loongarch64/tcg-target.c.inc b/tcg/loongarch64/tcg-target.c.inc
index 4e83167b99..373cf22447 100644
--- a/tcg/loongarch64/tcg-target.c.inc
+++ b/tcg/loongarch64/tcg-target.c.inc
@@ -560,6 +560,85 @@ static void tcg_out_op(TCGContext *s, TCGOpcode opc,
         tcg_out_clzctz(s, OPC_CTZ_D, a0, a1, a2, c2, false);
         break;
 
+    case INDEX_op_shl_i32:
+        if (c2) {
+            tcg_out_opc_slli_w(s, a0, a1, a2 & 0x1f);
+        } else {
+            tcg_out_opc_sll_w(s, a0, a1, a2);
+        }
+        break;
+    case INDEX_op_shl_i64:
+        if (c2) {
+            tcg_out_opc_slli_d(s, a0, a1, a2 & 0x3f);
+        } else {
+            tcg_out_opc_sll_d(s, a0, a1, a2);
+        }
+        break;
+
+    case INDEX_op_shr_i32:
+        if (c2) {
+            tcg_out_opc_srli_w(s, a0, a1, a2 & 0x1f);
+        } else {
+            tcg_out_opc_srl_w(s, a0, a1, a2);
+        }
+        break;
+    case INDEX_op_shr_i64:
+        if (c2) {
+            tcg_out_opc_srli_d(s, a0, a1, a2 & 0x3f);
+        } else {
+            tcg_out_opc_srl_d(s, a0, a1, a2);
+        }
+        break;
+
+    case INDEX_op_sar_i32:
+        if (c2) {
+            tcg_out_opc_srai_w(s, a0, a1, a2 & 0x1f);
+        } else {
+            tcg_out_opc_sra_w(s, a0, a1, a2);
+        }
+        break;
+    case INDEX_op_sar_i64:
+        if (c2) {
+            tcg_out_opc_srai_d(s, a0, a1, a2 & 0x3f);
+        } else {
+            tcg_out_opc_sra_d(s, a0, a1, a2);
+        }
+        break;
+
+    case INDEX_op_rotl_i32:
+        /* transform into equivalent rotr/rotri */
+        if (c2) {
+            tcg_out_opc_rotri_w(s, a0, a1, (32 - a2) & 0x1f);
+        } else {
+            tcg_out_opc_sub_w(s, TCG_REG_TMP0, TCG_REG_ZERO, a2);
+            tcg_out_opc_rotr_w(s, a0, a1, TCG_REG_TMP0);
+        }
+        break;
+    case INDEX_op_rotl_i64:
+        /* transform into equivalent rotr/rotri */
+        if (c2) {
+            tcg_out_opc_rotri_d(s, a0, a1, (64 - a2) & 0x3f);
+        } else {
+            tcg_out_opc_sub_w(s, TCG_REG_TMP0, TCG_REG_ZERO, a2);
+            tcg_out_opc_rotr_d(s, a0, a1, TCG_REG_TMP0);
+        }
+        break;
+
+    case INDEX_op_rotr_i32:
+        if (c2) {
+            tcg_out_opc_rotri_w(s, a0, a1, a2 & 0x1f);
+        } else {
+            tcg_out_opc_rotr_w(s, a0, a1, a2);
+        }
+        break;
+    case INDEX_op_rotr_i64:
+        if (c2) {
+            tcg_out_opc_rotri_d(s, a0, a1, a2 & 0x3f);
+        } else {
+            tcg_out_opc_rotr_d(s, a0, a1, a2);
+        }
+        break;
+
     case INDEX_op_mov_i32:  /* Always emitted via tcg_out_mov.  */
     case INDEX_op_mov_i64:
     default:
@@ -609,6 +688,18 @@ static TCGConstraintSetIndex tcg_target_op_def(TCGOpcode op)
          */
         return C_O1_I2(r, r, rC);
 
+    case INDEX_op_shl_i32:
+    case INDEX_op_shl_i64:
+    case INDEX_op_shr_i32:
+    case INDEX_op_shr_i64:
+    case INDEX_op_sar_i32:
+    case INDEX_op_sar_i64:
+    case INDEX_op_rotl_i32:
+    case INDEX_op_rotl_i64:
+    case INDEX_op_rotr_i32:
+    case INDEX_op_rotr_i64:
+        return C_O1_I2(r, r, ri);
+
     case INDEX_op_and_i32:
     case INDEX_op_and_i64:
     case INDEX_op_nor_i32:
diff --git a/tcg/loongarch64/tcg-target.h b/tcg/loongarch64/tcg-target.h
index fe152c17ac..cc17e96ad8 100644
--- a/tcg/loongarch64/tcg-target.h
+++ b/tcg/loongarch64/tcg-target.h
@@ -96,7 +96,7 @@ typedef enum {
 #define TCG_TARGET_HAS_div_i32          0
 #define TCG_TARGET_HAS_rem_i32          0
 #define TCG_TARGET_HAS_div2_i32         0
-#define TCG_TARGET_HAS_rot_i32          0
+#define TCG_TARGET_HAS_rot_i32          1
 #define TCG_TARGET_HAS_deposit_i32      1
 #define TCG_TARGET_HAS_extract_i32      1
 #define TCG_TARGET_HAS_sextract_i32     0
@@ -133,7 +133,7 @@ typedef enum {
 #define TCG_TARGET_HAS_div_i64          0
 #define TCG_TARGET_HAS_rem_i64          0
 #define TCG_TARGET_HAS_div2_i64         0
-#define TCG_TARGET_HAS_rot_i64          0
+#define TCG_TARGET_HAS_rot_i64          1
 #define TCG_TARGET_HAS_deposit_i64      1
 #define TCG_TARGET_HAS_extract_i64      1
 #define TCG_TARGET_HAS_sextract_i64     0
-- 
2.33.0



^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH v3 17/30] tcg/loongarch64: Implement add/sub ops
  2021-09-22 18:08 [PATCH v3 00/30] LoongArch64 port of QEMU TCG WANG Xuerui
                   ` (15 preceding siblings ...)
  2021-09-22 18:09 ` [PATCH v3 16/30] tcg/loongarch64: Implement shl/shr/sar/rotl/rotr ops WANG Xuerui
@ 2021-09-22 18:09 ` WANG Xuerui
  2021-09-22 18:09 ` [PATCH v3 18/30] tcg/loongarch64: Implement mul/mulsh/muluh/div/divu/rem/remu ops WANG Xuerui
                   ` (12 subsequent siblings)
  29 siblings, 0 replies; 53+ messages in thread
From: WANG Xuerui @ 2021-09-22 18:09 UTC (permalink / raw)
  To: qemu-devel
  Cc: WANG Xuerui, Peter Maydell, Richard Henderson,
	Philippe Mathieu-Daudé,
	Laurent Vivier

The neg_i{32,64} ops is fully expressible with sub, so omitted for
simplicity.

Signed-off-by: WANG Xuerui <git@xen0n.name>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
---
 tcg/loongarch64/tcg-target-con-set.h |  2 ++
 tcg/loongarch64/tcg-target.c.inc     | 38 ++++++++++++++++++++++++++++
 2 files changed, 40 insertions(+)

diff --git a/tcg/loongarch64/tcg-target-con-set.h b/tcg/loongarch64/tcg-target-con-set.h
index 42f8e28741..4b8ce85897 100644
--- a/tcg/loongarch64/tcg-target-con-set.h
+++ b/tcg/loongarch64/tcg-target-con-set.h
@@ -18,6 +18,8 @@ C_O0_I1(r)
 C_O1_I1(r, r)
 C_O1_I2(r, r, rC)
 C_O1_I2(r, r, ri)
+C_O1_I2(r, r, rI)
 C_O1_I2(r, r, rU)
 C_O1_I2(r, r, rW)
 C_O1_I2(r, 0, rZ)
+C_O1_I2(r, rZ, rN)
diff --git a/tcg/loongarch64/tcg-target.c.inc b/tcg/loongarch64/tcg-target.c.inc
index 373cf22447..a651c09dd7 100644
--- a/tcg/loongarch64/tcg-target.c.inc
+++ b/tcg/loongarch64/tcg-target.c.inc
@@ -639,6 +639,36 @@ static void tcg_out_op(TCGContext *s, TCGOpcode opc,
         }
         break;
 
+    case INDEX_op_add_i32:
+        if (c2) {
+            tcg_out_opc_addi_w(s, a0, a1, a2);
+        } else {
+            tcg_out_opc_add_w(s, a0, a1, a2);
+        }
+        break;
+    case INDEX_op_add_i64:
+        if (c2) {
+            tcg_out_opc_addi_d(s, a0, a1, a2);
+        } else {
+            tcg_out_opc_add_d(s, a0, a1, a2);
+        }
+        break;
+
+    case INDEX_op_sub_i32:
+        if (c2) {
+            tcg_out_opc_addi_w(s, a0, a1, -a2);
+        } else {
+            tcg_out_opc_sub_w(s, a0, a1, a2);
+        }
+        break;
+    case INDEX_op_sub_i64:
+        if (c2) {
+            tcg_out_opc_addi_d(s, a0, a1, -a2);
+        } else {
+            tcg_out_opc_sub_d(s, a0, a1, a2);
+        }
+        break;
+
     case INDEX_op_mov_i32:  /* Always emitted via tcg_out_mov.  */
     case INDEX_op_mov_i64:
     default:
@@ -700,6 +730,10 @@ static TCGConstraintSetIndex tcg_target_op_def(TCGOpcode op)
     case INDEX_op_rotr_i64:
         return C_O1_I2(r, r, ri);
 
+    case INDEX_op_add_i32:
+    case INDEX_op_add_i64:
+        return C_O1_I2(r, r, rI);
+
     case INDEX_op_and_i32:
     case INDEX_op_and_i64:
     case INDEX_op_nor_i32:
@@ -722,6 +756,10 @@ static TCGConstraintSetIndex tcg_target_op_def(TCGOpcode op)
         /* Must deposit into the same register as input */
         return C_O1_I2(r, 0, rZ);
 
+    case INDEX_op_sub_i32:
+    case INDEX_op_sub_i64:
+        return C_O1_I2(r, rZ, rN);
+
     default:
         g_assert_not_reached();
     }
-- 
2.33.0



^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH v3 18/30] tcg/loongarch64: Implement mul/mulsh/muluh/div/divu/rem/remu ops
  2021-09-22 18:08 [PATCH v3 00/30] LoongArch64 port of QEMU TCG WANG Xuerui
                   ` (16 preceding siblings ...)
  2021-09-22 18:09 ` [PATCH v3 17/30] tcg/loongarch64: Implement add/sub ops WANG Xuerui
@ 2021-09-22 18:09 ` WANG Xuerui
  2021-09-22 18:09 ` [PATCH v3 19/30] tcg/loongarch64: Implement br/brcond ops WANG Xuerui
                   ` (11 subsequent siblings)
  29 siblings, 0 replies; 53+ messages in thread
From: WANG Xuerui @ 2021-09-22 18:09 UTC (permalink / raw)
  To: qemu-devel
  Cc: WANG Xuerui, Peter Maydell, Richard Henderson,
	Philippe Mathieu-Daudé,
	Laurent Vivier

Signed-off-by: WANG Xuerui <git@xen0n.name>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
---
 tcg/loongarch64/tcg-target-con-set.h |  1 +
 tcg/loongarch64/tcg-target.c.inc     | 65 ++++++++++++++++++++++++++++
 tcg/loongarch64/tcg-target.h         | 16 +++----
 3 files changed, 74 insertions(+), 8 deletions(-)

diff --git a/tcg/loongarch64/tcg-target-con-set.h b/tcg/loongarch64/tcg-target-con-set.h
index 4b8ce85897..fb56f3a295 100644
--- a/tcg/loongarch64/tcg-target-con-set.h
+++ b/tcg/loongarch64/tcg-target-con-set.h
@@ -23,3 +23,4 @@ C_O1_I2(r, r, rU)
 C_O1_I2(r, r, rW)
 C_O1_I2(r, 0, rZ)
 C_O1_I2(r, rZ, rN)
+C_O1_I2(r, rZ, rZ)
diff --git a/tcg/loongarch64/tcg-target.c.inc b/tcg/loongarch64/tcg-target.c.inc
index a651c09dd7..11fd31ff9d 100644
--- a/tcg/loongarch64/tcg-target.c.inc
+++ b/tcg/loongarch64/tcg-target.c.inc
@@ -669,6 +669,55 @@ static void tcg_out_op(TCGContext *s, TCGOpcode opc,
         }
         break;
 
+    case INDEX_op_mul_i32:
+        tcg_out_opc_mul_w(s, a0, a1, a2);
+        break;
+    case INDEX_op_mul_i64:
+        tcg_out_opc_mul_d(s, a0, a1, a2);
+        break;
+
+    case INDEX_op_mulsh_i32:
+        tcg_out_opc_mulh_w(s, a0, a1, a2);
+        break;
+    case INDEX_op_mulsh_i64:
+        tcg_out_opc_mulh_d(s, a0, a1, a2);
+        break;
+
+    case INDEX_op_muluh_i32:
+        tcg_out_opc_mulh_wu(s, a0, a1, a2);
+        break;
+    case INDEX_op_muluh_i64:
+        tcg_out_opc_mulh_du(s, a0, a1, a2);
+        break;
+
+    case INDEX_op_div_i32:
+        tcg_out_opc_div_w(s, a0, a1, a2);
+        break;
+    case INDEX_op_div_i64:
+        tcg_out_opc_div_d(s, a0, a1, a2);
+        break;
+
+    case INDEX_op_divu_i32:
+        tcg_out_opc_div_wu(s, a0, a1, a2);
+        break;
+    case INDEX_op_divu_i64:
+        tcg_out_opc_div_du(s, a0, a1, a2);
+        break;
+
+    case INDEX_op_rem_i32:
+        tcg_out_opc_mod_w(s, a0, a1, a2);
+        break;
+    case INDEX_op_rem_i64:
+        tcg_out_opc_mod_d(s, a0, a1, a2);
+        break;
+
+    case INDEX_op_remu_i32:
+        tcg_out_opc_mod_wu(s, a0, a1, a2);
+        break;
+    case INDEX_op_remu_i64:
+        tcg_out_opc_mod_du(s, a0, a1, a2);
+        break;
+
     case INDEX_op_mov_i32:  /* Always emitted via tcg_out_mov.  */
     case INDEX_op_mov_i64:
     default:
@@ -760,6 +809,22 @@ static TCGConstraintSetIndex tcg_target_op_def(TCGOpcode op)
     case INDEX_op_sub_i64:
         return C_O1_I2(r, rZ, rN);
 
+    case INDEX_op_mul_i32:
+    case INDEX_op_mul_i64:
+    case INDEX_op_mulsh_i32:
+    case INDEX_op_mulsh_i64:
+    case INDEX_op_muluh_i32:
+    case INDEX_op_muluh_i64:
+    case INDEX_op_div_i32:
+    case INDEX_op_div_i64:
+    case INDEX_op_divu_i32:
+    case INDEX_op_divu_i64:
+    case INDEX_op_rem_i32:
+    case INDEX_op_rem_i64:
+    case INDEX_op_remu_i32:
+    case INDEX_op_remu_i64:
+        return C_O1_I2(r, rZ, rZ);
+
     default:
         g_assert_not_reached();
     }
diff --git a/tcg/loongarch64/tcg-target.h b/tcg/loongarch64/tcg-target.h
index cc17e96ad8..dbb089aeeb 100644
--- a/tcg/loongarch64/tcg-target.h
+++ b/tcg/loongarch64/tcg-target.h
@@ -93,8 +93,8 @@ typedef enum {
 
 /* optional instructions */
 #define TCG_TARGET_HAS_movcond_i32      0
-#define TCG_TARGET_HAS_div_i32          0
-#define TCG_TARGET_HAS_rem_i32          0
+#define TCG_TARGET_HAS_div_i32          1
+#define TCG_TARGET_HAS_rem_i32          1
 #define TCG_TARGET_HAS_div2_i32         0
 #define TCG_TARGET_HAS_rot_i32          1
 #define TCG_TARGET_HAS_deposit_i32      1
@@ -105,8 +105,8 @@ typedef enum {
 #define TCG_TARGET_HAS_sub2_i32         0
 #define TCG_TARGET_HAS_mulu2_i32        0
 #define TCG_TARGET_HAS_muls2_i32        0
-#define TCG_TARGET_HAS_muluh_i32        0
-#define TCG_TARGET_HAS_mulsh_i32        0
+#define TCG_TARGET_HAS_muluh_i32        1
+#define TCG_TARGET_HAS_mulsh_i32        1
 #define TCG_TARGET_HAS_ext8s_i32        1
 #define TCG_TARGET_HAS_ext16s_i32       1
 #define TCG_TARGET_HAS_ext8u_i32        1
@@ -130,8 +130,8 @@ typedef enum {
 
 /* 64-bit operations */
 #define TCG_TARGET_HAS_movcond_i64      0
-#define TCG_TARGET_HAS_div_i64          0
-#define TCG_TARGET_HAS_rem_i64          0
+#define TCG_TARGET_HAS_div_i64          1
+#define TCG_TARGET_HAS_rem_i64          1
 #define TCG_TARGET_HAS_div2_i64         0
 #define TCG_TARGET_HAS_rot_i64          1
 #define TCG_TARGET_HAS_deposit_i64      1
@@ -163,8 +163,8 @@ typedef enum {
 #define TCG_TARGET_HAS_sub2_i64         0
 #define TCG_TARGET_HAS_mulu2_i64        0
 #define TCG_TARGET_HAS_muls2_i64        0
-#define TCG_TARGET_HAS_muluh_i64        0
-#define TCG_TARGET_HAS_mulsh_i64        0
+#define TCG_TARGET_HAS_muluh_i64        1
+#define TCG_TARGET_HAS_mulsh_i64        1
 
 /* not defined -- call should be eliminated at compile time */
 void tb_target_set_jmp_target(uintptr_t, uintptr_t, uintptr_t, uintptr_t);
-- 
2.33.0



^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH v3 19/30] tcg/loongarch64: Implement br/brcond ops
  2021-09-22 18:08 [PATCH v3 00/30] LoongArch64 port of QEMU TCG WANG Xuerui
                   ` (17 preceding siblings ...)
  2021-09-22 18:09 ` [PATCH v3 18/30] tcg/loongarch64: Implement mul/mulsh/muluh/div/divu/rem/remu ops WANG Xuerui
@ 2021-09-22 18:09 ` WANG Xuerui
  2021-09-22 18:09 ` [PATCH v3 20/30] tcg/loongarch64: Implement setcond ops WANG Xuerui
                   ` (10 subsequent siblings)
  29 siblings, 0 replies; 53+ messages in thread
From: WANG Xuerui @ 2021-09-22 18:09 UTC (permalink / raw)
  To: qemu-devel
  Cc: WANG Xuerui, Peter Maydell, Richard Henderson,
	Philippe Mathieu-Daudé,
	Laurent Vivier

Signed-off-by: WANG Xuerui <git@xen0n.name>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
---
 tcg/loongarch64/tcg-target-con-set.h |  1 +
 tcg/loongarch64/tcg-target.c.inc     | 53 ++++++++++++++++++++++++++++
 2 files changed, 54 insertions(+)

diff --git a/tcg/loongarch64/tcg-target-con-set.h b/tcg/loongarch64/tcg-target-con-set.h
index fb56f3a295..367689c2e2 100644
--- a/tcg/loongarch64/tcg-target-con-set.h
+++ b/tcg/loongarch64/tcg-target-con-set.h
@@ -15,6 +15,7 @@
  * tcg-target-con-str.h; the constraint combination is inclusive or.
  */
 C_O0_I1(r)
+C_O0_I2(rZ, rZ)
 C_O1_I1(r, r)
 C_O1_I2(r, r, rC)
 C_O1_I2(r, r, ri)
diff --git a/tcg/loongarch64/tcg-target.c.inc b/tcg/loongarch64/tcg-target.c.inc
index 11fd31ff9d..b65852ef2f 100644
--- a/tcg/loongarch64/tcg-target.c.inc
+++ b/tcg/loongarch64/tcg-target.c.inc
@@ -386,6 +386,44 @@ static void tcg_out_clzctz(TCGContext *s, LoongArchInsn opc,
     tcg_out_opc_or(s, a0, TCG_REG_TMP0, a0);
 }
 
+/*
+ * Branch helpers
+ */
+
+static const struct {
+    LoongArchInsn op;
+    bool swap;
+} tcg_brcond_to_loongarch[] = {
+    [TCG_COND_EQ] =  { OPC_BEQ,  false },
+    [TCG_COND_NE] =  { OPC_BNE,  false },
+    [TCG_COND_LT] =  { OPC_BGT,  true  },
+    [TCG_COND_GE] =  { OPC_BLE,  true  },
+    [TCG_COND_LE] =  { OPC_BLE,  false },
+    [TCG_COND_GT] =  { OPC_BGT,  false },
+    [TCG_COND_LTU] = { OPC_BGTU, true  },
+    [TCG_COND_GEU] = { OPC_BLEU, true  },
+    [TCG_COND_LEU] = { OPC_BLEU, false },
+    [TCG_COND_GTU] = { OPC_BGTU, false }
+};
+
+static void tcg_out_brcond(TCGContext *s, TCGCond cond, TCGReg arg1,
+                           TCGReg arg2, TCGLabel *l)
+{
+    LoongArchInsn op = tcg_brcond_to_loongarch[cond].op;
+
+    tcg_debug_assert(op != 0);
+
+    if (tcg_brcond_to_loongarch[cond].swap) {
+        TCGReg t = arg1;
+        arg1 = arg2;
+        arg2 = t;
+    }
+
+    /* all conditional branch insns belong to DJSk16-format */
+    tcg_out_reloc(s, s->code_ptr, R_LOONGARCH_BR_SK16, l, 0);
+    tcg_out32(s, encode_djsk16_insn(op, arg1, arg2, 0));
+}
+
 /*
  * Entry-points
  */
@@ -408,6 +446,17 @@ static void tcg_out_op(TCGContext *s, TCGOpcode opc,
         tcg_out_opc_jirl(s, TCG_REG_ZERO, a0, 0);
         break;
 
+    case INDEX_op_br:
+        tcg_out_reloc(s, s->code_ptr, R_LOONGARCH_BR_SD10K16, arg_label(a0),
+                      0);
+        tcg_out_opc_b(s, 0);
+        break;
+
+    case INDEX_op_brcond_i32:
+    case INDEX_op_brcond_i64:
+        tcg_out_brcond(s, a2, a0, a1, arg_label(args[3]));
+        break;
+
     case INDEX_op_ext8s_i32:
     case INDEX_op_ext8s_i64:
         tcg_out_ext8s(s, a0, a1);
@@ -731,6 +780,10 @@ static TCGConstraintSetIndex tcg_target_op_def(TCGOpcode op)
     case INDEX_op_goto_ptr:
         return C_O0_I1(r);
 
+    case INDEX_op_brcond_i32:
+    case INDEX_op_brcond_i64:
+        return C_O0_I2(rZ, rZ);
+
     case INDEX_op_ext8s_i32:
     case INDEX_op_ext8s_i64:
     case INDEX_op_ext8u_i32:
-- 
2.33.0



^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH v3 20/30] tcg/loongarch64: Implement setcond ops
  2021-09-22 18:08 [PATCH v3 00/30] LoongArch64 port of QEMU TCG WANG Xuerui
                   ` (18 preceding siblings ...)
  2021-09-22 18:09 ` [PATCH v3 19/30] tcg/loongarch64: Implement br/brcond ops WANG Xuerui
@ 2021-09-22 18:09 ` WANG Xuerui
  2021-09-22 18:25   ` Richard Henderson
  2021-09-22 18:09 ` [PATCH v3 21/30] tcg/loongarch64: Implement tcg_out_call WANG Xuerui
                   ` (9 subsequent siblings)
  29 siblings, 1 reply; 53+ messages in thread
From: WANG Xuerui @ 2021-09-22 18:09 UTC (permalink / raw)
  To: qemu-devel
  Cc: WANG Xuerui, Peter Maydell, Richard Henderson,
	Philippe Mathieu-Daudé,
	Laurent Vivier

Signed-off-by: WANG Xuerui <git@xen0n.name>
---
 tcg/loongarch64/tcg-target.c.inc | 67 ++++++++++++++++++++++++++++++++
 1 file changed, 67 insertions(+)

diff --git a/tcg/loongarch64/tcg-target.c.inc b/tcg/loongarch64/tcg-target.c.inc
index b65852ef2f..e30277b43a 100644
--- a/tcg/loongarch64/tcg-target.c.inc
+++ b/tcg/loongarch64/tcg-target.c.inc
@@ -386,6 +386,66 @@ static void tcg_out_clzctz(TCGContext *s, LoongArchInsn opc,
     tcg_out_opc_or(s, a0, TCG_REG_TMP0, a0);
 }
 
+static void tcg_out_setcond(TCGContext *s, TCGCond cond, TCGReg ret,
+                            TCGReg arg1, TCGReg arg2, bool c2)
+{
+    TCGReg tmp;
+
+    if (c2) {
+        tcg_debug_assert(arg2 == 0);
+    }
+
+    switch (cond) {
+    case TCG_COND_EQ:
+        if (c2) {
+            tmp = arg1;
+        } else {
+            tcg_out_opc_sub_d(s, ret, arg1, arg2);
+            tmp = ret;
+        }
+        tcg_out_opc_sltui(s, ret, tmp, 1);
+        break;
+    case TCG_COND_NE:
+        if (c2) {
+            tmp = arg1;
+        } else {
+            tcg_out_opc_sub_d(s, ret, arg1, arg2);
+            tmp = ret;
+        }
+        tcg_out_opc_sltu(s, ret, TCG_REG_ZERO, tmp);
+        break;
+    case TCG_COND_LT:
+        tcg_out_opc_slt(s, ret, arg1, arg2);
+        break;
+    case TCG_COND_GE:
+        tcg_out_opc_slt(s, ret, arg1, arg2);
+        tcg_out_opc_xori(s, ret, ret, 1);
+        break;
+    case TCG_COND_LE:
+        tcg_out_setcond(s, TCG_COND_GE, ret, arg2, arg1, false);
+        break;
+    case TCG_COND_GT:
+        tcg_out_setcond(s, TCG_COND_LT, ret, arg2, arg1, false);
+        break;
+    case TCG_COND_LTU:
+        tcg_out_opc_sltu(s, ret, arg1, arg2);
+        break;
+    case TCG_COND_GEU:
+        tcg_out_opc_sltu(s, ret, arg1, arg2);
+        tcg_out_opc_xori(s, ret, ret, 1);
+        break;
+    case TCG_COND_LEU:
+        tcg_out_setcond(s, TCG_COND_GEU, ret, arg2, arg1, false);
+        break;
+    case TCG_COND_GTU:
+        tcg_out_setcond(s, TCG_COND_LTU, ret, arg2, arg1, false);
+        break;
+    default:
+        g_assert_not_reached();
+        break;
+    }
+}
+
 /*
  * Branch helpers
  */
@@ -767,6 +827,11 @@ static void tcg_out_op(TCGContext *s, TCGOpcode opc,
         tcg_out_opc_mod_du(s, a0, a1, a2);
         break;
 
+    case INDEX_op_setcond_i32:
+    case INDEX_op_setcond_i64:
+        tcg_out_setcond(s, args[3], a0, a1, a2, c2);
+        break;
+
     case INDEX_op_mov_i32:  /* Always emitted via tcg_out_mov.  */
     case INDEX_op_mov_i64:
     default:
@@ -876,6 +941,8 @@ static TCGConstraintSetIndex tcg_target_op_def(TCGOpcode op)
     case INDEX_op_rem_i64:
     case INDEX_op_remu_i32:
     case INDEX_op_remu_i64:
+    case INDEX_op_setcond_i32:
+    case INDEX_op_setcond_i64:
         return C_O1_I2(r, rZ, rZ);
 
     default:
-- 
2.33.0



^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH v3 21/30] tcg/loongarch64: Implement tcg_out_call
  2021-09-22 18:08 [PATCH v3 00/30] LoongArch64 port of QEMU TCG WANG Xuerui
                   ` (19 preceding siblings ...)
  2021-09-22 18:09 ` [PATCH v3 20/30] tcg/loongarch64: Implement setcond ops WANG Xuerui
@ 2021-09-22 18:09 ` WANG Xuerui
  2021-09-22 18:09 ` [PATCH v3 22/30] tcg/loongarch64: Implement simple load/store ops WANG Xuerui
                   ` (8 subsequent siblings)
  29 siblings, 0 replies; 53+ messages in thread
From: WANG Xuerui @ 2021-09-22 18:09 UTC (permalink / raw)
  To: qemu-devel
  Cc: WANG Xuerui, Peter Maydell, Richard Henderson,
	Philippe Mathieu-Daudé,
	Laurent Vivier

Signed-off-by: WANG Xuerui <git@xen0n.name>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
---
 tcg/loongarch64/tcg-target.c.inc | 34 ++++++++++++++++++++++++++++++++
 1 file changed, 34 insertions(+)

diff --git a/tcg/loongarch64/tcg-target.c.inc b/tcg/loongarch64/tcg-target.c.inc
index e30277b43a..66b19bb973 100644
--- a/tcg/loongarch64/tcg-target.c.inc
+++ b/tcg/loongarch64/tcg-target.c.inc
@@ -484,6 +484,39 @@ static void tcg_out_brcond(TCGContext *s, TCGCond cond, TCGReg arg1,
     tcg_out32(s, encode_djsk16_insn(op, arg1, arg2, 0));
 }
 
+static void tcg_out_call_int(TCGContext *s, const tcg_insn_unit *arg, bool tail)
+{
+    TCGReg link = tail ? TCG_REG_ZERO : TCG_REG_RA;
+    ptrdiff_t offset = tcg_pcrel_diff(s, arg);
+
+    tcg_debug_assert((offset & 3) == 0);
+    if (offset == sextreg(offset, 0, 28)) {
+        /* short jump: +/- 256MiB */
+        if (tail) {
+            tcg_out_opc_b(s, offset >> 2);
+        } else {
+            tcg_out_opc_bl(s, offset >> 2);
+        }
+    } else if (offset == sextreg(offset, 0, 38)) {
+        /* long jump: +/- 256GiB */
+        tcg_target_long lo = sextreg(offset, 0, 18);
+        tcg_target_long hi = offset - lo;
+        tcg_out_opc_pcaddu18i(s, TCG_REG_TMP0, hi >> 18);
+        tcg_out_opc_jirl(s, link, TCG_REG_TMP0, lo >> 2);
+    } else {
+        /* far jump: 64-bit */
+        tcg_target_long lo = sextreg((tcg_target_long)arg, 0, 18);
+        tcg_target_long hi = (tcg_target_long)arg - lo;
+        tcg_out_movi(s, TCG_TYPE_PTR, TCG_REG_TMP0, hi);
+        tcg_out_opc_jirl(s, link, TCG_REG_TMP0, lo >> 2);
+    }
+}
+
+static void tcg_out_call(TCGContext *s, const tcg_insn_unit *arg)
+{
+    tcg_out_call_int(s, arg, false);
+}
+
 /*
  * Entry-points
  */
@@ -834,6 +867,7 @@ static void tcg_out_op(TCGContext *s, TCGOpcode opc,
 
     case INDEX_op_mov_i32:  /* Always emitted via tcg_out_mov.  */
     case INDEX_op_mov_i64:
+    case INDEX_op_call:     /* Always emitted via tcg_out_call.  */
     default:
         g_assert_not_reached();
     }
-- 
2.33.0



^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH v3 22/30] tcg/loongarch64: Implement simple load/store ops
  2021-09-22 18:08 [PATCH v3 00/30] LoongArch64 port of QEMU TCG WANG Xuerui
                   ` (20 preceding siblings ...)
  2021-09-22 18:09 ` [PATCH v3 21/30] tcg/loongarch64: Implement tcg_out_call WANG Xuerui
@ 2021-09-22 18:09 ` WANG Xuerui
  2021-09-22 18:09 ` [PATCH v3 23/30] tcg/loongarch64: Add softmmu load/store helpers, implement qemu_ld/qemu_st ops WANG Xuerui
                   ` (7 subsequent siblings)
  29 siblings, 0 replies; 53+ messages in thread
From: WANG Xuerui @ 2021-09-22 18:09 UTC (permalink / raw)
  To: qemu-devel
  Cc: WANG Xuerui, Peter Maydell, Richard Henderson,
	Philippe Mathieu-Daudé,
	Laurent Vivier

Signed-off-by: WANG Xuerui <git@xen0n.name>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
---
 tcg/loongarch64/tcg-target-con-set.h |   1 +
 tcg/loongarch64/tcg-target.c.inc     | 131 +++++++++++++++++++++++++++
 2 files changed, 132 insertions(+)

diff --git a/tcg/loongarch64/tcg-target-con-set.h b/tcg/loongarch64/tcg-target-con-set.h
index 367689c2e2..3ab0416d9f 100644
--- a/tcg/loongarch64/tcg-target-con-set.h
+++ b/tcg/loongarch64/tcg-target-con-set.h
@@ -15,6 +15,7 @@
  * tcg-target-con-str.h; the constraint combination is inclusive or.
  */
 C_O0_I1(r)
+C_O0_I2(rZ, r)
 C_O0_I2(rZ, rZ)
 C_O1_I1(r, r)
 C_O1_I2(r, r, rC)
diff --git a/tcg/loongarch64/tcg-target.c.inc b/tcg/loongarch64/tcg-target.c.inc
index 66b19bb973..cc1123b015 100644
--- a/tcg/loongarch64/tcg-target.c.inc
+++ b/tcg/loongarch64/tcg-target.c.inc
@@ -517,6 +517,73 @@ static void tcg_out_call(TCGContext *s, const tcg_insn_unit *arg)
     tcg_out_call_int(s, arg, false);
 }
 
+/*
+ * Load/store helpers
+ */
+
+static void tcg_out_ldst(TCGContext *s, LoongArchInsn opc, TCGReg data,
+                         TCGReg addr, intptr_t offset)
+{
+    intptr_t imm12 = sextreg(offset, 0, 12);
+
+    if (offset != imm12) {
+        intptr_t diff = offset - (uintptr_t)s->code_ptr;
+
+        if (addr == TCG_REG_ZERO && diff == (int32_t)diff) {
+            imm12 = sextreg(diff, 0, 12);
+            tcg_out_opc_pcaddu12i(s, TCG_REG_TMP2, (diff - imm12) >> 12);
+        } else {
+            tcg_out_movi(s, TCG_TYPE_PTR, TCG_REG_TMP2, offset - imm12);
+            if (addr != TCG_REG_ZERO) {
+                tcg_out_opc_add_d(s, TCG_REG_TMP2, TCG_REG_TMP2, addr);
+            }
+        }
+        addr = TCG_REG_TMP2;
+    }
+
+    switch (opc) {
+    case OPC_LD_B:
+    case OPC_LD_BU:
+    case OPC_LD_H:
+    case OPC_LD_HU:
+    case OPC_LD_W:
+    case OPC_LD_WU:
+    case OPC_LD_D:
+    case OPC_ST_B:
+    case OPC_ST_H:
+    case OPC_ST_W:
+    case OPC_ST_D:
+        tcg_out32(s, encode_djsk12_insn(opc, data, addr, imm12));
+        break;
+    default:
+        g_assert_not_reached();
+    }
+}
+
+static void tcg_out_ld(TCGContext *s, TCGType type, TCGReg arg,
+                       TCGReg arg1, intptr_t arg2)
+{
+    bool is_32bit = type == TCG_TYPE_I32;
+    tcg_out_ldst(s, is_32bit ? OPC_LD_W : OPC_LD_D, arg, arg1, arg2);
+}
+
+static void tcg_out_st(TCGContext *s, TCGType type, TCGReg arg,
+                       TCGReg arg1, intptr_t arg2)
+{
+    bool is_32bit = type == TCG_TYPE_I32;
+    tcg_out_ldst(s, is_32bit ? OPC_ST_W : OPC_ST_D, arg, arg1, arg2);
+}
+
+static bool tcg_out_sti(TCGContext *s, TCGType type, TCGArg val,
+                        TCGReg base, intptr_t ofs)
+{
+    if (val == 0) {
+        tcg_out_st(s, type, TCG_REG_ZERO, base, ofs);
+        return true;
+    }
+    return false;
+}
+
 /*
  * Entry-points
  */
@@ -865,6 +932,49 @@ static void tcg_out_op(TCGContext *s, TCGOpcode opc,
         tcg_out_setcond(s, args[3], a0, a1, a2, c2);
         break;
 
+    case INDEX_op_ld8s_i32:
+    case INDEX_op_ld8s_i64:
+        tcg_out_ldst(s, OPC_LD_B, a0, a1, a2);
+        break;
+    case INDEX_op_ld8u_i32:
+    case INDEX_op_ld8u_i64:
+        tcg_out_ldst(s, OPC_LD_BU, a0, a1, a2);
+        break;
+    case INDEX_op_ld16s_i32:
+    case INDEX_op_ld16s_i64:
+        tcg_out_ldst(s, OPC_LD_H, a0, a1, a2);
+        break;
+    case INDEX_op_ld16u_i32:
+    case INDEX_op_ld16u_i64:
+        tcg_out_ldst(s, OPC_LD_HU, a0, a1, a2);
+        break;
+    case INDEX_op_ld_i32:
+    case INDEX_op_ld32s_i64:
+        tcg_out_ldst(s, OPC_LD_W, a0, a1, a2);
+        break;
+    case INDEX_op_ld32u_i64:
+        tcg_out_ldst(s, OPC_LD_WU, a0, a1, a2);
+        break;
+    case INDEX_op_ld_i64:
+        tcg_out_ldst(s, OPC_LD_D, a0, a1, a2);
+        break;
+
+    case INDEX_op_st8_i32:
+    case INDEX_op_st8_i64:
+        tcg_out_ldst(s, OPC_ST_B, a0, a1, a2);
+        break;
+    case INDEX_op_st16_i32:
+    case INDEX_op_st16_i64:
+        tcg_out_ldst(s, OPC_ST_H, a0, a1, a2);
+        break;
+    case INDEX_op_st_i32:
+    case INDEX_op_st32_i64:
+        tcg_out_ldst(s, OPC_ST_W, a0, a1, a2);
+        break;
+    case INDEX_op_st_i64:
+        tcg_out_ldst(s, OPC_ST_D, a0, a1, a2);
+        break;
+
     case INDEX_op_mov_i32:  /* Always emitted via tcg_out_mov.  */
     case INDEX_op_mov_i64:
     case INDEX_op_call:     /* Always emitted via tcg_out_call.  */
@@ -879,6 +989,15 @@ static TCGConstraintSetIndex tcg_target_op_def(TCGOpcode op)
     case INDEX_op_goto_ptr:
         return C_O0_I1(r);
 
+    case INDEX_op_st8_i32:
+    case INDEX_op_st8_i64:
+    case INDEX_op_st16_i32:
+    case INDEX_op_st16_i64:
+    case INDEX_op_st32_i64:
+    case INDEX_op_st_i32:
+    case INDEX_op_st_i64:
+        return C_O0_I2(rZ, r);
+
     case INDEX_op_brcond_i32:
     case INDEX_op_brcond_i64:
         return C_O0_I2(rZ, rZ);
@@ -906,6 +1025,18 @@ static TCGConstraintSetIndex tcg_target_op_def(TCGOpcode op)
     case INDEX_op_bswap32_i32:
     case INDEX_op_bswap32_i64:
     case INDEX_op_bswap64_i64:
+    case INDEX_op_ld8s_i32:
+    case INDEX_op_ld8s_i64:
+    case INDEX_op_ld8u_i32:
+    case INDEX_op_ld8u_i64:
+    case INDEX_op_ld16s_i32:
+    case INDEX_op_ld16s_i64:
+    case INDEX_op_ld16u_i32:
+    case INDEX_op_ld16u_i64:
+    case INDEX_op_ld32s_i64:
+    case INDEX_op_ld32u_i64:
+    case INDEX_op_ld_i32:
+    case INDEX_op_ld_i64:
         return C_O1_I1(r, r);
 
     case INDEX_op_andc_i32:
-- 
2.33.0



^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH v3 23/30] tcg/loongarch64: Add softmmu load/store helpers, implement qemu_ld/qemu_st ops
  2021-09-22 18:08 [PATCH v3 00/30] LoongArch64 port of QEMU TCG WANG Xuerui
                   ` (21 preceding siblings ...)
  2021-09-22 18:09 ` [PATCH v3 22/30] tcg/loongarch64: Implement simple load/store ops WANG Xuerui
@ 2021-09-22 18:09 ` WANG Xuerui
  2021-09-23 17:25   ` Richard Henderson
  2021-09-22 18:09 ` [PATCH v3 24/30] tcg/loongarch64: Implement tcg_target_qemu_prologue WANG Xuerui
                   ` (6 subsequent siblings)
  29 siblings, 1 reply; 53+ messages in thread
From: WANG Xuerui @ 2021-09-22 18:09 UTC (permalink / raw)
  To: qemu-devel
  Cc: WANG Xuerui, Peter Maydell, Richard Henderson,
	Philippe Mathieu-Daudé,
	Laurent Vivier

Signed-off-by: WANG Xuerui <git@xen0n.name>
---
 tcg/loongarch64/tcg-target-con-set.h |   2 +
 tcg/loongarch64/tcg-target.c.inc     | 342 +++++++++++++++++++++++++++
 2 files changed, 344 insertions(+)

diff --git a/tcg/loongarch64/tcg-target-con-set.h b/tcg/loongarch64/tcg-target-con-set.h
index 3ab0416d9f..8fd3a2f4a1 100644
--- a/tcg/loongarch64/tcg-target-con-set.h
+++ b/tcg/loongarch64/tcg-target-con-set.h
@@ -17,7 +17,9 @@
 C_O0_I1(r)
 C_O0_I2(rZ, r)
 C_O0_I2(rZ, rZ)
+C_O0_I2(LZ, L)
 C_O1_I1(r, r)
+C_O1_I1(r, L)
 C_O1_I2(r, r, rC)
 C_O1_I2(r, r, ri)
 C_O1_I2(r, r, rI)
diff --git a/tcg/loongarch64/tcg-target.c.inc b/tcg/loongarch64/tcg-target.c.inc
index cc1123b015..4c3feb2258 100644
--- a/tcg/loongarch64/tcg-target.c.inc
+++ b/tcg/loongarch64/tcg-target.c.inc
@@ -117,6 +117,11 @@ static const int tcg_target_call_oarg_regs[] = {
     TCG_REG_A1,
 };
 
+#ifndef CONFIG_SOFTMMU
+#define USE_GUEST_BASE     (guest_base != 0)
+#define TCG_GUEST_BASE_REG TCG_REG_S1
+#endif
+
 #define TCG_CT_CONST_ZERO  0x100
 #define TCG_CT_CONST_S12   0x200
 #define TCG_CT_CONST_N12   0x400
@@ -584,6 +589,322 @@ static bool tcg_out_sti(TCGContext *s, TCGType type, TCGArg val,
     return false;
 }
 
+/*
+ * Load/store helpers for SoftMMU, and qemu_ld/st implementations
+ */
+
+#if defined(CONFIG_SOFTMMU)
+#include "../tcg-ldst.c.inc"
+
+/*
+ * helper signature: helper_ret_ld_mmu(CPUState *env, target_ulong addr,
+ *                                     TCGMemOpIdx oi, uintptr_t ra)
+ */
+static void * const qemu_ld_helpers[4] = {
+    [MO_8]  = helper_ret_ldub_mmu,
+    [MO_16] = helper_le_lduw_mmu,
+    [MO_32] = helper_le_ldul_mmu,
+    [MO_64] = helper_le_ldq_mmu,
+};
+
+/*
+ * helper signature: helper_ret_st_mmu(CPUState *env, target_ulong addr,
+ *                                     uintxx_t val, TCGMemOpIdx oi,
+ *                                     uintptr_t ra)
+ */
+static void * const qemu_st_helpers[4] = {
+    [MO_8]  = helper_ret_stb_mmu,
+    [MO_16] = helper_le_stw_mmu,
+    [MO_32] = helper_le_stl_mmu,
+    [MO_64] = helper_le_stq_mmu,
+};
+
+/* We expect to use a 12-bit negative offset from ENV.  */
+QEMU_BUILD_BUG_ON(TLB_MASK_TABLE_OFS(0) > 0);
+QEMU_BUILD_BUG_ON(TLB_MASK_TABLE_OFS(0) < -(1 << 11));
+
+static bool tcg_out_goto(TCGContext *s, const tcg_insn_unit *target)
+{
+    tcg_out_opc_b(s, 0);
+    return reloc_br_sd10k16(s->code_ptr - 1, target);
+}
+
+static void tcg_out_tlb_load(TCGContext *s, TCGReg addrl, TCGMemOpIdx oi,
+                             tcg_insn_unit **label_ptr, bool is_load)
+{
+    MemOp opc = get_memop(oi);
+    unsigned s_bits = opc & MO_SIZE;
+    unsigned a_bits = get_alignment_bits(opc);
+    tcg_target_long compare_mask;
+    int mem_index = get_mmuidx(oi);
+    int fast_ofs = TLB_MASK_TABLE_OFS(mem_index);
+    int mask_ofs = fast_ofs + offsetof(CPUTLBDescFast, mask);
+    int table_ofs = fast_ofs + offsetof(CPUTLBDescFast, table);
+
+    tcg_out_ld(s, TCG_TYPE_PTR, TCG_REG_TMP0, TCG_AREG0, mask_ofs);
+    tcg_out_ld(s, TCG_TYPE_PTR, TCG_REG_TMP1, TCG_AREG0, table_ofs);
+
+    tcg_out_opc_srli_d(s, TCG_REG_TMP2, addrl,
+                    TARGET_PAGE_BITS - CPU_TLB_ENTRY_BITS);
+    tcg_out_opc_and(s, TCG_REG_TMP2, TCG_REG_TMP2, TCG_REG_TMP0);
+    tcg_out_opc_add_d(s, TCG_REG_TMP2, TCG_REG_TMP2, TCG_REG_TMP1);
+
+    /* Load the tlb comparator and the addend.  */
+    tcg_out_ld(s, TCG_TYPE_TL, TCG_REG_TMP0, TCG_REG_TMP2,
+               is_load ? offsetof(CPUTLBEntry, addr_read)
+               : offsetof(CPUTLBEntry, addr_write));
+    tcg_out_ld(s, TCG_TYPE_PTR, TCG_REG_TMP2, TCG_REG_TMP2,
+               offsetof(CPUTLBEntry, addend));
+
+    /* We don't support unaligned accesses.  */
+    if (a_bits < s_bits) {
+        a_bits = s_bits;
+    }
+    /* Clear the non-page, non-alignment bits from the address.  */
+    compare_mask = (tcg_target_long)TARGET_PAGE_MASK | ((1 << a_bits) - 1);
+    tcg_out_movi(s, TCG_TYPE_TL, TCG_REG_TMP1, compare_mask);
+    tcg_out_opc_and(s, TCG_REG_TMP1, TCG_REG_TMP1, addrl);
+
+    /* Compare masked address with the TLB entry.  */
+    label_ptr[0] = s->code_ptr;
+    tcg_out_opc_bne(s, TCG_REG_TMP0, TCG_REG_TMP1, 0);
+
+    /* TLB Hit - translate address using addend.  */
+    if (TCG_TARGET_REG_BITS > TARGET_LONG_BITS) {
+        tcg_out_ext32u(s, TCG_REG_TMP0, addrl);
+        addrl = TCG_REG_TMP0;
+    }
+    tcg_out_opc_add_d(s, TCG_REG_TMP0, TCG_REG_TMP2, addrl);
+}
+
+static void add_qemu_ldst_label(TCGContext *s, int is_ld, TCGMemOpIdx oi,
+                                TCGType type,
+                                TCGReg datalo, TCGReg addrlo,
+                                void *raddr, tcg_insn_unit **label_ptr)
+{
+    TCGLabelQemuLdst *label = new_ldst_label(s);
+
+    label->is_ld = is_ld;
+    label->oi = oi;
+    label->type = type;
+    label->datalo_reg = datalo;
+    label->datahi_reg = 0; /* unused */
+    label->addrlo_reg = addrlo;
+    label->addrhi_reg = 0; /* unused */
+    label->raddr = tcg_splitwx_to_rx(raddr);
+    label->label_ptr[0] = label_ptr[0];
+}
+
+static bool tcg_out_qemu_ld_slow_path(TCGContext *s, TCGLabelQemuLdst *l)
+{
+    TCGMemOpIdx oi = l->oi;
+    MemOp opc = get_memop(oi);
+    MemOp size = opc & MO_SIZE;
+    TCGType type = l->type;
+
+    /* resolve label address */
+    if (!reloc_br_sk16(l->label_ptr[0], tcg_splitwx_to_rx(s->code_ptr))) {
+        return false;
+    }
+
+    /* call load helper */
+    tcg_out_mov(s, TCG_TYPE_PTR, TCG_REG_A0, TCG_AREG0);
+    tcg_out_mov(s, TCG_TYPE_PTR, TCG_REG_A1, l->addrlo_reg);
+    tcg_out_movi(s, TCG_TYPE_PTR, TCG_REG_A2, oi);
+    tcg_out_movi(s, TCG_TYPE_PTR, TCG_REG_A3, (tcg_target_long)l->raddr);
+
+    tcg_out_call(s, qemu_ld_helpers[size]);
+
+    switch (opc & MO_SSIZE) {
+    case MO_SB:
+        tcg_out_ext8s(s, l->datalo_reg, TCG_REG_A0);
+        break;
+    case MO_SW:
+        tcg_out_ext16s(s, l->datalo_reg, TCG_REG_A0);
+        break;
+    case MO_SL:
+        tcg_out_ext32s(s, l->datalo_reg, TCG_REG_A0);
+        break;
+    case MO_UL:
+        if (type == TCG_TYPE_I32) {
+            /* MO_UL loads of i32 should be sign-extended too */
+            tcg_out_ext32s(s, l->datalo_reg, TCG_REG_A0);
+            break;
+        }
+        /* fallthrough */
+    default:
+        tcg_out_mov(s, size == MO_64, l->datalo_reg, TCG_REG_A0);
+        break;
+    }
+
+    return tcg_out_goto(s, l->raddr);
+}
+
+static bool tcg_out_qemu_st_slow_path(TCGContext *s, TCGLabelQemuLdst *l)
+{
+    TCGMemOpIdx oi = l->oi;
+    MemOp opc = get_memop(oi);
+    MemOp size = opc & MO_SIZE;
+
+    /* resolve label address */
+    if (!reloc_br_sk16(l->label_ptr[0], tcg_splitwx_to_rx(s->code_ptr))) {
+        return false;
+    }
+
+    /* call store helper */
+    tcg_out_mov(s, TCG_TYPE_PTR, TCG_REG_A0, TCG_AREG0);
+    tcg_out_mov(s, TCG_TYPE_PTR, TCG_REG_A1, l->addrlo_reg);
+    switch (size) {
+    case MO_8:
+        tcg_out_ext8u(s, TCG_REG_A2, l->datalo_reg);
+        break;
+    case MO_16:
+        tcg_out_ext16u(s, TCG_REG_A2, l->datalo_reg);
+        break;
+    case MO_32:
+        tcg_out_ext32u(s, TCG_REG_A2, l->datalo_reg);
+        break;
+    case MO_64:
+        tcg_out_mov(s, TCG_TYPE_I64, TCG_REG_A2, l->datalo_reg);
+        break;
+    default:
+        g_assert_not_reached();
+        break;
+    }
+    tcg_out_movi(s, TCG_TYPE_PTR, TCG_REG_A3, oi);
+    tcg_out_movi(s, TCG_TYPE_PTR, TCG_REG_A4, (tcg_target_long)l->raddr);
+
+    tcg_out_call(s, qemu_st_helpers[size]);
+
+    return tcg_out_goto(s, l->raddr);
+}
+#endif /* CONFIG_SOFTMMU */
+
+static void tcg_out_qemu_ld_direct(TCGContext *s, TCGReg lo, TCGReg base,
+                                   MemOp opc, TCGType type)
+{
+    /* Byte swapping is left to middle-end expansion.  */
+    tcg_debug_assert((opc & MO_BSWAP) == 0);
+
+    switch (opc & MO_SSIZE) {
+    case MO_UB:
+        tcg_out_opc_ld_bu(s, lo, base, 0);
+        break;
+    case MO_SB:
+        tcg_out_opc_ld_b(s, lo, base, 0);
+        break;
+    case MO_UW:
+        tcg_out_opc_ld_hu(s, lo, base, 0);
+        break;
+    case MO_SW:
+        tcg_out_opc_ld_h(s, lo, base, 0);
+        break;
+    case MO_UL:
+        if (type == TCG_TYPE_I64) {
+            tcg_out_opc_ld_wu(s, lo, base, 0);
+            break;
+        }
+        /* fallthrough */
+    case MO_SL:
+        tcg_out_opc_ld_w(s, lo, base, 0);
+        break;
+    case MO_Q:
+        tcg_out_opc_ld_d(s, lo, base, 0);
+        break;
+    default:
+        g_assert_not_reached();
+    }
+}
+
+static void tcg_out_qemu_ld(TCGContext *s, const TCGArg *args, TCGType type)
+{
+    TCGReg addr_regl;
+    TCGReg data_regl;
+    TCGMemOpIdx oi;
+    MemOp opc;
+#if defined(CONFIG_SOFTMMU)
+    tcg_insn_unit *label_ptr[1];
+#endif
+    TCGReg base = TCG_REG_TMP0;
+
+    data_regl = *args++;
+    addr_regl = *args++;
+    oi = *args++;
+    opc = get_memop(oi);
+
+#if defined(CONFIG_SOFTMMU)
+    tcg_out_tlb_load(s, addr_regl, oi, label_ptr, 1);
+    tcg_out_qemu_ld_direct(s, data_regl, base, opc, type);
+    add_qemu_ldst_label(s, 1, oi, type,
+                        data_regl, addr_regl,
+                        s->code_ptr, label_ptr);
+#else
+    if (USE_GUEST_BASE) {
+        tcg_out_opc_add_d(s, base, TCG_GUEST_BASE_REG, addr_regl);
+    } else {
+        base = addr_regl;
+    }
+    tcg_out_qemu_ld_direct(s, data_regl, base, opc, type);
+#endif
+}
+
+static void tcg_out_qemu_st_direct(TCGContext *s, TCGReg lo,
+                                   TCGReg base, MemOp opc)
+{
+    /* Byte swapping is left to middle-end expansion.  */
+    tcg_debug_assert((opc & MO_BSWAP) == 0);
+
+    switch (opc & MO_SIZE) {
+    case MO_8:
+        tcg_out_opc_st_b(s, lo, base, 0);
+        break;
+    case MO_16:
+        tcg_out_opc_st_h(s, lo, base, 0);
+        break;
+    case MO_32:
+        tcg_out_opc_st_w(s, lo, base, 0);
+        break;
+    case MO_64:
+        tcg_out_opc_st_d(s, lo, base, 0);
+        break;
+    default:
+        g_assert_not_reached();
+    }
+}
+
+static void tcg_out_qemu_st(TCGContext *s, const TCGArg *args)
+{
+    TCGReg addr_regl;
+    TCGReg data_regl;
+    TCGMemOpIdx oi;
+    MemOp opc;
+#if defined(CONFIG_SOFTMMU)
+    tcg_insn_unit *label_ptr[1];
+#endif
+    TCGReg base = TCG_REG_TMP0;
+
+    data_regl = *args++;
+    addr_regl = *args++;
+    oi = *args++;
+    opc = get_memop(oi);
+
+#if defined(CONFIG_SOFTMMU)
+    tcg_out_tlb_load(s, addr_regl, oi, label_ptr, 0);
+    tcg_out_qemu_st_direct(s, data_regl, base, opc);
+    add_qemu_ldst_label(s, 0, oi,
+                        0, /* type param is unused for stores */
+                        data_regl, addr_regl,
+                        s->code_ptr, label_ptr);
+#else
+    if (USE_GUEST_BASE) {
+        tcg_out_opc_add_d(s, base, TCG_GUEST_BASE_REG, addr_regl);
+    } else {
+        base = addr_regl;
+    }
+    tcg_out_qemu_st_direct(s, data_regl, base, opc);
+#endif
+}
+
 /*
  * Entry-points
  */
@@ -975,6 +1296,19 @@ static void tcg_out_op(TCGContext *s, TCGOpcode opc,
         tcg_out_ldst(s, OPC_ST_D, a0, a1, a2);
         break;
 
+    case INDEX_op_qemu_ld_i32:
+        tcg_out_qemu_ld(s, args, TCG_TYPE_I32);
+        break;
+    case INDEX_op_qemu_ld_i64:
+        tcg_out_qemu_ld(s, args, TCG_TYPE_I64);
+        break;
+    case INDEX_op_qemu_st_i32:
+        tcg_out_qemu_st(s, args);
+        break;
+    case INDEX_op_qemu_st_i64:
+        tcg_out_qemu_st(s, args);
+        break;
+
     case INDEX_op_mov_i32:  /* Always emitted via tcg_out_mov.  */
     case INDEX_op_mov_i64:
     case INDEX_op_call:     /* Always emitted via tcg_out_call.  */
@@ -1002,6 +1336,10 @@ static TCGConstraintSetIndex tcg_target_op_def(TCGOpcode op)
     case INDEX_op_brcond_i64:
         return C_O0_I2(rZ, rZ);
 
+    case INDEX_op_qemu_st_i32:
+    case INDEX_op_qemu_st_i64:
+        return C_O0_I2(LZ, L);
+
     case INDEX_op_ext8s_i32:
     case INDEX_op_ext8s_i64:
     case INDEX_op_ext8u_i32:
@@ -1039,6 +1377,10 @@ static TCGConstraintSetIndex tcg_target_op_def(TCGOpcode op)
     case INDEX_op_ld_i64:
         return C_O1_I1(r, r);
 
+    case INDEX_op_qemu_ld_i32:
+    case INDEX_op_qemu_ld_i64:
+        return C_O1_I1(r, L);
+
     case INDEX_op_andc_i32:
     case INDEX_op_andc_i64:
     case INDEX_op_orc_i32:
-- 
2.33.0



^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH v3 24/30] tcg/loongarch64: Implement tcg_target_qemu_prologue
  2021-09-22 18:08 [PATCH v3 00/30] LoongArch64 port of QEMU TCG WANG Xuerui
                   ` (22 preceding siblings ...)
  2021-09-22 18:09 ` [PATCH v3 23/30] tcg/loongarch64: Add softmmu load/store helpers, implement qemu_ld/qemu_st ops WANG Xuerui
@ 2021-09-22 18:09 ` WANG Xuerui
  2021-09-22 18:09 ` [PATCH v3 25/30] tcg/loongarch64: Implement exit_tb/goto_tb WANG Xuerui
                   ` (5 subsequent siblings)
  29 siblings, 0 replies; 53+ messages in thread
From: WANG Xuerui @ 2021-09-22 18:09 UTC (permalink / raw)
  To: qemu-devel
  Cc: WANG Xuerui, Peter Maydell, Richard Henderson,
	Philippe Mathieu-Daudé,
	Laurent Vivier

Signed-off-by: WANG Xuerui <git@xen0n.name>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
---
 tcg/loongarch64/tcg-target.c.inc | 68 ++++++++++++++++++++++++++++++++
 1 file changed, 68 insertions(+)

diff --git a/tcg/loongarch64/tcg-target.c.inc b/tcg/loongarch64/tcg-target.c.inc
index 4c3feb2258..590f2b7c2f 100644
--- a/tcg/loongarch64/tcg-target.c.inc
+++ b/tcg/loongarch64/tcg-target.c.inc
@@ -909,6 +909,8 @@ static void tcg_out_qemu_st(TCGContext *s, const TCGArg *args)
  * Entry-points
  */
 
+static const tcg_insn_unit *tb_ret_addr;
+
 static void tcg_out_op(TCGContext *s, TCGOpcode opc,
                        const TCGArg args[TCG_MAX_OP_ARGS],
                        const int const_args[TCG_MAX_OP_ARGS])
@@ -1456,3 +1458,69 @@ static TCGConstraintSetIndex tcg_target_op_def(TCGOpcode op)
         g_assert_not_reached();
     }
 }
+
+static const int tcg_target_callee_save_regs[] = {
+    TCG_REG_S0,     /* used for the global env (TCG_AREG0) */
+    TCG_REG_S1,
+    TCG_REG_S2,
+    TCG_REG_S3,
+    TCG_REG_S4,
+    TCG_REG_S5,
+    TCG_REG_S6,
+    TCG_REG_S7,
+    TCG_REG_S8,
+    TCG_REG_S9,
+    TCG_REG_RA,     /* should be last for ABI compliance */
+};
+
+/* Stack frame parameters.  */
+#define REG_SIZE   (TCG_TARGET_REG_BITS / 8)
+#define SAVE_SIZE  ((int)ARRAY_SIZE(tcg_target_callee_save_regs) * REG_SIZE)
+#define TEMP_SIZE  (CPU_TEMP_BUF_NLONGS * (int)sizeof(long))
+#define FRAME_SIZE ((TCG_STATIC_CALL_ARGS_SIZE + TEMP_SIZE + SAVE_SIZE \
+                     + TCG_TARGET_STACK_ALIGN - 1) \
+                    & -TCG_TARGET_STACK_ALIGN)
+#define SAVE_OFS   (TCG_STATIC_CALL_ARGS_SIZE + TEMP_SIZE)
+
+/* We're expecting to be able to use an immediate for frame allocation.  */
+QEMU_BUILD_BUG_ON(FRAME_SIZE > 0x7ff);
+
+/* Generate global QEMU prologue and epilogue code */
+static void tcg_target_qemu_prologue(TCGContext *s)
+{
+    int i;
+
+    tcg_set_frame(s, TCG_REG_SP, TCG_STATIC_CALL_ARGS_SIZE, TEMP_SIZE);
+
+    /* TB prologue */
+    tcg_out_opc_addi_d(s, TCG_REG_SP, TCG_REG_SP, -FRAME_SIZE);
+    for (i = 0; i < ARRAY_SIZE(tcg_target_callee_save_regs); i++) {
+        tcg_out_st(s, TCG_TYPE_REG, tcg_target_callee_save_regs[i],
+                   TCG_REG_SP, SAVE_OFS + i * REG_SIZE);
+    }
+
+#if !defined(CONFIG_SOFTMMU)
+    if (USE_GUEST_BASE) {
+        tcg_out_movi(s, TCG_TYPE_PTR, TCG_GUEST_BASE_REG, guest_base);
+        tcg_regset_set_reg(s->reserved_regs, TCG_GUEST_BASE_REG);
+    }
+#endif
+
+    /* Call generated code */
+    tcg_out_mov(s, TCG_TYPE_PTR, TCG_AREG0, tcg_target_call_iarg_regs[0]);
+    tcg_out_opc_jirl(s, TCG_REG_ZERO, tcg_target_call_iarg_regs[1], 0);
+
+    /* Return path for goto_ptr. Set return value to 0 */
+    tcg_code_gen_epilogue = tcg_splitwx_to_rx(s->code_ptr);
+    tcg_out_mov(s, TCG_TYPE_REG, TCG_REG_A0, TCG_REG_ZERO);
+
+    /* TB epilogue */
+    tb_ret_addr = tcg_splitwx_to_rx(s->code_ptr);
+    for (i = 0; i < ARRAY_SIZE(tcg_target_callee_save_regs); i++) {
+        tcg_out_ld(s, TCG_TYPE_REG, tcg_target_callee_save_regs[i],
+                   TCG_REG_SP, SAVE_OFS + i * REG_SIZE);
+    }
+
+    tcg_out_opc_addi_d(s, TCG_REG_SP, TCG_REG_SP, FRAME_SIZE);
+    tcg_out_opc_jirl(s, TCG_REG_ZERO, TCG_REG_RA, 0);
+}
-- 
2.33.0



^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH v3 25/30] tcg/loongarch64: Implement exit_tb/goto_tb
  2021-09-22 18:08 [PATCH v3 00/30] LoongArch64 port of QEMU TCG WANG Xuerui
                   ` (23 preceding siblings ...)
  2021-09-22 18:09 ` [PATCH v3 24/30] tcg/loongarch64: Implement tcg_target_qemu_prologue WANG Xuerui
@ 2021-09-22 18:09 ` WANG Xuerui
  2021-09-22 18:09 ` [PATCH v3 26/30] tcg/loongarch64: Implement tcg_target_init WANG Xuerui
                   ` (4 subsequent siblings)
  29 siblings, 0 replies; 53+ messages in thread
From: WANG Xuerui @ 2021-09-22 18:09 UTC (permalink / raw)
  To: qemu-devel
  Cc: WANG Xuerui, Peter Maydell, Richard Henderson,
	Philippe Mathieu-Daudé,
	Laurent Vivier

Signed-off-by: WANG Xuerui <git@xen0n.name>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
---
 tcg/loongarch64/tcg-target.c.inc | 19 +++++++++++++++++++
 1 file changed, 19 insertions(+)

diff --git a/tcg/loongarch64/tcg-target.c.inc b/tcg/loongarch64/tcg-target.c.inc
index 590f2b7c2f..405da21234 100644
--- a/tcg/loongarch64/tcg-target.c.inc
+++ b/tcg/loongarch64/tcg-target.c.inc
@@ -921,6 +921,25 @@ static void tcg_out_op(TCGContext *s, TCGOpcode opc,
     int c2 = const_args[2];
 
     switch (opc) {
+    case INDEX_op_exit_tb:
+        /* Reuse the zeroing that exists for goto_ptr.  */
+        if (a0 == 0) {
+            tcg_out_call_int(s, tcg_code_gen_epilogue, true);
+        } else {
+            tcg_out_movi(s, TCG_TYPE_PTR, TCG_REG_A0, a0);
+            tcg_out_call_int(s, tb_ret_addr, true);
+        }
+        break;
+
+    case INDEX_op_goto_tb:
+        assert(s->tb_jmp_insn_offset == 0);
+        /* indirect jump method */
+        tcg_out_ld(s, TCG_TYPE_PTR, TCG_REG_TMP0, TCG_REG_ZERO,
+                   (uintptr_t)(s->tb_jmp_target_addr + a0));
+        tcg_out_opc_jirl(s, TCG_REG_ZERO, TCG_REG_TMP0, 0);
+        set_jmp_reset_offset(s, a0);
+        break;
+
     case INDEX_op_mb:
         tcg_out_mb(s, a0);
         break;
-- 
2.33.0



^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH v3 26/30] tcg/loongarch64: Implement tcg_target_init
  2021-09-22 18:08 [PATCH v3 00/30] LoongArch64 port of QEMU TCG WANG Xuerui
                   ` (24 preceding siblings ...)
  2021-09-22 18:09 ` [PATCH v3 25/30] tcg/loongarch64: Implement exit_tb/goto_tb WANG Xuerui
@ 2021-09-22 18:09 ` WANG Xuerui
  2021-09-22 18:09 ` [PATCH v3 27/30] tcg/loongarch64: Register the JIT WANG Xuerui
                   ` (3 subsequent siblings)
  29 siblings, 0 replies; 53+ messages in thread
From: WANG Xuerui @ 2021-09-22 18:09 UTC (permalink / raw)
  To: qemu-devel
  Cc: WANG Xuerui, Peter Maydell, Richard Henderson,
	Philippe Mathieu-Daudé,
	Laurent Vivier

Signed-off-by: WANG Xuerui <git@xen0n.name>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
---
 tcg/loongarch64/tcg-target.c.inc | 27 +++++++++++++++++++++++++++
 1 file changed, 27 insertions(+)

diff --git a/tcg/loongarch64/tcg-target.c.inc b/tcg/loongarch64/tcg-target.c.inc
index 405da21234..b1de1529fa 100644
--- a/tcg/loongarch64/tcg-target.c.inc
+++ b/tcg/loongarch64/tcg-target.c.inc
@@ -1543,3 +1543,30 @@ static void tcg_target_qemu_prologue(TCGContext *s)
     tcg_out_opc_addi_d(s, TCG_REG_SP, TCG_REG_SP, FRAME_SIZE);
     tcg_out_opc_jirl(s, TCG_REG_ZERO, TCG_REG_RA, 0);
 }
+
+static void tcg_target_init(TCGContext *s)
+{
+    tcg_target_available_regs[TCG_TYPE_I32] = ALL_GENERAL_REGS;
+    tcg_target_available_regs[TCG_TYPE_I64] = ALL_GENERAL_REGS;
+
+    tcg_target_call_clobber_regs = ALL_GENERAL_REGS;
+    tcg_regset_reset_reg(tcg_target_call_clobber_regs, TCG_REG_S0);
+    tcg_regset_reset_reg(tcg_target_call_clobber_regs, TCG_REG_S1);
+    tcg_regset_reset_reg(tcg_target_call_clobber_regs, TCG_REG_S2);
+    tcg_regset_reset_reg(tcg_target_call_clobber_regs, TCG_REG_S3);
+    tcg_regset_reset_reg(tcg_target_call_clobber_regs, TCG_REG_S4);
+    tcg_regset_reset_reg(tcg_target_call_clobber_regs, TCG_REG_S5);
+    tcg_regset_reset_reg(tcg_target_call_clobber_regs, TCG_REG_S6);
+    tcg_regset_reset_reg(tcg_target_call_clobber_regs, TCG_REG_S7);
+    tcg_regset_reset_reg(tcg_target_call_clobber_regs, TCG_REG_S8);
+    tcg_regset_reset_reg(tcg_target_call_clobber_regs, TCG_REG_S9);
+
+    s->reserved_regs = 0;
+    tcg_regset_set_reg(s->reserved_regs, TCG_REG_ZERO);
+    tcg_regset_set_reg(s->reserved_regs, TCG_REG_TMP0);
+    tcg_regset_set_reg(s->reserved_regs, TCG_REG_TMP1);
+    tcg_regset_set_reg(s->reserved_regs, TCG_REG_TMP2);
+    tcg_regset_set_reg(s->reserved_regs, TCG_REG_SP);
+    tcg_regset_set_reg(s->reserved_regs, TCG_REG_TP);
+    tcg_regset_set_reg(s->reserved_regs, TCG_REG_RESERVED);
+}
-- 
2.33.0



^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH v3 27/30] tcg/loongarch64: Register the JIT
  2021-09-22 18:08 [PATCH v3 00/30] LoongArch64 port of QEMU TCG WANG Xuerui
                   ` (25 preceding siblings ...)
  2021-09-22 18:09 ` [PATCH v3 26/30] tcg/loongarch64: Implement tcg_target_init WANG Xuerui
@ 2021-09-22 18:09 ` WANG Xuerui
  2021-09-22 18:09 ` [PATCH v3 28/30] linux-user: Add safe syscall handling for loongarch64 hosts WANG Xuerui
                   ` (2 subsequent siblings)
  29 siblings, 0 replies; 53+ messages in thread
From: WANG Xuerui @ 2021-09-22 18:09 UTC (permalink / raw)
  To: qemu-devel
  Cc: WANG Xuerui, Peter Maydell, Richard Henderson,
	Philippe Mathieu-Daudé,
	Laurent Vivier

Signed-off-by: WANG Xuerui <git@xen0n.name>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
---
 tcg/loongarch64/tcg-target.c.inc | 44 ++++++++++++++++++++++++++++++++
 1 file changed, 44 insertions(+)

diff --git a/tcg/loongarch64/tcg-target.c.inc b/tcg/loongarch64/tcg-target.c.inc
index b1de1529fa..025ed525e6 100644
--- a/tcg/loongarch64/tcg-target.c.inc
+++ b/tcg/loongarch64/tcg-target.c.inc
@@ -1570,3 +1570,47 @@ static void tcg_target_init(TCGContext *s)
     tcg_regset_set_reg(s->reserved_regs, TCG_REG_TP);
     tcg_regset_set_reg(s->reserved_regs, TCG_REG_RESERVED);
 }
+
+typedef struct {
+    DebugFrameHeader h;
+    uint8_t fde_def_cfa[4];
+    uint8_t fde_reg_ofs[ARRAY_SIZE(tcg_target_callee_save_regs) * 2];
+} DebugFrame;
+
+#define ELF_HOST_MACHINE EM_LOONGARCH
+
+static const DebugFrame debug_frame = {
+    .h.cie.len = sizeof(DebugFrameCIE) - 4, /* length after .len member */
+    .h.cie.id = -1,
+    .h.cie.version = 1,
+    .h.cie.code_align = 1,
+    .h.cie.data_align = -(TCG_TARGET_REG_BITS / 8) & 0x7f, /* sleb128 */
+    .h.cie.return_column = TCG_REG_RA,
+
+    /* Total FDE size does not include the "len" member.  */
+    .h.fde.len = sizeof(DebugFrame) - offsetof(DebugFrame, h.fde.cie_offset),
+
+    .fde_def_cfa = {
+        12, TCG_REG_SP,                 /* DW_CFA_def_cfa sp, ...  */
+        (FRAME_SIZE & 0x7f) | 0x80,     /* ... uleb128 FRAME_SIZE */
+        (FRAME_SIZE >> 7)
+    },
+    .fde_reg_ofs = {
+        0x80 + 23, 11,                  /* DW_CFA_offset, s0, -88 */
+        0x80 + 24, 10,                  /* DW_CFA_offset, s1, -80 */
+        0x80 + 25, 9,                   /* DW_CFA_offset, s2, -72 */
+        0x80 + 26, 8,                   /* DW_CFA_offset, s3, -64 */
+        0x80 + 27, 7,                   /* DW_CFA_offset, s4, -56 */
+        0x80 + 28, 6,                   /* DW_CFA_offset, s5, -48 */
+        0x80 + 29, 5,                   /* DW_CFA_offset, s6, -40 */
+        0x80 + 30, 4,                   /* DW_CFA_offset, s7, -32 */
+        0x80 + 31, 3,                   /* DW_CFA_offset, s8, -24 */
+        0x80 + 22, 2,                   /* DW_CFA_offset, s9, -16 */
+        0x80 + 1 , 1,                   /* DW_CFA_offset, ra, -8 */
+    }
+};
+
+void tcg_register_jit(const void *buf, size_t buf_size)
+{
+    tcg_register_jit_int(buf, buf_size, &debug_frame, sizeof(debug_frame));
+}
-- 
2.33.0



^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH v3 28/30] linux-user: Add safe syscall handling for loongarch64 hosts
  2021-09-22 18:08 [PATCH v3 00/30] LoongArch64 port of QEMU TCG WANG Xuerui
                   ` (26 preceding siblings ...)
  2021-09-22 18:09 ` [PATCH v3 27/30] tcg/loongarch64: Register the JIT WANG Xuerui
@ 2021-09-22 18:09 ` WANG Xuerui
  2021-09-22 18:09 ` [PATCH v3 29/30] accel/tcg/user-exec: Implement CPU-specific signal handler " WANG Xuerui
  2021-09-22 18:09 ` [PATCH v3 30/30] configure, meson.build: Mark support " WANG Xuerui
  29 siblings, 0 replies; 53+ messages in thread
From: WANG Xuerui @ 2021-09-22 18:09 UTC (permalink / raw)
  To: qemu-devel
  Cc: WANG Xuerui, Peter Maydell, Richard Henderson,
	Philippe Mathieu-Daudé,
	Laurent Vivier

Signed-off-by: WANG Xuerui <git@xen0n.name>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
---
 linux-user/host/loongarch64/hostdep.h         | 34 ++++++++
 .../host/loongarch64/safe-syscall.inc.S       | 80 +++++++++++++++++++
 2 files changed, 114 insertions(+)
 create mode 100644 linux-user/host/loongarch64/hostdep.h
 create mode 100644 linux-user/host/loongarch64/safe-syscall.inc.S

diff --git a/linux-user/host/loongarch64/hostdep.h b/linux-user/host/loongarch64/hostdep.h
new file mode 100644
index 0000000000..e3d5fa703f
--- /dev/null
+++ b/linux-user/host/loongarch64/hostdep.h
@@ -0,0 +1,34 @@
+/*
+ * hostdep.h : things which are dependent on the host architecture
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
+
+#ifndef LOONGARCH64_HOSTDEP_H
+#define LOONGARCH64_HOSTDEP_H
+
+/* We have a safe-syscall.inc.S */
+#define HAVE_SAFE_SYSCALL
+
+#ifndef __ASSEMBLER__
+
+/* These are defined by the safe-syscall.inc.S file */
+extern char safe_syscall_start[];
+extern char safe_syscall_end[];
+
+/* Adjust the signal context to rewind out of safe-syscall if we're in it */
+static inline void rewind_if_in_safe_syscall(void *puc)
+{
+    ucontext_t *uc = puc;
+    unsigned long long *pcreg = &uc->uc_mcontext.__pc;
+
+    if (*pcreg > (uintptr_t)safe_syscall_start
+        && *pcreg < (uintptr_t)safe_syscall_end) {
+        *pcreg = (uintptr_t)safe_syscall_start;
+    }
+}
+
+#endif /* __ASSEMBLER__ */
+
+#endif
diff --git a/linux-user/host/loongarch64/safe-syscall.inc.S b/linux-user/host/loongarch64/safe-syscall.inc.S
new file mode 100644
index 0000000000..bb530248b3
--- /dev/null
+++ b/linux-user/host/loongarch64/safe-syscall.inc.S
@@ -0,0 +1,80 @@
+/*
+ * safe-syscall.inc.S : host-specific assembly fragment
+ * to handle signals occurring at the same time as system calls.
+ * This is intended to be included by linux-user/safe-syscall.S
+ *
+ * Ported to LoongArch by WANG Xuerui <git@xen0n.name>
+ *
+ * Based on safe-syscall.inc.S code for every other architecture,
+ * originally written by Richard Henderson <rth@twiddle.net>
+ * Copyright (C) 2018 Linaro, Inc.
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
+
+	.global safe_syscall_base
+	.global safe_syscall_start
+	.global safe_syscall_end
+	.type	safe_syscall_base, @function
+	.type	safe_syscall_start, @function
+	.type	safe_syscall_end, @function
+
+	/*
+	 * This is the entry point for making a system call. The calling
+	 * convention here is that of a C varargs function with the
+	 * first argument an 'int *' to the signal_pending flag, the
+	 * second one the system call number (as a 'long'), and all further
+	 * arguments being syscall arguments (also 'long').
+	 * We return a long which is the syscall's return value, which
+	 * may be negative-errno on failure. Conversion to the
+	 * -1-and-errno-set convention is done by the calling wrapper.
+	 */
+safe_syscall_base:
+	.cfi_startproc
+	/*
+	 * The syscall calling convention is nearly the same as C:
+	 * we enter with a0 == *signal_pending
+	 *               a1 == syscall number
+	 *               a2 ... a7 == syscall arguments
+	 *               and return the result in a0
+	 * and the syscall instruction needs
+	 *               a7 == syscall number
+	 *               a0 ... a5 == syscall arguments
+	 *               and returns the result in a0
+	 * Shuffle everything around appropriately.
+	 */
+	move	$t0, $a0	/* signal_pending pointer */
+	move	$t1, $a1	/* syscall number */
+	move	$a0, $a2	/* syscall arguments */
+	move	$a1, $a3
+	move	$a2, $a4
+	move	$a3, $a5
+	move	$a4, $a6
+	move	$a5, $a7
+	move	$a7, $t1
+
+	/*
+	 * This next sequence of code works in conjunction with the
+	 * rewind_if_safe_syscall_function(). If a signal is taken
+	 * and the interrupted PC is anywhere between 'safe_syscall_start'
+	 * and 'safe_syscall_end' then we rewind it to 'safe_syscall_start'.
+	 * The code sequence must therefore be able to cope with this, and
+	 * the syscall instruction must be the final one in the sequence.
+	 */
+safe_syscall_start:
+	/* If signal_pending is non-zero, don't do the call */
+	ld.w	$t1, $t0, 0
+	bnez	$t1, 0f
+	syscall	0
+safe_syscall_end:
+	/* code path for having successfully executed the syscall */
+	jr	$ra
+
+0:
+	/* code path when we didn't execute the syscall */
+	li.w	$a0, -TARGET_ERESTARTSYS
+	jr	$ra
+	.cfi_endproc
+
+	.size	safe_syscall_base, .-safe_syscall_base
-- 
2.33.0



^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH v3 29/30] accel/tcg/user-exec: Implement CPU-specific signal handler for loongarch64 hosts
  2021-09-22 18:08 [PATCH v3 00/30] LoongArch64 port of QEMU TCG WANG Xuerui
                   ` (27 preceding siblings ...)
  2021-09-22 18:09 ` [PATCH v3 28/30] linux-user: Add safe syscall handling for loongarch64 hosts WANG Xuerui
@ 2021-09-22 18:09 ` WANG Xuerui
  2021-09-22 18:09 ` [PATCH v3 30/30] configure, meson.build: Mark support " WANG Xuerui
  29 siblings, 0 replies; 53+ messages in thread
From: WANG Xuerui @ 2021-09-22 18:09 UTC (permalink / raw)
  To: qemu-devel
  Cc: WANG Xuerui, Peter Maydell, Richard Henderson,
	Philippe Mathieu-Daudé,
	Laurent Vivier

Signed-off-by: WANG Xuerui <git@xen0n.name>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
---
 accel/tcg/user-exec.c | 73 +++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 73 insertions(+)

diff --git a/accel/tcg/user-exec.c b/accel/tcg/user-exec.c
index 8fed542622..38d4ad8a7d 100644
--- a/accel/tcg/user-exec.c
+++ b/accel/tcg/user-exec.c
@@ -878,6 +878,79 @@ int cpu_signal_handler(int host_signum, void *pinfo,
     return handle_cpu_signal(pc, info, is_write, &uc->uc_sigmask);
 }
 
+#elif defined(__loongarch64)
+
+int cpu_signal_handler(int host_signum, void *pinfo,
+                       void *puc)
+{
+    siginfo_t *info = pinfo;
+    ucontext_t *uc = puc;
+    greg_t pc = uc->uc_mcontext.__pc;
+    uint32_t insn = *(uint32_t *)pc;
+    int is_write = 0;
+
+    /* Detect store by reading the instruction at the program counter.  */
+    switch ((insn >> 26) & 0b111111) {
+    case 0b001000: /* {ll,sc}.[wd] */
+        switch ((insn >> 24) & 0b11) {
+        case 0b01: /* sc.w */
+        case 0b11: /* sc.d */
+            is_write = 1;
+            break;
+        }
+        break;
+    case 0b001001: /* {ld,st}ox4.[wd] ({ld,st}ptr.[wd]) */
+        switch ((insn >> 24) & 0b11) {
+        case 0b01: /* stox4.w (stptr.w) */
+        case 0b11: /* stox4.d (stptr.d) */
+            is_write = 1;
+            break;
+        }
+        break;
+    case 0b001010: /* {ld,st}.* family */
+        switch ((insn >> 22) & 0b1111) {
+        case 0b0100: /* st.b */
+        case 0b0101: /* st.h */
+        case 0b0110: /* st.w */
+        case 0b0111: /* st.d */
+        case 0b1101: /* fst.s */
+        case 0b1111: /* fst.d */
+            is_write = 1;
+            break;
+        }
+        break;
+    case 0b001110: /* indexed, atomic, bounds-checking memory operations */
+        uint32_t sel = (insn >> 15) & 0b11111111111;
+
+        switch (sel) {
+        case 0b00000100000: /* stx.b */
+        case 0b00000101000: /* stx.h */
+        case 0b00000110000: /* stx.w */
+        case 0b00000111000: /* stx.d */
+        case 0b00001110000: /* fstx.s */
+        case 0b00001111000: /* fstx.d */
+        case 0b00011101100: /* fstgt.s */
+        case 0b00011101101: /* fstgt.d */
+        case 0b00011101110: /* fstle.s */
+        case 0b00011101111: /* fstle.d */
+        case 0b00011111000: /* stgt.b */
+        case 0b00011111001: /* stgt.h */
+        case 0b00011111010: /* stgt.w */
+        case 0b00011111011: /* stgt.d */
+        case 0b00011111100: /* stle.b */
+        case 0b00011111101: /* stle.h */
+        case 0b00011111110: /* stle.w */
+        case 0b00011111111: /* stle.d */
+        case 0b00011000000 ... 0b00011100011: /* am* insns */
+            is_write = 1;
+            break;
+        }
+        break;
+    }
+
+    return handle_cpu_signal(pc, info, is_write, &uc->uc_sigmask);
+}
+
 #else
 
 #error host CPU specific signal handler needed
-- 
2.33.0



^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH v3 30/30] configure, meson.build: Mark support for loongarch64 hosts
  2021-09-22 18:08 [PATCH v3 00/30] LoongArch64 port of QEMU TCG WANG Xuerui
                   ` (28 preceding siblings ...)
  2021-09-22 18:09 ` [PATCH v3 29/30] accel/tcg/user-exec: Implement CPU-specific signal handler " WANG Xuerui
@ 2021-09-22 18:09 ` WANG Xuerui
  29 siblings, 0 replies; 53+ messages in thread
From: WANG Xuerui @ 2021-09-22 18:09 UTC (permalink / raw)
  To: qemu-devel
  Cc: WANG Xuerui, Peter Maydell, Richard Henderson,
	Philippe Mathieu-Daudé,
	Laurent Vivier

Signed-off-by: WANG Xuerui <git@xen0n.name>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
---
 configure   | 7 ++++++-
 meson.build | 2 +-
 2 files changed, 7 insertions(+), 2 deletions(-)

diff --git a/configure b/configure
index 1043ccce4f..3a9035385d 100755
--- a/configure
+++ b/configure
@@ -659,6 +659,8 @@ elif check_define __arm__ ; then
   cpu="arm"
 elif check_define __aarch64__ ; then
   cpu="aarch64"
+elif check_define __loongarch64 ; then
+  cpu="loongarch64"
 else
   cpu=$(uname -m)
 fi
@@ -667,7 +669,7 @@ ARCH=
 # Normalise host CPU name and set ARCH.
 # Note that this case should only have supported host CPUs, not guests.
 case "$cpu" in
-  ppc|ppc64|s390x|sparc64|x32|riscv32|riscv64)
+  ppc|ppc64|s390x|sparc64|x32|riscv32|riscv64|loongarch64)
   ;;
   ppc64le)
     ARCH="ppc64"
@@ -4969,6 +4971,9 @@ if test "$linux" = "yes" ; then
   aarch64)
     linux_arch=arm64
     ;;
+  loongarch*)
+    linux_arch=loongarch
+    ;;
   mips64)
     linux_arch=mips
     ;;
diff --git a/meson.build b/meson.build
index 15ef4d3c41..fc55712ac3 100644
--- a/meson.build
+++ b/meson.build
@@ -57,7 +57,7 @@ python = import('python').find_installation()
 
 supported_oses = ['windows', 'freebsd', 'netbsd', 'openbsd', 'darwin', 'sunos', 'linux']
 supported_cpus = ['ppc', 'ppc64', 's390x', 'riscv32', 'riscv64', 'x86', 'x86_64',
-  'arm', 'aarch64', 'mips', 'mips64', 'sparc', 'sparc64']
+  'arm', 'aarch64', 'loongarch64', 'mips', 'mips64', 'sparc', 'sparc64']
 
 cpu = host_machine.cpu_family()
 targetos = host_machine.system()
-- 
2.33.0



^ permalink raw reply related	[flat|nested] 53+ messages in thread

* Re: [PATCH v3 01/30] elf: Add machine type value for LoongArch
  2021-09-22 18:08 ` [PATCH v3 01/30] elf: Add machine type value for LoongArch WANG Xuerui
@ 2021-09-22 18:17   ` Richard Henderson
  2021-09-22 18:23   ` Philippe Mathieu-Daudé
  1 sibling, 0 replies; 53+ messages in thread
From: Richard Henderson @ 2021-09-22 18:17 UTC (permalink / raw)
  To: WANG Xuerui, qemu-devel
  Cc: Peter Maydell, Philippe Mathieu-Daudé, Laurent Vivier

On 9/22/21 11:08 AM, WANG Xuerui wrote:
> This is already officially allocated as recorded in GNU binutils
> repo [1], and the description is updated in [2]. Add to enable further
> work.
> 
> [1]:https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=4cf2ad720078a9f490dd5b5bc8893a926479196e
> [2]:https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=01a8c731aacbdbed0eb5682d13cc074dc7e25fb3
> 
> Signed-off-by: WANG Xuerui<git@xen0n.name>
> ---
>   include/elf.h | 2 ++
>   1 file changed, 2 insertions(+)

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>

r~


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v3 14/30] tcg/loongarch64: Implement bswap{16,32,64} ops
  2021-09-22 18:09 ` [PATCH v3 14/30] tcg/loongarch64: Implement bswap{16,32,64} ops WANG Xuerui
@ 2021-09-22 18:23   ` Richard Henderson
  0 siblings, 0 replies; 53+ messages in thread
From: Richard Henderson @ 2021-09-22 18:23 UTC (permalink / raw)
  To: WANG Xuerui, qemu-devel
  Cc: Peter Maydell, Philippe Mathieu-Daudé, Laurent Vivier

On 9/22/21 11:09 AM, WANG Xuerui wrote:
> Signed-off-by: WANG Xuerui<git@xen0n.name>
> ---
>   tcg/loongarch64/tcg-target.c.inc | 32 ++++++++++++++++++++++++++++++++
>   tcg/loongarch64/tcg-target.h     | 10 +++++-----
>   2 files changed, 37 insertions(+), 5 deletions(-)

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>

r~


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v3 01/30] elf: Add machine type value for LoongArch
  2021-09-22 18:08 ` [PATCH v3 01/30] elf: Add machine type value for LoongArch WANG Xuerui
  2021-09-22 18:17   ` Richard Henderson
@ 2021-09-22 18:23   ` Philippe Mathieu-Daudé
  1 sibling, 0 replies; 53+ messages in thread
From: Philippe Mathieu-Daudé @ 2021-09-22 18:23 UTC (permalink / raw)
  To: WANG Xuerui
  Cc: Peter Maydell, Richard Henderson,
	qemu-devel@nongnu.org Developers, Laurent Vivier

On Wed, Sep 22, 2021 at 8:09 PM WANG Xuerui <git@xen0n.name> wrote:
>
> This is already officially allocated as recorded in GNU binutils
> repo [1], and the description is updated in [2]. Add to enable further
> work.
>
> [1]: https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=4cf2ad720078a9f490dd5b5bc8893a926479196e
> [2]: https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=01a8c731aacbdbed0eb5682d13cc074dc7e25fb3
>
> Signed-off-by: WANG Xuerui <git@xen0n.name>
> ---
>  include/elf.h | 2 ++
>  1 file changed, 2 insertions(+)

Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v3 20/30] tcg/loongarch64: Implement setcond ops
  2021-09-22 18:09 ` [PATCH v3 20/30] tcg/loongarch64: Implement setcond ops WANG Xuerui
@ 2021-09-22 18:25   ` Richard Henderson
  0 siblings, 0 replies; 53+ messages in thread
From: Richard Henderson @ 2021-09-22 18:25 UTC (permalink / raw)
  To: WANG Xuerui, qemu-devel
  Cc: Peter Maydell, Philippe Mathieu-Daudé, Laurent Vivier

On 9/22/21 11:09 AM, WANG Xuerui wrote:
> +    case INDEX_op_setcond_i32:
> +    case INDEX_op_setcond_i64:
>           return C_O1_I2(r, rZ, rZ);

It shouldn't ever happen, but just for safety, remove the Z from arg1.

With that,
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>


r~


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v3 03/30] tcg/loongarch64: Add the tcg-target.h file
  2021-09-22 18:09 ` [PATCH v3 03/30] tcg/loongarch64: Add the tcg-target.h file WANG Xuerui
@ 2021-09-22 18:34   ` Philippe Mathieu-Daudé
  2021-09-22 18:47     ` WANG Xuerui
  2021-09-22 18:58     ` Richard Henderson
  0 siblings, 2 replies; 53+ messages in thread
From: Philippe Mathieu-Daudé @ 2021-09-22 18:34 UTC (permalink / raw)
  To: WANG Xuerui, qemu-devel; +Cc: Peter Maydell, Richard Henderson, Laurent Vivier

On 9/22/21 20:09, WANG Xuerui wrote:
> Support for all optional TCG ops are initially marked disabled; the bits
> are to be set in individual commits later.
> 
> Signed-off-by: WANG Xuerui <git@xen0n.name>
> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>   tcg/loongarch64/tcg-target.h | 180 +++++++++++++++++++++++++++++++++++
>   1 file changed, 180 insertions(+)
>   create mode 100644 tcg/loongarch64/tcg-target.h
> 
> diff --git a/tcg/loongarch64/tcg-target.h b/tcg/loongarch64/tcg-target.h
> new file mode 100644
> index 0000000000..0fd9b61e6d
> --- /dev/null
> +++ b/tcg/loongarch64/tcg-target.h
> @@ -0,0 +1,180 @@
> +/*
> + * Tiny Code Generator for QEMU
> + *
> + * Copyright (c) 2021 WANG Xuerui <git@xen0n.name>
> + *
> + * Based on tcg/riscv/tcg-target.h
> + *
> + * Copyright (c) 2018 SiFive, Inc

I thought you could drop this line.

> + *
> + * Permission is hereby granted, free of charge, to any person obtaining a copy
> + * of this software and associated documentation files (the "Software"), to deal
> + * in the Software without restriction, including without limitation the rights
> + * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
> + * copies of the Software, and to permit persons to whom the Software is
> + * furnished to do so, subject to the following conditions:
> + *
> + * The above copyright notice and this permission notice shall be included in
> + * all copies or substantial portions of the Software.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
> + * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
> + * THE SOFTWARE.
> + */
> +
> +#ifndef LOONGARCH_TCG_TARGET_H
> +#define LOONGARCH_TCG_TARGET_H
> +
> +/*
> + * Loongson removed the (incomplete) 32-bit support from kernel and toolchain
> + * for the initial upstreaming of this architecture, so don't bother and just
> + * support the LP64 ABI for now.
> + */
> +#if defined(__loongarch64)
> +# define TCG_TARGET_REG_BITS 64
> +#else
> +# error unsupported LoongArch register size
> +#endif
> +
> +#define TCG_TARGET_INSN_UNIT_SIZE 4
> +#define TCG_TARGET_NB_REGS 32
> +#define MAX_CODE_GEN_BUFFER_SIZE  ((size_t)-1)

Is this SIZE_MAX?

> +
> +typedef enum {
> +    TCG_REG_ZERO,
> +    TCG_REG_RA,
> +    TCG_REG_TP,
> +    TCG_REG_SP,
> +    TCG_REG_A0,
> +    TCG_REG_A1,
> +    TCG_REG_A2,
> +    TCG_REG_A3,
> +    TCG_REG_A4,
> +    TCG_REG_A5,
> +    TCG_REG_A6,
> +    TCG_REG_A7,
> +    TCG_REG_T0,
> +    TCG_REG_T1,
> +    TCG_REG_T2,
> +    TCG_REG_T3,
> +    TCG_REG_T4,
> +    TCG_REG_T5,
> +    TCG_REG_T6,
> +    TCG_REG_T7,
> +    TCG_REG_T8,
> +    TCG_REG_RESERVED,
> +    TCG_REG_S9,
> +    TCG_REG_S0,
> +    TCG_REG_S1,
> +    TCG_REG_S2,
> +    TCG_REG_S3,
> +    TCG_REG_S4,
> +    TCG_REG_S5,
> +    TCG_REG_S6,
> +    TCG_REG_S7,
> +    TCG_REG_S8,

Here could go:

        TCG_TARGET_NB_REGS,

> +
> +    /* aliases */
> +    TCG_AREG0    = TCG_REG_S0,
> +    TCG_REG_TMP0 = TCG_REG_T8,
> +    TCG_REG_TMP1 = TCG_REG_T7,
> +    TCG_REG_TMP2 = TCG_REG_T6,
> +} TCGReg;

Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v3 04/30] tcg/loongarch64: Add generated instruction opcodes and encoding helpers
  2021-09-22 18:09 ` [PATCH v3 04/30] tcg/loongarch64: Add generated instruction opcodes and encoding helpers WANG Xuerui
@ 2021-09-22 18:37   ` Philippe Mathieu-Daudé
  2021-09-22 18:51     ` WANG Xuerui
  0 siblings, 1 reply; 53+ messages in thread
From: Philippe Mathieu-Daudé @ 2021-09-22 18:37 UTC (permalink / raw)
  To: WANG Xuerui, qemu-devel; +Cc: Peter Maydell, Richard Henderson, Laurent Vivier

On 9/22/21 20:09, WANG Xuerui wrote:
> Signed-off-by: WANG Xuerui <git@xen0n.name>
> Acked-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>   tcg/loongarch64/tcg-insn-defs.c.inc | 881 ++++++++++++++++++++++++++++
>   1 file changed, 881 insertions(+)
>   create mode 100644 tcg/loongarch64/tcg-insn-defs.c.inc
> 
> diff --git a/tcg/loongarch64/tcg-insn-defs.c.inc b/tcg/loongarch64/tcg-insn-defs.c.inc
> new file mode 100644
> index 0000000000..7d73136b1b
> --- /dev/null
> +++ b/tcg/loongarch64/tcg-insn-defs.c.inc
> @@ -0,0 +1,881 @@
> +/* SPDX-License-Identifier: MIT */
> +/*
> + * LoongArch instruction formats, opcodes, and encoders for TCG use.
> + *
> + * Code generated by genqemutcgdefs from
> + * https://github.com/loongson-community/loongarch-opcodes,
> + * from commit 308d38087401eb333063556f027ce0b24d86b416.
> + * DO NOT EDIT.

The generated code ...

> + */
> +
> +typedef enum {
> +    OPC_CLZ_W = 0x00001400,
> +    OPC_CTZ_W = 0x00001c00,
> +    OPC_CLZ_D = 0x00002400,
> +    OPC_CTZ_D = 0x00002c00,
> +    OPC_REVB_2H = 0x00003000,
> +    OPC_REVB_2W = 0x00003800,
> +    OPC_REVB_D = 0x00003c00,
> +    OPC_SEXT_H = 0x00005800,
> +    OPC_SEXT_B = 0x00005c00,
> +    OPC_ADD_W = 0x00100000,
> +    OPC_ADD_D = 0x00108000,
> +    OPC_SUB_W = 0x00110000,
> +    OPC_SUB_D = 0x00118000,
> +    OPC_SLT = 0x00120000,
> +    OPC_SLTU = 0x00128000,
> +    OPC_MASKEQZ = 0x00130000,
> +    OPC_MASKNEZ = 0x00138000,
> +    OPC_NOR = 0x00140000,
> +    OPC_AND = 0x00148000,
> +    OPC_OR = 0x00150000,
> +    OPC_XOR = 0x00158000,
> +    OPC_ORN = 0x00160000,
> +    OPC_ANDN = 0x00168000,
> +    OPC_SLL_W = 0x00170000,
> +    OPC_SRL_W = 0x00178000,
> +    OPC_SRA_W = 0x00180000,
> +    OPC_SLL_D = 0x00188000,
> +    OPC_SRL_D = 0x00190000,
> +    OPC_SRA_D = 0x00198000,
> +    OPC_ROTR_W = 0x001b0000,
> +    OPC_ROTR_D = 0x001b8000,
> +    OPC_MUL_W = 0x001c0000,
> +    OPC_MULH_W = 0x001c8000,
> +    OPC_MULH_WU = 0x001d0000,
> +    OPC_MUL_D = 0x001d8000,
> +    OPC_MULH_D = 0x001e0000,
> +    OPC_MULH_DU = 0x001e8000,
> +    OPC_DIV_W = 0x00200000,
> +    OPC_MOD_W = 0x00208000,
> +    OPC_DIV_WU = 0x00210000,
> +    OPC_MOD_WU = 0x00218000,
> +    OPC_DIV_D = 0x00220000,
> +    OPC_MOD_D = 0x00228000,
> +    OPC_DIV_DU = 0x00230000,
> +    OPC_MOD_DU = 0x00238000,
> +    OPC_SLLI_W = 0x00408000,
> +    OPC_SLLI_D = 0x00410000,
> +    OPC_SRLI_W = 0x00448000,
> +    OPC_SRLI_D = 0x00450000,
> +    OPC_SRAI_W = 0x00488000,
> +    OPC_SRAI_D = 0x00490000,
> +    OPC_ROTRI_W = 0x004c8000,
> +    OPC_ROTRI_D = 0x004d0000,
> +    OPC_BSTRINS_W = 0x00600000,
> +    OPC_BSTRPICK_W = 0x00608000,
> +    OPC_BSTRINS_D = 0x00800000,
> +    OPC_BSTRPICK_D = 0x00c00000,
> +    OPC_SLTI = 0x02000000,
> +    OPC_SLTUI = 0x02400000,
> +    OPC_ADDI_W = 0x02800000,
> +    OPC_ADDI_D = 0x02c00000,
> +    OPC_CU52I_D = 0x03000000,
> +    OPC_ANDI = 0x03400000,
> +    OPC_ORI = 0x03800000,
> +    OPC_XORI = 0x03c00000,
> +    OPC_LU12I_W = 0x14000000,
> +    OPC_CU32I_D = 0x16000000,
> +    OPC_PCADDU2I = 0x18000000,
> +    OPC_PCADDU12I = 0x1c000000,
> +    OPC_PCADDU18I = 0x1e000000,
> +    OPC_LD_B = 0x28000000,
> +    OPC_LD_H = 0x28400000,
> +    OPC_LD_W = 0x28800000,
> +    OPC_LD_D = 0x28c00000,
> +    OPC_ST_B = 0x29000000,
> +    OPC_ST_H = 0x29400000,
> +    OPC_ST_W = 0x29800000,
> +    OPC_ST_D = 0x29c00000,
> +    OPC_LD_BU = 0x2a000000,
> +    OPC_LD_HU = 0x2a400000,
> +    OPC_LD_WU = 0x2a800000,
> +    OPC_DBAR = 0x38720000,
> +    OPC_JIRL = 0x4c000000,
> +    OPC_B = 0x50000000,
> +    OPC_BL = 0x54000000,
> +    OPC_BEQ = 0x58000000,
> +    OPC_BNE = 0x5c000000,
> +    OPC_BGT = 0x60000000,
> +    OPC_BLE = 0x64000000,
> +    OPC_BGTU = 0x68000000,
> +    OPC_BLEU = 0x6c000000,
> +} LoongArchInsn;

... ends here, right? ...

> +
> +static int32_t __attribute__((unused))
> +encode_d_slot(LoongArchInsn opc, uint32_t d)
> +{
> +    return opc | d;
> +}
> +
> +static int32_t __attribute__((unused))
> +encode_dj_slots(LoongArchInsn opc, uint32_t d, uint32_t j)
> +{
> +    return opc | d | j << 5;
> +}
> +
> +static int32_t __attribute__((unused))
> +encode_djk_slots(LoongArchInsn opc, uint32_t d, uint32_t j, uint32_t k)
> +{
> +    return opc | d | j << 5 | k << 10;
> +}
> +
> +static int32_t __attribute__((unused))
> +encode_djkm_slots(LoongArchInsn opc, uint32_t d, uint32_t j, uint32_t k,
> +                  uint32_t m)
> +{
> +    return opc | d | j << 5 | k << 10 | m << 16;
> +}
> +
> +static int32_t __attribute__((unused))
> +encode_dk_slots(LoongArchInsn opc, uint32_t d, uint32_t k)
> +{
> +    return opc | d | k << 10;
> +}
> +
> +static int32_t __attribute__((unused))
> +encode_dj_insn(LoongArchInsn opc, TCGReg d, TCGReg j)
> +{
> +    tcg_debug_assert(d >= 0 && d <= 0x1f);
> +    tcg_debug_assert(j >= 0 && j <= 0x1f);
> +    return encode_dj_slots(opc, d, j);
> +}

... or are these helpers also generated?


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v3 09/30] tcg/loongarch64: Implement tcg_out_mov and tcg_out_movi
  2021-09-22 18:09 ` [PATCH v3 09/30] tcg/loongarch64: Implement tcg_out_mov and tcg_out_movi WANG Xuerui
@ 2021-09-22 18:39   ` Richard Henderson
  2021-09-22 19:02     ` WANG Xuerui
  2021-09-22 18:51   ` Richard Henderson
  2021-09-23 16:50   ` Richard Henderson
  2 siblings, 1 reply; 53+ messages in thread
From: Richard Henderson @ 2021-09-22 18:39 UTC (permalink / raw)
  To: WANG Xuerui, qemu-devel
  Cc: Peter Maydell, Philippe Mathieu-Daudé, Laurent Vivier

On 9/22/21 11:09 AM, WANG Xuerui wrote:
> +    if (pc_offset == (int32_t)pc_offset) {
> +        tcg_target_long lo = sextreg(pc_offset, 0, 12);
> +        tcg_target_long hi = pc_offset - lo;
> +        tcg_out_opc_pcaddu12i(s, rd, hi >> 12);
> +        tcg_out_opc_addi_d(s, rd, rd, lo);

pc_offset = 0x7ffff800 will fail:

lo = 0xfffffffffffff800
hi = 0x0000000080000000

but hi will be interpreted as negative by pcaddu12i.

This is the same problem I pointed out with tcg_out_call, but with different constants.


r~


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v3 07/30] tcg/loongarch64: Implement necessary relocation operations
  2021-09-22 18:09 ` [PATCH v3 07/30] tcg/loongarch64: Implement necessary relocation operations WANG Xuerui
@ 2021-09-22 18:41   ` Philippe Mathieu-Daudé
  2021-09-22 18:55     ` WANG Xuerui
  0 siblings, 1 reply; 53+ messages in thread
From: Philippe Mathieu-Daudé @ 2021-09-22 18:41 UTC (permalink / raw)
  To: WANG Xuerui, qemu-devel; +Cc: Peter Maydell, Richard Henderson, Laurent Vivier

On 9/22/21 20:09, WANG Xuerui wrote:
> Signed-off-by: WANG Xuerui <git@xen0n.name>
> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>   tcg/loongarch64/tcg-target.c.inc | 66 ++++++++++++++++++++++++++++++++
>   1 file changed, 66 insertions(+)

> +/* Field Sk16, shifted right by 2; suitable for conditional jumps */
> +#define R_LOONGARCH_BR_SK16     256
> +/* Field Sd10k16, shifted right by 2; suitable for B and BL */
> +#define R_LOONGARCH_BR_SD10K16  257
> +
> +static bool reloc_br_sk16(tcg_insn_unit *src_rw, const tcg_insn_unit *target)
> +{
> +    const tcg_insn_unit *src_rx = tcg_splitwx_to_rx(src_rw);
> +    intptr_t offset = (intptr_t)target - (intptr_t)src_rx;
> +
> +    tcg_debug_assert((offset & 3) == 0);
> +    offset >>= 2;
> +    if (offset == sextreg(offset, 0, 16)) {
> +        *src_rw |= (offset << 10) & 0x3fffc00;

Alternatively easier to read:

            *src_rw = deposit64(*src_rw , 10, 16, offset);

> +        return true;
> +    }
> +
> +    return false;
> +}


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v3 03/30] tcg/loongarch64: Add the tcg-target.h file
  2021-09-22 18:34   ` Philippe Mathieu-Daudé
@ 2021-09-22 18:47     ` WANG Xuerui
  2021-09-22 18:58     ` Richard Henderson
  1 sibling, 0 replies; 53+ messages in thread
From: WANG Xuerui @ 2021-09-22 18:47 UTC (permalink / raw)
  To: Philippe Mathieu-Daudé, qemu-devel
  Cc: Peter Maydell, Richard Henderson, Laurent Vivier

Hi Philippe,

On 9/23/21 02:34, Philippe Mathieu-Daudé wrote:
> On 9/22/21 20:09, WANG Xuerui wrote:
>> Support for all optional TCG ops are initially marked disabled; the bits
>> are to be set in individual commits later.
>>
>> Signed-off-by: WANG Xuerui <git@xen0n.name>
>> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
>> ---
>>   tcg/loongarch64/tcg-target.h | 180 +++++++++++++++++++++++++++++++++++
>>   1 file changed, 180 insertions(+)
>>   create mode 100644 tcg/loongarch64/tcg-target.h
>>
>> diff --git a/tcg/loongarch64/tcg-target.h b/tcg/loongarch64/tcg-target.h
>> new file mode 100644
>> index 0000000000..0fd9b61e6d
>> --- /dev/null
>> +++ b/tcg/loongarch64/tcg-target.h
>> @@ -0,0 +1,180 @@
>> +/*
>> + * Tiny Code Generator for QEMU
>> + *
>> + * Copyright (c) 2021 WANG Xuerui <git@xen0n.name>
>> + *
>> + * Based on tcg/riscv/tcg-target.h
>> + *
>> + * Copyright (c) 2018 SiFive, Inc
>
> I thought you could drop this line.
That's the original file's copyright line, and I always thought dropping 
it in derivative files wouldn't be nice?
>
>> + *
>> + * Permission is hereby granted, free of charge, to any person 
>> obtaining a copy
>> + * of this software and associated documentation files (the 
>> "Software"), to deal
>> + * in the Software without restriction, including without limitation 
>> the rights
>> + * to use, copy, modify, merge, publish, distribute, sublicense, 
>> and/or sell
>> + * copies of the Software, and to permit persons to whom the 
>> Software is
>> + * furnished to do so, subject to the following conditions:
>> + *
>> + * The above copyright notice and this permission notice shall be 
>> included in
>> + * all copies or substantial portions of the Software.
>> + *
>> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, 
>> EXPRESS OR
>> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF 
>> MERCHANTABILITY,
>> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT 
>> SHALL
>> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES 
>> OR OTHER
>> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, 
>> ARISING FROM,
>> + * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER 
>> DEALINGS IN
>> + * THE SOFTWARE.
>> + */
>> +
>> +#ifndef LOONGARCH_TCG_TARGET_H
>> +#define LOONGARCH_TCG_TARGET_H
>> +
>> +/*
>> + * Loongson removed the (incomplete) 32-bit support from kernel and 
>> toolchain
>> + * for the initial upstreaming of this architecture, so don't bother 
>> and just
>> + * support the LP64 ABI for now.
>> + */
>> +#if defined(__loongarch64)
>> +# define TCG_TARGET_REG_BITS 64
>> +#else
>> +# error unsupported LoongArch register size
>> +#endif
>> +
>> +#define TCG_TARGET_INSN_UNIT_SIZE 4
>> +#define TCG_TARGET_NB_REGS 32
>> +#define MAX_CODE_GEN_BUFFER_SIZE  ((size_t)-1)
>
> Is this SIZE_MAX?

I just did a quick grep across the tcg ports and found little similarity 
so far...

     aarch64      (2 * GiB)
     arm          UINT32_MAX
     i386         (2 * GiB)
     i386         UINT32_MAX
     loongarch64  ((size_t)-1)
     mips         (128 * MiB)
     ppc          (2 * GiB)
     ppc          (32 * MiB)
     riscv        ((size_t)-1)
     s390         (3 * GiB)
     sparc        (2 * GiB)
     tci          ((size_t)-1)

In that case, I think maybe SIZE_MAX would indeed be better for 
readability, so I'm going to change that...

>
>> +
>> +typedef enum {
>> +    TCG_REG_ZERO,
>> +    TCG_REG_RA,
>> +    TCG_REG_TP,
>> +    TCG_REG_SP,
>> +    TCG_REG_A0,
>> +    TCG_REG_A1,
>> +    TCG_REG_A2,
>> +    TCG_REG_A3,
>> +    TCG_REG_A4,
>> +    TCG_REG_A5,
>> +    TCG_REG_A6,
>> +    TCG_REG_A7,
>> +    TCG_REG_T0,
>> +    TCG_REG_T1,
>> +    TCG_REG_T2,
>> +    TCG_REG_T3,
>> +    TCG_REG_T4,
>> +    TCG_REG_T5,
>> +    TCG_REG_T6,
>> +    TCG_REG_T7,
>> +    TCG_REG_T8,
>> +    TCG_REG_RESERVED,
>> +    TCG_REG_S9,
>> +    TCG_REG_S0,
>> +    TCG_REG_S1,
>> +    TCG_REG_S2,
>> +    TCG_REG_S3,
>> +    TCG_REG_S4,
>> +    TCG_REG_S5,
>> +    TCG_REG_S6,
>> +    TCG_REG_S7,
>> +    TCG_REG_S8,
>
> Here could go:
>
>        TCG_TARGET_NB_REGS,
Good idea, something no other TCG ports has done... maybe we could 
refactor them all to avoid a little redundancy. I'll do this in v4.
>
>> +
>> +    /* aliases */
>> +    TCG_AREG0    = TCG_REG_S0,
>> +    TCG_REG_TMP0 = TCG_REG_T8,
>> +    TCG_REG_TMP1 = TCG_REG_T7,
>> +    TCG_REG_TMP2 = TCG_REG_T6,
>> +} TCGReg;
>
> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v3 04/30] tcg/loongarch64: Add generated instruction opcodes and encoding helpers
  2021-09-22 18:37   ` Philippe Mathieu-Daudé
@ 2021-09-22 18:51     ` WANG Xuerui
  2021-09-22 19:32       ` Philippe Mathieu-Daudé
  0 siblings, 1 reply; 53+ messages in thread
From: WANG Xuerui @ 2021-09-22 18:51 UTC (permalink / raw)
  To: Philippe Mathieu-Daudé, qemu-devel
  Cc: Peter Maydell, Richard Henderson, Laurent Vivier

Hi Philippe,

On 9/23/21 02:37, Philippe Mathieu-Daudé wrote:
> On 9/22/21 20:09, WANG Xuerui wrote:
>> <snip>
>
> The generated code ...
>
>> <snip>
>
> ... ends here, right? ...
>
>> <snip>
>
> ... or are these helpers also generated?

I actively hate writing these by hand, so of course the whole file is 
auto-generated with some quick Go scripting in that repo. Even including 
auto-formatting with clang-format so I could directly pipe the output 
into this file...



^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v3 09/30] tcg/loongarch64: Implement tcg_out_mov and tcg_out_movi
  2021-09-22 18:09 ` [PATCH v3 09/30] tcg/loongarch64: Implement tcg_out_mov and tcg_out_movi WANG Xuerui
  2021-09-22 18:39   ` Richard Henderson
@ 2021-09-22 18:51   ` Richard Henderson
  2021-09-23 15:38     ` WANG Xuerui
  2021-09-23 16:50   ` Richard Henderson
  2 siblings, 1 reply; 53+ messages in thread
From: Richard Henderson @ 2021-09-22 18:51 UTC (permalink / raw)
  To: WANG Xuerui, qemu-devel
  Cc: Peter Maydell, Philippe Mathieu-Daudé, Laurent Vivier

On 9/22/21 11:09 AM, WANG Xuerui wrote:
> +    if (sextreg(val, 0, 52) == val) {
> +        /*
> +         * Fits in 52-bits, upper bits are already properly sign-extended by
> +         * cu32i.d.
> +         */
> +        return;
> +    }
> +    tcg_out_opc_cu52i_d(s, rd, rd, top);
> +}

Oh, future improvement: constants with 52 low zero bits can be loaded with cu52i(rd, zero, 
val >> 52).

Which means there's a set of interesting constants:

   abc0_0000_0000_0def

	ori	rd, zero, 0xdef
	cu52i	rd, rd, 0xabc

   abcf_ffff_ffff_fdef

	cu52i	rd, zero, 0xabc - 1
	addi.d	rd, rd, 0xdef

Also,

> +    tcg_out_opc_lu12i_w(s, rd, upper);
> +    if (low != 0) {
> +        tcg_out_opc_ori(s, rd, rd, low & 0xfff);
> +    }

when upper == 0 and low != 0, we can omit the lu12i.


r~


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v3 07/30] tcg/loongarch64: Implement necessary relocation operations
  2021-09-22 18:41   ` Philippe Mathieu-Daudé
@ 2021-09-22 18:55     ` WANG Xuerui
  0 siblings, 0 replies; 53+ messages in thread
From: WANG Xuerui @ 2021-09-22 18:55 UTC (permalink / raw)
  To: Philippe Mathieu-Daudé, WANG Xuerui, qemu-devel
  Cc: Peter Maydell, Richard Henderson, Laurent Vivier

Hi Philippe,

On 9/23/21 02:41, Philippe Mathieu-Daudé wrote:
> On 9/22/21 20:09, WANG Xuerui wrote:
>> Signed-off-by: WANG Xuerui <git@xen0n.name>
>> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
>> ---
>>   tcg/loongarch64/tcg-target.c.inc | 66 ++++++++++++++++++++++++++++++++
>>   1 file changed, 66 insertions(+)
>
>> +/* Field Sk16, shifted right by 2; suitable for conditional jumps */
>> +#define R_LOONGARCH_BR_SK16     256
>> +/* Field Sd10k16, shifted right by 2; suitable for B and BL */
>> +#define R_LOONGARCH_BR_SD10K16  257
>> +
>> +static bool reloc_br_sk16(tcg_insn_unit *src_rw, const tcg_insn_unit 
>> *target)
>> +{
>> +    const tcg_insn_unit *src_rx = tcg_splitwx_to_rx(src_rw);
>> +    intptr_t offset = (intptr_t)target - (intptr_t)src_rx;
>> +
>> +    tcg_debug_assert((offset & 3) == 0);
>> +    offset >>= 2;
>> +    if (offset == sextreg(offset, 0, 16)) {
>> +        *src_rw |= (offset << 10) & 0x3fffc00;
>
> Alternatively easier to read:
>
>            *src_rw = deposit64(*src_rw , 10, 16, offset);
Hmm, this helper could surely be reused at multiple places... I'll try 
to replace other operations like this in v4.
>
>> +        return true;
>> +    }
>> +
>> +    return false;
>> +}


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v3 03/30] tcg/loongarch64: Add the tcg-target.h file
  2021-09-22 18:34   ` Philippe Mathieu-Daudé
  2021-09-22 18:47     ` WANG Xuerui
@ 2021-09-22 18:58     ` Richard Henderson
  2021-09-23 10:35       ` Philippe Mathieu-Daudé
  1 sibling, 1 reply; 53+ messages in thread
From: Richard Henderson @ 2021-09-22 18:58 UTC (permalink / raw)
  To: Philippe Mathieu-Daudé, WANG Xuerui, qemu-devel
  Cc: Peter Maydell, Laurent Vivier

On 9/22/21 11:34 AM, Philippe Mathieu-Daudé wrote:
> 
> Here could go:
> 
>         TCG_TARGET_NB_REGS,

Not a fan of putting NFOO in the enumeration.
It interferes with -Wenum stuff.


r~


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v3 09/30] tcg/loongarch64: Implement tcg_out_mov and tcg_out_movi
  2021-09-22 18:39   ` Richard Henderson
@ 2021-09-22 19:02     ` WANG Xuerui
  0 siblings, 0 replies; 53+ messages in thread
From: WANG Xuerui @ 2021-09-22 19:02 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel
  Cc: Peter Maydell, Philippe Mathieu-Daudé, Laurent Vivier

Hi Richard,

On 9/23/21 02:39, Richard Henderson wrote:
> On 9/22/21 11:09 AM, WANG Xuerui wrote:
>> +    if (pc_offset == (int32_t)pc_offset) {
>> +        tcg_target_long lo = sextreg(pc_offset, 0, 12);
>> +        tcg_target_long hi = pc_offset - lo;
>> +        tcg_out_opc_pcaddu12i(s, rd, hi >> 12);
>> +        tcg_out_opc_addi_d(s, rd, rd, lo);
>
> pc_offset = 0x7ffff800 will fail:
>
> lo = 0xfffffffffffff800
> hi = 0x0000000080000000
>
> but hi will be interpreted as negative by pcaddu12i.
>
> This is the same problem I pointed out with tcg_out_call, but with 
> different constants.
>
Hmm, I think I'll have to look into this, but not now (it's again 3 am 
here)... I'll try to come up with something tomorrow.
>
> r~


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v3 04/30] tcg/loongarch64: Add generated instruction opcodes and encoding helpers
  2021-09-22 18:51     ` WANG Xuerui
@ 2021-09-22 19:32       ` Philippe Mathieu-Daudé
  0 siblings, 0 replies; 53+ messages in thread
From: Philippe Mathieu-Daudé @ 2021-09-22 19:32 UTC (permalink / raw)
  To: WANG Xuerui
  Cc: Peter Maydell, Richard Henderson,
	qemu-devel@nongnu.org Developers, Laurent Vivier

On Wed, Sep 22, 2021 at 8:51 PM WANG Xuerui <i.qemu@xen0n.name> wrote:
>
> Hi Philippe,
>
> On 9/23/21 02:37, Philippe Mathieu-Daudé wrote:
> > On 9/22/21 20:09, WANG Xuerui wrote:
> >> <snip>
> >
> > The generated code ...
> >
> >> <snip>
> >
> > ... ends here, right? ...
> >
> >> <snip>
> >
> > ... or are these helpers also generated?
>
> I actively hate writing these by hand, so of course the whole file is
> auto-generated with some quick Go scripting in that repo. Even including
> auto-formatting with clang-format so I could directly pipe the output
> into this file...

Awesome then :) Maybe add /* End of generated code */ comment at the
end to make it obvious?


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v3 03/30] tcg/loongarch64: Add the tcg-target.h file
  2021-09-22 18:58     ` Richard Henderson
@ 2021-09-23 10:35       ` Philippe Mathieu-Daudé
  0 siblings, 0 replies; 53+ messages in thread
From: Philippe Mathieu-Daudé @ 2021-09-23 10:35 UTC (permalink / raw)
  To: Richard Henderson, WANG Xuerui, qemu-devel; +Cc: Peter Maydell, Laurent Vivier

On 9/22/21 20:58, Richard Henderson wrote:
> On 9/22/21 11:34 AM, Philippe Mathieu-Daudé wrote:
>>
>> Here could go:
>>
>>         TCG_TARGET_NB_REGS,
> 
> Not a fan of putting NFOO in the enumeration.
> It interferes with -Wenum stuff.

Oh good point... TIL :)


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v3 09/30] tcg/loongarch64: Implement tcg_out_mov and tcg_out_movi
  2021-09-22 18:51   ` Richard Henderson
@ 2021-09-23 15:38     ` WANG Xuerui
  0 siblings, 0 replies; 53+ messages in thread
From: WANG Xuerui @ 2021-09-23 15:38 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel
  Cc: Peter Maydell, Philippe Mathieu-Daudé, Laurent Vivier

Hi Richard,

On 9/23/21 02:51, Richard Henderson wrote:
> On 9/22/21 11:09 AM, WANG Xuerui wrote:
>> +    if (sextreg(val, 0, 52) == val) {
>> +        /*
>> +         * Fits in 52-bits, upper bits are already properly 
>> sign-extended by
>> +         * cu32i.d.
>> +         */
>> +        return;
>> +    }
>> +    tcg_out_opc_cu52i_d(s, rd, rd, top);
>> +}
>
> Oh, future improvement: constants with 52 low zero bits can be loaded 
> with cu52i(rd, zero, val >> 52).
>
> Which means there's a set of interesting constants:
>
>   abc0_0000_0000_0def
>
>     ori    rd, zero, 0xdef
>     cu52i    rd, rd, 0xabc
>
>   abcf_ffff_ffff_fdef
>
>     cu52i    rd, zero, 0xabc - 1
>     addi.d    rd, rd, 0xdef
I think I'll try to implement this in some kind of follow-up patch, yeah.
>
> Also,
>
>> +    tcg_out_opc_lu12i_w(s, rd, upper);
>> +    if (low != 0) {
>> +        tcg_out_opc_ori(s, rd, rd, low & 0xfff);
>> +    }
>
> when upper == 0 and low != 0, we can omit the lu12i.
>
This optimization can't be blindly implemented though; because cu32i.d 
only takes one input register, in case upper == low == 0 but higher != 
0, we would have no proper input. So in that case something like "move 
rd, zero" is necessary, and logic is a little bit complicated. I'll 
include this in v4 though.
>
> r~


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v3 09/30] tcg/loongarch64: Implement tcg_out_mov and tcg_out_movi
  2021-09-22 18:09 ` [PATCH v3 09/30] tcg/loongarch64: Implement tcg_out_mov and tcg_out_movi WANG Xuerui
  2021-09-22 18:39   ` Richard Henderson
  2021-09-22 18:51   ` Richard Henderson
@ 2021-09-23 16:50   ` Richard Henderson
  2021-09-24 15:08     ` WANG Xuerui
  2 siblings, 1 reply; 53+ messages in thread
From: Richard Henderson @ 2021-09-23 16:50 UTC (permalink / raw)
  To: WANG Xuerui, qemu-devel
  Cc: Peter Maydell, Philippe Mathieu-Daudé, Laurent Vivier

On 9/22/21 11:09 AM, WANG Xuerui wrote:

Following up on previous, I suggest:

> +static void tcg_out_movi(TCGContext *s, TCGType type, TCGReg rd,
> +                         tcg_target_long val)
> +{
> +    if (type == TCG_TYPE_I32) {
> +        val = (int32_t)val;
> +    }
> +
> +    /* Single-instruction cases.  */
> +    tcg_target_long low = sextreg(val, 0, 12);
> +    if (low == val) {
> +        /* val fits in simm12: addi.w rd, zero, val */
> +        tcg_out_opc_addi_w(s, rd, TCG_REG_ZERO, val);
> +        return;
> +    }
> +    if (0x800 <= val && val <= 0xfff) {
> +        /* val fits in uimm12: ori rd, zero, val */
> +        tcg_out_opc_ori(s, rd, TCG_REG_ZERO, val);
> +        return;
> +    }

> +    /* Test for PC-relative values that can be loaded faster.  */
> +    intptr_t pc_offset = tcg_pcrel_diff(s, (void *)val);
> +    if (pc_offset == sextreg(pc_offset, 0, 22) && (pc_offset & 3) == 0) {
> +        tcg_out_opc_pcaddu2i(s, rd, pc_offset >> 2);
> +        return;
> +    }

     /* Handle all 32-bit constants. */
     if (val == (int32_t)val) {
         tcg_out_opc_lu12i(s, rd, val >> 12);
         if (low) {
             tcg_out_opc_ori(s, rd, rd, val & 0xfff);
         }
         return;
     }

     /* Handle pc-relative values requiring 2 instructions. */
     intptr_t pc_lo = sextract64(pc_offset, 0, 12);
     intptr_t pc_hi = pc_offset - pc_low;
     if (pc_hi == (int32_t)pc_hi) {
         tcg_out_opc_pcaddu12i(s, rd, pc_hi >> 12);
         tcg_out_opc_addi_d(s, rd, rd, pc_lo);
         return;
     }

     /*
      * Choose signed low part if bit 13 is also set,
      * which gives us a chance of making more zeros.
      * Otherwise, let low be unsigned.
      */
     if ((val & 0x1800) != 0x1800) {
         low = val & 0xfff;
     }
     val -= low;

     tcg_target_long hi20 = sextract64(val, 12, 20);
     tcg_target_long hi32 = sextract64(val, 32, 20);
     tcg_target_long hi52 = sextract64(val, 52, 12);

     /*
      * If we can use the sign-extension of a previous
      * operation, suppress higher -1.
      */
     if (hi32 < 0 && hi52 == -1) {
         hi52 = 0;
     }
     if (hi20 < 0 && hi32 == -1) {
         hi32 = 0;
     }

     /* Initialize RD with the least non-zero component. */
     if (hi20) {
         tcg_out_opc_lu12i_w(s, rd, hi20 >> 12);
     } else if (hi32) {
         /* CU32I_D is modify in place, so RD must be initialized. */
         if (low < 0) {
             tcg_out_opc_addi_w(s, rd, TCG_REG_ZERO, low);
         } else {
             tcg_out_opc_ori(s, rd, TCG_REG_ZERO, low);
         }
         low = 0;
     } else {
         tcg_out_opc_cu52i_d(s, rd, TCG_REG_ZERO, hi52);
         hi52 = 0;
     }

     /* Assume that lu12i + ori are fusable */
     if (low > 0) {
         tcg_out_opc_ori(s, rd, rd, low);
     }

     /* Set the high 32 bits */
     if (hi32) {
         tcg_out_opc_cu32i_d(s, rd, hi32);
     }
     if (hi52) {
         tcg_out_opc_cu52i(s, rd, rd, hi52);
     }

     /*
      * Note that any subtraction must come last,
      * because cu32i and cu52i overwrite high bits,
      * and we have computed them as val - low.
      */
     if (low < 0) {
         tcg_out_opc_addi_d(s, rd, rd, low);
     }

Untested, and all bugs are mine, of course.

Try "qemu-system-ppc64 -D z -d in_asm,op_opt,out_asm".
You should see some masking constants like

  ---- 000000001daf2898
  and_i64 CA,r9,$0x7fffffffffffffff        dead: 2  pref=0xffff

   cu52i.d rd, zero, 0x800
   addi.d  rd, rd, -1

  ---- 000000001db0775c
  mov_i64 r26,$0x300000002                 sync: 0  dead: 0 1  pref=0xffff

   ori     rd, zero, 2
   cu32i   rd, 3


r~


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v3 23/30] tcg/loongarch64: Add softmmu load/store helpers,  implement qemu_ld/qemu_st ops
  2021-09-22 18:09 ` [PATCH v3 23/30] tcg/loongarch64: Add softmmu load/store helpers, implement qemu_ld/qemu_st ops WANG Xuerui
@ 2021-09-23 17:25   ` Richard Henderson
  0 siblings, 0 replies; 53+ messages in thread
From: Richard Henderson @ 2021-09-23 17:25 UTC (permalink / raw)
  To: WANG Xuerui, qemu-devel
  Cc: Peter Maydell, Philippe Mathieu-Daudé, Laurent Vivier

> +        /* fallthrough */
> +    default:
> +        tcg_out_mov(s, size == MO_64, l->datalo_reg, TCG_REG_A0);
> +        break;

Here in tcg_out_qemu_ld_slow_path, "size == MO_64" is "type".

> +    /* TLB Hit - translate address using addend.  */
> +    if (TCG_TARGET_REG_BITS > TARGET_LONG_BITS) {
> +        tcg_out_ext32u(s, TCG_REG_TMP0, addrl);
> +        addrl = TCG_REG_TMP0;
> +    }
> +    tcg_out_opc_add_d(s, TCG_REG_TMP0, TCG_REG_TMP2, addrl);

Note for future optimization: Unlike RISC-V, LoongArch has indexed addressing, and we 
should make use of it to eliminate this final add ...

> +    if (USE_GUEST_BASE) {
> +        tcg_out_opc_add_d(s, base, TCG_GUEST_BASE_REG, addr_regl);

... as well as these adds.

Compare tcg/ppc/ or tcg/sparc/, both of which always use indexed addressing (and indeed, 
their reverse-endian memory ops do not have an offset form):

     tcg_out_ldst_rr(s, data, addr,
                     (guest_base ? TCG_GUEST_BASE_REG : TCG_REG_G0),
                     qemu_ld_opc[memop & (MO_BSWAP | MO_SSIZE)]);

(It is not useful to compare tcg/aarch64/, which cannot represent "zero" with indexed 
addressing, and so has to swap between offset and indexed addressing, depending on the 
size of the guest address space and guest_base == 0.)

Oops, I've just noticed that the !CONFIG_SOFTMMU case is not zero-extending the guest 
address for TARGET_LONG_BITS == 32.


r~


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v3 09/30] tcg/loongarch64: Implement tcg_out_mov and tcg_out_movi
  2021-09-23 16:50   ` Richard Henderson
@ 2021-09-24 15:08     ` WANG Xuerui
  2021-09-24 15:53       ` Richard Henderson
  0 siblings, 1 reply; 53+ messages in thread
From: WANG Xuerui @ 2021-09-24 15:08 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel
  Cc: Peter Maydell, Philippe Mathieu-Daudé, Laurent Vivier

Hi Richard,

On 9/24/21 00:50, Richard Henderson wrote:
> On 9/22/21 11:09 AM, WANG Xuerui wrote:
>
> Following up on previous, I suggest:
>
>> +static void tcg_out_movi(TCGContext *s, TCGType type, TCGReg rd,
>> +                         tcg_target_long val)
>> +{
>> +    if (type == TCG_TYPE_I32) {
>> +        val = (int32_t)val;
>> +    }
>> +
>> +    /* Single-instruction cases.  */
>> +    tcg_target_long low = sextreg(val, 0, 12);
>> +    if (low == val) {
>> +        /* val fits in simm12: addi.w rd, zero, val */
>> +        tcg_out_opc_addi_w(s, rd, TCG_REG_ZERO, val);
>> +        return;
>> +    }
>> +    if (0x800 <= val && val <= 0xfff) {
>> +        /* val fits in uimm12: ori rd, zero, val */
>> +        tcg_out_opc_ori(s, rd, TCG_REG_ZERO, val);
>> +        return;
>> +    }
>
>> +    /* Test for PC-relative values that can be loaded faster.  */
>> +    intptr_t pc_offset = tcg_pcrel_diff(s, (void *)val);
>> +    if (pc_offset == sextreg(pc_offset, 0, 22) && (pc_offset & 3) == 
>> 0) {
>> +        tcg_out_opc_pcaddu2i(s, rd, pc_offset >> 2);
>> +        return;
>> +    }
>
>     /* Handle all 32-bit constants. */
>     if (val == (int32_t)val) {
>         tcg_out_opc_lu12i(s, rd, val >> 12);
>         if (low) {
>             tcg_out_opc_ori(s, rd, rd, val & 0xfff);
>         }
>         return;
>     }
>
>     /* Handle pc-relative values requiring 2 instructions. */
>     intptr_t pc_lo = sextract64(pc_offset, 0, 12);
>     intptr_t pc_hi = pc_offset - pc_low;
>     if (pc_hi == (int32_t)pc_hi) {
>         tcg_out_opc_pcaddu12i(s, rd, pc_hi >> 12);
>         tcg_out_opc_addi_d(s, rd, rd, pc_lo);
>         return;
>     }
>
>     /*
>      * Choose signed low part if bit 13 is also set,
>      * which gives us a chance of making more zeros.
>      * Otherwise, let low be unsigned.
>      */
>     if ((val & 0x1800) != 0x1800) {
>         low = val & 0xfff;
>     }
>     val -= low;
>
>     tcg_target_long hi20 = sextract64(val, 12, 20);
>     tcg_target_long hi32 = sextract64(val, 32, 20);
>     tcg_target_long hi52 = sextract64(val, 52, 12);
>
>     /*
>      * If we can use the sign-extension of a previous
>      * operation, suppress higher -1.
>      */
>     if (hi32 < 0 && hi52 == -1) {
>         hi52 = 0;
>     }
>     if (hi20 < 0 && hi32 == -1) {
>         hi32 = 0;
>     }
>
>     /* Initialize RD with the least non-zero component. */
>     if (hi20) {
>         tcg_out_opc_lu12i_w(s, rd, hi20 >> 12);
>     } else if (hi32) {
>         /* CU32I_D is modify in place, so RD must be initialized. */
>         if (low < 0) {
>             tcg_out_opc_addi_w(s, rd, TCG_REG_ZERO, low);
>         } else {
>             tcg_out_opc_ori(s, rd, TCG_REG_ZERO, low);
>         }
>         low = 0;
>     } else {
>         tcg_out_opc_cu52i_d(s, rd, TCG_REG_ZERO, hi52);
>         hi52 = 0;
>     }
>
>     /* Assume that lu12i + ori are fusable */
>     if (low > 0) {
>         tcg_out_opc_ori(s, rd, rd, low);
>     }
>
>     /* Set the high 32 bits */
>     if (hi32) {
>         tcg_out_opc_cu32i_d(s, rd, hi32);
>     }
>     if (hi52) {
>         tcg_out_opc_cu52i(s, rd, rd, hi52);
>     }
>
>     /*
>      * Note that any subtraction must come last,
>      * because cu32i and cu52i overwrite high bits,
>      * and we have computed them as val - low.
>      */
>     if (low < 0) {
>         tcg_out_opc_addi_d(s, rd, rd, low);
>     }
>
> Untested, and all bugs are mine, of course.
>
> Try "qemu-system-ppc64 -D z -d in_asm,op_opt,out_asm".
> You should see some masking constants like
>
>  ---- 000000001daf2898
>  and_i64 CA,r9,$0x7fffffffffffffff        dead: 2  pref=0xffff
>
>   cu52i.d rd, zero, 0x800
>   addi.d  rd, rd, -1
>
>  ---- 000000001db0775c
>  mov_i64 r26,$0x300000002                 sync: 0  dead: 0 1 pref=0xffff
>
>   ori     rd, zero, 2
>   cu32i   rd, 3
>
Oops, for some reason I only received this at about 8 pm... I'll of 
course take advantage of the Saturday and compare the generated code for 
the cases, hopefully incorporating some of your ideas presented here. 
Thanks for the detailed reply!
>
> r~


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v3 09/30] tcg/loongarch64: Implement tcg_out_mov and tcg_out_movi
  2021-09-24 15:08     ` WANG Xuerui
@ 2021-09-24 15:53       ` Richard Henderson
  2021-09-24 16:26         ` WANG Xuerui
  0 siblings, 1 reply; 53+ messages in thread
From: Richard Henderson @ 2021-09-24 15:53 UTC (permalink / raw)
  To: WANG Xuerui, qemu-devel

On 9/24/21 11:08 AM, WANG Xuerui wrote:
> Oops, for some reason I only received this at about 8 pm...

That was my fault.  I wrote a bunch of stuff off-line yesterday while traveling, and the 
mail queue only flushed this morning.

I'll note there's a bug in my example code wrt initializing rd with addi, then overwriting 
with cu32i.d.

I like your v4 version of movi, with the high-bit-set predicate.  The only case I can 
think of that you miss is e.g. 0x7fffffffffffffff, which can be

	addi.w	rd, zero, -1
	cu52i.d	rd, rd, 0x7ff

One possibility is to extract a subroutine:

static void tcg_out_movi_i32(TCGContext *s, TCGReg rd, int32_t val)
{
     /* Single instruction cases */
     /* else lu12i.w + ori */
}

static void tcg_out_movi(TCGContext *s, TCGType type, TCGReg rd,
                          tcg_target_long val)
{
     if (type == TCG_TYPE_I32 || val == (int32_t)val) {
         tcg_out_movi_i32(s, rd, val);
         return;
     }

     /* PC-relative cases */

     if (ctz64(val) >= 52) {
         tcg_out_opc_cu52i_d(s, rd, TCG_REG_ZERO, val >> 52);
         return;
     }

     /* Slow path.  Initialize the low 32-bits, then concat high bits. */
     tcg_out_movi_i32(s, rd, val);

     rd_high_bits_are_ones = (int32_t)val < 0);

     /* Your imm_part_needs_loading checks; rd is always written. */
}


r~


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v3 09/30] tcg/loongarch64: Implement tcg_out_mov and tcg_out_movi
  2021-09-24 15:53       ` Richard Henderson
@ 2021-09-24 16:26         ` WANG Xuerui
  0 siblings, 0 replies; 53+ messages in thread
From: WANG Xuerui @ 2021-09-24 16:26 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel

Hi Richard,

On 9/24/21 23:53, Richard Henderson wrote:
> On 9/24/21 11:08 AM, WANG Xuerui wrote:
>> Oops, for some reason I only received this at about 8 pm...
>
> That was my fault.  I wrote a bunch of stuff off-line yesterday while 
> traveling, and the mail queue only flushed this morning.
>
> I'll note there's a bug in my example code wrt initializing rd with 
> addi, then overwriting with cu32i.d.
>
> I like your v4 version of movi, with the high-bit-set predicate. The 
> only case I can think of that you miss is e.g. 0x7fffffffffffffff, 
> which can be
>
>     addi.w    rd, zero, -1
>     cu52i.d    rd, rd, 0x7ff
>
> One possibility is to extract a subroutine:
>
> static void tcg_out_movi_i32(TCGContext *s, TCGReg rd, int32_t val)
> {
>     /* Single instruction cases */
>     /* else lu12i.w + ori */
> }
>
> static void tcg_out_movi(TCGContext *s, TCGType type, TCGReg rd,
>                          tcg_target_long val)
> {
>     if (type == TCG_TYPE_I32 || val == (int32_t)val) {
>         tcg_out_movi_i32(s, rd, val);
>         return;
>     }
>
>     /* PC-relative cases */
>
>     if (ctz64(val) >= 52) {
>         tcg_out_opc_cu52i_d(s, rd, TCG_REG_ZERO, val >> 52);
>         return;
>     }
>
>     /* Slow path.  Initialize the low 32-bits, then concat high bits. */
>     tcg_out_movi_i32(s, rd, val);
>
>     rd_high_bits_are_ones = (int32_t)val < 0);
>
>     /* Your imm_part_needs_loading checks; rd is always written. */
> }
>
This is so impressive understanding of the LoongArch assembly, working 
around cu32i.d's single operand limitation so nicely. I'll just 
(shamelessly) take this for v5, thanks again!
>
> r~


^ permalink raw reply	[flat|nested] 53+ messages in thread

end of thread, other threads:[~2021-09-24 16:28 UTC | newest]

Thread overview: 53+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-09-22 18:08 [PATCH v3 00/30] LoongArch64 port of QEMU TCG WANG Xuerui
2021-09-22 18:08 ` [PATCH v3 01/30] elf: Add machine type value for LoongArch WANG Xuerui
2021-09-22 18:17   ` Richard Henderson
2021-09-22 18:23   ` Philippe Mathieu-Daudé
2021-09-22 18:08 ` [PATCH v3 02/30] MAINTAINERS: Add tcg/loongarch64 entry with myself as maintainer WANG Xuerui
2021-09-22 18:09 ` [PATCH v3 03/30] tcg/loongarch64: Add the tcg-target.h file WANG Xuerui
2021-09-22 18:34   ` Philippe Mathieu-Daudé
2021-09-22 18:47     ` WANG Xuerui
2021-09-22 18:58     ` Richard Henderson
2021-09-23 10:35       ` Philippe Mathieu-Daudé
2021-09-22 18:09 ` [PATCH v3 04/30] tcg/loongarch64: Add generated instruction opcodes and encoding helpers WANG Xuerui
2021-09-22 18:37   ` Philippe Mathieu-Daudé
2021-09-22 18:51     ` WANG Xuerui
2021-09-22 19:32       ` Philippe Mathieu-Daudé
2021-09-22 18:09 ` [PATCH v3 05/30] tcg/loongarch64: Add register names, allocation order and input/output sets WANG Xuerui
2021-09-22 18:09 ` [PATCH v3 06/30] tcg/loongarch64: Define the operand constraints WANG Xuerui
2021-09-22 18:09 ` [PATCH v3 07/30] tcg/loongarch64: Implement necessary relocation operations WANG Xuerui
2021-09-22 18:41   ` Philippe Mathieu-Daudé
2021-09-22 18:55     ` WANG Xuerui
2021-09-22 18:09 ` [PATCH v3 08/30] tcg/loongarch64: Implement the memory barrier op WANG Xuerui
2021-09-22 18:09 ` [PATCH v3 09/30] tcg/loongarch64: Implement tcg_out_mov and tcg_out_movi WANG Xuerui
2021-09-22 18:39   ` Richard Henderson
2021-09-22 19:02     ` WANG Xuerui
2021-09-22 18:51   ` Richard Henderson
2021-09-23 15:38     ` WANG Xuerui
2021-09-23 16:50   ` Richard Henderson
2021-09-24 15:08     ` WANG Xuerui
2021-09-24 15:53       ` Richard Henderson
2021-09-24 16:26         ` WANG Xuerui
2021-09-22 18:09 ` [PATCH v3 10/30] tcg/loongarch64: Implement goto_ptr WANG Xuerui
2021-09-22 18:09 ` [PATCH v3 11/30] tcg/loongarch64: Implement sign-/zero-extension ops WANG Xuerui
2021-09-22 18:09 ` [PATCH v3 12/30] tcg/loongarch64: Implement not/and/or/xor/nor/andc/orc ops WANG Xuerui
2021-09-22 18:09 ` [PATCH v3 13/30] tcg/loongarch64: Implement deposit/extract ops WANG Xuerui
2021-09-22 18:09 ` [PATCH v3 14/30] tcg/loongarch64: Implement bswap{16,32,64} ops WANG Xuerui
2021-09-22 18:23   ` Richard Henderson
2021-09-22 18:09 ` [PATCH v3 15/30] tcg/loongarch64: Implement clz/ctz ops WANG Xuerui
2021-09-22 18:09 ` [PATCH v3 16/30] tcg/loongarch64: Implement shl/shr/sar/rotl/rotr ops WANG Xuerui
2021-09-22 18:09 ` [PATCH v3 17/30] tcg/loongarch64: Implement add/sub ops WANG Xuerui
2021-09-22 18:09 ` [PATCH v3 18/30] tcg/loongarch64: Implement mul/mulsh/muluh/div/divu/rem/remu ops WANG Xuerui
2021-09-22 18:09 ` [PATCH v3 19/30] tcg/loongarch64: Implement br/brcond ops WANG Xuerui
2021-09-22 18:09 ` [PATCH v3 20/30] tcg/loongarch64: Implement setcond ops WANG Xuerui
2021-09-22 18:25   ` Richard Henderson
2021-09-22 18:09 ` [PATCH v3 21/30] tcg/loongarch64: Implement tcg_out_call WANG Xuerui
2021-09-22 18:09 ` [PATCH v3 22/30] tcg/loongarch64: Implement simple load/store ops WANG Xuerui
2021-09-22 18:09 ` [PATCH v3 23/30] tcg/loongarch64: Add softmmu load/store helpers, implement qemu_ld/qemu_st ops WANG Xuerui
2021-09-23 17:25   ` Richard Henderson
2021-09-22 18:09 ` [PATCH v3 24/30] tcg/loongarch64: Implement tcg_target_qemu_prologue WANG Xuerui
2021-09-22 18:09 ` [PATCH v3 25/30] tcg/loongarch64: Implement exit_tb/goto_tb WANG Xuerui
2021-09-22 18:09 ` [PATCH v3 26/30] tcg/loongarch64: Implement tcg_target_init WANG Xuerui
2021-09-22 18:09 ` [PATCH v3 27/30] tcg/loongarch64: Register the JIT WANG Xuerui
2021-09-22 18:09 ` [PATCH v3 28/30] linux-user: Add safe syscall handling for loongarch64 hosts WANG Xuerui
2021-09-22 18:09 ` [PATCH v3 29/30] accel/tcg/user-exec: Implement CPU-specific signal handler " WANG Xuerui
2021-09-22 18:09 ` [PATCH v3 30/30] configure, meson.build: Mark support " WANG Xuerui

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).