qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 00/21] Adding partial support for 128-bit riscv target
@ 2021-10-19  9:47 Frédéric Pétrot
  2021-10-19  9:47 ` [PATCH v3 01/21] memory: change define name for consistency Frédéric Pétrot
                   ` (20 more replies)
  0 siblings, 21 replies; 49+ messages in thread
From: Frédéric Pétrot @ 2021-10-19  9:47 UTC (permalink / raw)
  To: qemu-devel, qemu-riscv
  Cc: bin.meng, richard.henderson, alistair.francis, fabien.portas,
	palmer, Frédéric Pétrot, philmd

This series of patches provides partial 128-bit support for the riscv
target architecture, namely RVI and RVM, with minimal csr support.

This v3 is based on Richard proposal for handling correctly the various
register sizes (v4 version of his series).
As compared to the v2, it simplifies a bit the API, and also allows to
reuse existing generation functions where wrappers were needed before.
It also should handle 128-bit sign extension in RV128 running as RV32 or
RV64 correctly, but I did/could not run any tests making sure of it.

Based-on: 20211019000108.3678724-1-richard.henderson@linaro.org

Frédéric Pétrot (21):
  memory: change define name for consistency
  memory: add a few defines for octo (128-bit) values
  Int128.h: addition of a few 128-bit operations
  target/riscv: additional macros to check instruction support
  target/riscv: separation of bitwise logic and aritmetic helpers
  target/riscv: array for the 64 upper bits of 128-bit registers
  target/riscv: setup everything so that riscv128-softmmu compiles
  target/riscv: adding accessors to the registers upper part
  target/riscv: moving some insns close to similar insns
  target/riscv: support for 128-bit loads and store
  target/riscv: support for 128-bit bitwise instructions
  target/riscv: support for 128-bit U-type instructions
  target/riscv: support for 128-bit shift instructions
  target/riscv: support for 128-bit arithmetic instructions
  target/riscv: support for 128-bit M extension
  target/riscv: adding high part of some csrs
  target/riscv: helper functions to wrap calls to 128-bit csr insns
  target/riscv: modification of the trans_csrxx for 128-bit support
  target/riscv: actual functions to realize crs 128-bit insns
  target/riscv: adding 128-bit access functions for some csrs
  target/riscv: support for 128-bit satp

 configs/devices/riscv128-softmmu/default.mak |  17 +
 configs/targets/riscv128-softmmu.mak         |   6 +
 include/disas/dis-asm.h                      |   1 +
 include/exec/memop.h                         |  12 +-
 include/hw/riscv/sifive_cpu.h                |   3 +
 include/qemu/int128.h                        | 264 ++++++
 target/arm/translate-a32.h                   |   4 +-
 target/riscv/cpu-param.h                     |  10 +
 target/riscv/cpu.h                           |  32 +
 target/riscv/cpu_bits.h                      |  11 +
 target/riscv/helper.h                        |   9 +
 target/riscv/insn16.decode                   |  32 +-
 target/riscv/insn32.decode                   |  24 +
 disas/riscv.c                                |   5 +
 target/arm/translate-a64.c                   |   8 +-
 target/arm/translate-neon.c                  |   6 +-
 target/arm/translate-sve.c                   |   2 +-
 target/arm/translate-vfp.c                   |   8 +-
 target/arm/translate.c                       |   2 +-
 target/ppc/translate.c                       |  24 +-
 target/riscv/cpu.c                           |  23 +-
 target/riscv/cpu_helper.c                    |  54 +-
 target/riscv/csr.c                           | 329 +++++++-
 target/riscv/gdbstub.c                       |   3 +
 target/riscv/m128_helper.c                   | 109 +++
 target/riscv/op_helper.c                     |  44 +
 target/riscv/translate.c                     | 308 ++++++-
 target/sparc/translate.c                     |   4 +-
 target/ppc/translate/fixedpoint-impl.c.inc   |  20 +-
 target/ppc/translate/fp-impl.c.inc           |   4 +-
 target/ppc/translate/vsx-impl.c.inc          |   4 +-
 target/riscv/insn_trans/trans_rvb.c.inc      |  48 +-
 target/riscv/insn_trans/trans_rvd.c.inc      |  12 +-
 target/riscv/insn_trans/trans_rvf.c.inc      |   6 +-
 target/riscv/insn_trans/trans_rvi.c.inc      | 803 ++++++++++++++++---
 target/riscv/insn_trans/trans_rvm.c.inc      | 273 ++++++-
 tcg/aarch64/tcg-target.c.inc                 |   2 +-
 tcg/arm/tcg-target.c.inc                     |  10 +-
 tcg/i386/tcg-target.c.inc                    |   4 +-
 tcg/mips/tcg-target.c.inc                    |   4 +-
 tcg/ppc/tcg-target.c.inc                     |   8 +-
 tcg/riscv/tcg-target.c.inc                   |   6 +-
 tcg/s390x/tcg-target.c.inc                   |  10 +-
 gdb-xml/riscv-128bit-cpu.xml                 |  48 ++
 gdb-xml/riscv-128bit-virtual.xml             |  12 +
 target/riscv/Kconfig                         |   3 +
 target/riscv/meson.build                     |   1 +
 47 files changed, 2358 insertions(+), 274 deletions(-)
 create mode 100644 configs/devices/riscv128-softmmu/default.mak
 create mode 100644 configs/targets/riscv128-softmmu.mak
 create mode 100644 target/riscv/m128_helper.c
 create mode 100644 gdb-xml/riscv-128bit-cpu.xml
 create mode 100644 gdb-xml/riscv-128bit-virtual.xml

-- 
2.33.0



^ permalink raw reply	[flat|nested] 49+ messages in thread

* [PATCH v3 01/21] memory: change define name for consistency
  2021-10-19  9:47 [PATCH v3 00/21] Adding partial support for 128-bit riscv target Frédéric Pétrot
@ 2021-10-19  9:47 ` Frédéric Pétrot
  2021-10-20 15:07   ` Philippe Mathieu-Daudé
  2021-10-19  9:47 ` [PATCH v3 02/21] memory: add a few defines for octo (128-bit) values Frédéric Pétrot
                   ` (19 subsequent siblings)
  20 siblings, 1 reply; 49+ messages in thread
From: Frédéric Pétrot @ 2021-10-19  9:47 UTC (permalink / raw)
  To: qemu-devel, qemu-riscv
  Cc: bin.meng, richard.henderson, alistair.francis, fabien.portas,
	palmer, Frédéric Pétrot, philmd

Changed MO_Q into MO_UQ so as to avoid confusion, as suggested by
Philippe Mathieu-Daudé.

Signed-off-by: Frédéric Pétrot <frederic.petrot@univ-grenoble-alpes.fr>
---
 include/exec/memop.h                       |  8 ++++----
 target/arm/translate-a32.h                 |  4 ++--
 target/arm/translate-a64.c                 |  8 ++++----
 target/arm/translate-neon.c                |  6 +++---
 target/arm/translate-sve.c                 |  2 +-
 target/arm/translate-vfp.c                 |  8 ++++----
 target/arm/translate.c                     |  2 +-
 target/ppc/translate.c                     | 24 +++++++++++-----------
 target/sparc/translate.c                   |  4 ++--
 target/ppc/translate/fixedpoint-impl.c.inc | 20 +++++++++---------
 target/ppc/translate/fp-impl.c.inc         |  4 ++--
 target/ppc/translate/vsx-impl.c.inc        |  4 ++--
 tcg/aarch64/tcg-target.c.inc               |  2 +-
 tcg/arm/tcg-target.c.inc                   | 10 ++++-----
 tcg/i386/tcg-target.c.inc                  |  4 ++--
 tcg/mips/tcg-target.c.inc                  |  4 ++--
 tcg/ppc/tcg-target.c.inc                   |  8 ++++----
 tcg/riscv/tcg-target.c.inc                 |  6 +++---
 tcg/s390x/tcg-target.c.inc                 | 10 ++++-----
 19 files changed, 69 insertions(+), 69 deletions(-)

diff --git a/include/exec/memop.h b/include/exec/memop.h
index 04264ffd6b..c554bb0ee8 100644
--- a/include/exec/memop.h
+++ b/include/exec/memop.h
@@ -88,26 +88,26 @@ typedef enum MemOp {
     MO_SB    = MO_SIGN | MO_8,
     MO_SW    = MO_SIGN | MO_16,
     MO_SL    = MO_SIGN | MO_32,
-    MO_Q     = MO_64,
+    MO_UQ     = MO_64,
 
     MO_LEUW  = MO_LE | MO_UW,
     MO_LEUL  = MO_LE | MO_UL,
     MO_LESW  = MO_LE | MO_SW,
     MO_LESL  = MO_LE | MO_SL,
-    MO_LEQ   = MO_LE | MO_Q,
+    MO_LEQ   = MO_LE | MO_UQ,
 
     MO_BEUW  = MO_BE | MO_UW,
     MO_BEUL  = MO_BE | MO_UL,
     MO_BESW  = MO_BE | MO_SW,
     MO_BESL  = MO_BE | MO_SL,
-    MO_BEQ   = MO_BE | MO_Q,
+    MO_BEQ   = MO_BE | MO_UQ,
 
 #ifdef NEED_CPU_H
     MO_TEUW  = MO_TE | MO_UW,
     MO_TEUL  = MO_TE | MO_UL,
     MO_TESW  = MO_TE | MO_SW,
     MO_TESL  = MO_TE | MO_SL,
-    MO_TEQ   = MO_TE | MO_Q,
+    MO_TEQ   = MO_TE | MO_UQ,
 #endif
 
     MO_SSIZE = MO_SIZE | MO_SIGN,
diff --git a/target/arm/translate-a32.h b/target/arm/translate-a32.h
index 88f15df60e..ec0330ea0f 100644
--- a/target/arm/translate-a32.h
+++ b/target/arm/translate-a32.h
@@ -114,13 +114,13 @@ void gen_aa32_st_i64(DisasContext *s, TCGv_i64 val, TCGv_i32 a32,
 static inline void gen_aa32_ld64(DisasContext *s, TCGv_i64 val,
                                  TCGv_i32 a32, int index)
 {
-    gen_aa32_ld_i64(s, val, a32, index, MO_Q);
+    gen_aa32_ld_i64(s, val, a32, index, MO_UQ);
 }
 
 static inline void gen_aa32_st64(DisasContext *s, TCGv_i64 val,
                                  TCGv_i32 a32, int index)
 {
-    gen_aa32_st_i64(s, val, a32, index, MO_Q);
+    gen_aa32_st_i64(s, val, a32, index, MO_UQ);
 }
 
 DO_GEN_LD(8u, MO_UB)
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
index cec672f229..1411fdfb6f 100644
--- a/target/arm/translate-a64.c
+++ b/target/arm/translate-a64.c
@@ -973,7 +973,7 @@ static void do_fp_st(DisasContext *s, int srcidx, TCGv_i64 tcg_addr, int size)
 
         tcg_gen_ld_i64(tmphi, cpu_env, fp_reg_hi_offset(s, srcidx));
 
-        mop = s->be_data | MO_Q;
+        mop = s->be_data | MO_UQ;
         tcg_gen_qemu_st_i64(be ? tmphi : tmplo, tcg_addr, get_mem_index(s),
                             mop | (s->align_mem ? MO_ALIGN_16 : 0));
         tcg_gen_addi_i64(tcg_hiaddr, tcg_addr, 8);
@@ -1007,7 +1007,7 @@ static void do_fp_ld(DisasContext *s, int destidx, TCGv_i64 tcg_addr, int size)
         tmphi = tcg_temp_new_i64();
         tcg_hiaddr = tcg_temp_new_i64();
 
-        mop = s->be_data | MO_Q;
+        mop = s->be_data | MO_UQ;
         tcg_gen_qemu_ld_i64(be ? tmphi : tmplo, tcg_addr, get_mem_index(s),
                             mop | (s->align_mem ? MO_ALIGN_16 : 0));
         tcg_gen_addi_i64(tcg_hiaddr, tcg_addr, 8);
@@ -4099,10 +4099,10 @@ static void disas_ldst_tag(DisasContext *s, uint32_t insn)
         int i, n = (1 + is_pair) << LOG2_TAG_GRANULE;
 
         tcg_gen_qemu_st_i64(tcg_zero, clean_addr, mem_index,
-                            MO_Q | MO_ALIGN_16);
+                            MO_UQ | MO_ALIGN_16);
         for (i = 8; i < n; i += 8) {
             tcg_gen_addi_i64(clean_addr, clean_addr, 8);
-            tcg_gen_qemu_st_i64(tcg_zero, clean_addr, mem_index, MO_Q);
+            tcg_gen_qemu_st_i64(tcg_zero, clean_addr, mem_index, MO_UQ);
         }
         tcg_temp_free_i64(tcg_zero);
     }
diff --git a/target/arm/translate-neon.c b/target/arm/translate-neon.c
index dd43de558e..3854dd3516 100644
--- a/target/arm/translate-neon.c
+++ b/target/arm/translate-neon.c
@@ -73,7 +73,7 @@ static void neon_load_element64(TCGv_i64 var, int reg, int ele, MemOp mop)
     case MO_UL:
         tcg_gen_ld32u_i64(var, cpu_env, offset);
         break;
-    case MO_Q:
+    case MO_UQ:
         tcg_gen_ld_i64(var, cpu_env, offset);
         break;
     default:
@@ -1830,7 +1830,7 @@ static bool do_prewiden_3d(DisasContext *s, arg_3diff *a,
         return false;
     }
 
-    if ((a->vd & 1) || (src1_mop == MO_Q && (a->vn & 1))) {
+    if ((a->vd & 1) || (src1_mop == MO_UQ && (a->vn & 1))) {
         return false;
     }
 
@@ -1910,7 +1910,7 @@ static bool do_prewiden_3d(DisasContext *s, arg_3diff *a,
         };                                                              \
         int narrow_mop = a->size == MO_32 ? MO_32 | SIGN : -1;          \
         return do_prewiden_3d(s, a, widenfn[a->size], addfn[a->size],   \
-                              SRC1WIDE ? MO_Q : narrow_mop,             \
+                              SRC1WIDE ? MO_UQ : narrow_mop,             \
                               narrow_mop);                              \
     }
 
diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
index bc91a64171..86104b857e 100644
--- a/target/arm/translate-sve.c
+++ b/target/arm/translate-sve.c
@@ -5284,7 +5284,7 @@ static const MemOp dtype_mop[16] = {
     MO_UB, MO_UB, MO_UB, MO_UB,
     MO_SL, MO_UW, MO_UW, MO_UW,
     MO_SW, MO_SW, MO_UL, MO_UL,
-    MO_SB, MO_SB, MO_SB, MO_Q
+    MO_SB, MO_SB, MO_SB, MO_UQ
 };
 
 #define dtype_msz(x)  (dtype_mop[x] & MO_SIZE)
diff --git a/target/arm/translate-vfp.c b/target/arm/translate-vfp.c
index 59bcaec5be..17f796e32a 100644
--- a/target/arm/translate-vfp.c
+++ b/target/arm/translate-vfp.c
@@ -1170,11 +1170,11 @@ static bool trans_VLDR_VSTR_dp(DisasContext *s, arg_VLDR_VSTR_dp *a)
     addr = add_reg_for_lit(s, a->rn, offset);
     tmp = tcg_temp_new_i64();
     if (a->l) {
-        gen_aa32_ld_i64(s, tmp, addr, get_mem_index(s), MO_Q | MO_ALIGN_4);
+        gen_aa32_ld_i64(s, tmp, addr, get_mem_index(s), MO_UQ | MO_ALIGN_4);
         vfp_store_reg64(tmp, a->vd);
     } else {
         vfp_load_reg64(tmp, a->vd);
-        gen_aa32_st_i64(s, tmp, addr, get_mem_index(s), MO_Q | MO_ALIGN_4);
+        gen_aa32_st_i64(s, tmp, addr, get_mem_index(s), MO_UQ | MO_ALIGN_4);
     }
     tcg_temp_free_i64(tmp);
     tcg_temp_free_i32(addr);
@@ -1322,12 +1322,12 @@ static bool trans_VLDM_VSTM_dp(DisasContext *s, arg_VLDM_VSTM_dp *a)
     for (i = 0; i < n; i++) {
         if (a->l) {
             /* load */
-            gen_aa32_ld_i64(s, tmp, addr, get_mem_index(s), MO_Q | MO_ALIGN_4);
+            gen_aa32_ld_i64(s, tmp, addr, get_mem_index(s), MO_UQ | MO_ALIGN_4);
             vfp_store_reg64(tmp, a->vd + i);
         } else {
             /* store */
             vfp_load_reg64(tmp, a->vd + i);
-            gen_aa32_st_i64(s, tmp, addr, get_mem_index(s), MO_Q | MO_ALIGN_4);
+            gen_aa32_st_i64(s, tmp, addr, get_mem_index(s), MO_UQ | MO_ALIGN_4);
         }
         tcg_gen_addi_i32(addr, addr, offset);
     }
diff --git a/target/arm/translate.c b/target/arm/translate.c
index d6af5b1b03..0390e9d48e 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -1220,7 +1220,7 @@ void read_neon_element64(TCGv_i64 dest, int reg, int ele, MemOp memop)
     case MO_UL:
         tcg_gen_ld32u_i64(dest, cpu_env, off);
         break;
-    case MO_Q:
+    case MO_UQ:
         tcg_gen_ld_i64(dest, cpu_env, off);
         break;
     default:
diff --git a/target/ppc/translate.c b/target/ppc/translate.c
index c3c6cb9589..8133f7dea0 100644
--- a/target/ppc/translate.c
+++ b/target/ppc/translate.c
@@ -3228,10 +3228,10 @@ GEN_QEMU_LOAD_64(ld8u,  DEF_MEMOP(MO_UB))
 GEN_QEMU_LOAD_64(ld16u, DEF_MEMOP(MO_UW))
 GEN_QEMU_LOAD_64(ld32u, DEF_MEMOP(MO_UL))
 GEN_QEMU_LOAD_64(ld32s, DEF_MEMOP(MO_SL))
-GEN_QEMU_LOAD_64(ld64,  DEF_MEMOP(MO_Q))
+GEN_QEMU_LOAD_64(ld64,  DEF_MEMOP(MO_UQ))
 
 #if defined(TARGET_PPC64)
-GEN_QEMU_LOAD_64(ld64ur, BSWAP_MEMOP(MO_Q))
+GEN_QEMU_LOAD_64(ld64ur, BSWAP_MEMOP(MO_UQ))
 #endif
 
 #define GEN_QEMU_STORE_TL(stop, op)                                     \
@@ -3262,10 +3262,10 @@ static void glue(gen_qemu_, glue(stop, _i64))(DisasContext *ctx,  \
 GEN_QEMU_STORE_64(st8,  DEF_MEMOP(MO_UB))
 GEN_QEMU_STORE_64(st16, DEF_MEMOP(MO_UW))
 GEN_QEMU_STORE_64(st32, DEF_MEMOP(MO_UL))
-GEN_QEMU_STORE_64(st64, DEF_MEMOP(MO_Q))
+GEN_QEMU_STORE_64(st64, DEF_MEMOP(MO_UQ))
 
 #if defined(TARGET_PPC64)
-GEN_QEMU_STORE_64(st64r, BSWAP_MEMOP(MO_Q))
+GEN_QEMU_STORE_64(st64r, BSWAP_MEMOP(MO_UQ))
 #endif
 
 #define GEN_LDX_E(name, ldop, opc2, opc3, type, type2, chk)                   \
@@ -3302,7 +3302,7 @@ GEN_LDEPX(lb, DEF_MEMOP(MO_UB), 0x1F, 0x02)
 GEN_LDEPX(lh, DEF_MEMOP(MO_UW), 0x1F, 0x08)
 GEN_LDEPX(lw, DEF_MEMOP(MO_UL), 0x1F, 0x00)
 #if defined(TARGET_PPC64)
-GEN_LDEPX(ld, DEF_MEMOP(MO_Q), 0x1D, 0x00)
+GEN_LDEPX(ld, DEF_MEMOP(MO_UQ), 0x1D, 0x00)
 #endif
 
 #if defined(TARGET_PPC64)
@@ -3411,7 +3411,7 @@ GEN_STEPX(stb, DEF_MEMOP(MO_UB), 0x1F, 0x06)
 GEN_STEPX(sth, DEF_MEMOP(MO_UW), 0x1F, 0x0C)
 GEN_STEPX(stw, DEF_MEMOP(MO_UL), 0x1F, 0x04)
 #if defined(TARGET_PPC64)
-GEN_STEPX(std, DEF_MEMOP(MO_Q), 0x1d, 0x04)
+GEN_STEPX(std, DEF_MEMOP(MO_UQ), 0x1d, 0x04)
 #endif
 
 #if defined(TARGET_PPC64)
@@ -3905,7 +3905,7 @@ static void gen_lwat(DisasContext *ctx)
 #ifdef TARGET_PPC64
 static void gen_ldat(DisasContext *ctx)
 {
-    gen_ld_atomic(ctx, DEF_MEMOP(MO_Q));
+    gen_ld_atomic(ctx, DEF_MEMOP(MO_UQ));
 }
 #endif
 
@@ -3988,7 +3988,7 @@ static void gen_stwat(DisasContext *ctx)
 #ifdef TARGET_PPC64
 static void gen_stdat(DisasContext *ctx)
 {
-    gen_st_atomic(ctx, DEF_MEMOP(MO_Q));
+    gen_st_atomic(ctx, DEF_MEMOP(MO_UQ));
 }
 #endif
 
@@ -4040,9 +4040,9 @@ STCX(stwcx_, DEF_MEMOP(MO_UL))
 
 #if defined(TARGET_PPC64)
 /* ldarx */
-LARX(ldarx, DEF_MEMOP(MO_Q))
+LARX(ldarx, DEF_MEMOP(MO_UQ))
 /* stdcx. */
-STCX(stdcx_, DEF_MEMOP(MO_Q))
+STCX(stdcx_, DEF_MEMOP(MO_UQ))
 
 /* lqarx */
 static void gen_lqarx(DisasContext *ctx)
@@ -8050,7 +8050,7 @@ GEN_LDEPX(lb, DEF_MEMOP(MO_UB), 0x1F, 0x02)
 GEN_LDEPX(lh, DEF_MEMOP(MO_UW), 0x1F, 0x08)
 GEN_LDEPX(lw, DEF_MEMOP(MO_UL), 0x1F, 0x00)
 #if defined(TARGET_PPC64)
-GEN_LDEPX(ld, DEF_MEMOP(MO_Q), 0x1D, 0x00)
+GEN_LDEPX(ld, DEF_MEMOP(MO_UQ), 0x1D, 0x00)
 #endif
 
 #undef GEN_STX_E
@@ -8076,7 +8076,7 @@ GEN_STEPX(stb, DEF_MEMOP(MO_UB), 0x1F, 0x06)
 GEN_STEPX(sth, DEF_MEMOP(MO_UW), 0x1F, 0x0C)
 GEN_STEPX(stw, DEF_MEMOP(MO_UL), 0x1F, 0x04)
 #if defined(TARGET_PPC64)
-GEN_STEPX(std, DEF_MEMOP(MO_Q), 0x1D, 0x04)
+GEN_STEPX(std, DEF_MEMOP(MO_UQ), 0x1D, 0x04)
 #endif
 
 #undef GEN_CRLOGIC
diff --git a/target/sparc/translate.c b/target/sparc/translate.c
index fdb8bbe5dc..7dfb33f867 100644
--- a/target/sparc/translate.c
+++ b/target/sparc/translate.c
@@ -2830,7 +2830,7 @@ static void gen_ldda_asi(DisasContext *dc, TCGv addr, int insn, int rd)
     default:
         {
             TCGv_i32 r_asi = tcg_const_i32(da.asi);
-            TCGv_i32 r_mop = tcg_const_i32(MO_Q);
+            TCGv_i32 r_mop = tcg_const_i32(MO_UQ);
 
             save_state(dc);
             gen_helper_ld_asi(t64, cpu_env, addr, r_asi, r_mop);
@@ -2886,7 +2886,7 @@ static void gen_stda_asi(DisasContext *dc, TCGv hi, TCGv addr,
     default:
         {
             TCGv_i32 r_asi = tcg_const_i32(da.asi);
-            TCGv_i32 r_mop = tcg_const_i32(MO_Q);
+            TCGv_i32 r_mop = tcg_const_i32(MO_UQ);
 
             save_state(dc);
             gen_helper_st_asi(cpu_env, addr, t64, r_asi, r_mop);
diff --git a/target/ppc/translate/fixedpoint-impl.c.inc b/target/ppc/translate/fixedpoint-impl.c.inc
index 2e2518ee15..33ce041d0b 100644
--- a/target/ppc/translate/fixedpoint-impl.c.inc
+++ b/target/ppc/translate/fixedpoint-impl.c.inc
@@ -131,11 +131,11 @@ TRANS64(LWAUX, do_ldst_X, true, false, MO_SL)
 TRANS64(PLWA, do_ldst_PLS_D, false, false, MO_SL)
 
 /* Load Doubleword */
-TRANS64(LD, do_ldst_D, false, false, MO_Q)
-TRANS64(LDX, do_ldst_X, false, false, MO_Q)
-TRANS64(LDU, do_ldst_D, true, false, MO_Q)
-TRANS64(LDUX, do_ldst_X, true, false, MO_Q)
-TRANS64(PLD, do_ldst_PLS_D, false, false, MO_Q)
+TRANS64(LD, do_ldst_D, false, false, MO_UQ)
+TRANS64(LDX, do_ldst_X, false, false, MO_UQ)
+TRANS64(LDU, do_ldst_D, true, false, MO_UQ)
+TRANS64(LDUX, do_ldst_X, true, false, MO_UQ)
+TRANS64(PLD, do_ldst_PLS_D, false, false, MO_UQ)
 
 /* Store Byte */
 TRANS(STB, do_ldst_D, false, true, MO_UB)
@@ -159,11 +159,11 @@ TRANS(STWUX, do_ldst_X, true, true, MO_UL)
 TRANS(PSTW, do_ldst_PLS_D, false, true, MO_UL)
 
 /* Store Doubleword */
-TRANS64(STD, do_ldst_D, false, true, MO_Q)
-TRANS64(STDX, do_ldst_X, false, true, MO_Q)
-TRANS64(STDU, do_ldst_D, true, true, MO_Q)
-TRANS64(STDUX, do_ldst_X, true, true, MO_Q)
-TRANS64(PSTD, do_ldst_PLS_D, false, true, MO_Q)
+TRANS64(STD, do_ldst_D, false, true, MO_UQ)
+TRANS64(STDX, do_ldst_X, false, true, MO_UQ)
+TRANS64(STDU, do_ldst_D, true, true, MO_UQ)
+TRANS64(STDUX, do_ldst_X, true, true, MO_UQ)
+TRANS64(PSTD, do_ldst_PLS_D, false, true, MO_UQ)
 
 /*
  * Fixed-Point Compare Instructions
diff --git a/target/ppc/translate/fp-impl.c.inc b/target/ppc/translate/fp-impl.c.inc
index 9f7868ee28..01b5c53bf4 100644
--- a/target/ppc/translate/fp-impl.c.inc
+++ b/target/ppc/translate/fp-impl.c.inc
@@ -974,7 +974,7 @@ static void gen_lfdepx(DisasContext *ctx)
     EA = tcg_temp_new();
     t0 = tcg_temp_new_i64();
     gen_addr_reg_index(ctx, EA);
-    tcg_gen_qemu_ld_i64(t0, EA, PPC_TLB_EPID_LOAD, DEF_MEMOP(MO_Q));
+    tcg_gen_qemu_ld_i64(t0, EA, PPC_TLB_EPID_LOAD, DEF_MEMOP(MO_UQ));
     set_fpr(rD(ctx->opcode), t0);
     tcg_temp_free(EA);
     tcg_temp_free_i64(t0);
@@ -1210,7 +1210,7 @@ static void gen_stfdepx(DisasContext *ctx)
     t0 = tcg_temp_new_i64();
     gen_addr_reg_index(ctx, EA);
     get_fpr(t0, rD(ctx->opcode));
-    tcg_gen_qemu_st_i64(t0, EA, PPC_TLB_EPID_STORE, DEF_MEMOP(MO_Q));
+    tcg_gen_qemu_st_i64(t0, EA, PPC_TLB_EPID_STORE, DEF_MEMOP(MO_UQ));
     tcg_temp_free(EA);
     tcg_temp_free_i64(t0);
 }
diff --git a/target/ppc/translate/vsx-impl.c.inc b/target/ppc/translate/vsx-impl.c.inc
index 57a7f73bba..c1b1dde01c 100644
--- a/target/ppc/translate/vsx-impl.c.inc
+++ b/target/ppc/translate/vsx-impl.c.inc
@@ -162,8 +162,8 @@ static void gen_lxvdsx(DisasContext *ctx)
     gen_addr_reg_index(ctx, EA);
 
     data = tcg_temp_new_i64();
-    tcg_gen_qemu_ld_i64(data, EA, ctx->mem_idx, DEF_MEMOP(MO_Q));
-    tcg_gen_gvec_dup_i64(MO_Q, vsr_full_offset(xT(ctx->opcode)), 16, 16, data);
+    tcg_gen_qemu_ld_i64(data, EA, ctx->mem_idx, DEF_MEMOP(MO_UQ));
+    tcg_gen_gvec_dup_i64(MO_UQ, vsr_full_offset(xT(ctx->opcode)), 16, 16, data);
 
     tcg_temp_free(EA);
     tcg_temp_free_i64(data);
diff --git a/tcg/aarch64/tcg-target.c.inc b/tcg/aarch64/tcg-target.c.inc
index 5edca8d44d..a8db553287 100644
--- a/tcg/aarch64/tcg-target.c.inc
+++ b/tcg/aarch64/tcg-target.c.inc
@@ -1744,7 +1744,7 @@ static void tcg_out_qemu_ld_direct(TCGContext *s, MemOp memop, TCGType ext,
     case MO_SL:
         tcg_out_ldst_r(s, I3312_LDRSWX, data_r, addr_r, otype, off_r);
         break;
-    case MO_Q:
+    case MO_UQ:
         tcg_out_ldst_r(s, I3312_LDRX, data_r, addr_r, otype, off_r);
         break;
     default:
diff --git a/tcg/arm/tcg-target.c.inc b/tcg/arm/tcg-target.c.inc
index 633b8a37ba..e31f454695 100644
--- a/tcg/arm/tcg-target.c.inc
+++ b/tcg/arm/tcg-target.c.inc
@@ -1443,13 +1443,13 @@ static void * const qemu_ld_helpers[MO_SSIZE + 1] = {
 #ifdef HOST_WORDS_BIGENDIAN
     [MO_UW] = helper_be_lduw_mmu,
     [MO_UL] = helper_be_ldul_mmu,
-    [MO_Q]  = helper_be_ldq_mmu,
+    [MO_UQ]  = helper_be_ldq_mmu,
     [MO_SW] = helper_be_ldsw_mmu,
     [MO_SL] = helper_be_ldul_mmu,
 #else
     [MO_UW] = helper_le_lduw_mmu,
     [MO_UL] = helper_le_ldul_mmu,
-    [MO_Q]  = helper_le_ldq_mmu,
+    [MO_UQ]  = helper_le_ldq_mmu,
     [MO_SW] = helper_le_ldsw_mmu,
     [MO_SL] = helper_le_ldul_mmu,
 #endif
@@ -1694,7 +1694,7 @@ static bool tcg_out_qemu_ld_slow_path(TCGContext *s, TCGLabelQemuLdst *lb)
     default:
         tcg_out_mov_reg(s, COND_AL, datalo, TCG_REG_R0);
         break;
-    case MO_Q:
+    case MO_UQ:
         if (datalo != TCG_REG_R1) {
             tcg_out_mov_reg(s, COND_AL, datalo, TCG_REG_R0);
             tcg_out_mov_reg(s, COND_AL, datahi, TCG_REG_R1);
@@ -1781,7 +1781,7 @@ static void tcg_out_qemu_ld_index(TCGContext *s, MemOp opc,
     case MO_UL:
         tcg_out_ld32_r(s, COND_AL, datalo, addrlo, addend);
         break;
-    case MO_Q:
+    case MO_UQ:
         /* Avoid ldrd for user-only emulation, to handle unaligned.  */
         if (USING_SOFTMMU && use_armv6_instructions
             && (datalo & 1) == 0 && datahi == datalo + 1) {
@@ -1824,7 +1824,7 @@ static void tcg_out_qemu_ld_direct(TCGContext *s, MemOp opc, TCGReg datalo,
     case MO_UL:
         tcg_out_ld32_12(s, COND_AL, datalo, addrlo, 0);
         break;
-    case MO_Q:
+    case MO_UQ:
         /* Avoid ldrd for user-only emulation, to handle unaligned.  */
         if (USING_SOFTMMU && use_armv6_instructions
             && (datalo & 1) == 0 && datahi == datalo + 1) {
diff --git a/tcg/i386/tcg-target.c.inc b/tcg/i386/tcg-target.c.inc
index 84b109bb84..0b5d385ad6 100644
--- a/tcg/i386/tcg-target.c.inc
+++ b/tcg/i386/tcg-target.c.inc
@@ -1827,7 +1827,7 @@ static bool tcg_out_qemu_ld_slow_path(TCGContext *s, TCGLabelQemuLdst *l)
     case MO_UL:
         tcg_out_mov(s, TCG_TYPE_I32, data_reg, TCG_REG_EAX);
         break;
-    case MO_Q:
+    case MO_UQ:
         if (TCG_TARGET_REG_BITS == 64) {
             tcg_out_mov(s, TCG_TYPE_I64, data_reg, TCG_REG_RAX);
         } else if (data_reg == TCG_REG_EDX) {
@@ -2019,7 +2019,7 @@ static void tcg_out_qemu_ld_direct(TCGContext *s, TCGReg datalo, TCGReg datahi,
         }
         break;
 #endif
-    case MO_Q:
+    case MO_UQ:
         if (TCG_TARGET_REG_BITS == 64) {
             tcg_out_modrm_sib_offset(s, movop + P_REXW + seg, datalo,
                                      base, index, 0, ofs);
diff --git a/tcg/mips/tcg-target.c.inc b/tcg/mips/tcg-target.c.inc
index d8f6914f03..15704c84fa 100644
--- a/tcg/mips/tcg-target.c.inc
+++ b/tcg/mips/tcg-target.c.inc
@@ -1384,7 +1384,7 @@ static void tcg_out_qemu_ld_direct(TCGContext *s, TCGReg lo, TCGReg hi,
     case MO_SL:
         tcg_out_opc_imm(s, OPC_LW, lo, base, 0);
         break;
-    case MO_Q | MO_BSWAP:
+    case MO_UQ | MO_BSWAP:
         if (TCG_TARGET_REG_BITS == 64) {
             if (use_mips32r2_instructions) {
                 tcg_out_opc_imm(s, OPC_LD, lo, base, 0);
@@ -1413,7 +1413,7 @@ static void tcg_out_qemu_ld_direct(TCGContext *s, TCGReg lo, TCGReg hi,
             tcg_out_mov(s, TCG_TYPE_I32, MIPS_BE ? hi : lo, TCG_TMP3);
         }
         break;
-    case MO_Q:
+    case MO_UQ:
         /* Prefer to load from offset 0 first, but allow for overlap.  */
         if (TCG_TARGET_REG_BITS == 64) {
             tcg_out_opc_imm(s, OPC_LD, lo, base, 0);
diff --git a/tcg/ppc/tcg-target.c.inc b/tcg/ppc/tcg-target.c.inc
index 3e4ca2be88..6802cb06a3 100644
--- a/tcg/ppc/tcg-target.c.inc
+++ b/tcg/ppc/tcg-target.c.inc
@@ -1935,24 +1935,24 @@ static const uint32_t qemu_ldx_opc[(MO_SSIZE + MO_BSWAP) + 1] = {
     [MO_UB] = LBZX,
     [MO_UW] = LHZX,
     [MO_UL] = LWZX,
-    [MO_Q]  = LDX,
+    [MO_UQ]  = LDX,
     [MO_SW] = LHAX,
     [MO_SL] = LWAX,
     [MO_BSWAP | MO_UB] = LBZX,
     [MO_BSWAP | MO_UW] = LHBRX,
     [MO_BSWAP | MO_UL] = LWBRX,
-    [MO_BSWAP | MO_Q]  = LDBRX,
+    [MO_BSWAP | MO_UQ]  = LDBRX,
 };
 
 static const uint32_t qemu_stx_opc[(MO_SIZE + MO_BSWAP) + 1] = {
     [MO_UB] = STBX,
     [MO_UW] = STHX,
     [MO_UL] = STWX,
-    [MO_Q]  = STDX,
+    [MO_UQ]  = STDX,
     [MO_BSWAP | MO_UB] = STBX,
     [MO_BSWAP | MO_UW] = STHBRX,
     [MO_BSWAP | MO_UL] = STWBRX,
-    [MO_BSWAP | MO_Q]  = STDBRX,
+    [MO_BSWAP | MO_UQ]  = STDBRX,
 };
 
 static const uint32_t qemu_exts_opc[4] = {
diff --git a/tcg/riscv/tcg-target.c.inc b/tcg/riscv/tcg-target.c.inc
index 9b13a46fb4..b621694321 100644
--- a/tcg/riscv/tcg-target.c.inc
+++ b/tcg/riscv/tcg-target.c.inc
@@ -862,7 +862,7 @@ static void * const qemu_ld_helpers[MO_SSIZE + 1] = {
 #if TCG_TARGET_REG_BITS == 64
     [MO_SL] = helper_be_ldsl_mmu,
 #endif
-    [MO_Q]  = helper_be_ldq_mmu,
+    [MO_UQ]  = helper_be_ldq_mmu,
 #else
     [MO_UW] = helper_le_lduw_mmu,
     [MO_SW] = helper_le_ldsw_mmu,
@@ -870,7 +870,7 @@ static void * const qemu_ld_helpers[MO_SSIZE + 1] = {
 #if TCG_TARGET_REG_BITS == 64
     [MO_SL] = helper_le_ldsl_mmu,
 #endif
-    [MO_Q]  = helper_le_ldq_mmu,
+    [MO_UQ]  = helper_le_ldq_mmu,
 #endif
 };
 
@@ -1083,7 +1083,7 @@ static void tcg_out_qemu_ld_direct(TCGContext *s, TCGReg lo, TCGReg hi,
     case MO_SL:
         tcg_out_opc_imm(s, OPC_LW, lo, base, 0);
         break;
-    case MO_Q:
+    case MO_UQ:
         /* Prefer to load from offset 0 first, but allow for overlap.  */
         if (TCG_TARGET_REG_BITS == 64) {
             tcg_out_opc_imm(s, OPC_LD, lo, base, 0);
diff --git a/tcg/s390x/tcg-target.c.inc b/tcg/s390x/tcg-target.c.inc
index 8938c446c8..61d6694268 100644
--- a/tcg/s390x/tcg-target.c.inc
+++ b/tcg/s390x/tcg-target.c.inc
@@ -1745,10 +1745,10 @@ static void tcg_out_qemu_ld_direct(TCGContext *s, MemOp opc, TCGReg data,
         tcg_out_insn(s, RXY, LGF, data, base, index, disp);
         break;
 
-    case MO_Q | MO_BSWAP:
+    case MO_UQ | MO_BSWAP:
         tcg_out_insn(s, RXY, LRVG, data, base, index, disp);
         break;
-    case MO_Q:
+    case MO_UQ:
         tcg_out_insn(s, RXY, LG, data, base, index, disp);
         break;
 
@@ -1791,10 +1791,10 @@ static void tcg_out_qemu_st_direct(TCGContext *s, MemOp opc, TCGReg data,
         }
         break;
 
-    case MO_Q | MO_BSWAP:
+    case MO_UQ | MO_BSWAP:
         tcg_out_insn(s, RXY, STRVG, data, base, index, disp);
         break;
-    case MO_Q:
+    case MO_UQ:
         tcg_out_insn(s, RXY, STG, data, base, index, disp);
         break;
 
@@ -1928,7 +1928,7 @@ static bool tcg_out_qemu_st_slow_path(TCGContext *s, TCGLabelQemuLdst *lb)
     case MO_UL:
         tgen_ext32u(s, TCG_REG_R4, data_reg);
         break;
-    case MO_Q:
+    case MO_UQ:
         tcg_out_mov(s, TCG_TYPE_I64, TCG_REG_R4, data_reg);
         break;
     default:
-- 
2.33.0



^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH v3 02/21] memory: add a few defines for octo (128-bit) values
  2021-10-19  9:47 [PATCH v3 00/21] Adding partial support for 128-bit riscv target Frédéric Pétrot
  2021-10-19  9:47 ` [PATCH v3 01/21] memory: change define name for consistency Frédéric Pétrot
@ 2021-10-19  9:47 ` Frédéric Pétrot
  2021-10-19 18:00   ` Richard Henderson
  2021-10-19  9:47 ` [PATCH v3 03/21] Int128.h: addition of a few 128-bit operations Frédéric Pétrot
                   ` (18 subsequent siblings)
  20 siblings, 1 reply; 49+ messages in thread
From: Frédéric Pétrot @ 2021-10-19  9:47 UTC (permalink / raw)
  To: qemu-devel, qemu-riscv
  Cc: bin.meng, richard.henderson, alistair.francis, fabien.portas,
	palmer, Frédéric Pétrot, philmd

Introducing unsigned quad, signed quad, and octo accesses types
to handle load and store by 128-bit processors.

Signed-off-by: Frédéric Pétrot <frederic.petrot@univ-grenoble-alpes.fr>
---
 include/exec/memop.h | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/include/exec/memop.h b/include/exec/memop.h
index c554bb0ee8..476ea6cdae 100644
--- a/include/exec/memop.h
+++ b/include/exec/memop.h
@@ -85,10 +85,13 @@ typedef enum MemOp {
     MO_UB    = MO_8,
     MO_UW    = MO_16,
     MO_UL    = MO_32,
+    MO_UQ    = MO_64,
     MO_SB    = MO_SIGN | MO_8,
     MO_SW    = MO_SIGN | MO_16,
     MO_SL    = MO_SIGN | MO_32,
-    MO_UQ     = MO_64,
+    MO_SQ    = MO_SIGN | MO_64,
+    MO_Q     = MO_64,
+    MO_O     = MO_128,
 
     MO_LEUW  = MO_LE | MO_UW,
     MO_LEUL  = MO_LE | MO_UL,
@@ -105,9 +108,12 @@ typedef enum MemOp {
 #ifdef NEED_CPU_H
     MO_TEUW  = MO_TE | MO_UW,
     MO_TEUL  = MO_TE | MO_UL,
+    MO_TEUQ  = MO_TE | MO_UQ,
     MO_TESW  = MO_TE | MO_SW,
     MO_TESL  = MO_TE | MO_SL,
     MO_TEQ   = MO_TE | MO_UQ,
+    MO_TESQ  = MO_TE | MO_SQ,
+    MO_TEO   = MO_TE | MO_O,
 #endif
 
     MO_SSIZE = MO_SIZE | MO_SIGN,
-- 
2.33.0



^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH v3 03/21] Int128.h: addition of a few 128-bit operations
  2021-10-19  9:47 [PATCH v3 00/21] Adding partial support for 128-bit riscv target Frédéric Pétrot
  2021-10-19  9:47 ` [PATCH v3 01/21] memory: change define name for consistency Frédéric Pétrot
  2021-10-19  9:47 ` [PATCH v3 02/21] memory: add a few defines for octo (128-bit) values Frédéric Pétrot
@ 2021-10-19  9:47 ` Frédéric Pétrot
  2021-10-19 18:15   ` Richard Henderson
  2021-10-19  9:47 ` [PATCH v3 04/21] target/riscv: additional macros to check instruction support Frédéric Pétrot
                   ` (17 subsequent siblings)
  20 siblings, 1 reply; 49+ messages in thread
From: Frédéric Pétrot @ 2021-10-19  9:47 UTC (permalink / raw)
  To: qemu-devel, qemu-riscv
  Cc: bin.meng, richard.henderson, alistair.francis, fabien.portas,
	palmer, Frédéric Pétrot, philmd

Addition of not, xor, div and rem on 128-bit integers, used in particular
within div/rem and csr helpers for computations on 128-bit registers.
These will be used by the 128-bit riscv version.

Signed-off-by: Frédéric Pétrot <frederic.petrot@univ-grenoble-alpes.fr>
Co-authored-by: Fabien Portas <fabien.portas@grenoble-inp.org>
---
 include/qemu/int128.h | 264 ++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 264 insertions(+)

diff --git a/include/qemu/int128.h b/include/qemu/int128.h
index 2ac0746426..b3236d85ad 100644
--- a/include/qemu/int128.h
+++ b/include/qemu/int128.h
@@ -58,6 +58,11 @@ static inline Int128 int128_exts64(int64_t a)
     return a;
 }
 
+static inline Int128 int128_not(Int128 a)
+{
+    return ~a;
+}
+
 static inline Int128 int128_and(Int128 a, Int128 b)
 {
     return a & b;
@@ -68,6 +73,11 @@ static inline Int128 int128_or(Int128 a, Int128 b)
     return a | b;
 }
 
+static inline Int128 int128_xor(Int128 a, Int128 b)
+{
+    return a ^ b;
+}
+
 static inline Int128 int128_rshift(Int128 a, int n)
 {
     return a >> n;
@@ -162,6 +172,26 @@ static inline Int128 bswap128(Int128 a)
 #endif
 }
 
+static inline Int128 int128_divu(Int128 a, Int128 b)
+{
+    return (__uint128_t)a / (__uint128_t)b;
+}
+
+static inline Int128 int128_remu(Int128 a, Int128 b)
+{
+    return (__uint128_t)a % (__uint128_t)b;
+}
+
+static inline Int128 int128_divs(Int128 a, Int128 b)
+{
+    return a / b;
+}
+
+static inline Int128 int128_rems(Int128 a, Int128 b)
+{
+    return a % b;
+}
+
 #else /* !CONFIG_INT128 */
 
 typedef struct Int128 Int128;
@@ -235,6 +265,11 @@ static inline Int128 int128_exts64(int64_t a)
     return int128_make128(a, (a < 0) ? -1 : 0);
 }
 
+static inline Int128 int128_not(Int128 a)
+{
+    return int128_make128(~a.lo, ~a.hi);
+}
+
 static inline Int128 int128_and(Int128 a, Int128 b)
 {
     return int128_make128(a.lo & b.lo, a.hi & b.hi);
@@ -245,6 +280,11 @@ static inline Int128 int128_or(Int128 a, Int128 b)
     return int128_make128(a.lo | b.lo, a.hi | b.hi);
 }
 
+static inline Int128 int128_xor(Int128 a, Int128 b)
+{
+    return int128_make128(a.lo ^ b.lo, a.hi ^ b.hi);
+}
+
 static inline Int128 int128_rshift(Int128 a, int n)
 {
     int64_t h;
@@ -359,6 +399,228 @@ static inline Int128 bswap128(Int128 a)
     return int128_make128(bswap64(a.hi), bswap64(a.lo));
 }
 
+#include "qemu/host-utils.h"
+/*
+ * Division and remainder algorithms for 128-bit.
+ * Naïve implementation of Knuth Algorithm D, can be optimized quite a bit if
+ * it becomes a bootleneck.
+ * Precondition: function never called with v equals to 0, has to be dealt
+ *               with beforehand.
+ */
+static inline void divrem128(uint64_t ul, uint64_t uh,
+                             uint64_t vl, uint64_t vh,
+                             uint64_t *ql, uint64_t *qh,
+                             uint64_t *rl, uint64_t *rh)
+{
+    const uint64_t b = ((uint64_t) 1) << 32;
+    const int m = 4;
+    uint64_t qhat, rhat, p;
+    int n, s, i;
+    int64_t j, t, k;
+
+    /* Build arrays of 32-bit words for u and v */
+    uint32_t u[4] = {ul & 0xffffffff, (ul >> 32) & 0xffffffff,
+                     uh & 0xffffffff, (uh >> 32) & 0xffffffff};
+    uint32_t v[4] = {vl & 0xffffffff, (vl >> 32) & 0xffffffff,
+                     vh & 0xffffffff, (vh >> 32) & 0xffffffff};
+
+    uint32_t q[4] = {0}, r[4] = {0}, un[5] = {0}, vn[4] = {0};
+
+    if (v[3]) {
+        n = 4;
+    } else if (v[2]) {
+        n = 3;
+    } else if (v[1]) {
+        n = 2;
+    } else if (v[0]) {
+        n = 1;
+    } else {
+        /* never happens, but makes gcc shy */
+        n = 0;
+    }
+
+    if (n == 1) {
+        /* Take care of the case of a single-digit divisor here */
+        k = 0;
+        for (j = m - 1; j >= 0; j--) {
+            q[j] = (k * b + u[j]) / v[0];
+            k = (k * b + u[j]) - q[j] * v[0];
+        }
+        if (r != NULL) {
+            r[0] = k;
+        }
+    } else {
+        s = clz32(v[n - 1]); /* 0 <= s <= 32 */
+        if (s != 0) {
+            for (i = n - 1; i > 0; i--) {
+                vn[i] = ((v[i] << s) | (v[i - 1] >> (32 - s)));
+            }
+            vn[0] = v[0] << s;
+
+            un[m] = u[m - 1] >> (32 - s);
+            for (i = m - 1; i > 0; i--) {
+                un[i] = (u[i] << s) | (u[i - 1] >> (32 - s));
+            }
+            un[0] = u[0] << s;
+        } else {
+            for (i = 0; i < n; i++) {
+                vn[i] = v[i];
+            }
+
+            for (i = 0; i < m; i++) {
+                un[i] = u[i];
+            }
+            un[m] = 0;
+        }
+
+        /* Step D2 : loop on j */
+        for (j = m - n; j >= 0; j--) { /* Main loop */
+            /* Step D3 : Compute estimate qhat of q[j] */
+            qhat = (un[j + n] * b + un[j + n - 1]) / vn[n - 1];
+            /* Optimized mod vn[n -1 ] */
+            rhat = (un[j + n] * b + un[j + n - 1]) - qhat * vn[n - 1];
+
+            while (true) {
+                if (qhat == b
+                    || qhat * vn[n - 2] > b * rhat + un[j + n - 2]) {
+                    qhat = qhat - 1;
+                    rhat = rhat + vn[n - 1];
+                    if (rhat < b) {
+                        continue;
+                    }
+                }
+                break;
+            }
+
+            /* Step D4 : Multiply and subtract */
+            k = 0;
+            for (i = 0; i < n; i++) {
+                p = qhat * vn[i];
+                t = un[i + j] - k - (p & 0xffffffff);
+                un[i + j] = t;
+                k = (p >> 32) - (t >> 32);
+            }
+            t = un[j + n] - k;
+            un[j + n] = t;
+
+            /* Step D5 */
+            q[j] = qhat;         /* Store quotient digit */
+            /* Step D6 */
+            if (t < 0) {         /* If we subtracted too much, add back */
+                q[j] = q[j] - 1;
+                k = 0;
+                for (i = 0; i < n; i++) {
+                    t = un[i + j] + vn[i] + k;
+                    un[i + j] = t;
+                    k = t >> 32;
+                }
+                un[j + n] = un[j + n] + k;
+            }
+        } /* D7 Loop */
+
+        /* Step D8 : Unnormalize */
+        if (rl && rh) {
+            if (s != 0) {
+                for (i = 0; i < n; i++) {
+                    r[i] = (un[i] >> s) | (un[i + 1] << (32 - s));
+                }
+            } else {
+                for (i = 0; i < n; i++) {
+                    r[i] = un[i];
+                }
+            }
+        }
+    }
+
+    if (ql && qh) {
+        *ql = q[0] | ((uint64_t)q[1] << 32);
+        *qh = q[2] | ((uint64_t)q[3] << 32);
+    }
+
+    if (rl && rh) {
+        *rl = r[0] | ((uint64_t)r[1] << 32);
+        *rh = r[2] | ((uint64_t)r[3] << 32);
+    }
+}
+
+static inline Int128 int128_divu(Int128 a, Int128 b)
+{
+    uint64_t qh, ql;
+
+    divrem128(int128_getlo(a), int128_gethi(a),
+              int128_getlo(b), int128_gethi(b),
+              &ql, &qh,
+              NULL, NULL);
+
+    return int128_make128(ql, qh);
+}
+
+static inline Int128 int128_remu(Int128 a, Int128 b)
+{
+    uint64_t rh, rl;
+
+    divrem128(int128_getlo(a), int128_gethi(a),
+              int128_getlo(b), int128_gethi(b),
+              NULL, NULL,
+              &rl, &rh);
+
+    return int128_make128(rl, rh);
+}
+
+static inline Int128 int128_divs(Int128 a, Int128 b)
+{
+    uint64_t qh, ql;
+    bool sgna = !int128_nonneg(a),
+         sgnb = !int128_nonneg(b);
+
+    if (sgna) {
+        a = int128_neg(a);
+    }
+
+    if (sgnb) {
+        b = int128_neg(b);
+    }
+
+    divrem128(int128_getlo(a), int128_gethi(a),
+              int128_getlo(b), int128_gethi(b),
+              &ql, &qh,
+              NULL, NULL);
+    Int128 q = int128_make128(ql, qh);
+
+    if (sgna != sgnb) {
+        q = int128_neg(q);
+    }
+
+    return q;
+}
+
+static inline Int128 int128_rems(Int128 a, Int128 b)
+{
+    uint64_t rh, rl;
+    bool sgna = !int128_nonneg(a),
+         sgnb = !int128_nonneg(b);
+
+    if (sgna) {
+        a = int128_neg(a);
+    }
+
+    if (sgnb) {
+        b = int128_neg(b);
+    }
+
+    divrem128(int128_getlo(a), int128_gethi(a),
+              int128_getlo(b), int128_gethi(b),
+              NULL, NULL,
+              &rl, &rh);
+    Int128 r = int128_make128(rl, rh);
+
+    if (sgna) {
+        r = int128_neg(r);
+    }
+
+    return r;
+}
+
 #endif /* CONFIG_INT128 */
 
 static inline void bswap128s(Int128 *s)
@@ -366,4 +628,6 @@ static inline void bswap128s(Int128 *s)
     *s = bswap128(*s);
 }
 
+#define UINT128_MAX int128_make128(~0LL, ~0LL)
+
 #endif /* INT128_H */
-- 
2.33.0



^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH v3 04/21] target/riscv: additional macros to check instruction support
  2021-10-19  9:47 [PATCH v3 00/21] Adding partial support for 128-bit riscv target Frédéric Pétrot
                   ` (2 preceding siblings ...)
  2021-10-19  9:47 ` [PATCH v3 03/21] Int128.h: addition of a few 128-bit operations Frédéric Pétrot
@ 2021-10-19  9:47 ` Frédéric Pétrot
  2021-10-20 14:08   ` Richard Henderson
  2021-10-19  9:47 ` [PATCH v3 05/21] target/riscv: separation of bitwise logic and aritmetic helpers Frédéric Pétrot
                   ` (16 subsequent siblings)
  20 siblings, 1 reply; 49+ messages in thread
From: Frédéric Pétrot @ 2021-10-19  9:47 UTC (permalink / raw)
  To: qemu-devel, qemu-riscv
  Cc: bin.meng, richard.henderson, alistair.francis, fabien.portas,
	palmer, Frédéric Pétrot, philmd

Given that the 128-bit version of the riscv spec adds new instructions, and
that some instructions that were previously only available in 64-bit mode
are now available for both 64-bit and 128-bit, we added new macros to check
for the processor mode during translation.

Signed-off-by: Frédéric Pétrot <frederic.petrot@univ-grenoble-alpes.fr>
Co-authored-by: Fabien Portas <fabien.portas@grenoble-inp.org>
---
 target/riscv/translate.c | 18 ++++++++++++++++++
 1 file changed, 18 insertions(+)

diff --git a/target/riscv/translate.c b/target/riscv/translate.c
index 35245aafa7..121fcd71fe 100644
--- a/target/riscv/translate.c
+++ b/target/riscv/translate.c
@@ -350,6 +350,24 @@ EX_SH(12)
     }                              \
 } while (0)
 
+#define REQUIRE_128BIT(ctx) do {   \
+    if (get_xl(ctx) < MXL_RV128) { \
+        return false;              \
+    }                              \
+} while (0)
+
+#define REQUIRE_32_OR_64BIT(ctx) do { \
+    if (get_xl(ctx) == MXL_RV128) {   \
+        return false;                 \
+    }                                 \
+} while (0)
+
+#define REQUIRE_64_OR_128BIT(ctx) do { \
+    if (get_xl(ctx) == MXL_RV32) {     \
+        return false;                  \
+    }                                  \
+} while (0)
+
 static int ex_rvc_register(DisasContext *ctx, int reg)
 {
     return 8 + reg;
-- 
2.33.0



^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH v3 05/21] target/riscv: separation of bitwise logic and aritmetic helpers
  2021-10-19  9:47 [PATCH v3 00/21] Adding partial support for 128-bit riscv target Frédéric Pétrot
                   ` (3 preceding siblings ...)
  2021-10-19  9:47 ` [PATCH v3 04/21] target/riscv: additional macros to check instruction support Frédéric Pétrot
@ 2021-10-19  9:47 ` Frédéric Pétrot
  2021-10-20 14:14   ` Richard Henderson
  2021-10-19  9:47 ` [PATCH v3 06/21] target/riscv: array for the 64 upper bits of 128-bit registers Frédéric Pétrot
                   ` (15 subsequent siblings)
  20 siblings, 1 reply; 49+ messages in thread
From: Frédéric Pétrot @ 2021-10-19  9:47 UTC (permalink / raw)
  To: qemu-devel, qemu-riscv
  Cc: bin.meng, richard.henderson, alistair.francis, fabien.portas,
	palmer, Frédéric Pétrot, philmd

Introduction of a gen_logic function for bitwise logic to implement
instructions in which not propagation of information occurs between bits and
use of this function on the bitwise instructions.

Signed-off-by: Frédéric Pétrot <frederic.petrot@univ-grenoble-alpes.fr>
Co-authored-by: Fabien Portas <fabien.portas@grenoble-inp.org>
---
 target/riscv/translate.c                | 27 +++++++++++++++++++++++++
 target/riscv/insn_trans/trans_rvb.c.inc |  6 +++---
 target/riscv/insn_trans/trans_rvi.c.inc | 12 +++++------
 3 files changed, 36 insertions(+), 9 deletions(-)

diff --git a/target/riscv/translate.c b/target/riscv/translate.c
index 121fcd71fe..3c2e9fb790 100644
--- a/target/riscv/translate.c
+++ b/target/riscv/translate.c
@@ -382,6 +382,33 @@ static int ex_rvc_shifti(DisasContext *ctx, int imm)
 /* Include the auto-generated decoder for 32 bit insn */
 #include "decode-insn32.c.inc"
 
+static bool gen_logic_imm_fn(DisasContext *ctx, arg_i *a, DisasExtend ext,
+                             void (*func)(TCGv, TCGv, target_long))
+{
+    TCGv dest = dest_gpr(ctx, a->rd);
+    TCGv src1 = get_gpr(ctx, a->rs1, ext);
+
+    func(dest, src1, a->imm);
+
+    gen_set_gpr(ctx, a->rd, dest);
+
+    return true;
+}
+
+static bool gen_logic(DisasContext *ctx, arg_r *a, DisasExtend ext,
+                      void (*func)(TCGv, TCGv, TCGv))
+{
+    TCGv dest = dest_gpr(ctx, a->rd);
+    TCGv src1 = get_gpr(ctx, a->rs1, ext);
+    TCGv src2 = get_gpr(ctx, a->rs2, ext);
+
+    func(dest, src1, src2);
+
+    gen_set_gpr(ctx, a->rd, dest);
+
+    return true;
+}
+
 static bool gen_arith_imm_fn(DisasContext *ctx, arg_i *a, DisasExtend ext,
                              void (*func)(TCGv, TCGv, target_long))
 {
diff --git a/target/riscv/insn_trans/trans_rvb.c.inc b/target/riscv/insn_trans/trans_rvb.c.inc
index cc39e6033b..28f911f95d 100644
--- a/target/riscv/insn_trans/trans_rvb.c.inc
+++ b/target/riscv/insn_trans/trans_rvb.c.inc
@@ -86,19 +86,19 @@ static bool trans_cpop(DisasContext *ctx, arg_cpop *a)
 static bool trans_andn(DisasContext *ctx, arg_andn *a)
 {
     REQUIRE_ZBB(ctx);
-    return gen_arith(ctx, a, EXT_NONE, tcg_gen_andc_tl);
+    return gen_logic(ctx, a, EXT_NONE, tcg_gen_andc_tl);
 }
 
 static bool trans_orn(DisasContext *ctx, arg_orn *a)
 {
     REQUIRE_ZBB(ctx);
-    return gen_arith(ctx, a, EXT_NONE, tcg_gen_orc_tl);
+    return gen_logic(ctx, a, EXT_NONE, tcg_gen_orc_tl);
 }
 
 static bool trans_xnor(DisasContext *ctx, arg_xnor *a)
 {
     REQUIRE_ZBB(ctx);
-    return gen_arith(ctx, a, EXT_NONE, tcg_gen_eqv_tl);
+    return gen_logic(ctx, a, EXT_NONE, tcg_gen_eqv_tl);
 }
 
 static bool trans_min(DisasContext *ctx, arg_min *a)
diff --git a/target/riscv/insn_trans/trans_rvi.c.inc b/target/riscv/insn_trans/trans_rvi.c.inc
index 91dc438a3a..ed138f748e 100644
--- a/target/riscv/insn_trans/trans_rvi.c.inc
+++ b/target/riscv/insn_trans/trans_rvi.c.inc
@@ -250,17 +250,17 @@ static bool trans_sltiu(DisasContext *ctx, arg_sltiu *a)
 
 static bool trans_xori(DisasContext *ctx, arg_xori *a)
 {
-    return gen_arith_imm_fn(ctx, a, EXT_NONE, tcg_gen_xori_tl);
+    return gen_logic_imm_fn(ctx, a, EXT_NONE, tcg_gen_xori_tl);
 }
 
 static bool trans_ori(DisasContext *ctx, arg_ori *a)
 {
-    return gen_arith_imm_fn(ctx, a, EXT_NONE, tcg_gen_ori_tl);
+    return gen_logic_imm_fn(ctx, a, EXT_NONE, tcg_gen_ori_tl);
 }
 
 static bool trans_andi(DisasContext *ctx, arg_andi *a)
 {
-    return gen_arith_imm_fn(ctx, a, EXT_NONE, tcg_gen_andi_tl);
+    return gen_logic_imm_fn(ctx, a, EXT_NONE, tcg_gen_andi_tl);
 }
 
 static bool trans_slli(DisasContext *ctx, arg_slli *a)
@@ -317,7 +317,7 @@ static bool trans_sltu(DisasContext *ctx, arg_sltu *a)
 
 static bool trans_xor(DisasContext *ctx, arg_xor *a)
 {
-    return gen_arith(ctx, a, EXT_NONE, tcg_gen_xor_tl);
+    return gen_logic(ctx, a, EXT_NONE, tcg_gen_xor_tl);
 }
 
 static bool trans_srl(DisasContext *ctx, arg_srl *a)
@@ -332,12 +332,12 @@ static bool trans_sra(DisasContext *ctx, arg_sra *a)
 
 static bool trans_or(DisasContext *ctx, arg_or *a)
 {
-    return gen_arith(ctx, a, EXT_NONE, tcg_gen_or_tl);
+    return gen_logic(ctx, a, EXT_NONE, tcg_gen_or_tl);
 }
 
 static bool trans_and(DisasContext *ctx, arg_and *a)
 {
-    return gen_arith(ctx, a, EXT_NONE, tcg_gen_and_tl);
+    return gen_logic(ctx, a, EXT_NONE, tcg_gen_and_tl);
 }
 
 static bool trans_addiw(DisasContext *ctx, arg_addiw *a)
-- 
2.33.0



^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH v3 06/21] target/riscv: array for the 64 upper bits of 128-bit registers
  2021-10-19  9:47 [PATCH v3 00/21] Adding partial support for 128-bit riscv target Frédéric Pétrot
                   ` (4 preceding siblings ...)
  2021-10-19  9:47 ` [PATCH v3 05/21] target/riscv: separation of bitwise logic and aritmetic helpers Frédéric Pétrot
@ 2021-10-19  9:47 ` Frédéric Pétrot
  2021-10-20 14:44   ` Richard Henderson
  2021-10-19  9:47 ` [PATCH v3 07/21] target/riscv: setup everything so that riscv128-softmmu compiles Frédéric Pétrot
                   ` (14 subsequent siblings)
  20 siblings, 1 reply; 49+ messages in thread
From: Frédéric Pétrot @ 2021-10-19  9:47 UTC (permalink / raw)
  To: qemu-devel, qemu-riscv
  Cc: bin.meng, richard.henderson, alistair.francis, fabien.portas,
	palmer, Frédéric Pétrot, philmd

The upper 64-bit of the 128-bit registers have now a place inside
the cpu state structure, and are created as globals for future use.

Signed-off-by: Frédéric Pétrot <frederic.petrot@univ-grenoble-alpes.fr>
Co-authored-by: Fabien Portas <fabien.portas@grenoble-inp.org>
---
 target/riscv/cpu.h       | 1 +
 target/riscv/translate.c | 5 ++++-
 2 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/target/riscv/cpu.h b/target/riscv/cpu.h
index c24bc9a039..c8b98f1b70 100644
--- a/target/riscv/cpu.h
+++ b/target/riscv/cpu.h
@@ -109,6 +109,7 @@ FIELD(VTYPE, VILL, sizeof(target_ulong) * 8 - 1, 1)
 
 struct CPURISCVState {
     target_ulong gpr[32];
+    target_ulong gprh[32]; /* 64 top bits of the 128-bit registers */
     uint64_t fpr[32]; /* assume both F and D extensions */
 
     /* vector coprocessor state. */
diff --git a/target/riscv/translate.c b/target/riscv/translate.c
index 3c2e9fb790..b64fe8470d 100644
--- a/target/riscv/translate.c
+++ b/target/riscv/translate.c
@@ -32,7 +32,7 @@
 #include "instmap.h"
 
 /* global register indices */
-static TCGv cpu_gpr[32], cpu_pc, cpu_vl;
+static TCGv cpu_gpr[32], cpu_gprh[32], cpu_pc, cpu_vl;
 static TCGv_i64 cpu_fpr[32]; /* assume F and D extensions */
 static TCGv load_res;
 static TCGv load_val;
@@ -755,10 +755,13 @@ void riscv_translate_init(void)
      * unless you specifically block reads/writes to reg 0.
      */
     cpu_gpr[0] = NULL;
+    cpu_gprh[0] = NULL;
 
     for (i = 1; i < 32; i++) {
         cpu_gpr[i] = tcg_global_mem_new(cpu_env,
             offsetof(CPURISCVState, gpr[i]), riscv_int_regnames[i]);
+        cpu_gprh[i] = tcg_global_mem_new(cpu_env,
+            offsetof(CPURISCVState, gprh[i]), riscv_int_regnames[i]);
     }
 
     for (i = 0; i < 32; i++) {
-- 
2.33.0



^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH v3 07/21] target/riscv: setup everything so that riscv128-softmmu compiles
  2021-10-19  9:47 [PATCH v3 00/21] Adding partial support for 128-bit riscv target Frédéric Pétrot
                   ` (5 preceding siblings ...)
  2021-10-19  9:47 ` [PATCH v3 06/21] target/riscv: array for the 64 upper bits of 128-bit registers Frédéric Pétrot
@ 2021-10-19  9:47 ` Frédéric Pétrot
  2021-10-20 14:57   ` Richard Henderson
  2021-10-19  9:47 ` [PATCH v3 08/21] target/riscv: adding accessors to the registers upper part Frédéric Pétrot
                   ` (13 subsequent siblings)
  20 siblings, 1 reply; 49+ messages in thread
From: Frédéric Pétrot @ 2021-10-19  9:47 UTC (permalink / raw)
  To: qemu-devel, qemu-riscv
  Cc: bin.meng, richard.henderson, alistair.francis, fabien.portas,
	palmer, Frédéric Pétrot, philmd

This patch is kind of a mess because several files have to be slightly
modified to allow for a new target. Most of these modifications have to deal
with changing what was a binary choice into a ternary one.  Although we did
our best to avoid testing for TARGET_RISCV128 (which we did), it is
implicitly there in '#else' statements.
Most added infrastructure files are no far from being copies of the 64-bit
version.
Once this patch applied, adding risc128-sofmmu to --target-list produces
a (no so useful yet) executable.

Signed-off-by: Frédéric Pétrot <frederic.petrot@univ-grenoble-alpes.fr>
Co-authored-by: Fabien Portas <fabien.portas@grenoble-inp.org>
---
 configs/devices/riscv128-softmmu/default.mak | 17 +++++++
 configs/targets/riscv128-softmmu.mak         |  6 +++
 include/disas/dis-asm.h                      |  1 +
 include/hw/riscv/sifive_cpu.h                |  3 ++
 target/riscv/cpu-param.h                     |  5 ++
 target/riscv/cpu.h                           |  3 ++
 disas/riscv.c                                |  5 ++
 target/riscv/cpu.c                           | 23 +++++++++-
 target/riscv/gdbstub.c                       |  3 ++
 target/riscv/insn_trans/trans_rvd.c.inc      | 12 ++---
 target/riscv/insn_trans/trans_rvf.c.inc      |  6 +--
 gdb-xml/riscv-128bit-cpu.xml                 | 48 ++++++++++++++++++++
 gdb-xml/riscv-128bit-virtual.xml             | 12 +++++
 target/riscv/Kconfig                         |  3 ++
 14 files changed, 137 insertions(+), 10 deletions(-)
 create mode 100644 configs/devices/riscv128-softmmu/default.mak
 create mode 100644 configs/targets/riscv128-softmmu.mak
 create mode 100644 gdb-xml/riscv-128bit-cpu.xml
 create mode 100644 gdb-xml/riscv-128bit-virtual.xml

diff --git a/configs/devices/riscv128-softmmu/default.mak b/configs/devices/riscv128-softmmu/default.mak
new file mode 100644
index 0000000000..e838f35785
--- /dev/null
+++ b/configs/devices/riscv128-softmmu/default.mak
@@ -0,0 +1,17 @@
+# Default configuration for riscv128-softmmu
+
+# Uncomment the following lines to disable these optional devices:
+#
+#CONFIG_PCI_DEVICES=n
+# No does not seem to be an option for these two parameters
+CONFIG_SEMIHOSTING=y
+CONFIG_ARM_COMPATIBLE_SEMIHOSTING=y
+
+# Boards:
+#
+CONFIG_SPIKE=n
+CONFIG_SIFIVE_E=n
+CONFIG_SIFIVE_U=n
+CONFIG_RISCV_VIRT=y
+CONFIG_MICROCHIP_PFSOC=n
+CONFIG_SHAKTI_C=n
diff --git a/configs/targets/riscv128-softmmu.mak b/configs/targets/riscv128-softmmu.mak
new file mode 100644
index 0000000000..7e5976bbf3
--- /dev/null
+++ b/configs/targets/riscv128-softmmu.mak
@@ -0,0 +1,6 @@
+#For now a raw copy of the riscv64 version as changing TARGET_ARCH to riscv64 might trigger to much stuff for now
+TARGET_ARCH=riscv128
+TARGET_BASE_ARCH=riscv
+TARGET_SUPPORTS_MTTCG=y
+TARGET_XML_FILES=gdb-xml/riscv-128bit-cpu.xml gdb-xml/riscv-32bit-fpu.xml gdb-xml/riscv-64bit-fpu.xml gdb-xml/riscv-128bit-virtual.xml
+TARGET_NEED_FDT=y
diff --git a/include/disas/dis-asm.h b/include/disas/dis-asm.h
index 524f29196d..d9c725adae 100644
--- a/include/disas/dis-asm.h
+++ b/include/disas/dis-asm.h
@@ -460,6 +460,7 @@ int print_insn_little_nios2     (bfd_vma, disassemble_info*);
 int print_insn_xtensa           (bfd_vma, disassemble_info*);
 int print_insn_riscv32          (bfd_vma, disassemble_info*);
 int print_insn_riscv64          (bfd_vma, disassemble_info*);
+int print_insn_riscv128         (bfd_vma, disassemble_info*);
 int print_insn_rx(bfd_vma, disassemble_info *);
 int print_insn_hexagon(bfd_vma, disassemble_info *);
 
diff --git a/include/hw/riscv/sifive_cpu.h b/include/hw/riscv/sifive_cpu.h
index 136799633a..64078feba8 100644
--- a/include/hw/riscv/sifive_cpu.h
+++ b/include/hw/riscv/sifive_cpu.h
@@ -26,6 +26,9 @@
 #elif defined(TARGET_RISCV64)
 #define SIFIVE_E_CPU TYPE_RISCV_CPU_SIFIVE_E51
 #define SIFIVE_U_CPU TYPE_RISCV_CPU_SIFIVE_U54
+#else
+#define SIFIVE_E_CPU TYPE_RISCV_CPU_SIFIVE_E51
+#define SIFIVE_U_CPU TYPE_RISCV_CPU_SIFIVE_U54
 #endif
 
 #endif /* HW_SIFIVE_CPU_H */
diff --git a/target/riscv/cpu-param.h b/target/riscv/cpu-param.h
index 80eb615f93..c10459b56f 100644
--- a/target/riscv/cpu-param.h
+++ b/target/riscv/cpu-param.h
@@ -16,6 +16,11 @@
 # define TARGET_LONG_BITS 32
 # define TARGET_PHYS_ADDR_SPACE_BITS 34 /* 22-bit PPN */
 # define TARGET_VIRT_ADDR_SPACE_BITS 32 /* sv32 */
+#else
+/* 64-bit target, since QEMU isn't built to have TARGET_LONG_BITS over 64 */
+# define TARGET_LONG_BITS 64
+# define TARGET_PHYS_ADDR_SPACE_BITS 56 /* 44-bit PPN */
+# define TARGET_VIRT_ADDR_SPACE_BITS 48 /* sv48 */
 #endif
 #define TARGET_PAGE_BITS 12 /* 4 KiB Pages */
 /*
diff --git a/target/riscv/cpu.h b/target/riscv/cpu.h
index c8b98f1b70..5d21128865 100644
--- a/target/riscv/cpu.h
+++ b/target/riscv/cpu.h
@@ -38,6 +38,7 @@
 #define TYPE_RISCV_CPU_ANY              RISCV_CPU_TYPE_NAME("any")
 #define TYPE_RISCV_CPU_BASE32           RISCV_CPU_TYPE_NAME("rv32")
 #define TYPE_RISCV_CPU_BASE64           RISCV_CPU_TYPE_NAME("rv64")
+#define TYPE_RISCV_CPU_BASE128          RISCV_CPU_TYPE_NAME("rv128")
 #define TYPE_RISCV_CPU_IBEX             RISCV_CPU_TYPE_NAME("lowrisc-ibex")
 #define TYPE_RISCV_CPU_SHAKTI_C         RISCV_CPU_TYPE_NAME("shakti-c")
 #define TYPE_RISCV_CPU_SIFIVE_E31       RISCV_CPU_TYPE_NAME("sifive-e31")
@@ -50,6 +51,8 @@
 # define TYPE_RISCV_CPU_BASE            TYPE_RISCV_CPU_BASE32
 #elif defined(TARGET_RISCV64)
 # define TYPE_RISCV_CPU_BASE            TYPE_RISCV_CPU_BASE64
+#else
+# define TYPE_RISCV_CPU_BASE            TYPE_RISCV_CPU_BASE128
 #endif
 
 #define RV(x) ((target_ulong)1 << (x - 'A'))
diff --git a/disas/riscv.c b/disas/riscv.c
index 793ad14c27..03c8dc9961 100644
--- a/disas/riscv.c
+++ b/disas/riscv.c
@@ -3090,3 +3090,8 @@ int print_insn_riscv64(bfd_vma memaddr, struct disassemble_info *info)
 {
     return print_insn_riscv(memaddr, info, rv64);
 }
+
+int print_insn_riscv128(bfd_vma memaddr, struct disassemble_info *info)
+{
+    return print_insn_riscv(memaddr, info, rv128);
+}
diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
index b81b880900..d5a87f57e9 100644
--- a/target/riscv/cpu.c
+++ b/target/riscv/cpu.c
@@ -143,6 +143,8 @@ static void riscv_any_cpu_init(Object *obj)
     set_misa(env, MXL_RV32, RVI | RVM | RVA | RVF | RVD | RVC | RVU);
 #elif defined(TARGET_RISCV64)
     set_misa(env, MXL_RV64, RVI | RVM | RVA | RVF | RVD | RVC | RVU);
+#else
+    set_misa(env, MXL_RV128, RVI | RVM | RVA | RVF | RVD | RVC | RVU);
 #endif
     set_priv_version(env, PRIV_VERSION_1_11_0);
 }
@@ -169,7 +171,7 @@ static void rv64_sifive_e_cpu_init(Object *obj)
     set_priv_version(env, PRIV_VERSION_1_10_0);
     qdev_prop_set_bit(DEVICE(obj), "mmu", false);
 }
-#else
+#elif defined(TARGET_RISCV32)
 static void rv32_base_cpu_init(Object *obj)
 {
     CPURISCVState *env = &RISCV_CPU(obj)->env;
@@ -209,6 +211,13 @@ static void rv32_imafcu_nommu_cpu_init(Object *obj)
     set_resetvec(env, DEFAULT_RSTVEC);
     qdev_prop_set_bit(DEVICE(obj), "mmu", false);
 }
+#else
+static void rv128_base_cpu_init(Object *obj)
+{
+    CPURISCVState *env = &RISCV_CPU(obj)->env;
+    /* We set this in the realise function */
+    set_misa(env, MXL_RV128, 0);
+}
 #endif
 
 static ObjectClass *riscv_cpu_class_by_name(const char *cpu_model)
@@ -395,6 +404,9 @@ static void riscv_cpu_disas_set_info(CPUState *s, disassemble_info *info)
     case MXL_RV64:
         info->print_insn = print_insn_riscv64;
         break;
+    case MXL_RV128:
+        info->print_insn = print_insn_riscv128;
+        break;
     default:
         g_assert_not_reached();
     }
@@ -457,6 +469,9 @@ static void riscv_cpu_realize(DeviceState *dev, Error **errp)
 #ifdef TARGET_RISCV64
     case MXL_RV64:
         break;
+#elif !defined(TARGET_RISCV32)
+    case MXL_RV128:
+        break;
 #endif
     case MXL_RV32:
         break;
@@ -657,6 +672,8 @@ static gchar *riscv_gdb_arch_name(CPUState *cs)
         return g_strdup("riscv:rv32");
     case MXL_RV64:
         return g_strdup("riscv:rv64");
+    case MXL_RV128:
+        return g_strdup("riscv:rv128");
     default:
         g_assert_not_reached();
     }
@@ -721,6 +738,8 @@ static void riscv_cpu_class_init(ObjectClass *c, void *data)
     cc->gdb_core_xml_file = "riscv-32bit-cpu.xml";
 #elif defined(TARGET_RISCV64)
     cc->gdb_core_xml_file = "riscv-64bit-cpu.xml";
+#else
+    cc->gdb_core_xml_file = "riscv-128bit-cpu.xml";
 #endif
     cc->gdb_stop_before_watchpoint = true;
     cc->disas_set_info = riscv_cpu_disas_set_info;
@@ -808,6 +827,8 @@ static const TypeInfo riscv_cpu_type_infos[] = {
     DEFINE_CPU(TYPE_RISCV_CPU_SIFIVE_E51,       rv64_sifive_e_cpu_init),
     DEFINE_CPU(TYPE_RISCV_CPU_SIFIVE_U54,       rv64_sifive_u_cpu_init),
     DEFINE_CPU(TYPE_RISCV_CPU_SHAKTI_C,         rv64_sifive_u_cpu_init),
+#else
+    DEFINE_CPU(TYPE_RISCV_CPU_BASE128,          rv128_base_cpu_init),
 #endif
 };
 
diff --git a/target/riscv/gdbstub.c b/target/riscv/gdbstub.c
index 23429179e2..f840a309e2 100644
--- a/target/riscv/gdbstub.c
+++ b/target/riscv/gdbstub.c
@@ -204,6 +204,9 @@ void riscv_cpu_register_gdb_regs_for_features(CPUState *cs)
 #elif defined(TARGET_RISCV64)
     gdb_register_coprocessor(cs, riscv_gdb_get_virtual, riscv_gdb_set_virtual,
                              1, "riscv-64bit-virtual.xml", 0);
+#else
+    gdb_register_coprocessor(cs, riscv_gdb_get_virtual, riscv_gdb_set_virtual,
+                             1, "riscv-128bit-virtual.xml", 0);
 #endif
 
     gdb_register_coprocessor(cs, riscv_gdb_get_csr, riscv_gdb_set_csr,
diff --git a/target/riscv/insn_trans/trans_rvd.c.inc b/target/riscv/insn_trans/trans_rvd.c.inc
index db9ae15755..41da696ec4 100644
--- a/target/riscv/insn_trans/trans_rvd.c.inc
+++ b/target/riscv/insn_trans/trans_rvd.c.inc
@@ -393,11 +393,11 @@ static bool trans_fmv_x_d(DisasContext *ctx, arg_fmv_x_d *a)
     REQUIRE_FPU;
     REQUIRE_EXT(ctx, RVD);
 
-#ifdef TARGET_RISCV64
+#ifdef TARGET_RISCV32
+    qemu_build_not_reached();
+#else
     gen_set_gpr(ctx, a->rd, cpu_fpr[a->rs1]);
     return true;
-#else
-    qemu_build_not_reached();
 #endif
 }
 
@@ -437,11 +437,11 @@ static bool trans_fmv_d_x(DisasContext *ctx, arg_fmv_d_x *a)
     REQUIRE_FPU;
     REQUIRE_EXT(ctx, RVD);
 
-#ifdef TARGET_RISCV64
+#ifdef TARGET_RISCV32
+    qemu_build_not_reached();
+#else
     tcg_gen_mov_tl(cpu_fpr[a->rd], get_gpr(ctx, a->rs1, EXT_NONE));
     mark_fs_dirty(ctx);
     return true;
-#else
-    qemu_build_not_reached();
 #endif
 }
diff --git a/target/riscv/insn_trans/trans_rvf.c.inc b/target/riscv/insn_trans/trans_rvf.c.inc
index bddbd418d9..90cc51e5d6 100644
--- a/target/riscv/insn_trans/trans_rvf.c.inc
+++ b/target/riscv/insn_trans/trans_rvf.c.inc
@@ -311,10 +311,10 @@ static bool trans_fmv_x_w(DisasContext *ctx, arg_fmv_x_w *a)
 
     TCGv dest = dest_gpr(ctx, a->rd);
 
-#if defined(TARGET_RISCV64)
-    tcg_gen_ext32s_tl(dest, cpu_fpr[a->rs1]);
-#else
+#if defined(TARGET_RISCV32)
     tcg_gen_extrl_i64_i32(dest, cpu_fpr[a->rs1]);
+#else
+    tcg_gen_ext32s_tl(dest, cpu_fpr[a->rs1]);
 #endif
 
     gen_set_gpr(ctx, a->rd, dest);
diff --git a/gdb-xml/riscv-128bit-cpu.xml b/gdb-xml/riscv-128bit-cpu.xml
new file mode 100644
index 0000000000..c98168148f
--- /dev/null
+++ b/gdb-xml/riscv-128bit-cpu.xml
@@ -0,0 +1,48 @@
+<?xml version="1.0"?>
+<!-- Copyright (C) 2018-2019 Free Software Foundation, Inc.
+
+     Copying and distribution of this file, with or without modification,
+     are permitted in any medium without royalty provided the copyright
+     notice and this notice are preserved.  -->
+
+<!-- Register numbers are hard-coded in order to maintain backward
+     compatibility with older versions of tools that didn't use xml
+     register descriptions.  -->
+
+<!DOCTYPE feature SYSTEM "gdb-target.dtd">
+<!-- FIXME : All GPRs are marked as 64-bits since gdb doesn't like 128-bit registers for now. -->
+<feature name="org.gnu.gdb.riscv.cpu">
+  <reg name="zero" bitsize="64" type="int" regnum="0"/>
+  <reg name="ra" bitsize="64" type="code_ptr"/>
+  <reg name="sp" bitsize="64" type="data_ptr"/>
+  <reg name="gp" bitsize="64" type="data_ptr"/>
+  <reg name="tp" bitsize="64" type="data_ptr"/>
+  <reg name="t0" bitsize="64" type="int"/>
+  <reg name="t1" bitsize="64" type="int"/>
+  <reg name="t2" bitsize="64" type="int"/>
+  <reg name="fp" bitsize="64" type="data_ptr"/>
+  <reg name="s1" bitsize="64" type="int"/>
+  <reg name="a0" bitsize="64" type="int"/>
+  <reg name="a1" bitsize="64" type="int"/>
+  <reg name="a2" bitsize="64" type="int"/>
+  <reg name="a3" bitsize="64" type="int"/>
+  <reg name="a4" bitsize="64" type="int"/>
+  <reg name="a5" bitsize="64" type="int"/>
+  <reg name="a6" bitsize="64" type="int"/>
+  <reg name="a7" bitsize="64" type="int"/>
+  <reg name="s2" bitsize="64" type="int"/>
+  <reg name="s3" bitsize="64" type="int"/>
+  <reg name="s4" bitsize="64" type="int"/>
+  <reg name="s5" bitsize="64" type="int"/>
+  <reg name="s6" bitsize="64" type="int"/>
+  <reg name="s7" bitsize="64" type="int"/>
+  <reg name="s8" bitsize="64" type="int"/>
+  <reg name="s9" bitsize="64" type="int"/>
+  <reg name="s10" bitsize="64" type="int"/>
+  <reg name="s11" bitsize="64" type="int"/>
+  <reg name="t3" bitsize="64" type="int"/>
+  <reg name="t4" bitsize="64" type="int"/>
+  <reg name="t5" bitsize="64" type="int"/>
+  <reg name="t6" bitsize="64" type="int"/>
+  <reg name="pc" bitsize="64" type="code_ptr"/>
+</feature>
diff --git a/gdb-xml/riscv-128bit-virtual.xml b/gdb-xml/riscv-128bit-virtual.xml
new file mode 100644
index 0000000000..db9a0ff677
--- /dev/null
+++ b/gdb-xml/riscv-128bit-virtual.xml
@@ -0,0 +1,12 @@
+<?xml version="1.0"?>
+<!-- Copyright (C) 2018-2019 Free Software Foundation, Inc.
+
+     Copying and distribution of this file, with or without modification,
+     are permitted in any medium without royalty provided the copyright
+     notice and this notice are preserved.  -->
+
+<!DOCTYPE feature SYSTEM "gdb-target.dtd">
+<!-- FIXME : priv marked as 64-bits since gdb doesn't like 128-bit registers for now. -->
+<feature name="org.gnu.gdb.riscv.virtual">
+  <reg name="priv" bitsize="64"/>
+</feature>
diff --git a/target/riscv/Kconfig b/target/riscv/Kconfig
index b9e5932f13..f9ea52a59a 100644
--- a/target/riscv/Kconfig
+++ b/target/riscv/Kconfig
@@ -3,3 +3,6 @@ config RISCV32
 
 config RISCV64
     bool
+
+config RISCV128
+    bool
-- 
2.33.0



^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH v3 08/21] target/riscv: adding accessors to the registers upper part
  2021-10-19  9:47 [PATCH v3 00/21] Adding partial support for 128-bit riscv target Frédéric Pétrot
                   ` (6 preceding siblings ...)
  2021-10-19  9:47 ` [PATCH v3 07/21] target/riscv: setup everything so that riscv128-softmmu compiles Frédéric Pétrot
@ 2021-10-19  9:47 ` Frédéric Pétrot
  2021-10-20 15:09   ` Richard Henderson
  2021-10-19  9:48 ` [PATCH v3 09/21] target/riscv: moving some insns close to similar insns Frédéric Pétrot
                   ` (12 subsequent siblings)
  20 siblings, 1 reply; 49+ messages in thread
From: Frédéric Pétrot @ 2021-10-19  9:47 UTC (permalink / raw)
  To: qemu-devel, qemu-riscv
  Cc: bin.meng, richard.henderson, alistair.francis, fabien.portas,
	palmer, Frédéric Pétrot, philmd

Set and get functions to access the 64 top bits of a register, stored in
the gprh field of the cpu state.  The access to the gprh field can not be
protected at compile time to make sure it is accessed only
in the 128-bit version of the processor because we have no way to
indicate that the misa_mxl_max field is const.

Signed-off-by: Frédéric Pétrot <frederic.petrot@univ-grenoble-alpes.fr>
Co-authored-by: Fabien Portas <fabien.portas@grenoble-inp.org>
---
 target/riscv/translate.c | 45 ++++++++++++++++++++++++++++++++++++++++
 1 file changed, 45 insertions(+)

diff --git a/target/riscv/translate.c b/target/riscv/translate.c
index b64fe8470d..b6ddcf7a10 100644
--- a/target/riscv/translate.c
+++ b/target/riscv/translate.c
@@ -55,6 +55,7 @@ typedef struct DisasContext {
     /* pc_succ_insn points to the instruction following base.pc_next */
     target_ulong pc_succ_insn;
     target_ulong priv_ver;
+    RISCVMXL misa_mxl_max;
     RISCVMXL xl;
     uint32_t misa_ext;
     uint32_t opcode;
@@ -116,6 +117,13 @@ static inline int get_olen(DisasContext *ctx)
     return 16 << get_ol(ctx);
 }
 
+/* The maximum register length */
+#ifdef TARGET_RISCV32
+#define get_xl_max(ctx)    MXL_RV32
+#else
+#define get_xl_max(ctx)    ((ctx)->misa_mxl_max)
+#endif
+
 /*
  * RISC-V requires NaN-boxing of narrower width floating point values.
  * This applies when a 32-bit value is assigned to a 64-bit FP register.
@@ -220,6 +228,7 @@ static TCGv get_gpr(DisasContext *ctx, int reg_num, DisasExtend ext)
         }
         break;
     case MXL_RV64:
+    case MXL_RV128:
         break;
     default:
         g_assert_not_reached();
@@ -227,6 +236,14 @@ static TCGv get_gpr(DisasContext *ctx, int reg_num, DisasExtend ext)
     return cpu_gpr[reg_num];
 }
 
+static TCGv get_gprh(DisasContext *ctx, int reg_num)
+{
+    if (reg_num == 0 || get_ol(ctx) < MXL_RV128) {
+        return ctx->zero;
+    }
+    return cpu_gprh[reg_num];
+}
+
 static TCGv dest_gpr(DisasContext *ctx, int reg_num)
 {
     if (reg_num == 0 || get_olen(ctx) < TARGET_LONG_BITS) {
@@ -235,6 +252,14 @@ static TCGv dest_gpr(DisasContext *ctx, int reg_num)
     return cpu_gpr[reg_num];
 }
 
+static TCGv dest_gprh(DisasContext *ctx, int reg_num)
+{
+    if (reg_num == 0 || get_ol(ctx) < MXL_RV128) {
+        return temp_new(ctx);
+    }
+    return cpu_gprh[reg_num];
+}
+
 static void gen_set_gpr(DisasContext *ctx, int reg_num, TCGv t)
 {
     if (reg_num != 0) {
@@ -243,6 +268,7 @@ static void gen_set_gpr(DisasContext *ctx, int reg_num, TCGv t)
             tcg_gen_ext32s_tl(cpu_gpr[reg_num], t);
             break;
         case MXL_RV64:
+        case MXL_RV128:
             tcg_gen_mov_tl(cpu_gpr[reg_num], t);
             break;
         default:
@@ -251,6 +277,17 @@ static void gen_set_gpr(DisasContext *ctx, int reg_num, TCGv t)
     }
 }
 
+static void gen_set_gprh(DisasContext *ctx, int reg_num, TCGv t)
+{
+    if (reg_num != 0) {
+        if (get_ol(ctx) < MXL_RV128) {
+            tcg_gen_sari_tl(cpu_gprh[reg_num], cpu_gpr[reg_num], 63);
+        } else {
+            tcg_gen_mov_tl(cpu_gprh[reg_num], t);
+        }
+    }
+}
+
 static void gen_jal(DisasContext *ctx, int rd, target_ulong imm)
 {
     target_ulong next_pc;
@@ -392,6 +429,13 @@ static bool gen_logic_imm_fn(DisasContext *ctx, arg_i *a, DisasExtend ext,
 
     gen_set_gpr(ctx, a->rd, dest);
 
+    /* devilish temporary code so that the patch compiles */
+    if (get_xl_max(ctx) == MXL_RV128) {
+        (void)get_gprh(ctx, 6);
+        (void)dest_gprh(ctx, 6);
+        gen_set_gprh(ctx, 6, NULL);
+    }
+
     return true;
 }
 
@@ -655,6 +699,7 @@ static void riscv_tr_init_disas_context(DisasContextBase *dcbase, CPUState *cs)
     ctx->lmul = FIELD_EX32(tb_flags, TB_FLAGS, LMUL);
     ctx->mlen = 1 << (ctx->sew  + 3 - ctx->lmul);
     ctx->vl_eq_vlmax = FIELD_EX32(tb_flags, TB_FLAGS, VL_EQ_VLMAX);
+    ctx->misa_mxl_max = env->misa_mxl_max;
     ctx->xl = FIELD_EX32(tb_flags, TB_FLAGS, XL);
     ctx->cs = cs;
     ctx->ntemp = 0;
-- 
2.33.0



^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH v3 09/21] target/riscv: moving some insns close to similar insns
  2021-10-19  9:47 [PATCH v3 00/21] Adding partial support for 128-bit riscv target Frédéric Pétrot
                   ` (7 preceding siblings ...)
  2021-10-19  9:47 ` [PATCH v3 08/21] target/riscv: adding accessors to the registers upper part Frédéric Pétrot
@ 2021-10-19  9:48 ` Frédéric Pétrot
  2021-10-20 15:11   ` Richard Henderson
  2021-10-19  9:48 ` [PATCH v3 10/21] target/riscv: support for 128-bit loads and store Frédéric Pétrot
                   ` (11 subsequent siblings)
  20 siblings, 1 reply; 49+ messages in thread
From: Frédéric Pétrot @ 2021-10-19  9:48 UTC (permalink / raw)
  To: qemu-devel, qemu-riscv
  Cc: bin.meng, richard.henderson, alistair.francis, fabien.portas,
	palmer, Frédéric Pétrot, philmd

lwu and ld are functionally close to the other loads, but were after the
stores in the source file.
Similarly, xor was away from or and and by two arithmetic functions, while
the immediate versions were nicely put together.
This patch moves the aforementioned loads after lhu, and xor above or,
where they more logically belong.

Signed-off-by: Frédéric Pétrot <frederic.petrot@univ-grenoble-alpes.fr>
Co-authored-by: Fabien Portas <fabien.portas@grenoble-inp.org>
---
 target/riscv/insn_trans/trans_rvi.c.inc | 34 ++++++++++++-------------
 1 file changed, 17 insertions(+), 17 deletions(-)

diff --git a/target/riscv/insn_trans/trans_rvi.c.inc b/target/riscv/insn_trans/trans_rvi.c.inc
index ed138f748e..5c2a117a70 100644
--- a/target/riscv/insn_trans/trans_rvi.c.inc
+++ b/target/riscv/insn_trans/trans_rvi.c.inc
@@ -175,6 +175,18 @@ static bool trans_lhu(DisasContext *ctx, arg_lhu *a)
     return gen_load(ctx, a, MO_TEUW);
 }
 
+static bool trans_lwu(DisasContext *ctx, arg_lwu *a)
+{
+    REQUIRE_64BIT(ctx);
+    return gen_load(ctx, a, MO_TEUL);
+}
+
+static bool trans_ld(DisasContext *ctx, arg_ld *a)
+{
+    REQUIRE_64BIT(ctx);
+    return gen_load(ctx, a, MO_TEQ);
+}
+
 static bool gen_store(DisasContext *ctx, arg_sb *a, MemOp memop)
 {
     TCGv addr = get_gpr(ctx, a->rs1, EXT_NONE);
@@ -205,18 +217,6 @@ static bool trans_sw(DisasContext *ctx, arg_sw *a)
     return gen_store(ctx, a, MO_TESL);
 }
 
-static bool trans_lwu(DisasContext *ctx, arg_lwu *a)
-{
-    REQUIRE_64BIT(ctx);
-    return gen_load(ctx, a, MO_TEUL);
-}
-
-static bool trans_ld(DisasContext *ctx, arg_ld *a)
-{
-    REQUIRE_64BIT(ctx);
-    return gen_load(ctx, a, MO_TEQ);
-}
-
 static bool trans_sd(DisasContext *ctx, arg_sd *a)
 {
     REQUIRE_64BIT(ctx);
@@ -315,11 +315,6 @@ static bool trans_sltu(DisasContext *ctx, arg_sltu *a)
     return gen_arith(ctx, a, EXT_SIGN, gen_sltu);
 }
 
-static bool trans_xor(DisasContext *ctx, arg_xor *a)
-{
-    return gen_logic(ctx, a, EXT_NONE, tcg_gen_xor_tl);
-}
-
 static bool trans_srl(DisasContext *ctx, arg_srl *a)
 {
     return gen_shift(ctx, a, EXT_ZERO, tcg_gen_shr_tl);
@@ -330,6 +325,11 @@ static bool trans_sra(DisasContext *ctx, arg_sra *a)
     return gen_shift(ctx, a, EXT_SIGN, tcg_gen_sar_tl);
 }
 
+static bool trans_xor(DisasContext *ctx, arg_xor *a)
+{
+    return gen_logic(ctx, a, EXT_NONE, tcg_gen_xor_tl);
+}
+
 static bool trans_or(DisasContext *ctx, arg_or *a)
 {
     return gen_logic(ctx, a, EXT_NONE, tcg_gen_or_tl);
-- 
2.33.0



^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH v3 10/21] target/riscv: support for 128-bit loads and store
  2021-10-19  9:47 [PATCH v3 00/21] Adding partial support for 128-bit riscv target Frédéric Pétrot
                   ` (8 preceding siblings ...)
  2021-10-19  9:48 ` [PATCH v3 09/21] target/riscv: moving some insns close to similar insns Frédéric Pétrot
@ 2021-10-19  9:48 ` Frédéric Pétrot
  2021-10-20 17:31   ` Richard Henderson
  2021-10-19  9:48 ` [PATCH v3 11/21] target/riscv: support for 128-bit bitwise instructions Frédéric Pétrot
                   ` (10 subsequent siblings)
  20 siblings, 1 reply; 49+ messages in thread
From: Frédéric Pétrot @ 2021-10-19  9:48 UTC (permalink / raw)
  To: qemu-devel, qemu-riscv
  Cc: bin.meng, richard.henderson, alistair.francis, fabien.portas,
	palmer, Frédéric Pétrot, philmd

The 128-bit ISA adds ldu, lq and sq. We provide here support for these
instructions. Note that although we compute a 128-bit address, we only use
the lower 64-bit to actually address memory, cowardly utilizing the
existing address translation mechanism of QEMU.

Signed-off-by: Frédéric Pétrot <frederic.petrot@univ-grenoble-alpes.fr>
Co-authored-by: Fabien Portas <fabien.portas@grenoble-inp.org>
---
 target/riscv/insn16.decode              |  32 +++++-
 target/riscv/insn32.decode              |   4 +
 target/riscv/translate.c                |   7 --
 target/riscv/insn_trans/trans_rvi.c.inc | 146 ++++++++++++++++++++++--
 4 files changed, 171 insertions(+), 18 deletions(-)

diff --git a/target/riscv/insn16.decode b/target/riscv/insn16.decode
index 2e9212663c..151fc6e567 100644
--- a/target/riscv/insn16.decode
+++ b/target/riscv/insn16.decode
@@ -39,6 +39,10 @@
 %imm_addi16sp  12:s1 3:2 5:1 2:1 6:1 !function=ex_shift_4
 %imm_lui       12:s1 2:5             !function=ex_shift_12
 
+# Added for 128 bit support
+%uimm_cl_q    5:2 10:3               !function=ex_shift_3
+%uimm_6bit_lq 2:3 12:1 5:2           !function=ex_shift_3
+%uimm_6bit_sq 7:3 10:3               !function=ex_shift_3
 
 # Argument sets imported from insn32.decode:
 &empty                  !extern
@@ -54,16 +58,20 @@
 # Formats 16:
 @cr        ....  ..... .....  .. &r      rs2=%rs2_5       rs1=%rd     %rd
 @ci        ... . ..... .....  .. &i      imm=%imm_ci      rs1=%rd     %rd
+@cl_q      ... . .....  ..... .. &i      imm=%uimm_6bit_lq rs1=2 %rd
 @cl_d      ... ... ... .. ... .. &i      imm=%uimm_cl_d   rs1=%rs1_3  rd=%rs2_3
 @cl_w      ... ... ... .. ... .. &i      imm=%uimm_cl_w   rs1=%rs1_3  rd=%rs2_3
 @cs_2      ... ... ... .. ... .. &r      rs2=%rs2_3       rs1=%rs1_3  rd=%rs1_3
+@cs_q      ... ... ... .. ... .. &s      imm=%uimm_cl_q   rs1=%rs1_3  rs2=%rs2_3
 @cs_d      ... ... ... .. ... .. &s      imm=%uimm_cl_d   rs1=%rs1_3  rs2=%rs2_3
 @cs_w      ... ... ... .. ... .. &s      imm=%uimm_cl_w   rs1=%rs1_3  rs2=%rs2_3
 @cj        ...    ........... .. &j      imm=%imm_cj
 @cb_z      ... ... ... .. ... .. &b      imm=%imm_cb      rs1=%rs1_3  rs2=0
 
+@c_lqsp    ... . .....  ..... .. &i      imm=%uimm_6bit_lq rs1=2 %rd
 @c_ldsp    ... . .....  ..... .. &i      imm=%uimm_6bit_ld rs1=2 %rd
 @c_lwsp    ... . .....  ..... .. &i      imm=%uimm_6bit_lw rs1=2 %rd
+@c_sqsp    ... . .....  ..... .. &s      imm=%uimm_6bit_sq rs1=2 rs2=%rs2_5
 @c_sdsp    ... . .....  ..... .. &s      imm=%uimm_6bit_sd rs1=2 rs2=%rs2_5
 @c_swsp    ... . .....  ..... .. &s      imm=%uimm_6bit_sw rs1=2 rs2=%rs2_5
 @c_li      ... . .....  ..... .. &i      imm=%imm_ci rs1=0 %rd
@@ -87,9 +95,17 @@
   illegal         000  000 000 00 --- 00
   addi            000  ... ... .. ... 00 @c_addi4spn
 }
-fld               001  ... ... .. ... 00 @cl_d
+{
+  fld             001  ... ... .. ... 00 @cl_d
+  # *** RV128C specific Standard Extension (Quadrant 0) ***
+  lq              001  ... ... .. ... 00 @cl_q
+}
 lw                010  ... ... .. ... 00 @cl_w
-fsd               101  ... ... .. ... 00 @cs_d
+{
+  fsd             101  ... ... .. ... 00 @cs_d
+  # *** RV128C specific Standard Extension (Quadrant 0) ***
+  sq              101  ... ... .. ... 00 @cs_q
+}
 sw                110  ... ... .. ... 00 @cs_w
 
 # *** RV32C and RV64C specific Standard Extension (Quadrant 0) ***
@@ -132,7 +148,11 @@ addw              100 1 11 ... 01 ... 01 @cs_2
 
 # *** RV32/64C Standard Extension (Quadrant 2) ***
 slli              000 .  .....  ..... 10 @c_shift2
-fld               001 .  .....  ..... 10 @c_ldsp
+{
+  fld             001 .  .....  ..... 10 @c_ldsp
+  # *** RV128C specific Standard Extension (Quadrant 2) ***
+  lq              001  ... ... .. ... 10 @c_lqsp
+}
 {
   illegal         010 -  00000  ----- 10 # c.lwsp, RES rd=0
   lw              010 .  .....  ..... 10 @c_lwsp
@@ -147,7 +167,11 @@ fld               001 .  .....  ..... 10 @c_ldsp
   jalr            100 1  .....  00000 10 @c_jalr rd=1  # C.JALR
   add             100 1  .....  ..... 10 @cr
 }
-fsd               101   ......  ..... 10 @c_sdsp
+{
+  fsd             101   ......  ..... 10 @c_sdsp
+  # *** RV128C specific Standard Extension (Quadrant 2) ***
+  sq              101  ... ... .. ... 10 @c_sqsp
+}
 sw                110 .  .....  ..... 10 @c_swsp
 
 # *** RV32C and RV64C specific Standard Extension (Quadrant 2) ***
diff --git a/target/riscv/insn32.decode b/target/riscv/insn32.decode
index 2f251dac1b..1e7ddecc22 100644
--- a/target/riscv/insn32.decode
+++ b/target/riscv/insn32.decode
@@ -163,6 +163,10 @@ sllw     0000000 .....  ..... 001 ..... 0111011 @r
 srlw     0000000 .....  ..... 101 ..... 0111011 @r
 sraw     0100000 .....  ..... 101 ..... 0111011 @r
 
+# *** RV128I Base Instruction Set (in addition to RV64I) ***
+ldu      ............   ..... 111 ..... 0000011 @i
+lq       ............   ..... 010 ..... 0001111 @i
+sq       ............   ..... 100 ..... 0100011 @s
 # *** RV32M Standard Extension ***
 mul      0000001 .....  ..... 000 ..... 0110011 @r
 mulh     0000001 .....  ..... 001 ..... 0110011 @r
diff --git a/target/riscv/translate.c b/target/riscv/translate.c
index b6ddcf7a10..e8f08f921e 100644
--- a/target/riscv/translate.c
+++ b/target/riscv/translate.c
@@ -429,13 +429,6 @@ static bool gen_logic_imm_fn(DisasContext *ctx, arg_i *a, DisasExtend ext,
 
     gen_set_gpr(ctx, a->rd, dest);
 
-    /* devilish temporary code so that the patch compiles */
-    if (get_xl_max(ctx) == MXL_RV128) {
-        (void)get_gprh(ctx, 6);
-        (void)dest_gprh(ctx, 6);
-        gen_set_gprh(ctx, 6, NULL);
-    }
-
     return true;
 }
 
diff --git a/target/riscv/insn_trans/trans_rvi.c.inc b/target/riscv/insn_trans/trans_rvi.c.inc
index 5c2a117a70..92f41f3a86 100644
--- a/target/riscv/insn_trans/trans_rvi.c.inc
+++ b/target/riscv/insn_trans/trans_rvi.c.inc
@@ -134,7 +134,15 @@ static bool trans_bgeu(DisasContext *ctx, arg_bgeu *a)
     return gen_branch(ctx, a, TCG_COND_GEU);
 }
 
-static bool gen_load(DisasContext *ctx, arg_lb *a, MemOp memop)
+static void gen_addi2_i128(TCGv retl, TCGv reth,
+                           TCGv srcl, TCGv srch, target_long imm)
+{
+    TCGv imml  = tcg_constant_tl(imm),
+         immh  = tcg_constant_tl(-(imm < 0));
+    tcg_gen_add2_tl(retl, reth, srcl, srch, imml, immh);
+}
+
+static bool gen_load_tl(DisasContext *ctx, arg_lb *a, MemOp memop)
 {
     TCGv dest = dest_gpr(ctx, a->rd);
     TCGv addr = get_gpr(ctx, a->rs1, EXT_NONE);
@@ -150,6 +158,63 @@ static bool gen_load(DisasContext *ctx, arg_lb *a, MemOp memop)
     return true;
 }
 
+/*
+ * TODO: we should assert that src1h == 0, as we do not change the
+ *       address translation mechanism
+ */
+static bool gen_load_i128(DisasContext *ctx, arg_lb *a, MemOp memop)
+{
+    TCGv src1l = get_gpr(ctx, a->rs1, EXT_NONE);
+    TCGv src1h = get_gprh(ctx, a->rs1);
+    TCGv destl = dest_gpr(ctx, a->rd);
+    TCGv desth = dest_gprh(ctx, a->rd);
+    TCGv addrl = tcg_temp_new();
+    TCGv addrh = tcg_temp_new();
+    TCGv imml = tcg_temp_new();
+    TCGv immh = tcg_constant_tl(-(a->imm < 0));
+
+    /* Build a 128-bit address */
+    if (a->imm != 0) {
+        tcg_gen_movi_tl(imml, a->imm);
+        tcg_gen_add2_tl(addrl, addrh, src1l, src1h, imml, immh);
+    } else {
+        tcg_gen_mov_tl(addrl, src1l);
+        tcg_gen_mov_tl(addrh, src1h);
+    }
+
+    if (memop != (MemOp)MO_TEO) {
+        tcg_gen_qemu_ld_tl(destl, addrl, ctx->mem_idx, memop);
+        if (memop & MO_SIGN) {
+            tcg_gen_sari_tl(desth, destl, 63);
+        } else {
+            tcg_gen_movi_tl(desth, 0);
+        }
+    } else {
+        tcg_gen_qemu_ld_tl(memop & MO_BSWAP ? desth : destl, addrl,
+                           ctx->mem_idx, MO_TEQ);
+        gen_addi2_i128(addrl, addrh, addrl, addrh, 8);
+        tcg_gen_qemu_ld_tl(memop & MO_BSWAP ? destl : desth, addrl,
+                           ctx->mem_idx, MO_TEQ);
+    }
+
+    gen_set_gpr(ctx, a->rd, destl);
+    gen_set_gprh(ctx, a->rd, desth);
+
+    tcg_temp_free(addrl);
+    tcg_temp_free(addrh);
+    tcg_temp_free(imml);
+    return true;
+}
+
+static bool gen_load(DisasContext *ctx, arg_lb *a, MemOp memop)
+{
+    if (get_xl(ctx) == MXL_RV128) {
+        return gen_load_i128(ctx, a, memop);
+    } else {
+        return gen_load_tl(ctx, a, memop);
+    }
+}
+
 static bool trans_lb(DisasContext *ctx, arg_lb *a)
 {
     return gen_load(ctx, a, MO_SB);
@@ -165,6 +230,18 @@ static bool trans_lw(DisasContext *ctx, arg_lw *a)
     return gen_load(ctx, a, MO_TESL);
 }
 
+static bool trans_ld(DisasContext *ctx, arg_ld *a)
+{
+    REQUIRE_64_OR_128BIT(ctx);
+    return gen_load(ctx, a, MO_TESQ);
+}
+
+static bool trans_lq(DisasContext *ctx, arg_lq *a)
+{
+    REQUIRE_128BIT(ctx);
+    return gen_load(ctx, a, MO_TEO);
+}
+
 static bool trans_lbu(DisasContext *ctx, arg_lbu *a)
 {
     return gen_load(ctx, a, MO_UB);
@@ -177,17 +254,17 @@ static bool trans_lhu(DisasContext *ctx, arg_lhu *a)
 
 static bool trans_lwu(DisasContext *ctx, arg_lwu *a)
 {
-    REQUIRE_64BIT(ctx);
+    REQUIRE_64_OR_128BIT(ctx);
     return gen_load(ctx, a, MO_TEUL);
 }
 
-static bool trans_ld(DisasContext *ctx, arg_ld *a)
+static bool trans_ldu(DisasContext *ctx, arg_ldu *a)
 {
-    REQUIRE_64BIT(ctx);
-    return gen_load(ctx, a, MO_TEQ);
+    REQUIRE_128BIT(ctx);
+    return gen_load(ctx, a, MO_TEUQ);
 }
 
-static bool gen_store(DisasContext *ctx, arg_sb *a, MemOp memop)
+static bool gen_store_tl(DisasContext *ctx, arg_sb *a, MemOp memop)
 {
     TCGv addr = get_gpr(ctx, a->rs1, EXT_NONE);
     TCGv data = get_gpr(ctx, a->rs2, EXT_NONE);
@@ -202,6 +279,55 @@ static bool gen_store(DisasContext *ctx, arg_sb *a, MemOp memop)
     return true;
 }
 
+/*
+ * TODO: we should assert that src1h == 0, as we do not change the
+ *       address translation mechanism
+ */
+static bool gen_store_i128(DisasContext *ctx, arg_sb *a, MemOp memop)
+{
+    TCGv src1l = get_gpr(ctx, a->rs1, EXT_NONE);
+    TCGv src1h = get_gprh(ctx, a->rs1);
+    TCGv src2l = get_gpr(ctx, a->rs2, EXT_NONE);
+    TCGv src2h = get_gprh(ctx, a->rs2);
+    TCGv addrl = tcg_temp_new();
+    TCGv addrh = tcg_temp_new();
+    TCGv imml = tcg_temp_new();
+    TCGv immh = tcg_constant_tl(-(a->imm < 0));
+
+    /* Build a 128-bit address */
+    if (a->imm != 0) {
+        tcg_gen_movi_tl(imml, a->imm);
+        tcg_gen_add2_tl(addrl, addrh, src1l, src1h, imml, immh);
+    } else {
+        tcg_gen_mov_tl(addrl, src1l);
+        tcg_gen_mov_tl(addrh, src1h);
+    }
+
+    if (memop != (MemOp)MO_TEO) {
+        tcg_gen_qemu_st_tl(src2l, addrl, ctx->mem_idx, memop);
+    } else {
+        tcg_gen_qemu_st_tl(memop & MO_BSWAP ? src2h : src2l, addrl,
+            ctx->mem_idx, MO_TEQ);
+        gen_addi2_i128(addrl, addrh, addrl, addrh, 8);
+        tcg_gen_qemu_st_tl(memop & MO_BSWAP ? src2l : src2h, addrl,
+            ctx->mem_idx, MO_TEQ);
+    }
+
+    tcg_temp_free(addrl);
+    tcg_temp_free(addrh);
+    tcg_temp_free(imml);
+    return true;
+}
+
+static bool gen_store(DisasContext *ctx, arg_sb *a, MemOp memop)
+{
+    if (get_xl(ctx) == MXL_RV128) {
+        return gen_store_i128(ctx, a, memop);
+    } else {
+        return gen_store_tl(ctx, a, memop);
+    }
+}
+
 static bool trans_sb(DisasContext *ctx, arg_sb *a)
 {
     return gen_store(ctx, a, MO_SB);
@@ -219,10 +345,16 @@ static bool trans_sw(DisasContext *ctx, arg_sw *a)
 
 static bool trans_sd(DisasContext *ctx, arg_sd *a)
 {
-    REQUIRE_64BIT(ctx);
+    REQUIRE_64_OR_128BIT(ctx);
     return gen_store(ctx, a, MO_TEQ);
 }
 
+static bool trans_sq(DisasContext *ctx, arg_sq *a)
+{
+    REQUIRE_128BIT(ctx);
+    return gen_store(ctx, a, MO_TEO);
+}
+
 static bool trans_addi(DisasContext *ctx, arg_addi *a)
 {
     return gen_arith_imm_fn(ctx, a, EXT_NONE, tcg_gen_addi_tl);
-- 
2.33.0



^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH v3 11/21] target/riscv: support for 128-bit bitwise instructions
  2021-10-19  9:47 [PATCH v3 00/21] Adding partial support for 128-bit riscv target Frédéric Pétrot
                   ` (9 preceding siblings ...)
  2021-10-19  9:48 ` [PATCH v3 10/21] target/riscv: support for 128-bit loads and store Frédéric Pétrot
@ 2021-10-19  9:48 ` Frédéric Pétrot
  2021-10-20 17:47   ` Richard Henderson
  2021-10-19  9:48 ` [PATCH v3 12/21] target/riscv: support for 128-bit U-type instructions Frédéric Pétrot
                   ` (9 subsequent siblings)
  20 siblings, 1 reply; 49+ messages in thread
From: Frédéric Pétrot @ 2021-10-19  9:48 UTC (permalink / raw)
  To: qemu-devel, qemu-riscv
  Cc: bin.meng, richard.henderson, alistair.francis, fabien.portas,
	palmer, Frédéric Pétrot, philmd

The 128-bit bitwise instructions do not need any function prototype change
as the functions can be applied independently on the lower and upper part of
the registers.

Signed-off-by: Frédéric Pétrot <frederic.petrot@univ-grenoble-alpes.fr>
Co-authored-by: Fabien Portas <fabien.portas@grenoble-inp.org>
---
 target/riscv/translate.c | 22 ++++++++++++++++++++++
 1 file changed, 22 insertions(+)

diff --git a/target/riscv/translate.c b/target/riscv/translate.c
index e8f08f921e..71982f6284 100644
--- a/target/riscv/translate.c
+++ b/target/riscv/translate.c
@@ -429,6 +429,17 @@ static bool gen_logic_imm_fn(DisasContext *ctx, arg_i *a, DisasExtend ext,
 
     gen_set_gpr(ctx, a->rd, dest);
 
+    if (get_xl_max(ctx) == MXL_RV128) {
+        if (get_ol(ctx) ==  MXL_RV128) {
+            uint64_t immh = -(a->imm < 0);
+            src1 = get_gprh(ctx, a->rs1);
+            dest = dest_gprh(ctx, a->rd);
+
+            func(dest, src1, immh);
+        }
+        gen_set_gprh(ctx, a->rd, dest);
+    }
+
     return true;
 }
 
@@ -443,6 +454,17 @@ static bool gen_logic(DisasContext *ctx, arg_r *a, DisasExtend ext,
 
     gen_set_gpr(ctx, a->rd, dest);
 
+    if (get_xl_max(ctx) == MXL_RV128) {
+        if (get_ol(ctx) ==  MXL_RV128) {
+            dest = dest_gprh(ctx, a->rd);
+            src1 = get_gprh(ctx, a->rs1);
+            src2 = get_gprh(ctx, a->rs2);
+
+            func(dest, src1, src2);
+        }
+        gen_set_gprh(ctx, a->rd, dest);
+    }
+
     return true;
 }
 
-- 
2.33.0



^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH v3 12/21] target/riscv: support for 128-bit U-type instructions
  2021-10-19  9:47 [PATCH v3 00/21] Adding partial support for 128-bit riscv target Frédéric Pétrot
                   ` (10 preceding siblings ...)
  2021-10-19  9:48 ` [PATCH v3 11/21] target/riscv: support for 128-bit bitwise instructions Frédéric Pétrot
@ 2021-10-19  9:48 ` Frédéric Pétrot
  2021-10-20 17:59   ` Richard Henderson
  2021-10-19  9:48 ` [PATCH v3 13/21] target/riscv: support for 128-bit shift instructions Frédéric Pétrot
                   ` (8 subsequent siblings)
  20 siblings, 1 reply; 49+ messages in thread
From: Frédéric Pétrot @ 2021-10-19  9:48 UTC (permalink / raw)
  To: qemu-devel, qemu-riscv
  Cc: bin.meng, richard.henderson, alistair.francis, fabien.portas,
	palmer, Frédéric Pétrot, philmd

Adding the 128-bit version of lui and auipc.

Signed-off-by: Frédéric Pétrot <frederic.petrot@univ-grenoble-alpes.fr>
Co-authored-by: Fabien Portas <fabien.portas@grenoble-inp.org>
---
 target/riscv/insn_trans/trans_rvi.c.inc | 19 +++++++++++++++++--
 1 file changed, 17 insertions(+), 2 deletions(-)

diff --git a/target/riscv/insn_trans/trans_rvi.c.inc b/target/riscv/insn_trans/trans_rvi.c.inc
index 92f41f3a86..b5e292a2aa 100644
--- a/target/riscv/insn_trans/trans_rvi.c.inc
+++ b/target/riscv/insn_trans/trans_rvi.c.inc
@@ -26,14 +26,17 @@ static bool trans_illegal(DisasContext *ctx, arg_empty *a)
 
 static bool trans_c64_illegal(DisasContext *ctx, arg_empty *a)
 {
-     REQUIRE_64BIT(ctx);
-     return trans_illegal(ctx, a);
+    REQUIRE_64_OR_128BIT(ctx);
+    return trans_illegal(ctx, a);
 }
 
 static bool trans_lui(DisasContext *ctx, arg_lui *a)
 {
     if (a->rd != 0) {
         tcg_gen_movi_tl(cpu_gpr[a->rd], a->imm);
+        if (get_xl_max(ctx) == MXL_RV128) {
+            tcg_gen_movi_tl(cpu_gprh[a->rd], -(a->imm < 0));
+        }
     }
     return true;
 }
@@ -41,7 +44,19 @@ static bool trans_lui(DisasContext *ctx, arg_lui *a)
 static bool trans_auipc(DisasContext *ctx, arg_auipc *a)
 {
     if (a->rd != 0) {
+        if (get_xl_max(ctx) == MXL_RV128) {
+            /* TODO : when pc is 128 bits, use all its bits */
+            TCGv pc = tcg_constant_tl(ctx->base.pc_next),
+                 imml = tcg_constant_tl(a->imm),
+                 immh = tcg_constant_tl(-(a->imm < 0)),
+                 zero = tcg_constant_tl(0);
+            tcg_gen_add2_tl(cpu_gpr[a->rd], cpu_gprh[a->rd],
+                            pc, zero,
+                            imml, immh);
+            return true;
+        }
         tcg_gen_movi_tl(cpu_gpr[a->rd], a->imm + ctx->base.pc_next);
+        return true;
     }
     return true;
 }
-- 
2.33.0



^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH v3 13/21] target/riscv: support for 128-bit shift instructions
  2021-10-19  9:47 [PATCH v3 00/21] Adding partial support for 128-bit riscv target Frédéric Pétrot
                   ` (11 preceding siblings ...)
  2021-10-19  9:48 ` [PATCH v3 12/21] target/riscv: support for 128-bit U-type instructions Frédéric Pétrot
@ 2021-10-19  9:48 ` Frédéric Pétrot
  2021-10-20 19:06   ` Richard Henderson
  2021-10-19  9:48 ` [PATCH v3 14/21] target/riscv: support for 128-bit arithmetic instructions Frédéric Pétrot
                   ` (7 subsequent siblings)
  20 siblings, 1 reply; 49+ messages in thread
From: Frédéric Pétrot @ 2021-10-19  9:48 UTC (permalink / raw)
  To: qemu-devel, qemu-riscv
  Cc: bin.meng, richard.henderson, alistair.francis, fabien.portas,
	palmer, Frédéric Pétrot, philmd

Handling shifts for 32, 64 and 128 operation length for RV128, following the
general framework for handling various olens proposed by Richard.

Signed-off-by: Frédéric Pétrot <frederic.petrot@univ-grenoble-alpes.fr>
Co-authored-by: Fabien Portas <fabien.portas@grenoble-inp.org>
---
 target/riscv/insn32.decode              |  10 +
 target/riscv/translate.c                |  96 ++++++++--
 target/riscv/insn_trans/trans_rvb.c.inc |  22 +--
 target/riscv/insn_trans/trans_rvi.c.inc | 238 ++++++++++++++++++++++--
 4 files changed, 321 insertions(+), 45 deletions(-)

diff --git a/target/riscv/insn32.decode b/target/riscv/insn32.decode
index 1e7ddecc22..c642f6d09d 100644
--- a/target/riscv/insn32.decode
+++ b/target/riscv/insn32.decode
@@ -22,6 +22,7 @@
 %rs1       15:5
 %rd        7:5
 %sh5       20:5
+%sh6       20:6
 
 %sh7    20:7
 %csr    20:12
@@ -92,6 +93,9 @@
 # Formats 64:
 @sh5     .......  ..... .....  ... ..... ....... &shift  shamt=%sh5      %rs1 %rd
 
+# Formats 128:
+@sh6       ...... ...... ..... ... ..... ....... &shift shamt=%sh6 %rs1 %rd
+
 # *** Privileged Instructions ***
 ecall       000000000000     00000 000 00000 1110011
 ebreak      000000000001     00000 000 00000 1110011
@@ -167,6 +171,12 @@ sraw     0100000 .....  ..... 101 ..... 0111011 @r
 ldu      ............   ..... 111 ..... 0000011 @i
 lq       ............   ..... 010 ..... 0001111 @i
 sq       ............   ..... 100 ..... 0100011 @s
+sllid    000000 ......  ..... 001 ..... 1011011 @sh6
+srlid    000000 ......  ..... 101 ..... 1011011 @sh6
+sraid    010000 ......  ..... 101 ..... 1011011 @sh6
+slld     0000000 ..... .....  001 ..... 1111011 @r
+srld     0000000 ..... .....  101 ..... 1111011 @r
+srad     0100000 ..... .....  101 ..... 1111011 @r
 # *** RV32M Standard Extension ***
 mul      0000001 .....  ..... 000 ..... 0110011 @r
 mulh     0000001 .....  ..... 001 ..... 0110011 @r
diff --git a/target/riscv/translate.c b/target/riscv/translate.c
index 71982f6284..67a82a0855 100644
--- a/target/riscv/translate.c
+++ b/target/riscv/translate.c
@@ -419,6 +419,22 @@ static int ex_rvc_shifti(DisasContext *ctx, int imm)
 /* Include the auto-generated decoder for 32 bit insn */
 #include "decode-insn32.c.inc"
 
+/*
+ *  xlm  xl   ol   tl   func   remark
+ * ----+----+----+----+------+-------------------
+ *  32   32   32   32   f_tl
+ *  64   64   64   64   f_tl
+ *  64   64   32   64   f_32  sign extends to 64
+ *  64   32   32   64   f_32  sign extends to 64
+ * 128  128  128   64   f_128
+ * 128  128   64   64   f_tl  sign extends to 128
+ * 128  128   32   64   f_32  sign extends to 128
+ * 128   64   64   64   f_tl  sign extends to 128
+ * 128   64   32   64   f_32  sign extends to 128
+ * 128   32   32   64   f_32  sign extends to 128
+ * ----+----+----+----+------+-------------------
+ */
+
 static bool gen_logic_imm_fn(DisasContext *ctx, arg_i *a, DisasExtend ext,
                              void (*func)(TCGv, TCGv, target_long))
 {
@@ -523,7 +539,8 @@ static bool gen_arith_per_ol(DisasContext *ctx, arg_r *a, DisasExtend ext,
 }
 
 static bool gen_shift_imm_fn(DisasContext *ctx, arg_shift *a, DisasExtend ext,
-                             void (*func)(TCGv, TCGv, target_long))
+                             void (*func)(TCGv, TCGv, target_long),
+                             void (*f128)(TCGv, TCGv, TCGv, TCGv, target_long))
 {
     TCGv dest, src1;
     int max_len = get_olen(ctx);
@@ -532,29 +549,52 @@ static bool gen_shift_imm_fn(DisasContext *ctx, arg_shift *a, DisasExtend ext,
         return false;
     }
 
-    dest = dest_gpr(ctx, a->rd);
-    src1 = get_gpr(ctx, a->rs1, ext);
+    if (get_xl_max(ctx) < MXL_RV128) {
+        dest = dest_gpr(ctx, a->rd);
+        src1 = get_gpr(ctx, a->rs1, ext);
 
-    func(dest, src1, a->shamt);
+        func(dest, src1, a->shamt);
 
-    gen_set_gpr(ctx, a->rd, dest);
+        gen_set_gpr(ctx, a->rd, dest);
+    } else {
+        TCGv src1l = get_gpr(ctx, a->rs1, ext),
+             src1h = get_gprh(ctx, a->rs1),
+             destl = tcg_temp_new(),
+             desth = tcg_temp_new();
+
+        if (max_len < 128) {
+            func(destl, src1l, a->shamt);
+            gen_set_gpr(ctx, a->rd, destl);
+            gen_set_gprh(ctx, a->rd, desth);
+        } else {
+            assert(f128 != NULL);
+            f128(destl, desth, src1l, src1h, a->shamt);
+            gen_set_gpr(ctx, a->rd, destl);
+            gen_set_gprh(ctx, a->rd, desth);
+        }
+
+        tcg_temp_free(destl);
+        tcg_temp_free(desth);
+    }
     return true;
 }
 
 static bool gen_shift_imm_fn_per_ol(DisasContext *ctx, arg_shift *a,
                                     DisasExtend ext,
                                     void (*f_tl)(TCGv, TCGv, target_long),
-                                    void (*f_32)(TCGv, TCGv, target_long))
+                                    void (*f_32)(TCGv, TCGv, target_long),
+                                    void (*f_128)(TCGv, TCGv, TCGv, TCGv,
+                                                  target_long))
 {
     int olen = get_olen(ctx);
     if (olen != TARGET_LONG_BITS) {
         if (olen == 32) {
             f_tl = f_32;
-        } else {
+        } else if (olen != 128) {
             g_assert_not_reached();
         }
     }
-    return gen_shift_imm_fn(ctx, a, ext, f_tl);
+    return gen_shift_imm_fn(ctx, a, ext, f_tl, f_128);
 }
 
 static bool gen_shift_imm_tl(DisasContext *ctx, arg_shift *a, DisasExtend ext,
@@ -578,34 +618,58 @@ static bool gen_shift_imm_tl(DisasContext *ctx, arg_shift *a, DisasExtend ext,
 }
 
 static bool gen_shift(DisasContext *ctx, arg_r *a, DisasExtend ext,
-                      void (*func)(TCGv, TCGv, TCGv))
+                      void (*func)(TCGv, TCGv, TCGv),
+                      void (*f128)(TCGv, TCGv, TCGv, TCGv, TCGv))
 {
-    TCGv dest = dest_gpr(ctx, a->rd);
-    TCGv src1 = get_gpr(ctx, a->rs1, ext);
     TCGv src2 = get_gpr(ctx, a->rs2, EXT_NONE);
     TCGv ext2 = tcg_temp_new();
 
     tcg_gen_andi_tl(ext2, src2, get_olen(ctx) - 1);
-    func(dest, src1, ext2);
 
-    gen_set_gpr(ctx, a->rd, dest);
+    if (get_xl_max(ctx) < MXL_RV128) {
+        TCGv dest = dest_gpr(ctx, a->rd);
+        TCGv src1 = get_gpr(ctx, a->rs1, ext);
+        func(dest, src1, ext2);
+
+        gen_set_gpr(ctx, a->rd, dest);
+    } else {
+        TCGv src1l = get_gpr(ctx, a->rs1, ext),
+             src1h = get_gprh(ctx, a->rs1),
+             destl = tcg_temp_new(),
+             desth = tcg_temp_new();
+
+        if (get_olen(ctx) < 128) {
+            func(destl, src1l, ext2);
+            gen_set_gpr(ctx, a->rd, destl);
+            gen_set_gprh(ctx, a->rd, desth);
+        } else {
+            assert(f128 != NULL);
+            f128(destl, desth, src1l, src1h, ext2);
+            gen_set_gpr(ctx, a->rd, destl);
+            gen_set_gprh(ctx, a->rd, desth);
+        }
+
+        tcg_temp_free(destl);
+        tcg_temp_free(desth);
+    }
     tcg_temp_free(ext2);
     return true;
 }
 
 static bool gen_shift_per_ol(DisasContext *ctx, arg_r *a, DisasExtend ext,
                              void (*f_tl)(TCGv, TCGv, TCGv),
-                             void (*f_32)(TCGv, TCGv, TCGv))
+                             void (*f_32)(TCGv, TCGv, TCGv),
+                             void (*f_128)(TCGv, TCGv, TCGv, TCGv, TCGv))
 {
     int olen = get_olen(ctx);
     if (olen != TARGET_LONG_BITS) {
         if (olen == 32) {
             f_tl = f_32;
-        } else {
+        } else if (olen != 128) {
             g_assert_not_reached();
         }
     }
-    return gen_shift(ctx, a, ext, f_tl);
+    return gen_shift(ctx, a, ext, f_tl, f_128);
 }
 
 static bool gen_unary(DisasContext *ctx, arg_r2 *a, DisasExtend ext,
diff --git a/target/riscv/insn_trans/trans_rvb.c.inc b/target/riscv/insn_trans/trans_rvb.c.inc
index 28f911f95d..cae97ed842 100644
--- a/target/riscv/insn_trans/trans_rvb.c.inc
+++ b/target/riscv/insn_trans/trans_rvb.c.inc
@@ -156,7 +156,7 @@ static void gen_bset(TCGv ret, TCGv arg1, TCGv shamt)
 static bool trans_bset(DisasContext *ctx, arg_bset *a)
 {
     REQUIRE_ZBS(ctx);
-    return gen_shift(ctx, a, EXT_NONE, gen_bset);
+    return gen_shift(ctx, a, EXT_NONE, gen_bset, NULL);
 }
 
 static bool trans_bseti(DisasContext *ctx, arg_bseti *a)
@@ -178,7 +178,7 @@ static void gen_bclr(TCGv ret, TCGv arg1, TCGv shamt)
 static bool trans_bclr(DisasContext *ctx, arg_bclr *a)
 {
     REQUIRE_ZBS(ctx);
-    return gen_shift(ctx, a, EXT_NONE, gen_bclr);
+    return gen_shift(ctx, a, EXT_NONE, gen_bclr, NULL);
 }
 
 static bool trans_bclri(DisasContext *ctx, arg_bclri *a)
@@ -200,7 +200,7 @@ static void gen_binv(TCGv ret, TCGv arg1, TCGv shamt)
 static bool trans_binv(DisasContext *ctx, arg_binv *a)
 {
     REQUIRE_ZBS(ctx);
-    return gen_shift(ctx, a, EXT_NONE, gen_binv);
+    return gen_shift(ctx, a, EXT_NONE, gen_binv, NULL);
 }
 
 static bool trans_binvi(DisasContext *ctx, arg_binvi *a)
@@ -218,7 +218,7 @@ static void gen_bext(TCGv ret, TCGv arg1, TCGv shamt)
 static bool trans_bext(DisasContext *ctx, arg_bext *a)
 {
     REQUIRE_ZBS(ctx);
-    return gen_shift(ctx, a, EXT_NONE, gen_bext);
+    return gen_shift(ctx, a, EXT_NONE, gen_bext, NULL);
 }
 
 static bool trans_bexti(DisasContext *ctx, arg_bexti *a)
@@ -248,7 +248,7 @@ static void gen_rorw(TCGv ret, TCGv arg1, TCGv arg2)
 static bool trans_ror(DisasContext *ctx, arg_ror *a)
 {
     REQUIRE_ZBB(ctx);
-    return gen_shift_per_ol(ctx, a, EXT_NONE, tcg_gen_rotr_tl, gen_rorw);
+    return gen_shift_per_ol(ctx, a, EXT_NONE, tcg_gen_rotr_tl, gen_rorw, NULL);
 }
 
 static void gen_roriw(TCGv ret, TCGv arg1, target_long shamt)
@@ -266,7 +266,7 @@ static bool trans_rori(DisasContext *ctx, arg_rori *a)
 {
     REQUIRE_ZBB(ctx);
     return gen_shift_imm_fn_per_ol(ctx, a, EXT_NONE,
-                                   tcg_gen_rotri_tl, gen_roriw);
+                                   tcg_gen_rotri_tl, gen_roriw, NULL);
 }
 
 static void gen_rolw(TCGv ret, TCGv arg1, TCGv arg2)
@@ -290,7 +290,7 @@ static void gen_rolw(TCGv ret, TCGv arg1, TCGv arg2)
 static bool trans_rol(DisasContext *ctx, arg_rol *a)
 {
     REQUIRE_ZBB(ctx);
-    return gen_shift_per_ol(ctx, a, EXT_NONE, tcg_gen_rotl_tl, gen_rolw);
+    return gen_shift_per_ol(ctx, a, EXT_NONE, tcg_gen_rotl_tl, gen_rolw, NULL);
 }
 
 static void gen_rev8_32(TCGv ret, TCGv src1)
@@ -402,7 +402,7 @@ static bool trans_rorw(DisasContext *ctx, arg_rorw *a)
     REQUIRE_64BIT(ctx);
     REQUIRE_ZBB(ctx);
     ctx->ol = MXL_RV32;
-    return gen_shift(ctx, a, EXT_NONE, gen_rorw);
+    return gen_shift(ctx, a, EXT_NONE, gen_rorw, NULL);
 }
 
 static bool trans_roriw(DisasContext *ctx, arg_roriw *a)
@@ -410,7 +410,7 @@ static bool trans_roriw(DisasContext *ctx, arg_roriw *a)
     REQUIRE_64BIT(ctx);
     REQUIRE_ZBB(ctx);
     ctx->ol = MXL_RV32;
-    return gen_shift_imm_fn(ctx, a, EXT_NONE, gen_roriw);
+    return gen_shift_imm_fn(ctx, a, EXT_NONE, gen_roriw, NULL);
 }
 
 static bool trans_rolw(DisasContext *ctx, arg_rolw *a)
@@ -418,7 +418,7 @@ static bool trans_rolw(DisasContext *ctx, arg_rolw *a)
     REQUIRE_64BIT(ctx);
     REQUIRE_ZBB(ctx);
     ctx->ol = MXL_RV32;
-    return gen_shift(ctx, a, EXT_NONE, gen_rolw);
+    return gen_shift(ctx, a, EXT_NONE, gen_rolw, NULL);
 }
 
 #define GEN_SHADD_UW(SHAMT)                                       \
@@ -475,7 +475,7 @@ static bool trans_slli_uw(DisasContext *ctx, arg_slli_uw *a)
 {
     REQUIRE_64BIT(ctx);
     REQUIRE_ZBA(ctx);
-    return gen_shift_imm_fn(ctx, a, EXT_NONE, gen_slli_uw);
+    return gen_shift_imm_fn(ctx, a, EXT_NONE, gen_slli_uw, NULL);
 }
 
 static bool trans_clmul(DisasContext *ctx, arg_clmul *a)
diff --git a/target/riscv/insn_trans/trans_rvi.c.inc b/target/riscv/insn_trans/trans_rvi.c.inc
index b5e292a2aa..6e2c89cd5e 100644
--- a/target/riscv/insn_trans/trans_rvi.c.inc
+++ b/target/riscv/insn_trans/trans_rvi.c.inc
@@ -410,9 +410,22 @@ static bool trans_andi(DisasContext *ctx, arg_andi *a)
     return gen_logic_imm_fn(ctx, a, EXT_NONE, tcg_gen_andi_tl);
 }
 
+static void gen_slli_i128(TCGv retl, TCGv reth,
+                          TCGv src1l, TCGv src1h,
+                          target_long shamt)
+{
+    if (shamt >= 64) {
+        tcg_gen_shli_tl(reth, src1l, shamt - 64);
+        tcg_gen_movi_tl(retl, 0);
+    } else {
+        tcg_gen_extract2_tl(reth, src1l, src1h, 64 - shamt);
+        tcg_gen_shli_tl(retl, src1l, shamt);
+    }
+}
+
 static bool trans_slli(DisasContext *ctx, arg_slli *a)
 {
-    return gen_shift_imm_fn(ctx, a, EXT_NONE, tcg_gen_shli_tl);
+    return gen_shift_imm_fn(ctx, a, EXT_NONE, tcg_gen_shli_tl, gen_slli_i128);
 }
 
 static void gen_srliw(TCGv dst, TCGv src, target_long shamt)
@@ -420,10 +433,23 @@ static void gen_srliw(TCGv dst, TCGv src, target_long shamt)
     tcg_gen_extract_tl(dst, src, shamt, 32 - shamt);
 }
 
+static void gen_srli_i128(TCGv retl, TCGv reth,
+                          TCGv src1l, TCGv src1h,
+                          target_long shamt)
+{
+    if (shamt >= 64) {
+        tcg_gen_shri_tl(retl, src1h, shamt - 64);
+        tcg_gen_movi_tl(reth, 0);
+    } else {
+        tcg_gen_extract2_tl(retl, src1l, src1h, shamt);
+        tcg_gen_shri_tl(reth, src1h, shamt);
+    }
+}
+
 static bool trans_srli(DisasContext *ctx, arg_srli *a)
 {
     return gen_shift_imm_fn_per_ol(ctx, a, EXT_NONE,
-                                   tcg_gen_shri_tl, gen_srliw);
+                                   tcg_gen_shri_tl, gen_srliw, gen_srli_i128);
 }
 
 static void gen_sraiw(TCGv dst, TCGv src, target_long shamt)
@@ -431,10 +457,23 @@ static void gen_sraiw(TCGv dst, TCGv src, target_long shamt)
     tcg_gen_sextract_tl(dst, src, shamt, 32 - shamt);
 }
 
+static void gen_srai_i128(TCGv retl, TCGv reth,
+                          TCGv src1l, TCGv src1h,
+                          target_long shamt)
+{
+    if (shamt >= 64) {
+        tcg_gen_sari_tl(retl, src1h, shamt - 64);
+        tcg_gen_sari_tl(reth, src1h, 63);
+    } else {
+        tcg_gen_extract2_tl(retl, src1l, src1h, shamt);
+        tcg_gen_sari_tl(reth, src1h, shamt);
+    }
+}
+
 static bool trans_srai(DisasContext *ctx, arg_srai *a)
 {
     return gen_shift_imm_fn_per_ol(ctx, a, EXT_NONE,
-                                   tcg_gen_sari_tl, gen_sraiw);
+                                   tcg_gen_sari_tl, gen_sraiw, gen_srai_i128);
 }
 
 static bool trans_add(DisasContext *ctx, arg_add *a)
@@ -447,9 +486,75 @@ static bool trans_sub(DisasContext *ctx, arg_sub *a)
     return gen_arith(ctx, a, EXT_NONE, tcg_gen_sub_tl);
 }
 
+enum M128_DIR {
+    M128_LEFT,
+    M128_RIGHT,
+    M128_RIGHT_ARITH
+};
+/* 127 <= arg2 <= 0 */
+static void gen_shift_mod128(TCGv ret, TCGv arg1, TCGv arg2, enum M128_DIR dir)
+{
+    TCGv tmp1 = tcg_temp_new(),
+         tmp2 = tcg_temp_new(),
+         sgn = tcg_temp_new(),
+         cnst_zero = tcg_constant_tl(0);
+
+    tcg_gen_setcondi_tl(TCG_COND_GEU, tmp1, arg2, 64);
+
+    tcg_gen_andi_tl(tmp2, arg2, 0x3f);
+    switch (dir) {
+    case M128_LEFT:
+        tcg_gen_shl_tl(tmp2, arg1, tmp2);
+        break;
+    case M128_RIGHT:
+        tcg_gen_shr_tl(tmp2, arg1, tmp2);
+        break;
+    case M128_RIGHT_ARITH:
+        tcg_gen_sar_tl(tmp2, arg1, tmp2);
+        break;
+    }
+
+    if (dir == M128_RIGHT_ARITH) {
+        tcg_gen_sari_tl(sgn, arg1, 63);
+        tcg_gen_movcond_tl(TCG_COND_NE, ret, tmp1, cnst_zero, sgn, tmp2);
+    } else {
+        tcg_gen_movcond_tl(TCG_COND_NE, ret, tmp1, cnst_zero, cnst_zero, tmp2);
+    }
+
+    tcg_temp_free(tmp1);
+    tcg_temp_free(tmp2);
+    tcg_temp_free(sgn);
+    return;
+}
+
+static void gen_sll_i128(TCGv destl, TCGv desth,
+                     TCGv src1l, TCGv src1h, TCGv shamt)
+{
+        TCGv tmp = tcg_temp_new();
+        /*
+         * From Hacker's Delight 2.17:
+         *  y1 = x1 << n | x0 u>> (64 - n) | x0 << (n - 64)
+         */
+        gen_shift_mod128(desth, src1h, shamt, M128_LEFT);
+
+        tcg_gen_movi_tl(tmp, 64);
+        tcg_gen_sub_tl(tmp, tmp, shamt);
+        gen_shift_mod128(tmp, src1l, tmp, M128_RIGHT);
+        tcg_gen_or_tl(desth, desth, tmp);
+
+        tcg_gen_subi_tl(tmp, shamt, 64);
+        gen_shift_mod128(tmp, src1l, tmp, M128_LEFT);
+        tcg_gen_or_tl(desth, desth, tmp);
+
+        /* From Hacker's Delight 2.17: y0 = x0 << n */
+        gen_shift_mod128(destl, src1l, shamt, M128_LEFT);
+
+        tcg_temp_free(tmp);
+}
+
 static bool trans_sll(DisasContext *ctx, arg_sll *a)
 {
-    return gen_shift(ctx, a, EXT_NONE, tcg_gen_shl_tl);
+    return gen_shift(ctx, a, EXT_NONE, tcg_gen_shl_tl, gen_sll_i128);
 }
 
 static bool trans_slt(DisasContext *ctx, arg_slt *a)
@@ -462,14 +567,67 @@ static bool trans_sltu(DisasContext *ctx, arg_sltu *a)
     return gen_arith(ctx, a, EXT_SIGN, gen_sltu);
 }
 
+static void gen_srl_i128(TCGv destl, TCGv desth,
+                         TCGv src1l, TCGv src1h, TCGv shamt)
+{
+    TCGv tmp = tcg_temp_new();
+    /*
+     * From Hacker's Delight 2.17:
+     * y0 = x0 u>> n | x1 << (64 - n) | x1 u>> (n - 64)
+     */
+    gen_shift_mod128(destl, src1l, shamt, M128_RIGHT);
+
+    tcg_gen_movi_tl(tmp, 64);
+    tcg_gen_sub_tl(tmp, tmp, shamt);
+    gen_shift_mod128(tmp, src1h, tmp, M128_LEFT);
+    tcg_gen_or_tl(destl, destl, tmp);
+
+    tcg_gen_subi_tl(tmp, shamt, 64);
+    gen_shift_mod128(tmp, src1h, tmp, M128_RIGHT);
+    tcg_gen_or_tl(destl, destl, tmp);
+
+    /* From Hacker's Delight 2.17 : y1 = x1 u>> n */
+    gen_shift_mod128(desth, src1h, shamt, M128_RIGHT);
+
+    tcg_temp_free(tmp);
+}
+
 static bool trans_srl(DisasContext *ctx, arg_srl *a)
 {
-    return gen_shift(ctx, a, EXT_ZERO, tcg_gen_shr_tl);
+    return gen_shift(ctx, a, EXT_ZERO, tcg_gen_shr_tl, gen_srl_i128);
+}
+
+static void gen_sra_i128(TCGv destl, TCGv desth,
+                         TCGv src1l, TCGv src1h, TCGv shamt)
+{
+    TCGv tmp1 = tcg_temp_new(),
+         tmp2 = tcg_temp_new(),
+         const64 = tcg_constant_tl(64);
+
+    /* Compute y0 value if n < 64: x0 u>> n | x1 << (64 - n) */
+    gen_shift_mod128(tmp1, src1l, shamt, M128_RIGHT);
+    tcg_gen_movi_tl(tmp2, 64);
+    tcg_gen_sub_tl(tmp2, tmp2, shamt);
+    gen_shift_mod128(tmp2, src1h, tmp2, M128_LEFT);
+    tcg_gen_or_tl(tmp1, tmp1, tmp2);
+
+    /* Compute y0 value if n >= 64: x1 s>> (n - 64) */
+    tcg_gen_subi_tl(tmp2, shamt, 64);
+    gen_shift_mod128(tmp2, src1h, tmp2, M128_RIGHT_ARITH);
+
+    /* Conditionally move one value or the other */
+    tcg_gen_movcond_tl(TCG_COND_LT, destl, shamt, const64, tmp1, tmp2);
+
+    /* y1 = x1 s>> n */
+    gen_shift_mod128(desth, src1h, shamt, M128_RIGHT_ARITH);
+
+    tcg_temp_free(tmp1);
+    tcg_temp_free(tmp2);
 }
 
 static bool trans_sra(DisasContext *ctx, arg_sra *a)
 {
-    return gen_shift(ctx, a, EXT_SIGN, tcg_gen_sar_tl);
+    return gen_shift(ctx, a, EXT_SIGN, tcg_gen_sar_tl, gen_sra_i128);
 }
 
 static bool trans_xor(DisasContext *ctx, arg_xor *a)
@@ -496,25 +654,47 @@ static bool trans_addiw(DisasContext *ctx, arg_addiw *a)
 
 static bool trans_slliw(DisasContext *ctx, arg_slliw *a)
 {
-    REQUIRE_64BIT(ctx);
+    REQUIRE_64_OR_128BIT(ctx);
     ctx->ol = MXL_RV32;
-    return gen_shift_imm_fn(ctx, a, EXT_NONE, tcg_gen_shli_tl);
+    return gen_shift_imm_fn(ctx, a, EXT_NONE, tcg_gen_shli_tl, NULL);
 }
 
 static bool trans_srliw(DisasContext *ctx, arg_srliw *a)
 {
-    REQUIRE_64BIT(ctx);
+    REQUIRE_64_OR_128BIT(ctx);
     ctx->ol = MXL_RV32;
-    return gen_shift_imm_fn(ctx, a, EXT_NONE, gen_srliw);
+    return gen_shift_imm_fn(ctx, a, EXT_NONE, gen_srliw, NULL);
 }
 
 static bool trans_sraiw(DisasContext *ctx, arg_sraiw *a)
 {
-    REQUIRE_64BIT(ctx);
+    REQUIRE_64_OR_128BIT(ctx);
     ctx->ol = MXL_RV32;
-    return gen_shift_imm_fn(ctx, a, EXT_NONE, gen_sraiw);
+    return gen_shift_imm_fn(ctx, a, EXT_NONE, gen_sraiw, NULL);
 }
 
+static bool trans_sllid(DisasContext *ctx, arg_sllid *a)
+{
+    REQUIRE_128BIT(ctx);
+    ctx->ol = MXL_RV64;
+    return gen_shift_imm_fn(ctx, a, EXT_NONE, tcg_gen_shli_tl, NULL);
+}
+
+static bool trans_srlid(DisasContext *ctx, arg_srlid *a)
+{
+    REQUIRE_128BIT(ctx);
+    ctx->ol = MXL_RV64;
+    return gen_shift_imm_fn(ctx, a, EXT_NONE, tcg_gen_shri_tl, NULL);
+}
+
+static bool trans_sraid(DisasContext *ctx, arg_sraid *a)
+{
+    REQUIRE_128BIT(ctx);
+    ctx->ol = MXL_RV64;
+    return gen_shift_imm_fn(ctx, a, EXT_NONE, tcg_gen_sari_tl,  NULL);
+}
+
+
 static bool trans_addw(DisasContext *ctx, arg_addw *a)
 {
     REQUIRE_64BIT(ctx);
@@ -531,25 +711,47 @@ static bool trans_subw(DisasContext *ctx, arg_subw *a)
 
 static bool trans_sllw(DisasContext *ctx, arg_sllw *a)
 {
-    REQUIRE_64BIT(ctx);
+    REQUIRE_64_OR_128BIT(ctx);
     ctx->ol = MXL_RV32;
-    return gen_shift(ctx, a, EXT_NONE, tcg_gen_shl_tl);
+    return gen_shift(ctx, a, EXT_NONE, tcg_gen_shl_tl, NULL);
 }
 
 static bool trans_srlw(DisasContext *ctx, arg_srlw *a)
 {
-    REQUIRE_64BIT(ctx);
+    REQUIRE_64_OR_128BIT(ctx);
     ctx->ol = MXL_RV32;
-    return gen_shift(ctx, a, EXT_ZERO, tcg_gen_shr_tl);
+    return gen_shift(ctx, a, EXT_ZERO, tcg_gen_shr_tl, NULL);
 }
 
 static bool trans_sraw(DisasContext *ctx, arg_sraw *a)
 {
-    REQUIRE_64BIT(ctx);
+    REQUIRE_64_OR_128BIT(ctx);
     ctx->ol = MXL_RV32;
-    return gen_shift(ctx, a, EXT_SIGN, tcg_gen_sar_tl);
+    return gen_shift(ctx, a, EXT_SIGN, tcg_gen_sar_tl, NULL);
 }
 
+static bool trans_slld(DisasContext *ctx, arg_slld *a)
+{
+    REQUIRE_128BIT(ctx);
+    ctx->ol = MXL_RV64;
+    return gen_shift(ctx, a, EXT_NONE, tcg_gen_shl_tl, NULL);
+}
+
+static bool trans_srld(DisasContext *ctx, arg_srld *a)
+{
+    REQUIRE_128BIT(ctx);
+    ctx->ol = MXL_RV64;
+    return gen_shift(ctx, a, EXT_ZERO, tcg_gen_shr_tl, NULL);
+}
+
+static bool trans_srad(DisasContext *ctx, arg_srad *a)
+{
+    REQUIRE_128BIT(ctx);
+    ctx->ol = MXL_RV64;
+    return gen_shift(ctx, a, EXT_SIGN, tcg_gen_sar_tl, NULL);
+}
+
+
 static bool trans_fence(DisasContext *ctx, arg_fence *a)
 {
     /* FENCE is a full memory barrier. */
-- 
2.33.0



^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH v3 14/21] target/riscv: support for 128-bit arithmetic instructions
  2021-10-19  9:47 [PATCH v3 00/21] Adding partial support for 128-bit riscv target Frédéric Pétrot
                   ` (12 preceding siblings ...)
  2021-10-19  9:48 ` [PATCH v3 13/21] target/riscv: support for 128-bit shift instructions Frédéric Pétrot
@ 2021-10-19  9:48 ` Frédéric Pétrot
  2021-10-20 20:15   ` Richard Henderson
  2021-10-19  9:48 ` [PATCH v3 15/21] target/riscv: support for 128-bit M extension Frédéric Pétrot
                   ` (6 subsequent siblings)
  20 siblings, 1 reply; 49+ messages in thread
From: Frédéric Pétrot @ 2021-10-19  9:48 UTC (permalink / raw)
  To: qemu-devel, qemu-riscv
  Cc: bin.meng, richard.henderson, alistair.francis, fabien.portas,
	palmer, Frédéric Pétrot, philmd

Addition of 128-bit adds and subs in their various sizes.

Signed-off-by: Frédéric Pétrot <frederic.petrot@univ-grenoble-alpes.fr>
Co-authored-by: Fabien Portas <fabien.portas@grenoble-inp.org>
---
 target/riscv/insn32.decode              |   3 +
 target/riscv/translate.c                | 105 ++++++++++++---
 target/riscv/insn_trans/trans_rvb.c.inc |  20 +--
 target/riscv/insn_trans/trans_rvi.c.inc | 169 ++++++++++++++++++++++--
 target/riscv/insn_trans/trans_rvm.c.inc |  26 ++--
 5 files changed, 266 insertions(+), 57 deletions(-)

diff --git a/target/riscv/insn32.decode b/target/riscv/insn32.decode
index c642f6d09d..3556bf49cc 100644
--- a/target/riscv/insn32.decode
+++ b/target/riscv/insn32.decode
@@ -171,9 +171,12 @@ sraw     0100000 .....  ..... 101 ..... 0111011 @r
 ldu      ............   ..... 111 ..... 0000011 @i
 lq       ............   ..... 010 ..... 0001111 @i
 sq       ............   ..... 100 ..... 0100011 @s
+addid    ............  .....  000 ..... 1011011 @i
 sllid    000000 ......  ..... 001 ..... 1011011 @sh6
 srlid    000000 ......  ..... 101 ..... 1011011 @sh6
 sraid    010000 ......  ..... 101 ..... 1011011 @sh6
+addd     0000000 ..... .....  000 ..... 1111011 @r
+subd     0100000 ..... .....  000 ..... 1111011 @r
 slld     0000000 ..... .....  001 ..... 1111011 @r
 srld     0000000 ..... .....  101 ..... 1111011 @r
 srad     0100000 ..... .....  101 ..... 1111011 @r
diff --git a/target/riscv/translate.c b/target/riscv/translate.c
index 67a82a0855..332a5d0384 100644
--- a/target/riscv/translate.c
+++ b/target/riscv/translate.c
@@ -485,57 +485,122 @@ static bool gen_logic(DisasContext *ctx, arg_r *a, DisasExtend ext,
 }
 
 static bool gen_arith_imm_fn(DisasContext *ctx, arg_i *a, DisasExtend ext,
-                             void (*func)(TCGv, TCGv, target_long))
+                             void (*func)(TCGv, TCGv, target_long),
+                             void (*f128)(TCGv, TCGv, TCGv, TCGv, target_long))
 {
-    TCGv dest = dest_gpr(ctx, a->rd);
-    TCGv src1 = get_gpr(ctx, a->rs1, ext);
+    if (get_xl_max(ctx) < MXL_RV128) {
+        TCGv dest = dest_gpr(ctx, a->rd);
+        TCGv src1 = get_gpr(ctx, a->rs1, ext);
 
-    func(dest, src1, a->imm);
+        func(dest, src1, a->imm);
 
-    gen_set_gpr(ctx, a->rd, dest);
+        gen_set_gpr(ctx, a->rd, dest);
+    } else {
+        TCGv src1l = get_gpr(ctx, a->rs1, ext),
+             src1h = get_gprh(ctx, a->rs1),
+             destl = tcg_temp_new(),
+             desth = tcg_temp_new();
+
+        if (get_ol(ctx) < MXL_RV128) {
+            func(destl, src1l, a->imm);
+            gen_set_gpr(ctx, a->rd, destl);
+            gen_set_gprh(ctx, a->rd, desth);
+        } else {
+            assert(f128 != NULL);
+            f128(destl, desth, src1l, src1h, a->imm);
+            gen_set_gpr(ctx, a->rd, destl);
+            gen_set_gprh(ctx, a->rd, desth);
+        }
+
+        tcg_temp_free(destl);
+        tcg_temp_free(desth);
+    }
     return true;
 }
 
 static bool gen_arith_imm_tl(DisasContext *ctx, arg_i *a, DisasExtend ext,
-                             void (*func)(TCGv, TCGv, TCGv))
+                             void (*func)(TCGv, TCGv, TCGv),
+                             void (*f128)(TCGv, TCGv, TCGv, TCGv, TCGv, TCGv))
 {
-    TCGv dest = dest_gpr(ctx, a->rd);
-    TCGv src1 = get_gpr(ctx, a->rs1, ext);
-    TCGv src2 = tcg_constant_tl(a->imm);
 
-    func(dest, src1, src2);
+    if (get_xl_max(ctx) < MXL_RV128) {
+        TCGv dest = dest_gpr(ctx, a->rd);
+        TCGv src1 = get_gpr(ctx, a->rs1, ext);
+        TCGv src2 = tcg_constant_tl(a->imm);
 
-    gen_set_gpr(ctx, a->rd, dest);
+        func(dest, src1, src2);
+
+        gen_set_gpr(ctx, a->rd, dest);
+    } else {
+        TCGv src1l = get_gpr(ctx, a->rs1, ext),
+             src1h = get_gprh(ctx, a->rs1),
+             src2l = tcg_constant_tl(a->imm),
+             src2h = tcg_constant_tl(-(a->imm < 0)),
+             destl = tcg_temp_new(),
+             desth = tcg_temp_new();
+
+        assert(f128 != NULL);
+        f128(destl, desth, src1l, src1h, src2l, src2h);
+        gen_set_gpr(ctx, a->rd, destl);
+        gen_set_gprh(ctx, a->rd, desth);
+        tcg_temp_free(destl);
+        tcg_temp_free(desth);
+    }
     return true;
 }
 
 static bool gen_arith(DisasContext *ctx, arg_r *a, DisasExtend ext,
-                      void (*func)(TCGv, TCGv, TCGv))
+                      void (*func)(TCGv, TCGv, TCGv),
+                      void (*f128)(TCGv, TCGv, TCGv, TCGv, TCGv, TCGv))
 {
-    TCGv dest = dest_gpr(ctx, a->rd);
-    TCGv src1 = get_gpr(ctx, a->rs1, ext);
-    TCGv src2 = get_gpr(ctx, a->rs2, ext);
+    if (get_xl_max(ctx) < MXL_RV128) {
+        TCGv dest = dest_gpr(ctx, a->rd);
+        TCGv src1 = get_gpr(ctx, a->rs1, ext);
+        TCGv src2 = get_gpr(ctx, a->rs2, ext);
 
-    func(dest, src1, src2);
+        func(dest, src1, src2);
 
-    gen_set_gpr(ctx, a->rd, dest);
+        gen_set_gpr(ctx, a->rd, dest);
+    } else {
+        TCGv src1l = get_gpr(ctx, a->rs1, ext),
+             src1h = get_gprh(ctx, a->rs1),
+             src2l = get_gpr(ctx, a->rs2, ext),
+             src2h = get_gprh(ctx, a->rs2),
+             destl = tcg_temp_new(),
+             desth = tcg_temp_new();
+
+        if (get_ol(ctx) < MXL_RV128) {
+            func(destl, src1l, src2l);
+            gen_set_gpr(ctx, a->rd, destl);
+            gen_set_gprh(ctx, a->rd, desth);
+        } else {
+            assert(f128 != NULL);
+            f128(destl, desth, src1l, src1h, src2l, src2h);
+            gen_set_gpr(ctx, a->rd, destl);
+            gen_set_gprh(ctx, a->rd, desth);
+        }
+
+        tcg_temp_free(destl);
+        tcg_temp_free(desth);
+    }
     return true;
 }
 
 static bool gen_arith_per_ol(DisasContext *ctx, arg_r *a, DisasExtend ext,
                              void (*f_tl)(TCGv, TCGv, TCGv),
-                             void (*f_32)(TCGv, TCGv, TCGv))
+                             void (*f_32)(TCGv, TCGv, TCGv),
+                             void (*f_128)(TCGv, TCGv, TCGv, TCGv, TCGv, TCGv))
 {
     int olen = get_olen(ctx);
 
     if (olen != TARGET_LONG_BITS) {
         if (olen == 32) {
             f_tl = f_32;
-        } else {
+        } else if (olen != 128) {
             g_assert_not_reached();
         }
     }
-    return gen_arith(ctx, a, ext, f_tl);
+    return gen_arith(ctx, a, ext, f_tl, f_128);
 }
 
 static bool gen_shift_imm_fn(DisasContext *ctx, arg_shift *a, DisasExtend ext,
diff --git a/target/riscv/insn_trans/trans_rvb.c.inc b/target/riscv/insn_trans/trans_rvb.c.inc
index cae97ed842..764c0b7122 100644
--- a/target/riscv/insn_trans/trans_rvb.c.inc
+++ b/target/riscv/insn_trans/trans_rvb.c.inc
@@ -104,25 +104,25 @@ static bool trans_xnor(DisasContext *ctx, arg_xnor *a)
 static bool trans_min(DisasContext *ctx, arg_min *a)
 {
     REQUIRE_ZBB(ctx);
-    return gen_arith(ctx, a, EXT_SIGN, tcg_gen_smin_tl);
+    return gen_arith(ctx, a, EXT_SIGN, tcg_gen_smin_tl, NULL);
 }
 
 static bool trans_max(DisasContext *ctx, arg_max *a)
 {
     REQUIRE_ZBB(ctx);
-    return gen_arith(ctx, a, EXT_SIGN, tcg_gen_smax_tl);
+    return gen_arith(ctx, a, EXT_SIGN, tcg_gen_smax_tl, NULL);
 }
 
 static bool trans_minu(DisasContext *ctx, arg_minu *a)
 {
     REQUIRE_ZBB(ctx);
-    return gen_arith(ctx, a, EXT_SIGN, tcg_gen_umin_tl);
+    return gen_arith(ctx, a, EXT_SIGN, tcg_gen_umin_tl, NULL);
 }
 
 static bool trans_maxu(DisasContext *ctx, arg_maxu *a)
 {
     REQUIRE_ZBB(ctx);
-    return gen_arith(ctx, a, EXT_SIGN, tcg_gen_umax_tl);
+    return gen_arith(ctx, a, EXT_SIGN, tcg_gen_umax_tl, NULL);
 }
 
 static bool trans_sext_b(DisasContext *ctx, arg_sext_b *a)
@@ -354,7 +354,7 @@ GEN_SHADD(3)
 static bool trans_sh##SHAMT##add(DisasContext *ctx, arg_sh##SHAMT##add *a) \
 {                                                                          \
     REQUIRE_ZBA(ctx);                                                      \
-    return gen_arith(ctx, a, EXT_NONE, gen_sh##SHAMT##add);                \
+    return gen_arith(ctx, a, EXT_NONE, gen_sh##SHAMT##add, NULL);          \
 }
 
 GEN_TRANS_SHADD(1)
@@ -444,7 +444,7 @@ static bool trans_sh##SHAMT##add_uw(DisasContext *ctx,        \
 {                                                             \
     REQUIRE_64BIT(ctx);                                       \
     REQUIRE_ZBA(ctx);                                         \
-    return gen_arith(ctx, a, EXT_NONE, gen_sh##SHAMT##add_uw);  \
+    return gen_arith(ctx, a, EXT_NONE, gen_sh##SHAMT##add_uw, NULL); \
 }
 
 GEN_TRANS_SHADD_UW(1)
@@ -463,7 +463,7 @@ static bool trans_add_uw(DisasContext *ctx, arg_add_uw *a)
 {
     REQUIRE_64BIT(ctx);
     REQUIRE_ZBA(ctx);
-    return gen_arith(ctx, a, EXT_NONE, gen_add_uw);
+    return gen_arith(ctx, a, EXT_NONE, gen_add_uw, NULL);
 }
 
 static void gen_slli_uw(TCGv dest, TCGv src, target_long shamt)
@@ -481,7 +481,7 @@ static bool trans_slli_uw(DisasContext *ctx, arg_slli_uw *a)
 static bool trans_clmul(DisasContext *ctx, arg_clmul *a)
 {
     REQUIRE_ZBC(ctx);
-    return gen_arith(ctx, a, EXT_NONE, gen_helper_clmul);
+    return gen_arith(ctx, a, EXT_NONE, gen_helper_clmul, NULL);
 }
 
 static void gen_clmulh(TCGv dst, TCGv src1, TCGv src2)
@@ -493,11 +493,11 @@ static void gen_clmulh(TCGv dst, TCGv src1, TCGv src2)
 static bool trans_clmulh(DisasContext *ctx, arg_clmulr *a)
 {
     REQUIRE_ZBC(ctx);
-    return gen_arith(ctx, a, EXT_NONE, gen_clmulh);
+    return gen_arith(ctx, a, EXT_NONE, gen_clmulh, NULL);
 }
 
 static bool trans_clmulr(DisasContext *ctx, arg_clmulh *a)
 {
     REQUIRE_ZBC(ctx);
-    return gen_arith(ctx, a, EXT_NONE, gen_helper_clmulr);
+    return gen_arith(ctx, a, EXT_NONE, gen_helper_clmulr, NULL);
 }
diff --git a/target/riscv/insn_trans/trans_rvi.c.inc b/target/riscv/insn_trans/trans_rvi.c.inc
index 6e2c89cd5e..6497338842 100644
--- a/target/riscv/insn_trans/trans_rvi.c.inc
+++ b/target/riscv/insn_trans/trans_rvi.c.inc
@@ -97,13 +97,121 @@ static bool trans_jalr(DisasContext *ctx, arg_jalr *a)
     return true;
 }
 
+/*
+ * Comparison predicates using bitwise logic taken from Hacker's Delight, 2.12
+ * We are just interested in the sign bit, so rl is not used but for subtracting
+ */
+static bool gen_setcond_i128(TCGv rl, TCGv rh,
+                             TCGv al, TCGv ah,
+                             TCGv bl, TCGv bh,
+                             TCGCond cond)
+{
+    switch (cond) {
+    case TCG_COND_EQ:
+        tcg_gen_setcond_tl(TCG_COND_EQ, rl, al, bl);
+        tcg_gen_setcond_tl(TCG_COND_EQ, rh, ah, bh);
+        tcg_gen_and_tl(rl, rl, rh);
+        break;
+
+    case TCG_COND_NE:
+        tcg_gen_setcond_tl(TCG_COND_NE, rl, al, bl);
+        tcg_gen_setcond_tl(TCG_COND_NE, rh, ah, bh);
+        tcg_gen_or_tl(rl, rl, rh);
+        break;
+
+    case TCG_COND_LT:
+    {
+        TCGv tmp1 = tcg_temp_new(),
+             tmp2 = tcg_temp_new();
+
+        tcg_gen_sub2_tl(rl, rh, al, ah, bl, bh);
+        tcg_gen_xor_tl(tmp1, rh, ah);
+        tcg_gen_xor_tl(tmp2, ah, bh);
+        tcg_gen_and_tl(tmp1, tmp1, tmp2);
+        tcg_gen_xor_tl(tmp1, rh, tmp1);
+        tcg_gen_shri_tl(rl, tmp1, 63);
+
+        tcg_temp_free(tmp1);
+        tcg_temp_free(tmp2);
+        break;
+    }
+
+    case TCG_COND_GE:
+        /* Invert result of TCG_COND_LT */
+        gen_setcond_i128(rl, rh, al, ah, bl, bh, TCG_COND_LT);
+        tcg_gen_xori_tl(rl, rl, 1);
+        break;
+
+    case TCG_COND_LTU:
+    {
+        TCGv tmp1 = tcg_temp_new(),
+             tmp2 = tcg_temp_new();
+
+        tcg_gen_sub2_tl(rl, rh, al, ah, bl, bh);
+        tcg_gen_eqv_tl(tmp1, ah, bh);
+        tcg_gen_and_tl(tmp1, tmp1, rh);
+        tcg_gen_andc_tl(tmp2, bh, ah);
+        tcg_gen_or_tl(tmp1, tmp1, tmp2);
+        tcg_gen_shri_tl(rl, tmp1, 63);
+
+        tcg_temp_free(tmp1);
+        tcg_temp_free(tmp2);
+        break;
+    }
+
+    case TCG_COND_GEU:
+        /* Invert result of TCG_COND_LTU */
+        gen_setcond_i128(rl, rh, al, ah, bl, bh, TCG_COND_LTU);
+        tcg_gen_xori_tl(rl, rl, 1);
+        break;
+
+    default:
+        return false;
+    }
+    tcg_gen_movi_tl(rh, 0);
+    return true;
+}
+
 static bool gen_branch(DisasContext *ctx, arg_b *a, TCGCond cond)
 {
     TCGLabel *l = gen_new_label();
     TCGv src1 = get_gpr(ctx, a->rs1, EXT_SIGN);
     TCGv src2 = get_gpr(ctx, a->rs2, EXT_SIGN);
 
-    tcg_gen_brcond_tl(cond, src1, src2, l);
+    if (get_xl(ctx) == MXL_RV128) {
+        TCGv src1h = get_gprh(ctx, a->rs1),
+             src2h = get_gprh(ctx, a->rs2),
+             tmpl = tcg_temp_new(),
+             tmph = tcg_temp_new();
+
+        /*
+         * Do not use gen_setcond_i128 for EQ and NE as these conditions are
+         * often met and can be more efficiently implemented.
+         */
+        if (cond == TCG_COND_EQ || cond == TCG_COND_NE) {
+            /*
+             * bnez and beqz being used quite often too, lets optimize them,
+             * although QEMU's tcg optimizer handles these cases nicely
+             */
+            if (a->rs2 == 0) {
+                tcg_gen_or_tl(tmpl, src1, src1h);
+                tcg_gen_brcondi_tl(cond, tmpl, 0, l);
+            } else {
+                tcg_gen_xor_tl(tmpl, src1, src2);
+                tcg_gen_xor_tl(tmph, src1h, src2h);
+                tcg_gen_or_tl(tmpl, tmpl, tmph);
+                tcg_gen_brcondi_tl(cond, tmpl, 0, l);
+            }
+        } else {
+            gen_setcond_i128(tmpl, tmph, src1, src1h, src2, src2h, cond);
+            tcg_gen_brcondi_tl(TCG_COND_NE, tmpl, 0, l);
+        }
+
+        tcg_temp_free(tmph);
+        tcg_temp_free(tmpl);
+    } else {
+        tcg_gen_brcond_tl(cond, src1, src2, l);
+    }
     gen_goto_tb(ctx, 1, ctx->pc_succ_insn);
 
     gen_set_label(l); /* branch taken */
@@ -370,9 +478,30 @@ static bool trans_sq(DisasContext *ctx, arg_sq *a)
     return gen_store(ctx, a, MO_TEO);
 }
 
+static bool trans_addd(DisasContext *ctx, arg_addd *a)
+{
+    REQUIRE_128BIT(ctx);
+    ctx->ol = MXL_RV64;
+    return gen_arith(ctx, a, EXT_NONE, tcg_gen_add_tl, NULL);
+}
+
+static bool trans_addid(DisasContext *ctx, arg_addid *a)
+{
+    REQUIRE_128BIT(ctx);
+    ctx->ol = MXL_RV64;
+    return gen_arith_imm_fn(ctx, a, EXT_NONE, tcg_gen_addi_tl, NULL);
+}
+
+static bool trans_subd(DisasContext *ctx, arg_subd *a)
+{
+    REQUIRE_128BIT(ctx);
+    ctx->ol = MXL_RV64;
+    return gen_arith(ctx, a, EXT_NONE, tcg_gen_sub_tl, NULL);
+}
+
 static bool trans_addi(DisasContext *ctx, arg_addi *a)
 {
-    return gen_arith_imm_fn(ctx, a, EXT_NONE, tcg_gen_addi_tl);
+    return gen_arith_imm_fn(ctx, a, EXT_NONE, tcg_gen_addi_tl, gen_addi2_i128);
 }
 
 static void gen_slt(TCGv ret, TCGv s1, TCGv s2)
@@ -380,19 +509,31 @@ static void gen_slt(TCGv ret, TCGv s1, TCGv s2)
     tcg_gen_setcond_tl(TCG_COND_LT, ret, s1, s2);
 }
 
+static void gen_slt_i128(TCGv retl, TCGv reth,
+                         TCGv s1l, TCGv s1h, TCGv s2l, TCGv s2h)
+{
+    gen_setcond_i128(retl, reth, s1l, s1h, s2l, s2h, TCG_COND_LT);
+}
+
 static void gen_sltu(TCGv ret, TCGv s1, TCGv s2)
 {
     tcg_gen_setcond_tl(TCG_COND_LTU, ret, s1, s2);
 }
 
+static void gen_sltu_i128(TCGv retl, TCGv reth,
+                         TCGv s1l, TCGv s1h, TCGv s2l, TCGv s2h)
+{
+    gen_setcond_i128(retl, reth, s1l, s1h, s2l, s2h, TCG_COND_LTU);
+}
+
 static bool trans_slti(DisasContext *ctx, arg_slti *a)
 {
-    return gen_arith_imm_tl(ctx, a, EXT_SIGN, gen_slt);
+    return gen_arith_imm_tl(ctx, a, EXT_SIGN, gen_slt, gen_slt_i128);
 }
 
 static bool trans_sltiu(DisasContext *ctx, arg_sltiu *a)
 {
-    return gen_arith_imm_tl(ctx, a, EXT_SIGN, gen_sltu);
+    return gen_arith_imm_tl(ctx, a, EXT_SIGN, gen_sltu, gen_sltu_i128);
 }
 
 static bool trans_xori(DisasContext *ctx, arg_xori *a)
@@ -478,12 +619,12 @@ static bool trans_srai(DisasContext *ctx, arg_srai *a)
 
 static bool trans_add(DisasContext *ctx, arg_add *a)
 {
-    return gen_arith(ctx, a, EXT_NONE, tcg_gen_add_tl);
+    return gen_arith(ctx, a, EXT_NONE, tcg_gen_add_tl, tcg_gen_add2_tl);
 }
 
 static bool trans_sub(DisasContext *ctx, arg_sub *a)
 {
-    return gen_arith(ctx, a, EXT_NONE, tcg_gen_sub_tl);
+    return gen_arith(ctx, a, EXT_NONE, tcg_gen_sub_tl, tcg_gen_sub2_tl);
 }
 
 enum M128_DIR {
@@ -559,12 +700,12 @@ static bool trans_sll(DisasContext *ctx, arg_sll *a)
 
 static bool trans_slt(DisasContext *ctx, arg_slt *a)
 {
-    return gen_arith(ctx, a, EXT_SIGN, gen_slt);
+    return gen_arith(ctx, a, EXT_SIGN, gen_slt, gen_slt_i128);
 }
 
 static bool trans_sltu(DisasContext *ctx, arg_sltu *a)
 {
-    return gen_arith(ctx, a, EXT_SIGN, gen_sltu);
+    return gen_arith(ctx, a, EXT_SIGN, gen_sltu, gen_sltu_i128);
 }
 
 static void gen_srl_i128(TCGv destl, TCGv desth,
@@ -647,9 +788,9 @@ static bool trans_and(DisasContext *ctx, arg_and *a)
 
 static bool trans_addiw(DisasContext *ctx, arg_addiw *a)
 {
-    REQUIRE_64BIT(ctx);
+    REQUIRE_64_OR_128BIT(ctx);
     ctx->ol = MXL_RV32;
-    return gen_arith_imm_fn(ctx, a, EXT_NONE, tcg_gen_addi_tl);
+    return gen_arith_imm_fn(ctx, a, EXT_NONE, tcg_gen_addi_tl, NULL);
 }
 
 static bool trans_slliw(DisasContext *ctx, arg_slliw *a)
@@ -697,16 +838,16 @@ static bool trans_sraid(DisasContext *ctx, arg_sraid *a)
 
 static bool trans_addw(DisasContext *ctx, arg_addw *a)
 {
-    REQUIRE_64BIT(ctx);
+    REQUIRE_64_OR_128BIT(ctx);
     ctx->ol = MXL_RV32;
-    return gen_arith(ctx, a, EXT_NONE, tcg_gen_add_tl);
+    return gen_arith(ctx, a, EXT_NONE, tcg_gen_add_tl, NULL);
 }
 
 static bool trans_subw(DisasContext *ctx, arg_subw *a)
 {
-    REQUIRE_64BIT(ctx);
+    REQUIRE_64_OR_128BIT(ctx);
     ctx->ol = MXL_RV32;
-    return gen_arith(ctx, a, EXT_NONE, tcg_gen_sub_tl);
+    return gen_arith(ctx, a, EXT_NONE, tcg_gen_sub_tl, NULL);
 }
 
 static bool trans_sllw(DisasContext *ctx, arg_sllw *a)
diff --git a/target/riscv/insn_trans/trans_rvm.c.inc b/target/riscv/insn_trans/trans_rvm.c.inc
index 2af0e5c139..efe25dfc11 100644
--- a/target/riscv/insn_trans/trans_rvm.c.inc
+++ b/target/riscv/insn_trans/trans_rvm.c.inc
@@ -22,7 +22,7 @@
 static bool trans_mul(DisasContext *ctx, arg_mul *a)
 {
     REQUIRE_EXT(ctx, RVM);
-    return gen_arith(ctx, a, EXT_NONE, tcg_gen_mul_tl);
+    return gen_arith(ctx, a, EXT_NONE, tcg_gen_mul_tl, NULL);
 }
 
 static void gen_mulh(TCGv ret, TCGv s1, TCGv s2)
@@ -42,7 +42,7 @@ static void gen_mulh_w(TCGv ret, TCGv s1, TCGv s2)
 static bool trans_mulh(DisasContext *ctx, arg_mulh *a)
 {
     REQUIRE_EXT(ctx, RVM);
-    return gen_arith_per_ol(ctx, a, EXT_SIGN, gen_mulh, gen_mulh_w);
+    return gen_arith_per_ol(ctx, a, EXT_SIGN, gen_mulh, gen_mulh_w, NULL);
 }
 
 static void gen_mulhsu(TCGv ret, TCGv arg1, TCGv arg2)
@@ -76,7 +76,7 @@ static void gen_mulhsu_w(TCGv ret, TCGv arg1, TCGv arg2)
 static bool trans_mulhsu(DisasContext *ctx, arg_mulhsu *a)
 {
     REQUIRE_EXT(ctx, RVM);
-    return gen_arith_per_ol(ctx, a, EXT_NONE, gen_mulhsu, gen_mulhsu_w);
+    return gen_arith_per_ol(ctx, a, EXT_NONE, gen_mulhsu, gen_mulhsu_w, NULL);
 }
 
 static void gen_mulhu(TCGv ret, TCGv s1, TCGv s2)
@@ -91,7 +91,7 @@ static bool trans_mulhu(DisasContext *ctx, arg_mulhu *a)
 {
     REQUIRE_EXT(ctx, RVM);
     /* gen_mulh_w works for either sign as input. */
-    return gen_arith_per_ol(ctx, a, EXT_ZERO, gen_mulhu, gen_mulh_w);
+    return gen_arith_per_ol(ctx, a, EXT_ZERO, gen_mulhu, gen_mulh_w, NULL);
 }
 
 static void gen_div(TCGv ret, TCGv source1, TCGv source2)
@@ -130,7 +130,7 @@ static void gen_div(TCGv ret, TCGv source1, TCGv source2)
 static bool trans_div(DisasContext *ctx, arg_div *a)
 {
     REQUIRE_EXT(ctx, RVM);
-    return gen_arith(ctx, a, EXT_SIGN, gen_div);
+    return gen_arith(ctx, a, EXT_SIGN, gen_div, NULL);
 }
 
 static void gen_divu(TCGv ret, TCGv source1, TCGv source2)
@@ -158,7 +158,7 @@ static void gen_divu(TCGv ret, TCGv source1, TCGv source2)
 static bool trans_divu(DisasContext *ctx, arg_divu *a)
 {
     REQUIRE_EXT(ctx, RVM);
-    return gen_arith(ctx, a, EXT_ZERO, gen_divu);
+    return gen_arith(ctx, a, EXT_ZERO, gen_divu, NULL);
 }
 
 static void gen_rem(TCGv ret, TCGv source1, TCGv source2)
@@ -199,7 +199,7 @@ static void gen_rem(TCGv ret, TCGv source1, TCGv source2)
 static bool trans_rem(DisasContext *ctx, arg_rem *a)
 {
     REQUIRE_EXT(ctx, RVM);
-    return gen_arith(ctx, a, EXT_SIGN, gen_rem);
+    return gen_arith(ctx, a, EXT_SIGN, gen_rem, NULL);
 }
 
 static void gen_remu(TCGv ret, TCGv source1, TCGv source2)
@@ -227,7 +227,7 @@ static void gen_remu(TCGv ret, TCGv source1, TCGv source2)
 static bool trans_remu(DisasContext *ctx, arg_remu *a)
 {
     REQUIRE_EXT(ctx, RVM);
-    return gen_arith(ctx, a, EXT_ZERO, gen_remu);
+    return gen_arith(ctx, a, EXT_ZERO, gen_remu, NULL);
 }
 
 static bool trans_mulw(DisasContext *ctx, arg_mulw *a)
@@ -235,7 +235,7 @@ static bool trans_mulw(DisasContext *ctx, arg_mulw *a)
     REQUIRE_64BIT(ctx);
     REQUIRE_EXT(ctx, RVM);
     ctx->ol = MXL_RV32;
-    return gen_arith(ctx, a, EXT_NONE, tcg_gen_mul_tl);
+    return gen_arith(ctx, a, EXT_NONE, tcg_gen_mul_tl, NULL);
 }
 
 static bool trans_divw(DisasContext *ctx, arg_divw *a)
@@ -243,7 +243,7 @@ static bool trans_divw(DisasContext *ctx, arg_divw *a)
     REQUIRE_64BIT(ctx);
     REQUIRE_EXT(ctx, RVM);
     ctx->ol = MXL_RV32;
-    return gen_arith(ctx, a, EXT_SIGN, gen_div);
+    return gen_arith(ctx, a, EXT_SIGN, gen_div, NULL);
 }
 
 static bool trans_divuw(DisasContext *ctx, arg_divuw *a)
@@ -251,7 +251,7 @@ static bool trans_divuw(DisasContext *ctx, arg_divuw *a)
     REQUIRE_64BIT(ctx);
     REQUIRE_EXT(ctx, RVM);
     ctx->ol = MXL_RV32;
-    return gen_arith(ctx, a, EXT_ZERO, gen_divu);
+    return gen_arith(ctx, a, EXT_ZERO, gen_divu, NULL);
 }
 
 static bool trans_remw(DisasContext *ctx, arg_remw *a)
@@ -259,7 +259,7 @@ static bool trans_remw(DisasContext *ctx, arg_remw *a)
     REQUIRE_64BIT(ctx);
     REQUIRE_EXT(ctx, RVM);
     ctx->ol = MXL_RV32;
-    return gen_arith(ctx, a, EXT_SIGN, gen_rem);
+    return gen_arith(ctx, a, EXT_SIGN, gen_rem, NULL);
 }
 
 static bool trans_remuw(DisasContext *ctx, arg_remuw *a)
@@ -267,5 +267,5 @@ static bool trans_remuw(DisasContext *ctx, arg_remuw *a)
     REQUIRE_64BIT(ctx);
     REQUIRE_EXT(ctx, RVM);
     ctx->ol = MXL_RV32;
-    return gen_arith(ctx, a, EXT_ZERO, gen_remu);
+    return gen_arith(ctx, a, EXT_ZERO, gen_remu, NULL);
 }
-- 
2.33.0



^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH v3 15/21] target/riscv: support for 128-bit M extension
  2021-10-19  9:47 [PATCH v3 00/21] Adding partial support for 128-bit riscv target Frédéric Pétrot
                   ` (13 preceding siblings ...)
  2021-10-19  9:48 ` [PATCH v3 14/21] target/riscv: support for 128-bit arithmetic instructions Frédéric Pétrot
@ 2021-10-19  9:48 ` Frédéric Pétrot
  2021-10-20 20:58   ` Richard Henderson
  2021-10-19  9:48 ` [PATCH v3 16/21] target/riscv: adding high part of some csrs Frédéric Pétrot
                   ` (5 subsequent siblings)
  20 siblings, 1 reply; 49+ messages in thread
From: Frédéric Pétrot @ 2021-10-19  9:48 UTC (permalink / raw)
  To: qemu-devel, qemu-riscv
  Cc: bin.meng, richard.henderson, alistair.francis, fabien.portas,
	palmer, Frédéric Pétrot, philmd

Given the complexity of the implementation of these instructions, we call
helpers to produce their behavior. From an implementation standpoint, we
ended up by adding two more tcg globals to return the 128-bit result in a
wrapper that itself is called by gen_arith.
The sub 128-bit insns are now handled through the existing generation
functions.

Signed-off-by: Frédéric Pétrot <frederic.petrot@univ-grenoble-alpes.fr>
Co-authored-by: Fabien Portas <fabien.portas@grenoble-inp.org>
---
 target/riscv/cpu.h                      |   1 +
 target/riscv/helper.h                   |   6 +
 target/riscv/insn32.decode              |   7 +
 target/riscv/m128_helper.c              | 109 ++++++++++
 target/riscv/translate.c                |   7 +-
 target/riscv/insn_trans/trans_rvm.c.inc | 263 ++++++++++++++++++++++--
 target/riscv/meson.build                |   1 +
 7 files changed, 380 insertions(+), 14 deletions(-)
 create mode 100644 target/riscv/m128_helper.c

diff --git a/target/riscv/cpu.h b/target/riscv/cpu.h
index 5d21128865..8b96ccb37a 100644
--- a/target/riscv/cpu.h
+++ b/target/riscv/cpu.h
@@ -113,6 +113,7 @@ FIELD(VTYPE, VILL, sizeof(target_ulong) * 8 - 1, 1)
 struct CPURISCVState {
     target_ulong gpr[32];
     target_ulong gprh[32]; /* 64 top bits of the 128-bit registers */
+    target_ulong hlpr[2];  /* scratch registers for 128-bit div/rem helpers */
     uint64_t fpr[32]; /* assume both F and D extensions */
 
     /* vector coprocessor state. */
diff --git a/target/riscv/helper.h b/target/riscv/helper.h
index c7a5376227..67f5d23692 100644
--- a/target/riscv/helper.h
+++ b/target/riscv/helper.h
@@ -1147,3 +1147,9 @@ DEF_HELPER_6(vcompress_vm_b, void, ptr, ptr, ptr, ptr, env, i32)
 DEF_HELPER_6(vcompress_vm_h, void, ptr, ptr, ptr, ptr, env, i32)
 DEF_HELPER_6(vcompress_vm_w, void, ptr, ptr, ptr, ptr, env, i32)
 DEF_HELPER_6(vcompress_vm_d, void, ptr, ptr, ptr, ptr, env, i32)
+
+/* 128-bit integer multiplication and division */
+DEF_HELPER_5(divu_i128, void, env, i64, i64, i64, i64)
+DEF_HELPER_5(divs_i128, void, env, i64, i64, i64, i64)
+DEF_HELPER_5(remu_i128, void, env, i64, i64, i64, i64)
+DEF_HELPER_5(rems_i128, void, env, i64, i64, i64, i64)
diff --git a/target/riscv/insn32.decode b/target/riscv/insn32.decode
index 3556bf49cc..876e5f7f5b 100644
--- a/target/riscv/insn32.decode
+++ b/target/riscv/insn32.decode
@@ -197,6 +197,13 @@ divuw    0000001 .....  ..... 101 ..... 0111011 @r
 remw     0000001 .....  ..... 110 ..... 0111011 @r
 remuw    0000001 .....  ..... 111 ..... 0111011 @r
 
+# *** RV128M Standard Extension (in addition to RV64M) ***
+muld     0000001 .....  ..... 000 ..... 1111011 @r
+divd     0000001 .....  ..... 100 ..... 1111011 @r
+divud    0000001 .....  ..... 101 ..... 1111011 @r
+remd     0000001 .....  ..... 110 ..... 1111011 @r
+remud    0000001 .....  ..... 111 ..... 1111011 @r
+
 # *** RV32A Standard Extension ***
 lr_w       00010 . . 00000 ..... 010 ..... 0101111 @atom_ld
 sc_w       00011 . . ..... ..... 010 ..... 0101111 @atom_st
diff --git a/target/riscv/m128_helper.c b/target/riscv/m128_helper.c
new file mode 100644
index 0000000000..694ca5da9b
--- /dev/null
+++ b/target/riscv/m128_helper.c
@@ -0,0 +1,109 @@
+/*
+ * RISC-V Emulation Helpers for QEMU.
+ *
+ * Copyright (c) 2016-2017 Sagar Karandikar, sagark@eecs.berkeley.edu
+ * Copyright (c) 2017-2018 SiFive, Inc.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2 or later, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License along with
+ * this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include "qemu/osdep.h"
+#include "cpu.h"
+#include "qemu/main-loop.h"
+#include "exec/exec-all.h"
+#include "exec/helper-proto.h"
+
+void HELPER(divu_i128)(CPURISCVState *env,
+                       uint64_t ul, uint64_t uh,
+                       uint64_t vl, uint64_t vh)
+{
+    uint64_t ql, qh;
+    Int128 q;
+
+    if (vl == 0 && vh == 0) { /* Handle special behavior on div by zero */
+        ql = ~0x0;
+        qh = ~0x0;
+    } else {
+        q = int128_divu(int128_make128(ul, uh), int128_make128(vl, vh));
+        ql = int128_getlo(q);
+        qh = int128_gethi(q);
+    }
+
+    env->hlpr[0] = ql;
+    env->hlpr[1] = qh;
+}
+
+void HELPER(remu_i128)(CPURISCVState *env,
+                       uint64_t ul, uint64_t uh,
+                       uint64_t vl, uint64_t vh)
+{
+    uint64_t rl, rh;
+    Int128 r;
+
+    if (vl == 0 && vh == 0) {
+        rl = ul;
+        rh = uh;
+    } else {
+        r = int128_remu(int128_make128(ul, uh), int128_make128(vl, vh));
+        rl = int128_getlo(r);
+        rh = int128_gethi(r);
+    }
+
+    env->hlpr[0] = rl;
+    env->hlpr[1] = rh;
+}
+
+void HELPER(divs_i128)(CPURISCVState *env,
+                       uint64_t ul, uint64_t uh,
+                       uint64_t vl, uint64_t vh)
+{
+    uint64_t qh, ql;
+    Int128 q;
+
+    if (vl == 0 && vh == 0) { /* Div by zero check */
+        ql = ~0x0;
+        qh = ~0x0;
+    } else if (uh == 0x8000000000000000 && ul == 0 &&
+               vh == ~0x0 && vl == ~0x0) {
+        /* Signed div overflow check (-2**127 / -1) */
+        ql = ul;
+        qh = uh;
+    } else {
+        q = int128_divs(int128_make128(ul, uh), int128_make128(vl, vh));
+        ql = int128_getlo(q);
+        qh = int128_gethi(q);
+    }
+
+    env->hlpr[0] = ql;
+    env->hlpr[1] = qh;
+}
+
+void HELPER(rems_i128)(CPURISCVState *env,
+                       uint64_t ul, uint64_t uh,
+                       uint64_t vl, uint64_t vh)
+{
+    uint64_t rh, rl;
+    Int128 r;
+
+    if (vl == 0 && vh == 0) {
+        rl = ul;
+        rh = uh;
+    } else {
+        r = int128_rems(int128_make128(ul, uh), int128_make128(vl, vh));
+        rl = int128_getlo(r);
+        rh = int128_gethi(r);
+    }
+
+    env->hlpr[0] = rl;
+    env->hlpr[1] = rh;
+}
diff --git a/target/riscv/translate.c b/target/riscv/translate.c
index 332a5d0384..2d76832d56 100644
--- a/target/riscv/translate.c
+++ b/target/riscv/translate.c
@@ -32,7 +32,7 @@
 #include "instmap.h"
 
 /* global register indices */
-static TCGv cpu_gpr[32], cpu_gprh[32], cpu_pc, cpu_vl;
+static TCGv cpu_gpr[32], cpu_gprh[32], cpu_hlpr[2], cpu_pc, cpu_vl;
 static TCGv_i64 cpu_fpr[32]; /* assume F and D extensions */
 static TCGv load_res;
 static TCGv load_val;
@@ -953,6 +953,11 @@ void riscv_translate_init(void)
             offsetof(CPURISCVState, gprh[i]), riscv_int_regnames[i]);
     }
 
+    cpu_hlpr[0] = tcg_global_mem_new(cpu_env,
+        offsetof(CPURISCVState, hlpr[0]), "helper_reg0");
+    cpu_hlpr[1] = tcg_global_mem_new(cpu_env,
+        offsetof(CPURISCVState, hlpr[1]), "helper_reg1");
+
     for (i = 0; i < 32; i++) {
         cpu_fpr[i] = tcg_global_mem_new_i64(cpu_env,
             offsetof(CPURISCVState, fpr[i]), riscv_fpr_regnames[i]);
diff --git a/target/riscv/insn_trans/trans_rvm.c.inc b/target/riscv/insn_trans/trans_rvm.c.inc
index efe25dfc11..ea355ce333 100644
--- a/target/riscv/insn_trans/trans_rvm.c.inc
+++ b/target/riscv/insn_trans/trans_rvm.c.inc
@@ -18,11 +18,106 @@
  * this program.  If not, see <http://www.gnu.org/licenses/>.
  */
 
+static void gen_mulu2_i128(TCGv rll, TCGv rlh, TCGv rhl, TCGv rhh,
+                           TCGv al, TCGv ah, TCGv bl, TCGv bh)
+{
+    TCGv tmpl = tcg_temp_new(),
+         tmph = tcg_temp_new(),
+         cnst_zero = tcg_constant_tl(0);
+
+    tcg_gen_mulu2_tl(rll, rlh, al, bl);
+
+    tcg_gen_mulu2_tl(tmpl, tmph, al, bh);
+    tcg_gen_add2_tl(rlh, rhl, rlh, cnst_zero, tmpl, tmph);
+    tcg_gen_mulu2_tl(tmpl, tmph, ah, bl);
+    tcg_gen_add2_tl(rlh, tmph, rlh, rhl, tmpl, tmph);
+    /* Overflow detection into rhh */
+    tcg_gen_setcond_tl(TCG_COND_LTU, rhh, tmph, rhl);
+
+    tcg_gen_mov_tl(rhl, tmph);
+
+    tcg_gen_mulu2_tl(tmpl, tmph, ah, bh);
+    tcg_gen_add2_tl(rhl, rhh, rhl, rhh, tmpl, tmph);
+
+    tcg_temp_free(tmpl);
+    tcg_temp_free(tmph);
+}
+
+static void gen_mul_i128(TCGv rll, TCGv rlh,
+                         TCGv rs1l, TCGv rs1h, TCGv rs2l, TCGv rs2h)
+{
+    TCGv rhl = tcg_temp_new(),
+         rhh = tcg_temp_new();
+
+    gen_mulu2_i128(rll, rlh, rhl, rhh, rs1l, rs1h, rs2l, rs2h);
+
+    tcg_temp_free(rhl);
+    tcg_temp_free(rhh);
+}
+
 
 static bool trans_mul(DisasContext *ctx, arg_mul *a)
 {
     REQUIRE_EXT(ctx, RVM);
-    return gen_arith(ctx, a, EXT_NONE, tcg_gen_mul_tl, NULL);
+    return gen_arith(ctx, a, EXT_NONE, tcg_gen_mul_tl, gen_mul_i128);
+}
+
+static void gen_mulh_i128(TCGv rhl, TCGv rhh,
+                          TCGv rs1l, TCGv rs1h, TCGv rs2l, TCGv rs2h)
+{
+    TCGv rll = tcg_temp_new(),
+         rlh = tcg_temp_new(),
+         rlln = tcg_temp_new(),
+         rlhn = tcg_temp_new(),
+         rhln = tcg_temp_new(),
+         rhhn = tcg_temp_new(),
+         sgnres = tcg_temp_new(),
+         tmp = tcg_temp_new(),
+         cnst_one = tcg_constant_tl(1),
+         cnst_zero = tcg_constant_tl(0);
+
+    /* Extract sign of result (=> sgn(a) xor sgn(b)) */
+    tcg_gen_setcondi_tl(TCG_COND_LT, sgnres, rs1h, 0);
+    tcg_gen_setcondi_tl(TCG_COND_LT, tmp, rs2h, 0);
+    tcg_gen_xor_tl(sgnres, sgnres, tmp);
+
+    /* Take absolute value of operands */
+    tcg_gen_sari_tl(rhl, rs1h, 63);
+    tcg_gen_add2_tl(rlln, rlhn, rs1l, rs1h, rhl, rhl);
+    tcg_gen_xor_tl(rlln, rlln, rhl);
+    tcg_gen_xor_tl(rlhn, rlhn, rhl);
+
+    tcg_gen_sari_tl(rhl, rs2h, 63);
+    tcg_gen_add2_tl(rhln, rhhn, rs2l, rs2h, rhl, rhl);
+    tcg_gen_xor_tl(rhln, rhln, rhl);
+    tcg_gen_xor_tl(rhhn, rhhn, rhl);
+
+    /* Unsigned multiplication */
+    gen_mulu2_i128(rll, rlh, rhl, rhh, rlln, rlhn, rhln, rhhn);
+
+    /* Negation of result (two's complement : ~res + 1) */
+    tcg_gen_not_tl(rlln, rll);
+    tcg_gen_not_tl(rlhn, rlh);
+    tcg_gen_not_tl(rhln, rhl);
+    tcg_gen_not_tl(rhhn, rhh);
+
+    tcg_gen_add2_tl(rlln, tmp, rlln, cnst_zero, cnst_one, cnst_zero);
+    tcg_gen_add2_tl(rlhn, tmp, rlhn, cnst_zero, tmp, cnst_zero);
+    tcg_gen_add2_tl(rhln, tmp, rhln, cnst_zero, tmp, cnst_zero);
+    tcg_gen_add2_tl(rhhn, tmp, rhhn, cnst_zero, tmp, cnst_zero);
+
+    /* Move conditionally result or -result depending on result sign */
+    tcg_gen_movcond_tl(TCG_COND_NE, rhl, sgnres, cnst_zero, rhln, rhl);
+    tcg_gen_movcond_tl(TCG_COND_NE, rhh, sgnres, cnst_zero, rhhn, rhh);
+
+    tcg_temp_free(rll);
+    tcg_temp_free(rlh);
+    tcg_temp_free(rlln);
+    tcg_temp_free(rlhn);
+    tcg_temp_free(rhln);
+    tcg_temp_free(rhhn);
+    tcg_temp_free(sgnres);
+    tcg_temp_free(tmp);
 }
 
 static void gen_mulh(TCGv ret, TCGv s1, TCGv s2)
@@ -42,7 +137,59 @@ static void gen_mulh_w(TCGv ret, TCGv s1, TCGv s2)
 static bool trans_mulh(DisasContext *ctx, arg_mulh *a)
 {
     REQUIRE_EXT(ctx, RVM);
-    return gen_arith_per_ol(ctx, a, EXT_SIGN, gen_mulh, gen_mulh_w, NULL);
+    return gen_arith_per_ol(ctx, a, EXT_SIGN, gen_mulh, gen_mulh_w,
+                            gen_mulh_i128);
+}
+
+static void gen_mulhsu_i128(TCGv rhl, TCGv rhh,
+                            TCGv rs1l, TCGv rs1h, TCGv rs2l, TCGv rs2h)
+{
+    TCGv rll = tcg_temp_new(),
+         rlh = tcg_temp_new(),
+         rlln = tcg_temp_new(),
+         rlhn = tcg_temp_new(),
+         rhln = tcg_temp_new(),
+         rhhn = tcg_temp_new(),
+         sgnres = tcg_temp_new(),
+         tmp = tcg_temp_new(),
+         cnst_one = tcg_constant_tl(1),
+         cnst_zero = tcg_constant_tl(0);
+
+    /* Extract sign of result (=> sgn(a)) */
+    tcg_gen_setcondi_tl(TCG_COND_LT, sgnres, rs1h, 0);
+
+    /* Take absolute value of rs1 */
+    tcg_gen_sari_tl(rhl, rs1h, 63);
+    tcg_gen_add2_tl(rlln, rlhn, rs1l, rs1h, rhl, rhl);
+    tcg_gen_xor_tl(rlln, rlln, rhl);
+    tcg_gen_xor_tl(rlhn, rlhn, rhl);
+
+    /* Unsigned multiplication */
+    gen_mulu2_i128(rll, rlh, rhl, rhh, rlln, rlhn, rs2l, rs2h);
+
+    /* Negation of result (two's complement : ~res + 1) */
+    tcg_gen_not_tl(rlln, rll);
+    tcg_gen_not_tl(rlhn, rlh);
+    tcg_gen_not_tl(rhln, rhl);
+    tcg_gen_not_tl(rhhn, rhh);
+
+    tcg_gen_add2_tl(rlln, tmp, rlln, cnst_zero, cnst_one, cnst_zero);
+    tcg_gen_add2_tl(rlhn, tmp, rlhn, cnst_zero, tmp, cnst_zero);
+    tcg_gen_add2_tl(rhln, tmp, rhln, cnst_zero, tmp, cnst_zero);
+    tcg_gen_add2_tl(rhhn, tmp, rhhn, cnst_zero, tmp, cnst_zero);
+
+    /* Move conditionally result or -result depending on result sign */
+    tcg_gen_movcond_tl(TCG_COND_NE, rhl, sgnres, cnst_zero, rhln, rhl);
+    tcg_gen_movcond_tl(TCG_COND_NE, rhh, sgnres, cnst_zero, rhhn, rhh);
+
+    tcg_temp_free(rll);
+    tcg_temp_free(rlh);
+    tcg_temp_free(rlln);
+    tcg_temp_free(rlhn);
+    tcg_temp_free(rhln);
+    tcg_temp_free(rhhn);
+    tcg_temp_free(sgnres);
+    tcg_temp_free(tmp);
 }
 
 static void gen_mulhsu(TCGv ret, TCGv arg1, TCGv arg2)
@@ -76,7 +223,20 @@ static void gen_mulhsu_w(TCGv ret, TCGv arg1, TCGv arg2)
 static bool trans_mulhsu(DisasContext *ctx, arg_mulhsu *a)
 {
     REQUIRE_EXT(ctx, RVM);
-    return gen_arith_per_ol(ctx, a, EXT_NONE, gen_mulhsu, gen_mulhsu_w, NULL);
+    return gen_arith_per_ol(ctx, a, EXT_NONE, gen_mulhsu, gen_mulhsu_w,
+                            gen_mulhsu_i128);
+}
+
+static void gen_mulhu_i128(TCGv rhl, TCGv rhh,
+                           TCGv rs1l, TCGv rs1h, TCGv rs2l, TCGv rs2h)
+{
+    TCGv rll = tcg_temp_new(),
+         rlh = tcg_temp_new();
+
+    gen_mulu2_i128(rll, rlh, rhl, rhh, rs1l, rs1h, rs2l, rs2h);
+
+    tcg_temp_free(rll);
+    tcg_temp_free(rlh);
 }
 
 static void gen_mulhu(TCGv ret, TCGv s1, TCGv s2)
@@ -91,7 +251,17 @@ static bool trans_mulhu(DisasContext *ctx, arg_mulhu *a)
 {
     REQUIRE_EXT(ctx, RVM);
     /* gen_mulh_w works for either sign as input. */
-    return gen_arith_per_ol(ctx, a, EXT_ZERO, gen_mulhu, gen_mulh_w, NULL);
+    return gen_arith_per_ol(ctx, a, EXT_ZERO, gen_mulhu, gen_mulh_w,
+                            gen_mulhu_i128);
+}
+
+static void gen_div_i128(TCGv rdl, TCGv rdh,
+                         TCGv rs1l, TCGv rs1h, TCGv rs2l, TCGv rs2h)
+{
+    gen_helper_divs_i128(cpu_env, (TCGv_i64)rs1l, (TCGv_i64)rs1h,
+                                  (TCGv_i64)rs2l, (TCGv_i64)rs2h);
+    tcg_gen_mov_tl(rdl, cpu_hlpr[0]);
+    tcg_gen_mov_tl(rdh, cpu_hlpr[1]);
 }
 
 static void gen_div(TCGv ret, TCGv source1, TCGv source2)
@@ -130,7 +300,16 @@ static void gen_div(TCGv ret, TCGv source1, TCGv source2)
 static bool trans_div(DisasContext *ctx, arg_div *a)
 {
     REQUIRE_EXT(ctx, RVM);
-    return gen_arith(ctx, a, EXT_SIGN, gen_div, NULL);
+    return gen_arith(ctx, a, EXT_SIGN, gen_div, gen_div_i128);
+}
+
+static void gen_divu_i128(TCGv rdl, TCGv rdh,
+                          TCGv rs1l, TCGv rs1h, TCGv rs2l, TCGv rs2h)
+{
+    gen_helper_divu_i128(cpu_env, (TCGv_i64)rs1l, (TCGv_i64)rs1h,
+                                  (TCGv_i64)rs2l, (TCGv_i64)rs2h);
+    tcg_gen_mov_tl(rdl, cpu_hlpr[0]);
+    tcg_gen_mov_tl(rdh, cpu_hlpr[1]);
 }
 
 static void gen_divu(TCGv ret, TCGv source1, TCGv source2)
@@ -158,7 +337,16 @@ static void gen_divu(TCGv ret, TCGv source1, TCGv source2)
 static bool trans_divu(DisasContext *ctx, arg_divu *a)
 {
     REQUIRE_EXT(ctx, RVM);
-    return gen_arith(ctx, a, EXT_ZERO, gen_divu, NULL);
+    return gen_arith(ctx, a, EXT_ZERO, gen_divu, gen_divu_i128);
+}
+
+static void gen_rem_i128(TCGv rdl, TCGv rdh,
+                         TCGv rs1l, TCGv rs1h, TCGv rs2l, TCGv rs2h)
+{
+    gen_helper_rems_i128(cpu_env, (TCGv_i64)rs1l, (TCGv_i64)rs1h,
+                                  (TCGv_i64)rs2l, (TCGv_i64)rs2h);
+    tcg_gen_mov_tl(rdl, cpu_hlpr[0]);
+    tcg_gen_mov_tl(rdh, cpu_hlpr[1]);
 }
 
 static void gen_rem(TCGv ret, TCGv source1, TCGv source2)
@@ -199,7 +387,16 @@ static void gen_rem(TCGv ret, TCGv source1, TCGv source2)
 static bool trans_rem(DisasContext *ctx, arg_rem *a)
 {
     REQUIRE_EXT(ctx, RVM);
-    return gen_arith(ctx, a, EXT_SIGN, gen_rem, NULL);
+    return gen_arith(ctx, a, EXT_SIGN, gen_rem, gen_rem_i128);
+}
+
+static void gen_remu_i128(TCGv rdl, TCGv rdh,
+                          TCGv rs1l, TCGv rs1h, TCGv rs2l, TCGv rs2h)
+{
+    gen_helper_remu_i128(cpu_env, (TCGv_i64)rs1l, (TCGv_i64)rs1h,
+                                  (TCGv_i64)rs2l, (TCGv_i64)rs2h);
+    tcg_gen_mov_tl(rdl, cpu_hlpr[0]);
+    tcg_gen_mov_tl(rdh, cpu_hlpr[1]);
 }
 
 static void gen_remu(TCGv ret, TCGv source1, TCGv source2)
@@ -227,12 +424,12 @@ static void gen_remu(TCGv ret, TCGv source1, TCGv source2)
 static bool trans_remu(DisasContext *ctx, arg_remu *a)
 {
     REQUIRE_EXT(ctx, RVM);
-    return gen_arith(ctx, a, EXT_ZERO, gen_remu, NULL);
+    return gen_arith(ctx, a, EXT_ZERO, gen_remu, gen_remu_i128);
 }
 
 static bool trans_mulw(DisasContext *ctx, arg_mulw *a)
 {
-    REQUIRE_64BIT(ctx);
+    REQUIRE_64_OR_128BIT(ctx);
     REQUIRE_EXT(ctx, RVM);
     ctx->ol = MXL_RV32;
     return gen_arith(ctx, a, EXT_NONE, tcg_gen_mul_tl, NULL);
@@ -240,7 +437,7 @@ static bool trans_mulw(DisasContext *ctx, arg_mulw *a)
 
 static bool trans_divw(DisasContext *ctx, arg_divw *a)
 {
-    REQUIRE_64BIT(ctx);
+    REQUIRE_64_OR_128BIT(ctx);
     REQUIRE_EXT(ctx, RVM);
     ctx->ol = MXL_RV32;
     return gen_arith(ctx, a, EXT_SIGN, gen_div, NULL);
@@ -248,7 +445,7 @@ static bool trans_divw(DisasContext *ctx, arg_divw *a)
 
 static bool trans_divuw(DisasContext *ctx, arg_divuw *a)
 {
-    REQUIRE_64BIT(ctx);
+    REQUIRE_64_OR_128BIT(ctx);
     REQUIRE_EXT(ctx, RVM);
     ctx->ol = MXL_RV32;
     return gen_arith(ctx, a, EXT_ZERO, gen_divu, NULL);
@@ -256,7 +453,7 @@ static bool trans_divuw(DisasContext *ctx, arg_divuw *a)
 
 static bool trans_remw(DisasContext *ctx, arg_remw *a)
 {
-    REQUIRE_64BIT(ctx);
+    REQUIRE_64_OR_128BIT(ctx);
     REQUIRE_EXT(ctx, RVM);
     ctx->ol = MXL_RV32;
     return gen_arith(ctx, a, EXT_SIGN, gen_rem, NULL);
@@ -264,8 +461,48 @@ static bool trans_remw(DisasContext *ctx, arg_remw *a)
 
 static bool trans_remuw(DisasContext *ctx, arg_remuw *a)
 {
-    REQUIRE_64BIT(ctx);
+    REQUIRE_64_OR_128BIT(ctx);
     REQUIRE_EXT(ctx, RVM);
     ctx->ol = MXL_RV32;
     return gen_arith(ctx, a, EXT_ZERO, gen_remu, NULL);
 }
+
+static bool trans_muld(DisasContext *ctx, arg_muld *a)
+{
+    REQUIRE_128BIT(ctx);
+    REQUIRE_EXT(ctx, RVM);
+    ctx->ol = MXL_RV64;
+    return gen_arith(ctx, a, EXT_SIGN, tcg_gen_mul_tl, NULL);
+}
+
+static bool trans_divd(DisasContext *ctx, arg_divd *a)
+{
+    REQUIRE_128BIT(ctx);
+    REQUIRE_EXT(ctx, RVM);
+    ctx->ol = MXL_RV64;
+    return gen_arith(ctx, a, EXT_SIGN, gen_div, NULL);
+}
+
+static bool trans_divud(DisasContext *ctx, arg_divud *a)
+{
+    REQUIRE_128BIT(ctx);
+    REQUIRE_EXT(ctx, RVM);
+    ctx->ol = MXL_RV64;
+    return gen_arith(ctx, a, EXT_ZERO, gen_divu, NULL);
+}
+
+static bool trans_remd(DisasContext *ctx, arg_remd *a)
+{
+    REQUIRE_128BIT(ctx);
+    REQUIRE_EXT(ctx, RVM);
+    ctx->ol = MXL_RV64;
+    return gen_arith(ctx, a, EXT_SIGN, gen_rem, NULL);
+}
+
+static bool trans_remud(DisasContext *ctx, arg_remud *a)
+{
+    REQUIRE_128BIT(ctx);
+    REQUIRE_EXT(ctx, RVM);
+    ctx->ol = MXL_RV64;
+    return gen_arith(ctx, a, EXT_ZERO, gen_remu, NULL);
+}
diff --git a/target/riscv/meson.build b/target/riscv/meson.build
index d5e0bc93ea..a32158da93 100644
--- a/target/riscv/meson.build
+++ b/target/riscv/meson.build
@@ -18,6 +18,7 @@ riscv_ss.add(files(
   'vector_helper.c',
   'bitmanip_helper.c',
   'translate.c',
+  'm128_helper.c'
 ))
 
 riscv_softmmu_ss = ss.source_set()
-- 
2.33.0



^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH v3 16/21] target/riscv: adding high part of some csrs
  2021-10-19  9:47 [PATCH v3 00/21] Adding partial support for 128-bit riscv target Frédéric Pétrot
                   ` (14 preceding siblings ...)
  2021-10-19  9:48 ` [PATCH v3 15/21] target/riscv: support for 128-bit M extension Frédéric Pétrot
@ 2021-10-19  9:48 ` Frédéric Pétrot
  2021-10-20 21:38   ` Richard Henderson
  2021-10-20 23:03   ` Richard Henderson
  2021-10-19  9:48 ` [PATCH v3 17/21] target/riscv: helper functions to wrap calls to 128-bit csr insns Frédéric Pétrot
                   ` (4 subsequent siblings)
  20 siblings, 2 replies; 49+ messages in thread
From: Frédéric Pétrot @ 2021-10-19  9:48 UTC (permalink / raw)
  To: qemu-devel, qemu-riscv
  Cc: bin.meng, richard.henderson, alistair.francis, fabien.portas,
	palmer, Frédéric Pétrot, philmd

Adding the high part of a minimal set of csr.

Signed-off-by: Frédéric Pétrot <frederic.petrot@univ-grenoble-alpes.fr>
Co-authored-by: Fabien Portas <fabien.portas@grenoble-inp.org>
---
 target/riscv/cpu.h | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/target/riscv/cpu.h b/target/riscv/cpu.h
index 8b96ccb37a..27ec4fec63 100644
--- a/target/riscv/cpu.h
+++ b/target/riscv/cpu.h
@@ -192,6 +192,13 @@ struct CPURISCVState {
     target_ulong hgatp;
     uint64_t htimedelta;
 
+    /* Upper 64-bits of 128-bit CSRs */
+    uint64_t mtvech;
+    uint64_t mscratchh;
+    uint64_t mepch;
+    uint64_t satph;
+    uint64_t mstatush;
+
     /* Virtual CSRs */
     /*
      * For RV32 this is 32-bit vsstatus and 32-bit vsstatush.
-- 
2.33.0



^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH v3 17/21] target/riscv: helper functions to wrap calls to 128-bit csr insns
  2021-10-19  9:47 [PATCH v3 00/21] Adding partial support for 128-bit riscv target Frédéric Pétrot
                   ` (15 preceding siblings ...)
  2021-10-19  9:48 ` [PATCH v3 16/21] target/riscv: adding high part of some csrs Frédéric Pétrot
@ 2021-10-19  9:48 ` Frédéric Pétrot
  2021-10-20 21:47   ` Richard Henderson
  2021-10-19  9:48 ` [PATCH v3 18/21] target/riscv: modification of the trans_csrxx for 128-bit support Frédéric Pétrot
                   ` (3 subsequent siblings)
  20 siblings, 1 reply; 49+ messages in thread
From: Frédéric Pétrot @ 2021-10-19  9:48 UTC (permalink / raw)
  To: qemu-devel, qemu-riscv
  Cc: bin.meng, richard.henderson, alistair.francis, fabien.portas,
	palmer, Frédéric Pétrot, philmd

Given the side effects they have, the csr instructions are realized as
helpers. We extend this existing infrastructure for 128-bit sized csr.
We have a slight issue with returning 128-bit values: we use the globals
we added to support div/rem insns to that end.
Theses helpers all call a unique function that is currently a stub.
The trans_csrxx functions supporting 128-bit are yet to be implemented.

Signed-off-by: Frédéric Pétrot <frederic.petrot@univ-grenoble-alpes.fr>
Co-authored-by: Fabien Portas <fabien.portas@grenoble-inp.org>
---
 target/riscv/cpu.h       |  4 ++++
 target/riscv/helper.h    |  3 +++
 target/riscv/csr.c       |  7 +++++++
 target/riscv/op_helper.c | 44 ++++++++++++++++++++++++++++++++++++++++
 4 files changed, 58 insertions(+)

diff --git a/target/riscv/cpu.h b/target/riscv/cpu.h
index 27ec4fec63..eb4f63fcbf 100644
--- a/target/riscv/cpu.h
+++ b/target/riscv/cpu.h
@@ -470,6 +470,10 @@ typedef RISCVException (*riscv_csr_op_fn)(CPURISCVState *env, int csrno,
                                           target_ulong new_value,
                                           target_ulong write_mask);
 
+RISCVException riscv_csrrw_i128(CPURISCVState *env, int csrno,
+                                Int128 *ret_value,
+                                Int128 new_value, Int128 write_mask);
+
 typedef struct {
     const char *name;
     riscv_csr_predicate_fn predicate;
diff --git a/target/riscv/helper.h b/target/riscv/helper.h
index 67f5d23692..e27bdb9075 100644
--- a/target/riscv/helper.h
+++ b/target/riscv/helper.h
@@ -66,6 +66,9 @@ DEF_HELPER_FLAGS_2(clmulr, TCG_CALL_NO_RWG_SE, tl, tl, tl)
 DEF_HELPER_2(csrr, tl, env, int)
 DEF_HELPER_3(csrw, void, env, int, tl)
 DEF_HELPER_4(csrrw, tl, env, int, tl, tl)
+DEF_HELPER_2(csrr_i128, void, env, int)
+DEF_HELPER_4(csrw_i128, void, env, int, tl, tl)
+DEF_HELPER_6(csrrw_i128, void, env, int, tl, tl, tl, tl)
 #ifndef CONFIG_USER_ONLY
 DEF_HELPER_2(sret, tl, env, tl)
 DEF_HELPER_2(mret, tl, env, tl)
diff --git a/target/riscv/csr.c b/target/riscv/csr.c
index 69e4d65fcd..b802ee0dbc 100644
--- a/target/riscv/csr.c
+++ b/target/riscv/csr.c
@@ -1516,6 +1516,13 @@ RISCVException riscv_csrrw(CPURISCVState *env, int csrno,
     return RISCV_EXCP_NONE;
 }
 
+RISCVException riscv_csrrw_i128(CPURISCVState *env, int csrno,
+                               Int128 *ret_value,
+                               Int128 new_value, Int128 write_mask)
+{
+    return RISCV_EXCP_ILLEGAL_INST;
+}
+
 /*
  * Debugger support.  If not in user mode, set env->debugger before the
  * riscv_csrrw call and clear it after the call.
diff --git a/target/riscv/op_helper.c b/target/riscv/op_helper.c
index ee7c24efe7..753eb35000 100644
--- a/target/riscv/op_helper.c
+++ b/target/riscv/op_helper.c
@@ -69,6 +69,50 @@ target_ulong helper_csrrw(CPURISCVState *env, int csr,
     return val;
 }
 
+void helper_csrr_i128(CPURISCVState *env, int csr)
+{
+    Int128 rv = int128_zero();
+    RISCVException ret = riscv_csrrw_i128(env, csr, &rv,
+                                          int128_zero(),
+                                          int128_zero());
+
+    if (ret != RISCV_EXCP_NONE) {
+        riscv_raise_exception(env, ret, GETPC());
+    }
+
+    env->hlpr[0] = int128_getlo(rv);
+    env->hlpr[1] = int128_gethi(rv);
+}
+
+void helper_csrw_i128(CPURISCVState *env, int csr,
+                      target_ulong srcl, target_ulong srch)
+{
+    RISCVException ret = riscv_csrrw_i128(env, csr, NULL,
+                                          int128_make128(srcl, srch),
+                                          UINT128_MAX);
+
+    if (ret != RISCV_EXCP_NONE) {
+        riscv_raise_exception(env, ret, GETPC());
+    }
+}
+
+void helper_csrrw_i128(CPURISCVState *env, int csr,
+                       target_ulong srcl, target_ulong srch,
+                       target_ulong maskl, target_ulong maskh)
+{
+    Int128 rv = int128_zero();
+    RISCVException ret = riscv_csrrw_i128(env, csr, &rv,
+                                          int128_make128(srcl, srch),
+                                          int128_make128(maskl, maskh));
+
+    if (ret != RISCV_EXCP_NONE) {
+        riscv_raise_exception(env, ret, GETPC());
+    }
+
+    env->hlpr[0] = int128_getlo(rv);
+    env->hlpr[1] = int128_gethi(rv);
+}
+
 #ifndef CONFIG_USER_ONLY
 
 target_ulong helper_sret(CPURISCVState *env, target_ulong cpu_pc_deb)
-- 
2.33.0



^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH v3 18/21] target/riscv: modification of the trans_csrxx for 128-bit support
  2021-10-19  9:47 [PATCH v3 00/21] Adding partial support for 128-bit riscv target Frédéric Pétrot
                   ` (16 preceding siblings ...)
  2021-10-19  9:48 ` [PATCH v3 17/21] target/riscv: helper functions to wrap calls to 128-bit csr insns Frédéric Pétrot
@ 2021-10-19  9:48 ` Frédéric Pétrot
  2021-10-20 21:53   ` Richard Henderson
  2021-10-19  9:48 ` [PATCH v3 19/21] target/riscv: actual functions to realize crs 128-bit insns Frédéric Pétrot
                   ` (2 subsequent siblings)
  20 siblings, 1 reply; 49+ messages in thread
From: Frédéric Pétrot @ 2021-10-19  9:48 UTC (permalink / raw)
  To: qemu-devel, qemu-riscv
  Cc: bin.meng, richard.henderson, alistair.francis, fabien.portas,
	palmer, Frédéric Pétrot, philmd

As opposed to the gen_arith and gen_shift generation helpers, the csr insns
do not have a common prototype, so the choice to generate 32/64 or 128-bit
helper calls is done in the trans_csrxx functions.

Signed-off-by: Frédéric Pétrot <frederic.petrot@univ-grenoble-alpes.fr>
Co-authored-by: Fabien Portas <fabien.portas@grenoble-inp.org>
---
 target/riscv/insn_trans/trans_rvi.c.inc | 201 ++++++++++++++++++------
 1 file changed, 156 insertions(+), 45 deletions(-)

diff --git a/target/riscv/insn_trans/trans_rvi.c.inc b/target/riscv/insn_trans/trans_rvi.c.inc
index 6497338842..e08fa482c4 100644
--- a/target/riscv/insn_trans/trans_rvi.c.inc
+++ b/target/riscv/insn_trans/trans_rvi.c.inc
@@ -962,20 +962,74 @@ static bool do_csrrw(DisasContext *ctx, int rd, int rc, TCGv src, TCGv mask)
     return do_csr_post(ctx);
 }
 
+static bool do_csrr_i128(DisasContext *ctx, int rd, int rc)
+{
+    TCGv_i32 csr = tcg_constant_i32(rc);
+
+    if (tb_cflags(ctx->base.tb) & CF_USE_ICOUNT) {
+        gen_io_start();
+    }
+    gen_helper_csrr_i128(cpu_env, csr);
+    gen_set_gpr(ctx, rd, cpu_hlpr[0]);
+    gen_set_gprh(ctx, rd, cpu_hlpr[1]);
+    return do_csr_post(ctx);
+}
+
+static bool do_csrw_i128(DisasContext *ctx, int rc, TCGv srcl, TCGv srch)
+{
+    TCGv_i32 csr = tcg_constant_i32(rc);
+
+    if (tb_cflags(ctx->base.tb) & CF_USE_ICOUNT) {
+        gen_io_start();
+    }
+    gen_helper_csrw_i128(cpu_env, csr, srcl, srch);
+    return do_csr_post(ctx);
+}
+
+static bool do_csrrw_i128(DisasContext *ctx, int rd, int rc,
+                          TCGv srcl, TCGv srch, TCGv maskl, TCGv maskh)
+{
+    TCGv_i32 csr = tcg_constant_i32(rc);
+
+    if (tb_cflags(ctx->base.tb) & CF_USE_ICOUNT) {
+        gen_io_start();
+    }
+    gen_helper_csrrw_i128(cpu_env, csr, srcl, srch, maskl, maskh);
+    gen_set_gpr(ctx, rd, cpu_hlpr[0]);
+    gen_set_gprh(ctx, rd, cpu_hlpr[1]);
+    return do_csr_post(ctx);
+}
+
 static bool trans_csrrw(DisasContext *ctx, arg_csrrw *a)
 {
-    TCGv src = get_gpr(ctx, a->rs1, EXT_NONE);
-
-    /*
-     * If rd == 0, the insn shall not read the csr, nor cause any of the
-     * side effects that might occur on a csr read.
-     */
-    if (a->rd == 0) {
-        return do_csrw(ctx, a->csr, src);
+    if (get_xl(ctx) < MXL_RV128) {
+        TCGv src = get_gpr(ctx, a->rs1, EXT_NONE);
+
+        /*
+         * If rd == 0, the insn shall not read the csr, nor cause any of the
+         * side effects that might occur on a csr read.
+         */
+        if (a->rd == 0) {
+            return do_csrw(ctx, a->csr, src);
+        }
+
+        TCGv mask = tcg_constant_tl(-1);
+        return do_csrrw(ctx, a->rd, a->csr, src, mask);
+    } else {
+        TCGv srcl = get_gpr(ctx, a->rs1, EXT_NONE);
+        TCGv srch = get_gprh(ctx, a->rs1);
+
+        /*
+         * If rd == 0, the insn shall not read the csr, nor cause any of the
+         * side effects that might occur on a csr read.
+         */
+        if (a->rd == 0) {
+            return do_csrw_i128(ctx, a->csr, srcl, srch);
+        }
+
+        TCGv mask = tcg_constant_tl(-1);
+        return do_csrrw_i128(ctx, a->rd, a->csr, srcl, srch, mask, mask);
     }
-
-    TCGv mask = tcg_constant_tl(-1);
-    return do_csrrw(ctx, a->rd, a->csr, src, mask);
 }
 
 static bool trans_csrrs(DisasContext *ctx, arg_csrrs *a)
@@ -987,13 +1041,24 @@ static bool trans_csrrs(DisasContext *ctx, arg_csrrs *a)
      * a zero value, the instruction will still attempt to write the
      * unmodified value back to the csr and will cause side effects.
      */
-    if (a->rs1 == 0) {
-        return do_csrr(ctx, a->rd, a->csr);
+    if (get_xl(ctx) < MXL_RV128) {
+        if (a->rs1 == 0) {
+            return do_csrr(ctx, a->rd, a->csr);
+        }
+
+        TCGv ones = tcg_constant_tl(-1);
+        TCGv mask = get_gpr(ctx, a->rs1, EXT_ZERO);
+        return do_csrrw(ctx, a->rd, a->csr, ones, mask);
+    } else {
+        if (a->rs1 == 0) {
+            return do_csrr_i128(ctx, a->rd, a->csr);
+        }
+
+        TCGv ones = tcg_constant_tl(-1);
+        TCGv maskl = get_gpr(ctx, a->rs1, EXT_ZERO);
+        TCGv maskh = get_gprh(ctx, a->rs1);
+        return do_csrrw_i128(ctx, a->rd, a->csr, ones, ones, maskl, maskh);
     }
-
-    TCGv ones = tcg_constant_tl(-1);
-    TCGv mask = get_gpr(ctx, a->rs1, EXT_ZERO);
-    return do_csrrw(ctx, a->rd, a->csr, ones, mask);
 }
 
 static bool trans_csrrc(DisasContext *ctx, arg_csrrc *a)
@@ -1005,28 +1070,54 @@ static bool trans_csrrc(DisasContext *ctx, arg_csrrc *a)
      * a zero value, the instruction will still attempt to write the
      * unmodified value back to the csr and will cause side effects.
      */
-    if (a->rs1 == 0) {
-        return do_csrr(ctx, a->rd, a->csr);
+    if (get_xl(ctx) < MXL_RV128) {
+        if (a->rs1 == 0) {
+            return do_csrr(ctx, a->rd, a->csr);
+        }
+
+        TCGv mask = get_gpr(ctx, a->rs1, EXT_ZERO);
+        return do_csrrw(ctx, a->rd, a->csr, ctx->zero, mask);
+    } else {
+        if (a->rs1 == 0) {
+            return do_csrr_i128(ctx, a->rd, a->csr);
+        }
+
+        TCGv maskl = get_gpr(ctx, a->rs1, EXT_ZERO);
+        TCGv maskh = get_gprh(ctx, a->rs1);
+        return do_csrrw_i128(ctx, a->rd, a->csr,
+                             ctx->zero, ctx->zero, maskl, maskh);
     }
-
-    TCGv mask = get_gpr(ctx, a->rs1, EXT_ZERO);
-    return do_csrrw(ctx, a->rd, a->csr, ctx->zero, mask);
 }
 
 static bool trans_csrrwi(DisasContext *ctx, arg_csrrwi *a)
 {
-    TCGv src = tcg_constant_tl(a->rs1);
-
-    /*
-     * If rd == 0, the insn shall not read the csr, nor cause any of the
-     * side effects that might occur on a csr read.
-     */
-    if (a->rd == 0) {
-        return do_csrw(ctx, a->csr, src);
+    if (get_xl(ctx) < MXL_RV128) {
+        TCGv src = tcg_constant_tl(a->rs1);
+
+        /*
+         * If rd == 0, the insn shall not read the csr, nor cause any of the
+         * side effects that might occur on a csr read.
+         */
+        if (a->rd == 0) {
+            return do_csrw(ctx, a->csr, src);
+        }
+
+        TCGv mask = tcg_constant_tl(-1);
+        return do_csrrw(ctx, a->rd, a->csr, src, mask);
+    } else {
+        TCGv src = tcg_constant_tl(a->rs1);
+
+        /*
+         * If rd == 0, the insn shall not read the csr, nor cause any of the
+         * side effects that might occur on a csr read.
+         */
+        if (a->rd == 0) {
+            return do_csrw_i128(ctx, a->csr, src, ctx->zero);
+        }
+
+        TCGv mask = tcg_constant_tl(-1);
+        return do_csrrw_i128(ctx, a->rd, a->csr, src, ctx->zero, mask, mask);
     }
-
-    TCGv mask = tcg_constant_tl(-1);
-    return do_csrrw(ctx, a->rd, a->csr, src, mask);
 }
 
 static bool trans_csrrsi(DisasContext *ctx, arg_csrrsi *a)
@@ -1038,16 +1129,26 @@ static bool trans_csrrsi(DisasContext *ctx, arg_csrrsi *a)
      * a zero value, the instruction will still attempt to write the
      * unmodified value back to the csr and will cause side effects.
      */
-    if (a->rs1 == 0) {
-        return do_csrr(ctx, a->rd, a->csr);
+    if (get_xl(ctx) < MXL_RV128) {
+        if (a->rs1 == 0) {
+            return do_csrr(ctx, a->rd, a->csr);
+        }
+
+        TCGv ones = tcg_constant_tl(-1);
+        TCGv mask = tcg_constant_tl(a->rs1);
+        return do_csrrw(ctx, a->rd, a->csr, ones, mask);
+    } else {
+        if (a->rs1 == 0) {
+            return do_csrr_i128(ctx, a->rd, a->csr);
+        }
+
+        TCGv ones = tcg_constant_tl(-1);
+        TCGv mask = tcg_constant_tl(a->rs1);
+        return do_csrrw_i128(ctx, a->rd, a->csr, ones, ones, mask, ctx->zero);
     }
-
-    TCGv ones = tcg_constant_tl(-1);
-    TCGv mask = tcg_constant_tl(a->rs1);
-    return do_csrrw(ctx, a->rd, a->csr, ones, mask);
 }
 
-static bool trans_csrrci(DisasContext *ctx, arg_csrrci *a)
+static bool trans_csrrci(DisasContext *ctx, arg_csrrci * a)
 {
     /*
      * If rs1 == 0, the insn shall not write to the csr at all, nor
@@ -1056,10 +1157,20 @@ static bool trans_csrrci(DisasContext *ctx, arg_csrrci *a)
      * a zero value, the instruction will still attempt to write the
      * unmodified value back to the csr and will cause side effects.
      */
-    if (a->rs1 == 0) {
-        return do_csrr(ctx, a->rd, a->csr);
+    if (get_xl(ctx) < MXL_RV128) {
+        if (a->rs1 == 0) {
+            return do_csrr(ctx, a->rd, a->csr);
+        }
+
+        TCGv mask = tcg_constant_tl(a->rs1);
+        return do_csrrw(ctx, a->rd, a->csr, ctx->zero, mask);
+    } else {
+        if (a->rs1 == 0) {
+            return do_csrr_i128(ctx, a->rd, a->csr);
+        }
+
+        TCGv mask = tcg_constant_tl(a->rs1);
+        return do_csrrw_i128(ctx, a->rd, a->csr,
+                             ctx->zero, ctx->zero, mask, ctx->zero);
     }
-
-    TCGv mask = tcg_constant_tl(a->rs1);
-    return do_csrrw(ctx, a->rd, a->csr, ctx->zero, mask);
 }
-- 
2.33.0



^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH v3 19/21] target/riscv: actual functions to realize crs 128-bit insns
  2021-10-19  9:47 [PATCH v3 00/21] Adding partial support for 128-bit riscv target Frédéric Pétrot
                   ` (17 preceding siblings ...)
  2021-10-19  9:48 ` [PATCH v3 18/21] target/riscv: modification of the trans_csrxx for 128-bit support Frédéric Pétrot
@ 2021-10-19  9:48 ` Frédéric Pétrot
  2021-10-20 22:18   ` Richard Henderson
  2021-10-19  9:48 ` [PATCH v3 20/21] target/riscv: adding 128-bit access functions for some csrs Frédéric Pétrot
  2021-10-19  9:48 ` [PATCH v3 21/21] target/riscv: support for 128-bit satp Frédéric Pétrot
  20 siblings, 1 reply; 49+ messages in thread
From: Frédéric Pétrot @ 2021-10-19  9:48 UTC (permalink / raw)
  To: qemu-devel, qemu-riscv
  Cc: bin.meng, richard.henderson, alistair.francis, fabien.portas,
	palmer, Frédéric Pétrot, philmd

The csrs are accessed through function pointers: we set-up the table
for the 128-bit accesses, make the stub a function that does what it
should, and implement basic accesses on read-only csrs.

Signed-off-by: Frédéric Pétrot <frederic.petrot@univ-grenoble-alpes.fr>
Co-authored-by: Fabien Portas <fabien.portas@grenoble-inp.org>
---
 target/riscv/cpu.h |  16 +++++
 target/riscv/csr.c | 152 ++++++++++++++++++++++++++++++++++++++++++++-
 2 files changed, 165 insertions(+), 3 deletions(-)

diff --git a/target/riscv/cpu.h b/target/riscv/cpu.h
index eb4f63fcbf..253e87cd92 100644
--- a/target/riscv/cpu.h
+++ b/target/riscv/cpu.h
@@ -474,6 +474,15 @@ RISCVException riscv_csrrw_i128(CPURISCVState *env, int csrno,
                                 Int128 *ret_value,
                                 Int128 new_value, Int128 write_mask);
 
+typedef RISCVException (*riscv_csr_read128_fn)(CPURISCVState *env, int csrno,
+                                               Int128 *ret_value);
+typedef RISCVException (*riscv_csr_write128_fn)(CPURISCVState *env, int csrno,
+                                             Int128 new_value);
+typedef RISCVException (*riscv_csr_op128_fn)(CPURISCVState *env, int csrno,
+                                             Int128 *ret_value,
+                                             Int128 new_value,
+                                             Int128 write_mask);
+
 typedef struct {
     const char *name;
     riscv_csr_predicate_fn predicate;
@@ -482,6 +491,12 @@ typedef struct {
     riscv_csr_op_fn op;
 } riscv_csr_operations;
 
+typedef struct {
+    riscv_csr_read128_fn read128;
+    riscv_csr_write128_fn write128;
+    riscv_csr_op128_fn op128;
+} riscv_csr_operations128;
+
 /* CSR function table constants */
 enum {
     CSR_TABLE_SIZE = 0x1000
@@ -489,6 +504,7 @@ enum {
 
 /* CSR function table */
 extern riscv_csr_operations csr_ops[CSR_TABLE_SIZE];
+extern riscv_csr_operations128 csr_ops_128[CSR_TABLE_SIZE];
 
 void riscv_get_csr_ops(int csrno, riscv_csr_operations *ops);
 void riscv_set_csr_ops(int csrno, riscv_csr_operations *ops);
diff --git a/target/riscv/csr.c b/target/riscv/csr.c
index b802ee0dbc..3aac19e277 100644
--- a/target/riscv/csr.c
+++ b/target/riscv/csr.c
@@ -462,6 +462,13 @@ static const char valid_vm_1_10_64[16] = {
 };
 
 /* Machine Information Registers */
+static RISCVException read_zero_i128(CPURISCVState *env, int csrno,
+                                    Int128 *val)
+{
+    *val = int128_zero();
+    return RISCV_EXCP_NONE;
+}
+
 static RISCVException read_zero(CPURISCVState *env, int csrno,
                                 target_ulong *val)
 {
@@ -469,6 +476,13 @@ static RISCVException read_zero(CPURISCVState *env, int csrno,
     return RISCV_EXCP_NONE;
 }
 
+static RISCVException read_mhartid_i128(CPURISCVState *env, int csrno,
+                                       Int128 *val)
+{
+    *val = int128_make64(env->mhartid);
+    return RISCV_EXCP_NONE;
+}
+
 static RISCVException read_mhartid(CPURISCVState *env, int csrno,
                                    target_ulong *val)
 {
@@ -569,6 +583,13 @@ static RISCVException write_mstatush(CPURISCVState *env, int csrno,
     return RISCV_EXCP_NONE;
 }
 
+static RISCVException read_misa_i128(CPURISCVState *env, int csrno,
+                                    Int128 *val)
+{
+    *val = int128_make128(env->misa_ext, (uint64_t)MXL_RV128 << 62);
+    return RISCV_EXCP_NONE;
+}
+
 static RISCVException read_misa(CPURISCVState *env, int csrno,
                                 target_ulong *val)
 {
@@ -1516,11 +1537,118 @@ RISCVException riscv_csrrw(CPURISCVState *env, int csrno,
     return RISCV_EXCP_NONE;
 }
 
+static inline RISCVException riscv_csrrw_check_i128(CPURISCVState *env,
+                                                    int csrno,
+                                                    Int128 write_mask,
+                                                    RISCVCPU *cpu)
+{
+    /* check privileges and return -1 if check fails */
+#if !defined(CONFIG_USER_ONLY)
+    int effective_priv = env->priv;
+    int read_only = get_field(csrno, 0xc00) == 3;
+
+    if (riscv_has_ext(env, RVH) &&
+        env->priv == PRV_S &&
+        !riscv_cpu_virt_enabled(env)) {
+        /*
+         * We are in S mode without virtualisation, therefore we are in HS Mode.
+         * Add 1 to the effective privledge level to allow us to access the
+         * Hypervisor CSRs.
+         */
+        effective_priv++;
+    }
+
+    if ((int128_nz(write_mask) && read_only) ||
+        (!env->debugger && (effective_priv < get_field(csrno, 0x300)))) {
+        return RISCV_EXCP_ILLEGAL_INST;
+    }
+#endif
+
+    /* ensure the CSR extension is enabled. */
+    if (!cpu->cfg.ext_icsr) {
+        return RISCV_EXCP_ILLEGAL_INST;
+    }
+
+    /* check predicate */
+    if (!csr_ops[csrno].predicate) {
+        return RISCV_EXCP_ILLEGAL_INST;
+    }
+    RISCVException ret = csr_ops[csrno].predicate(env, csrno);
+    if (ret != RISCV_EXCP_NONE) {
+        return ret;
+    }
+
+    return RISCV_EXCP_NONE;
+}
+
 RISCVException riscv_csrrw_i128(CPURISCVState *env, int csrno,
-                               Int128 *ret_value,
-                               Int128 new_value, Int128 write_mask)
+                                Int128 *ret_value,
+                                Int128 new_value, Int128 write_mask)
 {
-    return RISCV_EXCP_ILLEGAL_INST;
+    RISCVException ret;
+    Int128 old_value;
+
+    RISCVCPU *cpu = env_archcpu(env);
+
+    if (!csr_ops_128[csrno].read128 && !csr_ops_128[csrno].op128) {
+        /*
+         * FIXME: Fall back to 64-bit version for now, if the 128-bit
+         * alternative isn't defined.
+         * Note, some CSRs don't extend to MXLEN, for those,
+         * this fallback is correctly handling the read/write.
+         */
+        target_ulong ret_64;
+        ret = riscv_csrrw(env, csrno, &ret_64,
+                          int128_getlo(new_value),
+                          int128_getlo(write_mask));
+
+        if (ret_value) {
+            *ret_value = int128_make64(ret_64);
+        }
+
+        return ret;
+    }
+
+    RISCVException check_status =
+        riscv_csrrw_check_i128(env, csrno, write_mask, cpu);
+    if (check_status != RISCV_EXCP_NONE) {
+        return check_status;
+    }
+
+    /* execute combined read/write operation if it exists */
+    if (csr_ops_128[csrno].op128) {
+        return csr_ops_128[csrno].op128(env, csrno, ret_value,
+                                        new_value, write_mask);
+    }
+
+    /* if no accessor exists then return failure */
+    if (!csr_ops_128[csrno].read128) {
+        return RISCV_EXCP_ILLEGAL_INST;
+    }
+    /* read old value */
+    ret = csr_ops_128[csrno].read128(env, csrno, &old_value);
+    if (ret != RISCV_EXCP_NONE) {
+        return ret;
+    }
+
+    /* write value if writable and write mask set, otherwise drop writes */
+    if (int128_nz(write_mask)) {
+        new_value = int128_or(int128_and(old_value, int128_not(write_mask)),
+                              int128_and(new_value, write_mask));
+        if (csr_ops_128[csrno].write128) {
+            ret = csr_ops_128[csrno].write128(env, csrno, new_value);
+            if (ret != RISCV_EXCP_NONE) {
+                return ret;
+            }
+        }
+    }
+
+    /* return old value */
+    if (ret_value) {
+        *ret_value = old_value;
+    }
+
+    return RISCV_EXCP_NONE;
 }
 
 /*
@@ -1544,6 +1672,24 @@ RISCVException riscv_csrrw_debug(CPURISCVState *env, int csrno,
 }
 
 /* Control and Status Register function table */
+riscv_csr_operations128 csr_ops_128[CSR_TABLE_SIZE] = {
+#if !defined(CONFIG_USER_ONLY)
+    [CSR_MVENDORID]  = { read_zero_i128    },
+    [CSR_MARCHID]    = { read_zero_i128    },
+    [CSR_MIMPID]     = { read_zero_i128    },
+    [CSR_MHARTID]    = { read_mhartid_i128 },
+
+    [CSR_MSTATUS]    = { read_zero_i128    },
+    [CSR_MISA]       = { read_misa_i128    },
+    [CSR_MTVEC]      = { read_zero_i128    },
+
+    [CSR_MSCRATCH]   = { read_zero_i128    },
+    [CSR_MEPC]       = { read_zero_i128    },
+
+    [CSR_SATP]       = { read_zero_i128    },
+#endif
+};
+
 riscv_csr_operations csr_ops[CSR_TABLE_SIZE] = {
     /* User Floating-Point CSRs */
     [CSR_FFLAGS]   = { "fflags",   fs,     read_fflags,  write_fflags },
-- 
2.33.0



^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH v3 20/21] target/riscv: adding 128-bit access functions for some csrs
  2021-10-19  9:47 [PATCH v3 00/21] Adding partial support for 128-bit riscv target Frédéric Pétrot
                   ` (18 preceding siblings ...)
  2021-10-19  9:48 ` [PATCH v3 19/21] target/riscv: actual functions to realize crs 128-bit insns Frédéric Pétrot
@ 2021-10-19  9:48 ` Frédéric Pétrot
  2021-10-20 23:18   ` Richard Henderson
  2021-10-19  9:48 ` [PATCH v3 21/21] target/riscv: support for 128-bit satp Frédéric Pétrot
  20 siblings, 1 reply; 49+ messages in thread
From: Frédéric Pétrot @ 2021-10-19  9:48 UTC (permalink / raw)
  To: qemu-devel, qemu-riscv
  Cc: bin.meng, richard.henderson, alistair.francis, fabien.portas,
	palmer, Frédéric Pétrot, philmd

Access to mstatus, mtvec, mscratch and mepc is implemented.

Signed-off-by: Frédéric Pétrot <frederic.petrot@univ-grenoble-alpes.fr>
Co-authored-by: Fabien Portas <fabien.portas@grenoble-inp.org>
---
 target/riscv/cpu_bits.h |   1 +
 target/riscv/csr.c      | 111 ++++++++++++++++++++++++++++++++++++++--
 2 files changed, 108 insertions(+), 4 deletions(-)

diff --git a/target/riscv/cpu_bits.h b/target/riscv/cpu_bits.h
index e248c6bf6d..e4750afc78 100644
--- a/target/riscv/cpu_bits.h
+++ b/target/riscv/cpu_bits.h
@@ -360,6 +360,7 @@
 
 #define MSTATUS32_SD        0x80000000
 #define MSTATUS64_SD        0x8000000000000000ULL
+#define MSTATUSH128_SD      0x8000000000000000ULL
 
 #define MISA32_MXL          0xC0000000
 #define MISA64_MXL          0xC000000000000000ULL
diff --git a/target/riscv/csr.c b/target/riscv/csr.c
index 3aac19e277..877cd2d63a 100644
--- a/target/riscv/csr.c
+++ b/target/riscv/csr.c
@@ -509,6 +509,61 @@ static uint64_t add_status_sd(RISCVMXL xl, uint64_t status)
     return status;
 }
 
+static RISCVException read_mstatus_i128(CPURISCVState *env, int csrno,
+                                   Int128 *val)
+{
+    *val = int128_make128(env->mstatus, env->mstatush);
+    return RISCV_EXCP_NONE;
+}
+
+static RISCVException write_mstatus_i128(CPURISCVState *env, int csrno,
+                                        Int128 val)
+{
+    Int128 mstatus = int128_make128(env->mstatus, env->mstatush);
+    Int128 mask = int128_zero();
+    int dirty;
+
+    /* flush tlb on mstatus fields that affect VM */
+    if (int128_getlo(int128_xor(mstatus, val))
+            & (MSTATUS_MXR | MSTATUS_MPP | MSTATUS_MPV |
+                           MSTATUS_MPRV | MSTATUS_SUM)) {
+        tlb_flush(env_cpu(env));
+    }
+    mask = int128_make64(MSTATUS_SIE | MSTATUS_SPIE |
+                         MSTATUS_MIE | MSTATUS_MPIE |
+                         MSTATUS_SPP | MSTATUS_FS | MSTATUS_MPRV | MSTATUS_SUM |
+                         MSTATUS_MPP | MSTATUS_MXR | MSTATUS_TVM | MSTATUS_TSR |
+                         MSTATUS_TW);
+
+    if (!riscv_cpu_is_32bit(env)) {
+        /*
+         * RV32: MPV and GVA are not in mstatus. The current plan is to
+         * add them to mstatush. For now, we just don't support it.
+         */
+        mask = int128_or(mask, int128_make64(MSTATUS_MPV | MSTATUS_GVA));
+    }
+
+    mstatus = int128_or(int128_and(mstatus, int128_not(mask)),
+                        int128_and(val, mask));
+
+    dirty = ((int128_getlo(mstatus) & MSTATUS_FS) == MSTATUS_FS) |
+            ((int128_getlo(mstatus) & MSTATUS_XS) == MSTATUS_XS);
+    if (dirty) {
+        if (riscv_cpu_is_32bit(env)) {
+            mstatus = int128_make64(int128_getlo(mstatus) | MSTATUS32_SD);
+        } else if (riscv_cpu_is_64bit(env)) {
+            mstatus = int128_make64(int128_getlo(mstatus) | MSTATUS64_SD);
+        } else {
+            mstatus = int128_or(mstatus, int128_make128(0, MSTATUSH128_SD));
+        }
+    }
+
+    env->mstatus = int128_getlo(mstatus);
+    env->mstatush = int128_gethi(mstatus);
+
+    return RISCV_EXCP_NONE;
+}
+
 static RISCVException read_mstatus(CPURISCVState *env, int csrno,
                                    target_ulong *val)
 {
@@ -713,6 +768,26 @@ static RISCVException write_mie(CPURISCVState *env, int csrno,
     return RISCV_EXCP_NONE;
 }
 
+static RISCVException read_mtvec_i128(CPURISCVState *env, int csrno,
+                                     Int128 *val)
+{
+    *val = int128_make128(env->mtvec, env->mtvech);
+    return RISCV_EXCP_NONE;
+}
+
+static RISCVException write_mtvec_i128(CPURISCVState *env, int csrno,
+                                      Int128 val)
+{
+    /* bits [1:0] encode mode; 0 = direct, 1 = vectored, 2 >= reserved */
+    if ((int128_getlo(val) & 3) < 2) {
+        env->mtvec = int128_getlo(val);
+        env->mtvech = int128_gethi(val);
+    } else {
+        qemu_log_mask(LOG_UNIMP, "CSR_MTVEC: reserved mode not supported\n");
+    }
+    return RISCV_EXCP_NONE;
+}
+
 static RISCVException read_mtvec(CPURISCVState *env, int csrno,
                                  target_ulong *val)
 {
@@ -747,6 +822,19 @@ static RISCVException write_mcounteren(CPURISCVState *env, int csrno,
 }
 
 /* Machine Trap Handling */
+static RISCVException read_mscratch_i128(CPURISCVState *env, int csrno,
+                                        Int128 *val)  {
+    *val = int128_make128(env->mscratch, env->mscratchh);
+    return RISCV_EXCP_NONE;
+}
+
+static RISCVException write_mscratch_i128(CPURISCVState *env, int csrno,
+                                         Int128 val) {
+    env->mscratch = int128_getlo(val);
+    env->mscratchh = int128_gethi(val);
+    return RISCV_EXCP_NONE;
+}
+
 static RISCVException read_mscratch(CPURISCVState *env, int csrno,
                                     target_ulong *val)
 {
@@ -761,6 +849,21 @@ static RISCVException write_mscratch(CPURISCVState *env, int csrno,
     return RISCV_EXCP_NONE;
 }
 
+static RISCVException read_mepc_i128(CPURISCVState *env, int csrno,
+                                    Int128 *val)
+{
+    *val = int128_make128(env->mepc, env->mepch);
+    return RISCV_EXCP_NONE;
+}
+
+static RISCVException write_mepc_i128(CPURISCVState *env, int csrno,
+                                     Int128 val)
+{
+    env->mepc = int128_getlo(val);
+    env->mepch = int128_gethi(val);
+    return RISCV_EXCP_NONE;
+}
+
 static RISCVException read_mepc(CPURISCVState *env, int csrno,
                                      target_ulong *val)
 {
@@ -1679,12 +1782,12 @@ riscv_csr_operations128 csr_ops_128[CSR_TABLE_SIZE] = {
     [CSR_MIMPID]     = { read_zero_i128    },
     [CSR_MHARTID]    = { read_mhartid_i128 },
 
-    [CSR_MSTATUS]    = { read_zero_i128    },
+    [CSR_MSTATUS]    = { read_mstatus_i128,  write_mstatus_i128  },
     [CSR_MISA]       = { read_misa_i128    },
-    [CSR_MTVEC]      = { read_zero_i128    },
+    [CSR_MTVEC]      = { read_mtvec_i128,    write_mtvec_i128    },
 
-    [CSR_MSCRATCH]   = { read_zero_i128    },
-    [CSR_MEPC]       = { read_zero_i128    },
+    [CSR_MSCRATCH]   = { read_mscratch_i128, write_mscratch_i128 },
+    [CSR_MEPC]       = { read_mepc_i128,     write_mepc_i128     },
 
     [CSR_SATP]       = { read_zero_i128    },
 #endif
-- 
2.33.0



^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH v3 21/21] target/riscv: support for 128-bit satp
  2021-10-19  9:47 [PATCH v3 00/21] Adding partial support for 128-bit riscv target Frédéric Pétrot
                   ` (19 preceding siblings ...)
  2021-10-19  9:48 ` [PATCH v3 20/21] target/riscv: adding 128-bit access functions for some csrs Frédéric Pétrot
@ 2021-10-19  9:48 ` Frédéric Pétrot
  2021-10-20 23:09   ` Richard Henderson
  20 siblings, 1 reply; 49+ messages in thread
From: Frédéric Pétrot @ 2021-10-19  9:48 UTC (permalink / raw)
  To: qemu-devel, qemu-riscv
  Cc: bin.meng, richard.henderson, alistair.francis, fabien.portas,
	palmer, Frédéric Pétrot, philmd

Support for a 128-bit satp. This is a bit more involved than necessary
because we took the opportunity to increase the page size to 16kB, and
change the page table geometry, which makes the page walk a bit more
parametrizable (variables instead of defines).
Note that is anyway a necessary step for the merging of the 32-bit and
64-bit riscv versions in a single executable.

Signed-off-by: Frédéric Pétrot <frederic.petrot@univ-grenoble-alpes.fr>
Co-authored-by: Fabien Portas <fabien.portas@grenoble-inp.org>
---
 target/riscv/cpu-param.h  |   9 +++-
 target/riscv/cpu_bits.h   |  10 ++++
 target/riscv/cpu_helper.c |  54 ++++++++++++++------
 target/riscv/csr.c        | 105 ++++++++++++++++++++++++++++++++------
 4 files changed, 144 insertions(+), 34 deletions(-)

diff --git a/target/riscv/cpu-param.h b/target/riscv/cpu-param.h
index c10459b56f..78f0916403 100644
--- a/target/riscv/cpu-param.h
+++ b/target/riscv/cpu-param.h
@@ -19,10 +19,15 @@
 #else
 /* 64-bit target, since QEMU isn't built to have TARGET_LONG_BITS over 64 */
 # define TARGET_LONG_BITS 64
-# define TARGET_PHYS_ADDR_SPACE_BITS 56 /* 44-bit PPN */
-# define TARGET_VIRT_ADDR_SPACE_BITS 48 /* sv48 */
+# define TARGET_PHYS_ADDR_SPACE_BITS 64 /* 54-bit PPN */
+# define TARGET_VIRT_ADDR_SPACE_BITS 44 /* sv44 */
 #endif
+
+#if defined(TARGET_RISCV32) || defined(TARGET_RISCV64)
 #define TARGET_PAGE_BITS 12 /* 4 KiB Pages */
+#else
+#define TARGET_PAGE_BITS 14 /* 16 KiB pages for RV128 */
+#endif
 /*
  * The current MMU Modes are:
  *  - U mode 0b000
diff --git a/target/riscv/cpu_bits.h b/target/riscv/cpu_bits.h
index e4750afc78..b04b103e31 100644
--- a/target/riscv/cpu_bits.h
+++ b/target/riscv/cpu_bits.h
@@ -430,6 +430,11 @@ typedef enum {
 #define SATP64_ASID         0x0FFFF00000000000ULL
 #define SATP64_PPN          0x00000FFFFFFFFFFFULL
 
+/* RV128 satp CSR field masks (H/L for high/low dword) */
+#define SATP128_HMODE       0xFF00000000000000ULL
+#define SATP128_HASID       0x00FFFFFFFF000000ULL
+#define SATP128_LPPN        0x0003FFFFFFFFFFFFULL
+
 /* VM modes (mstatus.vm) privileged ISA 1.9.1 */
 #define VM_1_09_MBARE       0
 #define VM_1_09_MBB         1
@@ -445,6 +450,9 @@ typedef enum {
 #define VM_1_10_SV48        9
 #define VM_1_10_SV57        10
 #define VM_1_10_SV64        11
+#define VM_1_10_SV44        12
+#define VM_1_10_SV54        13
+
 
 /* Page table entry (PTE) fields */
 #define PTE_V               0x001 /* Valid */
@@ -462,6 +470,8 @@ typedef enum {
 
 /* Leaf page shift amount */
 #define PGSHIFT             12
+/* For now, pages in RV128 are 16 KiB. */
+#define PGSHIFT128          14
 
 /* Default Reset Vector adress */
 #define DEFAULT_RSTVEC      0x1000
diff --git a/target/riscv/cpu_helper.c b/target/riscv/cpu_helper.c
index 0d1132f39d..d4b1e328ae 100644
--- a/target/riscv/cpu_helper.c
+++ b/target/riscv/cpu_helper.c
@@ -469,7 +469,7 @@ static int get_physical_address(CPURISCVState *env, hwaddr *physical,
     *prot = 0;
 
     hwaddr base;
-    int levels, ptidxbits, ptesize, vm, sum, mxr, widened;
+    int levels, ptidxbits, ptesize, vm, sum, mxr, widened, pgshift;
 
     if (first_stage == true) {
         mxr = get_field(env->mstatus, MSTATUS_MXR);
@@ -482,17 +482,25 @@ static int get_physical_address(CPURISCVState *env, hwaddr *physical,
             if (riscv_cpu_mxl(env) == MXL_RV32) {
                 base = (hwaddr)get_field(env->vsatp, SATP32_PPN) << PGSHIFT;
                 vm = get_field(env->vsatp, SATP32_MODE);
-            } else {
+            } else if (riscv_cpu_mxl(env) == MXL_RV64) {
                 base = (hwaddr)get_field(env->vsatp, SATP64_PPN) << PGSHIFT;
                 vm = get_field(env->vsatp, SATP64_MODE);
+            } else {
+                /* TODO : Hypervisor extension not supported yet in RV128. */
+                g_assert_not_reached();
             }
         } else {
             if (riscv_cpu_mxl(env) == MXL_RV32) {
                 base = (hwaddr)get_field(env->satp, SATP32_PPN) << PGSHIFT;
                 vm = get_field(env->satp, SATP32_MODE);
-            } else {
+            } else if (riscv_cpu_mxl(env) == MXL_RV64) {
                 base = (hwaddr)get_field(env->satp, SATP64_PPN) << PGSHIFT;
                 vm = get_field(env->satp, SATP64_MODE);
+            } else if (riscv_cpu_mxl(env) == MXL_RV128) {
+                base = (hwaddr)get_field(env->satp, SATP128_LPPN) << PGSHIFT128;
+                vm = get_field(env->satph, SATP128_HMODE);
+            } else {
+                g_assert_not_reached();
             }
         }
         widened = 0;
@@ -500,9 +508,15 @@ static int get_physical_address(CPURISCVState *env, hwaddr *physical,
         if (riscv_cpu_mxl(env) == MXL_RV32) {
             base = (hwaddr)get_field(env->hgatp, SATP32_PPN) << PGSHIFT;
             vm = get_field(env->hgatp, SATP32_MODE);
-        } else {
+        } else if (riscv_cpu_mxl(env) == MXL_RV64) {
             base = (hwaddr)get_field(env->hgatp, SATP64_PPN) << PGSHIFT;
             vm = get_field(env->hgatp, SATP64_MODE);
+        } else {
+            /*
+             * TODO : Hypervisor extension not supported yet in RV128,
+             * so there shouldn't be any two-stage address lookups.
+             */
+            g_assert_not_reached();
         }
         widened = 2;
     }
@@ -510,13 +524,17 @@ static int get_physical_address(CPURISCVState *env, hwaddr *physical,
     sum = get_field(env->mstatus, MSTATUS_SUM) || use_background || is_debug;
     switch (vm) {
     case VM_1_10_SV32:
-      levels = 2; ptidxbits = 10; ptesize = 4; break;
+      levels = 2; ptidxbits = 10; ptesize = 4; pgshift = 12; break;
     case VM_1_10_SV39:
-      levels = 3; ptidxbits = 9; ptesize = 8; break;
+      levels = 3; ptidxbits = 9; ptesize = 8; pgshift = 12; break;
     case VM_1_10_SV48:
-      levels = 4; ptidxbits = 9; ptesize = 8; break;
+      levels = 4; ptidxbits = 9; ptesize = 8; pgshift = 12; break;
     case VM_1_10_SV57:
-      levels = 5; ptidxbits = 9; ptesize = 8; break;
+      levels = 5; ptidxbits = 9; ptesize = 8; pgshift = 12; break;
+    case VM_1_10_SV44:
+      levels = 3; ptidxbits = 10; ptesize = 16; pgshift = 14; break;
+    case VM_1_10_SV54:
+      levels = 4; ptidxbits = 10; ptesize = 16;  pgshift = 14; break;
     case VM_1_10_MBARE:
         *physical = addr;
         *prot = PAGE_READ | PAGE_WRITE | PAGE_EXEC;
@@ -526,7 +544,7 @@ static int get_physical_address(CPURISCVState *env, hwaddr *physical,
     }
 
     CPUState *cs = env_cpu(env);
-    int va_bits = PGSHIFT + levels * ptidxbits + widened;
+    int va_bits = pgshift + levels * ptidxbits + widened;
     target_ulong mask, masked_msbs;
 
     if (TARGET_LONG_BITS > (va_bits - 1)) {
@@ -541,6 +559,7 @@ static int get_physical_address(CPURISCVState *env, hwaddr *physical,
     }
 
     int ptshift = (levels - 1) * ptidxbits;
+    uint64_t pgoff_mask = (1ULL << pgshift) - 1;
     int i;
 
 #if !TCG_OVERSIZED_GUEST
@@ -549,10 +568,10 @@ restart:
     for (i = 0; i < levels; i++, ptshift -= ptidxbits) {
         target_ulong idx;
         if (i == 0) {
-            idx = (addr >> (PGSHIFT + ptshift)) &
+            idx = (addr >> (pgshift + ptshift)) &
                            ((1 << (ptidxbits + widened)) - 1);
         } else {
-            idx = (addr >> (PGSHIFT + ptshift)) &
+            idx = (addr >> (pgshift + ptshift)) &
                            ((1 << ptidxbits) - 1);
         }
 
@@ -560,6 +579,7 @@ restart:
         hwaddr pte_addr;
 
         if (two_stage && first_stage) {
+            /* TODO : Two-stage translation for RV128 */
             int vbase_prot;
             hwaddr vbase;
 
@@ -593,6 +613,10 @@ restart:
         if (riscv_cpu_mxl(env) == MXL_RV32) {
             pte = address_space_ldl(cs->as, pte_addr, attrs, &res);
         } else {
+            /*
+             * For RV128, load only lower 64 bits as only those
+             * are used for now
+             */
             pte = address_space_ldq(cs->as, pte_addr, attrs, &res);
         }
 
@@ -607,7 +631,7 @@ restart:
             return TRANSLATE_FAIL;
         } else if (!(pte & (PTE_R | PTE_W | PTE_X))) {
             /* Inner PTE, continue walking */
-            base = ppn << PGSHIFT;
+            base = ppn << pgshift;
         } else if ((pte & (PTE_R | PTE_W | PTE_X)) == PTE_W) {
             /* Reserved leaf PTE flags: PTE_W */
             return TRANSLATE_FAIL;
@@ -679,9 +703,9 @@ restart:
 
             /* for superpage mappings, make a fake leaf PTE for the TLB's
                benefit. */
-            target_ulong vpn = addr >> PGSHIFT;
-            *physical = ((ppn | (vpn & ((1L << ptshift) - 1))) << PGSHIFT) |
-                        (addr & ~TARGET_PAGE_MASK);
+            target_ulong vpn = addr >> pgshift;
+            *physical = ((ppn | (vpn & ((1L << ptshift) - 1))) << pgshift) |
+                        (addr & pgoff_mask);
 
             /* set permissions on the TLB entry */
             if ((pte & PTE_R) || ((pte & PTE_X) && mxr)) {
diff --git a/target/riscv/csr.c b/target/riscv/csr.c
index 877cd2d63a..028adab6a8 100644
--- a/target/riscv/csr.c
+++ b/target/riscv/csr.c
@@ -461,6 +461,12 @@ static const char valid_vm_1_10_64[16] = {
     [VM_1_10_SV57] = 1
 };
 
+static const bool valid_vm_1_10_128[16] = {
+    [VM_1_10_MBARE] = 1,
+    [VM_1_10_SV44] = 1,
+    [VM_1_10_SV54] = 1
+};
+
 /* Machine Information Registers */
 static RISCVException read_zero_i128(CPURISCVState *env, int csrno,
                                     Int128 *val)
@@ -535,29 +541,27 @@ static RISCVException write_mstatus_i128(CPURISCVState *env, int csrno,
                          MSTATUS_MPP | MSTATUS_MXR | MSTATUS_TVM | MSTATUS_TSR |
                          MSTATUS_TW);
 
-    if (!riscv_cpu_is_32bit(env)) {
-        /*
-         * RV32: MPV and GVA are not in mstatus. The current plan is to
-         * add them to mstatush. For now, we just don't support it.
-         */
-        mask = int128_or(mask, int128_make64(MSTATUS_MPV | MSTATUS_GVA));
-    }
+    mask = int128_or(mask, int128_make64(MSTATUS_MPV | MSTATUS_GVA));
 
     mstatus = int128_or(int128_and(mstatus, int128_not(mask)),
                         int128_and(val, mask));
 
     dirty = ((int128_getlo(mstatus) & MSTATUS_FS) == MSTATUS_FS) |
             ((int128_getlo(mstatus) & MSTATUS_XS) == MSTATUS_XS);
+
+    /* Cannot use add_status_sd here, let's do it the old way */
     if (dirty) {
-        if (riscv_cpu_is_32bit(env)) {
-            mstatus = int128_make64(int128_getlo(mstatus) | MSTATUS32_SD);
-        } else if (riscv_cpu_is_64bit(env)) {
-            mstatus = int128_make64(int128_getlo(mstatus) | MSTATUS64_SD);
-        } else {
-            mstatus = int128_or(mstatus, int128_make128(0, MSTATUSH128_SD));
-        }
+        mstatus = int128_or(mstatus, int128_make128(0, MSTATUSH128_SD));
     }
 
+    /* SXL and UXL fields are for now read only, at xl_max */
+    mstatus = int128_make128(
+                     set_field(int128_getlo(mstatus), MSTATUS64_SXL, MXL_RV128),
+                     int128_gethi(mstatus));
+    mstatus = int128_make128(
+                     set_field(int128_getlo(mstatus), MSTATUS64_UXL, MXL_RV128),
+                     int128_gethi(mstatus));
+
     env->mstatus = int128_getlo(mstatus);
     env->mstatush = int128_gethi(mstatus);
 
@@ -575,8 +579,12 @@ static int validate_vm(CPURISCVState *env, target_ulong vm)
 {
     if (riscv_cpu_mxl(env) == MXL_RV32) {
         return valid_vm_1_10_32[vm & 0xf];
-    } else {
+    } else if (riscv_cpu_mxl(env) == MXL_RV64) {
         return valid_vm_1_10_64[vm & 0xf];
+    } else if (riscv_cpu_mxl(env) == MXL_RV128) {
+        return valid_vm_1_10_128[vm & 0xf];
+    } else {
+        return 0;
     }
 }
 
@@ -1114,6 +1122,69 @@ static RISCVException rmw_sip(CPURISCVState *env, int csrno,
 }
 
 /* Supervisor Protection and Translation */
+static RISCVException read_satp_i128(CPURISCVState *env, int csrno,
+                                    Int128 *val)
+{
+    if (!riscv_feature(env, RISCV_FEATURE_MMU)) {
+        *val = int128_zero();
+        return RISCV_EXCP_NONE;
+    }
+
+    if (env->priv == PRV_S && get_field(env->mstatus, MSTATUS_TVM)) {
+        return RISCV_EXCP_ILLEGAL_INST;
+    } else {
+        *val = int128_make128(env->satp, env->satph);
+    }
+
+    return RISCV_EXCP_NONE;
+}
+
+static RISCVException write_satp_i128(CPURISCVState *env, int csrno,
+                                     Int128 val)
+{
+    uint32_t asid;
+    bool vm_ok;
+    Int128 mask;
+
+    if (!riscv_feature(env, RISCV_FEATURE_MMU)) {
+        return RISCV_EXCP_NONE;
+    }
+
+    if (riscv_cpu_mxl(env) == MXL_RV32) {
+        vm_ok = validate_vm(env, get_field(int128_getlo(val), SATP32_MODE));
+        mask = int128_make64((int128_getlo(val) ^ env->satp)
+                           & (SATP32_MODE | SATP32_ASID | SATP32_PPN));
+        asid = (int128_getlo(val) ^ env->satp) & SATP32_ASID;
+    } else if (riscv_cpu_mxl(env) == MXL_RV64) {
+        vm_ok = validate_vm(env, get_field(int128_getlo(val), SATP64_MODE));
+        mask = int128_make64((int128_getlo(val) ^ env->satp)
+                           & (SATP64_MODE | SATP64_ASID | SATP64_PPN));
+        asid = (int128_getlo(val) ^ env->satp) & SATP64_ASID;
+    } else if (riscv_cpu_mxl(env) == MXL_RV128) {
+        vm_ok = validate_vm(env, get_field(int128_gethi(val), SATP128_HMODE));
+        mask = int128_and(
+                   int128_xor(val, int128_make128(env->satp, env->satph)),
+                   int128_make128(SATP128_LPPN, SATP128_HMODE | SATP128_HASID));
+        asid = (int128_gethi(val) ^ env->satph) & SATP128_HASID;
+    } else {
+        g_assert_not_reached();
+    }
+
+
+    if (vm_ok && int128_nz(mask)) {
+        if (env->priv == PRV_S && get_field(env->mstatus, MSTATUS_TVM)) {
+            return RISCV_EXCP_ILLEGAL_INST;
+        } else {
+            if (asid) {
+                tlb_flush(env_cpu(env));
+            }
+            env->satp = int128_getlo(val);
+            env->satph = int128_gethi(val);
+        }
+    }
+    return RISCV_EXCP_NONE;
+}
+
 static RISCVException read_satp(CPURISCVState *env, int csrno,
                                 target_ulong *val)
 {
@@ -1648,7 +1719,7 @@ static inline RISCVException riscv_csrrw_check_i128(CPURISCVState *env,
     /* check privileges and return -1 if check fails */
 #if !defined(CONFIG_USER_ONLY)
     int effective_priv = env->priv;
-    int read_only = get_field(csrno, 0xc00) == 3;
+    int read_only = get_field(csrno, 0xC00) == 3;
 
     if (riscv_has_ext(env, RVH) &&
         env->priv == PRV_S &&
@@ -1789,7 +1860,7 @@ riscv_csr_operations128 csr_ops_128[CSR_TABLE_SIZE] = {
     [CSR_MSCRATCH]   = { read_mscratch_i128, write_mscratch_i128 },
     [CSR_MEPC]       = { read_mepc_i128,     write_mepc_i128     },
 
-    [CSR_SATP]       = { read_zero_i128    },
+    [CSR_SATP]       = { read_satp_i128,     write_satp_i128     },
 #endif
 };
 
-- 
2.33.0



^ permalink raw reply related	[flat|nested] 49+ messages in thread

* Re: [PATCH v3 02/21] memory: add a few defines for octo (128-bit) values
  2021-10-19  9:47 ` [PATCH v3 02/21] memory: add a few defines for octo (128-bit) values Frédéric Pétrot
@ 2021-10-19 18:00   ` Richard Henderson
  0 siblings, 0 replies; 49+ messages in thread
From: Richard Henderson @ 2021-10-19 18:00 UTC (permalink / raw)
  To: Frédéric Pétrot, qemu-devel, qemu-riscv
  Cc: philmd, bin.meng, alistair.francis, palmer, fabien.portas

On 10/19/21 2:47 AM, Frédéric Pétrot wrote:
> Introducing unsigned quad, signed quad, and octo accesses types
> to handle load and store by 128-bit processors.
> 
> Signed-off-by: Frédéric Pétrot <frederic.petrot@univ-grenoble-alpes.fr>
> ---
>   include/exec/memop.h | 8 +++++++-
>   1 file changed, 7 insertions(+), 1 deletion(-)
> 
> diff --git a/include/exec/memop.h b/include/exec/memop.h
> index c554bb0ee8..476ea6cdae 100644
> --- a/include/exec/memop.h
> +++ b/include/exec/memop.h
> @@ -85,10 +85,13 @@ typedef enum MemOp {
>       MO_UB    = MO_8,
>       MO_UW    = MO_16,
>       MO_UL    = MO_32,
> +    MO_UQ    = MO_64,
>       MO_SB    = MO_SIGN | MO_8,
>       MO_SW    = MO_SIGN | MO_16,
>       MO_SL    = MO_SIGN | MO_32,
> -    MO_UQ     = MO_64,
> +    MO_SQ    = MO_SIGN | MO_64,
> +    MO_Q     = MO_64,
> +    MO_O     = MO_128,

There's no point in removing MO_Q in one patch and adding it back in the next.  And I 
guess we might as well name MO_O to MO_UO now.

> @@ -105,9 +108,12 @@ typedef enum MemOp {
>   #ifdef NEED_CPU_H
>       MO_TEUW  = MO_TE | MO_UW,
>       MO_TEUL  = MO_TE | MO_UL,
> +    MO_TEUQ  = MO_TE | MO_UQ,
>       MO_TESW  = MO_TE | MO_SW,
>       MO_TESL  = MO_TE | MO_SL,
>       MO_TEQ   = MO_TE | MO_UQ,
> +    MO_TESQ  = MO_TE | MO_SQ,

These should have been renamed at the same time as MO_Q.  Though it seems you are missing 
a rename of these throughout target/?  Surely this patch does not build as-is.


r~


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v3 03/21] Int128.h: addition of a few 128-bit operations
  2021-10-19  9:47 ` [PATCH v3 03/21] Int128.h: addition of a few 128-bit operations Frédéric Pétrot
@ 2021-10-19 18:15   ` Richard Henderson
  0 siblings, 0 replies; 49+ messages in thread
From: Richard Henderson @ 2021-10-19 18:15 UTC (permalink / raw)
  To: Frédéric Pétrot, qemu-devel, qemu-riscv
  Cc: philmd, bin.meng, alistair.francis, palmer, fabien.portas

On 10/19/21 2:47 AM, Frédéric Pétrot wrote:
> +static inline void divrem128(uint64_t ul, uint64_t uh,
> +                             uint64_t vl, uint64_t vh,
> +                             uint64_t *ql, uint64_t *qh,
> +                             uint64_t *rl, uint64_t *rh)

I think we should move all of the division implementation out of the header; this is 
really much too large to inline.

I think util/int128.c would be a reasonable place.

That said, why are you splitting the Int128 apart to pass as pieces here?  Seems like 
passing the Int128 and doing the split inside would make more sense.

> +        /* never happens, but makes gcc shy */
> +        n = 0;

Then g_assert_not_reached(), or change the previous if to an assert.

Hmm, it's not "never happens" so much as "divide by zero".
Please update the comment accordingly.

> +        if (r != NULL) {
> +            r[0] = k;
> +        }

r is a local array; useless check for null.

> +        s = clz32(v[n - 1]); /* 0 <= s <= 32 */
> +        if (s != 0) {
> +            for (i = n - 1; i > 0; i--) {
> +                vn[i] = ((v[i] << s) | (v[i - 1] >> (32 - s)));
> +            }
> +            vn[0] = v[0] << s;
> +
> +            un[m] = u[m - 1] >> (32 - s);
> +            for (i = m - 1; i > 0; i--) {
> +                un[i] = (u[i] << s) | (u[i - 1] >> (32 - s));
> +            }
> +            un[0] = u[0] << s;

Why are you shifting the 128-bit value in 4 parts, rather than letting int128_lshift do 
the job?


r~


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v3 04/21] target/riscv: additional macros to check instruction support
  2021-10-19  9:47 ` [PATCH v3 04/21] target/riscv: additional macros to check instruction support Frédéric Pétrot
@ 2021-10-20 14:08   ` Richard Henderson
  2021-10-21 16:22     ` Frédéric Pétrot
  0 siblings, 1 reply; 49+ messages in thread
From: Richard Henderson @ 2021-10-20 14:08 UTC (permalink / raw)
  To: Frédéric Pétrot, qemu-devel, qemu-riscv
  Cc: philmd, bin.meng, alistair.francis, palmer, fabien.portas

On 10/19/21 2:47 AM, Frédéric Pétrot wrote:
> Given that the 128-bit version of the riscv spec adds new instructions, and
> that some instructions that were previously only available in 64-bit mode
> are now available for both 64-bit and 128-bit, we added new macros to check
> for the processor mode during translation.
> 
> Signed-off-by: Frédéric Pétrot <frederic.petrot@univ-grenoble-alpes.fr>
> Co-authored-by: Fabien Portas <fabien.portas@grenoble-inp.org>
> ---
>   target/riscv/translate.c | 18 ++++++++++++++++++
>   1 file changed, 18 insertions(+)
> 
> diff --git a/target/riscv/translate.c b/target/riscv/translate.c
> index 35245aafa7..121fcd71fe 100644
> --- a/target/riscv/translate.c
> +++ b/target/riscv/translate.c
> @@ -350,6 +350,24 @@ EX_SH(12)
>       }                              \
>   } while (0)
>   
> +#define REQUIRE_128BIT(ctx) do {   \
> +    if (get_xl(ctx) < MXL_RV128) { \
> +        return false;              \
> +    }                              \
> +} while (0)
> +
> +#define REQUIRE_32_OR_64BIT(ctx) do { \
> +    if (get_xl(ctx) == MXL_RV128) {   \
> +        return false;                 \
> +    }                                 \
> +} while (0)
> +
> +#define REQUIRE_64_OR_128BIT(ctx) do { \
> +    if (get_xl(ctx) == MXL_RV32) {     \
> +        return false;                  \
> +    }                                  \
> +} while (0)

So... you've left REQUIRE_64BIT accepting RV128, so that means that your current 
REQUIRE_64_OR_128BIT is redundant.  Is that intentional?

It does seem like all places that accept RV128 should accept RV64, but perhaps that's just 
your "limited" caveat in the cover letter.

You don't use REQUIRE_32_OR_64BIT at all.  Remove it?


r~


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v3 05/21] target/riscv: separation of bitwise logic and aritmetic helpers
  2021-10-19  9:47 ` [PATCH v3 05/21] target/riscv: separation of bitwise logic and aritmetic helpers Frédéric Pétrot
@ 2021-10-20 14:14   ` Richard Henderson
  0 siblings, 0 replies; 49+ messages in thread
From: Richard Henderson @ 2021-10-20 14:14 UTC (permalink / raw)
  To: Frédéric Pétrot, qemu-devel, qemu-riscv
  Cc: philmd, bin.meng, alistair.francis, palmer, fabien.portas

On 10/19/21 2:47 AM, Frédéric Pétrot wrote:
> Introduction of a gen_logic function for bitwise logic to implement
> instructions in which not propagation of information occurs between bits and
> use of this function on the bitwise instructions.
> 
> Signed-off-by: Frédéric Pétrot <frederic.petrot@univ-grenoble-alpes.fr>
> Co-authored-by: Fabien Portas <fabien.portas@grenoble-inp.org>
> ---
>   target/riscv/translate.c                | 27 +++++++++++++++++++++++++
>   target/riscv/insn_trans/trans_rvb.c.inc |  6 +++---
>   target/riscv/insn_trans/trans_rvi.c.inc | 12 +++++------
>   3 files changed, 36 insertions(+), 9 deletions(-)
> 
> diff --git a/target/riscv/translate.c b/target/riscv/translate.c
> index 121fcd71fe..3c2e9fb790 100644
> --- a/target/riscv/translate.c
> +++ b/target/riscv/translate.c
> @@ -382,6 +382,33 @@ static int ex_rvc_shifti(DisasContext *ctx, int imm)
>   /* Include the auto-generated decoder for 32 bit insn */
>   #include "decode-insn32.c.inc"
>   
> +static bool gen_logic_imm_fn(DisasContext *ctx, arg_i *a, DisasExtend ext,
> +                             void (*func)(TCGv, TCGv, target_long))
> +{
> +    TCGv dest = dest_gpr(ctx, a->rd);
> +    TCGv src1 = get_gpr(ctx, a->rs1, ext);
> +
> +    func(dest, src1, a->imm);
> +
> +    gen_set_gpr(ctx, a->rd, dest);
> +
> +    return true;
> +}
> +
> +static bool gen_logic(DisasContext *ctx, arg_r *a, DisasExtend ext,
> +                      void (*func)(TCGv, TCGv, TCGv))
> +{

I think you should drop the ext argument, which is (by nature of the operations) 
universally EXT_NONE for all callers.  Otherwise,

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>

r~


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v3 06/21] target/riscv: array for the 64 upper bits of 128-bit registers
  2021-10-19  9:47 ` [PATCH v3 06/21] target/riscv: array for the 64 upper bits of 128-bit registers Frédéric Pétrot
@ 2021-10-20 14:44   ` Richard Henderson
  2021-10-22  6:06     ` Frédéric Pétrot
  0 siblings, 1 reply; 49+ messages in thread
From: Richard Henderson @ 2021-10-20 14:44 UTC (permalink / raw)
  To: Frédéric Pétrot, qemu-devel, qemu-riscv
  Cc: philmd, bin.meng, alistair.francis, palmer, fabien.portas

On 10/19/21 2:47 AM, Frédéric Pétrot wrote:
> The upper 64-bit of the 128-bit registers have now a place inside
> the cpu state structure, and are created as globals for future use.
> 
> Signed-off-by: Frédéric Pétrot <frederic.petrot@univ-grenoble-alpes.fr>
> Co-authored-by: Fabien Portas <fabien.portas@grenoble-inp.org>
> ---
>   target/riscv/cpu.h       | 1 +
>   target/riscv/translate.c | 5 ++++-
>   2 files changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/target/riscv/cpu.h b/target/riscv/cpu.h
> index c24bc9a039..c8b98f1b70 100644
> --- a/target/riscv/cpu.h
> +++ b/target/riscv/cpu.h
> @@ -109,6 +109,7 @@ FIELD(VTYPE, VILL, sizeof(target_ulong) * 8 - 1, 1)
>   
>   struct CPURISCVState {
>       target_ulong gpr[32];
> +    target_ulong gprh[32]; /* 64 top bits of the 128-bit registers */

At first I was going to suggest that the 128-bit value be represented in host byte order, 
but then I thought that would just get in the way until such any such host operations are 
apparent.

You are missing an update to machine.c for migration (and probably more importantly, 
loadvm/savevm for debugging).  I think you'll want to put these into a separate 
subsection, controlled by misa_mxl_max == RV128.

>       for (i = 1; i < 32; i++) {
>           cpu_gpr[i] = tcg_global_mem_new(cpu_env,
>               offsetof(CPURISCVState, gpr[i]), riscv_int_regnames[i]);
> +        cpu_gprh[i] = tcg_global_mem_new(cpu_env,
> +            offsetof(CPURISCVState, gprh[i]), riscv_int_regnames[i]);

This will just be confusing in the tcg dumps -- let's not name the two temps the identically.

Honestly, I'm not 100% thrilled about the / that appears in the current name; I think it 
would be easiest to do

   g_string_printf("x%d", i)
and
   g_string_printf("x%dh", i)


r~


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v3 07/21] target/riscv: setup everything so that riscv128-softmmu compiles
  2021-10-19  9:47 ` [PATCH v3 07/21] target/riscv: setup everything so that riscv128-softmmu compiles Frédéric Pétrot
@ 2021-10-20 14:57   ` Richard Henderson
  0 siblings, 0 replies; 49+ messages in thread
From: Richard Henderson @ 2021-10-20 14:57 UTC (permalink / raw)
  To: Frédéric Pétrot, qemu-devel, qemu-riscv
  Cc: philmd, bin.meng, alistair.francis, palmer, fabien.portas

On 10/19/21 2:47 AM, Frédéric Pétrot wrote:
> This patch is kind of a mess because several files have to be slightly
> modified to allow for a new target. Most of these modifications have to deal
> with changing what was a binary choice into a ternary one.  Although we did
> our best to avoid testing for TARGET_RISCV128 (which we did), it is
> implicitly there in '#else' statements.
> Most added infrastructure files are no far from being copies of the 64-bit
> version.
> Once this patch applied, adding risc128-sofmmu to --target-list produces
> a (no so useful yet) executable.
> 
> Signed-off-by: Frédéric Pétrot<frederic.petrot@univ-grenoble-alpes.fr>
> Co-authored-by: Fabien Portas<fabien.portas@grenoble-inp.org>
> ---
>   configs/devices/riscv128-softmmu/default.mak | 17 +++++++
>   configs/targets/riscv128-softmmu.mak         |  6 +++
>   include/disas/dis-asm.h                      |  1 +
>   include/hw/riscv/sifive_cpu.h                |  3 ++
>   target/riscv/cpu-param.h                     |  5 ++
>   target/riscv/cpu.h                           |  3 ++
>   disas/riscv.c                                |  5 ++
>   target/riscv/cpu.c                           | 23 +++++++++-
>   target/riscv/gdbstub.c                       |  3 ++
>   target/riscv/insn_trans/trans_rvd.c.inc      | 12 ++---
>   target/riscv/insn_trans/trans_rvf.c.inc      |  6 +--
>   gdb-xml/riscv-128bit-cpu.xml                 | 48 ++++++++++++++++++++
>   gdb-xml/riscv-128bit-virtual.xml             | 12 +++++
>   target/riscv/Kconfig                         |  3 ++
>   14 files changed, 137 insertions(+), 10 deletions(-)
>   create mode 100644 configs/devices/riscv128-softmmu/default.mak
>   create mode 100644 configs/targets/riscv128-softmmu.mak
>   create mode 100644 gdb-xml/riscv-128bit-cpu.xml
>   create mode 100644 gdb-xml/riscv-128bit-virtual.xml

So... do we really want to go down this route, with a separate binary?  It seems like we 
could reasonably support rv128 in the qemu-system-riscv64 binary with -cpu rv128.

> +++ b/gdb-xml/riscv-128bit-cpu.xml
> @@ -0,0 +1,48 @@
> +<?xml version="1.0"?>
> +<!-- Copyright (C) 2018-2019 Free Software Foundation, Inc.
> +
> +     Copying and distribution of this file, with or without modification,
> +     are permitted in any medium without royalty provided the copyright
> +     notice and this notice are preserved.  -->
> +
> +<!-- Register numbers are hard-coded in order to maintain backward
> +     compatibility with older versions of tools that didn't use xml
> +     register descriptions.  -->
> +
> +<!DOCTYPE feature SYSTEM "gdb-target.dtd">
> +<!-- FIXME : All GPRs are marked as 64-bits since gdb doesn't like 128-bit registers for now. -->

If the widths are not correct, we can just as easily skip it for now.


r~


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v3 01/21] memory: change define name for consistency
  2021-10-19  9:47 ` [PATCH v3 01/21] memory: change define name for consistency Frédéric Pétrot
@ 2021-10-20 15:07   ` Philippe Mathieu-Daudé
  0 siblings, 0 replies; 49+ messages in thread
From: Philippe Mathieu-Daudé @ 2021-10-20 15:07 UTC (permalink / raw)
  To: Frédéric Pétrot, qemu-devel, qemu-riscv
  Cc: richard.henderson, bin.meng, alistair.francis, palmer, fabien.portas

On 10/19/21 11:47, Frédéric Pétrot wrote:
> Changed MO_Q into MO_UQ so as to avoid confusion, as suggested by
> Philippe Mathieu-Daudé.
> 
> Signed-off-by: Frédéric Pétrot <frederic.petrot@univ-grenoble-alpes.fr>
> ---
>  include/exec/memop.h                       |  8 ++++----
>  target/arm/translate-a32.h                 |  4 ++--
>  target/arm/translate-a64.c                 |  8 ++++----
>  target/arm/translate-neon.c                |  6 +++---
>  target/arm/translate-sve.c                 |  2 +-
>  target/arm/translate-vfp.c                 |  8 ++++----
>  target/arm/translate.c                     |  2 +-
>  target/ppc/translate.c                     | 24 +++++++++++-----------
>  target/sparc/translate.c                   |  4 ++--
>  target/ppc/translate/fixedpoint-impl.c.inc | 20 +++++++++---------
>  target/ppc/translate/fp-impl.c.inc         |  4 ++--
>  target/ppc/translate/vsx-impl.c.inc        |  4 ++--
>  tcg/aarch64/tcg-target.c.inc               |  2 +-
>  tcg/arm/tcg-target.c.inc                   | 10 ++++-----
>  tcg/i386/tcg-target.c.inc                  |  4 ++--
>  tcg/mips/tcg-target.c.inc                  |  4 ++--
>  tcg/ppc/tcg-target.c.inc                   |  8 ++++----
>  tcg/riscv/tcg-target.c.inc                 |  6 +++---
>  tcg/s390x/tcg-target.c.inc                 | 10 ++++-----
>  19 files changed, 69 insertions(+), 69 deletions(-)
> 
> diff --git a/include/exec/memop.h b/include/exec/memop.h
> index 04264ffd6b..c554bb0ee8 100644
> --- a/include/exec/memop.h
> +++ b/include/exec/memop.h
> @@ -88,26 +88,26 @@ typedef enum MemOp {
>      MO_SB    = MO_SIGN | MO_8,
>      MO_SW    = MO_SIGN | MO_16,
>      MO_SL    = MO_SIGN | MO_32,
> -    MO_Q     = MO_64,
> +    MO_UQ     = MO_64,
> diff --git a/tcg/arm/tcg-target.c.inc b/tcg/arm/tcg-target.c.inc
> index 633b8a37ba..e31f454695 100644
> --- a/tcg/arm/tcg-target.c.inc
> +++ b/tcg/arm/tcg-target.c.inc
> @@ -1443,13 +1443,13 @@ static void * const qemu_ld_helpers[MO_SSIZE + 1] = {
>  #ifdef HOST_WORDS_BIGENDIAN
>      [MO_UW] = helper_be_lduw_mmu,
>      [MO_UL] = helper_be_ldul_mmu,
> -    [MO_Q]  = helper_be_ldq_mmu,
> +    [MO_UQ]  = helper_be_ldq_mmu,
>      [MO_SW] = helper_be_ldsw_mmu,
>      [MO_SL] = helper_be_ldul_mmu,
>  #else
>      [MO_UW] = helper_le_lduw_mmu,
>      [MO_UL] = helper_le_ldul_mmu,
> -    [MO_Q]  = helper_le_ldq_mmu,
> +    [MO_UQ]  = helper_le_ldq_mmu,
>      [MO_SW] = helper_le_ldsw_mmu,
>      [MO_SL] = helper_le_ldul_mmu,
>  #endif

> --- a/tcg/ppc/tcg-target.c.inc
> +++ b/tcg/ppc/tcg-target.c.inc
> @@ -1935,24 +1935,24 @@ static const uint32_t qemu_ldx_opc[(MO_SSIZE + MO_BSWAP) + 1] = {
>      [MO_UB] = LBZX,
>      [MO_UW] = LHZX,
>      [MO_UL] = LWZX,
> -    [MO_Q]  = LDX,
> +    [MO_UQ]  = LDX,
>      [MO_SW] = LHAX,
>      [MO_SL] = LWAX,
>      [MO_BSWAP | MO_UB] = LBZX,
>      [MO_BSWAP | MO_UW] = LHBRX,
>      [MO_BSWAP | MO_UL] = LWBRX,
> -    [MO_BSWAP | MO_Q]  = LDBRX,
> +    [MO_BSWAP | MO_UQ]  = LDBRX,
>  };
>  
>  static const uint32_t qemu_stx_opc[(MO_SIZE + MO_BSWAP) + 1] = {
>      [MO_UB] = STBX,
>      [MO_UW] = STHX,
>      [MO_UL] = STWX,
> -    [MO_Q]  = STDX,
> +    [MO_UQ]  = STDX,
>      [MO_BSWAP | MO_UB] = STBX,
>      [MO_BSWAP | MO_UW] = STHBRX,
>      [MO_BSWAP | MO_UL] = STWBRX,
> -    [MO_BSWAP | MO_Q]  = STDBRX,
> +    [MO_BSWAP | MO_UQ]  = STDBRX,
>  };

> diff --git a/tcg/riscv/tcg-target.c.inc b/tcg/riscv/tcg-target.c.inc
> index 9b13a46fb4..b621694321 100644
> --- a/tcg/riscv/tcg-target.c.inc
> +++ b/tcg/riscv/tcg-target.c.inc
> @@ -862,7 +862,7 @@ static void * const qemu_ld_helpers[MO_SSIZE + 1] = {
>  #if TCG_TARGET_REG_BITS == 64
>      [MO_SL] = helper_be_ldsl_mmu,
>  #endif
> -    [MO_Q]  = helper_be_ldq_mmu,
> +    [MO_UQ]  = helper_be_ldq_mmu,
>  #else
>      [MO_UW] = helper_le_lduw_mmu,
>      [MO_SW] = helper_le_ldsw_mmu,
> @@ -870,7 +870,7 @@ static void * const qemu_ld_helpers[MO_SSIZE + 1] = {
>  #if TCG_TARGET_REG_BITS == 64
>      [MO_SL] = helper_le_ldsl_mmu,
>  #endif
> -    [MO_Q]  = helper_le_ldq_mmu,
> +    [MO_UQ]  = helper_le_ldq_mmu,
>  #endif
>  };

Some '=' are now mis-indented.

Otherwise:
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>

Also this subject would be more appropriate:
"exec/memop: Rename MO_Q definition as MO_UQ"

Regards,

Phil.


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v3 08/21] target/riscv: adding accessors to the registers upper part
  2021-10-19  9:47 ` [PATCH v3 08/21] target/riscv: adding accessors to the registers upper part Frédéric Pétrot
@ 2021-10-20 15:09   ` Richard Henderson
  0 siblings, 0 replies; 49+ messages in thread
From: Richard Henderson @ 2021-10-20 15:09 UTC (permalink / raw)
  To: Frédéric Pétrot, qemu-devel, qemu-riscv
  Cc: philmd, bin.meng, alistair.francis, palmer, fabien.portas

On 10/19/21 2:47 AM, Frédéric Pétrot wrote:
> +static TCGv get_gprh(DisasContext *ctx, int reg_num)
> +{
> +    if (reg_num == 0 || get_ol(ctx) < MXL_RV128) {
> +        return ctx->zero;
> +    }

So... why return anything for OL < 128?
Seems like that should be a bug.

> +static void gen_set_gprh(DisasContext *ctx, int reg_num, TCGv t)
> +{
> +    if (reg_num != 0) {
> +        if (get_ol(ctx) < MXL_RV128) {
> +            tcg_gen_sari_tl(cpu_gprh[reg_num], cpu_gpr[reg_num], 63);
> +        } else {
> +            tcg_gen_mov_tl(cpu_gprh[reg_num], t);
> +        }
> +    }

Hmm... this implies that you must set the low part first, which could be easy to mis-use. 
  Probably better to create a combined gen_set_gpr128 that takes both halves at once.

> +    /* devilish temporary code so that the patch compiles */
> +    if (get_xl_max(ctx) == MXL_RV128) {
> +        (void)get_gprh(ctx, 6);
> +        (void)dest_gprh(ctx, 6);
> +        gen_set_gprh(ctx, 6, NULL);
> +    }

I don't think it would be confusing to squash this patch into the next one, which adds the 
actual uses.


r~


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v3 09/21] target/riscv: moving some insns close to similar insns
  2021-10-19  9:48 ` [PATCH v3 09/21] target/riscv: moving some insns close to similar insns Frédéric Pétrot
@ 2021-10-20 15:11   ` Richard Henderson
  0 siblings, 0 replies; 49+ messages in thread
From: Richard Henderson @ 2021-10-20 15:11 UTC (permalink / raw)
  To: Frédéric Pétrot, qemu-devel, qemu-riscv
  Cc: philmd, bin.meng, alistair.francis, palmer, fabien.portas

On 10/19/21 2:48 AM, Frédéric Pétrot wrote:
> lwu and ld are functionally close to the other loads, but were after the
> stores in the source file.
> Similarly, xor was away from or and and by two arithmetic functions, while
> the immediate versions were nicely put together.
> This patch moves the aforementioned loads after lhu, and xor above or,
> where they more logically belong.
> 
> Signed-off-by: Frédéric Pétrot<frederic.petrot@univ-grenoble-alpes.fr>
> Co-authored-by: Fabien Portas<fabien.portas@grenoble-inp.org>
> ---
>   target/riscv/insn_trans/trans_rvi.c.inc | 34 ++++++++++++-------------
>   1 file changed, 17 insertions(+), 17 deletions(-)

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>

r~


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v3 10/21] target/riscv: support for 128-bit loads and store
  2021-10-19  9:48 ` [PATCH v3 10/21] target/riscv: support for 128-bit loads and store Frédéric Pétrot
@ 2021-10-20 17:31   ` Richard Henderson
  0 siblings, 0 replies; 49+ messages in thread
From: Richard Henderson @ 2021-10-20 17:31 UTC (permalink / raw)
  To: Frédéric Pétrot, qemu-devel, qemu-riscv
  Cc: philmd, bin.meng, alistair.francis, palmer, fabien.portas

On 10/19/21 2:48 AM, Frédéric Pétrot wrote:
> +# Added for 128 bit support
> +%uimm_cl_q    5:2 10:3               !function=ex_shift_3
> +%uimm_6bit_lq 2:3 12:1 5:2           !function=ex_shift_3
> +%uimm_6bit_sq 7:3 10:3               !function=ex_shift_3
>   

These are incorrect.  LQ and LQSP are scaled by shift 4, not 3.  And the immediate bits 
are differently swizzled from LD and LW.


> -fld               001  ... ... .. ... 00 @cl_d
> +{
> +  fld             001  ... ... .. ... 00 @cl_d
> +  # *** RV128C specific Standard Extension (Quadrant 0) ***
> +  lq              001  ... ... .. ... 00 @cl_q
> +}

You need to move lq first, so that it overrides fld when RV128 is enabled.  Otherwise you 
have to invent some c_fld_not_rv32 pattern with the proper XLEN predicate inside.

Likewise for all of the other groups.

> +/*
> + * TODO: we should assert that src1h == 0, as we do not change the
> + *       address translation mechanism
> + */
> +static bool gen_load_i128(DisasContext *ctx, arg_lb *a, MemOp memop)
> +{
> +    TCGv src1l = get_gpr(ctx, a->rs1, EXT_NONE);
> +    TCGv src1h = get_gprh(ctx, a->rs1);
> +    TCGv destl = dest_gpr(ctx, a->rd);
> +    TCGv desth = dest_gprh(ctx, a->rd);
> +    TCGv addrl = tcg_temp_new();
> +    TCGv addrh = tcg_temp_new();
> +    TCGv imml = tcg_temp_new();
> +    TCGv immh = tcg_constant_tl(-(a->imm < 0));
> +
> +    /* Build a 128-bit address */
> +    if (a->imm != 0) {
> +        tcg_gen_movi_tl(imml, a->imm);
> +        tcg_gen_add2_tl(addrl, addrh, src1l, src1h, imml, immh);
> +    } else {
> +        tcg_gen_mov_tl(addrl, src1l);
> +        tcg_gen_mov_tl(addrh, src1h);
> +    }

Hmm.. I thought I remembered some clause by which the top N bits of the address could be 
ignored, but I can't find it now.

In any case, even if it should be done eventually, I don't think it's worthwhile to 
compute addrh at all right now.

> +    if (memop != (MemOp)MO_TEO) {

Why the cast?  MO_TEO is a MemOp enumerator.

> +        tcg_gen_qemu_ld_tl(memop & MO_BSWAP ? desth : destl, addrl,
> +                           ctx->mem_idx, MO_TEQ);
> +        gen_addi2_i128(addrl, addrh, addrl, addrh, 8);
> +        tcg_gen_qemu_ld_tl(memop & MO_BSWAP ? destl : desth, addrl,
> +                           ctx->mem_idx, MO_TEQ);

In addition... we need an atomic load here for aligned 128-bit addresses (unaligned 
addresses are allowed to be non-atomic).

We don't currently have such an operation in TCG, though we need one (the Power8 LQ 
instruction is also only atomic when aligned).

We should either add this right away (shouldn't be too hard), or change the default to 
thread=single for -cpu rv128.  We should disable thread=multi if !HAVE_ATOMIC128, because 
we will be constantly trapping with EXCP_ATOMIC.

Similarly for store, of course.


r~


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v3 11/21] target/riscv: support for 128-bit bitwise instructions
  2021-10-19  9:48 ` [PATCH v3 11/21] target/riscv: support for 128-bit bitwise instructions Frédéric Pétrot
@ 2021-10-20 17:47   ` Richard Henderson
  2021-10-20 19:18     ` Frédéric Pétrot
  0 siblings, 1 reply; 49+ messages in thread
From: Richard Henderson @ 2021-10-20 17:47 UTC (permalink / raw)
  To: Frédéric Pétrot, qemu-devel, qemu-riscv
  Cc: philmd, bin.meng, alistair.francis, palmer, fabien.portas

On 10/19/21 2:48 AM, Frédéric Pétrot wrote:
> The 128-bit bitwise instructions do not need any function prototype change
> as the functions can be applied independently on the lower and upper part of
> the registers.
> 
> Signed-off-by: Frédéric Pétrot <frederic.petrot@univ-grenoble-alpes.fr>
> Co-authored-by: Fabien Portas <fabien.portas@grenoble-inp.org>
> ---
>   target/riscv/translate.c | 22 ++++++++++++++++++++++
>   1 file changed, 22 insertions(+)
> 
> diff --git a/target/riscv/translate.c b/target/riscv/translate.c
> index e8f08f921e..71982f6284 100644
> --- a/target/riscv/translate.c
> +++ b/target/riscv/translate.c
> @@ -429,6 +429,17 @@ static bool gen_logic_imm_fn(DisasContext *ctx, arg_i *a, DisasExtend ext,
>   
>       gen_set_gpr(ctx, a->rd, dest);
>   
> +    if (get_xl_max(ctx) == MXL_RV128) {
> +        if (get_ol(ctx) ==  MXL_RV128) {
> +            uint64_t immh = -(a->imm < 0);
> +            src1 = get_gprh(ctx, a->rs1);
> +            dest = dest_gprh(ctx, a->rd);
> +
> +            func(dest, src1, immh);
> +        }
> +        gen_set_gprh(ctx, a->rd, dest);
> +    }

If ol < RV128, you're storing the low dest into the gprh, which is wrong.  It should be 
the sign-extension of the low part.  But that should happen for all writes.

Earlier, I suggested gen_set_gpr128 instead of gen_set_gprh.
I think this should be written

     if (get_xl(ctx) == MXL_RV128) {
         TCGv src1h = get_gprh(ctx, a->rs1);
         TCGv desth = dest_gprh(ctx, a->rd);

         func(dest, src1h, -(a->imm < 0));
         gen_set_gpr128(ctx, a->rd, dest, desth);
     } else {
         gen_set_gpr(ctx, a->rd, dest);
     }

Where gen_set_gpr will handle the sign-extension to 128-bits.


> @@ -443,6 +454,17 @@ static bool gen_logic(DisasContext *ctx, arg_r *a, DisasExtend ext,
>   
>       gen_set_gpr(ctx, a->rd, dest);
>   
> +    if (get_xl_max(ctx) == MXL_RV128) {
> +        if (get_ol(ctx) ==  MXL_RV128) {
> +            dest = dest_gprh(ctx, a->rd);
> +            src1 = get_gprh(ctx, a->rs1);
> +            src2 = get_gprh(ctx, a->rs2);
> +
> +            func(dest, src1, src2);
> +        }
> +        gen_set_gprh(ctx, a->rd, dest);
> +    }

Similarly.


r~


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v3 12/21] target/riscv: support for 128-bit U-type instructions
  2021-10-19  9:48 ` [PATCH v3 12/21] target/riscv: support for 128-bit U-type instructions Frédéric Pétrot
@ 2021-10-20 17:59   ` Richard Henderson
  0 siblings, 0 replies; 49+ messages in thread
From: Richard Henderson @ 2021-10-20 17:59 UTC (permalink / raw)
  To: Frédéric Pétrot, qemu-devel, qemu-riscv
  Cc: philmd, bin.meng, alistair.francis, palmer, fabien.portas

On 10/19/21 2:48 AM, Frédéric Pétrot wrote:
> Adding the 128-bit version of lui and auipc.
> 
> Signed-off-by: Frédéric Pétrot <frederic.petrot@univ-grenoble-alpes.fr>
> Co-authored-by: Fabien Portas <fabien.portas@grenoble-inp.org>
> ---
>   target/riscv/insn_trans/trans_rvi.c.inc | 19 +++++++++++++++++--
>   1 file changed, 17 insertions(+), 2 deletions(-)
> 
> diff --git a/target/riscv/insn_trans/trans_rvi.c.inc b/target/riscv/insn_trans/trans_rvi.c.inc
> index 92f41f3a86..b5e292a2aa 100644
> --- a/target/riscv/insn_trans/trans_rvi.c.inc
> +++ b/target/riscv/insn_trans/trans_rvi.c.inc
> @@ -26,14 +26,17 @@ static bool trans_illegal(DisasContext *ctx, arg_empty *a)
>   
>   static bool trans_c64_illegal(DisasContext *ctx, arg_empty *a)
>   {
> -     REQUIRE_64BIT(ctx);
> -     return trans_illegal(ctx, a);
> +    REQUIRE_64_OR_128BIT(ctx);
> +    return trans_illegal(ctx, a);
>   }
>   
>   static bool trans_lui(DisasContext *ctx, arg_lui *a)
>   {
>       if (a->rd != 0) {
>           tcg_gen_movi_tl(cpu_gpr[a->rd], a->imm);
> +        if (get_xl_max(ctx) == MXL_RV128) {
> +            tcg_gen_movi_tl(cpu_gprh[a->rd], -(a->imm < 0));
> +        }
>       }
>       return true;
>   }
> @@ -41,7 +44,19 @@ static bool trans_lui(DisasContext *ctx, arg_lui *a)
>   static bool trans_auipc(DisasContext *ctx, arg_auipc *a)
>   {
>       if (a->rd != 0) {
> +        if (get_xl_max(ctx) == MXL_RV128) {
> +            /* TODO : when pc is 128 bits, use all its bits */
> +            TCGv pc = tcg_constant_tl(ctx->base.pc_next),
> +                 imml = tcg_constant_tl(a->imm),
> +                 immh = tcg_constant_tl(-(a->imm < 0)),
> +                 zero = tcg_constant_tl(0);
> +            tcg_gen_add2_tl(cpu_gpr[a->rd], cpu_gprh[a->rd],
> +                            pc, zero,
> +                            imml, immh);

A runtime computation of constant + constant is pointless.

I think you should refactor these into a gen_set_gpri, and hide the sign-extension into 
gprh there.


r~


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v3 13/21] target/riscv: support for 128-bit shift instructions
  2021-10-19  9:48 ` [PATCH v3 13/21] target/riscv: support for 128-bit shift instructions Frédéric Pétrot
@ 2021-10-20 19:06   ` Richard Henderson
  2021-10-24 22:49     ` Frédéric Pétrot
  0 siblings, 1 reply; 49+ messages in thread
From: Richard Henderson @ 2021-10-20 19:06 UTC (permalink / raw)
  To: Frédéric Pétrot, qemu-devel, qemu-riscv
  Cc: philmd, bin.meng, alistair.francis, palmer, fabien.portas

On 10/19/21 2:48 AM, Frédéric Pétrot wrote:
> +    } else {
> +        TCGv src1l = get_gpr(ctx, a->rs1, ext),
> +             src1h = get_gprh(ctx, a->rs1),
> +             destl = tcg_temp_new(),
> +             desth = tcg_temp_new();

Don't do this comma, reuse of type and indent thing.
I know there are several instances.

> +        if (max_len < 128) {
> +            func(destl, src1l, a->shamt);
> +            gen_set_gpr(ctx, a->rd, destl);
> +            gen_set_gprh(ctx, a->rd, desth);

You hadn't initialized desth.  Again, where gen_set_gpr and gen_set_gpr128 are clearer 
than this.

>       int olen = get_olen(ctx);
>       if (olen != TARGET_LONG_BITS) {
>           if (olen == 32) {
>               f_tl = f_32;
> -        } else {
> +        } else if (olen != 128) {
>               g_assert_not_reached();
>           }
>       }
> -    return gen_shift_imm_fn(ctx, a, ext, f_tl);
> +    return gen_shift_imm_fn(ctx, a, ext, f_tl, f_128);

Surely it would be cleaner to split out f_128 at this point, and not pass along f_128 to 
gen_shift_imm_fn?

>   static bool gen_shift(DisasContext *ctx, arg_r *a, DisasExtend ext,
> -                      void (*func)(TCGv, TCGv, TCGv))
> +                      void (*func)(TCGv, TCGv, TCGv),
> +                      void (*f128)(TCGv, TCGv, TCGv, TCGv, TCGv))
>   {
> -    TCGv dest = dest_gpr(ctx, a->rd);
> -    TCGv src1 = get_gpr(ctx, a->rs1, ext);
>       TCGv src2 = get_gpr(ctx, a->rs2, EXT_NONE);
>       TCGv ext2 = tcg_temp_new();
>   
>       tcg_gen_andi_tl(ext2, src2, get_olen(ctx) - 1);
> -    func(dest, src1, ext2);
>   
> -    gen_set_gpr(ctx, a->rd, dest);
> +    if (get_xl_max(ctx) < MXL_RV128) {
> +        TCGv dest = dest_gpr(ctx, a->rd);
> +        TCGv src1 = get_gpr(ctx, a->rs1, ext);
> +        func(dest, src1, ext2);
> +
> +        gen_set_gpr(ctx, a->rd, dest);
> +    } else {
> +        TCGv src1l = get_gpr(ctx, a->rs1, ext),
> +             src1h = get_gprh(ctx, a->rs1),
> +             destl = tcg_temp_new(),
> +             desth = tcg_temp_new();

Should be dest_gpr*.

> +
> +        if (get_olen(ctx) < 128) {
> +            func(destl, src1l, ext2);
> +            gen_set_gpr(ctx, a->rd, destl);
> +            gen_set_gprh(ctx, a->rd, desth);
> +        } else {
> +            assert(f128 != NULL);

I think you don't want to assert, but just return false.  This will make all of the Zb 
instructions come out undefined for rv128, which is probably what you want.  You'd want to 
do that earlier, before all the get_gpr* above.

> @@ -447,9 +486,75 @@ static bool trans_sub(DisasContext *ctx, arg_sub *a)
>       return gen_arith(ctx, a, EXT_NONE, tcg_gen_sub_tl);
>   }
>   
> +enum M128_DIR {
> +    M128_LEFT,
> +    M128_RIGHT,
> +    M128_RIGHT_ARITH
> +};

Why "M"?

> +         cnst_zero = tcg_constant_tl(0);

This is ctx->zero.

Lots of instances throughout your patch set
though this is the first time I noticed.

> +    tcg_gen_setcondi_tl(TCG_COND_GEU, tmp1, arg2, 64);

You should fold this test into the movcond.

> +        tcg_gen_movi_tl(tmp, 64);
> +        tcg_gen_sub_tl(tmp, tmp, shamt);

tcg_gen_subfi_tl.

The indentation is off in gen_sll_i128.

Hmm.  3 * (and + shift + cmp + cmov) + 2 * (sub + or) = 16 ops.
Not horrible...

Let's see.

     ls = sh & 63;        1
     rs = -sh & 63;       3
     hs = sh & 64;        4

     ll = s1l << ls;      5
     h0 = s1h << ls;      6
     lr = s1l >> rs;      7
     h1 = h0 | lr;        8

     dl = hs ? 0 : ll;    10
     dh = hs ? ll : h1;   12

That seems right, and would be 4 ops smaller.
Would need testing of course.


r~


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v3 11/21] target/riscv: support for 128-bit bitwise instructions
  2021-10-20 17:47   ` Richard Henderson
@ 2021-10-20 19:18     ` Frédéric Pétrot
  0 siblings, 0 replies; 49+ messages in thread
From: Frédéric Pétrot @ 2021-10-20 19:18 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel, qemu-riscv
  Cc: philmd, bin.meng, alistair.francis, palmer, fabien.portas

Le 20/10/2021 à 19:47, Richard Henderson a écrit :
> On 10/19/21 2:48 AM, Frédéric Pétrot wrote:
>> The 128-bit bitwise instructions do not need any function prototype change
>> as the functions can be applied independently on the lower and upper part of
>> the registers.
>>
>> Signed-off-by: Frédéric Pétrot <frederic.petrot@univ-grenoble-alpes.fr>
>> Co-authored-by: Fabien Portas <fabien.portas@grenoble-inp.org>
>> ---
>>   target/riscv/translate.c | 22 ++++++++++++++++++++++
>>   1 file changed, 22 insertions(+)
>>
>> diff --git a/target/riscv/translate.c b/target/riscv/translate.c
>> index e8f08f921e..71982f6284 100644
>> --- a/target/riscv/translate.c
>> +++ b/target/riscv/translate.c
>> @@ -429,6 +429,17 @@ static bool gen_logic_imm_fn(DisasContext *ctx, arg_i *a,
>> DisasExtend ext,
>>         gen_set_gpr(ctx, a->rd, dest);
>>   +    if (get_xl_max(ctx) == MXL_RV128) {
>> +        if (get_ol(ctx) ==  MXL_RV128) {
>> +            uint64_t immh = -(a->imm < 0);
>> +            src1 = get_gprh(ctx, a->rs1);
>> +            dest = dest_gprh(ctx, a->rd);
>> +
>> +            func(dest, src1, immh);
>> +        }
>> +        gen_set_gprh(ctx, a->rd, dest);
>> +    }
> 
> If ol < RV128, you're storing the low dest into the gprh, which is wrong.  It
> should be the sign-extension of the low part.  But that should happen for all
> writes.

  Thanks for your feedback (on the other parts too) that I'll apply.

  On this specific case, in gen_set_gprh I check that the operation is not on
  128 bit in which case I propagate the sign of the low part instead of using
  dest (the spec says that the sign should propagate to misa_xl_max, irrelevant
  of xl).
  This implicitly forces the order in which the functions must be called as you
  noticed, and introducing a higher level function as you suggest would indeed
  make things more readable, and this can probably be applied in most places.

Frédéric
-- 
+---------------------------------------------------------------------------+
| Frédéric Pétrot, Pr. Grenoble INP-Ensimag/TIMA,   Ensimag deputy director |
| Mob/Pho: +33 6 74 57 99 65/+33 4 76 57 48 70      Ad augusta  per angusta |
| http://tima.univ-grenoble-alpes.fr frederic.petrot@univ-grenoble-alpes.fr |
+---------------------------------------------------------------------------+


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v3 14/21] target/riscv: support for 128-bit arithmetic instructions
  2021-10-19  9:48 ` [PATCH v3 14/21] target/riscv: support for 128-bit arithmetic instructions Frédéric Pétrot
@ 2021-10-20 20:15   ` Richard Henderson
  0 siblings, 0 replies; 49+ messages in thread
From: Richard Henderson @ 2021-10-20 20:15 UTC (permalink / raw)
  To: Frédéric Pétrot, qemu-devel, qemu-riscv
  Cc: philmd, bin.meng, alistair.francis, palmer, fabien.portas

On 10/19/21 2:48 AM, Frédéric Pétrot wrote:
> +static bool gen_setcond_i128(TCGv rl, TCGv rh,
> +                             TCGv al, TCGv ah,
> +                             TCGv bl, TCGv bh,
> +                             TCGCond cond)
> +{
> +    switch (cond) {
> +    case TCG_COND_EQ:
> +        tcg_gen_setcond_tl(TCG_COND_EQ, rl, al, bl);
> +        tcg_gen_setcond_tl(TCG_COND_EQ, rh, ah, bh);
> +        tcg_gen_and_tl(rl, rl, rh);
> +        break;
> +
> +    case TCG_COND_NE:
> +        tcg_gen_setcond_tl(TCG_COND_NE, rl, al, bl);
> +        tcg_gen_setcond_tl(TCG_COND_NE, rh, ah, bh);
> +        tcg_gen_or_tl(rl, rl, rh);
> +        break;

But of course setcond is more expensive than logic
Better as xor + xor + or + setcond.


> +    case TCG_COND_LTU:
> +    {
> +        TCGv tmp1 = tcg_temp_new(),
> +             tmp2 = tcg_temp_new();
> +
> +        tcg_gen_sub2_tl(rl, rh, al, ah, bl, bh);
> +        tcg_gen_eqv_tl(tmp1, ah, bh);
> +        tcg_gen_and_tl(tmp1, tmp1, rh);
> +        tcg_gen_andc_tl(tmp2, bh, ah);
> +        tcg_gen_or_tl(tmp1, tmp1, tmp2);
> +        tcg_gen_shri_tl(rl, tmp1, 63);

Hmm.  Seems like it would work.
I make that 7 operations.

Or GEU in 6 operations:

     /* borrow in to second word */
     setcond_tl(TCG_COND_LTU, t1, al, bl);
     /* seed third word with 1, which will be result */
     sub2_tl(t1, t2, ah, one, t1, zero);
     sub2_tl(t1, rl, t1, t2, bh, zero);

> +        } else {
> +            gen_setcond_i128(tmpl, tmph, src1, src1h, src2, src2h, cond);
> +            tcg_gen_brcondi_tl(TCG_COND_NE, tmpl, 0, l);
> +        }

There are two setcond cases that invert their result; you should fold that in here and 
invert the branch condition.  As long as you're special casing 0, you might as well 
special case TCG_COND_LT/GE and test the sign of the high word.


r~


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v3 15/21] target/riscv: support for 128-bit M extension
  2021-10-19  9:48 ` [PATCH v3 15/21] target/riscv: support for 128-bit M extension Frédéric Pétrot
@ 2021-10-20 20:58   ` Richard Henderson
  0 siblings, 0 replies; 49+ messages in thread
From: Richard Henderson @ 2021-10-20 20:58 UTC (permalink / raw)
  To: Frédéric Pétrot, qemu-devel, qemu-riscv
  Cc: philmd, bin.meng, alistair.francis, palmer, fabien.portas

On 10/19/21 2:48 AM, Frédéric Pétrot wrote:
>   struct CPURISCVState {
>       target_ulong gpr[32];
>       target_ulong gprh[32]; /* 64 top bits of the 128-bit registers */
> +    target_ulong hlpr[2];  /* scratch registers for 128-bit div/rem helpers */

We have something similar for s390x, but we make use of the helper return value to return 
one part of the result and only store the other part of the result in env->retxl.

> +    cpu_hlpr[0] = tcg_global_mem_new(cpu_env,
> +        offsetof(CPURISCVState, hlpr[0]), "helper_reg0");
> +    cpu_hlpr[1] = tcg_global_mem_new(cpu_env,
> +        offsetof(CPURISCVState, hlpr[1]), "helper_reg1");

You very much do not want to make these global temps.

This requires the helpers to indicate that they clobber temps, which will flush all cached 
register state across the helper.  Just perform the load of the result explicitly after 
the helper.

> +static void gen_mulu2_i128(TCGv rll, TCGv rlh, TCGv rhl, TCGv rhh,
> +                           TCGv al, TCGv ah, TCGv bl, TCGv bh)
> +{
> +    TCGv tmpl = tcg_temp_new(),
> +         tmph = tcg_temp_new(),
> +         cnst_zero = tcg_constant_tl(0);
> +
> +    tcg_gen_mulu2_tl(rll, rlh, al, bl);
> +
> +    tcg_gen_mulu2_tl(tmpl, tmph, al, bh);
> +    tcg_gen_add2_tl(rlh, rhl, rlh, cnst_zero, tmpl, tmph);
> +    tcg_gen_mulu2_tl(tmpl, tmph, ah, bl);
> +    tcg_gen_add2_tl(rlh, tmph, rlh, rhl, tmpl, tmph);
> +    /* Overflow detection into rhh */
> +    tcg_gen_setcond_tl(TCG_COND_LTU, rhh, tmph, rhl);
> +
> +    tcg_gen_mov_tl(rhl, tmph);
> +
> +    tcg_gen_mulu2_tl(tmpl, tmph, ah, bh);
> +    tcg_gen_add2_tl(rhl, rhh, rhl, rhh, tmpl, tmph);

It might be clearer to number these 0-3 rather than permute [lh].

I think you don't need to return all 4 words of results; just have gen_mulhu_i128 with 6 
parameters, since there's no RV128 instruction that returns the entire result.

> +static void gen_mul_i128(TCGv rll, TCGv rlh,
> +                         TCGv rs1l, TCGv rs1h, TCGv rs2l, TCGv rs2h)
> +{
> +    TCGv rhl = tcg_temp_new(),
> +         rhh = tcg_temp_new();
> +
> +    gen_mulu2_i128(rll, rlh, rhl, rhh, rs1l, rs1h, rs2l, rs2h);
> +
> +    tcg_temp_free(rhl);
> +    tcg_temp_free(rhh);
> +}

This is much simpler than gen_mulu2_i128.

> +static void gen_mulh_i128(TCGv rhl, TCGv rhh,
> +                          TCGv rs1l, TCGv rs1h, TCGv rs2l, TCGv rs2h)
> +{
> +    TCGv rll = tcg_temp_new(),
> +         rlh = tcg_temp_new(),
> +         rlln = tcg_temp_new(),
> +         rlhn = tcg_temp_new(),
> +         rhln = tcg_temp_new(),
> +         rhhn = tcg_temp_new(),
> +         sgnres = tcg_temp_new(),
> +         tmp = tcg_temp_new(),
> +         cnst_one = tcg_constant_tl(1),
> +         cnst_zero = tcg_constant_tl(0);
> +
> +    /* Extract sign of result (=> sgn(a) xor sgn(b)) */
> +    tcg_gen_setcondi_tl(TCG_COND_LT, sgnres, rs1h, 0);
> +    tcg_gen_setcondi_tl(TCG_COND_LT, tmp, rs2h, 0);
> +    tcg_gen_xor_tl(sgnres, sgnres, tmp);
> +
> +    /* Take absolute value of operands */
> +    tcg_gen_sari_tl(rhl, rs1h, 63);
> +    tcg_gen_add2_tl(rlln, rlhn, rs1l, rs1h, rhl, rhl);
> +    tcg_gen_xor_tl(rlln, rlln, rhl);
> +    tcg_gen_xor_tl(rlhn, rlhn, rhl);
> +
> +    tcg_gen_sari_tl(rhl, rs2h, 63);
> +    tcg_gen_add2_tl(rhln, rhhn, rs2l, rs2h, rhl, rhl);
> +    tcg_gen_xor_tl(rhln, rhln, rhl);
> +    tcg_gen_xor_tl(rhhn, rhhn, rhl);
> +
> +    /* Unsigned multiplication */
> +    gen_mulu2_i128(rll, rlh, rhl, rhh, rlln, rlhn, rhln, rhhn);
> +
> +    /* Negation of result (two's complement : ~res + 1) */
> +    tcg_gen_not_tl(rlln, rll);
> +    tcg_gen_not_tl(rlhn, rlh);
> +    tcg_gen_not_tl(rhln, rhl);
> +    tcg_gen_not_tl(rhhn, rhh);
> +
> +    tcg_gen_add2_tl(rlln, tmp, rlln, cnst_zero, cnst_one, cnst_zero);
> +    tcg_gen_add2_tl(rlhn, tmp, rlhn, cnst_zero, tmp, cnst_zero);
> +    tcg_gen_add2_tl(rhln, tmp, rhln, cnst_zero, tmp, cnst_zero);
> +    tcg_gen_add2_tl(rhhn, tmp, rhhn, cnst_zero, tmp, cnst_zero);
> +
> +    /* Move conditionally result or -result depending on result sign */
> +    tcg_gen_movcond_tl(TCG_COND_NE, rhl, sgnres, cnst_zero, rhln, rhl);
> +    tcg_gen_movcond_tl(TCG_COND_NE, rhh, sgnres, cnst_zero, rhhn, rhh);
> +
> +    tcg_temp_free(rll);
> +    tcg_temp_free(rlh);
> +    tcg_temp_free(rlln);
> +    tcg_temp_free(rlhn);
> +    tcg_temp_free(rhln);
> +    tcg_temp_free(rhhn);
> +    tcg_temp_free(sgnres);
> +    tcg_temp_free(tmp);
>  }

You don't need to compute abs or conditional negation.

See tcg_gen_muls2_i32, adjust for negative inputs. It's simply subtracting one input from 
the high part when the other input is negative.

> +static void gen_mulhsu_i128(TCGv rhl, TCGv rhh,
> +                            TCGv rs1l, TCGv rs1h, TCGv rs2l, TCGv rs2h)

Similarly, but of course only one operand may be negative.

> +static void gen_div_i128(TCGv rdl, TCGv rdh,
> +                         TCGv rs1l, TCGv rs1h, TCGv rs2l, TCGv rs2h)
> +{
> +    gen_helper_divs_i128(cpu_env, (TCGv_i64)rs1l, (TCGv_i64)rs1h,
> +                                  (TCGv_i64)rs2l, (TCGv_i64)rs2h);

Do not cast, just make the arguments target_long always.

Anyway, per above, this becomes

     gen_helper_divs_i128(rdl, cpu_env, rs1l, rs1h, rs2l, rs2h);
     tcg_gen_ld_tl(rdh, cpu_env, offsetof(CPURISCVState, retxh));


r~


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v3 16/21] target/riscv: adding high part of some csrs
  2021-10-19  9:48 ` [PATCH v3 16/21] target/riscv: adding high part of some csrs Frédéric Pétrot
@ 2021-10-20 21:38   ` Richard Henderson
  2021-10-20 23:03   ` Richard Henderson
  1 sibling, 0 replies; 49+ messages in thread
From: Richard Henderson @ 2021-10-20 21:38 UTC (permalink / raw)
  To: Frédéric Pétrot, qemu-devel, qemu-riscv
  Cc: philmd, bin.meng, alistair.francis, palmer, fabien.portas

On 10/19/21 2:48 AM, Frédéric Pétrot wrote:
> +    /* Upper 64-bits of 128-bit CSRs */
> +    uint64_t mtvech;
> +    uint64_t mscratchh;
> +    uint64_t mepch;
> +    uint64_t satph;
> +    uint64_t mstatush;

Needs adding to the same machine.c subsection as the gprs.
Otherwise,

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>

r~


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v3 17/21] target/riscv: helper functions to wrap calls to 128-bit csr insns
  2021-10-19  9:48 ` [PATCH v3 17/21] target/riscv: helper functions to wrap calls to 128-bit csr insns Frédéric Pétrot
@ 2021-10-20 21:47   ` Richard Henderson
  0 siblings, 0 replies; 49+ messages in thread
From: Richard Henderson @ 2021-10-20 21:47 UTC (permalink / raw)
  To: Frédéric Pétrot, qemu-devel, qemu-riscv
  Cc: philmd, bin.meng, alistair.francis, palmer, fabien.portas

On 10/19/21 2:48 AM, Frédéric Pétrot wrote:
> Given the side effects they have, the csr instructions are realized as
> helpers. We extend this existing infrastructure for 128-bit sized csr.
> We have a slight issue with returning 128-bit values: we use the globals
> we added to support div/rem insns to that end.
> Theses helpers all call a unique function that is currently a stub.
> The trans_csrxx functions supporting 128-bit are yet to be implemented.
> 
> Signed-off-by: Frédéric Pétrot<frederic.petrot@univ-grenoble-alpes.fr>
> Co-authored-by: Fabien Portas<fabien.portas@grenoble-inp.org>
> ---
>   target/riscv/cpu.h       |  4 ++++
>   target/riscv/helper.h    |  3 +++
>   target/riscv/csr.c       |  7 +++++++
>   target/riscv/op_helper.c | 44 ++++++++++++++++++++++++++++++++++++++++
>   4 files changed, 58 insertions(+)

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>

r~


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v3 18/21] target/riscv: modification of the trans_csrxx for 128-bit support
  2021-10-19  9:48 ` [PATCH v3 18/21] target/riscv: modification of the trans_csrxx for 128-bit support Frédéric Pétrot
@ 2021-10-20 21:53   ` Richard Henderson
  0 siblings, 0 replies; 49+ messages in thread
From: Richard Henderson @ 2021-10-20 21:53 UTC (permalink / raw)
  To: Frédéric Pétrot, qemu-devel, qemu-riscv
  Cc: philmd, bin.meng, alistair.francis, palmer, fabien.portas

On 10/19/21 2:48 AM, Frédéric Pétrot wrote:
> As opposed to the gen_arith and gen_shift generation helpers, the csr insns
> do not have a common prototype, so the choice to generate 32/64 or 128-bit
> helper calls is done in the trans_csrxx functions.
> 
> Signed-off-by: Frédéric Pétrot<frederic.petrot@univ-grenoble-alpes.fr>
> Co-authored-by: Fabien Portas<fabien.portas@grenoble-inp.org>
> ---
>   target/riscv/insn_trans/trans_rvi.c.inc | 201 ++++++++++++++++++------
>   1 file changed, 156 insertions(+), 45 deletions(-)

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>

r~


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v3 19/21] target/riscv: actual functions to realize crs 128-bit insns
  2021-10-19  9:48 ` [PATCH v3 19/21] target/riscv: actual functions to realize crs 128-bit insns Frédéric Pétrot
@ 2021-10-20 22:18   ` Richard Henderson
  0 siblings, 0 replies; 49+ messages in thread
From: Richard Henderson @ 2021-10-20 22:18 UTC (permalink / raw)
  To: Frédéric Pétrot, qemu-devel, qemu-riscv
  Cc: philmd, bin.meng, alistair.francis, palmer, fabien.portas

On 10/19/21 2:48 AM, Frédéric Pétrot wrote:
> The csrs are accessed through function pointers: we set-up the table
> for the 128-bit accesses, make the stub a function that does what it
> should, and implement basic accesses on read-only csrs.
> 
> Signed-off-by: Frédéric Pétrot <frederic.petrot@univ-grenoble-alpes.fr>
> Co-authored-by: Fabien Portas <fabien.portas@grenoble-inp.org>
> ---
>   target/riscv/cpu.h |  16 +++++
>   target/riscv/csr.c | 152 ++++++++++++++++++++++++++++++++++++++++++++-
>   2 files changed, 165 insertions(+), 3 deletions(-)
> 
> diff --git a/target/riscv/cpu.h b/target/riscv/cpu.h
> index eb4f63fcbf..253e87cd92 100644
> --- a/target/riscv/cpu.h
> +++ b/target/riscv/cpu.h
> @@ -474,6 +474,15 @@ RISCVException riscv_csrrw_i128(CPURISCVState *env, int csrno,
>                                   Int128 *ret_value,
>                                   Int128 new_value, Int128 write_mask);
>   
> +typedef RISCVException (*riscv_csr_read128_fn)(CPURISCVState *env, int csrno,
> +                                               Int128 *ret_value);
> +typedef RISCVException (*riscv_csr_write128_fn)(CPURISCVState *env, int csrno,
> +                                             Int128 new_value);
> +typedef RISCVException (*riscv_csr_op128_fn)(CPURISCVState *env, int csrno,
> +                                             Int128 *ret_value,
> +                                             Int128 new_value,
> +                                             Int128 write_mask);

Do we really want all 3, or just the single rmw function?
Although I guess it's clearest to match the existing code...

> +
>   typedef struct {
>       const char *name;
>       riscv_csr_predicate_fn predicate;
> @@ -482,6 +491,12 @@ typedef struct {
>       riscv_csr_op_fn op;
>   } riscv_csr_operations;
>   
> +typedef struct {
> +    riscv_csr_read128_fn read128;
> +    riscv_csr_write128_fn write128;
> +    riscv_csr_op128_fn op128;
> +} riscv_csr_operations128;

Eh.  I think better to extend the one riscv_csr_operations structure.

> +static inline RISCVException riscv_csrrw_check_i128(CPURISCVState *env,
> +                                                    int csrno,
> +                                                    Int128 write_mask,
> +                                                    RISCVCPU *cpu)

Change "Int128 write_mask" to "bool write" and you can share this entire function with 
riscv_csrrw.

Indeed, you could split them like so:

riscv_csrrw(...)
{
     ret = csrrw_check(...);
     if (ret != RISCV_EXCP_NONE) {
         return ret;
     }
     return csrrw_do64(...);
}

riscv_csrrw_128(...)
{
     ret = csrrw_check(...);
     if (ret != RISCV_EXCP_NONE) {
         return ret;
     }
     if (csr128) {
         return csrrw_do128(...);
     }
     ret = csrrw_do64(..., old64, ...);
     if (ret == RISCV_EXCP_NONE) {
         *old_val = int128_make64(old64);
     }
     return ret;
}

> +    RISCVException ret = csr_ops[csrno].predicate(env, csrno);
> +    if (ret != RISCV_EXCP_NONE) {
> +        return ret;
> +    }
> +
> +    return RISCV_EXCP_NONE;

BTW, just

     return csr_ops[csrno].predicate(env, csrno);


r~


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v3 16/21] target/riscv: adding high part of some csrs
  2021-10-19  9:48 ` [PATCH v3 16/21] target/riscv: adding high part of some csrs Frédéric Pétrot
  2021-10-20 21:38   ` Richard Henderson
@ 2021-10-20 23:03   ` Richard Henderson
  1 sibling, 0 replies; 49+ messages in thread
From: Richard Henderson @ 2021-10-20 23:03 UTC (permalink / raw)
  To: Frédéric Pétrot, qemu-devel, qemu-riscv
  Cc: philmd, bin.meng, alistair.francis, palmer, fabien.portas

On 10/19/21 2:48 AM, Frédéric Pétrot wrote:
> Adding the high part of a minimal set of csr.
> 
> Signed-off-by: Frédéric Pétrot <frederic.petrot@univ-grenoble-alpes.fr>
> Co-authored-by: Fabien Portas <fabien.portas@grenoble-inp.org>
> ---
>   target/riscv/cpu.h | 7 +++++++
>   1 file changed, 7 insertions(+)
> 
> diff --git a/target/riscv/cpu.h b/target/riscv/cpu.h
> index 8b96ccb37a..27ec4fec63 100644
> --- a/target/riscv/cpu.h
> +++ b/target/riscv/cpu.h
> @@ -192,6 +192,13 @@ struct CPURISCVState {
>       target_ulong hgatp;
>       uint64_t htimedelta;
>   
> +    /* Upper 64-bits of 128-bit CSRs */
> +    uint64_t mtvech;
> +    uint64_t mscratchh;
> +    uint64_t mepch;
> +    uint64_t satph;
> +    uint64_t mstatush;

There's nothing defined for mstatush (except SD), so we might as well leave it out until 
there is.  The only thing required there is that we put SD in the correct place when we 
compute it from lower bits on read.

mepch and mtvech do not need extending until we extend pc.

I don't see a definition of how satph extends, and since you're not changing the rv64 
virtual memory routines nothing will examine it anyway.  Let's drop that.

Which leaves mscratchh and maybe sscratchh as the only "real" 128-bit csrs.
Which suggests that the support that you do add in the next patch does not need to be 
quite as complicated.  E.g. drop the op128 hook.


r~


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v3 21/21] target/riscv: support for 128-bit satp
  2021-10-19  9:48 ` [PATCH v3 21/21] target/riscv: support for 128-bit satp Frédéric Pétrot
@ 2021-10-20 23:09   ` Richard Henderson
  2021-10-21 11:12     ` Frédéric Pétrot
  0 siblings, 1 reply; 49+ messages in thread
From: Richard Henderson @ 2021-10-20 23:09 UTC (permalink / raw)
  To: Frédéric Pétrot, qemu-devel, qemu-riscv
  Cc: philmd, bin.meng, alistair.francis, palmer, fabien.portas

On 10/19/21 2:48 AM, Frédéric Pétrot wrote:
> Support for a 128-bit satp. This is a bit more involved than necessary
> because we took the opportunity to increase the page size to 16kB, and
> change the page table geometry, which makes the page walk a bit more
> parametrizable (variables instead of defines).
> Note that is anyway a necessary step for the merging of the 32-bit and
> 64-bit riscv versions in a single executable.
> 
> Signed-off-by: Frédéric Pétrot<frederic.petrot@univ-grenoble-alpes.fr>
> Co-authored-by: Fabien Portas<fabien.portas@grenoble-inp.org>
> ---
>   target/riscv/cpu-param.h  |   9 +++-
>   target/riscv/cpu_bits.h   |  10 ++++
>   target/riscv/cpu_helper.c |  54 ++++++++++++++------
>   target/riscv/csr.c        | 105 ++++++++++++++++++++++++++++++++------
>   4 files changed, 144 insertions(+), 34 deletions(-)

Is there a spec for this?  I don't see anything in the 2021-10-06 draft...


r~


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v3 20/21] target/riscv: adding 128-bit access functions for some csrs
  2021-10-19  9:48 ` [PATCH v3 20/21] target/riscv: adding 128-bit access functions for some csrs Frédéric Pétrot
@ 2021-10-20 23:18   ` Richard Henderson
  0 siblings, 0 replies; 49+ messages in thread
From: Richard Henderson @ 2021-10-20 23:18 UTC (permalink / raw)
  To: Frédéric Pétrot, qemu-devel, qemu-riscv
  Cc: philmd, bin.meng, alistair.francis, palmer, fabien.portas

On 10/19/21 2:48 AM, Frédéric Pétrot wrote:
> +static RISCVException read_mstatus_i128(CPURISCVState *env, int csrno,
> +                                   Int128 *val)
> +{
> +    *val = int128_make128(env->mstatus, env->mstatush);
> +    return RISCV_EXCP_NONE;
> +}

Needs updating from split SD bit.  I suggest

     uint64_t val64;
     read_mstatus(env, CSR_MSTATUS, &val64);
     *val = int128_make128(val64 & MSTATUS64_SD, val64 & MSTATUS64_SD);

> +static RISCVException write_mstatus_i128(CPURISCVState *env, int csrno,
> +                                        Int128 val)
> +{
...
> +    dirty = ((int128_getlo(mstatus) & MSTATUS_FS) == MSTATUS_FS) |
> +            ((int128_getlo(mstatus) & MSTATUS_XS) == MSTATUS_XS);
> +    if (dirty) {
> +        if (riscv_cpu_is_32bit(env)) {
> +            mstatus = int128_make64(int128_getlo(mstatus) | MSTATUS32_SD);
> +        } else if (riscv_cpu_is_64bit(env)) {
> +            mstatus = int128_make64(int128_getlo(mstatus) | MSTATUS64_SD);
> +        } else {
> +            mstatus = int128_or(mstatus, int128_make128(0, MSTATUSH128_SD));
> +        }
> +    }

Needs updating for change to SD.
Now you can defer everything to the 64-bit write_mstatus.


r~


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v3 21/21] target/riscv: support for 128-bit satp
  2021-10-20 23:09   ` Richard Henderson
@ 2021-10-21 11:12     ` Frédéric Pétrot
  0 siblings, 0 replies; 49+ messages in thread
From: Frédéric Pétrot @ 2021-10-21 11:12 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel, qemu-riscv
  Cc: philmd, bin.meng, alistair.francis, palmer, fabien.portas

Le 21/10/2021 à 01:09, Richard Henderson a écrit :
> On 10/19/21 2:48 AM, Frédéric Pétrot wrote:
>> Support for a 128-bit satp. This is a bit more involved than necessary
>> because we took the opportunity to increase the page size to 16kB, and
>> change the page table geometry, which makes the page walk a bit more
>> parametrizable (variables instead of defines).
>> Note that is anyway a necessary step for the merging of the 32-bit and
>> 64-bit riscv versions in a single executable.
>>
>> Signed-off-by: Frédéric Pétrot<frederic.petrot@univ-grenoble-alpes.fr>
>> Co-authored-by: Fabien Portas<fabien.portas@grenoble-inp.org>
>> ---
>>   target/riscv/cpu-param.h  |   9 +++-
>>   target/riscv/cpu_bits.h   |  10 ++++
>>   target/riscv/cpu_helper.c |  54 ++++++++++++++------
>>   target/riscv/csr.c        | 105 ++++++++++++++++++++++++++++++++------
>>   4 files changed, 144 insertions(+), 34 deletions(-)
> 
> Is there a spec for this?  I don't see anything in the 2021-10-06 draft...

  Indeed, there is nothing close to be standardized on that matter, so we are
  clearly out of bounds, I should probably not have added this in the series.
  FWIW, we wrote a small specification of the schemes we implemented.
  (https://github.com/fpetrot/128-test/blob/main/kernel/vm_spec_short.pdf).

  Frédéric
> 
> 
> r~-- 
+---------------------------------------------------------------------------+
| Frédéric Pétrot, Pr. Grenoble INP-Ensimag/TIMA,   Ensimag deputy director |
| Mob/Pho: +33 6 74 57 99 65/+33 4 76 57 48 70      Ad augusta  per angusta |
| http://tima.univ-grenoble-alpes.fr frederic.petrot@univ-grenoble-alpes.fr |
+---------------------------------------------------------------------------+


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v3 04/21] target/riscv: additional macros to check instruction support
  2021-10-20 14:08   ` Richard Henderson
@ 2021-10-21 16:22     ` Frédéric Pétrot
  0 siblings, 0 replies; 49+ messages in thread
From: Frédéric Pétrot @ 2021-10-21 16:22 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel, qemu-riscv
  Cc: philmd, bin.meng, alistair.francis, palmer, fabien.portas

Le 20/10/2021 à 16:08, Richard Henderson a écrit :
> On 10/19/21 2:47 AM, Frédéric Pétrot wrote:
>> +
>> +#define REQUIRE_64_OR_128BIT(ctx) do { \
>> +    if (get_xl(ctx) == MXL_RV32) {     \
>> +        return false;                  \
>> +    }                                  \
>> +} while (0)
> 
> So... you've left REQUIRE_64BIT accepting RV128, so that means that your current
> REQUIRE_64_OR_128BIT is redundant.  Is that intentional?
> 
> It does seem like all places that accept RV128 should accept RV64, but perhaps
> that's just your "limited" caveat in the cover letter.

  My bad, indeed there is no instruction only required by RV64. "Limited" was
  related to the minimal support of the priviledge spec.
> You don't use REQUIRE_32_OR_64BIT at all.  Remove it?

  It's a bug : some compressed insns are only RV32/RV64 (this is linked to
  the other bug in the order in which the insns stand in the insn16.decode
  file that you pointed out).

  Frédéric
> 
> 
> r~

-- 
+---------------------------------------------------------------------------+
| Frédéric Pétrot, Pr. Grenoble INP-Ensimag/TIMA,   Ensimag deputy director |
| Mob/Pho: +33 6 74 57 99 65/+33 4 76 57 48 70      Ad augusta  per angusta |
| http://tima.univ-grenoble-alpes.fr frederic.petrot@univ-grenoble-alpes.fr |
+---------------------------------------------------------------------------+


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v3 06/21] target/riscv: array for the 64 upper bits of 128-bit registers
  2021-10-20 14:44   ` Richard Henderson
@ 2021-10-22  6:06     ` Frédéric Pétrot
  0 siblings, 0 replies; 49+ messages in thread
From: Frédéric Pétrot @ 2021-10-22  6:06 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel, qemu-riscv
  Cc: philmd, bin.meng, alistair.francis, palmer, fabien.portas

Le 20/10/2021 à 16:44, Richard Henderson a écrit :
> On 10/19/21 2:47 AM, Frédéric Pétrot wrote:
>> The upper 64-bit of the 128-bit registers have now a place inside
>> the cpu state structure, and are created as globals for future use.
>>
>> Signed-off-by: Frédéric Pétrot <frederic.petrot@univ-grenoble-alpes.fr>
>> Co-authored-by: Fabien Portas <fabien.portas@grenoble-inp.org>
>> ---
>>   target/riscv/translate.c | 5 ++++-
>>   2 files changed, 5 insertions(+), 1 deletion(-)
>>       for (i = 1; i < 32; i++) {
>>           cpu_gpr[i] = tcg_global_mem_new(cpu_env,
>>               offsetof(CPURISCVState, gpr[i]), riscv_int_regnames[i]);
>> +        cpu_gprh[i] = tcg_global_mem_new(cpu_env,
>> +            offsetof(CPURISCVState, gprh[i]), riscv_int_regnames[i]);
> 
> This will just be confusing in the tcg dumps -- let's not name the two temps the
> identically.

  Agreed.

> Honestly, I'm not 100% thrilled about the / that appears in the current name; I
> think it would be easiest to do
> 
>   g_string_printf("x%d", i)
> and
>   g_string_printf("x%dh", i)

  Registers sw names are used by gcc -S and the default objdump -d output,
  and also by disas/riscv.c, so dropping them might be a bit rough.
  For now I'll just add an h in the existing names, and suggest we wait to see
  if anyone cares.

  Frédéric
-- 
+---------------------------------------------------------------------------+
| Frédéric Pétrot, Pr. Grenoble INP-Ensimag/TIMA,   Ensimag deputy director |
| Mob/Pho: +33 6 74 57 99 65/+33 4 76 57 48 70      Ad augusta  per angusta |
| http://tima.univ-grenoble-alpes.fr frederic.petrot@univ-grenoble-alpes.fr |
+---------------------------------------------------------------------------+


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v3 13/21] target/riscv: support for 128-bit shift instructions
  2021-10-20 19:06   ` Richard Henderson
@ 2021-10-24 22:49     ` Frédéric Pétrot
  0 siblings, 0 replies; 49+ messages in thread
From: Frédéric Pétrot @ 2021-10-24 22:49 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel, qemu-riscv
  Cc: philmd, bin.meng, alistair.francis, palmer, fabien.portas

Le 20/10/2021 à 21:06, Richard Henderson a écrit :
> On 10/19/21 2:48 AM, Frédéric Pétrot wrote:
> 
> Hmm.  3 * (and + shift + cmp + cmov) + 2 * (sub + or) = 16 ops.
> Not horrible...
> 
> Let's see.
> 
>     ls = sh & 63;        1
>     rs = -sh & 63;       3
>     hs = sh & 64;        4
> 
>     ll = s1l << ls;      5
>     h0 = s1h << ls;      6
>     lr = s1l >> rs;      7
>     h1 = h0 | lr;        8
> 
>     dl = hs ? 0 : ll;    10
>     dh = hs ? ll : h1;   12
> 
> That seems right, and would be 4 ops smaller.
> Would need testing of course.

  Nice !
  The case when sh is 0 is specific, so we need an additional
  cmov, but this is still 3 ops better.

  Frédéric
> 
> 
> r~

-- 
+---------------------------------------------------------------------------+
| Frédéric Pétrot, Pr. Grenoble INP-Ensimag/TIMA,   Ensimag deputy director |
| Mob/Pho: +33 6 74 57 99 65/+33 4 76 57 48 70      Ad augusta  per angusta |
| http://tima.univ-grenoble-alpes.fr frederic.petrot@univ-grenoble-alpes.fr |
+---------------------------------------------------------------------------+


^ permalink raw reply	[flat|nested] 49+ messages in thread

end of thread, other threads:[~2021-10-24 22:50 UTC | newest]

Thread overview: 49+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-10-19  9:47 [PATCH v3 00/21] Adding partial support for 128-bit riscv target Frédéric Pétrot
2021-10-19  9:47 ` [PATCH v3 01/21] memory: change define name for consistency Frédéric Pétrot
2021-10-20 15:07   ` Philippe Mathieu-Daudé
2021-10-19  9:47 ` [PATCH v3 02/21] memory: add a few defines for octo (128-bit) values Frédéric Pétrot
2021-10-19 18:00   ` Richard Henderson
2021-10-19  9:47 ` [PATCH v3 03/21] Int128.h: addition of a few 128-bit operations Frédéric Pétrot
2021-10-19 18:15   ` Richard Henderson
2021-10-19  9:47 ` [PATCH v3 04/21] target/riscv: additional macros to check instruction support Frédéric Pétrot
2021-10-20 14:08   ` Richard Henderson
2021-10-21 16:22     ` Frédéric Pétrot
2021-10-19  9:47 ` [PATCH v3 05/21] target/riscv: separation of bitwise logic and aritmetic helpers Frédéric Pétrot
2021-10-20 14:14   ` Richard Henderson
2021-10-19  9:47 ` [PATCH v3 06/21] target/riscv: array for the 64 upper bits of 128-bit registers Frédéric Pétrot
2021-10-20 14:44   ` Richard Henderson
2021-10-22  6:06     ` Frédéric Pétrot
2021-10-19  9:47 ` [PATCH v3 07/21] target/riscv: setup everything so that riscv128-softmmu compiles Frédéric Pétrot
2021-10-20 14:57   ` Richard Henderson
2021-10-19  9:47 ` [PATCH v3 08/21] target/riscv: adding accessors to the registers upper part Frédéric Pétrot
2021-10-20 15:09   ` Richard Henderson
2021-10-19  9:48 ` [PATCH v3 09/21] target/riscv: moving some insns close to similar insns Frédéric Pétrot
2021-10-20 15:11   ` Richard Henderson
2021-10-19  9:48 ` [PATCH v3 10/21] target/riscv: support for 128-bit loads and store Frédéric Pétrot
2021-10-20 17:31   ` Richard Henderson
2021-10-19  9:48 ` [PATCH v3 11/21] target/riscv: support for 128-bit bitwise instructions Frédéric Pétrot
2021-10-20 17:47   ` Richard Henderson
2021-10-20 19:18     ` Frédéric Pétrot
2021-10-19  9:48 ` [PATCH v3 12/21] target/riscv: support for 128-bit U-type instructions Frédéric Pétrot
2021-10-20 17:59   ` Richard Henderson
2021-10-19  9:48 ` [PATCH v3 13/21] target/riscv: support for 128-bit shift instructions Frédéric Pétrot
2021-10-20 19:06   ` Richard Henderson
2021-10-24 22:49     ` Frédéric Pétrot
2021-10-19  9:48 ` [PATCH v3 14/21] target/riscv: support for 128-bit arithmetic instructions Frédéric Pétrot
2021-10-20 20:15   ` Richard Henderson
2021-10-19  9:48 ` [PATCH v3 15/21] target/riscv: support for 128-bit M extension Frédéric Pétrot
2021-10-20 20:58   ` Richard Henderson
2021-10-19  9:48 ` [PATCH v3 16/21] target/riscv: adding high part of some csrs Frédéric Pétrot
2021-10-20 21:38   ` Richard Henderson
2021-10-20 23:03   ` Richard Henderson
2021-10-19  9:48 ` [PATCH v3 17/21] target/riscv: helper functions to wrap calls to 128-bit csr insns Frédéric Pétrot
2021-10-20 21:47   ` Richard Henderson
2021-10-19  9:48 ` [PATCH v3 18/21] target/riscv: modification of the trans_csrxx for 128-bit support Frédéric Pétrot
2021-10-20 21:53   ` Richard Henderson
2021-10-19  9:48 ` [PATCH v3 19/21] target/riscv: actual functions to realize crs 128-bit insns Frédéric Pétrot
2021-10-20 22:18   ` Richard Henderson
2021-10-19  9:48 ` [PATCH v3 20/21] target/riscv: adding 128-bit access functions for some csrs Frédéric Pétrot
2021-10-20 23:18   ` Richard Henderson
2021-10-19  9:48 ` [PATCH v3 21/21] target/riscv: support for 128-bit satp Frédéric Pétrot
2021-10-20 23:09   ` Richard Henderson
2021-10-21 11:12     ` Frédéric Pétrot

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).