qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 1/8] target/riscv: Settings for 128-bit extension support
@ 2021-08-30 17:16 Frédéric Pétrot
  2021-08-30 17:16 ` [PATCH 2/8] target/riscv: 128-bit registers creation and access Frédéric Pétrot
                   ` (7 more replies)
  0 siblings, 8 replies; 22+ messages in thread
From: Frédéric Pétrot @ 2021-08-30 17:16 UTC (permalink / raw)
  To: qemu-devel, qemu-riscv
  Cc: Alex Bennée, Bin Meng, Alistair Francis, Fabien Portas,
	Palmer Dabbelt, Frédéric Pétrot,
	Philippe Mathieu-Daudé

Starting 128-bit extension support implies a few modifications in the
existing sources because checking for 32-bit is done by checking that
it is not 64-bit and vice-versa.
We now consider the 3 possible xlen values so as to allow correct
compilation for both existing targets while setting the compilation
framework so that it can also handle the riscv128-softmmu target.
This includes gdb configuration files, that are just the bare copy of the
64-bit ones as gdb does not honor, yet, 128-bit CPUs.
To consider the 3 xlen values, we had to add a misah field, representing the
upper 64 bits of the misa register.

Signed-off-by: Frédéric Pétrot <frederic.petrot@univ-grenoble-alpes.fr>
Co-authored-by: Fabien Portas <fabien.portas@grenoble-inp.org>
---
 configs/devices/riscv128-softmmu/default.mak | 16 ++++++
 configs/targets/riscv128-softmmu.mak         |  5 ++
 gdb-xml/riscv-128bit-cpu.xml                 | 48 ++++++++++++++++++
 gdb-xml/riscv-128bit-virtual.xml             | 12 +++++
 include/hw/riscv/sifive_cpu.h                |  4 ++
 target/riscv/Kconfig                         |  3 ++
 target/riscv/arch_dump.c                     |  3 +-
 target/riscv/cpu-param.h                     |  3 +-
 target/riscv/cpu.c                           | 51 +++++++++++++++++---
 target/riscv/cpu.h                           | 19 ++++++++
 target/riscv/gdbstub.c                       |  3 ++
 target/riscv/insn_trans/trans_rvd.c.inc      | 10 ++--
 target/riscv/insn_trans/trans_rvf.c.inc      |  2 +-
 target/riscv/translate.c                     | 45 ++++++++++++++++-
 14 files changed, 209 insertions(+), 15 deletions(-)
 create mode 100644 configs/devices/riscv128-softmmu/default.mak
 create mode 100644 configs/targets/riscv128-softmmu.mak
 create mode 100644 gdb-xml/riscv-128bit-cpu.xml
 create mode 100644 gdb-xml/riscv-128bit-virtual.xml

diff --git a/configs/devices/riscv128-softmmu/default.mak b/configs/devices/riscv128-softmmu/default.mak
new file mode 100644
index 0000000000..31439dbcfe
--- /dev/null
+++ b/configs/devices/riscv128-softmmu/default.mak
@@ -0,0 +1,16 @@
+# Default configuration for riscv128-softmmu
+
+# Uncomment the following lines to disable these optional devices:
+#
+#CONFIG_PCI_DEVICES=n
+CONFIG_SEMIHOSTING=y
+CONFIG_ARM_COMPATIBLE_SEMIHOSTING=y
+
+# Boards:
+#
+CONFIG_SPIKE=n
+CONFIG_SIFIVE_E=n
+CONFIG_SIFIVE_U=n
+CONFIG_RISCV_VIRT=y
+CONFIG_MICROCHIP_PFSOC=n
+CONFIG_SHAKTI_C=n
diff --git a/configs/targets/riscv128-softmmu.mak b/configs/targets/riscv128-softmmu.mak
new file mode 100644
index 0000000000..e300c43c8e
--- /dev/null
+++ b/configs/targets/riscv128-softmmu.mak
@@ -0,0 +1,5 @@
+TARGET_ARCH=riscv128
+TARGET_BASE_ARCH=riscv
+TARGET_SUPPORTS_MTTCG=y
+TARGET_XML_FILES= gdb-xml/riscv-128bit-cpu.xml gdb-xml/riscv-32bit-fpu.xml gdb-xml/riscv-64bit-fpu.xml gdb-xml/riscv-128bit-virtual.xml
+TARGET_NEED_FDT=y
diff --git a/gdb-xml/riscv-128bit-cpu.xml b/gdb-xml/riscv-128bit-cpu.xml
new file mode 100644
index 0000000000..c98168148f
--- /dev/null
+++ b/gdb-xml/riscv-128bit-cpu.xml
@@ -0,0 +1,48 @@
+<?xml version="1.0"?>
+<!-- Copyright (C) 2018-2019 Free Software Foundation, Inc.
+
+     Copying and distribution of this file, with or without modification,
+     are permitted in any medium without royalty provided the copyright
+     notice and this notice are preserved.  -->
+
+<!-- Register numbers are hard-coded in order to maintain backward
+     compatibility with older versions of tools that didn't use xml
+     register descriptions.  -->
+
+<!DOCTYPE feature SYSTEM "gdb-target.dtd">
+<!-- FIXME : All GPRs are marked as 64-bits since gdb doesn't like 128-bit registers for now. -->
+<feature name="org.gnu.gdb.riscv.cpu">
+  <reg name="zero" bitsize="64" type="int" regnum="0"/>
+  <reg name="ra" bitsize="64" type="code_ptr"/>
+  <reg name="sp" bitsize="64" type="data_ptr"/>
+  <reg name="gp" bitsize="64" type="data_ptr"/>
+  <reg name="tp" bitsize="64" type="data_ptr"/>
+  <reg name="t0" bitsize="64" type="int"/>
+  <reg name="t1" bitsize="64" type="int"/>
+  <reg name="t2" bitsize="64" type="int"/>
+  <reg name="fp" bitsize="64" type="data_ptr"/>
+  <reg name="s1" bitsize="64" type="int"/>
+  <reg name="a0" bitsize="64" type="int"/>
+  <reg name="a1" bitsize="64" type="int"/>
+  <reg name="a2" bitsize="64" type="int"/>
+  <reg name="a3" bitsize="64" type="int"/>
+  <reg name="a4" bitsize="64" type="int"/>
+  <reg name="a5" bitsize="64" type="int"/>
+  <reg name="a6" bitsize="64" type="int"/>
+  <reg name="a7" bitsize="64" type="int"/>
+  <reg name="s2" bitsize="64" type="int"/>
+  <reg name="s3" bitsize="64" type="int"/>
+  <reg name="s4" bitsize="64" type="int"/>
+  <reg name="s5" bitsize="64" type="int"/>
+  <reg name="s6" bitsize="64" type="int"/>
+  <reg name="s7" bitsize="64" type="int"/>
+  <reg name="s8" bitsize="64" type="int"/>
+  <reg name="s9" bitsize="64" type="int"/>
+  <reg name="s10" bitsize="64" type="int"/>
+  <reg name="s11" bitsize="64" type="int"/>
+  <reg name="t3" bitsize="64" type="int"/>
+  <reg name="t4" bitsize="64" type="int"/>
+  <reg name="t5" bitsize="64" type="int"/>
+  <reg name="t6" bitsize="64" type="int"/>
+  <reg name="pc" bitsize="64" type="code_ptr"/>
+</feature>
diff --git a/gdb-xml/riscv-128bit-virtual.xml b/gdb-xml/riscv-128bit-virtual.xml
new file mode 100644
index 0000000000..db9a0ff677
--- /dev/null
+++ b/gdb-xml/riscv-128bit-virtual.xml
@@ -0,0 +1,12 @@
+<?xml version="1.0"?>
+<!-- Copyright (C) 2018-2019 Free Software Foundation, Inc.
+
+     Copying and distribution of this file, with or without modification,
+     are permitted in any medium without royalty provided the copyright
+     notice and this notice are preserved.  -->
+
+<!DOCTYPE feature SYSTEM "gdb-target.dtd">
+<!-- FIXME : priv marked as 64-bits since gdb doesn't like 128-bit registers for now. -->
+<feature name="org.gnu.gdb.riscv.virtual">
+  <reg name="priv" bitsize="64"/>
+</feature>
diff --git a/include/hw/riscv/sifive_cpu.h b/include/hw/riscv/sifive_cpu.h
index 136799633a..2fd441664f 100644
--- a/include/hw/riscv/sifive_cpu.h
+++ b/include/hw/riscv/sifive_cpu.h
@@ -26,6 +26,10 @@
 #elif defined(TARGET_RISCV64)
 #define SIFIVE_E_CPU TYPE_RISCV_CPU_SIFIVE_E51
 #define SIFIVE_U_CPU TYPE_RISCV_CPU_SIFIVE_U54
+#elif defined(TARGET_RISCV128)
+/* 128-bit uses 64-bit CPU for now, since no cpu implements RV128 */
+#define SIFIVE_E_CPU TYPE_RISCV_CPU_SIFIVE_E51
+#define SIFIVE_U_CPU TYPE_RISCV_CPU_SIFIVE_U54
 #endif
 
 #endif /* HW_SIFIVE_CPU_H */
diff --git a/target/riscv/Kconfig b/target/riscv/Kconfig
index b9e5932f13..f9ea52a59a 100644
--- a/target/riscv/Kconfig
+++ b/target/riscv/Kconfig
@@ -3,3 +3,6 @@ config RISCV32
 
 config RISCV64
     bool
+
+config RISCV128
+    bool
diff --git a/target/riscv/arch_dump.c b/target/riscv/arch_dump.c
index 709f621d82..f756ed2988 100644
--- a/target/riscv/arch_dump.c
+++ b/target/riscv/arch_dump.c
@@ -176,7 +176,8 @@ int cpu_get_dump_info(ArchDumpInfo *info,
 
     info->d_machine = EM_RISCV;
 
-#if defined(TARGET_RISCV64)
+#if defined(TARGET_RISCV64) || defined(TARGET_RISCV128)
+    /* FIXME : No 128-bit ELF class exists (for now), use 64-bit one. */
     info->d_class = ELFCLASS64;
 #else
     info->d_class = ELFCLASS32;
diff --git a/target/riscv/cpu-param.h b/target/riscv/cpu-param.h
index 80eb615f93..e6d0651f60 100644
--- a/target/riscv/cpu-param.h
+++ b/target/riscv/cpu-param.h
@@ -8,7 +8,8 @@
 #ifndef RISCV_CPU_PARAM_H
 #define RISCV_CPU_PARAM_H 1
 
-#if defined(TARGET_RISCV64)
+/* 64-bit target, since QEMU isn't built to have TARGET_LONG_BITS over 64 */
+#if defined(TARGET_RISCV64) || defined(TARGET_RISCV128)
 # define TARGET_LONG_BITS 64
 # define TARGET_PHYS_ADDR_SPACE_BITS 56 /* 44-bit PPN */
 # define TARGET_VIRT_ADDR_SPACE_BITS 48 /* sv48 */
diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
index 991a6bb760..1f15026e9c 100644
--- a/target/riscv/cpu.c
+++ b/target/riscv/cpu.c
@@ -110,18 +110,38 @@ const char *riscv_cpu_get_trap_name(target_ulong cause, bool async)
 
 bool riscv_cpu_is_32bit(CPURISCVState *env)
 {
-    if (env->misa & RV64) {
-        return false;
-    }
+    return (env->misa & MXLEN_MASK) == RV32;
+}
 
-    return true;
+bool riscv_cpu_is_64bit(CPURISCVState *env)
+{
+    return (env->misa & MXLEN_MASK) == RV64;
 }
 
+#if defined(TARGET_RISCV128)
+bool riscv_cpu_is_128bit(CPURISCVState *env)
+{
+    return (env->misah & MXLEN_MASK) == RV128;
+}
+#else
+bool __attribute__((const)) riscv_cpu_is_128bit(CPURISCVState *env)
+{
+    return false;
+}
+#endif
+
 static void set_misa(CPURISCVState *env, target_ulong misa)
 {
     env->misa_mask = env->misa = misa;
 }
 
+#if defined(TARGET_RISCV128)
+static void set_misah(CPURISCVState *env, target_ulong misah)
+{
+    env->misah_mask = env->misah = misah;
+}
+#endif
+
 static void set_priv_version(CPURISCVState *env, int priv_ver)
 {
     env->priv_ver = priv_ver;
@@ -156,11 +176,22 @@ static void riscv_any_cpu_init(Object *obj)
     set_misa(env, RV32 | RVI | RVM | RVA | RVF | RVD | RVC | RVU);
 #elif defined(TARGET_RISCV64)
     set_misa(env, RV64 | RVI | RVM | RVA | RVF | RVD | RVC | RVU);
+#elif defined(TARGET_RISCV128)
+    set_misa(env, RVI | RVM | RVA | RVF | RVD | RVC | RVU);
+    set_misah(env, RV128);
 #endif
     set_priv_version(env, PRIV_VERSION_1_11_0);
 }
 
-#if defined(TARGET_RISCV64)
+#if defined(TARGET_RISCV128)
+static void rv128_base_cpu_init(Object *obj)
+{
+    CPURISCVState *env = &RISCV_CPU(obj)->env;
+    /* We set this in the realise function */
+    set_misa(env, 0);
+    set_misah(env, RV128);
+}
+#elif defined(TARGET_RISCV64)
 static void rv64_base_cpu_init(Object *obj)
 {
     CPURISCVState *env = &RISCV_CPU(obj)->env;
@@ -440,7 +471,11 @@ static void riscv_cpu_realize(DeviceState *dev, Error **errp)
     set_resetvec(env, cpu->cfg.resetvec);
 
     /* If only XLEN is set for misa, then set misa from properties */
-    if (env->misa == RV32 || env->misa == RV64) {
+    if (env->misa == RV32 || env->misa == RV64
+#if defined(TARGET_RISCV128)
+            || (env->misah == RV128 && env->misa == 0)
+#endif
+            ) {
         /* Do some ISA extension error checking */
         if (cpu->cfg.ext_i && cpu->cfg.ext_e) {
             error_setg(errp,
@@ -674,6 +709,8 @@ static void riscv_cpu_class_init(ObjectClass *c, void *data)
     cc->gdb_core_xml_file = "riscv-32bit-cpu.xml";
 #elif defined(TARGET_RISCV64)
     cc->gdb_core_xml_file = "riscv-64bit-cpu.xml";
+#elif defined(TARGET_RISCV128)
+    cc->gdb_core_xml_file = "riscv-128bit-cpu.xml";
 #endif
     cc->gdb_stop_before_watchpoint = true;
     cc->disas_set_info = riscv_cpu_disas_set_info;
@@ -761,6 +798,8 @@ static const TypeInfo riscv_cpu_type_infos[] = {
     DEFINE_CPU(TYPE_RISCV_CPU_SIFIVE_E51,       rv64_sifive_e_cpu_init),
     DEFINE_CPU(TYPE_RISCV_CPU_SIFIVE_U54,       rv64_sifive_u_cpu_init),
     DEFINE_CPU(TYPE_RISCV_CPU_SHAKTI_C,         rv64_sifive_u_cpu_init),
+#elif defined(TARGET_RISCV128)
+    DEFINE_CPU(TYPE_RISCV_CPU_BASE128,          rv128_base_cpu_init),
 #endif
 };
 
diff --git a/target/riscv/cpu.h b/target/riscv/cpu.h
index bf1c899c00..d1a73276fb 100644
--- a/target/riscv/cpu.h
+++ b/target/riscv/cpu.h
@@ -37,6 +37,7 @@
 #define TYPE_RISCV_CPU_ANY              RISCV_CPU_TYPE_NAME("any")
 #define TYPE_RISCV_CPU_BASE32           RISCV_CPU_TYPE_NAME("rv32")
 #define TYPE_RISCV_CPU_BASE64           RISCV_CPU_TYPE_NAME("rv64")
+#define TYPE_RISCV_CPU_BASE128          RISCV_CPU_TYPE_NAME("rv128")
 #define TYPE_RISCV_CPU_IBEX             RISCV_CPU_TYPE_NAME("lowrisc-ibex")
 #define TYPE_RISCV_CPU_SHAKTI_C         RISCV_CPU_TYPE_NAME("shakti-c")
 #define TYPE_RISCV_CPU_SIFIVE_E31       RISCV_CPU_TYPE_NAME("sifive-e31")
@@ -49,10 +50,16 @@
 # define TYPE_RISCV_CPU_BASE            TYPE_RISCV_CPU_BASE32
 #elif defined(TARGET_RISCV64)
 # define TYPE_RISCV_CPU_BASE            TYPE_RISCV_CPU_BASE64
+#elif defined(TARGET_RISCV128)
+# define TYPE_RISCV_CPU_BASE            TYPE_RISCV_CPU_BASE128
 #endif
 
+/* Mask for the MXLEN flag in the misa CSR */
+#define MXLEN_MASK ((target_ulong)3 << (TARGET_LONG_BITS - 2))
 #define RV32 ((target_ulong)1 << (TARGET_LONG_BITS - 2))
 #define RV64 ((target_ulong)2 << (TARGET_LONG_BITS - 2))
+/* To be used on misah, the upper part of misa */
+#define RV128 ((target_ulong)3 << (TARGET_LONG_BITS - 2))
 
 #define RV(x) ((target_ulong)1 << (x - 'A'))
 
@@ -187,6 +194,12 @@ struct CPURISCVState {
     target_ulong hgatp;
     uint64_t htimedelta;
 
+#if defined(TARGET_RISCV128)
+    /* Upper 64-bits of 128-bit CSRs */
+    uint64_t misah;
+    uint64_t misah_mask;
+#endif
+
     /* Virtual CSRs */
     /*
      * For RV32 this is 32-bit vsstatus and 32-bit vsstatush.
@@ -396,6 +409,12 @@ FIELD(TB_FLAGS, VILL, 8, 1)
 FIELD(TB_FLAGS, HLSX, 9, 1)
 
 bool riscv_cpu_is_32bit(CPURISCVState *env);
+bool riscv_cpu_is_64bit(CPURISCVState *env);
+#if defined(TARGET_RISCV128)
+bool riscv_cpu_is_128bit(CPURISCVState *env);
+#else
+bool riscv_cpu_is_128bit(CPURISCVState *env) __attribute__ ((const));
+#endif
 
 /*
  * A simplification for VLMAX
diff --git a/target/riscv/gdbstub.c b/target/riscv/gdbstub.c
index a7a9c0b1fe..9f75d23b16 100644
--- a/target/riscv/gdbstub.c
+++ b/target/riscv/gdbstub.c
@@ -204,6 +204,9 @@ void riscv_cpu_register_gdb_regs_for_features(CPUState *cs)
 #elif defined(TARGET_RISCV64)
     gdb_register_coprocessor(cs, riscv_gdb_get_virtual, riscv_gdb_set_virtual,
                              1, "riscv-64bit-virtual.xml", 0);
+#elif defined(TARGET_RISCV128)
+    gdb_register_coprocessor(cs, riscv_gdb_get_virtual, riscv_gdb_set_virtual,
+                             1, "riscv-128bit-virtual.xml", 0);
 #endif
 
     gdb_register_coprocessor(cs, riscv_gdb_get_csr, riscv_gdb_set_csr,
diff --git a/target/riscv/insn_trans/trans_rvd.c.inc b/target/riscv/insn_trans/trans_rvd.c.inc
index 7e45538ae0..4d430960b9 100644
--- a/target/riscv/insn_trans/trans_rvd.c.inc
+++ b/target/riscv/insn_trans/trans_rvd.c.inc
@@ -22,6 +22,7 @@ static bool trans_fld(DisasContext *ctx, arg_fld *a)
 {
     REQUIRE_FPU;
     REQUIRE_EXT(ctx, RVD);
+    REQUIRE_32_OR_64BIT(ctx);
     TCGv t0 = tcg_temp_new();
     gen_get_gpr(t0, a->rs1);
     tcg_gen_addi_tl(t0, t0, a->imm);
@@ -37,6 +38,7 @@ static bool trans_fsd(DisasContext *ctx, arg_fsd *a)
 {
     REQUIRE_FPU;
     REQUIRE_EXT(ctx, RVD);
+    REQUIRE_32_OR_64BIT(ctx);
     TCGv t0 = tcg_temp_new();
     gen_get_gpr(t0, a->rs1);
     tcg_gen_addi_tl(t0, t0, a->imm);
@@ -388,11 +390,11 @@ static bool trans_fcvt_lu_d(DisasContext *ctx, arg_fcvt_lu_d *a)
 
 static bool trans_fmv_x_d(DisasContext *ctx, arg_fmv_x_d *a)
 {
-    REQUIRE_64BIT(ctx);
+    REQUIRE_64_OR_128BIT(ctx);
     REQUIRE_FPU;
     REQUIRE_EXT(ctx, RVD);
 
-#ifdef TARGET_RISCV64
+#if defined(TARGET_RISCV64) || defined(TARGET_RISCV128)
     gen_set_gpr(a->rd, cpu_fpr[a->rs1]);
     return true;
 #else
@@ -434,11 +436,11 @@ static bool trans_fcvt_d_lu(DisasContext *ctx, arg_fcvt_d_lu *a)
 
 static bool trans_fmv_d_x(DisasContext *ctx, arg_fmv_d_x *a)
 {
-    REQUIRE_64BIT(ctx);
+    REQUIRE_64_OR_128BIT(ctx);
     REQUIRE_FPU;
     REQUIRE_EXT(ctx, RVD);
 
-#ifdef TARGET_RISCV64
+#if defined(TARGET_RISCV64) || defined(TARGET_RISCV128)
     TCGv t0 = tcg_temp_new();
     gen_get_gpr(t0, a->rs1);
 
diff --git a/target/riscv/insn_trans/trans_rvf.c.inc b/target/riscv/insn_trans/trans_rvf.c.inc
index db1c0c9974..47efa7284d 100644
--- a/target/riscv/insn_trans/trans_rvf.c.inc
+++ b/target/riscv/insn_trans/trans_rvf.c.inc
@@ -303,7 +303,7 @@ static bool trans_fmv_x_w(DisasContext *ctx, arg_fmv_x_w *a)
 
     TCGv t0 = tcg_temp_new();
 
-#if defined(TARGET_RISCV64)
+#if defined(TARGET_RISCV64) || defined(TARGET_RISCV128)
     tcg_gen_ext32s_tl(t0, cpu_fpr[a->rs1]);
 #else
     tcg_gen_extrl_i64_i32(t0, cpu_fpr[a->rs1]);
diff --git a/target/riscv/translate.c b/target/riscv/translate.c
index 6983be5723..713b14da8b 100644
--- a/target/riscv/translate.c
+++ b/target/riscv/translate.c
@@ -47,7 +47,9 @@ typedef struct DisasContext {
     bool virt_enabled;
     uint32_t opcode;
     uint32_t mstatus_fs;
+    /* Type of csrs should be MXLEN, that might be dynamically settable */
     target_ulong misa;
+    uint64_t misah;
     uint32_t mem_idx;
     /* Remember the rounding mode encoded in the previous fp instruction,
        which we have already installed into env->fp_status.  Or -1 for
@@ -74,13 +76,30 @@ static inline bool has_ext(DisasContext *ctx, uint32_t ext)
 
 #ifdef TARGET_RISCV32
 # define is_32bit(ctx)  true
+# define is_64bit(ctx)  false
+# define is_128bit(ctx) false
 #elif defined(CONFIG_USER_ONLY)
 # define is_32bit(ctx)  false
+# define is_64_bit(ctx) true
+# define is_128_bit(ctx) false
 #else
 static inline bool is_32bit(DisasContext *ctx)
 {
-    return (ctx->misa & RV32) == RV32;
+    return (ctx->misa & MXLEN_MASK) == RV32;
 }
+
+static inline bool is_64bit(DisasContext *ctx)
+{
+    return (ctx->misa & MXLEN_MASK) == RV64;
+}
+#if defined(TARGET_RISCV128)
+static inline bool is_128bit(DisasContext *ctx)
+{
+    return (ctx->misah & MXLEN_MASK) == RV128;
+}
+#else
+# define is_128bit(ctx) false
+#endif
 #endif
 
 /*
@@ -418,11 +437,30 @@ EX_SH(12)
 } while (0)
 
 #define REQUIRE_64BIT(ctx) do { \
-    if (is_32bit(ctx)) {        \
+    if (!is_64bit(ctx)) {       \
         return false;           \
     }                           \
 } while (0)
 
+#define REQUIRE_128BIT(ctx) do { \
+    if (!is_128bit(ctx)) {         \
+        return false;            \
+    }                            \
+} while (0)
+
+#define REQUIRE_32_OR_64BIT(ctx) do {  \
+    if (is_128bit(ctx)) {               \
+        return false;                  \
+    }                                  \
+} while (0)
+
+#define REQUIRE_64_OR_128BIT(ctx) do { \
+    if (is_32bit(ctx)) {               \
+        return false;                  \
+    }                                  \
+} while (0)
+
+
 static int ex_rvc_register(DisasContext *ctx, int reg)
 {
     return 8 + reg;
@@ -938,6 +976,9 @@ static void riscv_tr_init_disas_context(DisasContextBase *dcbase, CPUState *cs)
     ctx->virt_enabled = false;
 #endif
     ctx->misa = env->misa;
+#if defined(TARGET_RISCV128)
+    ctx->misah = env->misah;
+#endif
     ctx->frm = -1;  /* unknown rounding mode */
     ctx->ext_ifencei = cpu->cfg.ext_ifencei;
     ctx->vlen = cpu->cfg.vlen;
-- 
2.33.0



^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 2/8] target/riscv: 128-bit registers creation and access
  2021-08-30 17:16 [PATCH 1/8] target/riscv: Settings for 128-bit extension support Frédéric Pétrot
@ 2021-08-30 17:16 ` Frédéric Pétrot
  2021-08-30 21:34   ` Philippe Mathieu-Daudé
  2021-08-30 17:16 ` [PATCH 3/8] target/riscv: Addition of 128-bit ldu, lq and sq instructions Frédéric Pétrot
                   ` (6 subsequent siblings)
  7 siblings, 1 reply; 22+ messages in thread
From: Frédéric Pétrot @ 2021-08-30 17:16 UTC (permalink / raw)
  To: qemu-devel, qemu-riscv
  Cc: Bin Meng, Alistair Francis, Frédéric Pétrot,
	Palmer Dabbelt, Fabien Portas

Addition of the upper 64 bits of the 128-bit registers, along with
the setter and getter for them and creation of the corresponding
global tcg values.

Signed-off-by: Frédéric Pétrot <frederic.petrot@univ-grenoble-alpes.fr>
Co-authored-by: Fabien Portas <fabien.portas@grenoble-inp.org>
---
 slirp                    |  2 +-
 target/riscv/cpu.h       |  3 +++
 target/riscv/translate.c | 30 ++++++++++++++++++++++++++++++
 3 files changed, 34 insertions(+), 1 deletion(-)

diff --git a/slirp b/slirp
index a88d9ace23..8f43a99191 160000
--- a/slirp
+++ b/slirp
@@ -1 +1 @@
-Subproject commit a88d9ace234a24ce1c17189642ef9104799425e0
+Subproject commit 8f43a99191afb47ca3f3c6972f6306209f367ece
diff --git a/target/riscv/cpu.h b/target/riscv/cpu.h
index d1a73276fb..6528b4540e 100644
--- a/target/riscv/cpu.h
+++ b/target/riscv/cpu.h
@@ -120,6 +120,9 @@ FIELD(VTYPE, VILL, sizeof(target_ulong) * 8 - 1, 1)
 
 struct CPURISCVState {
     target_ulong gpr[32];
+#if defined(TARGET_RISCV128)
+    target_ulong gprh[32]; /* 64 top bits of the 128-bit registers */
+#endif
     uint64_t fpr[32]; /* assume both F and D extensions */
 
     /* vector coprocessor state. */
diff --git a/target/riscv/translate.c b/target/riscv/translate.c
index 713b14da8b..be9c64f3e4 100644
--- a/target/riscv/translate.c
+++ b/target/riscv/translate.c
@@ -33,6 +33,9 @@
 
 /* global register indices */
 static TCGv cpu_gpr[32], cpu_pc, cpu_vl;
+#if defined(TARGET_RISCV128)
+static TCGv cpu_gprh[32];
+#endif
 static TCGv_i64 cpu_fpr[32]; /* assume F and D extensions */
 static TCGv load_res;
 static TCGv load_val;
@@ -211,6 +214,17 @@ static inline void gen_get_gpr(TCGv t, int reg_num)
     }
 }
 
+#if defined(TARGET_RISCV128)
+static inline void gen_get_gprh(TCGv t, int reg_num)
+{
+    if (reg_num == 0) {
+        tcg_gen_movi_tl(t, 0);
+    } else {
+        tcg_gen_mov_tl(t, cpu_gprh[reg_num]);
+    }
+}
+#endif
+
 /* Wrapper for setting reg values - need to check of reg is zero since
  * cpu_gpr[0] is not actually allocated. this is more for safety purposes,
  * since we usually avoid calling the OP_TYPE_gen function if we see a write to
@@ -223,6 +237,15 @@ static inline void gen_set_gpr(int reg_num_dst, TCGv t)
     }
 }
 
+#if defined(TARGET_RISCV128)
+static inline void gen_set_gprh(int reg_num_dst, TCGv t)
+{
+    if (reg_num_dst != 0) {
+        tcg_gen_mov_tl(cpu_gprh[reg_num_dst], t);
+    }
+}
+#endif
+
 static void gen_mulhsu(TCGv ret, TCGv arg1, TCGv arg2)
 {
     TCGv rl = tcg_temp_new();
@@ -1074,10 +1097,17 @@ void riscv_translate_init(void)
     /* Use the gen_set_gpr and gen_get_gpr helper functions when accessing */
     /* registers, unless you specifically block reads/writes to reg 0 */
     cpu_gpr[0] = NULL;
+#if defined(TARGET_RISCV128)
+    cpu_gprh[0] = NULL;
+#endif
 
     for (i = 1; i < 32; i++) {
         cpu_gpr[i] = tcg_global_mem_new(cpu_env,
             offsetof(CPURISCVState, gpr[i]), riscv_int_regnames[i]);
+#if defined(TARGET_RISCV128)
+        cpu_gprh[i] = tcg_global_mem_new(cpu_env,
+            offsetof(CPURISCVState, gprh[i]), riscv_int_regnames[i]);
+#endif
     }
 
     for (i = 0; i < 32; i++) {
-- 
2.33.0



^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 3/8] target/riscv: Addition of 128-bit ldu, lq and sq instructions
  2021-08-30 17:16 [PATCH 1/8] target/riscv: Settings for 128-bit extension support Frédéric Pétrot
  2021-08-30 17:16 ` [PATCH 2/8] target/riscv: 128-bit registers creation and access Frédéric Pétrot
@ 2021-08-30 17:16 ` Frédéric Pétrot
  2021-08-30 21:35   ` Philippe Mathieu-Daudé
                     ` (2 more replies)
  2021-08-30 17:16 ` [PATCH 4/8] target/riscv: 128-bit arithmetic and logic instructions Frédéric Pétrot
                   ` (5 subsequent siblings)
  7 siblings, 3 replies; 22+ messages in thread
From: Frédéric Pétrot @ 2021-08-30 17:16 UTC (permalink / raw)
  To: qemu-devel, qemu-riscv
  Cc: Bin Meng, Richard Henderson, Palmer Dabbelt, Fabien Portas,
	Alistair Francis, Frédéric Pétrot

Addition of the load(s) and store instructions of the 128-bit extension.
These instructions have addresses on 128-bit but explicitly assume that the
upper 64-bit of the address registers is null, and therefore can use the
existing address translation mechanism.
128-bit memory access identification and 64-bit signedness is handled a bit
off-the-record:
MemOp reserves 2 bits for size and a contiguous 3rd bit for the sign, so we
cannot simply take value 4 to indicate a size of 16 bytes.
Additionally, MO_TEQ | MO_SIGN seems to be a sentinel value, leading to a
QEMU assertion violation.
Modifying the existing state in QEMU has a great impact that we are not
capable of fully evaluating, so we choose to pass this information into
another parameter and let memop as it is for now.

Signed-off-by: Frédéric Pétrot <frederic.petrot@univ-grenoble-alpes.fr>
Co-authored-by: Fabien Portas <fabien.portas@grenoble-inp.org>
---
 include/tcg/tcg-op.h                    |   1 +
 target/riscv/insn16.decode              |  33 ++++-
 target/riscv/insn32.decode              |   5 +
 target/riscv/insn_trans/trans_rvi.c.inc | 188 +++++++++++++++++++++---
 tcg/tcg-op.c                            |   6 +
 5 files changed, 207 insertions(+), 26 deletions(-)

diff --git a/include/tcg/tcg-op.h b/include/tcg/tcg-op.h
index 2a654f350c..e2560784cb 100644
--- a/include/tcg/tcg-op.h
+++ b/include/tcg/tcg-op.h
@@ -726,6 +726,7 @@ static inline void tcg_gen_neg_i64(TCGv_i64 ret, TCGv_i64 arg)
 
 void tcg_gen_extu_i32_i64(TCGv_i64 ret, TCGv_i32 arg);
 void tcg_gen_ext_i32_i64(TCGv_i64 ret, TCGv_i32 arg);
+void tcg_gen_ext_i64_i128(TCGv_i64 lo, TCGv_i64 hi, TCGv_i64 arg);
 void tcg_gen_concat_i32_i64(TCGv_i64 dest, TCGv_i32 low, TCGv_i32 high);
 void tcg_gen_extrl_i64_i32(TCGv_i32 ret, TCGv_i64 arg);
 void tcg_gen_extrh_i64_i32(TCGv_i32 ret, TCGv_i64 arg);
diff --git a/target/riscv/insn16.decode b/target/riscv/insn16.decode
index 2e9212663c..165a6ed3bd 100644
--- a/target/riscv/insn16.decode
+++ b/target/riscv/insn16.decode
@@ -39,6 +39,11 @@
 %imm_addi16sp  12:s1 3:2 5:1 2:1 6:1 !function=ex_shift_4
 %imm_lui       12:s1 2:5             !function=ex_shift_12
 
+# Added for 128 bit support
+%uimm_cl_q    5:2 10:3               !function=ex_shift_3
+%uimm_6bit_lq 2:3 12:1 5:2           !function=ex_shift_3
+%uimm_6bit_sq 7:3 10:3               !function=ex_shift_3
+
 
 # Argument sets imported from insn32.decode:
 &empty                  !extern
@@ -54,16 +59,20 @@
 # Formats 16:
 @cr        ....  ..... .....  .. &r      rs2=%rs2_5       rs1=%rd     %rd
 @ci        ... . ..... .....  .. &i      imm=%imm_ci      rs1=%rd     %rd
+@cl_q      ... . .....  ..... .. &i      imm=%uimm_6bit_lq rs1=2 %rd
 @cl_d      ... ... ... .. ... .. &i      imm=%uimm_cl_d   rs1=%rs1_3  rd=%rs2_3
 @cl_w      ... ... ... .. ... .. &i      imm=%uimm_cl_w   rs1=%rs1_3  rd=%rs2_3
 @cs_2      ... ... ... .. ... .. &r      rs2=%rs2_3       rs1=%rs1_3  rd=%rs1_3
+@cs_q      ... ... ... .. ... .. &s      imm=%uimm_cl_q   rs1=%rs1_3  rs2=%rs2_3
 @cs_d      ... ... ... .. ... .. &s      imm=%uimm_cl_d   rs1=%rs1_3  rs2=%rs2_3
 @cs_w      ... ... ... .. ... .. &s      imm=%uimm_cl_w   rs1=%rs1_3  rs2=%rs2_3
 @cj        ...    ........... .. &j      imm=%imm_cj
 @cb_z      ... ... ... .. ... .. &b      imm=%imm_cb      rs1=%rs1_3  rs2=0
 
+@c_lqsp    ... . .....  ..... .. &i      imm=%uimm_6bit_lq rs1=2 %rd
 @c_ldsp    ... . .....  ..... .. &i      imm=%uimm_6bit_ld rs1=2 %rd
 @c_lwsp    ... . .....  ..... .. &i      imm=%uimm_6bit_lw rs1=2 %rd
+@c_sqsp    ... . .....  ..... .. &s      imm=%uimm_6bit_sq rs1=2 rs2=%rs2_5
 @c_sdsp    ... . .....  ..... .. &s      imm=%uimm_6bit_sd rs1=2 rs2=%rs2_5
 @c_swsp    ... . .....  ..... .. &s      imm=%uimm_6bit_sw rs1=2 rs2=%rs2_5
 @c_li      ... . .....  ..... .. &i      imm=%imm_ci rs1=0 %rd
@@ -87,9 +96,17 @@
   illegal         000  000 000 00 --- 00
   addi            000  ... ... .. ... 00 @c_addi4spn
 }
-fld               001  ... ... .. ... 00 @cl_d
+{
+  fld             001  ... ... .. ... 00 @cl_d
+  # *** RV128C specific Standard Extension (Quadrant 0) ***
+  lq              001  ... ... .. ... 00 @cl_q
+}
 lw                010  ... ... .. ... 00 @cl_w
-fsd               101  ... ... .. ... 00 @cs_d
+{
+  fsd             101  ... ... .. ... 00 @cs_d
+  # *** RV128C specific Standard Extension (Quadrant 0) ***
+  sq              101  ... ... .. ... 00 @cs_q
+}
 sw                110  ... ... .. ... 00 @cs_w
 
 # *** RV32C and RV64C specific Standard Extension (Quadrant 0) ***
@@ -132,7 +149,11 @@ addw              100 1 11 ... 01 ... 01 @cs_2
 
 # *** RV32/64C Standard Extension (Quadrant 2) ***
 slli              000 .  .....  ..... 10 @c_shift2
-fld               001 .  .....  ..... 10 @c_ldsp
+{
+  fld             001 .  .....  ..... 10 @c_ldsp
+  # *** RV128C specific Standard Extension (Quadrant 2) ***
+  lq              001  ... ... .. ... 10 @c_lqsp
+}
 {
   illegal         010 -  00000  ----- 10 # c.lwsp, RES rd=0
   lw              010 .  .....  ..... 10 @c_lwsp
@@ -147,7 +168,11 @@ fld               001 .  .....  ..... 10 @c_ldsp
   jalr            100 1  .....  00000 10 @c_jalr rd=1  # C.JALR
   add             100 1  .....  ..... 10 @cr
 }
-fsd               101   ......  ..... 10 @c_sdsp
+{
+  fsd             101   ......  ..... 10 @c_sdsp
+  # *** RV128C specific Standard Extension (Quadrant 2) ***
+  sq              101  ... ... .. ... 10 @c_sqsp
+}
 sw                110 .  .....  ..... 10 @c_swsp
 
 # *** RV32C and RV64C specific Standard Extension (Quadrant 2) ***
diff --git a/target/riscv/insn32.decode b/target/riscv/insn32.decode
index f09f8d5faf..225669e277 100644
--- a/target/riscv/insn32.decode
+++ b/target/riscv/insn32.decode
@@ -162,6 +162,11 @@ sllw     0000000 .....  ..... 001 ..... 0111011 @r
 srlw     0000000 .....  ..... 101 ..... 0111011 @r
 sraw     0100000 .....  ..... 101 ..... 0111011 @r
 
+# *** RV128I Base Instruction Set (in addition to RV64I) ***
+ldu      ............   ..... 111 ..... 0000011 @i
+lq       ............   ..... 010 ..... 0001111 @i
+sq       ............   ..... 100 ..... 0100011 @s
+
 # *** RV32M Standard Extension ***
 mul      0000001 .....  ..... 000 ..... 0110011 @r
 mulh     0000001 .....  ..... 001 ..... 0110011 @r
diff --git a/target/riscv/insn_trans/trans_rvi.c.inc b/target/riscv/insn_trans/trans_rvi.c.inc
index 6e736c9d0d..772330a766 100644
--- a/target/riscv/insn_trans/trans_rvi.c.inc
+++ b/target/riscv/insn_trans/trans_rvi.c.inc
@@ -141,7 +141,7 @@ static bool trans_bgeu(DisasContext *ctx, arg_bgeu *a)
     return gen_branch(ctx, a, TCG_COND_GEU);
 }
 
-static bool gen_load(DisasContext *ctx, arg_lb *a, MemOp memop)
+static bool gen_load_tl(DisasContext *ctx, arg_lb *a, MemOp memop)
 {
     TCGv t0 = tcg_temp_new();
     TCGv t1 = tcg_temp_new();
@@ -155,32 +155,133 @@ static bool gen_load(DisasContext *ctx, arg_lb *a, MemOp memop)
     return true;
 }
 
+#if defined(TARGET_RISCV128)
+/*
+ * Accessing signed 64-bit or 128-bit values should be part of MemOp in
+ * include/exec/memop.h
+ * Unfortunately, this requires to change the defines there, as MO_SIGN is 4,
+ * and values 0 to 3 are usual types sizes.
+ * Note that an assert is triggered when MemOp is MO_SIGN|MO_TEQ, this value
+ * being some kind of sentinel.
+ * Changing MemOp values is too involved given our understanding, we
+ * therefore use our own way to deal locally with zero or sign extended
+ * 64-bit values, and 128-bit values.
+ * Doing this implies adding a preprocessor conditional in all memory access
+ * functions to avoid penalizing 32 and 64-bit accesses.
+ */
+typedef enum XMemOp {
+    XMO_NOP   = 0, /* MemOp rules! */
+    XMO_TEUQ  = 1, /* Zero-extended 64-bit access */
+    XMO_TET   = 2, /* 128-bit (T integer format) access */
+    XMO_SIGN  = 4, /* Sign-extended 64-bit access */
+    XMO_TESQ  = XMO_TEUQ | XMO_SIGN,
+} XMemOp;
+
+static bool gen_load_i128(DisasContext *ctx, arg_lb *a, MemOp memop, XMemOp xm)
+{
+    if (is_128bit(ctx)) {
+        TCGv rs1l = tcg_temp_new();
+        TCGv rs1h = tcg_temp_new();
+        TCGv rdl = tcg_temp_new();
+        TCGv rdh = tcg_temp_new();
+        TCGv imml = tcg_temp_new();
+        TCGv immh = tcg_const_tl((a->imm & 0b100000000000)
+                                   ? 0xffffffffffffffff : 0);
+
+        gen_get_gpr(rs1l, a->rs1);
+        gen_get_gprh(rs1h, a->rs1);
+
+        /* Build a 128-bit address */
+        tcg_gen_movi_tl(imml, a->imm);
+        tcg_gen_add2_tl(rs1l, rs1h, rs1l, rs1h, imml, immh);
+        /* TODO: should assert that rs1h == 0 for now */
+
+        if (xm != XMO_TET) {
+            tcg_gen_qemu_ld_tl(rdl, rs1l, ctx->mem_idx, memop);
+            if ((memop & MO_SIGN) || (xm & XMO_SIGN)) {
+                tcg_gen_ext_i64_i128(rdl, rdh, rdl);
+            } else {
+                tcg_gen_movi_tl(rdh, 0);
+            }
+        } else {
+            tcg_gen_qemu_ld_tl(memop & MO_BSWAP ? rdh : rdl, rs1l,
+                               ctx->mem_idx, MO_TEQ);
+            tcg_gen_movi_tl(imml, 8);
+            tcg_gen_movi_tl(immh, 0);
+            tcg_gen_add2_tl(rs1l, rs1h, rs1l, rs1h, imml, immh);
+            /* TODO: should assert that rs1h == 0 for now */
+            tcg_gen_qemu_ld_tl(memop & MO_BSWAP ? rdl : rdh, rs1l,
+                               ctx->mem_idx, MO_TEQ);
+        }
+
+        gen_set_gpr(a->rd, rdl);
+        gen_set_gprh(a->rd, rdh);
+
+        tcg_temp_free(rs1l);
+        tcg_temp_free(rs1h);
+        tcg_temp_free(rdl);
+        tcg_temp_free(rdh);
+        tcg_temp_free(imml);
+        tcg_temp_free(immh);
+        return true;
+    }
+    return gen_load_tl(ctx, a, memop);
+}
+#define gen_load(ctx, a, memop, xmemop) gen_load_i128(ctx, a, memop, xmemop)
+#else
+#define gen_load(ctx, a, memop, xmemop) gen_load_tl(ctx, a, memop)
+#endif
+
 static bool trans_lb(DisasContext *ctx, arg_lb *a)
 {
-    return gen_load(ctx, a, MO_SB);
+    return gen_load(ctx, a, MO_SB, XMO_NOP);
 }
 
 static bool trans_lh(DisasContext *ctx, arg_lh *a)
 {
-    return gen_load(ctx, a, MO_TESW);
+    return gen_load(ctx, a, MO_TESW, XMO_NOP);
 }
 
 static bool trans_lw(DisasContext *ctx, arg_lw *a)
 {
-    return gen_load(ctx, a, MO_TESL);
+    return gen_load(ctx, a, MO_TESL, XMO_NOP);
+}
+
+static bool trans_ld(DisasContext *ctx, arg_ld *a)
+{
+    REQUIRE_64_OR_128BIT(ctx);
+    return gen_load(ctx, a, MO_TEQ, XMO_TESQ);
+}
+
+static bool trans_lq(DisasContext *ctx, arg_lq *a)
+{
+    REQUIRE_128BIT(ctx);
+    return gen_load(ctx, a, MO_TEQ, XMO_TET);
 }
 
 static bool trans_lbu(DisasContext *ctx, arg_lbu *a)
 {
-    return gen_load(ctx, a, MO_UB);
+    return gen_load(ctx, a, MO_UB, XMO_NOP);
 }
 
 static bool trans_lhu(DisasContext *ctx, arg_lhu *a)
 {
-    return gen_load(ctx, a, MO_TEUW);
+    return gen_load(ctx, a, MO_TEUW, XMO_NOP);
+}
+
+static bool trans_lwu(DisasContext *ctx, arg_lwu *a)
+{
+    REQUIRE_64_OR_128BIT(ctx);
+    return gen_load(ctx, a, MO_TEUL, XMO_NOP);
 }
 
-static bool gen_store(DisasContext *ctx, arg_sb *a, MemOp memop)
+static bool trans_ldu(DisasContext *ctx, arg_ldu* a)
+{
+    REQUIRE_128BIT(ctx);
+    return gen_load(ctx, a, MO_TEQ, XMO_TEUQ);
+}
+
+static bool gen_store_tl(DisasContext *ctx, arg_sb *a, MemOp memop)
 {
     TCGv t0 = tcg_temp_new();
     TCGv dat = tcg_temp_new();
@@ -194,38 +295,81 @@ static bool gen_store(DisasContext *ctx, arg_sb *a, MemOp memop)
     return true;
 }
 
+#if defined(TARGET_RISCV128)
+static bool gen_store_i128(DisasContext *ctx, arg_sb *a, MemOp memop, XMemOp xm)
+{
+    if (is_128bit(ctx)) {
+        TCGv rs1l = tcg_temp_new();
+        TCGv rs1h = tcg_temp_new();
+        TCGv rs2l = tcg_temp_new();
+        TCGv rs2h = tcg_temp_new();
+        TCGv imml = tcg_temp_new();
+        TCGv immh = tcg_const_tl((a->imm & 0b100000000000)
+                                  ? 0xffffffffffffffff
+                                  : 0);
+
+        gen_get_gpr(rs1l, a->rs1);
+        gen_get_gprh(rs1h, a->rs1);
+        gen_get_gpr(rs2l, a->rs2);
+        gen_get_gprh(rs2h, a->rs2);
+        /* Build a 128-bit address */
+        tcg_gen_movi_tl(imml, a->imm);
+        tcg_gen_add2_tl(rs1l, rs1h, rs1l, rs1h, imml, immh);
+        /* TODO: should assert that rs1h == 0 for now */
+
+        if (xm != XMO_TET) {
+            tcg_gen_qemu_st_tl(rs2l, rs1l, ctx->mem_idx, memop);
+        } else {
+            tcg_gen_qemu_st_tl(memop & MO_BSWAP ? rs2h : rs2l, rs1l,
+                               ctx->mem_idx, MO_TEQ);
+            tcg_gen_movi_tl(imml, 8);
+            tcg_gen_movi_tl(immh, 0);
+            tcg_gen_add2_tl(rs1l, rs1h, rs1l, rs1h, imml, immh);
+            /* TODO: should assert that rs1h == 0 for now */
+            tcg_gen_qemu_st_tl(memop & MO_BSWAP ? rs2l : rs2h, rs1l,
+                               ctx->mem_idx, MO_TEQ);
+        }
+
+        tcg_temp_free(rs1l);
+        tcg_temp_free(rs1h);
+        tcg_temp_free(rs2l);
+        tcg_temp_free(rs2h);
+        tcg_temp_free(imml);
+        tcg_temp_free(immh);
+        return true;
+    }
+    return gen_store_tl(ctx, a, memop);
+}
+#define gen_store(ctx, a, memop, xmemop) gen_store_i128(ctx, a, memop, xmemop)
+#else
+#define gen_store(ctx, a, memop, xmemop) gen_store_tl(ctx, a, memop)
+#endif
 
 static bool trans_sb(DisasContext *ctx, arg_sb *a)
 {
-    return gen_store(ctx, a, MO_SB);
+    return gen_store(ctx, a, MO_SB, XMO_NOP);
 }
 
 static bool trans_sh(DisasContext *ctx, arg_sh *a)
 {
-    return gen_store(ctx, a, MO_TESW);
+    return gen_store(ctx, a, MO_TESW, XMO_NOP);
 }
 
 static bool trans_sw(DisasContext *ctx, arg_sw *a)
 {
-    return gen_store(ctx, a, MO_TESL);
+    return gen_store(ctx, a, MO_TESL, XMO_NOP);
 }
 
-static bool trans_lwu(DisasContext *ctx, arg_lwu *a)
-{
-    REQUIRE_64BIT(ctx);
-    return gen_load(ctx, a, MO_TEUL);
-}
-
-static bool trans_ld(DisasContext *ctx, arg_ld *a)
+static bool trans_sd(DisasContext *ctx, arg_sd *a)
 {
-    REQUIRE_64BIT(ctx);
-    return gen_load(ctx, a, MO_TEQ);
+    REQUIRE_64_OR_128BIT(ctx);
+    return gen_store(ctx, a, MO_TEQ, XMO_NOP);
 }
 
-static bool trans_sd(DisasContext *ctx, arg_sd *a)
+static bool trans_sq(DisasContext *ctx, arg_sq *a)
 {
-    REQUIRE_64BIT(ctx);
-    return gen_store(ctx, a, MO_TEQ);
+    REQUIRE_128BIT(ctx);
+    return gen_store(ctx, a, MO_TEQ, XMO_TET);
 }
 
 static bool trans_addi(DisasContext *ctx, arg_addi *a)
diff --git a/tcg/tcg-op.c b/tcg/tcg-op.c
index c754396575..c1e9ba8309 100644
--- a/tcg/tcg-op.c
+++ b/tcg/tcg-op.c
@@ -2655,6 +2655,12 @@ void tcg_gen_ext_i32_i64(TCGv_i64 ret, TCGv_i32 arg)
     }
 }
 
+void tcg_gen_ext_i64_i128(TCGv_i64 lo, TCGv_i64 hi, TCGv_i64 arg)
+{
+    tcg_gen_mov_i64(lo, arg);
+    tcg_gen_sari_i64(hi, arg, 63);
+}
+
 void tcg_gen_concat_i32_i64(TCGv_i64 dest, TCGv_i32 low, TCGv_i32 high)
 {
     TCGv_i64 tmp;
-- 
2.33.0



^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 4/8] target/riscv: 128-bit arithmetic and logic instructions
  2021-08-30 17:16 [PATCH 1/8] target/riscv: Settings for 128-bit extension support Frédéric Pétrot
  2021-08-30 17:16 ` [PATCH 2/8] target/riscv: 128-bit registers creation and access Frédéric Pétrot
  2021-08-30 17:16 ` [PATCH 3/8] target/riscv: Addition of 128-bit ldu, lq and sq instructions Frédéric Pétrot
@ 2021-08-30 17:16 ` Frédéric Pétrot
  2021-08-30 21:38   ` Philippe Mathieu-Daudé
  2021-08-31  3:30   ` Richard Henderson
  2021-08-30 17:16 ` [PATCH 5/8] target/riscv: 128-bit multiply and divide Frédéric Pétrot
                   ` (4 subsequent siblings)
  7 siblings, 2 replies; 22+ messages in thread
From: Frédéric Pétrot @ 2021-08-30 17:16 UTC (permalink / raw)
  To: qemu-devel, qemu-riscv
  Cc: Bin Meng, Alistair Francis, Frédéric Pétrot,
	Palmer Dabbelt, Fabien Portas

Adding the support for the 128-bit arithmetic and logic instructions.
Remember that all (i) instructions are now acting on 128-bit registers, that
a few others are added to cope with values that are held on 64 bits within
the 128-bit registers, and that the ones that cope with values on 32-bit
must also be modified for proper sign extension.
Most algorithms taken from Hackers' delight.

Signed-off-by: Frédéric Pétrot <frederic.petrot@univ-grenoble-alpes.fr>
Co-authored-by: Fabien Portas <fabien.portas@grenoble-inp.org>
---
 target/riscv/insn32.decode              |  13 +
 target/riscv/insn_trans/trans_rvi.c.inc | 955 +++++++++++++++++++++++-
 target/riscv/translate.c                |  25 +
 3 files changed, 976 insertions(+), 17 deletions(-)

diff --git a/target/riscv/insn32.decode b/target/riscv/insn32.decode
index 225669e277..2fe7e1dd36 100644
--- a/target/riscv/insn32.decode
+++ b/target/riscv/insn32.decode
@@ -22,6 +22,7 @@
 %rs1       15:5
 %rd        7:5
 %sh5       20:5
+%sh6       20:6
 
 %sh7    20:7
 %csr    20:12
@@ -91,6 +92,9 @@
 # Formats 64:
 @sh5     .......  ..... .....  ... ..... ....... &shift  shamt=%sh5      %rs1 %rd
 
+# Formats 128:
+@sh6       ...... ...... ..... ... ..... ....... &shift shamt=%sh6 %rs1 %rd
+
 # *** Privileged Instructions ***
 ecall       000000000000     00000 000 00000 1110011
 ebreak      000000000001     00000 000 00000 1110011
@@ -166,6 +170,15 @@ sraw     0100000 .....  ..... 101 ..... 0111011 @r
 ldu      ............   ..... 111 ..... 0000011 @i
 lq       ............   ..... 010 ..... 0001111 @i
 sq       ............   ..... 100 ..... 0100011 @s
+addid    ............  .....  000 ..... 1011011 @i
+sllid    000000 ......  ..... 001 ..... 1011011 @sh6
+srlid    000000 ......  ..... 101 ..... 1011011 @sh6
+sraid    010000 ......  ..... 101 ..... 1011011 @sh6
+addd     0000000 ..... .....  000 ..... 1111011 @r
+subd     0100000 ..... .....  000 ..... 1111011 @r
+slld     0000000 ..... .....  001 ..... 1111011 @r
+srld     0000000 ..... .....  101 ..... 1111011 @r
+srad     0100000 ..... .....  101 ..... 1111011 @r
 
 # *** RV32M Standard Extension ***
 mul      0000001 .....  ..... 000 ..... 0110011 @r
diff --git a/target/riscv/insn_trans/trans_rvi.c.inc b/target/riscv/insn_trans/trans_rvi.c.inc
index 772330a766..0401ba3d69 100644
--- a/target/riscv/insn_trans/trans_rvi.c.inc
+++ b/target/riscv/insn_trans/trans_rvi.c.inc
@@ -26,14 +26,20 @@ static bool trans_illegal(DisasContext *ctx, arg_empty *a)
 
 static bool trans_c64_illegal(DisasContext *ctx, arg_empty *a)
 {
-     REQUIRE_64BIT(ctx);
-     return trans_illegal(ctx, a);
+    REQUIRE_64_OR_128BIT(ctx);
+    return trans_illegal(ctx, a);
 }
 
 static bool trans_lui(DisasContext *ctx, arg_lui *a)
 {
     if (a->rd != 0) {
         tcg_gen_movi_tl(cpu_gpr[a->rd], a->imm);
+#if defined(TARGET_RISCV128)
+        if (is_128bit(ctx)) {
+            tcg_gen_ext_i64_i128(cpu_gpr[a->rd], cpu_gprh[a->rd],
+                                 cpu_gpr[a->rd]);
+        }
+#endif
     }
     return true;
 }
@@ -41,7 +47,25 @@ static bool trans_lui(DisasContext *ctx, arg_lui *a)
 static bool trans_auipc(DisasContext *ctx, arg_auipc *a)
 {
     if (a->rd != 0) {
+#if defined(TARGET_RISCV128)
+        if (is_128bit(ctx)) {
+            /* TODO : when pc is 128 bits, use all its bits */
+            TCGv pc = tcg_const_tl(ctx->base.pc_next),
+                 imm = tcg_const_tl(a->imm),
+                 immh = tcg_const_tl((a->imm & 0x80000)
+                         ? 0xffffffffffffffff : 0),
+                 cnst_zero = tcg_const_tl(0);
+            tcg_gen_add2_tl(cpu_gpr[a->rd], cpu_gprh[a->rd], pc, cnst_zero,
+                            imm, immh);
+            tcg_temp_free(pc);
+            tcg_temp_free(imm);
+            tcg_temp_free(immh);
+            tcg_temp_free(cnst_zero);
+            return true;
+        }
+#endif
         tcg_gen_movi_tl(cpu_gpr[a->rd], a->imm + ctx->base.pc_next);
+        return true;
     }
     return true;
 }
@@ -84,6 +108,94 @@ static bool trans_jalr(DisasContext *ctx, arg_jalr *a)
     return true;
 }
 
+#if defined(TARGET_RISCV128)
+static bool gen_setcond_128(TCGv rl, TCGv rh,
+                            TCGv al, TCGv ah,
+                            TCGv bl, TCGv bh,
+                            TCGCond cond) {
+    tcg_gen_sub2_tl(rl, rh, al, ah, bl, bh);
+    switch (cond) {
+    case TCG_COND_EQ:
+    {
+        TCGv tmp1 = tcg_temp_new(),
+             tmp2 = tcg_temp_new();
+        tcg_gen_setcondi_tl(TCG_COND_EQ, tmp1, rl, 0);
+        tcg_gen_setcondi_tl(TCG_COND_EQ, tmp2, rh, 0);
+        tcg_gen_and_tl(rl, tmp1, tmp2);
+
+        tcg_temp_free(tmp1);
+        tcg_temp_free(tmp2);
+        break;
+    }
+
+    case TCG_COND_NE:
+    {
+        TCGv tmp1 = tcg_temp_new(),
+             tmp2 = tcg_temp_new();
+        tcg_gen_setcondi_tl(TCG_COND_NE, tmp1, rl, 0);
+        tcg_gen_setcondi_tl(TCG_COND_NE, tmp2, rh, 0);
+        tcg_gen_or_tl(rl, tmp1, tmp2);
+        tcg_gen_movi_tl(rh, 0);
+
+        tcg_temp_free(tmp1);
+        tcg_temp_free(tmp2);
+        break;
+    }
+
+    case TCG_COND_LT:
+    {
+        TCGv tmp1 = tcg_temp_new(),
+             tmp2 = tcg_temp_new();
+
+        tcg_gen_xor_tl(tmp1, rh, ah);
+        tcg_gen_xor_tl(tmp2, ah, bh);
+        tcg_gen_and_tl(tmp1, tmp1, tmp2);
+        tcg_gen_xor_tl(tmp1, rh, tmp1);
+        tcg_gen_setcondi_tl(TCG_COND_LT, rl, tmp1, 0); /* Check sign bit */
+
+        tcg_temp_free(tmp1);
+        tcg_temp_free(tmp2);
+        break;
+    }
+
+    case TCG_COND_GE:
+        /* We invert the result of TCG_COND_LT */
+        gen_setcond_128(rl, rh, al, ah, bl, bh, TCG_COND_LT);
+        tcg_gen_setcondi_tl(TCG_COND_EQ, rl, rl, 0);
+        break;
+
+    case TCG_COND_LTU:
+    {
+        TCGv tmp1 = tcg_temp_new(),
+             tmp2 = tcg_temp_new();
+
+        tcg_gen_eqv_tl(tmp1, ah, bh);
+        tcg_gen_and_tl(tmp1, tmp1, rh);
+        tcg_gen_not_tl(tmp2, ah);
+        tcg_gen_and_tl(tmp2, tmp2, bh);
+        tcg_gen_or_tl(tmp1, tmp1, tmp2);
+
+        tcg_gen_setcondi_tl(TCG_COND_LT, rl, tmp1, 0); /* Check sign bit */
+
+        tcg_temp_free(tmp1);
+        tcg_temp_free(tmp2);
+        break;
+    }
+
+    case TCG_COND_GEU:
+        /* We invert the result of TCG_COND_LTU */
+        gen_setcond_128(rl, rh, al, ah, bl, bh, TCG_COND_LTU);
+        tcg_gen_setcondi_tl(TCG_COND_EQ, rl, rl, 0);
+        break;
+
+    default:
+        return false;
+    }
+    tcg_gen_movi_tl(rh, 0);
+    return true;
+}
+#endif
+
 static bool gen_branch(DisasContext *ctx, arg_b *a, TCGCond cond)
 {
     TCGLabel *l = gen_new_label();
@@ -93,7 +205,28 @@ static bool gen_branch(DisasContext *ctx, arg_b *a, TCGCond cond)
     gen_get_gpr(source1, a->rs1);
     gen_get_gpr(source2, a->rs2);
 
+#if defined(TARGET_RISCV128)
+    if (is_128bit(ctx)) {
+        TCGv source1h, source2h, tmph, tmpl;
+        source1h = tcg_temp_new();
+        source2h = tcg_temp_new();
+        tmph = tcg_temp_new();
+        tmpl = tcg_temp_new();
+        gen_get_gprh(source1h, a->rs1);
+        gen_get_gprh(source2h, a->rs2);
+
+        gen_setcond_128(tmpl, tmph, source1, source1h, source2, source2h, cond);
+        tcg_gen_brcondi_tl(TCG_COND_NE, tmpl, 0, l);
+        tcg_temp_free(source1h);
+        tcg_temp_free(source2h);
+        tcg_temp_free(tmph);
+        tcg_temp_free(tmpl);
+    } else {
+        tcg_gen_brcond_tl(cond, source1, source2, l);
+    }
+#else
     tcg_gen_brcond_tl(cond, source1, source2, l);
+#endif
     gen_goto_tb(ctx, 1, ctx->pc_succ_insn);
     gen_set_label(l); /* branch taken */
 
@@ -166,7 +299,7 @@ static bool gen_load_tl(DisasContext *ctx, arg_lb *a, MemOp memop)
  * Changing MemOp values is too involved given our understanding, we
  * therefore use our own way to deal locally with zero or sign extended
  * 64-bit values, and 128-bit values.
- * Doing this implies adding a preprocessor conditional in all memory access
+ * Doing this implies adding a preprocessor conditional for all memory access
  * functions to avoid penalizing 32 and 64-bit accesses.
  */
 typedef enum XMemOp {
@@ -275,7 +408,7 @@ static bool trans_lwu(DisasContext *ctx, arg_lwu *a)
     return gen_load(ctx, a, MO_TEUL, XMO_NOP);
 }
 
-static bool trans_ldu(DisasContext *ctx, arg_ldu* a)
+static bool trans_ldu(DisasContext *ctx, arg_ldu *a)
 {
     REQUIRE_128BIT(ctx);
     return gen_load(ctx, a, MO_TEQ, XMO_TEUQ);
@@ -372,8 +505,98 @@ static bool trans_sq(DisasContext *ctx, arg_sq *a)
     return gen_store(ctx, a, MO_TEQ, XMO_TET);
 }
 
+static bool trans_addd(DisasContext *ctx, arg_addd *a)
+{
+    REQUIRE_128BIT(ctx);
+#if defined(TARGET_RISCV128)
+    TCGv src1 = tcg_temp_new(),
+         src2 = tcg_temp_new();
+
+    gen_get_gpr(src1, a->rs1);
+    gen_get_gpr(src2, a->rs2);
+
+    tcg_gen_add_tl(src1, src1, src2);
+    tcg_gen_ext_i64_i128(src1, src2, src1);
+
+    gen_set_gpr(a->rd, src1);
+    gen_set_gprh(a->rd, src2);
+
+    tcg_temp_free(src1);
+    tcg_temp_free(src2);
+#endif
+    return true;
+}
+
+static bool trans_addid(DisasContext *ctx, arg_addid *a)
+{
+    REQUIRE_128BIT(ctx);
+#if defined(TARGET_RISCV128)
+    TCGv src1 = tcg_temp_new(),
+         resh    = tcg_temp_new();
+
+    gen_get_gpr(src1, a->rs1);
+
+    tcg_gen_addi_tl(src1, src1, a->imm);
+    tcg_gen_ext_i64_i128(src1, resh, src1);
+
+    gen_set_gpr(a->rd, src1);
+    gen_set_gprh(a->rd, resh);
+
+    tcg_temp_free(src1);
+    tcg_temp_free(resh);
+#endif
+    return true;
+}
+
+static bool trans_subd(DisasContext *ctx, arg_subd *a)
+{
+    REQUIRE_128BIT(ctx);
+#if defined(TARGET_RISCV128)
+    TCGv src1 = tcg_temp_new(),
+         src2 = tcg_temp_new();
+
+    gen_get_gpr(src1, a->rs1);
+    gen_get_gpr(src2, a->rs2);
+
+    tcg_gen_sub_tl(src1, src1, src2);
+    tcg_gen_ext_i64_i128(src1, src2, src1);
+
+    gen_set_gpr(a->rd, src1);
+    gen_set_gprh(a->rd, src2);
+
+    tcg_temp_free(src1);
+    tcg_temp_free(src2);
+#endif
+    return true;
+}
+
 static bool trans_addi(DisasContext *ctx, arg_addi *a)
 {
+#if defined(TARGET_RISCV128)
+    if (is_128bit(ctx)) {
+        TCGv rs1vl = tcg_temp_new(),
+             rs1vh = tcg_temp_new(),
+             imml  = tcg_const_tl(a->imm),
+             immh  = tcg_const_tl((a->imm & 0b100000000000)
+                                   ? 0xffffffffffffffff
+                                   : 0);
+
+        gen_get_gpr(rs1vl, a->rs1);
+        gen_get_gprh(rs1vh, a->rs1);
+
+        tcg_gen_add2_tl(rs1vl, rs1vh, rs1vl, rs1vh, imml, immh);
+
+        gen_set_gpr(a->rd, rs1vl);
+        gen_set_gprh(a->rd, rs1vh);
+
+        tcg_temp_free(rs1vl);
+        tcg_temp_free(rs1vh);
+        tcg_temp_free(imml);
+        tcg_temp_free(immh);
+
+        return true;
+    }
+#endif
     return gen_arith_imm_fn(ctx, a, &tcg_gen_addi_tl);
 }
 
@@ -390,11 +613,69 @@ static void gen_sltu(TCGv ret, TCGv s1, TCGv s2)
 
 static bool trans_slti(DisasContext *ctx, arg_slti *a)
 {
+#if defined(TARGET_RISCV128)
+    if (is_128bit(ctx)) {
+        TCGv r1h  = tcg_temp_new(),
+             r1l  = tcg_temp_new(),
+             immh = tcg_const_tl((a->imm & 0b100000000000)
+                                  ? 0xffffffffffffffff
+                                  : 0),
+             imml = tcg_const_tl(a->imm),
+             resh = tcg_temp_new(),
+             resl = tcg_temp_new();
+
+        gen_get_gpr(r1l, a->rs1);
+        gen_get_gprh(r1h, a->rs1);
+
+        gen_setcond_128(resl, resh, r1l, r1h, imml, immh, TCG_COND_LT);
+
+        gen_set_gpr(a->rd, resl);
+        gen_set_gprh(a->rd, resh);
+
+        tcg_temp_free(r1h);
+        tcg_temp_free(r1l);
+        tcg_temp_free(immh);
+        tcg_temp_free(imml);
+        tcg_temp_free(resh);
+        tcg_temp_free(resl);
+
+        return true;
+    }
+#endif
     return gen_arith_imm_tl(ctx, a, &gen_slt);
 }
 
 static bool trans_sltiu(DisasContext *ctx, arg_sltiu *a)
 {
+#if defined(TARGET_RISCV128)
+    if (is_128bit(ctx)) {
+        TCGv resh = tcg_temp_new(),
+             resl = tcg_temp_new(),
+             r1h = tcg_temp_new(),
+             r1l = tcg_temp_new(),
+             immh = tcg_const_tl((a->imm & 0b100000000000)
+                                  ? 0xffffffffffffffff
+                                  : 0),
+             imml = tcg_const_tl(a->imm);
+
+        gen_get_gprh(r1h, a->rs1);
+        gen_get_gpr(r1l, a->rs1);
+
+        gen_setcond_128(resl, resh, r1l, r1h, imml, immh, TCG_COND_LTU);
+
+        gen_set_gpr(a->rd, resl);
+        gen_set_gprh(a->rd, resh);
+
+        tcg_temp_free(resl);
+        tcg_temp_free(resh);
+        tcg_temp_free(r1h);
+        tcg_temp_free(r1l);
+        tcg_temp_free(immh);
+        tcg_temp_free(imml);
+
+        return true;
+    }
+#endif
     return gen_arith_imm_tl(ctx, a, &gen_sltu);
 }
 
@@ -412,41 +693,406 @@ static bool trans_andi(DisasContext *ctx, arg_andi *a)
 }
 static bool trans_slli(DisasContext *ctx, arg_slli *a)
 {
+#if defined(TARGET_RISCV128)
+    if (is_128bit(ctx)) {
+        if (a->shamt >= 128) {
+            return false;
+        }
+
+        if (a->rd != 0 && a->shamt != 0) {
+            TCGv rs = tcg_temp_new(),
+                 rsh = tcg_temp_new();
+            TCGv res = tcg_temp_new(),
+                 resh = tcg_temp_new(),
+                 tmp = tcg_temp_new();
+            gen_get_gpr(rs, a->rs1);
+            gen_get_gprh(rsh, a->rs1);
+
+            /*
+             * Computation of double-length left shift,
+             * adapted for immediates from section 2.17 of Hacker's Delight
+             */
+            if (a->shamt >= 64) {
+                tcg_gen_movi_tl(resh, 0);
+            } else {
+                tcg_gen_shli_tl(resh, rsh, a->shamt);
+            }
+
+            if (64 - a->shamt < 0) {
+                tcg_gen_movi_tl(tmp, 0);
+            } else {
+                tcg_gen_shri_tl(tmp, rs, 64 - a->shamt);
+            }
+            tcg_gen_or_tl(resh, resh, tmp);
+            if (a->shamt - 64 < 0) {
+                tcg_gen_movi_tl(tmp, 0);
+            } else {
+                tcg_gen_shli_tl(tmp, rs, a->shamt - 64);
+            }
+            tcg_gen_or_tl(resh, resh, tmp);
+
+            if (a->shamt >= 64) {
+                tcg_gen_movi_tl(res, 0);
+            } else {
+                tcg_gen_shli_tl(res, rs, a->shamt);
+            }
+
+            gen_set_gpr(a->rd, res);
+            gen_set_gprh(a->rd, resh);
+
+            tcg_temp_free(rs);
+            tcg_temp_free(rsh);
+            tcg_temp_free(res);
+            tcg_temp_free(resh);
+            tcg_temp_free(tmp);
+        } /* NOP otherwise */
+        return true;
+    }
+#endif
     return gen_shifti(ctx, a, tcg_gen_shl_tl);
 }
 
 static bool trans_srli(DisasContext *ctx, arg_srli *a)
 {
+#if defined(TARGET_RISCV128)
+    if (is_128bit(ctx)) {
+        if (a->shamt >= 128) {
+            return false;
+        }
+
+        if (a->rd != 0 && a->shamt != 0) {
+            TCGv rs = tcg_temp_new(),
+                 rsh = tcg_temp_new(),
+                 res = tcg_temp_new(),
+                 resh = tcg_temp_new(),
+                 tmp = tcg_temp_new();
+            gen_get_gpr(rs, a->rs1);
+            gen_get_gprh(rsh, a->rs1);
+
+            /*
+             * Computation of double-length right logical shift,
+             * adapted for immediates from section 2.17 of Hacker's Delight
+             */
+            if (a->shamt >= 64) {
+                tcg_gen_movi_tl(res, 0);
+            } else {
+                tcg_gen_shri_tl(res, rs, a->shamt);
+            }
+            if (64 - a->shamt < 0) {
+                tcg_gen_movi_tl(tmp, 0);
+            } else {
+                tcg_gen_shli_tl(tmp, rsh, 64 - a->shamt);
+            }
+            tcg_gen_or_tl(res, res, tmp);
+            if (a->shamt - 64 < 0) {
+                tcg_gen_movi_tl(tmp, 0);
+            } else {
+                tcg_gen_shri_tl(tmp, rsh, a->shamt - 64);
+            }
+            tcg_gen_or_tl(res, res, tmp);
+
+            if (a->shamt >= 64) {
+                tcg_gen_movi_tl(resh, 0);
+            } else {
+                tcg_gen_shri_tl(resh, rsh, a->shamt);
+            }
+
+            gen_set_gpr(a->rd, res);
+            gen_set_gprh(a->rd, resh);
+
+            tcg_temp_free(rs);
+            tcg_temp_free(rsh);
+            tcg_temp_free(res);
+            tcg_temp_free(resh);
+            tcg_temp_free(tmp);
+        } /* NOP otherwise */
+        return true;
+    }
+#endif
     return gen_shifti(ctx, a, tcg_gen_shr_tl);
 }
 
 static bool trans_srai(DisasContext *ctx, arg_srai *a)
 {
+#if defined(TARGET_RISCV128)
+    if (is_128bit(ctx)) {
+        if (a->shamt >= 128) {
+            return false;
+        }
+
+        if (a->rd != 0 && a->shamt != 0) {
+            TCGv rs = tcg_temp_new(),
+                 rsh = tcg_temp_new(),
+                 res = tcg_temp_new(),
+                 resh = tcg_temp_new(),
+                 tmp = tcg_temp_new();
+            gen_get_gpr(rs, a->rs1);
+            gen_get_gprh(rsh, a->rs1);
+
+            /*
+             * Computation of double-length right arith shift,
+             * adapted for immediates from section 2.17 of Hacker's Delight
+             */
+            if (a->shamt < 64) {
+                tcg_gen_shri_tl(res, rs, a->shamt);
+                tcg_gen_shli_tl(tmp, rsh, 64 - a->shamt);
+                tcg_gen_or_tl(res, res, tmp);
+            } else {
+                tcg_gen_sari_tl(res, rsh, a->shamt - 64);
+            }
+
+            /* Arithmetic shift of upper bits by shamt */
+            if (a->shamt == 127) {
+                tcg_gen_sari_tl(resh, rsh, 63);
+                tcg_gen_sari_tl(resh, resh, 63);
+                tcg_gen_sari_tl(resh, resh, 1);
+            } else if (a->shamt >= 64) {
+                tcg_gen_sari_tl(resh, rsh, 63);
+                tcg_gen_sari_tl(resh, resh, a->shamt - 63);
+            } else {
+                tcg_gen_sari_tl(resh, rsh, a->shamt);
+            }
+
+            gen_set_gpr(a->rd, res);
+            gen_set_gprh(a->rd, resh);
+
+            tcg_temp_free(rs);
+            tcg_temp_free(rsh);
+            tcg_temp_free(res);
+            tcg_temp_free(resh);
+            tcg_temp_free(tmp);
+        } /* NOP otherwise */
+        return true;
+    }
+#endif
     return gen_shifti(ctx, a, tcg_gen_sar_tl);
 }
 
 static bool trans_add(DisasContext *ctx, arg_add *a)
 {
+#if defined(TARGET_RISCV128)
+    if (is_128bit(ctx)) {
+        TCGv rs1vl = tcg_temp_new(),
+             rs1vh = tcg_temp_new(),
+             rs2vl = tcg_temp_new(),
+             rs2vh = tcg_temp_new(),
+             resl  = tcg_temp_new(),
+             resh  = tcg_temp_new();
+
+        gen_get_gpr(rs1vl, a->rs1);
+        gen_get_gprh(rs1vh, a->rs1);
+        gen_get_gpr(rs2vl, a->rs2);
+        gen_get_gprh(rs2vh, a->rs2);
+
+        tcg_gen_add2_tl(resl, resh, rs1vl, rs1vh, rs2vl, rs2vh);
+
+        gen_set_gpr(a->rd, resl);
+        gen_set_gprh(a->rd, resh);
+
+        tcg_temp_free(rs1vl);
+        tcg_temp_free(rs1vh);
+        tcg_temp_free(rs2vl);
+        tcg_temp_free(rs2vh);
+        tcg_temp_free(resl);
+        tcg_temp_free(resh);
+
+        return true;
+    }
+#endif
     return gen_arith(ctx, a, &tcg_gen_add_tl);
 }
 
 static bool trans_sub(DisasContext *ctx, arg_sub *a)
 {
+#if defined(TARGET_RISCV128)
+    if (is_128bit(ctx)) {
+        TCGv rs1vl = tcg_temp_new(),
+             rs1vh = tcg_temp_new(),
+             rs2vl = tcg_temp_new(),
+             rs2vh = tcg_temp_new(),
+             resl  = tcg_temp_new(),
+             resh  = tcg_temp_new();
+
+        gen_get_gpr(rs1vl, a->rs1);
+        gen_get_gprh(rs1vh, a->rs1);
+        gen_get_gpr(rs2vl, a->rs2);
+        gen_get_gprh(rs2vh, a->rs2);
+
+        tcg_gen_sub2_tl(resl, resh, rs1vl, rs1vh, rs2vl, rs2vh);
+
+        gen_set_gpr(a->rd, resl);
+        gen_set_gprh(a->rd, resh);
+
+        tcg_temp_free(rs1vl);
+        tcg_temp_free(rs1vh);
+        tcg_temp_free(rs2vl);
+        tcg_temp_free(rs2vh);
+        tcg_temp_free(resl);
+        tcg_temp_free(resh);
+
+        return true;
+    }
+#endif
     return gen_arith(ctx, a, &tcg_gen_sub_tl);
 }
 
+#if defined(TARGET_RISCV128)
+enum M128_DIR { M128_LEFT, M128_RIGHT, M128_RIGHT_ARITH };
+static void gen_shift_mod128(TCGv ret, TCGv arg1, TCGv arg2, enum M128_DIR dir)
+{
+    TCGv tmp1 = tcg_temp_new(),
+         tmp2 = tcg_temp_new(),
+         cnst_zero = tcg_const_tl(0),
+         sgn = tcg_temp_new();
+
+    tcg_gen_setcondi_tl(TCG_COND_GE, tmp1, arg2, 64);
+    tcg_gen_setcondi_tl(TCG_COND_LT, tmp2, arg2, 0);
+    tcg_gen_or_tl(tmp1, tmp1, tmp2);
+
+    tcg_gen_andi_tl(tmp2, arg2, 0x3f);
+    switch (dir) {
+    case M128_LEFT:
+        tcg_gen_shl_tl(tmp2, arg1, tmp2);
+        break;
+    case M128_RIGHT:
+        tcg_gen_shr_tl(tmp2, arg1, tmp2);
+        break;
+    case M128_RIGHT_ARITH:
+        tcg_gen_sar_tl(tmp2, arg1, tmp2);
+        break;
+    }
+
+    if (dir == M128_RIGHT_ARITH) {
+        tcg_gen_sari_tl(sgn, arg1, 63);
+        tcg_gen_movcond_tl(TCG_COND_NE, ret, tmp1, cnst_zero, sgn, tmp2);
+    } else {
+        tcg_gen_movcond_tl(TCG_COND_NE, ret, tmp1, cnst_zero, cnst_zero, tmp2);
+    }
+
+    tcg_temp_free(tmp1);
+    tcg_temp_free(tmp2);
+    tcg_temp_free(cnst_zero);
+    tcg_temp_free(sgn);
+    return;
+}
+#endif
+
 static bool trans_sll(DisasContext *ctx, arg_sll *a)
 {
+#if defined(TARGET_RISCV128)
+    if (is_128bit(ctx)) {
+        TCGv src1l = tcg_temp_new(),
+             src1h = tcg_temp_new(),
+             src2l = tcg_temp_new(),
+             tmp   = tcg_temp_new(),
+             resl  = tcg_temp_new(),
+             resh  = tcg_temp_new();
+
+        gen_get_gpr(src1l, a->rs1);
+        gen_get_gprh(src1h, a->rs1);
+        gen_get_gpr(src2l, a->rs2);
+
+        tcg_gen_andi_tl(src2l, src2l, 0x7f); /* Use 7 lower bits for shift */
+
+        /*
+         * From Hacker's Delight 2.17:
+         *  y1 = x1 << n | x0 u>> (64 - n) | x0 << (n - 64)
+         */
+        gen_shift_mod128(resh, src1h, src2l, M128_LEFT);
+
+        tcg_gen_movi_tl(tmp, 64);
+        tcg_gen_sub_tl(tmp, tmp, src2l);
+        gen_shift_mod128(tmp, src1l, tmp, M128_RIGHT);
+        tcg_gen_or_tl(resh, resh, tmp);
+
+        tcg_gen_subi_tl(tmp, src2l, 64);
+        gen_shift_mod128(tmp, src1l, tmp, M128_LEFT);
+        tcg_gen_or_tl(resh, resh, tmp);
+
+        /* From Hacker's Delight 2.17: y0 = x0 << n */
+        gen_shift_mod128(resl, src1l, src2l, M128_LEFT);
+
+        gen_set_gpr(a->rd, resl);
+        gen_set_gprh(a->rd, resh);
+
+        tcg_temp_free(src1l);
+        tcg_temp_free(src1h);
+        tcg_temp_free(src2l);
+        tcg_temp_free(tmp);
+        tcg_temp_free(resl);
+        tcg_temp_free(resh);
+
+        return true;
+    }
+#endif
     return gen_shift(ctx, a, &tcg_gen_shl_tl);
 }
 
 static bool trans_slt(DisasContext *ctx, arg_slt *a)
 {
+#if defined(TARGET_RISCV128)
+    if (is_128bit(ctx)) {
+        TCGv r1h  = tcg_temp_new(),
+             r1l  = tcg_temp_new(),
+             r2h  = tcg_temp_new(),
+             r2l  = tcg_temp_new(),
+             tmph = tcg_temp_new(),
+             tmpl = tcg_temp_new();
+
+        gen_get_gprh(r1h, a->rs1);
+        gen_get_gpr(r1l, a->rs1);
+        gen_get_gprh(r2h, a->rs2);
+        gen_get_gpr(r2l, a->rs2);
+
+        gen_setcond_128(tmpl, tmph, r1l, r1h, r2l, r2h, TCG_COND_LT);
+
+        gen_set_gpr(a->rd, tmpl);
+        gen_set_gprh(a->rd, tmph);
+
+        tcg_temp_free(r1h);
+        tcg_temp_free(r1l);
+        tcg_temp_free(r2h);
+        tcg_temp_free(r2l);
+        tcg_temp_free(tmph);
+        tcg_temp_free(tmpl);
+
+        return true;
+    }
+#endif
     return gen_arith(ctx, a, &gen_slt);
 }
 
 static bool trans_sltu(DisasContext *ctx, arg_sltu *a)
 {
+#if defined(TARGET_RISCV128)
+    if (is_128bit(ctx)) {
+        TCGv resh = tcg_temp_new(),
+             resl = tcg_temp_new(),
+             r1h = tcg_temp_new(),
+             r1l = tcg_temp_new(),
+             r2h = tcg_temp_new(),
+             r2l = tcg_temp_new();
+
+        gen_get_gprh(r1h, a->rs1);
+        gen_get_gpr(r1l, a->rs1);
+        gen_get_gprh(r2h, a->rs2);
+        gen_get_gpr(r2l, a->rs2);
+
+        gen_setcond_128(resl, resh, r1l, r1h, r2l, r2h, TCG_COND_LTU);
+
+        gen_set_gpr(a->rd, resl);
+        gen_set_gprh(a->rd, resh);
+
+        tcg_temp_free(resl);
+        tcg_temp_free(resh);
+        tcg_temp_free(r1h);
+        tcg_temp_free(r1l);
+        tcg_temp_free(r2h);
+        tcg_temp_free(r2l);
+
+        return true;
+    }
+#endif
     return gen_arith(ctx, a, &gen_sltu);
 }
 
@@ -457,11 +1103,106 @@ static bool trans_xor(DisasContext *ctx, arg_xor *a)
 
 static bool trans_srl(DisasContext *ctx, arg_srl *a)
 {
+#if defined(TARGET_RISCV128)
+    if (is_128bit(ctx)) {
+        TCGv src1l = tcg_temp_new(),
+             src1h = tcg_temp_new(),
+             src2l = tcg_temp_new(),
+             tmp   = tcg_temp_new(),
+             resl  = tcg_temp_new(),
+             resh  = tcg_temp_new();
+
+        gen_get_gpr(src1l, a->rs1);
+        gen_get_gprh(src1h, a->rs1);
+        gen_get_gpr(src2l, a->rs2);
+
+        tcg_gen_andi_tl(src2l, src2l, 0x7f); /* Use 7 lower bits for shift */
+
+        /*
+         * From Hacker's Delight 2.17:
+         * y0 = x0 u>> n | x1 << (64 - n) | x1 u>> (n - 64)
+         */
+        gen_shift_mod128(resl, src1l, src2l, M128_RIGHT);
+
+        tcg_gen_movi_tl(tmp, 64);
+        tcg_gen_sub_tl(tmp, tmp, src2l);
+        gen_shift_mod128(tmp, src1h, tmp, M128_LEFT);
+        tcg_gen_or_tl(resl, resl, tmp);
+
+        tcg_gen_subi_tl(tmp, src2l, 64);
+        gen_shift_mod128(tmp, src1h, tmp, M128_RIGHT);
+        tcg_gen_or_tl(resl, resl, tmp);
+
+        /* From Hacker's Delight 2.17 : y1 = x1 u>> n */
+        gen_shift_mod128(resh, src1h, src2l, M128_RIGHT);
+
+        gen_set_gpr(a->rd, resl);
+        gen_set_gprh(a->rd, resh);
+
+        tcg_temp_free(src1l);
+        tcg_temp_free(src1h);
+        tcg_temp_free(src2l);
+        tcg_temp_free(tmp);
+        tcg_temp_free(resl);
+        tcg_temp_free(resh);
+
+        return true;
+    }
+#endif
     return gen_shift(ctx, a, &tcg_gen_shr_tl);
 }
 
 static bool trans_sra(DisasContext *ctx, arg_sra *a)
 {
+#if defined(TARGET_RISCV128)
+    if (is_128bit(ctx)) {
+        TCGv src1l   = tcg_temp_new(),
+             src1h   = tcg_temp_new(),
+             src2l   = tcg_temp_new(),
+             tmp1    = tcg_temp_new(),
+             tmp2    = tcg_temp_new(),
+             const64 = tcg_const_tl(64),
+             resl    = tcg_temp_new(),
+             resh    = tcg_temp_new();
+
+        gen_get_gpr(src1l, a->rs1);
+        gen_get_gprh(src1h, a->rs1);
+        gen_get_gpr(src2l, a->rs2);
+
+        tcg_gen_andi_tl(src2l, src2l, 0x7f); /* Use 7 lower bits for shift */
+
+        /* Compute y0 value if n < 64: x0 u>> n | x1 << (64 - n) */
+        gen_shift_mod128(tmp1, src1l, src2l, M128_RIGHT);
+        tcg_gen_movi_tl(tmp2, 64);
+        tcg_gen_sub_tl(tmp2, tmp2, src2l);
+        gen_shift_mod128(tmp2, src1h, tmp2, M128_LEFT);
+        tcg_gen_or_tl(tmp1, tmp1, tmp2);
+
+        /* Compute y0 value if n >= 64: x1 s>> (n - 64) */
+        tcg_gen_subi_tl(tmp2, src2l, 64);
+        gen_shift_mod128(tmp2, src1h, tmp2, M128_RIGHT_ARITH);
+
+        /* Conditionally move one value or the other */
+        tcg_gen_movcond_tl(TCG_COND_LT, resl, src2l, const64, tmp1, tmp2);
+
+        /* y1 = x1 s>> n */
+        gen_shift_mod128(resh, src1h, src2l, M128_RIGHT_ARITH);
+
+        gen_set_gpr(a->rd, resl);
+        gen_set_gprh(a->rd, resh);
+
+        tcg_temp_free(src1l);
+        tcg_temp_free(src1h);
+        tcg_temp_free(src2l);
+        tcg_temp_free(tmp1);
+        tcg_temp_free(tmp2);
+        tcg_temp_free(const64);
+        tcg_temp_free(resl);
+        tcg_temp_free(resh);
+
+        return true;
+    }
+#endif
     return gen_shift(ctx, a, &tcg_gen_sar_tl);
 }
 
@@ -477,24 +1218,95 @@ static bool trans_and(DisasContext *ctx, arg_and *a)
 
 static bool trans_addiw(DisasContext *ctx, arg_addiw *a)
 {
-    REQUIRE_64BIT(ctx);
-    return gen_arith_imm_tl(ctx, a, &gen_addw);
+    REQUIRE_64_OR_128BIT(ctx);
+    const bool rv = gen_arith_imm_tl(ctx, a, &gen_addw);
+#if defined(TARGET_RISCV128)
+    if (is_128bit(ctx) && a->rd != 0) {
+        tcg_gen_ext_i64_i128(cpu_gpr[a->rd], cpu_gprh[a->rd], cpu_gpr[a->rd]);
+    }
+#endif
+    return rv;
 }
 
 static bool trans_slliw(DisasContext *ctx, arg_slliw *a)
 {
-    REQUIRE_64BIT(ctx);
-    return gen_shiftiw(ctx, a, tcg_gen_shl_tl);
+    REQUIRE_64_OR_128BIT(ctx);
+    const bool rv = gen_shiftiw(ctx, a, tcg_gen_shl_tl);
+#if defined(TARGET_RISCV128)
+    if (is_128bit(ctx) && a->rd != 0) {
+        tcg_gen_ext_i64_i128(cpu_gpr[a->rd], cpu_gprh[a->rd], cpu_gpr[a->rd]);
+    }
+#endif
+    return rv;
+}
+
+static bool trans_sllid(DisasContext *ctx, arg_slliw *a)
+{
+    REQUIRE_128BIT(ctx);
+#if defined(TARGET_RISCV128)
+    TCGv source1 = tcg_temp_new();
+    gen_get_gpr(source1, a->rs1);
+
+    tcg_gen_shli_tl(source1, source1, a->shamt);
+    if (a->rd != 0) {
+        tcg_gen_ext_i64_i128(source1, cpu_gprh[a->rd], source1);
+    }
+    gen_set_gpr(a->rd, source1);
+
+    tcg_temp_free(source1);
+#endif
+    return true;
+}
+
+static bool trans_srlid(DisasContext *ctx, arg_srlid *a)
+{
+    REQUIRE_128BIT(ctx);
+#if defined(TARGET_RISCV128)
+    TCGv source1 = tcg_temp_new();
+    gen_get_gpr(source1, a->rs1);
+
+    tcg_gen_shri_tl(source1, source1, a->shamt);
+    if (a->rd != 0) {
+        tcg_gen_ext_i64_i128(source1, cpu_gprh[a->rd], source1);
+    }
+    gen_set_gpr(a->rd, source1);
+
+    tcg_temp_free(source1);
+#endif
+    return true;
+}
+
+static bool trans_sraid(DisasContext *ctx, arg_sraid *a)
+{
+    REQUIRE_128BIT(ctx);
+#if defined(TARGET_RISCV128)
+    TCGv source1 = tcg_temp_new();
+    gen_get_gpr(source1, a->rs1);
+
+    tcg_gen_sari_tl(source1, source1, a->shamt);
+    if (a->rd != 0) {
+        tcg_gen_ext_i64_i128(source1, cpu_gprh[a->rd], source1);
+    }
+    gen_set_gpr(a->rd, source1);
+
+    tcg_temp_free(source1);
+#endif
+    return true;
 }
 
 static bool trans_srliw(DisasContext *ctx, arg_srliw *a)
 {
-    REQUIRE_64BIT(ctx);
+    REQUIRE_64_OR_128BIT(ctx);
     TCGv t = tcg_temp_new();
     gen_get_gpr(t, a->rs1);
     tcg_gen_extract_tl(t, t, a->shamt, 32 - a->shamt);
     /* sign-extend for W instructions */
     tcg_gen_ext32s_tl(t, t);
+#if defined(TARGET_RISCV128)
+    if (is_128bit(ctx) && a->rd != 0) {
+        tcg_gen_ext_i64_i128(t, cpu_gprh[a->rd], t);
+    }
+#endif
     gen_set_gpr(a->rd, t);
     tcg_temp_free(t);
     return true;
@@ -502,10 +1314,15 @@ static bool trans_srliw(DisasContext *ctx, arg_srliw *a)
 
 static bool trans_sraiw(DisasContext *ctx, arg_sraiw *a)
 {
-    REQUIRE_64BIT(ctx);
+    REQUIRE_64_OR_128BIT(ctx);
     TCGv t = tcg_temp_new();
     gen_get_gpr(t, a->rs1);
     tcg_gen_sextract_tl(t, t, a->shamt, 32 - a->shamt);
+#if defined(TARGET_RISCV128)
+    if (is_128bit(ctx) && a->rd != 0) {
+        tcg_gen_ext_i64_i128(t, cpu_gprh[a->rd], t);
+    }
+#endif
     gen_set_gpr(a->rd, t);
     tcg_temp_free(t);
     return true;
@@ -513,19 +1330,31 @@ static bool trans_sraiw(DisasContext *ctx, arg_sraiw *a)
 
 static bool trans_addw(DisasContext *ctx, arg_addw *a)
 {
-    REQUIRE_64BIT(ctx);
-    return gen_arith(ctx, a, &gen_addw);
+    REQUIRE_64_OR_128BIT(ctx);
+    const bool ok = gen_arith(ctx, a, &gen_addw);
+#if defined(TARGET_RISCV128)
+    if (is_128bit(ctx) && a->rd != 0) {
+        tcg_gen_ext_i64_i128(cpu_gpr[a->rd], cpu_gprh[a->rd], cpu_gpr[a->rd]);
+    }
+#endif
+    return ok;
 }
 
 static bool trans_subw(DisasContext *ctx, arg_subw *a)
 {
-    REQUIRE_64BIT(ctx);
-    return gen_arith(ctx, a, &gen_subw);
+    REQUIRE_64_OR_128BIT(ctx);
+    const bool rv = gen_arith(ctx, a, &gen_subw);
+#if defined(TARGET_RISCV128)
+    if (is_128bit(ctx) && a->rd != 0) {
+        tcg_gen_ext_i64_i128(cpu_gpr[a->rd], cpu_gprh[a->rd], cpu_gpr[a->rd]);
+    }
+#endif
+    return rv;
 }
 
 static bool trans_sllw(DisasContext *ctx, arg_sllw *a)
 {
-    REQUIRE_64BIT(ctx);
+    REQUIRE_64_OR_128BIT(ctx);
     TCGv source1 = tcg_temp_new();
     TCGv source2 = tcg_temp_new();
 
@@ -536,6 +1365,14 @@ static bool trans_sllw(DisasContext *ctx, arg_sllw *a)
     tcg_gen_shl_tl(source1, source1, source2);
 
     tcg_gen_ext32s_tl(source1, source1);
+#if defined(TARGET_RISCV128)
+    if (is_128bit(ctx) && a->rd != 0) {
+        TCGv resh = tcg_temp_new();
+        tcg_gen_ext_i64_i128(source1, resh, source1);
+        gen_set_gprh(a->rd, resh);
+        tcg_temp_free(resh);
+    }
+#endif
     gen_set_gpr(a->rd, source1);
     tcg_temp_free(source1);
     tcg_temp_free(source2);
@@ -544,7 +1381,7 @@ static bool trans_sllw(DisasContext *ctx, arg_sllw *a)
 
 static bool trans_srlw(DisasContext *ctx, arg_srlw *a)
 {
-    REQUIRE_64BIT(ctx);
+    REQUIRE_64_OR_128BIT(ctx);
     TCGv source1 = tcg_temp_new();
     TCGv source2 = tcg_temp_new();
 
@@ -557,6 +1394,14 @@ static bool trans_srlw(DisasContext *ctx, arg_srlw *a)
     tcg_gen_shr_tl(source1, source1, source2);
 
     tcg_gen_ext32s_tl(source1, source1);
+#if defined(TARGET_RISCV128)
+    if (is_128bit(ctx) && a->rd != 0) {
+        TCGv resh = tcg_temp_new();
+        tcg_gen_ext_i64_i128(source1, resh, source1);
+        gen_set_gprh(a->rd, resh);
+        tcg_temp_free(resh);
+    }
+#endif
     gen_set_gpr(a->rd, source1);
     tcg_temp_free(source1);
     tcg_temp_free(source2);
@@ -565,7 +1410,7 @@ static bool trans_srlw(DisasContext *ctx, arg_srlw *a)
 
 static bool trans_sraw(DisasContext *ctx, arg_sraw *a)
 {
-    REQUIRE_64BIT(ctx);
+    REQUIRE_64_OR_128BIT(ctx);
     TCGv source1 = tcg_temp_new();
     TCGv source2 = tcg_temp_new();
 
@@ -580,6 +1425,15 @@ static bool trans_sraw(DisasContext *ctx, arg_sraw *a)
     tcg_gen_andi_tl(source2, source2, 0x1F);
     tcg_gen_sar_tl(source1, source1, source2);
 
+#if defined(TARGET_RISCV128)
+    if (is_128bit(ctx) && a->rd != 0) {
+        TCGv resh = tcg_temp_new();
+        tcg_gen_ext_i64_i128(source1, resh, source1);
+        gen_set_gprh(a->rd, resh);
+        tcg_temp_free(resh);
+    }
+#endif
+
     gen_set_gpr(a->rd, source1);
     tcg_temp_free(source1);
     tcg_temp_free(source2);
@@ -587,6 +1441,73 @@ static bool trans_sraw(DisasContext *ctx, arg_sraw *a)
     return true;
 }
 
+/* Translation functions for 64-bit operations specific to RV128 */
+static bool trans_slld(DisasContext *ctx, arg_slld *a)
+{
+    REQUIRE_128BIT(ctx);
+#if defined(TARGET_RISCV128)
+    TCGv src1 = tcg_temp_new(),
+         src2 = tcg_temp_new();
+
+    gen_get_gpr(src1, a->rs1);
+    gen_get_gpr(src2, a->rs2);
+
+    tcg_gen_shl_tl(src1, src1, src2);
+    tcg_gen_ext_i64_i128(src1, src2, src1);
+
+    gen_set_gpr(a->rd, src1);
+    gen_set_gprh(a->rd, src2);
+
+    tcg_temp_free(src1);
+    tcg_temp_free(src2);
+#endif
+    return true;
+}
+
+static bool trans_srld(DisasContext *ctx, arg_srld *a)
+{
+    REQUIRE_128BIT(ctx);
+#if defined(TARGET_RISCV128)
+    TCGv src1 = tcg_temp_new(),
+         src2 = tcg_temp_new();
+
+    gen_get_gpr(src1, a->rs1);
+    gen_get_gpr(src2, a->rs2);
+
+    tcg_gen_shr_tl(src1, src1, src2);
+    tcg_gen_ext_i64_i128(src1, src2, src1);
+
+    gen_set_gpr(a->rd, src1);
+    gen_set_gprh(a->rd, src2);
+
+    tcg_temp_free(src1);
+    tcg_temp_free(src2);
+    return true;
+#endif
+}
+
+static bool trans_srad(DisasContext *ctx, arg_srad *a)
+{
+    REQUIRE_128BIT(ctx);
+#if defined(TARGET_RISCV128)
+    TCGv src1 = tcg_temp_new(),
+         src2 = tcg_temp_new();
+
+    gen_get_gpr(src1, a->rs1);
+    gen_get_gpr(src2, a->rs2);
+
+    tcg_gen_sar_tl(src1, src1, src2);
+    tcg_gen_ext_i64_i128(src1, src2, src1);
+
+    gen_set_gpr(a->rd, src1);
+    gen_set_gprh(a->rd, src2);
+
+    tcg_temp_free(src1);
+    tcg_temp_free(src2);
+#endif
+    return true;
+}
+
 static bool trans_fence(DisasContext *ctx, arg_fence *a)
 {
     /* FENCE is a full memory barrier. */
diff --git a/target/riscv/translate.c b/target/riscv/translate.c
index be9c64f3e4..7d447bd225 100644
--- a/target/riscv/translate.c
+++ b/target/riscv/translate.c
@@ -509,6 +509,19 @@ static bool gen_arith_imm_fn(DisasContext *ctx, arg_i *a,
     (*func)(source1, source1, a->imm);
 
     gen_set_gpr(a->rd, source1);
+
+#if defined(TARGET_RISCV128)
+    if (is_128bit(ctx)) {
+        uint64_t immh = (a->imm & 0b100000000000) ? 0xffffffffffffffff : 0;
+
+        gen_get_gprh(source1, a->rs1);
+
+        (*func)(source1, source1, immh);
+
+        gen_set_gprh(a->rd, source1);
+    }
+#endif
+
     tcg_temp_free(source1);
     return true;
 }
@@ -827,6 +840,18 @@ static bool gen_arith(DisasContext *ctx, arg_r *a,
     (*func)(source1, source1, source2);
 
     gen_set_gpr(a->rd, source1);
+
+#if defined(TARGET_RISCV128)
+    if (is_128bit(ctx)) {
+        gen_get_gprh(source1, a->rs1);
+        gen_get_gprh(source2, a->rs2);
+
+        (*func)(source1, source1, source2);
+
+        gen_set_gprh(a->rd, source1);
+    }
+#endif
+
     tcg_temp_free(source1);
     tcg_temp_free(source2);
     return true;
-- 
2.33.0



^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 5/8] target/riscv: 128-bit multiply and divide
  2021-08-30 17:16 [PATCH 1/8] target/riscv: Settings for 128-bit extension support Frédéric Pétrot
                   ` (2 preceding siblings ...)
  2021-08-30 17:16 ` [PATCH 4/8] target/riscv: 128-bit arithmetic and logic instructions Frédéric Pétrot
@ 2021-08-30 17:16 ` Frédéric Pétrot
  2021-08-30 17:16 ` [PATCH 6/8] target/riscv: Support of compiler's 128-bit integer types Frédéric Pétrot
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 22+ messages in thread
From: Frédéric Pétrot @ 2021-08-30 17:16 UTC (permalink / raw)
  To: qemu-devel, qemu-riscv
  Cc: Bin Meng, Alistair Francis, Frédéric Pétrot,
	Palmer Dabbelt, Fabien Portas

Adding the support for the 128-bit (m) extension.
Division and remainder are helpers using a simple implementation of Knuth
algorithm D.

Signed-off-by: Frédéric Pétrot <frederic.petrot@univ-grenoble-alpes.fr>
Co-authored-by: Fabien Portas <fabien.portas@grenoble-inp.org>
---
 target/riscv/helper.h                   |   8 +
 target/riscv/insn32.decode              |   7 +
 target/riscv/insn_trans/trans_rvm.c.inc | 456 +++++++++++++++++++++++-
 target/riscv/m128_helper.c              | 301 ++++++++++++++++
 target/riscv/meson.build                |   1 +
 5 files changed, 759 insertions(+), 14 deletions(-)
 create mode 100644 target/riscv/m128_helper.c

diff --git a/target/riscv/helper.h b/target/riscv/helper.h
index 415e37bc37..f3aed608dc 100644
--- a/target/riscv/helper.h
+++ b/target/riscv/helper.h
@@ -1149,3 +1149,11 @@ DEF_HELPER_6(vcompress_vm_b, void, ptr, ptr, ptr, ptr, env, i32)
 DEF_HELPER_6(vcompress_vm_h, void, ptr, ptr, ptr, ptr, env, i32)
 DEF_HELPER_6(vcompress_vm_w, void, ptr, ptr, ptr, ptr, env, i32)
 DEF_HELPER_6(vcompress_vm_d, void, ptr, ptr, ptr, ptr, env, i32)
+
+#ifdef TARGET_RISCV128
+/* 128-bit integer multiplication and division */
+DEF_HELPER_6(idivu128, void, env, i64, i64, i64, i64, i64)
+DEF_HELPER_6(idivs128, void, env, i64, i64, i64, i64, i64)
+DEF_HELPER_6(iremu128, void, env, i64, i64, i64, i64, i64)
+DEF_HELPER_6(irems128, void, env, i64, i64, i64, i64, i64)
+#endif
diff --git a/target/riscv/insn32.decode b/target/riscv/insn32.decode
index 2fe7e1dd36..9085d15a7a 100644
--- a/target/riscv/insn32.decode
+++ b/target/riscv/insn32.decode
@@ -197,6 +197,13 @@ divuw    0000001 .....  ..... 101 ..... 0111011 @r
 remw     0000001 .....  ..... 110 ..... 0111011 @r
 remuw    0000001 .....  ..... 111 ..... 0111011 @r
 
+# *** RV128M Standard Extension (in addition to RV64M) ***
+muld     0000001 .....  ..... 000 ..... 1111011 @r
+divd     0000001 .....  ..... 100 ..... 1111011 @r
+divud    0000001 .....  ..... 101 ..... 1111011 @r
+remd     0000001 .....  ..... 110 ..... 1111011 @r
+remud    0000001 .....  ..... 111 ..... 1111011 @r
+
 # *** RV32A Standard Extension ***
 lr_w       00010 . . 00000 ..... 010 ..... 0101111 @atom_ld
 sc_w       00011 . . ..... ..... 010 ..... 0101111 @atom_st
diff --git a/target/riscv/insn_trans/trans_rvm.c.inc b/target/riscv/insn_trans/trans_rvm.c.inc
index 10ecc456fc..ed3bf43ea8 100644
--- a/target/riscv/insn_trans/trans_rvm.c.inc
+++ b/target/riscv/insn_trans/trans_rvm.c.inc
@@ -18,16 +18,157 @@
  * this program.  If not, see <http://www.gnu.org/licenses/>.
  */
 
+#if defined(TARGET_RISCV128)
+static void gen_mulu2_128(TCGv rll, TCGv rlh, TCGv rhl, TCGv rhh,
+                           TCGv al, TCGv ah, TCGv bl, TCGv bh)
+{
+    TCGv tmpl = tcg_temp_new(),
+         tmph = tcg_temp_new(),
+         cnst_zero = tcg_const_tl(0);
+
+    tcg_gen_mulu2_tl(rll, rlh, al, bl);
+
+    tcg_gen_mulu2_tl(tmpl, tmph, al, bh);
+    tcg_gen_add2_tl(rlh, rhl, rlh, cnst_zero, tmpl, tmph);
+    tcg_gen_mulu2_tl(tmpl, tmph, ah, bl);
+    tcg_gen_add2_tl(rlh, tmph, rlh, rhl, tmpl, tmph);
+    /* Overflow detection into rhh */
+    tcg_gen_setcond_tl(TCG_COND_LTU, rhh, tmph, rhl);
+
+    tcg_gen_mov_tl(rhl, tmph);
+
+    tcg_gen_mulu2_tl(tmpl, tmph, ah, bh);
+    tcg_gen_add2_tl(rhl, rhh, rhl, rhh, tmpl, tmph);
+
+    tcg_temp_free(tmpl);
+    tcg_temp_free(tmph);
+    tcg_temp_free(cnst_zero);
+}
+#endif
 
 static bool trans_mul(DisasContext *ctx, arg_mul *a)
 {
     REQUIRE_EXT(ctx, RVM);
+#if defined(TARGET_RISCV128)
+    if (is_128bit(ctx)) {
+        TCGv rs1h = tcg_temp_new(),
+             rs1l = tcg_temp_new(),
+             rs2h = tcg_temp_new(),
+             rs2l = tcg_temp_new(),
+             rll = tcg_temp_new(),
+             rlh = tcg_temp_new(),
+             rhl = tcg_temp_new(),
+             rhh = tcg_temp_new();
+
+        gen_get_gpr(rs1l, a->rs1);
+        gen_get_gprh(rs1h, a->rs1);
+        gen_get_gpr(rs2l, a->rs2);
+        gen_get_gprh(rs2h, a->rs2);
+
+        gen_mulu2_128(rll, rlh, rhl, rhh, rs1l, rs1h, rs2l, rs2h);
+
+        gen_set_gpr(a->rd, rll);
+        gen_set_gprh(a->rd, rlh);
+
+        tcg_temp_free(rs1h);
+        tcg_temp_free(rs1l);
+        tcg_temp_free(rs2l);
+        tcg_temp_free(rs2h);
+        tcg_temp_free(rll);
+        tcg_temp_free(rlh);
+        tcg_temp_free(rhl);
+        tcg_temp_free(rhh);
+
+        return true;
+    }
+#endif
     return gen_arith(ctx, a, &tcg_gen_mul_tl);
 }
 
 static bool trans_mulh(DisasContext *ctx, arg_mulh *a)
 {
     REQUIRE_EXT(ctx, RVM);
+#if defined(TARGET_RISCV128)
+    if (is_128bit(ctx)) {
+        TCGv rs1h = tcg_temp_new(),
+             rs1l = tcg_temp_new(),
+             rs2h = tcg_temp_new(),
+             rs2l = tcg_temp_new(),
+             rll = tcg_temp_new(),
+             rlh = tcg_temp_new(),
+             rhl = tcg_temp_new(),
+             rhh = tcg_temp_new(),
+             rlln = tcg_temp_new(),
+             rlhn = tcg_temp_new(),
+             rhln = tcg_temp_new(),
+             rhhn = tcg_temp_new(),
+             sgnres = tcg_temp_new(),
+             tmp = tcg_temp_new(),
+             cnst_one = tcg_const_tl(1),
+             cnst_zero = tcg_const_tl(0);
+
+        gen_get_gpr(rs1l, a->rs1);
+        gen_get_gprh(rs1h, a->rs1);
+        gen_get_gpr(rs2l, a->rs2);
+        gen_get_gprh(rs2h, a->rs2);
+
+        /* Extract sign of result (=> sgn(a) xor sgn(b)) */
+        tcg_gen_setcondi_tl(TCG_COND_LT, sgnres, rs1h, 0);
+        tcg_gen_setcondi_tl(TCG_COND_LT, tmp, rs2h, 0);
+        tcg_gen_xor_tl(sgnres, sgnres, tmp);
+
+        /* Take absolute value of operands */
+        tcg_gen_sari_tl(rhl, rs1h, 63);
+        tcg_gen_add2_tl(rs1l, rs1h, rs1l, rs1h, rhl, rhl);
+        tcg_gen_xor_tl(rs1l, rs1l, rhl);
+        tcg_gen_xor_tl(rs1h, rs1h, rhl);
+
+        tcg_gen_sari_tl(rhl, rs2h, 63);
+        tcg_gen_add2_tl(rs2l, rs2h, rs2l, rs2h, rhl, rhl);
+        tcg_gen_xor_tl(rs2l, rs2l, rhl);
+        tcg_gen_xor_tl(rs2h, rs2h, rhl);
+
+        /* Unsigned multiplication */
+        gen_mulu2_128(rll, rlh, rhl, rhh, rs1l, rs1h, rs2l, rs2h);
+
+        /* Negation of result (two's complement : ~res + 1) */
+        tcg_gen_not_tl(rlln, rll);
+        tcg_gen_not_tl(rlhn, rlh);
+        tcg_gen_not_tl(rhln, rhl);
+        tcg_gen_not_tl(rhhn, rhh);
+
+        tcg_gen_add2_tl(rlln, tmp, rlln, cnst_zero, cnst_one, cnst_zero);
+        tcg_gen_add2_tl(rlhn, tmp, rlhn, cnst_zero, tmp, cnst_zero);
+        tcg_gen_add2_tl(rhln, tmp, rhln, cnst_zero, tmp, cnst_zero);
+        tcg_gen_add2_tl(rhhn, tmp, rhhn, cnst_zero, tmp, cnst_zero);
+
+        /* Move conditionally result or -result depending on result sign */
+        tcg_gen_movcond_tl(TCG_COND_NE, rhl, sgnres, cnst_zero, rhln, rhl);
+        tcg_gen_movcond_tl(TCG_COND_NE, rhh, sgnres, cnst_zero, rhhn, rhh);
+
+        gen_set_gpr(a->rd, rhl);
+        gen_set_gprh(a->rd, rhh);
+
+        tcg_temp_free(rs1h);
+        tcg_temp_free(rs1l);
+        tcg_temp_free(rs2l);
+        tcg_temp_free(rs2h);
+        tcg_temp_free(rll);
+        tcg_temp_free(rlh);
+        tcg_temp_free(rhl);
+        tcg_temp_free(rhh);
+        tcg_temp_free(rlln);
+        tcg_temp_free(rlhn);
+        tcg_temp_free(rhln);
+        tcg_temp_free(rhhn);
+        tcg_temp_free(sgnres);
+        tcg_temp_free(tmp);
+        tcg_temp_free(cnst_one);
+        tcg_temp_free(cnst_zero);
+
+        return true;
+    }
+#endif
     TCGv source1 = tcg_temp_new();
     TCGv source2 = tcg_temp_new();
     gen_get_gpr(source1, a->rs1);
@@ -44,12 +185,119 @@ static bool trans_mulh(DisasContext *ctx, arg_mulh *a)
 static bool trans_mulhsu(DisasContext *ctx, arg_mulhsu *a)
 {
     REQUIRE_EXT(ctx, RVM);
+#if defined(TARGET_RISCV128)
+    if (is_128bit(ctx)) {
+        TCGv rs1h = tcg_temp_new(),
+             rs1l = tcg_temp_new(),
+             rs2h = tcg_temp_new(),
+             rs2l = tcg_temp_new(),
+             rll = tcg_temp_new(),
+             rlh = tcg_temp_new(),
+             rhl = tcg_temp_new(),
+             rhh = tcg_temp_new(),
+             rlln = tcg_temp_new(),
+             rlhn = tcg_temp_new(),
+             rhln = tcg_temp_new(),
+             rhhn = tcg_temp_new(),
+             sgnres = tcg_temp_new(),
+             tmp = tcg_temp_new(),
+             cnst_one = tcg_const_tl(1),
+             cnst_zero = tcg_const_tl(0);
+
+        gen_get_gpr(rs1l, a->rs1);
+        gen_get_gprh(rs1h, a->rs1);
+        gen_get_gpr(rs2l, a->rs2);
+        gen_get_gprh(rs2h, a->rs2);
+
+        /* Extract sign of result (=> sgn(a)) */
+        tcg_gen_setcondi_tl(TCG_COND_LT, sgnres, rs1h, 0);
+
+        /* Take absolute value of rs1 */
+        tcg_gen_sari_tl(rhl, rs1h, 63);
+        tcg_gen_add2_tl(rs1l, rs1h, rs1l, rs1h, rhl, rhl);
+        tcg_gen_xor_tl(rs1l, rs1l, rhl);
+        tcg_gen_xor_tl(rs1h, rs1h, rhl);
+
+        /* Unsigned multiplication */
+        gen_mulu2_128(rll, rlh, rhl, rhh, rs1l, rs1h, rs2l, rs2h);
+
+        /* Negation of result (two's complement : ~res + 1) */
+        tcg_gen_not_tl(rlln, rll);
+        tcg_gen_not_tl(rlhn, rlh);
+        tcg_gen_not_tl(rhln, rhl);
+        tcg_gen_not_tl(rhhn, rhh);
+
+        tcg_gen_add2_tl(rlln, tmp, rlln, cnst_zero, cnst_one, cnst_zero);
+        tcg_gen_add2_tl(rlhn, tmp, rlhn, cnst_zero, tmp, cnst_zero);
+        tcg_gen_add2_tl(rhln, tmp, rhln, cnst_zero, tmp, cnst_zero);
+        tcg_gen_add2_tl(rhhn, tmp, rhhn, cnst_zero, tmp, cnst_zero);
+
+        /* Move conditionally result or -result depending on result sign */
+        tcg_gen_movcond_tl(TCG_COND_NE, rhl, sgnres, cnst_zero, rhln, rhl);
+        tcg_gen_movcond_tl(TCG_COND_NE, rhh, sgnres, cnst_zero, rhhn, rhh);
+
+        gen_set_gpr(a->rd, rhl);
+        gen_set_gprh(a->rd, rhh);
+
+        tcg_temp_free(rs1h);
+        tcg_temp_free(rs1l);
+        tcg_temp_free(rs2l);
+        tcg_temp_free(rs2h);
+        tcg_temp_free(rll);
+        tcg_temp_free(rlh);
+        tcg_temp_free(rhl);
+        tcg_temp_free(rhh);
+        tcg_temp_free(rlln);
+        tcg_temp_free(rlhn);
+        tcg_temp_free(rhln);
+        tcg_temp_free(rhhn);
+        tcg_temp_free(sgnres);
+        tcg_temp_free(tmp);
+        tcg_temp_free(cnst_one);
+        tcg_temp_free(cnst_zero);
+
+        return true;
+    }
+#endif
     return gen_arith(ctx, a, &gen_mulhsu);
 }
 
 static bool trans_mulhu(DisasContext *ctx, arg_mulhu *a)
 {
     REQUIRE_EXT(ctx, RVM);
+#if defined(TARGET_RISCV128)
+    if (is_128bit(ctx)) {
+        TCGv rs1h = tcg_temp_new(),
+             rs1l = tcg_temp_new(),
+             rs2h = tcg_temp_new(),
+             rs2l = tcg_temp_new(),
+             rll = tcg_temp_new(),
+             rlh = tcg_temp_new(),
+             rhl = tcg_temp_new(),
+             rhh = tcg_temp_new();
+
+        gen_get_gpr(rs1l, a->rs1);
+        gen_get_gprh(rs1h, a->rs1);
+        gen_get_gpr(rs2l, a->rs2);
+        gen_get_gprh(rs2h, a->rs2);
+
+        gen_mulu2_128(rll, rlh, rhl, rhh, rs1l, rs1h, rs2l, rs2h);
+
+        gen_set_gpr(a->rd, rhl);
+        gen_set_gprh(a->rd, rhh);
+
+        tcg_temp_free(rs1h);
+        tcg_temp_free(rs1l);
+        tcg_temp_free(rs2l);
+        tcg_temp_free(rs2h);
+        tcg_temp_free(rll);
+        tcg_temp_free(rlh);
+        tcg_temp_free(rhl);
+        tcg_temp_free(rhh);
+
+        return true;
+    }
+#endif
     TCGv source1 = tcg_temp_new();
     TCGv source2 = tcg_temp_new();
     gen_get_gpr(source1, a->rs1);
@@ -66,63 +314,243 @@ static bool trans_mulhu(DisasContext *ctx, arg_mulhu *a)
 static bool trans_div(DisasContext *ctx, arg_div *a)
 {
     REQUIRE_EXT(ctx, RVM);
-    return gen_arith(ctx, a, &gen_div);
+    if (!is_128bit(ctx)) {
+        return gen_arith(ctx, a, &gen_div);
+    }
+
+#ifdef TARGET_RISCV128
+    TCGv ul = tcg_temp_new(),
+         uh = tcg_temp_new(),
+         vl = tcg_temp_new(),
+         vh = tcg_temp_new(),
+         rd = tcg_temp_new();
+
+    tcg_gen_movi_i64(rd, a->rd);
+
+    gen_get_gpr(ul, a->rs1);
+    gen_get_gprh(uh, a->rs1);
+    gen_get_gpr(vl, a->rs2);
+    gen_get_gprh(vh, a->rs2);
+
+    gen_helper_idivs128(cpu_env, rd, ul, uh, vl, vh);
+#endif
+    return true;
 }
 
 static bool trans_divu(DisasContext *ctx, arg_divu *a)
 {
     REQUIRE_EXT(ctx, RVM);
-    return gen_arith(ctx, a, &gen_divu);
+    if (!is_128bit(ctx)) {
+        return gen_arith(ctx, a, &gen_divu);
+    }
+
+#ifdef TARGET_RISCV128
+    TCGv ul = tcg_temp_new(),
+         uh = tcg_temp_new(),
+         vl = tcg_temp_new(),
+         vh = tcg_temp_new(),
+         rd = tcg_temp_new();
+
+    tcg_gen_movi_i64(rd, a->rd);
+
+    gen_get_gpr(ul, a->rs1);
+    gen_get_gprh(uh, a->rs1);
+    gen_get_gpr(vl, a->rs2);
+    gen_get_gprh(vh, a->rs2);
+
+    gen_helper_idivu128(cpu_env, rd, ul, uh, vl, vh);
+#endif
+    return true;
 }
 
 static bool trans_rem(DisasContext *ctx, arg_rem *a)
 {
     REQUIRE_EXT(ctx, RVM);
-    return gen_arith(ctx, a, &gen_rem);
+    if (!is_128bit(ctx)) {
+        return gen_arith(ctx, a, &gen_rem);
+    }
+
+#ifdef TARGET_RISCV128
+    TCGv ul = tcg_temp_new(),
+            uh = tcg_temp_new(),
+            vl = tcg_temp_new(),
+            vh = tcg_temp_new(),
+            rd = tcg_temp_new();
+
+    tcg_gen_movi_i64(rd, a->rd);
+
+    gen_get_gpr(ul, a->rs1);
+    gen_get_gprh(uh, a->rs1);
+    gen_get_gpr(vl, a->rs2);
+    gen_get_gprh(vh, a->rs2);
+
+    gen_helper_irems128(cpu_env, rd, ul, uh, vl, vh);
+#endif
+    return true;
 }
 
 static bool trans_remu(DisasContext *ctx, arg_remu *a)
 {
     REQUIRE_EXT(ctx, RVM);
-    return gen_arith(ctx, a, &gen_remu);
+    if (!is_128bit(ctx)) {
+        return gen_arith(ctx, a, &gen_remu);
+    }
+
+#ifdef TARGET_RISCV128
+    TCGv ul = tcg_temp_new(),
+         uh = tcg_temp_new(),
+         vl = tcg_temp_new(),
+         vh = tcg_temp_new(),
+         rd = tcg_temp_new();
+
+    tcg_gen_movi_tl(rd, a->rd);
+
+    gen_get_gpr(ul, a->rs1);
+    gen_get_gprh(uh, a->rs1);
+    gen_get_gpr(vl, a->rs2);
+    gen_get_gprh(vh, a->rs2);
+
+    gen_helper_iremu128(cpu_env, rd, ul, uh, vl, vh);
+#endif
+    return true;
 }
 
 static bool trans_mulw(DisasContext *ctx, arg_mulw *a)
 {
-    REQUIRE_64BIT(ctx);
+    REQUIRE_64_OR_128BIT(ctx);
     REQUIRE_EXT(ctx, RVM);
 
-    return gen_arith(ctx, a, &gen_mulw);
+    bool rv = gen_arith(ctx, a, &gen_mulw);
+#if defined(TARGET_RISCV128)
+    if (is_128bit(ctx) && a->rd != 0) {
+        tcg_gen_ext_i64_i128(cpu_gpr[a->rd], cpu_gprh[a->rd], cpu_gpr[a->rd]);
+    }
+#endif
+    return rv;
 }
 
 static bool trans_divw(DisasContext *ctx, arg_divw *a)
 {
-    REQUIRE_64BIT(ctx);
+    REQUIRE_64_OR_128BIT(ctx);
     REQUIRE_EXT(ctx, RVM);
 
-    return gen_arith_div_w(ctx, a, &gen_div);
+    bool rv = gen_arith_div_w(ctx, a, &gen_div);
+#if defined(TARGET_RISCV128)
+    if (is_128bit(ctx) && a->rd != 0) {
+        tcg_gen_ext_i64_i128(cpu_gpr[a->rd], cpu_gprh[a->rd], cpu_gpr[a->rd]);
+    }
+#endif
+    return rv;
 }
 
 static bool trans_divuw(DisasContext *ctx, arg_divuw *a)
 {
-    REQUIRE_64BIT(ctx);
+    REQUIRE_64_OR_128BIT(ctx);
     REQUIRE_EXT(ctx, RVM);
 
-    return gen_arith_div_uw(ctx, a, &gen_divu);
+    bool rv = gen_arith_div_uw(ctx, a, &gen_divu);
+#if defined(TARGET_RISCV128)
+    if (is_128bit(ctx) && a->rd != 0) {
+        tcg_gen_ext_i64_i128(cpu_gpr[a->rd], cpu_gprh[a->rd], cpu_gpr[a->rd]);
+    }
+#endif
+    return rv;
 }
 
 static bool trans_remw(DisasContext *ctx, arg_remw *a)
 {
-    REQUIRE_64BIT(ctx);
+    REQUIRE_64_OR_128BIT(ctx);
     REQUIRE_EXT(ctx, RVM);
 
-    return gen_arith_div_w(ctx, a, &gen_rem);
+    bool rv = gen_arith_div_w(ctx, a, &gen_rem);
+#if defined(TARGET_RISCV128)
+    if (is_128bit(ctx) && a->rd != 0) {
+        tcg_gen_ext_i64_i128(cpu_gpr[a->rd], cpu_gprh[a->rd], cpu_gpr[a->rd]);
+    }
+#endif
+    return rv;
 }
 
 static bool trans_remuw(DisasContext *ctx, arg_remuw *a)
 {
-    REQUIRE_64BIT(ctx);
+    REQUIRE_64_OR_128BIT(ctx);
+    REQUIRE_EXT(ctx, RVM);
+
+    bool rv = gen_arith_div_uw(ctx, a, &gen_remu);
+#if defined(TARGET_RISCV128)
+    if (is_128bit(ctx) && a->rd != 0) {
+        tcg_gen_ext_i64_i128(cpu_gpr[a->rd], cpu_gprh[a->rd], cpu_gpr[a->rd]);
+    }
+#endif
+    return rv;
+}
+
+static bool trans_muld(DisasContext *ctx, arg_muld *a)
+{
+    REQUIRE_EXT(ctx, RVM);
+    REQUIRE_128BIT(ctx);
+
+    bool rv = gen_arith(ctx, a, &tcg_gen_mul_tl);
+#if defined(TARGET_RISCV128)
+    if (a->rd != 0) {
+        tcg_gen_ext_i64_i128(cpu_gpr[a->rd], cpu_gprh[a->rd], cpu_gpr[a->rd]);
+    }
+#endif
+    return rv;
+}
+
+static bool trans_divd(DisasContext *ctx, arg_divd *a)
+{
+    REQUIRE_EXT(ctx, RVM);
+    REQUIRE_128BIT(ctx);
+
+    bool rv = gen_arith(ctx, a, &gen_div);
+#if defined(TARGET_RISCV128)
+    if (a->rd != 0) {
+        tcg_gen_ext_i64_i128(cpu_gpr[a->rd], cpu_gprh[a->rd], cpu_gpr[a->rd]);
+    }
+#endif
+    return rv;
+}
+
+static bool trans_divud(DisasContext *ctx, arg_divud *a)
+{
+    REQUIRE_EXT(ctx, RVM);
+    REQUIRE_128BIT(ctx);
+
+    bool rv = gen_arith(ctx, a, &gen_divu);
+#if defined(TARGET_RISCV128)
+    if (a->rd != 0) {
+        tcg_gen_ext_i64_i128(cpu_gpr[a->rd], cpu_gprh[a->rd], cpu_gpr[a->rd]);
+    }
+#endif
+    return rv;
+}
+
+static bool trans_remd(DisasContext *ctx, arg_remd *a)
+{
+    REQUIRE_EXT(ctx, RVM);
+    REQUIRE_128BIT(ctx);
+
+    bool rv = gen_arith(ctx, a, &gen_rem);
+#if defined(TARGET_RISCV128)
+    if (a->rd != 0) {
+        tcg_gen_ext_i64_i128(cpu_gpr[a->rd], cpu_gprh[a->rd], cpu_gpr[a->rd]);
+    }
+#endif
+    return rv;
+}
+
+static bool trans_remud(DisasContext *ctx, arg_remud *a)
+{
     REQUIRE_EXT(ctx, RVM);
+    REQUIRE_128BIT(ctx);
 
-    return gen_arith_div_uw(ctx, a, &gen_remu);
+    bool rv = gen_arith(ctx, a, &gen_remu);
+#if defined(TARGET_RISCV128)
+    if (a->rd != 0) {
+        tcg_gen_ext_i64_i128(cpu_gpr[a->rd], cpu_gprh[a->rd], cpu_gpr[a->rd]);
+    }
+#endif
+    return rv;
 }
diff --git a/target/riscv/m128_helper.c b/target/riscv/m128_helper.c
new file mode 100644
index 0000000000..973632b005
--- /dev/null
+++ b/target/riscv/m128_helper.c
@@ -0,0 +1,301 @@
+/*
+ * RISC-V Emulation Helpers for QEMU.
+ *
+ * Copyright (c) 2016-2017 Sagar Karandikar, sagark@eecs.berkeley.edu
+ * Copyright (c) 2017-2018 SiFive, Inc.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2 or later, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License along with
+ * this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include "qemu/osdep.h"
+#include "cpu.h"
+#include "qemu/main-loop.h"
+#include "exec/exec-all.h"
+#include "exec/helper-proto.h"
+
+#ifdef TARGET_RISCV128
+/* TODO : This can be optimized by a lot */
+static void divmod128(uint64_t ul, uint64_t uh,
+            uint64_t vl, uint64_t vh,
+            uint64_t *ql, uint64_t *qh,
+            uint64_t *rl, uint64_t *rh)
+{
+    const uint64_t b = ((uint64_t) 1) << 32;
+    const int m = 4;
+    uint64_t qhat, rhat, p;
+    int n, s, i;
+    int64_t j, t, k;
+
+    /* Build arrays of 32-bit words for u and v */
+    uint32_t *u = alloca(4 * sizeof(uint32_t));
+    u[0] = ul & 0xffffffff;
+    u[1] = (ul >> 32) & 0xffffffff;
+    u[2] = uh & 0xffffffff;
+    u[3] = (uh >> 32) & 0xffffffff;
+
+    uint32_t *v = alloca(4 * sizeof(uint32_t));
+    v[0] = vl & 0xffffffff;
+    v[1] = (vl >> 32) & 0xffffffff;
+    v[2] = vh & 0xffffffff;
+    v[3] = (vh >> 32) & 0xffffffff;
+
+    uint32_t *q = alloca(4 * sizeof(uint32_t));
+    uint32_t *r = alloca(4 * sizeof(uint32_t));
+    uint32_t *un = alloca(5 * sizeof(uint32_t));
+    uint32_t *vn = alloca(4 * sizeof(uint32_t));
+
+    memset(q, 0, 4 * sizeof(uint32_t));
+    memset(r, 0, 4 * sizeof(uint32_t));
+    memset(un, 0, 5 * sizeof(uint32_t));
+    memset(vn, 0, 4 * sizeof(uint32_t));
+
+    if (v[3] != 0) {
+        n = 4;
+    } else if (v[2]) {
+        n = 3;
+    } else if (v[1]) {
+        n = 2;
+    } else if (v[0]) {
+        n = 1;
+    } else {
+        /* never happens, but makes gcc shy */
+        return;
+    }
+
+    if (n == 1) {
+        /* Take care of the case of a single-digit divisor here */
+        k = 0;
+        for (j = m - 1; j >= 0; j--) {
+            q[j] = (k * b + u[j]) / v[0];
+            k = (k * b + u[j]) - q[j] * v[0];
+        }
+        if (r != NULL) {
+            r[0] = k;
+        }
+    } else {
+        s = clz32(v[n - 1]); /* 0 <= s <= 32 */
+        if (s != 0) {
+            for (i = n - 1; i > 0; i--) {
+                vn[i] = ((v[i] << s) | (v[i - 1] >> (32 - s)));
+            }
+            vn[0] = v[0] << s;
+
+            un[m] = u[m - 1] >> (32 - s);
+            for (i = m - 1; i > 0; i--) {
+                un[i] = (u[i] << s) | (u[i - 1] >> (32 - s));
+            }
+            un[0] = u[0] << s;
+        } else {
+            for (i = 0; i < n; i++) {
+                vn[i] = v[i];
+            }
+
+            for (i = 0; i < m; i++) {
+                un[i] = u[i];
+            }
+            un[m] = 0;
+        }
+
+        /* Step D2 : loop on j */
+        for (j = m - n; j >= 0; j--) { /* Main loop */
+            /* Step D3 : Compute estimate qhat of q[j] */
+            qhat = (un[j + n] * b + un[j + n - 1]) / vn[n - 1];
+            /* Optimized mod vn[n -1 ] */
+            rhat = (un[j + n] * b + un[j + n - 1]) - qhat * vn[n - 1];
+
+            while (true) {
+                if (qhat == b
+                    || qhat * vn[n - 2] > b * rhat + un[j + n - 2]) {
+                    qhat = qhat - 1;
+                    rhat = rhat + vn[n - 1];
+                    if (rhat < b) {
+                        continue;
+                    }
+                }
+                break;
+            }
+
+            /* Step D4 : Multiply and subtract */
+            k = 0;
+            for (i = 0; i < n; i++) {
+                p = qhat * vn[i];
+                t = un[i + j] - k - (p & 0xffffffff);
+                un[i + j] = t;
+                k = (p >> 32) - (t >> 32);
+            }
+            t = un[j + n] - k;
+            un[j + n] = t;
+
+            /* Step D5 */
+            q[j] = qhat;         /* Store quotient digit */
+            /* Step D6 */
+            if (t < 0) {         /* If we subtracted too much, add back */
+                q[j] = q[j] - 1;
+                k = 0;
+                for (i = 0; i < n; i++) {
+                    t = un[i + j] + vn[i] + k;
+                    un[i + j] = t;
+                    k = t >> 32;
+                }
+                un[j + n] = un[j + n] + k;
+            }
+        } /* D7 Loop */
+
+        /* Step D8 : Unnormalize */
+        if (rl && rh) {
+            if (s != 0) {
+                for (i = 0; i < n; i++) {
+                    r[i] = (un[i] >> s) | (un[i + 1] << (32 - s));
+                }
+            } else {
+                for (i = 0; i < n; i++) {
+                    r[i] = un[i];
+                }
+            }
+        }
+    }
+
+    if (ql && qh) {
+        *ql = q[0] | ((uint64_t)q[1] << 32);
+        *qh = q[2] | ((uint64_t)q[3] << 32);
+    }
+
+    if (rl && rh) {
+        *rl = r[0] | ((uint64_t)r[1] << 32);
+        *rh = r[2] | ((uint64_t)r[3] << 32);
+    }
+}
+
+void HELPER(idivu128)(CPURISCVState *env, uint64_t rd,
+                        uint64_t ul, uint64_t uh,
+                        uint64_t vl, uint64_t vh)
+{
+    uint64_t ql, qh;
+    if (vl == 0 && vh == 0) { /* Handle special behavior on div by zero */
+        ql = 0xffffffffffffffff;
+        qh = ql;
+    } else {
+        /* Soft quad division */
+        divmod128(ul, uh, vl, vh, &ql, &qh, NULL, NULL);
+    }
+
+    if (rd != 0) {
+        env->gpr[rd] = ql;
+        env->gprh[rd] = qh;
+    }
+    return;
+}
+
+void HELPER(iremu128)(CPURISCVState *env, uint64_t rd,
+                      uint64_t ul, uint64_t uh,
+                      uint64_t vl, uint64_t vh)
+{
+    uint64_t rl, rh;
+    if (vl == 0 && vh == 0) {
+        rl = ul;
+        rh = uh;
+    } else {
+        /* Soft quad division */
+        divmod128(ul, uh, vl, vh, NULL, NULL, &rl, &rh);
+    }
+
+    if (rd != 0) {
+        env->gpr[rd] = rl;
+        env->gprh[rd] = rh;
+    }
+    return;
+}
+
+static void neg128(uint64_t *valh, uint64_t *vall)
+{
+    uint64_t oneh = ~(*valh), onel = ~(*vall);
+    *vall = onel + 1;
+    /* Carry into upper 64 bits */
+    *valh = (*vall < onel) ? oneh + 1 : oneh;
+}
+
+void HELPER(idivs128)(CPURISCVState *env, uint64_t rd,
+                      uint64_t ul, uint64_t uh,
+                      uint64_t vl, uint64_t vh)
+{
+    uint64_t qh, ql;
+    if (vl == 0 && vh == 0) { /* Div by zero check */
+        ql = 0xffffffffffffffff;
+        qh = 0xffffffffffffffff;
+    } else if (uh == 0x8000000000000000 && ul == 0 &&
+               vh == 0xffffffffffffffff && vl == 0xffffffffffffffff) {
+        /* Signed div overflow check (-2**127 / -1) */
+        ql = ul;
+        qh = uh;
+    } else {
+        /* User unsigned divmod to build signed quotient */
+        bool sgnu = (uh & 0x8000000000000000),
+             sgnv = (vh & 0x8000000000000000);
+
+        if (sgnu) {
+            neg128(&uh, &ul);
+        }
+
+        if (sgnv) {
+            neg128(&vh, &vl);
+        }
+
+        divmod128(ul, uh, vl, vh, &ql, &qh, NULL, NULL);
+
+        if (sgnu != sgnv) {
+            neg128(&qh, &ql);
+        }
+    }
+
+    if (rd != 0) {
+        env->gpr[rd] = ql;
+        env->gprh[rd] = qh;
+    }
+    return;
+}
+
+void HELPER(irems128)(CPURISCVState *env, uint64_t rd,
+                      uint64_t ul, uint64_t uh,
+                      uint64_t vl, uint64_t vh)
+{
+    uint64_t rh, rl;
+    if (vl == 0 && vh == 0) {
+        rl = ul;
+        rh = uh;
+    } else {
+        /* User unsigned divmod to build signed remainder */
+        bool sgnu = (uh & 0x8000000000000000),
+             sgnv = (vh & 0x8000000000000000);
+
+        if (sgnu) {
+            neg128(&uh, &ul);
+        }
+
+        if (sgnv) {
+            neg128(&vh, &vl);
+        }
+
+        divmod128(ul, uh, vl, vh, NULL, NULL, &rl, &rh);
+
+        if (sgnu) {
+            neg128(&rh, &rl);
+        }
+    }
+
+    if (rd != 0) {
+        env->gpr[rd] = rl;
+        env->gprh[rd] = rh;
+    }
+    return;
+}
+#endif
diff --git a/target/riscv/meson.build b/target/riscv/meson.build
index d5e0bc93ea..3a25dd723b 100644
--- a/target/riscv/meson.build
+++ b/target/riscv/meson.build
@@ -16,6 +16,7 @@ riscv_ss.add(files(
   'gdbstub.c',
   'op_helper.c',
   'vector_helper.c',
+  'm128_helper.c',
   'bitmanip_helper.c',
   'translate.c',
 ))
-- 
2.33.0



^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 6/8] target/riscv: Support of compiler's 128-bit integer types
  2021-08-30 17:16 [PATCH 1/8] target/riscv: Settings for 128-bit extension support Frédéric Pétrot
                   ` (3 preceding siblings ...)
  2021-08-30 17:16 ` [PATCH 5/8] target/riscv: 128-bit multiply and divide Frédéric Pétrot
@ 2021-08-30 17:16 ` Frédéric Pétrot
  2021-08-31  3:38   ` Richard Henderson
  2021-08-30 17:16 ` [PATCH 7/8] target/riscv: 128-bit support for some csrs Frédéric Pétrot
                   ` (2 subsequent siblings)
  7 siblings, 1 reply; 22+ messages in thread
From: Frédéric Pétrot @ 2021-08-30 17:16 UTC (permalink / raw)
  To: qemu-devel, qemu-riscv
  Cc: Bin Meng, Alistair Francis, Frédéric Pétrot,
	Palmer Dabbelt, Fabien Portas

128-bit mult and div helpers may now use the compiler support
for 128-bit integers if it exists.

Signed-off-by: Frédéric Pétrot <frederic.petrot@univ-grenoble-alpes.fr>
Co-authored-by: Fabien Portas <fabien.portas@grenoble-inp.org>
---
 target/riscv/cpu.h         | 13 +++++++++++
 target/riscv/m128_helper.c | 48 ++++++++++++++++++++++++++++++++++++++
 2 files changed, 61 insertions(+)

diff --git a/target/riscv/cpu.h b/target/riscv/cpu.h
index 6528b4540e..4321b03b94 100644
--- a/target/riscv/cpu.h
+++ b/target/riscv/cpu.h
@@ -60,6 +60,19 @@
 #define RV64 ((target_ulong)2 << (TARGET_LONG_BITS - 2))
 /* To be used on misah, the upper part of misa */
 #define RV128 ((target_ulong)3 << (TARGET_LONG_BITS - 2))
+/*
+ * Defined to force the use of tcg 128-bit arithmetic
+ * if the compiler does not have a 128-bit built-in type
+ */
+#define SOFT_128BIT
+/*
+ * If available and not explicitly disabled,
+ * use compiler's 128-bit integers.
+ */
+#if defined(__SIZEOF_INT128__) && !defined(SOFT_128BIT)
+#define HARD_128BIT
+#endif
+
 
 #define RV(x) ((target_ulong)1 << (x - 'A'))
 
diff --git a/target/riscv/m128_helper.c b/target/riscv/m128_helper.c
index 973632b005..bf50525ec0 100644
--- a/target/riscv/m128_helper.c
+++ b/target/riscv/m128_helper.c
@@ -24,6 +24,7 @@
 #include "exec/helper-proto.h"
 
 #ifdef TARGET_RISCV128
+#ifndef HARD_128BIT
 /* TODO : This can be optimized by a lot */
 static void divmod128(uint64_t ul, uint64_t uh,
             uint64_t vl, uint64_t vh,
@@ -175,6 +176,7 @@ static void divmod128(uint64_t ul, uint64_t uh,
         *rh = r[2] | ((uint64_t)r[3] << 32);
     }
 }
+#endif
 
 void HELPER(idivu128)(CPURISCVState *env, uint64_t rd,
                         uint64_t ul, uint64_t uh,
@@ -185,8 +187,19 @@ void HELPER(idivu128)(CPURISCVState *env, uint64_t rd,
         ql = 0xffffffffffffffff;
         qh = ql;
     } else {
+#ifdef HARD_128BIT
+        /* If available, use builtin 128-bit type */
+        __uint128_t u = (((__uint128_t) uh) << 64) | ul,
+                    v = (((__uint128_t) vh) << 64) | vl,
+                    r;
+
+        r = u / v;
+        ql = r & 0xffffffffffffffff;
+        qh = (r >> 64) & 0xffffffffffffffff;
+#else
         /* Soft quad division */
         divmod128(ul, uh, vl, vh, &ql, &qh, NULL, NULL);
+#endif
     }
 
     if (rd != 0) {
@@ -205,8 +218,19 @@ void HELPER(iremu128)(CPURISCVState *env, uint64_t rd,
         rl = ul;
         rh = uh;
     } else {
+#ifdef HARD_128BIT
+        /* If available, use builtin 128-bit type */
+        __uint128_t u = (((__uint128_t) uh) << 64) | ul,
+                    v = (((__uint128_t) vh) << 64) | vl,
+                    r;
+
+        r = u % v;
+        rl = r & 0xffffffffffffffff;
+        rh = (r >> 64) & 0xffffffffffffffff;
+#else
         /* Soft quad division */
         divmod128(ul, uh, vl, vh, NULL, NULL, &rl, &rh);
+#endif
     }
 
     if (rd != 0) {
@@ -216,6 +240,7 @@ void HELPER(iremu128)(CPURISCVState *env, uint64_t rd,
     return;
 }
 
+#ifndef HARD_128BIT
 static void neg128(uint64_t *valh, uint64_t *vall)
 {
     uint64_t oneh = ~(*valh), onel = ~(*vall);
@@ -223,6 +248,7 @@ static void neg128(uint64_t *valh, uint64_t *vall)
     /* Carry into upper 64 bits */
     *valh = (*vall < onel) ? oneh + 1 : oneh;
 }
+#endif
 
 void HELPER(idivs128)(CPURISCVState *env, uint64_t rd,
                       uint64_t ul, uint64_t uh,
@@ -238,6 +264,16 @@ void HELPER(idivs128)(CPURISCVState *env, uint64_t rd,
         ql = ul;
         qh = uh;
     } else {
+#ifdef HARD_128BIT
+        /* Use gcc's builtin 128 bit type */
+        __int128_t u = (__int128_t) ((((__uint128_t) uh) << 64) | ul),
+                   v = (__int128_t) ((((__uint128_t) vh) << 64) | vl);
+
+        __int128_t r = u / v;
+
+        ql = r & 0xffffffffffffffff;
+        qh = (r >> 64) & 0xffffffffffffffff;
+#else
         /* User unsigned divmod to build signed quotient */
         bool sgnu = (uh & 0x8000000000000000),
              sgnv = (vh & 0x8000000000000000);
@@ -255,6 +291,7 @@ void HELPER(idivs128)(CPURISCVState *env, uint64_t rd,
         if (sgnu != sgnv) {
             neg128(&qh, &ql);
         }
+#endif
     }
 
     if (rd != 0) {
@@ -273,6 +310,16 @@ void HELPER(irems128)(CPURISCVState *env, uint64_t rd,
         rl = ul;
         rh = uh;
     } else {
+#ifdef HARD_128BIT
+        /* Use gcc's builtin 128 bit type */
+        __int128_t u = (__int128_t) ((((__uint128_t) uh) << 64) | ul),
+                   v = (__int128_t) ((((__uint128_t) vh) << 64) | vl);
+
+        __int128_t r = u % v;
+
+        rl = r & 0xffffffffffffffff;
+        rh = (r >> 64) & 0xffffffffffffffff;
+#else
         /* User unsigned divmod to build signed remainder */
         bool sgnu = (uh & 0x8000000000000000),
              sgnv = (vh & 0x8000000000000000);
@@ -290,6 +337,7 @@ void HELPER(irems128)(CPURISCVState *env, uint64_t rd,
         if (sgnu) {
             neg128(&rh, &rl);
         }
+#endif
     }
 
     if (rd != 0) {
-- 
2.33.0



^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 7/8] target/riscv: 128-bit support for some csrs
  2021-08-30 17:16 [PATCH 1/8] target/riscv: Settings for 128-bit extension support Frédéric Pétrot
                   ` (4 preceding siblings ...)
  2021-08-30 17:16 ` [PATCH 6/8] target/riscv: Support of compiler's 128-bit integer types Frédéric Pétrot
@ 2021-08-30 17:16 ` Frédéric Pétrot
  2021-08-31  3:43   ` Richard Henderson
  2021-08-30 17:16 ` [PATCH 8/8] target/riscv: Support for 128-bit satp Frédéric Pétrot
  2021-08-31  3:13 ` [PATCH 1/8] target/riscv: Settings for 128-bit extension support Alistair Francis
  7 siblings, 1 reply; 22+ messages in thread
From: Frédéric Pétrot @ 2021-08-30 17:16 UTC (permalink / raw)
  To: qemu-devel, qemu-riscv
  Cc: Bin Meng, Alistair Francis, Frédéric Pétrot,
	Palmer Dabbelt, Fabien Portas

Adding 128-bit support for a minimal subset of the csrs, so that it is
possible to boot and jump to and return from interrupts/exceptions using
the csrrw instruction.
The (partially handled) 128-bit csrs are the following:
csr_mhartid, csr_mstatus, csr_misa, csr_mtvec, csr_mscratch and csr_mepc.
We fallback on the 64-bit version of the csr functions, assuming the relevant
information stands in the lower double-word when no 128-bit support is
implemented.

Signed-off-by: Frédéric Pétrot <frederic.petrot@univ-grenoble-alpes.fr>
Co-authored-by: Fabien Portas <fabien.portas@grenoble-inp.org>
---
 target/riscv/cpu.h                      |  52 +++--
 target/riscv/cpu_bits.h                 |   2 +
 target/riscv/csr.c                      | 264 ++++++++++++++++++++++++
 target/riscv/helper.h                   |   7 +
 target/riscv/insn_trans/trans_rvi.c.inc |  76 +++++++
 target/riscv/op_helper.c                |  60 ++++++
 target/riscv/translate.c                |   4 +
 target/riscv/utils_128.h                | 173 ++++++++++++++++
 8 files changed, 625 insertions(+), 13 deletions(-)
 create mode 100644 target/riscv/utils_128.h

diff --git a/target/riscv/cpu.h b/target/riscv/cpu.h
index 4321b03b94..0d18055e08 100644
--- a/target/riscv/cpu.h
+++ b/target/riscv/cpu.h
@@ -26,6 +26,7 @@
 #include "fpu/softfloat-types.h"
 #include "qom/object.h"
 
+#include "utils_128.h"
 #define TCG_GUEST_DEFAULT_MO 0
 
 #define TYPE_RISCV_CPU "riscv-cpu"
@@ -60,19 +61,6 @@
 #define RV64 ((target_ulong)2 << (TARGET_LONG_BITS - 2))
 /* To be used on misah, the upper part of misa */
 #define RV128 ((target_ulong)3 << (TARGET_LONG_BITS - 2))
-/*
- * Defined to force the use of tcg 128-bit arithmetic
- * if the compiler does not have a 128-bit built-in type
- */
-#define SOFT_128BIT
-/*
- * If available and not explicitly disabled,
- * use compiler's 128-bit integers.
- */
-#if defined(__SIZEOF_INT128__) && !defined(SOFT_128BIT)
-#define HARD_128BIT
-#endif
-
 
 #define RV(x) ((target_ulong)1 << (x - 'A'))
 
@@ -214,6 +202,11 @@ struct CPURISCVState {
     /* Upper 64-bits of 128-bit CSRs */
     uint64_t misah;
     uint64_t misah_mask;
+    uint64_t mtvech;
+    uint64_t mscratchh;
+    uint64_t mepch;
+    uint64_t satph;
+    uint64_t mstatush;
 #endif
 
     /* Virtual CSRs */
@@ -491,9 +484,20 @@ static inline void cpu_get_tb_cpu_state(CPURISCVState *env, target_ulong *pc,
     *pflags = flags;
 }
 
+#if defined(TARGET_RISCV128)
+RISCVException riscv_csrrw_check(CPURISCVState *env,
+                                 int csrno,
+                                 const UINT128 *write_mask,
+                                 RISCVCPU *cpu);
+#endif
 RISCVException riscv_csrrw(CPURISCVState *env, int csrno,
                            target_ulong *ret_value,
                            target_ulong new_value, target_ulong write_mask);
+#if defined(TARGET_RISCV128)
+RISCVException riscv_csrrw_128(CPURISCVState *env, int csrno,
+                                UINT128 *ret_value,
+                                UINT128 new_value, UINT128 write_mask);
+#endif
 RISCVException riscv_csrrw_debug(CPURISCVState *env, int csrno,
                                  target_ulong *ret_value,
                                  target_ulong new_value,
@@ -523,6 +527,17 @@ typedef RISCVException (*riscv_csr_op_fn)(CPURISCVState *env, int csrno,
                                           target_ulong new_value,
                                           target_ulong write_mask);
 
+#if defined(TARGET_RISCV128)
+typedef RISCVException (*riscv_csr_read128_fn)(CPURISCVState *env, int csrno,
+                                               UINT128 *ret_value);
+typedef RISCVException (*riscv_csr_write128_fn)(CPURISCVState *env, int csrno,
+                                             UINT128 new_value);
+typedef RISCVException (*riscv_csr_op128_fn)(CPURISCVState *env, int csrno,
+                                             UINT128 *ret_value,
+                                             UINT128 new_value,
+                                             UINT128 write_mask);
+#endif
+
 typedef struct {
     const char *name;
     riscv_csr_predicate_fn predicate;
@@ -531,6 +546,14 @@ typedef struct {
     riscv_csr_op_fn op;
 } riscv_csr_operations;
 
+#if defined(TARGET_RISCV128)
+typedef struct {
+    riscv_csr_read128_fn read128;
+    riscv_csr_write128_fn write128;
+    riscv_csr_op128_fn op128;
+} riscv_csr_operations128;
+#endif
+
 /* CSR function table constants */
 enum {
     CSR_TABLE_SIZE = 0x1000
@@ -538,6 +561,9 @@ enum {
 
 /* CSR function table */
 extern riscv_csr_operations csr_ops[CSR_TABLE_SIZE];
+#if defined(TARGET_RISCV128)
+extern riscv_csr_operations128 csr_ops_128[CSR_TABLE_SIZE];
+#endif
 
 void riscv_get_csr_ops(int csrno, riscv_csr_operations *ops);
 void riscv_set_csr_ops(int csrno, riscv_csr_operations *ops);
diff --git a/target/riscv/cpu_bits.h b/target/riscv/cpu_bits.h
index 7330ff5a19..901f0e890a 100644
--- a/target/riscv/cpu_bits.h
+++ b/target/riscv/cpu_bits.h
@@ -361,6 +361,8 @@
 #define MSTATUS32_SD        0x80000000
 #define MSTATUS64_SD        0x8000000000000000ULL
 
+#define MSTATUSH128_SD      0x8000000000000000ULL
+
 #define MISA32_MXL          0xC0000000
 #define MISA64_MXL          0xC000000000000000ULL
 
diff --git a/target/riscv/csr.c b/target/riscv/csr.c
index 9a4ed18ac5..c3471a1365 100644
--- a/target/riscv/csr.c
+++ b/target/riscv/csr.c
@@ -462,6 +462,14 @@ static const char valid_vm_1_10_64[16] = {
 };
 
 /* Machine Information Registers */
+#if defined(TARGET_RISCV128)
+static RISCVException read_zero_128(CPURISCVState *env, int csrno,
+                                    UINT128 *val)
+{
+    *val = u128_zero();
+    return RISCV_EXCP_NONE;
+}
+#endif
 static RISCVException read_zero(CPURISCVState *env, int csrno,
                                 target_ulong *val)
 {
@@ -469,6 +477,14 @@ static RISCVException read_zero(CPURISCVState *env, int csrno,
     return RISCV_EXCP_NONE;
 }
 
+#if defined(TARGET_RISCV128)
+static RISCVException read_mhartid_128(CPURISCVState *env, int csrno,
+                                       UINT128 *val)
+{
+    *val = u128_from64(env->mhartid);
+    return RISCV_EXCP_NONE;
+}
+#endif
 static RISCVException read_mhartid(CPURISCVState *env, int csrno,
                                    target_ulong *val)
 {
@@ -477,6 +493,61 @@ static RISCVException read_mhartid(CPURISCVState *env, int csrno,
 }
 
 /* Machine Trap Setup */
+#if defined(TARGET_RISCV128)
+static RISCVException read_mstatus_128(CPURISCVState *env, int csrno,
+                                   UINT128 *val)
+{
+    *val = u128_from_pair(env->mstatus, env->mstatush);
+    return RISCV_EXCP_NONE;
+}
+
+static RISCVException write_mstatus_128(CPURISCVState *env, int csrno,
+                                        UINT128 val)
+{
+    UINT128 mstatus = u128_from_pair(env->mstatus, env->mstatush);
+    UINT128 mask = u128_zero();
+    int dirty;
+
+    /* flush tlb on mstatus fields that affect VM */
+    if (u128_lo64(u128_xor(mstatus, val))
+            & (MSTATUS_MXR | MSTATUS_MPP | MSTATUS_MPV |
+                           MSTATUS_MPRV | MSTATUS_SUM)) {
+        tlb_flush(env_cpu(env));
+    }
+    mask = u128_from64(MSTATUS_SIE | MSTATUS_SPIE | MSTATUS_MIE | MSTATUS_MPIE |
+                       MSTATUS_SPP | MSTATUS_FS | MSTATUS_MPRV | MSTATUS_SUM |
+                       MSTATUS_MPP | MSTATUS_MXR | MSTATUS_TVM | MSTATUS_TSR |
+                       MSTATUS_TW);
+
+    if (!riscv_cpu_is_32bit(env)) {
+        /*
+         * RV32: MPV and GVA are not in mstatus. The current plan is to
+         * add them to mstatush. For now, we just don't support it.
+         */
+        mask = u128_or(mask, u128_from64(MSTATUS_MPV | MSTATUS_GVA));
+    }
+
+    mstatus = u128_or(u128_and(mstatus, u128_not(mask)), u128_and(val, mask));
+
+    dirty = ((u128_get_lo64(&mstatus) & MSTATUS_FS) == MSTATUS_FS) |
+            ((u128_get_lo64(&mstatus) & MSTATUS_XS) == MSTATUS_XS);
+    if (dirty) {
+        if (riscv_cpu_is_32bit(env)) {
+            mstatus = u128_from64(u128_get_lo64(&mstatus) | MSTATUS32_SD);
+        } else if (riscv_cpu_is_64bit(env)) {
+            mstatus = u128_from64(u128_get_lo64(&mstatus) | MSTATUS64_SD);
+        } else {
+            mstatus = u128_or(mstatus, u128_from_pair(0, MSTATUSH128_SD));
+        }
+    }
+
+    env->mstatus = u128_get_lo64(&mstatus);
+    env->mstatush = u128_get_hi64(&mstatus);
+
+    return RISCV_EXCP_NONE;
+}
+#endif
+
 static RISCVException read_mstatus(CPURISCVState *env, int csrno,
                                    target_ulong *val)
 {
@@ -554,6 +625,15 @@ static RISCVException write_mstatush(CPURISCVState *env, int csrno,
     return RISCV_EXCP_NONE;
 }
 
+#if defined(TARGET_RISCV128)
+static RISCVException read_misa_128(CPURISCVState *env, int csrno,
+                                    UINT128 *val)
+{
+    *val = u128_from_pair(env->misa, env->misah);
+    return RISCV_EXCP_NONE;
+}
+#endif
+
 static RISCVException read_misa(CPURISCVState *env, int csrno,
                                 target_ulong *val)
 {
@@ -663,6 +743,27 @@ static RISCVException write_mie(CPURISCVState *env, int csrno,
     return RISCV_EXCP_NONE;
 }
 
+#if defined(TARGET_RISCV128)
+static RISCVException read_mtvec_128(CPURISCVState *env, int csrno,
+                                     UINT128 *val)
+{
+    *val = u128_from_pair(env->mtvec, env->mtvech);
+    return RISCV_EXCP_NONE;
+}
+
+static RISCVException write_mtvec_128(CPURISCVState *env, int csrno,
+                                      UINT128 val)
+{
+    /* bits [1:0] encode mode; 0 = direct, 1 = vectored, 2 >= reserved */
+    if ((u128_get_lo64(&val) & 3) < 2) {
+        env->mtvec = u128_get_lo64(&val);
+        env->mtvech = u128_get_hi64(&val);
+    } else {
+        qemu_log_mask(LOG_UNIMP, "CSR_MTVEC: reserved mode not supported\n");
+    }
+    return RISCV_EXCP_NONE;
+}
+#endif
 static RISCVException read_mtvec(CPURISCVState *env, int csrno,
                                  target_ulong *val)
 {
@@ -697,6 +798,20 @@ static RISCVException write_mcounteren(CPURISCVState *env, int csrno,
 }
 
 /* Machine Trap Handling */
+#if defined(TARGET_RISCV128)
+static RISCVException read_mscratch_128(CPURISCVState *env, int csrno,
+                                        UINT128 *val)  {
+    *val = u128_from_pair(env->mscratch, env->mscratchh);
+    return RISCV_EXCP_NONE;
+}
+
+static RISCVException write_mscratch_128(CPURISCVState *env, int csrno,
+                                         UINT128 val) {
+    env->mscratch = u128_get_lo64(&val);
+    env->mscratchh = u128_get_hi64(&val);
+    return RISCV_EXCP_NONE;
+}
+#endif
 static RISCVException read_mscratch(CPURISCVState *env, int csrno,
                                     target_ulong *val)
 {
@@ -711,6 +826,23 @@ static RISCVException write_mscratch(CPURISCVState *env, int csrno,
     return RISCV_EXCP_NONE;
 }
 
+#if defined(TARGET_RISCV128)
+static RISCVException read_mepc_128(CPURISCVState *env, int csrno,
+                                    UINT128 *val)
+{
+    *val = u128_from_pair(env->mepc, env->mepch);
+    return RISCV_EXCP_NONE;
+}
+
+static RISCVException write_mepc_128(CPURISCVState *env, int csrno,
+                                     UINT128 val)
+{
+    env->mepc = u128_get_lo64(&val);
+    env->mepch = u128_get_hi64(&val);
+    return RISCV_EXCP_NONE;
+}
+#endif
+
 static RISCVException read_mepc(CPURISCVState *env, int csrno,
                                      target_ulong *val)
 {
@@ -1493,6 +1625,120 @@ RISCVException riscv_csrrw(CPURISCVState *env, int csrno,
     return RISCV_EXCP_NONE;
 }
 
+#if defined(TARGET_RISCV128)
+static inline RISCVException riscv_csrrw_128_check(CPURISCVState *env,
+                                 int csrno,
+                                 const UINT128 *write_mask,
+                                 RISCVCPU *cpu) {
+    /* check privileges and return -1 if check fails */
+#if !defined(CONFIG_USER_ONLY)
+    int effective_priv = env->priv;
+    int read_only = get_field(csrno, 0xC00) == 3;
+
+    if (riscv_has_ext(env, RVH) &&
+        env->priv == PRV_S &&
+        !riscv_cpu_virt_enabled(env)) {
+        /*
+         * We are in S mode without virtualisation, therefore we are in HS Mode.
+         * Add 1 to the effective privledge level to allow us to access the
+         * Hypervisor CSRs.
+         */
+        effective_priv++;
+    }
+
+    if ((u128_is_nonzero(write_mask) && read_only) ||
+        (!env->debugger && (effective_priv < get_field(csrno, 0x300)))) {
+        return RISCV_EXCP_ILLEGAL_INST;
+    }
+#endif
+
+    /* ensure the CSR extension is enabled. */
+    if (!cpu->cfg.ext_icsr) {
+        return RISCV_EXCP_ILLEGAL_INST;
+    }
+
+    /* check predicate */
+    if (!csr_ops[csrno].predicate) {
+        return RISCV_EXCP_ILLEGAL_INST;
+    }
+    RISCVException ret = csr_ops[csrno].predicate(env, csrno);
+    if (ret != RISCV_EXCP_NONE) {
+        return ret;
+    }
+
+    return RISCV_EXCP_NONE;
+}
+
+RISCVException riscv_csrrw_128(CPURISCVState *env, int csrno,
+                               UINT128 *ret_value,
+                               UINT128 new_value, UINT128 write_mask) {
+    RISCVException ret;
+    UINT128 old_value;
+
+    RISCVCPU *cpu = env_archcpu(env);
+
+    if (!csr_ops_128[csrno].read128 && !csr_ops_128[csrno].op128) {
+        /*
+         * FIXME: Fall back to 64-bit version for now, if the 128-bit
+         * alternative isn't defined.
+         * Note, some CSRs don't extend to MXLEN, for those,
+         * this fallback is correctly handling the read/write.
+         */
+        target_ulong ret_64;
+        ret = riscv_csrrw(env, csrno, &ret_64,
+                          u128_get_lo64(&new_value),
+                          u128_get_lo64(&write_mask));
+
+        if (ret_value) {
+            *ret_value = u128_from64(ret_64);
+        }
+
+        return ret;
+    }
+
+    RISCVException check_status =
+        riscv_csrrw_128_check(env, csrno, &write_mask, cpu);
+    if (check_status != RISCV_EXCP_NONE) {
+        return check_status;
+    }
+
+    /* execute combined read/write operation if it exists */
+    if (csr_ops_128[csrno].op128) {
+        return csr_ops_128[csrno].op128(env, csrno, ret_value,
+                                        new_value, write_mask);
+    }
+
+    /* if no accessor exists then return failure */
+    if (!csr_ops_128[csrno].read128) {
+        return RISCV_EXCP_ILLEGAL_INST;
+    }
+    /* read old value */
+    ret = csr_ops_128[csrno].read128(env, csrno, &old_value);
+    if (ret != RISCV_EXCP_NONE) {
+        return ret;
+    }
+
+    /* write value if writable and write mask set, otherwise drop writes */
+    if (u128_is_nonzero(&write_mask)) {
+        new_value = u128_or(u128_and(old_value, u128_not(write_mask)),
+                            u128_and(new_value, write_mask));
+        if (csr_ops_128[csrno].write128) {
+            ret = csr_ops_128[csrno].write128(env, csrno, new_value);
+            if (ret != RISCV_EXCP_NONE) {
+                return ret;
+            }
+        }
+    }
+
+    /* return old value */
+    if (ret_value) {
+        *ret_value = old_value;
+    }
+
+    return RISCV_EXCP_NONE;
+}
+#endif
+
 /*
  * Debugger support.  If not in user mode, set env->debugger before the
  * riscv_csrrw call and clear it after the call.
@@ -1514,6 +1760,24 @@ RISCVException riscv_csrrw_debug(CPURISCVState *env, int csrno,
 }
 
 /* Control and Status Register function table */
+#if defined(TARGET_RISCV128)
+riscv_csr_operations128 csr_ops_128[CSR_TABLE_SIZE] = {
+#if !defined(CONFIG_USER_ONLY)
+    [CSR_MVENDORID]  = { read_zero_128    },
+    [CSR_MARCHID]    = { read_zero_128    },
+    [CSR_MIMPID]     = { read_zero_128    },
+    [CSR_MHARTID]    = { read_mhartid_128 },
+
+    [CSR_MSTATUS]    = { read_mstatus_128,  write_mstatus_128 },
+    [CSR_MISA]       = { read_misa_128    },
+    [CSR_MTVEC]      = { read_mtvec_128,    write_mtvec_128   },
+
+    [CSR_MSCRATCH]   = { read_mscratch_128, write_mscratch_128},
+    [CSR_MEPC]       = { read_mepc_128,     write_mepc_128    },
+#endif
+};
+#endif
+
 riscv_csr_operations csr_ops[CSR_TABLE_SIZE] = {
     /* User Floating-Point CSRs */
     [CSR_FFLAGS]   = { "fflags",   fs,     read_fflags,  write_fflags },
diff --git a/target/riscv/helper.h b/target/riscv/helper.h
index f3aed608dc..e3eb1dfe59 100644
--- a/target/riscv/helper.h
+++ b/target/riscv/helper.h
@@ -68,6 +68,13 @@ DEF_HELPER_FLAGS_2(gorcw, TCG_CALL_NO_RWG_SE, tl, tl, tl)
 DEF_HELPER_3(csrrw, tl, env, tl, tl)
 DEF_HELPER_4(csrrs, tl, env, tl, tl, tl)
 DEF_HELPER_4(csrrc, tl, env, tl, tl, tl)
+
+#ifdef TARGET_RISCV128
+DEF_HELPER_5(csrrw_128, void, env, tl, tl, tl, tl)
+DEF_HELPER_5(csrrs_128, void, env, tl, tl, tl, tl)
+DEF_HELPER_5(csrrc_128, void, env, tl, tl, tl, tl)
+#endif
+
 #ifndef CONFIG_USER_ONLY
 DEF_HELPER_2(sret, tl, env, tl)
 DEF_HELPER_2(mret, tl, env, tl)
diff --git a/target/riscv/insn_trans/trans_rvi.c.inc b/target/riscv/insn_trans/trans_rvi.c.inc
index 0401ba3d69..2c8041ba15 100644
--- a/target/riscv/insn_trans/trans_rvi.c.inc
+++ b/target/riscv/insn_trans/trans_rvi.c.inc
@@ -1543,6 +1543,29 @@ static bool trans_fence_i(DisasContext *ctx, arg_fence_i *a)
     gen_io_start();\
 } while (0)
 
+#if defined(TARGET_RISCV128)
+#define RISCV_OP_CSR128_PRE do { \
+    source1_lo = tcg_temp_new(); \
+    source1_hi = tcg_temp_new(); \
+    csr_store = tcg_temp_new();  \
+    rd = tcg_const_tl(a->rd); \
+    gen_get_gpr(source1_lo, a->rs1); \
+    gen_get_gprh(source1_hi, a->rs1); \
+    tcg_gen_movi_tl(cpu_pc, ctx->base.pc_next); \
+    tcg_gen_movi_tl(csr_store, a->csr); \
+    gen_io_start(); \
+} while (0)
+
+#define RISCV_OP_CSRI128_PRE do { \
+    source1_lo = tcg_const_tl(a->rs1); \
+    source1_hi = tcg_const_tl(0); \
+    csr_store = tcg_const_tl(a->csr); \
+    rd = tcg_const_tl(a->rd); \
+    tcg_gen_movi_tl(cpu_pc, ctx->base.pc_next); \
+    gen_io_start(); \
+} while (0)
+#endif
+
 #define RISCV_OP_CSR_POST do {\
     gen_set_gpr(a->rd, dest); \
     tcg_gen_movi_tl(cpu_pc, ctx->pc_succ_insn); \
@@ -1554,57 +1577,110 @@ static bool trans_fence_i(DisasContext *ctx, arg_fence_i *a)
     tcg_temp_free(rs1_pass); \
 } while (0)
 
+#if defined(TARGET_RISCV128)
+#define RISCV_OP_CSR128_POST do {\
+    tcg_gen_movi_tl(cpu_pc, ctx->pc_succ_insn); \
+    exit_tb(ctx); \
+    ctx->base.is_jmp = DISAS_NORETURN; \
+    tcg_temp_free(source1_hi); \
+    tcg_temp_free(source1_lo); \
+    tcg_temp_free(csr_store); \
+    tcg_temp_free(rd); \
+} while (0)
+#endif
 
 static bool trans_csrrw(DisasContext *ctx, arg_csrrw *a)
 {
+#if !defined(TARGET_RISCV128)
     TCGv source1, csr_store, dest, rs1_pass;
     RISCV_OP_CSR_PRE;
     gen_helper_csrrw(dest, cpu_env, source1, csr_store);
     RISCV_OP_CSR_POST;
+#else
+    TCGv csr_store, source1_lo, source1_hi, rd;
+    RISCV_OP_CSR128_PRE;
+    gen_helper_csrrw_128(cpu_env, rd, csr_store, source1_lo, source1_hi);
+    RISCV_OP_CSR128_POST;
+#endif
     return true;
 }
 
 static bool trans_csrrs(DisasContext *ctx, arg_csrrs *a)
 {
+#if !defined(TARGET_RISCV128)
     TCGv source1, csr_store, dest, rs1_pass;
     RISCV_OP_CSR_PRE;
     gen_helper_csrrs(dest, cpu_env, source1, csr_store, rs1_pass);
     RISCV_OP_CSR_POST;
+#else
+    TCGv csr_store, source1_lo, source1_hi, rd;
+    RISCV_OP_CSR128_PRE;
+    gen_helper_csrrs_128(cpu_env, rd, csr_store, source1_lo, source1_hi);
+    RISCV_OP_CSR128_POST;
+#endif
     return true;
 }
 
 static bool trans_csrrc(DisasContext *ctx, arg_csrrc *a)
 {
+#if !defined(TARGET_RISCV128)
     TCGv source1, csr_store, dest, rs1_pass;
     RISCV_OP_CSR_PRE;
     gen_helper_csrrc(dest, cpu_env, source1, csr_store, rs1_pass);
     RISCV_OP_CSR_POST;
+#else
+    TCGv csr_store, source1_lo, source1_hi, rd;
+    RISCV_OP_CSR128_PRE;
+    gen_helper_csrrc_128(cpu_env, rd, csr_store, source1_lo, source1_hi);
+    RISCV_OP_CSR128_POST;
+#endif
     return true;
 }
 
 static bool trans_csrrwi(DisasContext *ctx, arg_csrrwi *a)
 {
+#if !defined(TARGET_RISCV128)
     TCGv source1, csr_store, dest, rs1_pass;
     RISCV_OP_CSR_PRE;
     gen_helper_csrrw(dest, cpu_env, rs1_pass, csr_store);
     RISCV_OP_CSR_POST;
+#else
+    TCGv csr_store, source1_lo, source1_hi, rd;
+    RISCV_OP_CSRI128_PRE;
+    gen_helper_csrrw_128(cpu_env, rd, csr_store, source1_lo, source1_hi);
+    RISCV_OP_CSR128_POST;
+#endif
     return true;
 }
 
 static bool trans_csrrsi(DisasContext *ctx, arg_csrrsi *a)
 {
+#if !defined(TARGET_RISCV128)
     TCGv source1, csr_store, dest, rs1_pass;
     RISCV_OP_CSR_PRE;
     gen_helper_csrrs(dest, cpu_env, rs1_pass, csr_store, rs1_pass);
     RISCV_OP_CSR_POST;
+#else
+    TCGv csr_store, source1_lo, source1_hi, rd;
+    RISCV_OP_CSRI128_PRE;
+    gen_helper_csrrs_128(cpu_env, rd, csr_store, source1_lo, source1_hi);
+    RISCV_OP_CSR128_POST;
+#endif
     return true;
 }
 
 static bool trans_csrrci(DisasContext *ctx, arg_csrrci *a)
 {
+#if !defined(TARGET_RISCV128)
     TCGv source1, csr_store, dest, rs1_pass;
     RISCV_OP_CSR_PRE;
     gen_helper_csrrc(dest, cpu_env, rs1_pass, csr_store, rs1_pass);
     RISCV_OP_CSR_POST;
+#else
+    TCGv csr_store, source1_lo, source1_hi, rd;
+    RISCV_OP_CSRI128_PRE;
+    gen_helper_csrrc_128(cpu_env, rd, csr_store, source1_lo, source1_hi);
+    RISCV_OP_CSR128_POST;
+#endif
     return true;
 }
diff --git a/target/riscv/op_helper.c b/target/riscv/op_helper.c
index 3c48e739ac..de3f4a2a61 100644
--- a/target/riscv/op_helper.c
+++ b/target/riscv/op_helper.c
@@ -23,6 +23,8 @@
 #include "exec/exec-all.h"
 #include "exec/helper-proto.h"
 
+#include "utils_128.h"
+
 /* Exceptions processing helpers */
 void QEMU_NORETURN riscv_raise_exception(CPURISCVState *env,
                                           uint32_t exception, uintptr_t pc)
@@ -73,6 +75,64 @@ target_ulong helper_csrrc(CPURISCVState *env, target_ulong src,
     return val;
 }
 
+#if defined(TARGET_RISCV128)
+void HELPER(csrrw_128)(CPURISCVState *env, target_ulong rd, target_ulong csrno,
+                       target_ulong src_l, target_ulong src_h)
+{
+    UINT128 ret_value = u128_zero();
+    RISCVException ret = riscv_csrrw_128(env, csrno, &ret_value,
+                                         u128_from_pair(src_l, src_h),
+                                         u128_maxval());
+
+    if (ret != RISCV_EXCP_NONE) {
+        riscv_raise_exception(env, ret, GETPC());
+    }
+
+    if (rd != 0) {
+        env->gpr[rd] = u128_get_lo64(&ret_value);
+        if (riscv_cpu_is_128bit(env)) {
+            env->gprh[rd] = u128_get_hi64(&ret_value);
+        }
+    }
+}
+
+void HELPER(csrrs_128)(CPURISCVState *env, target_ulong rd, target_ulong csrno,
+                       target_ulong src_l, target_ulong src_h)
+{
+    UINT128 ret_value = u128_zero();
+    RISCVException ret = riscv_csrrw_128(env, csrno, &ret_value,
+                                         u128_maxval(),
+                                         u128_from_pair(src_l, src_h));
+
+    if (ret != RISCV_EXCP_NONE) {
+        riscv_raise_exception(env, ret, GETPC());
+    }
+
+    if (rd != 0) {
+        env->gpr[rd] = u128_get_lo64(&ret_value);
+        env->gprh[rd] = u128_get_hi64(&ret_value);
+    }
+}
+
+void HELPER(csrrc_128)(CPURISCVState *env, target_ulong rd, target_ulong csrno,
+                       target_ulong src_l, target_ulong src_h)
+{
+    UINT128 ret_value = u128_zero();
+    RISCVException ret = riscv_csrrw_128(env, csrno, &ret_value,
+                                         u128_zero(),
+                                         u128_from_pair(src_l, src_h));
+
+    if (ret != RISCV_EXCP_NONE) {
+        riscv_raise_exception(env, ret, GETPC());
+    }
+
+    if (rd != 0) {
+        env->gpr[rd] = u128_get_lo64(&ret_value);
+        env->gprh[rd] = u128_get_hi64(&ret_value);
+    }
+}
+#endif
+
 #ifndef CONFIG_USER_ONLY
 
 target_ulong helper_sret(CPURISCVState *env, target_ulong cpu_pc_deb)
diff --git a/target/riscv/translate.c b/target/riscv/translate.c
index 7d447bd225..5d0da1ce39 100644
--- a/target/riscv/translate.c
+++ b/target/riscv/translate.c
@@ -31,6 +31,10 @@
 
 #include "instmap.h"
 
+#if defined(TARGET_RISCV128)
+#include "utils_128.h"
+#endif
+
 /* global register indices */
 static TCGv cpu_gpr[32], cpu_pc, cpu_vl;
 #if defined(TARGET_RISCV128)
diff --git a/target/riscv/utils_128.h b/target/riscv/utils_128.h
new file mode 100644
index 0000000000..b149597fa1
--- /dev/null
+++ b/target/riscv/utils_128.h
@@ -0,0 +1,173 @@
+#ifndef QEMU_128_UTILS_128_H
+#define QEMU_128_UTILS_128_H
+
+#include <stdint.h>
+#include <stdbool.h>
+
+#include "qemu/osdep.h"
+
+/*
+ * Defined to force the use of "software" 128-bit arithmetic
+ * (instead of the compiler's built-in type)
+ */
+#define SOFT_128BIT
+/*
+ * If available and not explicitly disabled,
+ * use compiler's 128-bit integers.
+ */
+#if defined(__SIZEOF_INT128__) && !defined(SOFT_128BIT)
+#define HARD_128BIT
+#endif
+
+/*
+ * Define a UINT128 type, that will either be a built-in type,
+ * or a struct packing a pair of 64-bit ints.
+ * of 64-bot values.
+ */
+#if defined(HARD_128BIT)
+#define UINT128 __uint128_t
+#define INT128 __int128_t
+#else
+typedef struct { uint64_t lo; uint64_t hi; } type_uint128;
+#define UINT128 type_uint128
+#endif
+
+/* Assignment operator for UINT128 */
+static inline void u128_assign(UINT128 *target, const UINT128 *val)
+{
+#if defined(HARD_128BIT)
+    *target = *val;
+#else
+    target->lo = val->lo;
+    target->hi = val->hi;
+#endif
+}
+
+static inline bool u128_is_nonzero(const UINT128 *val)
+{
+#if defined(HARD_128BIT)
+    return (*val) != 0;
+#else
+    return val->hi != 0 || val->lo != 0;
+#endif
+}
+
+static inline UINT128 u128_from_pair(uint64_t lo, uint64_t hi)
+{
+#if defined(HARD_128BIT)
+    return (((UINT128) hi) << 64) | lo;
+#else
+    return (UINT128) {lo, hi};
+#endif
+}
+
+/* Zero-extends a 64-bit value to a 128-bit one */
+static inline UINT128 u128_from64(uint64_t val)
+{
+    return u128_from_pair(val, 0);
+}
+
+static inline uint64_t u128_get_lo64(const UINT128 *val)
+{
+#if defined(HARD_128BIT)
+    return (*val) & 0xffffffffffffffff;
+#else
+    return val->lo;
+#endif
+}
+
+static inline uint64_t u128_get_hi64(const UINT128 *val)
+{
+#if defined(HARD_128BIT)
+    return ((*val) >> 64) & 0xffffffffffffffff;
+#else
+    return val->hi;
+#endif
+}
+
+/* Equivalents to u128_get_[lo/hi]64, but taking struct on stack */
+static inline uint64_t u128_lo64(const UINT128 val)
+{
+#if defined(HARD_128BIT)
+    return val & 0xffffffffffffffff;
+#else
+    return val.lo;
+#endif
+}
+
+static inline uint64_t u128_hi64(const UINT128 val)
+{
+#if defined(HARD_128BIT)
+    return (val >> 64) & 0xffffffffffffffff;
+#else
+    return val.hi;
+#endif
+}
+
+/* Bitwise logic operations needed to access csrs */
+static inline UINT128 u128_or(UINT128 a, UINT128 b)
+{
+#if defined(HARD_128BIT)
+    return a | b;
+#else
+    return (UINT128) {a.lo | b.lo, a.hi | b.hi};
+#endif
+}
+
+static inline UINT128 u128_and(UINT128 a, UINT128 b)
+{
+#if defined(HARD_128BIT)
+    return a & b;
+#else
+    return (UINT128) {a.lo & b.lo, a.hi & b.hi};
+#endif
+}
+
+static inline UINT128 u128_xor(UINT128 a, UINT128 b)
+{
+#if defined(HARD_128BIT)
+    return a ^ b;
+#else
+    return (UINT128) {a.lo ^ b.lo, a.hi ^ b.hi};
+#endif
+}
+
+static inline UINT128 u128_not(UINT128 a)
+{
+#if defined(HARD_128BIT)
+    return ~a;
+#else
+    return (UINT128) {~a.lo, ~a.hi};
+#endif
+}
+
+/* Static constants, should be easily inlined by compiler */
+static inline UINT128 u128_zero(void)
+{
+#if defined(HARD_128BIT)
+        return 0;
+#else
+        return (UINT128) {0, 0};
+#endif
+}
+
+static inline UINT128 u128_one(void)
+{
+#if defined(HARD_128BIT)
+    return 1;
+#else
+    return (UINT128) {1, 0};
+#endif
+}
+
+static inline UINT128 u128_maxval(void)
+{
+#if defined(HARD_128BIT)
+    UINT128 val = 0xffffffffffffffff;
+    return (val << 64) | 0xffffffffffffffff;
+#else
+    return (UINT128) {0xffffffffffffffff, 0xffffffffffffffff};
+#endif
+}
+
+#endif
-- 
2.33.0



^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 8/8] target/riscv: Support for 128-bit satp
  2021-08-30 17:16 [PATCH 1/8] target/riscv: Settings for 128-bit extension support Frédéric Pétrot
                   ` (5 preceding siblings ...)
  2021-08-30 17:16 ` [PATCH 7/8] target/riscv: 128-bit support for some csrs Frédéric Pétrot
@ 2021-08-30 17:16 ` Frédéric Pétrot
  2021-08-31  3:13 ` [PATCH 1/8] target/riscv: Settings for 128-bit extension support Alistair Francis
  7 siblings, 0 replies; 22+ messages in thread
From: Frédéric Pétrot @ 2021-08-30 17:16 UTC (permalink / raw)
  To: qemu-devel, qemu-riscv
  Cc: Bin Meng, Alistair Francis, Frédéric Pétrot,
	Palmer Dabbelt, Fabien Portas

Addition of a 128-bit satp to support memory translation.
We propose two new virtual memory schemes for targets with 128-bit addresses.
These schemes, sv44 and sv54, are natural extensions of the sv39 and sv48
schemes, but with 16KB page tables.
The theoretical physically addressable space is 68 bits, truncated by the
implementation to 64, as it assumes the upper 64 bits of the address are
zeroed for compatibility with the rest of the translation process.

Signed-off-by: Frédéric Pétrot <frederic.petrot@univ-grenoble-alpes.fr>
Co-authored-by: Fabien Portas <fabien.portas@grenoble-inp.org>
---
 target/riscv/cpu-param.h  | 11 ++++-
 target/riscv/cpu_bits.h   | 10 +++++
 target/riscv/cpu_helper.c | 56 ++++++++++++++++++-------
 target/riscv/csr.c        | 87 ++++++++++++++++++++++++++++++++++++---
 4 files changed, 142 insertions(+), 22 deletions(-)

diff --git a/target/riscv/cpu-param.h b/target/riscv/cpu-param.h
index e6d0651f60..bcdd1b0a68 100644
--- a/target/riscv/cpu-param.h
+++ b/target/riscv/cpu-param.h
@@ -9,7 +9,11 @@
 #define RISCV_CPU_PARAM_H 1
 
 /* 64-bit target, since QEMU isn't built to have TARGET_LONG_BITS over 64 */
-#if defined(TARGET_RISCV64) || defined(TARGET_RISCV128)
+#if defined(TARGET_RISCV128)
+# define TARGET_LONG_BITS 64
+# define TARGET_PHYS_ADDR_SPACE_BITS 64 /* 54-bit PPN */
+# define TARGET_VIRT_ADDR_SPACE_BITS 44 /* sv44 */
+#elif defined(TARGET_RISCV64)
 # define TARGET_LONG_BITS 64
 # define TARGET_PHYS_ADDR_SPACE_BITS 56 /* 44-bit PPN */
 # define TARGET_VIRT_ADDR_SPACE_BITS 48 /* sv48 */
@@ -18,7 +22,12 @@
 # define TARGET_PHYS_ADDR_SPACE_BITS 34 /* 22-bit PPN */
 # define TARGET_VIRT_ADDR_SPACE_BITS 32 /* sv32 */
 #endif
+
+#if defined(TARGET_RISCV128)
+#define TARGET_PAGE_BITS 14 /* Let us choose 16 KiB pages for RV128 */
+#else
 #define TARGET_PAGE_BITS 12 /* 4 KiB Pages */
+#endif
 /*
  * The current MMU Modes are:
  *  - U mode 0b000
diff --git a/target/riscv/cpu_bits.h b/target/riscv/cpu_bits.h
index 901f0e890a..3f2b3c3b34 100644
--- a/target/riscv/cpu_bits.h
+++ b/target/riscv/cpu_bits.h
@@ -429,6 +429,11 @@
 #define SATP64_ASID         0x0FFFF00000000000ULL
 #define SATP64_PPN          0x00000FFFFFFFFFFFULL
 
+/* RV128 satp CSR field masks (H/L for high/low dword) */
+#define SATP128_HMODE       0xFF00000000000000ULL
+#define SATP128_HASID       0x00FFFFFFFF000000ULL
+#define SATP128_LPPN        0x0003FFFFFFFFFFFFULL
+
 /* VM modes (mstatus.vm) privileged ISA 1.9.1 */
 #define VM_1_09_MBARE       0
 #define VM_1_09_MBB         1
@@ -445,6 +450,9 @@
 #define VM_1_10_SV57        10
 #define VM_1_10_SV64        11
 
+#define VM_1_10_SV44        12
+#define VM_1_10_SV54        13
+
 /* Page table entry (PTE) fields */
 #define PTE_V               0x001 /* Valid */
 #define PTE_R               0x002 /* Read */
@@ -461,6 +469,8 @@
 
 /* Leaf page shift amount */
 #define PGSHIFT             12
+/* For now, pages in RV128 are 16 KiB. */
+#define PGSHIFT128          14
 
 /* Default Reset Vector adress */
 #define DEFAULT_RSTVEC      0x1000
diff --git a/target/riscv/cpu_helper.c b/target/riscv/cpu_helper.c
index 968cb8046f..a24f02796c 100644
--- a/target/riscv/cpu_helper.c
+++ b/target/riscv/cpu_helper.c
@@ -395,7 +395,7 @@ static int get_physical_address(CPURISCVState *env, hwaddr *physical,
     *prot = 0;
 
     hwaddr base;
-    int levels, ptidxbits, ptesize, vm, sum, mxr, widened;
+    int levels, ptidxbits, ptesize, vm, sum, mxr, widened, pgshift;
 
     if (first_stage == true) {
         mxr = get_field(env->mstatus, MSTATUS_MXR);
@@ -408,17 +408,27 @@ static int get_physical_address(CPURISCVState *env, hwaddr *physical,
             if (riscv_cpu_is_32bit(env)) {
                 base = (hwaddr)get_field(env->vsatp, SATP32_PPN) << PGSHIFT;
                 vm = get_field(env->vsatp, SATP32_MODE);
-            } else {
+            } else if (riscv_cpu_is_64bit(env)) {
                 base = (hwaddr)get_field(env->vsatp, SATP64_PPN) << PGSHIFT;
                 vm = get_field(env->vsatp, SATP64_MODE);
+            } else {
+                /* TODO : Hypervisor extension not supported yet in RV128. */
+                g_assert_not_reached();
             }
         } else {
             if (riscv_cpu_is_32bit(env)) {
                 base = (hwaddr)get_field(env->satp, SATP32_PPN) << PGSHIFT;
                 vm = get_field(env->satp, SATP32_MODE);
-            } else {
+            } else if (riscv_cpu_is_64bit(env)) {
                 base = (hwaddr)get_field(env->satp, SATP64_PPN) << PGSHIFT;
                 vm = get_field(env->satp, SATP64_MODE);
+            } else {
+#if defined(TARGET_RISCV128)
+                base = (hwaddr)get_field(env->satp, SATP128_LPPN) << PGSHIFT128;
+                vm = get_field(env->satph, SATP128_HMODE);
+#else
+                g_assert_not_reached();
+#endif
             }
         }
         widened = 0;
@@ -426,9 +436,15 @@ static int get_physical_address(CPURISCVState *env, hwaddr *physical,
         if (riscv_cpu_is_32bit(env)) {
             base = (hwaddr)get_field(env->hgatp, SATP32_PPN) << PGSHIFT;
             vm = get_field(env->hgatp, SATP32_MODE);
-        } else {
+        } else if (riscv_cpu_is_64bit(env)) {
             base = (hwaddr)get_field(env->hgatp, SATP64_PPN) << PGSHIFT;
             vm = get_field(env->hgatp, SATP64_MODE);
+        } else {
+            /*
+             * TODO : Hypervisor extension not supported yet in RV128,
+             * so there shouldn't be any two-stage address lookups.
+             */
+            g_assert_not_reached();
         }
         widened = 2;
     }
@@ -436,13 +452,17 @@ static int get_physical_address(CPURISCVState *env, hwaddr *physical,
     sum = get_field(env->mstatus, MSTATUS_SUM) || use_background || is_debug;
     switch (vm) {
     case VM_1_10_SV32:
-      levels = 2; ptidxbits = 10; ptesize = 4; break;
+      levels = 2; ptidxbits = 10; ptesize = 4; pgshift = 12; break;
     case VM_1_10_SV39:
-      levels = 3; ptidxbits = 9; ptesize = 8; break;
+      levels = 3; ptidxbits = 9; ptesize = 8; pgshift = 12; break;
     case VM_1_10_SV48:
-      levels = 4; ptidxbits = 9; ptesize = 8; break;
+      levels = 4; ptidxbits = 9; ptesize = 8; pgshift = 12; break;
     case VM_1_10_SV57:
-      levels = 5; ptidxbits = 9; ptesize = 8; break;
+      levels = 5; ptidxbits = 9; ptesize = 8; pgshift = 12; break;
+    case VM_1_10_SV44:
+      levels = 3; ptidxbits = 10; ptesize = 16; pgshift = 14; break;
+    case VM_1_10_SV54:
+      levels = 4; ptidxbits = 10; ptesize = 16;  pgshift = 14; break;
     case VM_1_10_MBARE:
         *physical = addr;
         *prot = PAGE_READ | PAGE_WRITE | PAGE_EXEC;
@@ -452,7 +472,7 @@ static int get_physical_address(CPURISCVState *env, hwaddr *physical,
     }
 
     CPUState *cs = env_cpu(env);
-    int va_bits = PGSHIFT + levels * ptidxbits + widened;
+    int va_bits = pgshift + levels * ptidxbits + widened;
     target_ulong mask, masked_msbs;
 
     if (TARGET_LONG_BITS > (va_bits - 1)) {
@@ -467,6 +487,7 @@ static int get_physical_address(CPURISCVState *env, hwaddr *physical,
     }
 
     int ptshift = (levels - 1) * ptidxbits;
+    uint64_t pgoff_mask = (1ULL << pgshift) - 1;
     int i;
 
 #if !TCG_OVERSIZED_GUEST
@@ -475,10 +496,10 @@ restart:
     for (i = 0; i < levels; i++, ptshift -= ptidxbits) {
         target_ulong idx;
         if (i == 0) {
-            idx = (addr >> (PGSHIFT + ptshift)) &
+            idx = (addr >> (pgshift + ptshift)) &
                            ((1 << (ptidxbits + widened)) - 1);
         } else {
-            idx = (addr >> (PGSHIFT + ptshift)) &
+            idx = (addr >> (pgshift + ptshift)) &
                            ((1 << ptidxbits) - 1);
         }
 
@@ -486,6 +507,7 @@ restart:
         hwaddr pte_addr;
 
         if (two_stage && first_stage) {
+            /* TODO : Two-stage translation for RV128 */
             int vbase_prot;
             hwaddr vbase;
 
@@ -519,6 +541,10 @@ restart:
         if (riscv_cpu_is_32bit(env)) {
             pte = address_space_ldl(cs->as, pte_addr, attrs, &res);
         } else {
+            /*
+             * For RV128, load only lower 64 bits as only those
+             * are used for now
+             */
             pte = address_space_ldq(cs->as, pte_addr, attrs, &res);
         }
 
@@ -533,7 +559,7 @@ restart:
             return TRANSLATE_FAIL;
         } else if (!(pte & (PTE_R | PTE_W | PTE_X))) {
             /* Inner PTE, continue walking */
-            base = ppn << PGSHIFT;
+            base = ppn << pgshift;
         } else if ((pte & (PTE_R | PTE_W | PTE_X)) == PTE_W) {
             /* Reserved leaf PTE flags: PTE_W */
             return TRANSLATE_FAIL;
@@ -605,9 +631,9 @@ restart:
 
             /* for superpage mappings, make a fake leaf PTE for the TLB's
                benefit. */
-            target_ulong vpn = addr >> PGSHIFT;
-            *physical = ((ppn | (vpn & ((1L << ptshift) - 1))) << PGSHIFT) |
-                        (addr & ~TARGET_PAGE_MASK);
+            target_ulong vpn = addr >> pgshift;
+            *physical = ((ppn | (vpn & ((1L << ptshift) - 1))) << pgshift) |
+                        (addr & pgoff_mask);
 
             /* set permissions on the TLB entry */
             if ((pte & PTE_R) || ((pte & PTE_X) && mxr)) {
diff --git a/target/riscv/csr.c b/target/riscv/csr.c
index c3471a1365..6b57900457 100644
--- a/target/riscv/csr.c
+++ b/target/riscv/csr.c
@@ -461,6 +461,13 @@ static const char valid_vm_1_10_64[16] = {
     [VM_1_10_SV57] = 1
 };
 
+static const bool valid_vm_1_10_128[256] = {
+    [VM_1_10_MBARE] = 1,
+    [VM_1_10_SV44] = 1,
+    [VM_1_10_SV54] = 1
+};
+
+
 /* Machine Information Registers */
 #if defined(TARGET_RISCV128)
 static RISCVException read_zero_128(CPURISCVState *env, int csrno,
@@ -558,10 +565,13 @@ static RISCVException read_mstatus(CPURISCVState *env, int csrno,
 static int validate_vm(CPURISCVState *env, target_ulong vm)
 {
     if (riscv_cpu_is_32bit(env)) {
-        return valid_vm_1_10_32[vm & 0xf];
-    } else {
+        return valid_vm_1_10_32[vm & 1];
+    } else if (riscv_cpu_is_64bit(env)) {
         return valid_vm_1_10_64[vm & 0xf];
+    } else if (riscv_cpu_is_128bit(env)) {
+        return valid_vm_1_10_128[vm & 0xff];
     }
+    return 0;
 }
 
 static RISCVException write_mstatus(CPURISCVState *env, int csrno,
@@ -1093,6 +1103,69 @@ static RISCVException rmw_sip(CPURISCVState *env, int csrno,
 }
 
 /* Supervisor Protection and Translation */
+#if defined(TARGET_RISCV128)
+static RISCVException read_satp_128(CPURISCVState *env, int csrno,
+                                    UINT128 *val)
+{
+    if (!riscv_feature(env, RISCV_FEATURE_MMU)) {
+        *val = u128_zero();
+        return RISCV_EXCP_NONE;
+    }
+
+    if (env->priv == PRV_S && get_field(env->mstatus, MSTATUS_TVM)) {
+        return RISCV_EXCP_ILLEGAL_INST;
+    } else {
+        *val = u128_from_pair(env->satp, env->satph);
+    }
+
+    return RISCV_EXCP_NONE;
+}
+
+static RISCVException write_satp_128(CPURISCVState *env, int csrno,
+                                     UINT128 val)
+{
+    uint32_t asid;
+    bool vm_ok;
+    UINT128 mask;
+
+    if (!riscv_feature(env, RISCV_FEATURE_MMU)) {
+        return RISCV_EXCP_NONE;
+    }
+
+    if (riscv_cpu_is_32bit(env)) {
+        vm_ok = validate_vm(env, get_field(u128_get_lo64(&val), SATP32_MODE));
+        mask = u128_from64((u128_get_lo64(&val) ^ env->satp)
+                           & (SATP32_MODE | SATP32_ASID | SATP32_PPN));
+        asid = (u128_get_lo64(&val) ^ env->satp) & SATP32_ASID;
+    } else if (riscv_cpu_is_64bit(env)) {
+        vm_ok = validate_vm(env, get_field(u128_get_lo64(&val), SATP64_MODE));
+        mask = u128_from64((u128_get_lo64(&val) ^ env->satp)
+                           & (SATP64_MODE | SATP64_ASID | SATP64_PPN));
+        asid = (u128_get_lo64(&val) ^ env->satp) & SATP64_ASID;
+    } else {
+        vm_ok = validate_vm(env, get_field(u128_get_hi64(&val), SATP128_HMODE));
+        mask = u128_and(
+                   u128_xor(val, u128_from_pair(env->satp, env->satph)),
+                   u128_from_pair(SATP128_LPPN, SATP128_HMODE | SATP128_HASID));
+        asid = (u128_get_hi64(&val) ^ env->satph) & SATP128_HASID;
+    }
+
+
+    if (vm_ok && u128_is_nonzero(&mask)) {
+        if (env->priv == PRV_S && get_field(env->mstatus, MSTATUS_TVM)) {
+            return RISCV_EXCP_ILLEGAL_INST;
+        } else {
+            if (asid) {
+                tlb_flush(env_cpu(env));
+            }
+            env->satp = u128_get_lo64(&val);
+            env->satph = u128_get_hi64(&val);
+        }
+    }
+    return RISCV_EXCP_NONE;
+}
+#endif
+
 static RISCVException read_satp(CPURISCVState *env, int csrno,
                                 target_ulong *val)
 {
@@ -1768,12 +1841,14 @@ riscv_csr_operations128 csr_ops_128[CSR_TABLE_SIZE] = {
     [CSR_MIMPID]     = { read_zero_128    },
     [CSR_MHARTID]    = { read_mhartid_128 },
 
-    [CSR_MSTATUS]    = { read_mstatus_128,  write_mstatus_128 },
+    [CSR_MSTATUS]    = { read_mstatus_128,  write_mstatus_128  },
     [CSR_MISA]       = { read_misa_128    },
-    [CSR_MTVEC]      = { read_mtvec_128,    write_mtvec_128   },
+    [CSR_MTVEC]      = { read_mtvec_128,    write_mtvec_128    },
+
+    [CSR_MSCRATCH]   = { read_mscratch_128, write_mscratch_128 },
+    [CSR_MEPC]       = { read_mepc_128,     write_mepc_128     },
 
-    [CSR_MSCRATCH]   = { read_mscratch_128, write_mscratch_128},
-    [CSR_MEPC]       = { read_mepc_128,     write_mepc_128    },
+    [CSR_SATP]       = { read_satp_128,     write_satp_128     },
 #endif
 };
 #endif
-- 
2.33.0



^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH 2/8] target/riscv: 128-bit registers creation and access
  2021-08-30 17:16 ` [PATCH 2/8] target/riscv: 128-bit registers creation and access Frédéric Pétrot
@ 2021-08-30 21:34   ` Philippe Mathieu-Daudé
  0 siblings, 0 replies; 22+ messages in thread
From: Philippe Mathieu-Daudé @ 2021-08-30 21:34 UTC (permalink / raw)
  To: Frédéric Pétrot, qemu-devel, qemu-riscv
  Cc: Palmer Dabbelt, Bin Meng, Alistair Francis, Fabien Portas

On 8/30/21 7:16 PM, Frédéric Pétrot wrote:
> Addition of the upper 64 bits of the 128-bit registers, along with
> the setter and getter for them and creation of the corresponding
> global tcg values.
> 
> Signed-off-by: Frédéric Pétrot <frederic.petrot@univ-grenoble-alpes.fr>
> Co-authored-by: Fabien Portas <fabien.portas@grenoble-inp.org>
> ---
>  slirp                    |  2 +-
>  target/riscv/cpu.h       |  3 +++
>  target/riscv/translate.c | 30 ++++++++++++++++++++++++++++++
>  3 files changed, 34 insertions(+), 1 deletion(-)
> 
> diff --git a/slirp b/slirp
> index a88d9ace23..8f43a99191 160000
> --- a/slirp
> +++ b/slirp
> @@ -1 +1 @@
> -Subproject commit a88d9ace234a24ce1c17189642ef9104799425e0
> +Subproject commit 8f43a99191afb47ca3f3c6972f6306209f367ece

Unrelated change...


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 3/8] target/riscv: Addition of 128-bit ldu, lq and sq instructions
  2021-08-30 17:16 ` [PATCH 3/8] target/riscv: Addition of 128-bit ldu, lq and sq instructions Frédéric Pétrot
@ 2021-08-30 21:35   ` Philippe Mathieu-Daudé
  2021-08-31  2:24   ` Richard Henderson
  2021-08-31  2:30   ` Richard Henderson
  2 siblings, 0 replies; 22+ messages in thread
From: Philippe Mathieu-Daudé @ 2021-08-30 21:35 UTC (permalink / raw)
  To: Frédéric Pétrot, qemu-devel, qemu-riscv
  Cc: Palmer Dabbelt, Alistair Francis, Bin Meng, Richard Henderson,
	Fabien Portas

On 8/30/21 7:16 PM, Frédéric Pétrot wrote:
> Addition of the load(s) and store instructions of the 128-bit extension.
> These instructions have addresses on 128-bit but explicitly assume that the
> upper 64-bit of the address registers is null, and therefore can use the
> existing address translation mechanism.
> 128-bit memory access identification and 64-bit signedness is handled a bit
> off-the-record:
> MemOp reserves 2 bits for size and a contiguous 3rd bit for the sign, so we
> cannot simply take value 4 to indicate a size of 16 bytes.
> Additionally, MO_TEQ | MO_SIGN seems to be a sentinel value, leading to a
> QEMU assertion violation.
> Modifying the existing state in QEMU has a great impact that we are not
> capable of fully evaluating, so we choose to pass this information into
> another parameter and let memop as it is for now.
> 
> Signed-off-by: Frédéric Pétrot <frederic.petrot@univ-grenoble-alpes.fr>
> Co-authored-by: Fabien Portas <fabien.portas@grenoble-inp.org>
> ---
>  include/tcg/tcg-op.h                    |   1 +
>  tcg/tcg-op.c                            |   6 +

Please split in 2 patches, first TCG generic,

>  target/riscv/insn16.decode              |  33 ++++-
>  target/riscv/insn32.decode              |   5 +
>  target/riscv/insn_trans/trans_rvi.c.inc | 188 +++++++++++++++++++++---

Second particular RISCV implementation.

>  5 files changed, 207 insertions(+), 26 deletions(-)


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 4/8] target/riscv: 128-bit arithmetic and logic instructions
  2021-08-30 17:16 ` [PATCH 4/8] target/riscv: 128-bit arithmetic and logic instructions Frédéric Pétrot
@ 2021-08-30 21:38   ` Philippe Mathieu-Daudé
  2021-08-30 21:40     ` Philippe Mathieu-Daudé
  2021-08-31  3:32     ` Richard Henderson
  2021-08-31  3:30   ` Richard Henderson
  1 sibling, 2 replies; 22+ messages in thread
From: Philippe Mathieu-Daudé @ 2021-08-30 21:38 UTC (permalink / raw)
  To: Frédéric Pétrot, qemu-devel, qemu-riscv
  Cc: Palmer Dabbelt, Bin Meng, Alistair Francis, Fabien Portas

On 8/30/21 7:16 PM, Frédéric Pétrot wrote:
> Adding the support for the 128-bit arithmetic and logic instructions.
> Remember that all (i) instructions are now acting on 128-bit registers, that
> a few others are added to cope with values that are held on 64 bits within
> the 128-bit registers, and that the ones that cope with values on 32-bit
> must also be modified for proper sign extension.
> Most algorithms taken from Hackers' delight.
> 
> Signed-off-by: Frédéric Pétrot <frederic.petrot@univ-grenoble-alpes.fr>
> Co-authored-by: Fabien Portas <fabien.portas@grenoble-inp.org>
> ---
>  target/riscv/insn32.decode              |  13 +
>  target/riscv/insn_trans/trans_rvi.c.inc | 955 +++++++++++++++++++++++-
>  target/riscv/translate.c                |  25 +
>  3 files changed, 976 insertions(+), 17 deletions(-)

> diff --git a/target/riscv/insn_trans/trans_rvi.c.inc b/target/riscv/insn_trans/trans_rvi.c.inc
> index 772330a766..0401ba3d69 100644
> --- a/target/riscv/insn_trans/trans_rvi.c.inc
> +++ b/target/riscv/insn_trans/trans_rvi.c.inc
> @@ -26,14 +26,20 @@ static bool trans_illegal(DisasContext *ctx, arg_empty *a)
>  
>  static bool trans_c64_illegal(DisasContext *ctx, arg_empty *a)
>  {
> -     REQUIRE_64BIT(ctx);
> -     return trans_illegal(ctx, a);
> +    REQUIRE_64_OR_128BIT(ctx);
> +    return trans_illegal(ctx, a);
>  }
>  
>  static bool trans_lui(DisasContext *ctx, arg_lui *a)
>  {
>      if (a->rd != 0) {
>          tcg_gen_movi_tl(cpu_gpr[a->rd], a->imm);
> +#if defined(TARGET_RISCV128)
> +        if (is_128bit(ctx)) {

Maybe this could allow the compiler eventually elide the
code and avoid superfluous #ifdef'ry:

           if (TARGET_LONG_BITS >= 128) {

> +            tcg_gen_ext_i64_i128(cpu_gpr[a->rd], cpu_gprh[a->rd],
> +                                 cpu_gpr[a->rd]);
> +        }
> +#endif
>      }
>      return true;
>  }


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 4/8] target/riscv: 128-bit arithmetic and logic instructions
  2021-08-30 21:38   ` Philippe Mathieu-Daudé
@ 2021-08-30 21:40     ` Philippe Mathieu-Daudé
  2021-08-31 15:57       ` Frédéric Pétrot
  2021-08-31  3:32     ` Richard Henderson
  1 sibling, 1 reply; 22+ messages in thread
From: Philippe Mathieu-Daudé @ 2021-08-30 21:40 UTC (permalink / raw)
  To: Frédéric Pétrot, qemu-devel, qemu-riscv
  Cc: Alistair Francis, Bin Meng, Palmer Dabbelt, Fabien Portas

On 8/30/21 11:38 PM, Philippe Mathieu-Daudé wrote:
> On 8/30/21 7:16 PM, Frédéric Pétrot wrote:
>> Adding the support for the 128-bit arithmetic and logic instructions.
>> Remember that all (i) instructions are now acting on 128-bit registers, that
>> a few others are added to cope with values that are held on 64 bits within
>> the 128-bit registers, and that the ones that cope with values on 32-bit
>> must also be modified for proper sign extension.
>> Most algorithms taken from Hackers' delight.
>>
>> Signed-off-by: Frédéric Pétrot <frederic.petrot@univ-grenoble-alpes.fr>
>> Co-authored-by: Fabien Portas <fabien.portas@grenoble-inp.org>
>> ---
>>  target/riscv/insn32.decode              |  13 +
>>  target/riscv/insn_trans/trans_rvi.c.inc | 955 +++++++++++++++++++++++-
>>  target/riscv/translate.c                |  25 +
>>  3 files changed, 976 insertions(+), 17 deletions(-)
> 
>> diff --git a/target/riscv/insn_trans/trans_rvi.c.inc b/target/riscv/insn_trans/trans_rvi.c.inc
>> index 772330a766..0401ba3d69 100644
>> --- a/target/riscv/insn_trans/trans_rvi.c.inc
>> +++ b/target/riscv/insn_trans/trans_rvi.c.inc
>> @@ -26,14 +26,20 @@ static bool trans_illegal(DisasContext *ctx, arg_empty *a)
>>  
>>  static bool trans_c64_illegal(DisasContext *ctx, arg_empty *a)
>>  {
>> -     REQUIRE_64BIT(ctx);
>> -     return trans_illegal(ctx, a);
>> +    REQUIRE_64_OR_128BIT(ctx);
>> +    return trans_illegal(ctx, a);
>>  }
>>  
>>  static bool trans_lui(DisasContext *ctx, arg_lui *a)
>>  {
>>      if (a->rd != 0) {
>>          tcg_gen_movi_tl(cpu_gpr[a->rd], a->imm);
>> +#if defined(TARGET_RISCV128)
>> +        if (is_128bit(ctx)) {
> 
> Maybe this could allow the compiler eventually elide the
> code and avoid superfluous #ifdef'ry:
> 
>            if (TARGET_LONG_BITS >= 128) {

Actually:

             if (TARGET_LONG_BITS >= 128 && is_128bit(ctx)) {

> 
>> +            tcg_gen_ext_i64_i128(cpu_gpr[a->rd], cpu_gprh[a->rd],
>> +                                 cpu_gpr[a->rd]);
>> +        }
>> +#endif
>>      }
>>      return true;
>>  }
> 


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 3/8] target/riscv: Addition of 128-bit ldu, lq and sq instructions
  2021-08-30 17:16 ` [PATCH 3/8] target/riscv: Addition of 128-bit ldu, lq and sq instructions Frédéric Pétrot
  2021-08-30 21:35   ` Philippe Mathieu-Daudé
@ 2021-08-31  2:24   ` Richard Henderson
  2021-08-31 16:00     ` Frédéric Pétrot
  2021-08-31  2:30   ` Richard Henderson
  2 siblings, 1 reply; 22+ messages in thread
From: Richard Henderson @ 2021-08-31  2:24 UTC (permalink / raw)
  To: Frédéric Pétrot, qemu-devel, qemu-riscv
  Cc: Alistair Francis, Bin Meng, Palmer Dabbelt, Fabien Portas

On 8/30/21 10:16 AM, Frédéric Pétrot wrote:
> +#if defined(TARGET_RISCV128)
> +/*
> + * Accessing signed 64-bit or 128-bit values should be part of MemOp in
> + * include/exec/memop.h
> + * Unfortunately, this requires to change the defines there, as MO_SIGN is 4,
> + * and values 0 to 3 are usual types sizes.
> + * Note that an assert is triggered when MemOp is MO_SIGN|MO_TEQ, this value
> + * being some kind of sentinel.

https://lore.kernel.org/qemu-devel/20210818191920.390759-24-richard.henderson@linaro.org/


r~


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 3/8] target/riscv: Addition of 128-bit ldu, lq and sq instructions
  2021-08-30 17:16 ` [PATCH 3/8] target/riscv: Addition of 128-bit ldu, lq and sq instructions Frédéric Pétrot
  2021-08-30 21:35   ` Philippe Mathieu-Daudé
  2021-08-31  2:24   ` Richard Henderson
@ 2021-08-31  2:30   ` Richard Henderson
  2 siblings, 0 replies; 22+ messages in thread
From: Richard Henderson @ 2021-08-31  2:30 UTC (permalink / raw)
  To: Frédéric Pétrot, qemu-devel, qemu-riscv
  Cc: Alistair Francis, Bin Meng, Palmer Dabbelt, Fabien Portas

On 8/30/21 10:16 AM, Frédéric Pétrot wrote:
> +void tcg_gen_ext_i64_i128(TCGv_i64 lo, TCGv_i64 hi, TCGv_i64 arg)
> +{
> +    tcg_gen_mov_i64(lo, arg);
> +    tcg_gen_sari_i64(hi, arg, 63);
> +}

No, don't add this until we add TCGv_i128.
Just use sari as needed in target/riscv when dealing with TCGv_i64.


r~


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/8] target/riscv: Settings for 128-bit extension support
  2021-08-30 17:16 [PATCH 1/8] target/riscv: Settings for 128-bit extension support Frédéric Pétrot
                   ` (6 preceding siblings ...)
  2021-08-30 17:16 ` [PATCH 8/8] target/riscv: Support for 128-bit satp Frédéric Pétrot
@ 2021-08-31  3:13 ` Alistair Francis
  2021-08-31 16:20   ` Frédéric Pétrot
  7 siblings, 1 reply; 22+ messages in thread
From: Alistair Francis @ 2021-08-31  3:13 UTC (permalink / raw)
  To: Frédéric Pétrot
  Cc: open list:RISC-V, Philippe Mathieu-Daudé,
	Bin Meng, qemu-devel@nongnu.org Developers, Alistair Francis,
	Fabien Portas, Palmer Dabbelt, Alex Bennée

On Tue, Aug 31, 2021 at 5:26 AM Frédéric Pétrot
<frederic.petrot@univ-grenoble-alpes.fr> wrote:
>
> Starting 128-bit extension support implies a few modifications in the
> existing sources because checking for 32-bit is done by checking that
> it is not 64-bit and vice-versa.
> We now consider the 3 possible xlen values so as to allow correct
> compilation for both existing targets while setting the compilation
> framework so that it can also handle the riscv128-softmmu target.
> This includes gdb configuration files, that are just the bare copy of the
> 64-bit ones as gdb does not honor, yet, 128-bit CPUs.
> To consider the 3 xlen values, we had to add a misah field, representing the
> upper 64 bits of the misa register.
>
> Signed-off-by: Frédéric Pétrot <frederic.petrot@univ-grenoble-alpes.fr>
> Co-authored-by: Fabien Portas <fabien.portas@grenoble-inp.org>
> ---
>  configs/devices/riscv128-softmmu/default.mak | 16 ++++++
>  configs/targets/riscv128-softmmu.mak         |  5 ++
>  gdb-xml/riscv-128bit-cpu.xml                 | 48 ++++++++++++++++++
>  gdb-xml/riscv-128bit-virtual.xml             | 12 +++++
>  include/hw/riscv/sifive_cpu.h                |  4 ++
>  target/riscv/Kconfig                         |  3 ++
>  target/riscv/arch_dump.c                     |  3 +-
>  target/riscv/cpu-param.h                     |  3 +-
>  target/riscv/cpu.c                           | 51 +++++++++++++++++---
>  target/riscv/cpu.h                           | 19 ++++++++
>  target/riscv/gdbstub.c                       |  3 ++
>  target/riscv/insn_trans/trans_rvd.c.inc      | 10 ++--
>  target/riscv/insn_trans/trans_rvf.c.inc      |  2 +-
>  target/riscv/translate.c                     | 45 ++++++++++++++++-
>  14 files changed, 209 insertions(+), 15 deletions(-)
>  create mode 100644 configs/devices/riscv128-softmmu/default.mak
>  create mode 100644 configs/targets/riscv128-softmmu.mak
>  create mode 100644 gdb-xml/riscv-128bit-cpu.xml
>  create mode 100644 gdb-xml/riscv-128bit-virtual.xml

Hey!

Thanks for the patches!

Overall this patch looks good.

It would greatly help reviewing and the speed in which this can be
merged if you can split it up more. A lot of these changes probably
can be separate patches (for example a patch to add misah). I know it
can sometimes seem a little silly, but it greatly helps with reviewing
when patches are small and self contained.

>
> diff --git a/configs/devices/riscv128-softmmu/default.mak b/configs/devices/riscv128-softmmu/default.mak
> new file mode 100644
> index 0000000000..31439dbcfe
> --- /dev/null
> +++ b/configs/devices/riscv128-softmmu/default.mak
> @@ -0,0 +1,16 @@
> +# Default configuration for riscv128-softmmu
> +
> +# Uncomment the following lines to disable these optional devices:
> +#
> +#CONFIG_PCI_DEVICES=n
> +CONFIG_SEMIHOSTING=y
> +CONFIG_ARM_COMPATIBLE_SEMIHOSTING=y
> +
> +# Boards:
> +#
> +CONFIG_SPIKE=n
> +CONFIG_SIFIVE_E=n
> +CONFIG_SIFIVE_U=n
> +CONFIG_RISCV_VIRT=y
> +CONFIG_MICROCHIP_PFSOC=n
> +CONFIG_SHAKTI_C=n
> diff --git a/configs/targets/riscv128-softmmu.mak b/configs/targets/riscv128-softmmu.mak
> new file mode 100644
> index 0000000000..e300c43c8e
> --- /dev/null
> +++ b/configs/targets/riscv128-softmmu.mak
> @@ -0,0 +1,5 @@
> +TARGET_ARCH=riscv128
> +TARGET_BASE_ARCH=riscv
> +TARGET_SUPPORTS_MTTCG=y
> +TARGET_XML_FILES= gdb-xml/riscv-128bit-cpu.xml gdb-xml/riscv-32bit-fpu.xml gdb-xml/riscv-64bit-fpu.xml gdb-xml/riscv-128bit-virtual.xml
> +TARGET_NEED_FDT=y
> diff --git a/gdb-xml/riscv-128bit-cpu.xml b/gdb-xml/riscv-128bit-cpu.xml
> new file mode 100644
> index 0000000000..c98168148f
> --- /dev/null
> +++ b/gdb-xml/riscv-128bit-cpu.xml
> @@ -0,0 +1,48 @@
> +<?xml version="1.0"?>
> +<!-- Copyright (C) 2018-2019 Free Software Foundation, Inc.
> +
> +     Copying and distribution of this file, with or without modification,
> +     are permitted in any medium without royalty provided the copyright
> +     notice and this notice are preserved.  -->
> +
> +<!-- Register numbers are hard-coded in order to maintain backward
> +     compatibility with older versions of tools that didn't use xml
> +     register descriptions.  -->
> +
> +<!DOCTYPE feature SYSTEM "gdb-target.dtd">
> +<!-- FIXME : All GPRs are marked as 64-bits since gdb doesn't like 128-bit registers for now. -->
> +<feature name="org.gnu.gdb.riscv.cpu">
> +  <reg name="zero" bitsize="64" type="int" regnum="0"/>
> +  <reg name="ra" bitsize="64" type="code_ptr"/>
> +  <reg name="sp" bitsize="64" type="data_ptr"/>
> +  <reg name="gp" bitsize="64" type="data_ptr"/>
> +  <reg name="tp" bitsize="64" type="data_ptr"/>
> +  <reg name="t0" bitsize="64" type="int"/>
> +  <reg name="t1" bitsize="64" type="int"/>
> +  <reg name="t2" bitsize="64" type="int"/>
> +  <reg name="fp" bitsize="64" type="data_ptr"/>
> +  <reg name="s1" bitsize="64" type="int"/>
> +  <reg name="a0" bitsize="64" type="int"/>
> +  <reg name="a1" bitsize="64" type="int"/>
> +  <reg name="a2" bitsize="64" type="int"/>
> +  <reg name="a3" bitsize="64" type="int"/>
> +  <reg name="a4" bitsize="64" type="int"/>
> +  <reg name="a5" bitsize="64" type="int"/>
> +  <reg name="a6" bitsize="64" type="int"/>
> +  <reg name="a7" bitsize="64" type="int"/>
> +  <reg name="s2" bitsize="64" type="int"/>
> +  <reg name="s3" bitsize="64" type="int"/>
> +  <reg name="s4" bitsize="64" type="int"/>
> +  <reg name="s5" bitsize="64" type="int"/>
> +  <reg name="s6" bitsize="64" type="int"/>
> +  <reg name="s7" bitsize="64" type="int"/>
> +  <reg name="s8" bitsize="64" type="int"/>
> +  <reg name="s9" bitsize="64" type="int"/>
> +  <reg name="s10" bitsize="64" type="int"/>
> +  <reg name="s11" bitsize="64" type="int"/>
> +  <reg name="t3" bitsize="64" type="int"/>
> +  <reg name="t4" bitsize="64" type="int"/>
> +  <reg name="t5" bitsize="64" type="int"/>
> +  <reg name="t6" bitsize="64" type="int"/>
> +  <reg name="pc" bitsize="64" type="code_ptr"/>
> +</feature>
> diff --git a/gdb-xml/riscv-128bit-virtual.xml b/gdb-xml/riscv-128bit-virtual.xml
> new file mode 100644
> index 0000000000..db9a0ff677
> --- /dev/null
> +++ b/gdb-xml/riscv-128bit-virtual.xml
> @@ -0,0 +1,12 @@
> +<?xml version="1.0"?>
> +<!-- Copyright (C) 2018-2019 Free Software Foundation, Inc.
> +
> +     Copying and distribution of this file, with or without modification,
> +     are permitted in any medium without royalty provided the copyright
> +     notice and this notice are preserved.  -->
> +
> +<!DOCTYPE feature SYSTEM "gdb-target.dtd">
> +<!-- FIXME : priv marked as 64-bits since gdb doesn't like 128-bit registers for now. -->
> +<feature name="org.gnu.gdb.riscv.virtual">
> +  <reg name="priv" bitsize="64"/>
> +</feature>
> diff --git a/include/hw/riscv/sifive_cpu.h b/include/hw/riscv/sifive_cpu.h
> index 136799633a..2fd441664f 100644
> --- a/include/hw/riscv/sifive_cpu.hthat
> +++ b/include/hw/riscv/sifive_cpu.h
> @@ -26,6 +26,10 @@
>  #elif defined(TARGET_RISCV64)
>  #define SIFIVE_E_CPU TYPE_RISCV_CPU_SIFIVE_E51
>  #define SIFIVE_U_CPU TYPE_RISCV_CPU_SIFIVE_U54
> +#elif defined(TARGET_RISCV128)
> +/* 128-bit uses 64-bit CPU for now, since no cpu implements RV128 */
> +#define SIFIVE_E_CPU TYPE_RISCV_CPU_SIFIVE_E51
> +#define SIFIVE_U_CPU TYPE_RISCV_CPU_SIFIVE_U54
>  #endif
>
>  #endif /* HW_SIFIVE_CPU_H */
> diff --git a/target/riscv/Kconfig b/target/riscv/Kconfig
> index b9e5932f13..f9ea52a59a 100644
> --- a/target/riscv/Kconfig
> +++ b/target/riscv/Kconfig
> @@ -3,3 +3,6 @@ config RISCV32
>
>  config RISCV64
>      bool
> +
> +config RISCV128
> +    bool
> diff --git a/target/riscv/arch_dump.c b/target/riscv/arch_dump.c
> index 709f621d82..f756ed2988 100644
> --- a/target/riscv/arch_dump.c
> +++ b/target/riscv/arch_dump.c
> @@ -176,7 +176,8 @@ int cpu_get_dump_info(ArchDumpInfo *info,
>
>      info->d_machine = EM_RISCV;
>
> -#if defined(TARGET_RISCV64)
> +#if defined(TARGET_RISCV64) || defined(TARGET_RISCV128)
> +    /* FIXME : No 128-bit ELF class exists (for now), use 64-bit one. */
>      info->d_class = ELFCLASS64;
>  #else
>      info->d_class = ELFCLASS32;
> diff --git a/target/riscv/cpu-param.h b/target/riscv/cpu-param.h
> index 80eb615f93..e6d0651f60 100644
> --- a/target/riscv/cpu-param.h
> +++ b/target/riscv/cpu-param.h
> @@ -8,7 +8,8 @@
>  #ifndef RISCV_CPU_PARAM_H
>  #define RISCV_CPU_PARAM_H 1
>
> -#if defined(TARGET_RISCV64)
> +/* 64-bit target, since QEMU isn't built to have TARGET_LONG_BITS over 64 */
> +#if defined(TARGET_RISCV64) || defined(TARGET_RISCV128)
>  # define TARGET_LONG_BITS 64
>  # define TARGET_PHYS_ADDR_SPACE_BITS 56 /* 44-bit PPN */
>  # define TARGET_VIRT_ADDR_SPACE_BITS 48 /* sv48 */
> diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
> index 991a6bb760..1f15026e9c 100644
> --- a/target/riscv/cpu.c
> +++ b/target/riscv/cpu.c
> @@ -110,18 +110,38 @@ const char *riscv_cpu_get_trap_name(target_ulong cause, bool async)
>
>  bool riscv_cpu_is_32bit(CPURISCVState *env)
>  {
> -    if (env->misa & RV64) {
> -        return false;
> -    }
> +    return (env->misa & MXLEN_MASK) == RV32;
> +}
>
> -    return true;
> +bool riscv_cpu_is_64bit(CPURISCVState *env)
> +{
> +    return (env->misa & MXLEN_MASK) == RV64;
>  }
>
> +#if defined(TARGET_RISCV128)

Don't add any TARGET_* defines.

We are trying to move to a point where the 64-bit RISC-V softmmu can
run 32-bit CPUs. Ideally we want the same with 128-bit. You don't have
to get that working, but don't add any compile time conditionals.

That applies to all code, not just this patch. Unless there is already
a conditional TARGET_* compile please don't add one.

Alistair


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 4/8] target/riscv: 128-bit arithmetic and logic instructions
  2021-08-30 17:16 ` [PATCH 4/8] target/riscv: 128-bit arithmetic and logic instructions Frédéric Pétrot
  2021-08-30 21:38   ` Philippe Mathieu-Daudé
@ 2021-08-31  3:30   ` Richard Henderson
  1 sibling, 0 replies; 22+ messages in thread
From: Richard Henderson @ 2021-08-31  3:30 UTC (permalink / raw)
  To: Frédéric Pétrot, qemu-devel, qemu-riscv
  Cc: Palmer Dabbelt, Bin Meng, Alistair Francis, Fabien Portas

On 8/30/21 10:16 AM, Frédéric Pétrot wrote:
> Adding the support for the 128-bit arithmetic and logic instructions.
> Remember that all (i) instructions are now acting on 128-bit registers, that
> a few others are added to cope with values that are held on 64 bits within
> the 128-bit registers, and that the ones that cope with values on 32-bit
> must also be modified for proper sign extension.
> Most algorithms taken from Hackers' delight.
> 
> Signed-off-by: Frédéric Pétrot <frederic.petrot@univ-grenoble-alpes.fr>
> Co-authored-by: Fabien Portas <fabien.portas@grenoble-inp.org>
> ---
>   target/riscv/insn32.decode              |  13 +
>   target/riscv/insn_trans/trans_rvi.c.inc | 955 +++++++++++++++++++++++-
>   target/riscv/translate.c                |  25 +
>   3 files changed, 976 insertions(+), 17 deletions(-)
> 
> diff --git a/target/riscv/insn32.decode b/target/riscv/insn32.decode
> index 225669e277..2fe7e1dd36 100644
> --- a/target/riscv/insn32.decode
> +++ b/target/riscv/insn32.decode
> @@ -22,6 +22,7 @@
>   %rs1       15:5
>   %rd        7:5
>   %sh5       20:5
> +%sh6       20:6
>   
>   %sh7    20:7
>   %csr    20:12
> @@ -91,6 +92,9 @@
>   # Formats 64:
>   @sh5     .......  ..... .....  ... ..... ....... &shift  shamt=%sh5      %rs1 %rd
>   
> +# Formats 128:
> +@sh6       ...... ...... ..... ... ..... ....... &shift shamt=%sh6 %rs1 %rd
> +
>   # *** Privileged Instructions ***
>   ecall       000000000000     00000 000 00000 1110011
>   ebreak      000000000001     00000 000 00000 1110011
> @@ -166,6 +170,15 @@ sraw     0100000 .....  ..... 101 ..... 0111011 @r
>   ldu      ............   ..... 111 ..... 0000011 @i
>   lq       ............   ..... 010 ..... 0001111 @i
>   sq       ............   ..... 100 ..... 0100011 @s
> +addid    ............  .....  000 ..... 1011011 @i
> +sllid    000000 ......  ..... 001 ..... 1011011 @sh6
> +srlid    000000 ......  ..... 101 ..... 1011011 @sh6
> +sraid    010000 ......  ..... 101 ..... 1011011 @sh6
> +addd     0000000 ..... .....  000 ..... 1111011 @r
> +subd     0100000 ..... .....  000 ..... 1111011 @r
> +slld     0000000 ..... .....  001 ..... 1111011 @r
> +srld     0000000 ..... .....  101 ..... 1111011 @r
> +srad     0100000 ..... .....  101 ..... 1111011 @r
>   
>   # *** RV32M Standard Extension ***
>   mul      0000001 .....  ..... 000 ..... 0110011 @r
> diff --git a/target/riscv/insn_trans/trans_rvi.c.inc b/target/riscv/insn_trans/trans_rvi.c.inc
> index 772330a766..0401ba3d69 100644
> --- a/target/riscv/insn_trans/trans_rvi.c.inc
> +++ b/target/riscv/insn_trans/trans_rvi.c.inc
> @@ -26,14 +26,20 @@ static bool trans_illegal(DisasContext *ctx, arg_empty *a)
>   
>   static bool trans_c64_illegal(DisasContext *ctx, arg_empty *a)
>   {
> -     REQUIRE_64BIT(ctx);
> -     return trans_illegal(ctx, a);
> +    REQUIRE_64_OR_128BIT(ctx);
> +    return trans_illegal(ctx, a);
>   }
>   
>   static bool trans_lui(DisasContext *ctx, arg_lui *a)
>   {
>       if (a->rd != 0) {
>           tcg_gen_movi_tl(cpu_gpr[a->rd], a->imm);
> +#if defined(TARGET_RISCV128)
> +        if (is_128bit(ctx)) {
> +            tcg_gen_ext_i64_i128(cpu_gpr[a->rd], cpu_gprh[a->rd],
> +                                 cpu_gpr[a->rd]);
> +        }
> +#endif
>       }
>       return true;
>   }

I think it is a mistake to introduce all of these ifdefs.

If 128-bit is not enabled, then is_128bit should evaluate to false, and all should be 
well.  As for cpu_gprh[], that should be hidden behind some function/macro so that if 
128-bit is not enabled we get qemu_build_not_reached().

Finally, you can compute this immediate value directly.  Don't leave it to the optimizer 
to provide the extension.

>   static bool trans_auipc(DisasContext *ctx, arg_auipc *a)
>   {
>       if (a->rd != 0) {
> +#if defined(TARGET_RISCV128)
> +        if (is_128bit(ctx)) {
> +            /* TODO : when pc is 128 bits, use all its bits */
> +            TCGv pc = tcg_const_tl(ctx->base.pc_next),
> +                 imm = tcg_const_tl(a->imm),
> +                 immh = tcg_const_tl((a->imm & 0x80000)
> +                         ? 0xffffffffffffffff : 0),

No need to test bits here: -(a->imm < 0) will do fine.

> +                 cnst_zero = tcg_const_tl(0);
> +            tcg_gen_add2_tl(cpu_gpr[a->rd], cpu_gprh[a->rd], pc, cnst_zero,
> +                            imm, immh);
> +            tcg_temp_free(pc);
> +            tcg_temp_free(imm);
> +            tcg_temp_free(immh);
> +            tcg_temp_free(cnst_zero);
> +            return true;

tcg_constant_tl, not tcg_const_tl and no tcg_temp_free.

> +    case TCG_COND_LT:
> +    {
> +        TCGv tmp1 = tcg_temp_new(),
> +             tmp2 = tcg_temp_new();
> +
> +        tcg_gen_xor_tl(tmp1, rh, ah);
> +        tcg_gen_xor_tl(tmp2, ah, bh);
> +        tcg_gen_and_tl(tmp1, tmp1, tmp2);
> +        tcg_gen_xor_tl(tmp1, rh, tmp1);
> +        tcg_gen_setcondi_tl(TCG_COND_LT, rl, tmp1, 0); /* Check sign bit */
> +
> +        tcg_temp_free(tmp1);
> +        tcg_temp_free(tmp2);
> +        break;
> +    }

Incorrect, as you're not examining the low parts at all.


> +
> +    case TCG_COND_GE:
> +        /* We invert the result of TCG_COND_LT */
> +        gen_setcond_128(rl, rh, al, ah, bl, bh, TCG_COND_LT);
> +        tcg_gen_setcondi_tl(TCG_COND_EQ, rl, rl, 0);

Inversion of a boolean is better as xor with 1.

> +    case TCG_COND_LTU:
> +    {
> +        TCGv tmp1 = tcg_temp_new(),
> +             tmp2 = tcg_temp_new();
> +
> +        tcg_gen_eqv_tl(tmp1, ah, bh);
> +        tcg_gen_and_tl(tmp1, tmp1, rh);
> +        tcg_gen_not_tl(tmp2, ah);
> +        tcg_gen_and_tl(tmp2, tmp2, bh);
> +        tcg_gen_or_tl(tmp1, tmp1, tmp2);
> +
> +        tcg_gen_setcondi_tl(TCG_COND_LT, rl, tmp1, 0); /* Check sign bit */
> +
> +        tcg_temp_free(tmp1);
> +        tcg_temp_free(tmp2);
> +        break;
> +    }

Again, missing comparison of low parts.

> @@ -93,7 +205,28 @@ static bool gen_branch(DisasContext *ctx, arg_b *a, TCGCond cond)
>       gen_get_gpr(source1, a->rs1);
>       gen_get_gpr(source2, a->rs2);
>   
> +#if defined(TARGET_RISCV128)
> +    if (is_128bit(ctx)) {
> +        TCGv source1h, source2h, tmph, tmpl;
> +        source1h = tcg_temp_new();
> +        source2h = tcg_temp_new();
> +        tmph = tcg_temp_new();
> +        tmpl = tcg_temp_new();
> +        gen_get_gprh(source1h, a->rs1);
> +        gen_get_gprh(source2h, a->rs2);
> +
> +        gen_setcond_128(tmpl, tmph, source1, source1h, source2, source2h, cond);
> +        tcg_gen_brcondi_tl(TCG_COND_NE, tmpl, 0, l);
> +        tcg_temp_free(source1h);
> +        tcg_temp_free(source2h);
> +        tcg_temp_free(tmph);
> +        tcg_temp_free(tmpl);

setcond feeding brcond results in too many comparisons, usually.

In this instance it may be just as easy to generate multiple branches, in general.  But 
EQ/NE should not be

	setcond	t1, al, bl, eq
	setcond t2, ah, bh, eq
	and	t3, t1, t1
	brcond	t3, 0, ne

but

	xor	t1, al, bl
	xor	t2, ah, bh
	or	t3, t1, t2
	brcond	t3, 0, eq

>   static bool trans_srli(DisasContext *ctx, arg_srli *a)
>   {
> +#if defined(TARGET_RISCV128)
> +    if (is_128bit(ctx)) {
> +        if (a->shamt >= 128) {
> +            return false;
> +        }
> +
> +        if (a->rd != 0 && a->shamt != 0) {
> +            TCGv rs = tcg_temp_new(),
> +                 rsh = tcg_temp_new(),
> +                 res = tcg_temp_new(),
> +                 resh = tcg_temp_new(),
> +                 tmp = tcg_temp_new();
> +            gen_get_gpr(rs, a->rs1);
> +            gen_get_gprh(rsh, a->rs1);
> +
> +            /*
> +             * Computation of double-length right logical shift,
> +             * adapted for immediates from section 2.17 of Hacker's Delight
> +             */
> +            if (a->shamt >= 64) {
> +                tcg_gen_movi_tl(res, 0);
> +            } else {
> +                tcg_gen_shri_tl(res, rs, a->shamt);
> +            }
> +            if (64 - a->shamt < 0) {
> +                tcg_gen_movi_tl(tmp, 0);
> +            } else {
> +                tcg_gen_shli_tl(tmp, rsh, 64 - a->shamt);
> +            }
> +            tcg_gen_or_tl(res, res, tmp);
> +            if (a->shamt - 64 < 0) {
> +                tcg_gen_movi_tl(tmp, 0);
> +            } else {
> +                tcg_gen_shri_tl(tmp, rsh, a->shamt - 64);
> +            }
> +            tcg_gen_or_tl(res, res, tmp);
> +
> +            if (a->shamt >= 64) {
> +                tcg_gen_movi_tl(resh, 0);
> +            } else {
> +                tcg_gen_shri_tl(resh, rsh, a->shamt);
> +            }

We have tcg_gen_extract2_i64 for the purpose of double-word immediate shifts.  You should 
be doing

     if (shamt >= 64) {
         tcg_gen_shri_tl(resl, srch, shamt - 64);
         tcg_gen_movi_tl(resh, 0);
     } else {
         tcg_gen_extract2_tl(resl, srcl, srch, shamt);
         tcg_gen_shri_tl(resh, srch, shamt);
     }

>   static bool trans_srai(DisasContext *ctx, arg_srai *a)

     if (shamt >= 64) {
         tcg_gen_sari_tl(resl, srch, shamt - 64);
         tcg_gen_sari_tl(resh, srch, 63);
     } else {
         tcg_gen_extract2_tl(resl, srcl, srch, shamt);
         tcg_gen_sari_tl(resh, srch, shamt);
     }

>  static bool trans_slli(DisasContext *ctx, arg_slli *a)

     if (shamt >= 64) {
         tcg_gen_shli_tl(resh, srcl, shamt - 64);
         tcg_gen_movi_tl(resl, 0);
     } else {
         tcg_gen_extract2_tl(resh, srcl, srch, 64 - shamt);
         tcg_gen_shli_tl(resl, srcl, shamt);
     }

C.f. tcg/tcg-op.c, tcg_gen_shifti_i64, which is doing the same thing for i32.

> +#if defined(TARGET_RISCV128)
> +enum M128_DIR { M128_LEFT, M128_RIGHT, M128_RIGHT_ARITH };
> +static void gen_shift_mod128(TCGv ret, TCGv arg1, TCGv arg2, enum M128_DIR dir)
> +{
> +    TCGv tmp1 = tcg_temp_new(),
> +         tmp2 = tcg_temp_new(),
> +         cnst_zero = tcg_const_tl(0),
> +         sgn = tcg_temp_new();
> +
> +    tcg_gen_setcondi_tl(TCG_COND_GE, tmp1, arg2, 64);
> +    tcg_gen_setcondi_tl(TCG_COND_LT, tmp2, arg2, 0);
> +    tcg_gen_or_tl(tmp1, tmp1, tmp2);

What in the world are you doing with signed comparisons?

> +    tcg_gen_andi_tl(tmp2, arg2, 0x3f);

You should have one test with 0x3f and one with 0x40.



r~


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 4/8] target/riscv: 128-bit arithmetic and logic instructions
  2021-08-30 21:38   ` Philippe Mathieu-Daudé
  2021-08-30 21:40     ` Philippe Mathieu-Daudé
@ 2021-08-31  3:32     ` Richard Henderson
  1 sibling, 0 replies; 22+ messages in thread
From: Richard Henderson @ 2021-08-31  3:32 UTC (permalink / raw)
  To: Philippe Mathieu-Daudé,
	Frédéric Pétrot, qemu-devel, qemu-riscv
  Cc: Alistair Francis, Bin Meng, Palmer Dabbelt, Fabien Portas

On 8/30/21 2:38 PM, Philippe Mathieu-Daudé wrote:
>> +#if defined(TARGET_RISCV128)
>> +        if (is_128bit(ctx)) {
> 
> Maybe this could allow the compiler eventually elide the
> code and avoid superfluous #ifdef'ry:
> 
>             if (TARGET_LONG_BITS >= 128) {

TCG does not support TARGET_LONG_BITS != {32,64}.
This will not work.

But is_128bit() should be sufficient in each opcode, because that itself should evaluate 
to false if unsupported.


r~


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 6/8] target/riscv: Support of compiler's 128-bit integer types
  2021-08-30 17:16 ` [PATCH 6/8] target/riscv: Support of compiler's 128-bit integer types Frédéric Pétrot
@ 2021-08-31  3:38   ` Richard Henderson
  0 siblings, 0 replies; 22+ messages in thread
From: Richard Henderson @ 2021-08-31  3:38 UTC (permalink / raw)
  To: Frédéric Pétrot, qemu-devel, qemu-riscv
  Cc: Palmer Dabbelt, Bin Meng, Alistair Francis, Fabien Portas

On 8/30/21 10:16 AM, Frédéric Pétrot wrote:
> diff --git a/target/riscv/cpu.h b/target/riscv/cpu.h
> index 6528b4540e..4321b03b94 100644
> --- a/target/riscv/cpu.h
> +++ b/target/riscv/cpu.h
> @@ -60,6 +60,19 @@
>   #define RV64 ((target_ulong)2 << (TARGET_LONG_BITS - 2))
>   /* To be used on misah, the upper part of misa */
>   #define RV128 ((target_ulong)3 << (TARGET_LONG_BITS - 2))
> +/*
> + * Defined to force the use of tcg 128-bit arithmetic
> + * if the compiler does not have a 128-bit built-in type
> + */
> +#define SOFT_128BIT
> +/*
> + * If available and not explicitly disabled,
> + * use compiler's 128-bit integers.
> + */
> +#if defined(__SIZEOF_INT128__) && !defined(SOFT_128BIT)
> +#define HARD_128BIT
> +#endif

This doesn't belong here.  CONFIG_INT128 is more correct.


r~


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 7/8] target/riscv: 128-bit support for some csrs
  2021-08-30 17:16 ` [PATCH 7/8] target/riscv: 128-bit support for some csrs Frédéric Pétrot
@ 2021-08-31  3:43   ` Richard Henderson
  0 siblings, 0 replies; 22+ messages in thread
From: Richard Henderson @ 2021-08-31  3:43 UTC (permalink / raw)
  To: Frédéric Pétrot, qemu-devel, qemu-riscv
  Cc: Palmer Dabbelt, Bin Meng, Alistair Francis, Fabien Portas

On 8/30/21 10:16 AM, Frédéric Pétrot wrote:
>   target/riscv/utils_128.h                | 173 ++++++++++++++++

You should extend include/qemu/int128.h as needed, rather than this.

> diff --git a/target/riscv/cpu.h b/target/riscv/cpu.h
> index 4321b03b94..0d18055e08 100644
> --- a/target/riscv/cpu.h
> +++ b/target/riscv/cpu.h
> @@ -26,6 +26,7 @@
>   #include "fpu/softfloat-types.h"
>   #include "qom/object.h"
>   
> +#include "utils_128.h"

And anyway, it's certainly not needed in cpu.h.

> @@ -60,19 +61,6 @@
>   #define RV64 ((target_ulong)2 << (TARGET_LONG_BITS - 2))
>   /* To be used on misah, the upper part of misa */
>   #define RV128 ((target_ulong)3 << (TARGET_LONG_BITS - 2))
> -/*
> - * Defined to force the use of tcg 128-bit arithmetic
> - * if the compiler does not have a 128-bit built-in type
> - */
> -#define SOFT_128BIT
> -/*
> - * If available and not explicitly disabled,
> - * use compiler's 128-bit integers.
> - */
> -#if defined(__SIZEOF_INT128__) && !defined(SOFT_128BIT)
> -#define HARD_128BIT
> -#endif

You shouldn't have added these earlier and remove them here.  Of course, I don't think 
they're needed at all.


r~


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 4/8] target/riscv: 128-bit arithmetic and logic instructions
  2021-08-30 21:40     ` Philippe Mathieu-Daudé
@ 2021-08-31 15:57       ` Frédéric Pétrot
  0 siblings, 0 replies; 22+ messages in thread
From: Frédéric Pétrot @ 2021-08-31 15:57 UTC (permalink / raw)
  To: Philippe Mathieu-Daudé, qemu-devel, qemu-riscv
  Cc: Alistair Francis, Bin Meng, Palmer Dabbelt, Fabien Portas

Hello Philippe,

Le 30/08/2021 à 23:40, Philippe Mathieu-Daudé a écrit :
> On 8/30/21 11:38 PM, Philippe Mathieu-Daudé wrote:
>> On 8/30/21 7:16 PM, Frédéric Pétrot wrote:
>>> Adding the support for the 128-bit arithmetic and logic instructions.
>>> Remember that all (i) instructions are now acting on 128-bit registers, that
>>> a few others are added to cope with values that are held on 64 bits within
>>> the 128-bit registers, and that the ones that cope with values on 32-bit
>>> must also be modified for proper sign extension.
>>> Most algorithms taken from Hackers' delight.
>>>
>>> Signed-off-by: Frédéric Pétrot <frederic.petrot@univ-grenoble-alpes.fr>
>>> Co-authored-by: Fabien Portas <fabien.portas@grenoble-inp.org>
>>> ---
>>>  target/riscv/insn32.decode              |  13 +
>>>  target/riscv/insn_trans/trans_rvi.c.inc | 955 +++++++++++++++++++++++-
>>>  target/riscv/translate.c                |  25 +
>>>  3 files changed, 976 insertions(+), 17 deletions(-)
>>
>>> diff --git a/target/riscv/insn_trans/trans_rvi.c.inc b/target/riscv/insn_trans/trans_rvi.c.inc
>>> index 772330a766..0401ba3d69 100644
>>> --- a/target/riscv/insn_trans/trans_rvi.c.inc
>>> +++ b/target/riscv/insn_trans/trans_rvi.c.inc
>>> @@ -26,14 +26,20 @@ static bool trans_illegal(DisasContext *ctx, arg_empty *a)
>>>  
>>>  static bool trans_c64_illegal(DisasContext *ctx, arg_empty *a)
>>>  {
>>> -     REQUIRE_64BIT(ctx);
>>> -     return trans_illegal(ctx, a);
>>> +    REQUIRE_64_OR_128BIT(ctx);
>>> +    return trans_illegal(ctx, a);
>>>  }
>>>  
>>>  static bool trans_lui(DisasContext *ctx, arg_lui *a)
>>>  {
>>>      if (a->rd != 0) {
>>>          tcg_gen_movi_tl(cpu_gpr[a->rd], a->imm);
>>> +#if defined(TARGET_RISCV128)
>>> +        if (is_128bit(ctx)) {
>>
>> Maybe this could allow the compiler eventually elide the
>> code and avoid superfluous #ifdef'ry:
>>
>>            if (TARGET_LONG_BITS >= 128) {
> 
> Actually:
> 
>              if (TARGET_LONG_BITS >= 128 && is_128bit(ctx)) {

  We may have taken a wrong path then, because we have kept
  TARGET_LONG_BITS == 64 for the 128-bit case (as we use the tcg_xxx_tl of the
  64 version to generate our micro-ops, which I admit might be a mistake).

  Frédéric

> 
>>
>>> +            tcg_gen_ext_i64_i128(cpu_gpr[a->rd], cpu_gprh[a->rd],
>>> +                                 cpu_gpr[a->rd]);
>>> +        }
>>> +#endif
>>>      }
>>>      return true;
>>>  }
>>

-- 
+---------------------------------------------------------------------------+
| Frédéric Pétrot, Pr. Grenoble INP-Ensimag/TIMA,   Ensimag deputy director |
| Mob/Pho: +33 6 74 57 99 65/+33 4 76 57 48 70      Ad augusta  per angusta |
| http://tima.univ-grenoble-alpes.fr frederic.petrot@univ-grenoble-alpes.fr |
+---------------------------------------------------------------------------+


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 3/8] target/riscv: Addition of 128-bit ldu, lq and sq instructions
  2021-08-31  2:24   ` Richard Henderson
@ 2021-08-31 16:00     ` Frédéric Pétrot
  0 siblings, 0 replies; 22+ messages in thread
From: Frédéric Pétrot @ 2021-08-31 16:00 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel, qemu-riscv
  Cc: Alistair Francis, Bin Meng, Palmer Dabbelt, Fabien Portas

Hello Richard,

Le 31/08/2021 à 04:24, Richard Henderson a écrit :
> On 8/30/21 10:16 AM, Frédéric Pétrot wrote:
>> +#if defined(TARGET_RISCV128)
>> +/*
>> + * Accessing signed 64-bit or 128-bit values should be part of MemOp in
>> + * include/exec/memop.h
>> + * Unfortunately, this requires to change the defines there, as MO_SIGN is 4,
>> + * and values 0 to 3 are usual types sizes.
>> + * Note that an assert is triggered when MemOp is MO_SIGN|MO_TEQ, this value
>> + * being some kind of sentinel.
> 
> https://lore.kernel.org/qemu-devel/20210818191920.390759-24-richard.henderson@linaro.org/

  Thanks for the pointer,
  Frédéric
> 
> 
> 
> r~

-- 
+---------------------------------------------------------------------------+
| Frédéric Pétrot, Pr. Grenoble INP-Ensimag/TIMA,   Ensimag deputy director |
| Mob/Pho: +33 6 74 57 99 65/+33 4 76 57 48 70      Ad augusta  per angusta |
| http://tima.univ-grenoble-alpes.fr frederic.petrot@univ-grenoble-alpes.fr |
+---------------------------------------------------------------------------+


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/8] target/riscv: Settings for 128-bit extension support
  2021-08-31  3:13 ` [PATCH 1/8] target/riscv: Settings for 128-bit extension support Alistair Francis
@ 2021-08-31 16:20   ` Frédéric Pétrot
  0 siblings, 0 replies; 22+ messages in thread
From: Frédéric Pétrot @ 2021-08-31 16:20 UTC (permalink / raw)
  To: Alistair Francis
  Cc: open list:RISC-V, Philippe Mathieu-Daudé,
	Bin Meng, qemu-devel@nongnu.org Developers, Alistair Francis,
	Fabien Portas, Palmer Dabbelt, Alex Bennée

Hello Alistair,

Le 31/08/2021 à 05:13, Alistair Francis a écrit :
> On Tue, Aug 31, 2021 at 5:26 AM Frédéric Pétrot
> <frederic.petrot@univ-grenoble-alpes.fr> wrote:
>>
>> Starting 128-bit extension support implies a few modifications in the
>> existing sources because checking for 32-bit is done by checking that
>> it is not 64-bit and vice-versa.
>> We now consider the 3 possible xlen values so as to allow correct
>> compilation for both existing targets while setting the compilation
>> framework so that it can also handle the riscv128-softmmu target.
>> This includes gdb configuration files, that are just the bare copy of the
>> 64-bit ones as gdb does not honor, yet, 128-bit CPUs.
>> To consider the 3 xlen values, we had to add a misah field, representing the
>> upper 64 bits of the misa register.
>>
>> Signed-off-by: Frédéric Pétrot <frederic.petrot@univ-grenoble-alpes.fr>
>> Co-authored-by: Fabien Portas <fabien.portas@grenoble-inp.org>
>> ---
>>  configs/devices/riscv128-softmmu/default.mak | 16 ++++++
>>  configs/targets/riscv128-softmmu.mak         |  5 ++
>>  gdb-xml/riscv-128bit-cpu.xml                 | 48 ++++++++++++++++++
>>  gdb-xml/riscv-128bit-virtual.xml             | 12 +++++
>>  include/hw/riscv/sifive_cpu.h                |  4 ++
>>  target/riscv/Kconfig                         |  3 ++
>>  target/riscv/arch_dump.c                     |  3 +-
>>  target/riscv/cpu-param.h                     |  3 +-
>>  target/riscv/cpu.c                           | 51 +++++++++++++++++---
>>  target/riscv/cpu.h                           | 19 ++++++++
>>  target/riscv/gdbstub.c                       |  3 ++
>>  target/riscv/insn_trans/trans_rvd.c.inc      | 10 ++--
>>  target/riscv/insn_trans/trans_rvf.c.inc      |  2 +-
>>  target/riscv/translate.c                     | 45 ++++++++++++++++-
>>  14 files changed, 209 insertions(+), 15 deletions(-)
>>  create mode 100644 configs/devices/riscv128-softmmu/default.mak
>>  create mode 100644 configs/targets/riscv128-softmmu.mak
>>  create mode 100644 gdb-xml/riscv-128bit-cpu.xml
>>  create mode 100644 gdb-xml/riscv-128bit-virtual.xml
> 
> Hey!
> 
> Thanks for the patches!
> 
> Overall this patch looks good.

  Thanks for cheering!

> It would greatly help reviewing and the speed in which this can be
> merged if you can split it up more. A lot of these changes probably
> can be separate patches (for example a patch to add misah). I know it
> can sometimes seem a little silly, but it greatly helps with reviewing
> when patches are small and self contained.

  Ok, got it.
>>
>> diff --git a/configs/devices/riscv128-softmmu/default.mak b/configs/devices/riscv128-softmmu/default.mak
>> new file mode 100644
>> index 0000000000..31439dbcfe
>> --- /dev/null
>> +++ b/configs/devices/riscv128-softmmu/default.mak
>> @@ -0,0 +1,16 @@
>> +# Default configuration for riscv128-softmmu
>> +
>> +# Uncomment the following lines to disable these optional devices:
>> +#
>> +#CONFIG_PCI_DEVICES=n
>> +CONFIG_SEMIHOSTING=y
>> +CONFIG_ARM_COMPATIBLE_SEMIHOSTING=y
>> +
>> +# Boards:
>> +#
>> +CONFIG_SPIKE=n
>> +CONFIG_SIFIVE_E=n
>> +CONFIG_SIFIVE_U=n
>> +CONFIG_RISCV_VIRT=y
>> +CONFIG_MICROCHIP_PFSOC=n
>> +CONFIG_SHAKTI_C=n
>> diff --git a/configs/targets/riscv128-softmmu.mak b/configs/targets/riscv128-softmmu.mak
>> new file mode 100644
>> index 0000000000..e300c43c8e
>> --- /dev/null
>> +++ b/configs/targets/riscv128-softmmu.mak
>> @@ -0,0 +1,5 @@
>> +TARGET_ARCH=riscv128
>> +TARGET_BASE_ARCH=riscv
>> +TARGET_SUPPORTS_MTTCG=y
>> +TARGET_XML_FILES= gdb-xml/riscv-128bit-cpu.xml gdb-xml/riscv-32bit-fpu.xml gdb-xml/riscv-64bit-fpu.xml gdb-xml/riscv-128bit-virtual.xml
>> +TARGET_NEED_FDT=y
>> diff --git a/gdb-xml/riscv-128bit-cpu.xml b/gdb-xml/riscv-128bit-cpu.xml
>> new file mode 100644
>> index 0000000000..c98168148f
>> --- /dev/null
>> +++ b/gdb-xml/riscv-128bit-cpu.xml
>> @@ -0,0 +1,48 @@
>> +<?xml version="1.0"?>
>> +<!-- Copyright (C) 2018-2019 Free Software Foundation, Inc.
>> +
>> +     Copying and distribution of this file, with or without modification,
>> +     are permitted in any medium without royalty provided the copyright
>> +     notice and this notice are preserved.  -->
>> +
>> +<!-- Register numbers are hard-coded in order to maintain backward
>> +     compatibility with older versions of tools that didn't use xml
>> +     register descriptions.  -->
>> +
>> +<!DOCTYPE feature SYSTEM "gdb-target.dtd">
>> +<!-- FIXME : All GPRs are marked as 64-bits since gdb doesn't like 128-bit registers for now. -->
>> +<feature name="org.gnu.gdb.riscv.cpu">
>> +  <reg name="zero" bitsize="64" type="int" regnum="0"/>
>> +  <reg name="ra" bitsize="64" type="code_ptr"/>
>> +  <reg name="sp" bitsize="64" type="data_ptr"/>
>> +  <reg name="gp" bitsize="64" type="data_ptr"/>
>> +  <reg name="tp" bitsize="64" type="data_ptr"/>
>> +  <reg name="t0" bitsize="64" type="int"/>
>> +  <reg name="t1" bitsize="64" type="int"/>
>> +  <reg name="t2" bitsize="64" type="int"/>
>> +  <reg name="fp" bitsize="64" type="data_ptr"/>
>> +  <reg name="s1" bitsize="64" type="int"/>
>> +  <reg name="a0" bitsize="64" type="int"/>
>> +  <reg name="a1" bitsize="64" type="int"/>
>> +  <reg name="a2" bitsize="64" type="int"/>
>> +  <reg name="a3" bitsize="64" type="int"/>
>> +  <reg name="a4" bitsize="64" type="int"/>
>> +  <reg name="a5" bitsize="64" type="int"/>
>> +  <reg name="a6" bitsize="64" type="int"/>
>> +  <reg name="a7" bitsize="64" type="int"/>
>> +  <reg name="s2" bitsize="64" type="int"/>
>> +  <reg name="s3" bitsize="64" type="int"/>
>> +  <reg name="s4" bitsize="64" type="int"/>
>> +  <reg name="s5" bitsize="64" type="int"/>
>> +  <reg name="s6" bitsize="64" type="int"/>
>> +  <reg name="s7" bitsize="64" type="int"/>
>> +  <reg name="s8" bitsize="64" type="int"/>
>> +  <reg name="s9" bitsize="64" type="int"/>
>> +  <reg name="s10" bitsize="64" type="int"/>
>> +  <reg name="s11" bitsize="64" type="int"/>
>> +  <reg name="t3" bitsize="64" type="int"/>
>> +  <reg name="t4" bitsize="64" type="int"/>
>> +  <reg name="t5" bitsize="64" type="int"/>
>> +  <reg name="t6" bitsize="64" type="int"/>
>> +  <reg name="pc" bitsize="64" type="code_ptr"/>
>> +</feature>
>> diff --git a/gdb-xml/riscv-128bit-virtual.xml b/gdb-xml/riscv-128bit-virtual.xml
>> new file mode 100644
>> index 0000000000..db9a0ff677
>> --- /dev/null
>> +++ b/gdb-xml/riscv-128bit-virtual.xml
>> @@ -0,0 +1,12 @@
>> +<?xml version="1.0"?>
>> +<!-- Copyright (C) 2018-2019 Free Software Foundation, Inc.
>> +
>> +     Copying and distribution of this file, with or without modification,
>> +     are permitted in any medium without royalty provided the copyright
>> +     notice and this notice are preserved.  -->
>> +
>> +<!DOCTYPE feature SYSTEM "gdb-target.dtd">
>> +<!-- FIXME : priv marked as 64-bits since gdb doesn't like 128-bit registers for now. -->
>> +<feature name="org.gnu.gdb.riscv.virtual">
>> +  <reg name="priv" bitsize="64"/>
>> +</feature>
>> diff --git a/include/hw/riscv/sifive_cpu.h b/include/hw/riscv/sifive_cpu.h
>> index 136799633a..2fd441664f 100644
>> --- a/include/hw/riscv/sifive_cpu.hthat
>> +++ b/include/hw/riscv/sifive_cpu.h
>> @@ -26,6 +26,10 @@
>>  #elif defined(TARGET_RISCV64)
>>  #define SIFIVE_E_CPU TYPE_RISCV_CPU_SIFIVE_E51
>>  #define SIFIVE_U_CPU TYPE_RISCV_CPU_SIFIVE_U54
>> +#elif defined(TARGET_RISCV128)
>> +/* 128-bit uses 64-bit CPU for now, since no cpu implements RV128 */
>> +#define SIFIVE_E_CPU TYPE_RISCV_CPU_SIFIVE_E51
>> +#define SIFIVE_U_CPU TYPE_RISCV_CPU_SIFIVE_U54
>>  #endif
>>
>>  #endif /* HW_SIFIVE_CPU_H */
>> diff --git a/target/riscv/Kconfig b/target/riscv/Kconfig
>> index b9e5932f13..f9ea52a59a 100644
>> --- a/target/riscv/Kconfig
>> +++ b/target/riscv/Kconfig
>> @@ -3,3 +3,6 @@ config RISCV32
>>
>>  config RISCV64
>>      bool
>> +
>> +config RISCV128
>> +    bool
>> diff --git a/target/riscv/arch_dump.c b/target/riscv/arch_dump.c
>> index 709f621d82..f756ed2988 100644
>> --- a/target/riscv/arch_dump.c
>> +++ b/target/riscv/arch_dump.c
>> @@ -176,7 +176,8 @@ int cpu_get_dump_info(ArchDumpInfo *info,
>>
>>      info->d_machine = EM_RISCV;
>>
>> -#if defined(TARGET_RISCV64)
>> +#if defined(TARGET_RISCV64) || defined(TARGET_RISCV128)
>> +    /* FIXME : No 128-bit ELF class exists (for now), use 64-bit one. */
>>      info->d_class = ELFCLASS64;
>>  #else
>>      info->d_class = ELFCLASS32;
>> diff --git a/target/riscv/cpu-param.h b/target/riscv/cpu-param.h
>> index 80eb615f93..e6d0651f60 100644
>> --- a/target/riscv/cpu-param.h
>> +++ b/target/riscv/cpu-param.h
>> @@ -8,7 +8,8 @@
>>  #ifndef RISCV_CPU_PARAM_H
>>  #define RISCV_CPU_PARAM_H 1
>>
>> -#if defined(TARGET_RISCV64)
>> +/* 64-bit target, since QEMU isn't built to have TARGET_LONG_BITS over 64 */
>> +#if defined(TARGET_RISCV64) || defined(TARGET_RISCV128)
>>  # define TARGET_LONG_BITS 64
>>  # define TARGET_PHYS_ADDR_SPACE_BITS 56 /* 44-bit PPN */
>>  # define TARGET_VIRT_ADDR_SPACE_BITS 48 /* sv48 */
>> diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
>> index 991a6bb760..1f15026e9c 100644
>> --- a/target/riscv/cpu.c
>> +++ b/target/riscv/cpu.c
>> @@ -110,18 +110,38 @@ const char *riscv_cpu_get_trap_name(target_ulong cause, bool async)
>>
>>  bool riscv_cpu_is_32bit(CPURISCVState *env)
>>  {
>> -    if (env->misa & RV64) {
>> -        return false;
>> -    }
>> +    return (env->misa & MXLEN_MASK) == RV32;
>> +}
>>
>> -    return true;
>> +bool riscv_cpu_is_64bit(CPURISCVState *env)
>> +{
>> +    return (env->misa & MXLEN_MASK) == RV64;
>>  }
>>
>> +#if defined(TARGET_RISCV128)
> 
> Don't add any TARGET_* defines.
> 
> We are trying to move to a point where the 64-bit RISC-V softmmu can
> run 32-bit CPUs. Ideally we want the same with 128-bit. You don't have
> to get that working, but don't add any compile time conditionals.
> 
> That applies to all code, not just this patch. Unless there is already
> a conditional TARGET_* compile please don't add one.

  Dully noted,
  Frédéric
> 
> Alistair
> 

-- 
+---------------------------------------------------------------------------+
| Frédéric Pétrot, Pr. Grenoble INP-Ensimag/TIMA,   Ensimag deputy director |
| Mob/Pho: +33 6 74 57 99 65/+33 4 76 57 48 70      Ad augusta  per angusta |
| http://tima.univ-grenoble-alpes.fr frederic.petrot@univ-grenoble-alpes.fr |
+---------------------------------------------------------------------------+


^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2021-08-31 16:22 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-08-30 17:16 [PATCH 1/8] target/riscv: Settings for 128-bit extension support Frédéric Pétrot
2021-08-30 17:16 ` [PATCH 2/8] target/riscv: 128-bit registers creation and access Frédéric Pétrot
2021-08-30 21:34   ` Philippe Mathieu-Daudé
2021-08-30 17:16 ` [PATCH 3/8] target/riscv: Addition of 128-bit ldu, lq and sq instructions Frédéric Pétrot
2021-08-30 21:35   ` Philippe Mathieu-Daudé
2021-08-31  2:24   ` Richard Henderson
2021-08-31 16:00     ` Frédéric Pétrot
2021-08-31  2:30   ` Richard Henderson
2021-08-30 17:16 ` [PATCH 4/8] target/riscv: 128-bit arithmetic and logic instructions Frédéric Pétrot
2021-08-30 21:38   ` Philippe Mathieu-Daudé
2021-08-30 21:40     ` Philippe Mathieu-Daudé
2021-08-31 15:57       ` Frédéric Pétrot
2021-08-31  3:32     ` Richard Henderson
2021-08-31  3:30   ` Richard Henderson
2021-08-30 17:16 ` [PATCH 5/8] target/riscv: 128-bit multiply and divide Frédéric Pétrot
2021-08-30 17:16 ` [PATCH 6/8] target/riscv: Support of compiler's 128-bit integer types Frédéric Pétrot
2021-08-31  3:38   ` Richard Henderson
2021-08-30 17:16 ` [PATCH 7/8] target/riscv: 128-bit support for some csrs Frédéric Pétrot
2021-08-31  3:43   ` Richard Henderson
2021-08-30 17:16 ` [PATCH 8/8] target/riscv: Support for 128-bit satp Frédéric Pétrot
2021-08-31  3:13 ` [PATCH 1/8] target/riscv: Settings for 128-bit extension support Alistair Francis
2021-08-31 16:20   ` Frédéric Pétrot

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).