All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v6 0/6] support subsets of Float-Point in Integer Registers extensions
@ 2022-02-11  4:39 ` Weiwei Li
  0 siblings, 0 replies; 22+ messages in thread
From: Weiwei Li @ 2022-02-11  4:39 UTC (permalink / raw)
  To: richard.henderson, palmer, alistair.francis, bin.meng,
	qemu-riscv, qemu-devel
  Cc: wangjunqiang, Weiwei Li, lazyparser, ardxwe

This patchset implements RISC-V Float-Point in Integer Registers extensions(Version 1.0), which includes Zfinx, Zdinx, Zhinx and Zhinxmin extension. 

Specification:
https://github.com/riscv/riscv-zfinx/blob/main/zfinx-1.0.0.pdf

The port is available here:
https://github.com/plctlab/plct-qemu/tree/plct-zfinx-upstream-v6

To test this implementation, specify cpu argument with 'zfinx =true,zdinx=true,zhinx=true,zhinxmin=true' with 'g=false,f=false,d=false,Zfh=false,Zfhmin=false'
This implementation can pass gcc tests, ci result can be found in https://ci.rvperf.org/job/plct-qemu-zfinx-upstream/.

v6:
* rename flags Z*inx to z*inx
* rebase on apply-to-riscv.next

v5:
* put definition of ftemp and nftemp together, add comments for them
* sperate the declare of variable i from loop 

v4:
* combine register pair check for rv32 zdinx
* clear mstatus.FS when RVF is disabled by write_misa

v3:
* delete unused reset for mstatus.FS
* use positive test for RVF instead of negative test for ZFINX
* replace get_ol with get_xl
* use tcg_gen_concat_tl_i64 to unify tcg_gen_concat_i32_i64 and tcg_gen_deposit_i64

v2:
* hardwire mstatus.FS to zero when enable zfinx
* do register-pair check at the begin of translation
* optimize partial implemention as suggested

Weiwei Li (6):
  target/riscv: add cfg properties for zfinx, zdinx and zhinx{min}
  target/riscv: hardwire mstatus.FS to zero when enable zfinx
  target/riscv: add support for zfinx
  target/riscv: add support for zdinx
  target/riscv: add support for zhinx/zhinxmin
  target/riscv: expose zfinx, zdinx, zhinx{min} properties

 target/riscv/cpu.c                        |  17 ++
 target/riscv/cpu.h                        |   4 +
 target/riscv/cpu_helper.c                 |   6 +-
 target/riscv/csr.c                        |  25 +-
 target/riscv/fpu_helper.c                 | 178 ++++++------
 target/riscv/helper.h                     |   4 +-
 target/riscv/insn_trans/trans_rvd.c.inc   | 285 ++++++++++++++-----
 target/riscv/insn_trans/trans_rvf.c.inc   | 314 +++++++++++++-------
 target/riscv/insn_trans/trans_rvzfh.c.inc | 332 +++++++++++++++-------
 target/riscv/internals.h                  |  32 ++-
 target/riscv/translate.c                  | 149 +++++++++-
 11 files changed, 974 insertions(+), 372 deletions(-)

-- 
2.17.1



^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH v6 0/6] support subsets of Float-Point in Integer Registers extensions
@ 2022-02-11  4:39 ` Weiwei Li
  0 siblings, 0 replies; 22+ messages in thread
From: Weiwei Li @ 2022-02-11  4:39 UTC (permalink / raw)
  To: richard.henderson, palmer, alistair.francis, bin.meng,
	qemu-riscv, qemu-devel
  Cc: wangjunqiang, lazyparser, ardxwe, Weiwei Li

This patchset implements RISC-V Float-Point in Integer Registers extensions(Version 1.0), which includes Zfinx, Zdinx, Zhinx and Zhinxmin extension. 

Specification:
https://github.com/riscv/riscv-zfinx/blob/main/zfinx-1.0.0.pdf

The port is available here:
https://github.com/plctlab/plct-qemu/tree/plct-zfinx-upstream-v6

To test this implementation, specify cpu argument with 'zfinx =true,zdinx=true,zhinx=true,zhinxmin=true' with 'g=false,f=false,d=false,Zfh=false,Zfhmin=false'
This implementation can pass gcc tests, ci result can be found in https://ci.rvperf.org/job/plct-qemu-zfinx-upstream/.

v6:
* rename flags Z*inx to z*inx
* rebase on apply-to-riscv.next

v5:
* put definition of ftemp and nftemp together, add comments for them
* sperate the declare of variable i from loop 

v4:
* combine register pair check for rv32 zdinx
* clear mstatus.FS when RVF is disabled by write_misa

v3:
* delete unused reset for mstatus.FS
* use positive test for RVF instead of negative test for ZFINX
* replace get_ol with get_xl
* use tcg_gen_concat_tl_i64 to unify tcg_gen_concat_i32_i64 and tcg_gen_deposit_i64

v2:
* hardwire mstatus.FS to zero when enable zfinx
* do register-pair check at the begin of translation
* optimize partial implemention as suggested

Weiwei Li (6):
  target/riscv: add cfg properties for zfinx, zdinx and zhinx{min}
  target/riscv: hardwire mstatus.FS to zero when enable zfinx
  target/riscv: add support for zfinx
  target/riscv: add support for zdinx
  target/riscv: add support for zhinx/zhinxmin
  target/riscv: expose zfinx, zdinx, zhinx{min} properties

 target/riscv/cpu.c                        |  17 ++
 target/riscv/cpu.h                        |   4 +
 target/riscv/cpu_helper.c                 |   6 +-
 target/riscv/csr.c                        |  25 +-
 target/riscv/fpu_helper.c                 | 178 ++++++------
 target/riscv/helper.h                     |   4 +-
 target/riscv/insn_trans/trans_rvd.c.inc   | 285 ++++++++++++++-----
 target/riscv/insn_trans/trans_rvf.c.inc   | 314 +++++++++++++-------
 target/riscv/insn_trans/trans_rvzfh.c.inc | 332 +++++++++++++++-------
 target/riscv/internals.h                  |  32 ++-
 target/riscv/translate.c                  | 149 +++++++++-
 11 files changed, 974 insertions(+), 372 deletions(-)

-- 
2.17.1



^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH v6 1/6] target/riscv: add cfg properties for zfinx, zdinx and zhinx{min}
  2022-02-11  4:39 ` Weiwei Li
@ 2022-02-11  4:39   ` Weiwei Li
  -1 siblings, 0 replies; 22+ messages in thread
From: Weiwei Li @ 2022-02-11  4:39 UTC (permalink / raw)
  To: richard.henderson, palmer, alistair.francis, bin.meng,
	qemu-riscv, qemu-devel
  Cc: wangjunqiang, Weiwei Li, lazyparser, ardxwe

Co-authored-by: ardxwe <ardxwe@gmail.com>
Signed-off-by: Weiwei Li <liweiwei@iscas.ac.cn>
Signed-off-by: Junqiang Wang <wangjunqiang@iscas.ac.cn>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
---
 target/riscv/cpu.c | 12 ++++++++++++
 target/riscv/cpu.h |  4 ++++
 2 files changed, 16 insertions(+)

diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
index b0a40b83e7..55371b1aa5 100644
--- a/target/riscv/cpu.c
+++ b/target/riscv/cpu.c
@@ -587,6 +587,11 @@ static void riscv_cpu_realize(DeviceState *dev, Error **errp)
             cpu->cfg.ext_d = true;
         }
 
+        if (cpu->cfg.ext_zdinx || cpu->cfg.ext_zhinx ||
+            cpu->cfg.ext_zhinxmin) {
+            cpu->cfg.ext_zfinx = true;
+        }
+
         /* Set the ISA extensions, checks should have happened above */
         if (cpu->cfg.ext_i) {
             ext |= RVI;
@@ -665,6 +670,13 @@ static void riscv_cpu_realize(DeviceState *dev, Error **errp)
         if (cpu->cfg.ext_j) {
             ext |= RVJ;
         }
+        if (cpu->cfg.ext_zfinx && ((ext & (RVF | RVD)) || cpu->cfg.ext_zfh ||
+                                   cpu->cfg.ext_zfhmin)) {
+            error_setg(errp,
+                    "'Zfinx' cannot be supported together with 'F', 'D', 'Zfh',"
+                    " 'Zfhmin'");
+            return;
+        }
 
         set_misa(env, env->misa_mxl, ext);
     }
diff --git a/target/riscv/cpu.h b/target/riscv/cpu.h
index 8183fb86d5..9ba05042ed 100644
--- a/target/riscv/cpu.h
+++ b/target/riscv/cpu.h
@@ -362,8 +362,12 @@ struct RISCVCPUConfig {
     bool ext_svinval;
     bool ext_svnapot;
     bool ext_svpbmt;
+    bool ext_zdinx;
     bool ext_zfh;
     bool ext_zfhmin;
+    bool ext_zfinx;
+    bool ext_zhinx;
+    bool ext_zhinxmin;
     bool ext_zve32f;
     bool ext_zve64f;
 
-- 
2.17.1



^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v6 1/6] target/riscv: add cfg properties for zfinx, zdinx and zhinx{min}
@ 2022-02-11  4:39   ` Weiwei Li
  0 siblings, 0 replies; 22+ messages in thread
From: Weiwei Li @ 2022-02-11  4:39 UTC (permalink / raw)
  To: richard.henderson, palmer, alistair.francis, bin.meng,
	qemu-riscv, qemu-devel
  Cc: wangjunqiang, lazyparser, ardxwe, Weiwei Li

Co-authored-by: ardxwe <ardxwe@gmail.com>
Signed-off-by: Weiwei Li <liweiwei@iscas.ac.cn>
Signed-off-by: Junqiang Wang <wangjunqiang@iscas.ac.cn>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
---
 target/riscv/cpu.c | 12 ++++++++++++
 target/riscv/cpu.h |  4 ++++
 2 files changed, 16 insertions(+)

diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
index b0a40b83e7..55371b1aa5 100644
--- a/target/riscv/cpu.c
+++ b/target/riscv/cpu.c
@@ -587,6 +587,11 @@ static void riscv_cpu_realize(DeviceState *dev, Error **errp)
             cpu->cfg.ext_d = true;
         }
 
+        if (cpu->cfg.ext_zdinx || cpu->cfg.ext_zhinx ||
+            cpu->cfg.ext_zhinxmin) {
+            cpu->cfg.ext_zfinx = true;
+        }
+
         /* Set the ISA extensions, checks should have happened above */
         if (cpu->cfg.ext_i) {
             ext |= RVI;
@@ -665,6 +670,13 @@ static void riscv_cpu_realize(DeviceState *dev, Error **errp)
         if (cpu->cfg.ext_j) {
             ext |= RVJ;
         }
+        if (cpu->cfg.ext_zfinx && ((ext & (RVF | RVD)) || cpu->cfg.ext_zfh ||
+                                   cpu->cfg.ext_zfhmin)) {
+            error_setg(errp,
+                    "'Zfinx' cannot be supported together with 'F', 'D', 'Zfh',"
+                    " 'Zfhmin'");
+            return;
+        }
 
         set_misa(env, env->misa_mxl, ext);
     }
diff --git a/target/riscv/cpu.h b/target/riscv/cpu.h
index 8183fb86d5..9ba05042ed 100644
--- a/target/riscv/cpu.h
+++ b/target/riscv/cpu.h
@@ -362,8 +362,12 @@ struct RISCVCPUConfig {
     bool ext_svinval;
     bool ext_svnapot;
     bool ext_svpbmt;
+    bool ext_zdinx;
     bool ext_zfh;
     bool ext_zfhmin;
+    bool ext_zfinx;
+    bool ext_zhinx;
+    bool ext_zhinxmin;
     bool ext_zve32f;
     bool ext_zve64f;
 
-- 
2.17.1



^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v6 2/6] target/riscv: hardwire mstatus.FS to zero when enable zfinx
  2022-02-11  4:39 ` Weiwei Li
@ 2022-02-11  4:39   ` Weiwei Li
  -1 siblings, 0 replies; 22+ messages in thread
From: Weiwei Li @ 2022-02-11  4:39 UTC (permalink / raw)
  To: richard.henderson, palmer, alistair.francis, bin.meng,
	qemu-riscv, qemu-devel
  Cc: wangjunqiang, Weiwei Li, lazyparser, ardxwe

Co-authored-by: ardxwe <ardxwe@gmail.com>
Signed-off-by: Weiwei Li <liweiwei@iscas.ac.cn>
Signed-off-by: Junqiang Wang <wangjunqiang@iscas.ac.cn>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
---
 target/riscv/cpu_helper.c |  6 +++++-
 target/riscv/csr.c        | 25 ++++++++++++++++++++-----
 target/riscv/translate.c  |  4 ++++
 3 files changed, 29 insertions(+), 6 deletions(-)

diff --git a/target/riscv/cpu_helper.c b/target/riscv/cpu_helper.c
index 746335bfd6..1c60fb2e80 100644
--- a/target/riscv/cpu_helper.c
+++ b/target/riscv/cpu_helper.c
@@ -466,9 +466,13 @@ bool riscv_cpu_vector_enabled(CPURISCVState *env)
 
 void riscv_cpu_swap_hypervisor_regs(CPURISCVState *env)
 {
-    uint64_t mstatus_mask = MSTATUS_MXR | MSTATUS_SUM | MSTATUS_FS |
+    uint64_t mstatus_mask = MSTATUS_MXR | MSTATUS_SUM |
                             MSTATUS_SPP | MSTATUS_SPIE | MSTATUS_SIE |
                             MSTATUS64_UXL | MSTATUS_VS;
+
+    if (riscv_has_ext(env, RVF)) {
+        mstatus_mask |= MSTATUS_FS;
+    }
     bool current_virt = riscv_cpu_virt_enabled(env);
 
     g_assert(riscv_has_ext(env, RVH));
diff --git a/target/riscv/csr.c b/target/riscv/csr.c
index 387088a86c..93bba1ca1c 100644
--- a/target/riscv/csr.c
+++ b/target/riscv/csr.c
@@ -38,7 +38,8 @@ void riscv_set_csr_ops(int csrno, riscv_csr_operations *ops)
 static RISCVException fs(CPURISCVState *env, int csrno)
 {
 #if !defined(CONFIG_USER_ONLY)
-    if (!env->debugger && !riscv_cpu_fp_enabled(env)) {
+    if (!env->debugger && !riscv_cpu_fp_enabled(env) &&
+        !RISCV_CPU(env_cpu(env))->cfg.ext_zfinx) {
         return RISCV_EXCP_ILLEGAL_INST;
     }
 #endif
@@ -301,7 +302,9 @@ static RISCVException write_fflags(CPURISCVState *env, int csrno,
                                    target_ulong val)
 {
 #if !defined(CONFIG_USER_ONLY)
-    env->mstatus |= MSTATUS_FS;
+    if (riscv_has_ext(env, RVF)) {
+        env->mstatus |= MSTATUS_FS;
+    }
 #endif
     riscv_cpu_set_fflags(env, val & (FSR_AEXC >> FSR_AEXC_SHIFT));
     return RISCV_EXCP_NONE;
@@ -318,7 +321,9 @@ static RISCVException write_frm(CPURISCVState *env, int csrno,
                                 target_ulong val)
 {
 #if !defined(CONFIG_USER_ONLY)
-    env->mstatus |= MSTATUS_FS;
+    if (riscv_has_ext(env, RVF)) {
+        env->mstatus |= MSTATUS_FS;
+    }
 #endif
     env->frm = val & (FSR_RD >> FSR_RD_SHIFT);
     return RISCV_EXCP_NONE;
@@ -336,7 +341,9 @@ static RISCVException write_fcsr(CPURISCVState *env, int csrno,
                                  target_ulong val)
 {
 #if !defined(CONFIG_USER_ONLY)
-    env->mstatus |= MSTATUS_FS;
+    if (riscv_has_ext(env, RVF)) {
+        env->mstatus |= MSTATUS_FS;
+    }
 #endif
     env->frm = (val & FSR_RD) >> FSR_RD_SHIFT;
     riscv_cpu_set_fflags(env, (val & FSR_AEXC) >> FSR_AEXC_SHIFT);
@@ -652,10 +659,14 @@ static RISCVException write_mstatus(CPURISCVState *env, int csrno,
         tlb_flush(env_cpu(env));
     }
     mask = MSTATUS_SIE | MSTATUS_SPIE | MSTATUS_MIE | MSTATUS_MPIE |
-        MSTATUS_SPP | MSTATUS_FS | MSTATUS_MPRV | MSTATUS_SUM |
+        MSTATUS_SPP | MSTATUS_MPRV | MSTATUS_SUM |
         MSTATUS_MPP | MSTATUS_MXR | MSTATUS_TVM | MSTATUS_TSR |
         MSTATUS_TW | MSTATUS_VS;
 
+    if (riscv_has_ext(env, RVF)) {
+        mask |= MSTATUS_FS;
+    }
+
     if (xl != MXL_RV32 || env->debugger) {
         /*
          * RV32: MPV and GVA are not in mstatus. The current plan is to
@@ -787,6 +798,10 @@ static RISCVException write_misa(CPURISCVState *env, int csrno,
         return RISCV_EXCP_NONE;
     }
 
+    if (!(val & RVF)) {
+        env->mstatus &= ~MSTATUS_FS;
+    }
+
     /* flush translation cache */
     tb_flush(env_cpu(env));
     env->misa_ext = val;
diff --git a/target/riscv/translate.c b/target/riscv/translate.c
index 84dbfa6340..c7232de326 100644
--- a/target/riscv/translate.c
+++ b/target/riscv/translate.c
@@ -426,6 +426,10 @@ static void mark_fs_dirty(DisasContext *ctx)
 {
     TCGv tmp;
 
+    if (!has_ext(ctx, RVF)) {
+        return;
+    }
+
     if (ctx->mstatus_fs != MSTATUS_FS) {
         /* Remember the state change for the rest of the TB. */
         ctx->mstatus_fs = MSTATUS_FS;
-- 
2.17.1



^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v6 2/6] target/riscv: hardwire mstatus.FS to zero when enable zfinx
@ 2022-02-11  4:39   ` Weiwei Li
  0 siblings, 0 replies; 22+ messages in thread
From: Weiwei Li @ 2022-02-11  4:39 UTC (permalink / raw)
  To: richard.henderson, palmer, alistair.francis, bin.meng,
	qemu-riscv, qemu-devel
  Cc: wangjunqiang, lazyparser, ardxwe, Weiwei Li

Co-authored-by: ardxwe <ardxwe@gmail.com>
Signed-off-by: Weiwei Li <liweiwei@iscas.ac.cn>
Signed-off-by: Junqiang Wang <wangjunqiang@iscas.ac.cn>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
---
 target/riscv/cpu_helper.c |  6 +++++-
 target/riscv/csr.c        | 25 ++++++++++++++++++++-----
 target/riscv/translate.c  |  4 ++++
 3 files changed, 29 insertions(+), 6 deletions(-)

diff --git a/target/riscv/cpu_helper.c b/target/riscv/cpu_helper.c
index 746335bfd6..1c60fb2e80 100644
--- a/target/riscv/cpu_helper.c
+++ b/target/riscv/cpu_helper.c
@@ -466,9 +466,13 @@ bool riscv_cpu_vector_enabled(CPURISCVState *env)
 
 void riscv_cpu_swap_hypervisor_regs(CPURISCVState *env)
 {
-    uint64_t mstatus_mask = MSTATUS_MXR | MSTATUS_SUM | MSTATUS_FS |
+    uint64_t mstatus_mask = MSTATUS_MXR | MSTATUS_SUM |
                             MSTATUS_SPP | MSTATUS_SPIE | MSTATUS_SIE |
                             MSTATUS64_UXL | MSTATUS_VS;
+
+    if (riscv_has_ext(env, RVF)) {
+        mstatus_mask |= MSTATUS_FS;
+    }
     bool current_virt = riscv_cpu_virt_enabled(env);
 
     g_assert(riscv_has_ext(env, RVH));
diff --git a/target/riscv/csr.c b/target/riscv/csr.c
index 387088a86c..93bba1ca1c 100644
--- a/target/riscv/csr.c
+++ b/target/riscv/csr.c
@@ -38,7 +38,8 @@ void riscv_set_csr_ops(int csrno, riscv_csr_operations *ops)
 static RISCVException fs(CPURISCVState *env, int csrno)
 {
 #if !defined(CONFIG_USER_ONLY)
-    if (!env->debugger && !riscv_cpu_fp_enabled(env)) {
+    if (!env->debugger && !riscv_cpu_fp_enabled(env) &&
+        !RISCV_CPU(env_cpu(env))->cfg.ext_zfinx) {
         return RISCV_EXCP_ILLEGAL_INST;
     }
 #endif
@@ -301,7 +302,9 @@ static RISCVException write_fflags(CPURISCVState *env, int csrno,
                                    target_ulong val)
 {
 #if !defined(CONFIG_USER_ONLY)
-    env->mstatus |= MSTATUS_FS;
+    if (riscv_has_ext(env, RVF)) {
+        env->mstatus |= MSTATUS_FS;
+    }
 #endif
     riscv_cpu_set_fflags(env, val & (FSR_AEXC >> FSR_AEXC_SHIFT));
     return RISCV_EXCP_NONE;
@@ -318,7 +321,9 @@ static RISCVException write_frm(CPURISCVState *env, int csrno,
                                 target_ulong val)
 {
 #if !defined(CONFIG_USER_ONLY)
-    env->mstatus |= MSTATUS_FS;
+    if (riscv_has_ext(env, RVF)) {
+        env->mstatus |= MSTATUS_FS;
+    }
 #endif
     env->frm = val & (FSR_RD >> FSR_RD_SHIFT);
     return RISCV_EXCP_NONE;
@@ -336,7 +341,9 @@ static RISCVException write_fcsr(CPURISCVState *env, int csrno,
                                  target_ulong val)
 {
 #if !defined(CONFIG_USER_ONLY)
-    env->mstatus |= MSTATUS_FS;
+    if (riscv_has_ext(env, RVF)) {
+        env->mstatus |= MSTATUS_FS;
+    }
 #endif
     env->frm = (val & FSR_RD) >> FSR_RD_SHIFT;
     riscv_cpu_set_fflags(env, (val & FSR_AEXC) >> FSR_AEXC_SHIFT);
@@ -652,10 +659,14 @@ static RISCVException write_mstatus(CPURISCVState *env, int csrno,
         tlb_flush(env_cpu(env));
     }
     mask = MSTATUS_SIE | MSTATUS_SPIE | MSTATUS_MIE | MSTATUS_MPIE |
-        MSTATUS_SPP | MSTATUS_FS | MSTATUS_MPRV | MSTATUS_SUM |
+        MSTATUS_SPP | MSTATUS_MPRV | MSTATUS_SUM |
         MSTATUS_MPP | MSTATUS_MXR | MSTATUS_TVM | MSTATUS_TSR |
         MSTATUS_TW | MSTATUS_VS;
 
+    if (riscv_has_ext(env, RVF)) {
+        mask |= MSTATUS_FS;
+    }
+
     if (xl != MXL_RV32 || env->debugger) {
         /*
          * RV32: MPV and GVA are not in mstatus. The current plan is to
@@ -787,6 +798,10 @@ static RISCVException write_misa(CPURISCVState *env, int csrno,
         return RISCV_EXCP_NONE;
     }
 
+    if (!(val & RVF)) {
+        env->mstatus &= ~MSTATUS_FS;
+    }
+
     /* flush translation cache */
     tb_flush(env_cpu(env));
     env->misa_ext = val;
diff --git a/target/riscv/translate.c b/target/riscv/translate.c
index 84dbfa6340..c7232de326 100644
--- a/target/riscv/translate.c
+++ b/target/riscv/translate.c
@@ -426,6 +426,10 @@ static void mark_fs_dirty(DisasContext *ctx)
 {
     TCGv tmp;
 
+    if (!has_ext(ctx, RVF)) {
+        return;
+    }
+
     if (ctx->mstatus_fs != MSTATUS_FS) {
         /* Remember the state change for the rest of the TB. */
         ctx->mstatus_fs = MSTATUS_FS;
-- 
2.17.1



^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v6 3/6] target/riscv: add support for zfinx
  2022-02-11  4:39 ` Weiwei Li
@ 2022-02-11  4:39   ` Weiwei Li
  -1 siblings, 0 replies; 22+ messages in thread
From: Weiwei Li @ 2022-02-11  4:39 UTC (permalink / raw)
  To: richard.henderson, palmer, alistair.francis, bin.meng,
	qemu-riscv, qemu-devel
  Cc: wangjunqiang, Weiwei Li, lazyparser, ardxwe

  - update extension check REQUIRE_ZFINX_OR_F
  - update single float point register read/write
  - disable nanbox_s check

Co-authored-by: ardxwe <ardxwe@gmail.com>
Signed-off-by: Weiwei Li <liweiwei@iscas.ac.cn>
Signed-off-by: Junqiang Wang <wangjunqiang@iscas.ac.cn>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/riscv/fpu_helper.c               |  89 +++----
 target/riscv/helper.h                   |   2 +-
 target/riscv/insn_trans/trans_rvf.c.inc | 314 ++++++++++++++++--------
 target/riscv/internals.h                |  16 +-
 target/riscv/translate.c                |  93 ++++++-
 5 files changed, 369 insertions(+), 145 deletions(-)

diff --git a/target/riscv/fpu_helper.c b/target/riscv/fpu_helper.c
index 4a5982d594..63ca703459 100644
--- a/target/riscv/fpu_helper.c
+++ b/target/riscv/fpu_helper.c
@@ -98,10 +98,11 @@ static uint64_t do_fmadd_h(CPURISCVState *env, uint64_t rs1, uint64_t rs2,
 static uint64_t do_fmadd_s(CPURISCVState *env, uint64_t rs1, uint64_t rs2,
                            uint64_t rs3, int flags)
 {
-    float32 frs1 = check_nanbox_s(rs1);
-    float32 frs2 = check_nanbox_s(rs2);
-    float32 frs3 = check_nanbox_s(rs3);
-    return nanbox_s(float32_muladd(frs1, frs2, frs3, flags, &env->fp_status));
+    float32 frs1 = check_nanbox_s(env, rs1);
+    float32 frs2 = check_nanbox_s(env, rs2);
+    float32 frs3 = check_nanbox_s(env, rs3);
+    return nanbox_s(env, float32_muladd(frs1, frs2, frs3, flags,
+                                        &env->fp_status));
 }
 
 uint64_t helper_fmadd_s(CPURISCVState *env, uint64_t frs1, uint64_t frs2,
@@ -183,124 +184,124 @@ uint64_t helper_fnmadd_h(CPURISCVState *env, uint64_t frs1, uint64_t frs2,
 
 uint64_t helper_fadd_s(CPURISCVState *env, uint64_t rs1, uint64_t rs2)
 {
-    float32 frs1 = check_nanbox_s(rs1);
-    float32 frs2 = check_nanbox_s(rs2);
-    return nanbox_s(float32_add(frs1, frs2, &env->fp_status));
+    float32 frs1 = check_nanbox_s(env, rs1);
+    float32 frs2 = check_nanbox_s(env, rs2);
+    return nanbox_s(env, float32_add(frs1, frs2, &env->fp_status));
 }
 
 uint64_t helper_fsub_s(CPURISCVState *env, uint64_t rs1, uint64_t rs2)
 {
-    float32 frs1 = check_nanbox_s(rs1);
-    float32 frs2 = check_nanbox_s(rs2);
-    return nanbox_s(float32_sub(frs1, frs2, &env->fp_status));
+    float32 frs1 = check_nanbox_s(env, rs1);
+    float32 frs2 = check_nanbox_s(env, rs2);
+    return nanbox_s(env, float32_sub(frs1, frs2, &env->fp_status));
 }
 
 uint64_t helper_fmul_s(CPURISCVState *env, uint64_t rs1, uint64_t rs2)
 {
-    float32 frs1 = check_nanbox_s(rs1);
-    float32 frs2 = check_nanbox_s(rs2);
-    return nanbox_s(float32_mul(frs1, frs2, &env->fp_status));
+    float32 frs1 = check_nanbox_s(env, rs1);
+    float32 frs2 = check_nanbox_s(env, rs2);
+    return nanbox_s(env, float32_mul(frs1, frs2, &env->fp_status));
 }
 
 uint64_t helper_fdiv_s(CPURISCVState *env, uint64_t rs1, uint64_t rs2)
 {
-    float32 frs1 = check_nanbox_s(rs1);
-    float32 frs2 = check_nanbox_s(rs2);
-    return nanbox_s(float32_div(frs1, frs2, &env->fp_status));
+    float32 frs1 = check_nanbox_s(env, rs1);
+    float32 frs2 = check_nanbox_s(env, rs2);
+    return nanbox_s(env, float32_div(frs1, frs2, &env->fp_status));
 }
 
 uint64_t helper_fmin_s(CPURISCVState *env, uint64_t rs1, uint64_t rs2)
 {
-    float32 frs1 = check_nanbox_s(rs1);
-    float32 frs2 = check_nanbox_s(rs2);
-    return nanbox_s(env->priv_ver < PRIV_VERSION_1_11_0 ?
+    float32 frs1 = check_nanbox_s(env, rs1);
+    float32 frs2 = check_nanbox_s(env, rs2);
+    return nanbox_s(env, env->priv_ver < PRIV_VERSION_1_11_0 ?
                     float32_minnum(frs1, frs2, &env->fp_status) :
                     float32_minimum_number(frs1, frs2, &env->fp_status));
 }
 
 uint64_t helper_fmax_s(CPURISCVState *env, uint64_t rs1, uint64_t rs2)
 {
-    float32 frs1 = check_nanbox_s(rs1);
-    float32 frs2 = check_nanbox_s(rs2);
-    return nanbox_s(env->priv_ver < PRIV_VERSION_1_11_0 ?
+    float32 frs1 = check_nanbox_s(env, rs1);
+    float32 frs2 = check_nanbox_s(env, rs2);
+    return nanbox_s(env, env->priv_ver < PRIV_VERSION_1_11_0 ?
                     float32_maxnum(frs1, frs2, &env->fp_status) :
                     float32_maximum_number(frs1, frs2, &env->fp_status));
 }
 
 uint64_t helper_fsqrt_s(CPURISCVState *env, uint64_t rs1)
 {
-    float32 frs1 = check_nanbox_s(rs1);
-    return nanbox_s(float32_sqrt(frs1, &env->fp_status));
+    float32 frs1 = check_nanbox_s(env, rs1);
+    return nanbox_s(env, float32_sqrt(frs1, &env->fp_status));
 }
 
 target_ulong helper_fle_s(CPURISCVState *env, uint64_t rs1, uint64_t rs2)
 {
-    float32 frs1 = check_nanbox_s(rs1);
-    float32 frs2 = check_nanbox_s(rs2);
+    float32 frs1 = check_nanbox_s(env, rs1);
+    float32 frs2 = check_nanbox_s(env, rs2);
     return float32_le(frs1, frs2, &env->fp_status);
 }
 
 target_ulong helper_flt_s(CPURISCVState *env, uint64_t rs1, uint64_t rs2)
 {
-    float32 frs1 = check_nanbox_s(rs1);
-    float32 frs2 = check_nanbox_s(rs2);
+    float32 frs1 = check_nanbox_s(env, rs1);
+    float32 frs2 = check_nanbox_s(env, rs2);
     return float32_lt(frs1, frs2, &env->fp_status);
 }
 
 target_ulong helper_feq_s(CPURISCVState *env, uint64_t rs1, uint64_t rs2)
 {
-    float32 frs1 = check_nanbox_s(rs1);
-    float32 frs2 = check_nanbox_s(rs2);
+    float32 frs1 = check_nanbox_s(env, rs1);
+    float32 frs2 = check_nanbox_s(env, rs2);
     return float32_eq_quiet(frs1, frs2, &env->fp_status);
 }
 
 target_ulong helper_fcvt_w_s(CPURISCVState *env, uint64_t rs1)
 {
-    float32 frs1 = check_nanbox_s(rs1);
+    float32 frs1 = check_nanbox_s(env, rs1);
     return float32_to_int32(frs1, &env->fp_status);
 }
 
 target_ulong helper_fcvt_wu_s(CPURISCVState *env, uint64_t rs1)
 {
-    float32 frs1 = check_nanbox_s(rs1);
+    float32 frs1 = check_nanbox_s(env, rs1);
     return (int32_t)float32_to_uint32(frs1, &env->fp_status);
 }
 
 target_ulong helper_fcvt_l_s(CPURISCVState *env, uint64_t rs1)
 {
-    float32 frs1 = check_nanbox_s(rs1);
+    float32 frs1 = check_nanbox_s(env, rs1);
     return float32_to_int64(frs1, &env->fp_status);
 }
 
 target_ulong helper_fcvt_lu_s(CPURISCVState *env, uint64_t rs1)
 {
-    float32 frs1 = check_nanbox_s(rs1);
+    float32 frs1 = check_nanbox_s(env, rs1);
     return float32_to_uint64(frs1, &env->fp_status);
 }
 
 uint64_t helper_fcvt_s_w(CPURISCVState *env, target_ulong rs1)
 {
-    return nanbox_s(int32_to_float32((int32_t)rs1, &env->fp_status));
+    return nanbox_s(env, int32_to_float32((int32_t)rs1, &env->fp_status));
 }
 
 uint64_t helper_fcvt_s_wu(CPURISCVState *env, target_ulong rs1)
 {
-    return nanbox_s(uint32_to_float32((uint32_t)rs1, &env->fp_status));
+    return nanbox_s(env, uint32_to_float32((uint32_t)rs1, &env->fp_status));
 }
 
 uint64_t helper_fcvt_s_l(CPURISCVState *env, target_ulong rs1)
 {
-    return nanbox_s(int64_to_float32(rs1, &env->fp_status));
+    return nanbox_s(env, int64_to_float32(rs1, &env->fp_status));
 }
 
 uint64_t helper_fcvt_s_lu(CPURISCVState *env, target_ulong rs1)
 {
-    return nanbox_s(uint64_to_float32(rs1, &env->fp_status));
+    return nanbox_s(env, uint64_to_float32(rs1, &env->fp_status));
 }
 
-target_ulong helper_fclass_s(uint64_t rs1)
+target_ulong helper_fclass_s(CPURISCVState *env, uint64_t rs1)
 {
-    float32 frs1 = check_nanbox_s(rs1);
+    float32 frs1 = check_nanbox_s(env, rs1);
     return fclass_s(frs1);
 }
 
@@ -340,12 +341,12 @@ uint64_t helper_fmax_d(CPURISCVState *env, uint64_t frs1, uint64_t frs2)
 
 uint64_t helper_fcvt_s_d(CPURISCVState *env, uint64_t rs1)
 {
-    return nanbox_s(float64_to_float32(rs1, &env->fp_status));
+    return nanbox_s(env, float64_to_float32(rs1, &env->fp_status));
 }
 
 uint64_t helper_fcvt_d_s(CPURISCVState *env, uint64_t rs1)
 {
-    float32 frs1 = check_nanbox_s(rs1);
+    float32 frs1 = check_nanbox_s(env, rs1);
     return float32_to_float64(frs1, &env->fp_status);
 }
 
@@ -539,14 +540,14 @@ uint64_t helper_fcvt_h_lu(CPURISCVState *env, target_ulong rs1)
 
 uint64_t helper_fcvt_h_s(CPURISCVState *env, uint64_t rs1)
 {
-    float32 frs1 = check_nanbox_s(rs1);
+    float32 frs1 = check_nanbox_s(env, rs1);
     return nanbox_h(float32_to_float16(frs1, true, &env->fp_status));
 }
 
 uint64_t helper_fcvt_s_h(CPURISCVState *env, uint64_t rs1)
 {
     float16 frs1 = check_nanbox_h(rs1);
-    return nanbox_s(float16_to_float32(frs1, true, &env->fp_status));
+    return nanbox_s(env, float16_to_float32(frs1, true, &env->fp_status));
 }
 
 uint64_t helper_fcvt_h_d(CPURISCVState *env, uint64_t rs1)
diff --git a/target/riscv/helper.h b/target/riscv/helper.h
index 72cc2582f4..89195aad9d 100644
--- a/target/riscv/helper.h
+++ b/target/riscv/helper.h
@@ -38,7 +38,7 @@ DEF_HELPER_FLAGS_2(fcvt_s_w, TCG_CALL_NO_RWG, i64, env, tl)
 DEF_HELPER_FLAGS_2(fcvt_s_wu, TCG_CALL_NO_RWG, i64, env, tl)
 DEF_HELPER_FLAGS_2(fcvt_s_l, TCG_CALL_NO_RWG, i64, env, tl)
 DEF_HELPER_FLAGS_2(fcvt_s_lu, TCG_CALL_NO_RWG, i64, env, tl)
-DEF_HELPER_FLAGS_1(fclass_s, TCG_CALL_NO_RWG_SE, tl, i64)
+DEF_HELPER_FLAGS_2(fclass_s, TCG_CALL_NO_RWG_SE, tl, env, i64)
 
 /* Floating Point - Double Precision */
 DEF_HELPER_FLAGS_3(fadd_d, TCG_CALL_NO_RWG, i64, env, i64, i64)
diff --git a/target/riscv/insn_trans/trans_rvf.c.inc b/target/riscv/insn_trans/trans_rvf.c.inc
index 0aac87f7db..a1d3eb52ad 100644
--- a/target/riscv/insn_trans/trans_rvf.c.inc
+++ b/target/riscv/insn_trans/trans_rvf.c.inc
@@ -20,7 +20,14 @@
 
 #define REQUIRE_FPU do {\
     if (ctx->mstatus_fs == 0) \
-        return false;                       \
+        if (!ctx->cfg_ptr->ext_zfinx) \
+            return false; \
+} while (0)
+
+#define REQUIRE_ZFINX_OR_F(ctx) do {\
+    if (!ctx->cfg_ptr->ext_zfinx) { \
+        REQUIRE_EXT(ctx, RVF); \
+    } \
 } while (0)
 
 static bool trans_flw(DisasContext *ctx, arg_flw *a)
@@ -55,10 +62,16 @@ static bool trans_fsw(DisasContext *ctx, arg_fsw *a)
 static bool trans_fmadd_s(DisasContext *ctx, arg_fmadd_s *a)
 {
     REQUIRE_FPU;
-    REQUIRE_EXT(ctx, RVF);
+    REQUIRE_ZFINX_OR_F(ctx);
+
+    TCGv_i64 dest = dest_fpr(ctx, a->rd);
+    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
+    TCGv_i64 src2 = get_fpr_hs(ctx, a->rs2);
+    TCGv_i64 src3 = get_fpr_hs(ctx, a->rs3);
+
     gen_set_rm(ctx, a->rm);
-    gen_helper_fmadd_s(cpu_fpr[a->rd], cpu_env, cpu_fpr[a->rs1],
-                       cpu_fpr[a->rs2], cpu_fpr[a->rs3]);
+    gen_helper_fmadd_s(dest, cpu_env, src1, src2, src3);
+    gen_set_fpr_hs(ctx, a->rd, dest);
     mark_fs_dirty(ctx);
     return true;
 }
@@ -66,10 +79,16 @@ static bool trans_fmadd_s(DisasContext *ctx, arg_fmadd_s *a)
 static bool trans_fmsub_s(DisasContext *ctx, arg_fmsub_s *a)
 {
     REQUIRE_FPU;
-    REQUIRE_EXT(ctx, RVF);
+    REQUIRE_ZFINX_OR_F(ctx);
+
+    TCGv_i64 dest = dest_fpr(ctx, a->rd);
+    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
+    TCGv_i64 src2 = get_fpr_hs(ctx, a->rs2);
+    TCGv_i64 src3 = get_fpr_hs(ctx, a->rs3);
+
     gen_set_rm(ctx, a->rm);
-    gen_helper_fmsub_s(cpu_fpr[a->rd], cpu_env, cpu_fpr[a->rs1],
-                       cpu_fpr[a->rs2], cpu_fpr[a->rs3]);
+    gen_helper_fmsub_s(dest, cpu_env, src1, src2, src3);
+    gen_set_fpr_hs(ctx, a->rd, dest);
     mark_fs_dirty(ctx);
     return true;
 }
@@ -77,10 +96,16 @@ static bool trans_fmsub_s(DisasContext *ctx, arg_fmsub_s *a)
 static bool trans_fnmsub_s(DisasContext *ctx, arg_fnmsub_s *a)
 {
     REQUIRE_FPU;
-    REQUIRE_EXT(ctx, RVF);
+    REQUIRE_ZFINX_OR_F(ctx);
+
+    TCGv_i64 dest = dest_fpr(ctx, a->rd);
+    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
+    TCGv_i64 src2 = get_fpr_hs(ctx, a->rs2);
+    TCGv_i64 src3 = get_fpr_hs(ctx, a->rs3);
+
     gen_set_rm(ctx, a->rm);
-    gen_helper_fnmsub_s(cpu_fpr[a->rd], cpu_env, cpu_fpr[a->rs1],
-                        cpu_fpr[a->rs2], cpu_fpr[a->rs3]);
+    gen_helper_fnmsub_s(dest, cpu_env, src1, src2, src3);
+    gen_set_fpr_hs(ctx, a->rd, dest);
     mark_fs_dirty(ctx);
     return true;
 }
@@ -88,10 +113,16 @@ static bool trans_fnmsub_s(DisasContext *ctx, arg_fnmsub_s *a)
 static bool trans_fnmadd_s(DisasContext *ctx, arg_fnmadd_s *a)
 {
     REQUIRE_FPU;
-    REQUIRE_EXT(ctx, RVF);
+    REQUIRE_ZFINX_OR_F(ctx);
+
+    TCGv_i64 dest = dest_fpr(ctx, a->rd);
+    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
+    TCGv_i64 src2 = get_fpr_hs(ctx, a->rs2);
+    TCGv_i64 src3 = get_fpr_hs(ctx, a->rs3);
+
     gen_set_rm(ctx, a->rm);
-    gen_helper_fnmadd_s(cpu_fpr[a->rd], cpu_env, cpu_fpr[a->rs1],
-                        cpu_fpr[a->rs2], cpu_fpr[a->rs3]);
+    gen_helper_fnmadd_s(dest, cpu_env, src1, src2, src3);
+    gen_set_fpr_hs(ctx, a->rd, dest);
     mark_fs_dirty(ctx);
     return true;
 }
@@ -99,11 +130,15 @@ static bool trans_fnmadd_s(DisasContext *ctx, arg_fnmadd_s *a)
 static bool trans_fadd_s(DisasContext *ctx, arg_fadd_s *a)
 {
     REQUIRE_FPU;
-    REQUIRE_EXT(ctx, RVF);
+    REQUIRE_ZFINX_OR_F(ctx);
+
+    TCGv_i64 dest = dest_fpr(ctx, a->rd);
+    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
+    TCGv_i64 src2 = get_fpr_hs(ctx, a->rs2);
 
     gen_set_rm(ctx, a->rm);
-    gen_helper_fadd_s(cpu_fpr[a->rd], cpu_env,
-                      cpu_fpr[a->rs1], cpu_fpr[a->rs2]);
+    gen_helper_fadd_s(dest, cpu_env, src1, src2);
+    gen_set_fpr_hs(ctx, a->rd, dest);
     mark_fs_dirty(ctx);
     return true;
 }
@@ -111,11 +146,15 @@ static bool trans_fadd_s(DisasContext *ctx, arg_fadd_s *a)
 static bool trans_fsub_s(DisasContext *ctx, arg_fsub_s *a)
 {
     REQUIRE_FPU;
-    REQUIRE_EXT(ctx, RVF);
+    REQUIRE_ZFINX_OR_F(ctx);
+
+    TCGv_i64 dest = dest_fpr(ctx, a->rd);
+    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
+    TCGv_i64 src2 = get_fpr_hs(ctx, a->rs2);
 
     gen_set_rm(ctx, a->rm);
-    gen_helper_fsub_s(cpu_fpr[a->rd], cpu_env,
-                      cpu_fpr[a->rs1], cpu_fpr[a->rs2]);
+    gen_helper_fsub_s(dest, cpu_env, src1, src2);
+    gen_set_fpr_hs(ctx, a->rd, dest);
     mark_fs_dirty(ctx);
     return true;
 }
@@ -123,11 +162,15 @@ static bool trans_fsub_s(DisasContext *ctx, arg_fsub_s *a)
 static bool trans_fmul_s(DisasContext *ctx, arg_fmul_s *a)
 {
     REQUIRE_FPU;
-    REQUIRE_EXT(ctx, RVF);
+    REQUIRE_ZFINX_OR_F(ctx);
+
+    TCGv_i64 dest = dest_fpr(ctx, a->rd);
+    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
+    TCGv_i64 src2 = get_fpr_hs(ctx, a->rs2);
 
     gen_set_rm(ctx, a->rm);
-    gen_helper_fmul_s(cpu_fpr[a->rd], cpu_env,
-                      cpu_fpr[a->rs1], cpu_fpr[a->rs2]);
+    gen_helper_fmul_s(dest, cpu_env, src1, src2);
+    gen_set_fpr_hs(ctx, a->rd, dest);
     mark_fs_dirty(ctx);
     return true;
 }
@@ -135,11 +178,15 @@ static bool trans_fmul_s(DisasContext *ctx, arg_fmul_s *a)
 static bool trans_fdiv_s(DisasContext *ctx, arg_fdiv_s *a)
 {
     REQUIRE_FPU;
-    REQUIRE_EXT(ctx, RVF);
+    REQUIRE_ZFINX_OR_F(ctx);
+
+    TCGv_i64 dest = dest_fpr(ctx, a->rd);
+    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
+    TCGv_i64 src2 = get_fpr_hs(ctx, a->rs2);
 
     gen_set_rm(ctx, a->rm);
-    gen_helper_fdiv_s(cpu_fpr[a->rd], cpu_env,
-                      cpu_fpr[a->rs1], cpu_fpr[a->rs2]);
+    gen_helper_fdiv_s(dest, cpu_env, src1, src2);
+    gen_set_fpr_hs(ctx, a->rd, dest);
     mark_fs_dirty(ctx);
     return true;
 }
@@ -147,10 +194,14 @@ static bool trans_fdiv_s(DisasContext *ctx, arg_fdiv_s *a)
 static bool trans_fsqrt_s(DisasContext *ctx, arg_fsqrt_s *a)
 {
     REQUIRE_FPU;
-    REQUIRE_EXT(ctx, RVF);
+    REQUIRE_ZFINX_OR_F(ctx);
+
+    TCGv_i64 dest = dest_fpr(ctx, a->rd);
+    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
 
     gen_set_rm(ctx, a->rm);
-    gen_helper_fsqrt_s(cpu_fpr[a->rd], cpu_env, cpu_fpr[a->rs1]);
+    gen_helper_fsqrt_s(dest, cpu_env, src1);
+    gen_set_fpr_hs(ctx, a->rd, dest);
     mark_fs_dirty(ctx);
     return true;
 }
@@ -158,22 +209,37 @@ static bool trans_fsqrt_s(DisasContext *ctx, arg_fsqrt_s *a)
 static bool trans_fsgnj_s(DisasContext *ctx, arg_fsgnj_s *a)
 {
     REQUIRE_FPU;
-    REQUIRE_EXT(ctx, RVF);
+    REQUIRE_ZFINX_OR_F(ctx);
+
+    TCGv_i64 dest = dest_fpr(ctx, a->rd);
+    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
 
     if (a->rs1 == a->rs2) { /* FMOV */
-        gen_check_nanbox_s(cpu_fpr[a->rd], cpu_fpr[a->rs1]);
+        if (!ctx->cfg_ptr->ext_zfinx) {
+            gen_check_nanbox_s(dest, src1);
+        } else {
+            tcg_gen_ext32s_i64(dest, src1);
+        }
     } else { /* FSGNJ */
-        TCGv_i64 rs1 = tcg_temp_new_i64();
-        TCGv_i64 rs2 = tcg_temp_new_i64();
-
-        gen_check_nanbox_s(rs1, cpu_fpr[a->rs1]);
-        gen_check_nanbox_s(rs2, cpu_fpr[a->rs2]);
-
-        /* This formulation retains the nanboxing of rs2. */
-        tcg_gen_deposit_i64(cpu_fpr[a->rd], rs2, rs1, 0, 31);
-        tcg_temp_free_i64(rs1);
-        tcg_temp_free_i64(rs2);
+        TCGv_i64 src2 = get_fpr_hs(ctx, a->rs2);
+
+        if (!ctx->cfg_ptr->ext_zfinx) {
+            TCGv_i64 rs1 = tcg_temp_new_i64();
+            TCGv_i64 rs2 = tcg_temp_new_i64();
+            gen_check_nanbox_s(rs1, src1);
+            gen_check_nanbox_s(rs2, src2);
+
+            /* This formulation retains the nanboxing of rs2 in normal 'F'. */
+            tcg_gen_deposit_i64(dest, rs2, rs1, 0, 31);
+
+            tcg_temp_free_i64(rs1);
+            tcg_temp_free_i64(rs2);
+        } else {
+            tcg_gen_deposit_i64(dest, src2, src1, 0, 31);
+            tcg_gen_ext32s_i64(dest, dest);
+        }
     }
+    gen_set_fpr_hs(ctx, a->rd, dest);
     mark_fs_dirty(ctx);
     return true;
 }
@@ -183,16 +249,27 @@ static bool trans_fsgnjn_s(DisasContext *ctx, arg_fsgnjn_s *a)
     TCGv_i64 rs1, rs2, mask;
 
     REQUIRE_FPU;
-    REQUIRE_EXT(ctx, RVF);
+    REQUIRE_ZFINX_OR_F(ctx);
 
-    rs1 = tcg_temp_new_i64();
-    gen_check_nanbox_s(rs1, cpu_fpr[a->rs1]);
+    TCGv_i64 dest = dest_fpr(ctx, a->rd);
+    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
 
+    rs1 = tcg_temp_new_i64();
+    if (!ctx->cfg_ptr->ext_zfinx) {
+        gen_check_nanbox_s(rs1, src1);
+    } else {
+        tcg_gen_mov_i64(rs1, src1);
+    }
     if (a->rs1 == a->rs2) { /* FNEG */
-        tcg_gen_xori_i64(cpu_fpr[a->rd], rs1, MAKE_64BIT_MASK(31, 1));
+        tcg_gen_xori_i64(dest, rs1, MAKE_64BIT_MASK(31, 1));
     } else {
+        TCGv_i64 src2 = get_fpr_hs(ctx, a->rs2);
         rs2 = tcg_temp_new_i64();
-        gen_check_nanbox_s(rs2, cpu_fpr[a->rs2]);
+        if (!ctx->cfg_ptr->ext_zfinx) {
+            gen_check_nanbox_s(rs2, src2);
+        } else {
+            tcg_gen_mov_i64(rs2, src2);
+        }
 
         /*
          * Replace bit 31 in rs1 with inverse in rs2.
@@ -200,13 +277,17 @@ static bool trans_fsgnjn_s(DisasContext *ctx, arg_fsgnjn_s *a)
          */
         mask = tcg_constant_i64(~MAKE_64BIT_MASK(31, 1));
         tcg_gen_nor_i64(rs2, rs2, mask);
-        tcg_gen_and_i64(rs1, mask, rs1);
-        tcg_gen_or_i64(cpu_fpr[a->rd], rs1, rs2);
+        tcg_gen_and_i64(dest, mask, rs1);
+        tcg_gen_or_i64(dest, dest, rs2);
 
         tcg_temp_free_i64(rs2);
     }
+    /* signed-extended intead of nanboxing for result if enable zfinx */
+    if (ctx->cfg_ptr->ext_zfinx) {
+        tcg_gen_ext32s_i64(dest, dest);
+    }
+    gen_set_fpr_hs(ctx, a->rd, dest);
     tcg_temp_free_i64(rs1);
-
     mark_fs_dirty(ctx);
     return true;
 }
@@ -216,28 +297,45 @@ static bool trans_fsgnjx_s(DisasContext *ctx, arg_fsgnjx_s *a)
     TCGv_i64 rs1, rs2;
 
     REQUIRE_FPU;
-    REQUIRE_EXT(ctx, RVF);
+    REQUIRE_ZFINX_OR_F(ctx);
 
+    TCGv_i64 dest = dest_fpr(ctx, a->rd);
+    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
     rs1 = tcg_temp_new_i64();
-    gen_check_nanbox_s(rs1, cpu_fpr[a->rs1]);
+
+    if (!ctx->cfg_ptr->ext_zfinx) {
+        gen_check_nanbox_s(rs1, src1);
+    } else {
+        tcg_gen_mov_i64(rs1, src1);
+    }
 
     if (a->rs1 == a->rs2) { /* FABS */
-        tcg_gen_andi_i64(cpu_fpr[a->rd], rs1, ~MAKE_64BIT_MASK(31, 1));
+        tcg_gen_andi_i64(dest, rs1, ~MAKE_64BIT_MASK(31, 1));
     } else {
+        TCGv_i64 src2 = get_fpr_hs(ctx, a->rs2);
         rs2 = tcg_temp_new_i64();
-        gen_check_nanbox_s(rs2, cpu_fpr[a->rs2]);
+
+        if (!ctx->cfg_ptr->ext_zfinx) {
+            gen_check_nanbox_s(rs2, src2);
+        } else {
+            tcg_gen_mov_i64(rs2, src2);
+        }
 
         /*
          * Xor bit 31 in rs1 with that in rs2.
          * This formulation retains the nanboxing of rs1.
          */
-        tcg_gen_andi_i64(rs2, rs2, MAKE_64BIT_MASK(31, 1));
-        tcg_gen_xor_i64(cpu_fpr[a->rd], rs1, rs2);
+        tcg_gen_andi_i64(dest, rs2, MAKE_64BIT_MASK(31, 1));
+        tcg_gen_xor_i64(dest, rs1, dest);
 
         tcg_temp_free_i64(rs2);
     }
+    /* signed-extended intead of nanboxing for result if enable zfinx */
+    if (ctx->cfg_ptr->ext_zfinx) {
+        tcg_gen_ext32s_i64(dest, dest);
+    }
     tcg_temp_free_i64(rs1);
-
+    gen_set_fpr_hs(ctx, a->rd, dest);
     mark_fs_dirty(ctx);
     return true;
 }
@@ -245,10 +343,14 @@ static bool trans_fsgnjx_s(DisasContext *ctx, arg_fsgnjx_s *a)
 static bool trans_fmin_s(DisasContext *ctx, arg_fmin_s *a)
 {
     REQUIRE_FPU;
-    REQUIRE_EXT(ctx, RVF);
+    REQUIRE_ZFINX_OR_F(ctx);
+
+    TCGv_i64 dest = dest_fpr(ctx, a->rd);
+    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
+    TCGv_i64 src2 = get_fpr_hs(ctx, a->rs2);
 
-    gen_helper_fmin_s(cpu_fpr[a->rd], cpu_env, cpu_fpr[a->rs1],
-                      cpu_fpr[a->rs2]);
+    gen_helper_fmin_s(dest, cpu_env, src1, src2);
+    gen_set_fpr_hs(ctx, a->rd, dest);
     mark_fs_dirty(ctx);
     return true;
 }
@@ -256,10 +358,14 @@ static bool trans_fmin_s(DisasContext *ctx, arg_fmin_s *a)
 static bool trans_fmax_s(DisasContext *ctx, arg_fmax_s *a)
 {
     REQUIRE_FPU;
-    REQUIRE_EXT(ctx, RVF);
+    REQUIRE_ZFINX_OR_F(ctx);
+
+    TCGv_i64 dest = dest_fpr(ctx, a->rd);
+    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
+    TCGv_i64 src2 = get_fpr_hs(ctx, a->rs2);
 
-    gen_helper_fmax_s(cpu_fpr[a->rd], cpu_env, cpu_fpr[a->rs1],
-                      cpu_fpr[a->rs2]);
+    gen_helper_fmax_s(dest, cpu_env, src1, src2);
+    gen_set_fpr_hs(ctx, a->rd, dest);
     mark_fs_dirty(ctx);
     return true;
 }
@@ -267,12 +373,13 @@ static bool trans_fmax_s(DisasContext *ctx, arg_fmax_s *a)
 static bool trans_fcvt_w_s(DisasContext *ctx, arg_fcvt_w_s *a)
 {
     REQUIRE_FPU;
-    REQUIRE_EXT(ctx, RVF);
+    REQUIRE_ZFINX_OR_F(ctx);
 
     TCGv dest = dest_gpr(ctx, a->rd);
+    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
 
     gen_set_rm(ctx, a->rm);
-    gen_helper_fcvt_w_s(dest, cpu_env, cpu_fpr[a->rs1]);
+    gen_helper_fcvt_w_s(dest, cpu_env, src1);
     gen_set_gpr(ctx, a->rd, dest);
     return true;
 }
@@ -280,12 +387,13 @@ static bool trans_fcvt_w_s(DisasContext *ctx, arg_fcvt_w_s *a)
 static bool trans_fcvt_wu_s(DisasContext *ctx, arg_fcvt_wu_s *a)
 {
     REQUIRE_FPU;
-    REQUIRE_EXT(ctx, RVF);
+    REQUIRE_ZFINX_OR_F(ctx);
 
     TCGv dest = dest_gpr(ctx, a->rd);
+    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
 
     gen_set_rm(ctx, a->rm);
-    gen_helper_fcvt_wu_s(dest, cpu_env, cpu_fpr[a->rs1]);
+    gen_helper_fcvt_wu_s(dest, cpu_env, src1);
     gen_set_gpr(ctx, a->rd, dest);
     return true;
 }
@@ -294,14 +402,14 @@ static bool trans_fmv_x_w(DisasContext *ctx, arg_fmv_x_w *a)
 {
     /* NOTE: This was FMV.X.S in an earlier version of the ISA spec! */
     REQUIRE_FPU;
-    REQUIRE_EXT(ctx, RVF);
+    REQUIRE_ZFINX_OR_F(ctx);
 
     TCGv dest = dest_gpr(ctx, a->rd);
-
+    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
 #if defined(TARGET_RISCV64)
-    tcg_gen_ext32s_tl(dest, cpu_fpr[a->rs1]);
+    tcg_gen_ext32s_tl(dest, src1);
 #else
-    tcg_gen_extrl_i64_i32(dest, cpu_fpr[a->rs1]);
+    tcg_gen_extrl_i64_i32(dest, src1);
 #endif
 
     gen_set_gpr(ctx, a->rd, dest);
@@ -311,11 +419,13 @@ static bool trans_fmv_x_w(DisasContext *ctx, arg_fmv_x_w *a)
 static bool trans_feq_s(DisasContext *ctx, arg_feq_s *a)
 {
     REQUIRE_FPU;
-    REQUIRE_EXT(ctx, RVF);
+    REQUIRE_ZFINX_OR_F(ctx);
 
     TCGv dest = dest_gpr(ctx, a->rd);
+    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
+    TCGv_i64 src2 = get_fpr_hs(ctx, a->rs2);
 
-    gen_helper_feq_s(dest, cpu_env, cpu_fpr[a->rs1], cpu_fpr[a->rs2]);
+    gen_helper_feq_s(dest, cpu_env, src1, src2);
     gen_set_gpr(ctx, a->rd, dest);
     return true;
 }
@@ -323,11 +433,13 @@ static bool trans_feq_s(DisasContext *ctx, arg_feq_s *a)
 static bool trans_flt_s(DisasContext *ctx, arg_flt_s *a)
 {
     REQUIRE_FPU;
-    REQUIRE_EXT(ctx, RVF);
+    REQUIRE_ZFINX_OR_F(ctx);
 
     TCGv dest = dest_gpr(ctx, a->rd);
+    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
+    TCGv_i64 src2 = get_fpr_hs(ctx, a->rs2);
 
-    gen_helper_flt_s(dest, cpu_env, cpu_fpr[a->rs1], cpu_fpr[a->rs2]);
+    gen_helper_flt_s(dest, cpu_env, src1, src2);
     gen_set_gpr(ctx, a->rd, dest);
     return true;
 }
@@ -335,11 +447,13 @@ static bool trans_flt_s(DisasContext *ctx, arg_flt_s *a)
 static bool trans_fle_s(DisasContext *ctx, arg_fle_s *a)
 {
     REQUIRE_FPU;
-    REQUIRE_EXT(ctx, RVF);
+    REQUIRE_ZFINX_OR_F(ctx);
 
     TCGv dest = dest_gpr(ctx, a->rd);
+    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
+    TCGv_i64 src2 = get_fpr_hs(ctx, a->rs2);
 
-    gen_helper_fle_s(dest, cpu_env, cpu_fpr[a->rs1], cpu_fpr[a->rs2]);
+    gen_helper_fle_s(dest, cpu_env, src1, src2);
     gen_set_gpr(ctx, a->rd, dest);
     return true;
 }
@@ -347,11 +461,12 @@ static bool trans_fle_s(DisasContext *ctx, arg_fle_s *a)
 static bool trans_fclass_s(DisasContext *ctx, arg_fclass_s *a)
 {
     REQUIRE_FPU;
-    REQUIRE_EXT(ctx, RVF);
+    REQUIRE_ZFINX_OR_F(ctx);
 
     TCGv dest = dest_gpr(ctx, a->rd);
+    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
 
-    gen_helper_fclass_s(dest, cpu_fpr[a->rs1]);
+    gen_helper_fclass_s(dest, cpu_env, src1);
     gen_set_gpr(ctx, a->rd, dest);
     return true;
 }
@@ -359,13 +474,14 @@ static bool trans_fclass_s(DisasContext *ctx, arg_fclass_s *a)
 static bool trans_fcvt_s_w(DisasContext *ctx, arg_fcvt_s_w *a)
 {
     REQUIRE_FPU;
-    REQUIRE_EXT(ctx, RVF);
+    REQUIRE_ZFINX_OR_F(ctx);
 
+    TCGv_i64 dest = dest_fpr(ctx, a->rd);
     TCGv src = get_gpr(ctx, a->rs1, EXT_SIGN);
 
     gen_set_rm(ctx, a->rm);
-    gen_helper_fcvt_s_w(cpu_fpr[a->rd], cpu_env, src);
-
+    gen_helper_fcvt_s_w(dest, cpu_env, src);
+    gen_set_fpr_hs(ctx, a->rd, dest);
     mark_fs_dirty(ctx);
     return true;
 }
@@ -373,13 +489,14 @@ static bool trans_fcvt_s_w(DisasContext *ctx, arg_fcvt_s_w *a)
 static bool trans_fcvt_s_wu(DisasContext *ctx, arg_fcvt_s_wu *a)
 {
     REQUIRE_FPU;
-    REQUIRE_EXT(ctx, RVF);
+    REQUIRE_ZFINX_OR_F(ctx);
 
+    TCGv_i64 dest = dest_fpr(ctx, a->rd);
     TCGv src = get_gpr(ctx, a->rs1, EXT_ZERO);
 
     gen_set_rm(ctx, a->rm);
-    gen_helper_fcvt_s_wu(cpu_fpr[a->rd], cpu_env, src);
-
+    gen_helper_fcvt_s_wu(dest, cpu_env, src);
+    gen_set_fpr_hs(ctx, a->rd, dest);
     mark_fs_dirty(ctx);
     return true;
 }
@@ -388,13 +505,14 @@ static bool trans_fmv_w_x(DisasContext *ctx, arg_fmv_w_x *a)
 {
     /* NOTE: This was FMV.S.X in an earlier version of the ISA spec! */
     REQUIRE_FPU;
-    REQUIRE_EXT(ctx, RVF);
+    REQUIRE_ZFINX_OR_F(ctx);
 
+    TCGv_i64 dest = dest_fpr(ctx, a->rd);
     TCGv src = get_gpr(ctx, a->rs1, EXT_ZERO);
 
-    tcg_gen_extu_tl_i64(cpu_fpr[a->rd], src);
-    gen_nanbox_s(cpu_fpr[a->rd], cpu_fpr[a->rd]);
-
+    tcg_gen_extu_tl_i64(dest, src);
+    gen_nanbox_s(dest, dest);
+    gen_set_fpr_hs(ctx, a->rd, dest);
     mark_fs_dirty(ctx);
     return true;
 }
@@ -403,12 +521,13 @@ static bool trans_fcvt_l_s(DisasContext *ctx, arg_fcvt_l_s *a)
 {
     REQUIRE_64BIT(ctx);
     REQUIRE_FPU;
-    REQUIRE_EXT(ctx, RVF);
+    REQUIRE_ZFINX_OR_F(ctx);
 
     TCGv dest = dest_gpr(ctx, a->rd);
+    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
 
     gen_set_rm(ctx, a->rm);
-    gen_helper_fcvt_l_s(dest, cpu_env, cpu_fpr[a->rs1]);
+    gen_helper_fcvt_l_s(dest, cpu_env, src1);
     gen_set_gpr(ctx, a->rd, dest);
     return true;
 }
@@ -417,12 +536,13 @@ static bool trans_fcvt_lu_s(DisasContext *ctx, arg_fcvt_lu_s *a)
 {
     REQUIRE_64BIT(ctx);
     REQUIRE_FPU;
-    REQUIRE_EXT(ctx, RVF);
+    REQUIRE_ZFINX_OR_F(ctx);
 
     TCGv dest = dest_gpr(ctx, a->rd);
+    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
 
     gen_set_rm(ctx, a->rm);
-    gen_helper_fcvt_lu_s(dest, cpu_env, cpu_fpr[a->rs1]);
+    gen_helper_fcvt_lu_s(dest, cpu_env, src1);
     gen_set_gpr(ctx, a->rd, dest);
     return true;
 }
@@ -431,13 +551,14 @@ static bool trans_fcvt_s_l(DisasContext *ctx, arg_fcvt_s_l *a)
 {
     REQUIRE_64BIT(ctx);
     REQUIRE_FPU;
-    REQUIRE_EXT(ctx, RVF);
+    REQUIRE_ZFINX_OR_F(ctx);
 
+    TCGv_i64 dest = dest_fpr(ctx, a->rd);
     TCGv src = get_gpr(ctx, a->rs1, EXT_SIGN);
 
     gen_set_rm(ctx, a->rm);
-    gen_helper_fcvt_s_l(cpu_fpr[a->rd], cpu_env, src);
-
+    gen_helper_fcvt_s_l(dest, cpu_env, src);
+    gen_set_fpr_hs(ctx, a->rd, dest);
     mark_fs_dirty(ctx);
     return true;
 }
@@ -446,13 +567,14 @@ static bool trans_fcvt_s_lu(DisasContext *ctx, arg_fcvt_s_lu *a)
 {
     REQUIRE_64BIT(ctx);
     REQUIRE_FPU;
-    REQUIRE_EXT(ctx, RVF);
+    REQUIRE_ZFINX_OR_F(ctx);
 
+    TCGv_i64 dest = dest_fpr(ctx, a->rd);
     TCGv src = get_gpr(ctx, a->rs1, EXT_ZERO);
 
     gen_set_rm(ctx, a->rm);
-    gen_helper_fcvt_s_lu(cpu_fpr[a->rd], cpu_env, src);
-
+    gen_helper_fcvt_s_lu(dest, cpu_env, src);
+    gen_set_fpr_hs(ctx, a->rd, dest);
     mark_fs_dirty(ctx);
     return true;
 }
diff --git a/target/riscv/internals.h b/target/riscv/internals.h
index 065e8162a2..6237bb3115 100644
--- a/target/riscv/internals.h
+++ b/target/riscv/internals.h
@@ -46,13 +46,23 @@ enum {
     RISCV_FRM_ROD = 8,  /* Round to Odd */
 };
 
-static inline uint64_t nanbox_s(float32 f)
+static inline uint64_t nanbox_s(CPURISCVState *env, float32 f)
 {
-    return f | MAKE_64BIT_MASK(32, 32);
+    /* the value is sign-extended instead of NaN-boxing for zfinx */
+    if (RISCV_CPU(env_cpu(env))->cfg.ext_zfinx) {
+        return (int32_t)f;
+    } else {
+        return f | MAKE_64BIT_MASK(32, 32);
+    }
 }
 
-static inline float32 check_nanbox_s(uint64_t f)
+static inline float32 check_nanbox_s(CPURISCVState *env, uint64_t f)
 {
+    /* Disable NaN-boxing check when enable zfinx */
+    if (RISCV_CPU(env_cpu(env))->cfg.ext_zfinx) {
+        return (uint32_t)f;
+    }
+
     uint64_t mask = MAKE_64BIT_MASK(32, 32);
 
     if (likely((f & mask) == mask)) {
diff --git a/target/riscv/translate.c b/target/riscv/translate.c
index c7232de326..10cf37be41 100644
--- a/target/riscv/translate.c
+++ b/target/riscv/translate.c
@@ -101,6 +101,9 @@ typedef struct DisasContext {
     TCGv zero;
     /* Space for 3 operands plus 1 extra for address computation. */
     TCGv temp[4];
+    /* Space for 4 operands(1 dest and <=3 src) for float point computation */
+    TCGv_i64 ftemp[4];
+    uint8_t nftemp;
     /* PointerMasking extension */
     bool pm_mask_enabled;
     bool pm_base_enabled;
@@ -380,6 +383,86 @@ static void gen_set_gpr128(DisasContext *ctx, int reg_num, TCGv rl, TCGv rh)
     }
 }
 
+static TCGv_i64 ftemp_new(DisasContext *ctx)
+{
+    assert(ctx->nftemp < ARRAY_SIZE(ctx->ftemp));
+    return ctx->ftemp[ctx->nftemp++] = tcg_temp_new_i64();
+}
+
+static TCGv_i64 get_fpr_hs(DisasContext *ctx, int reg_num)
+{
+    if (!ctx->cfg_ptr->ext_zfinx) {
+        return cpu_fpr[reg_num];
+    }
+
+    if (reg_num == 0) {
+        return tcg_constant_i64(0);
+    }
+    switch (get_xl(ctx)) {
+    case MXL_RV32:
+#ifdef TARGET_RISCV32
+    {
+        TCGv_i64 t = ftemp_new(ctx);
+        tcg_gen_ext_i32_i64(t, cpu_gpr[reg_num]);
+        return t;
+    }
+#else
+    /* fall through */
+    case MXL_RV64:
+        return cpu_gpr[reg_num];
+#endif
+    default:
+        g_assert_not_reached();
+    }
+}
+
+static TCGv_i64 dest_fpr(DisasContext *ctx, int reg_num)
+{
+    if (!ctx->cfg_ptr->ext_zfinx) {
+        return cpu_fpr[reg_num];
+    }
+
+    if (reg_num == 0) {
+        return ftemp_new(ctx);
+    }
+
+    switch (get_xl(ctx)) {
+    case MXL_RV32:
+        return ftemp_new(ctx);
+#ifdef TARGET_RISCV64
+    case MXL_RV64:
+        return cpu_gpr[reg_num];
+#endif
+    default:
+        g_assert_not_reached();
+    }
+}
+
+/* assume t is nanboxing (for normal) or sign-extended (for zfinx) */
+static void gen_set_fpr_hs(DisasContext *ctx, int reg_num, TCGv_i64 t)
+{
+    if (!ctx->cfg_ptr->ext_zfinx) {
+        tcg_gen_mov_i64(cpu_fpr[reg_num], t);
+        return;
+    }
+    if (reg_num != 0) {
+        switch (get_xl(ctx)) {
+        case MXL_RV32:
+#ifdef TARGET_RISCV32
+            tcg_gen_extrl_i64_i32(cpu_gpr[reg_num], t);
+            break;
+#else
+        /* fall through */
+        case MXL_RV64:
+            tcg_gen_mov_i64(cpu_gpr[reg_num], t);
+            break;
+#endif
+        default:
+            g_assert_not_reached();
+        }
+    }
+}
+
 static void gen_jal(DisasContext *ctx, int rd, target_ulong imm)
 {
     target_ulong next_pc;
@@ -955,6 +1038,8 @@ static void riscv_tr_init_disas_context(DisasContextBase *dcbase, CPUState *cs)
     ctx->cs = cs;
     ctx->ntemp = 0;
     memset(ctx->temp, 0, sizeof(ctx->temp));
+    ctx->nftemp = 0;
+    memset(ctx->ftemp, 0, sizeof(ctx->ftemp));
     ctx->pm_mask_enabled = FIELD_EX32(tb_flags, TB_FLAGS, PM_MASK_ENABLED);
     ctx->pm_base_enabled = FIELD_EX32(tb_flags, TB_FLAGS, PM_BASE_ENABLED);
     ctx->zero = tcg_constant_tl(0);
@@ -976,16 +1061,22 @@ static void riscv_tr_translate_insn(DisasContextBase *dcbase, CPUState *cpu)
     DisasContext *ctx = container_of(dcbase, DisasContext, base);
     CPURISCVState *env = cpu->env_ptr;
     uint16_t opcode16 = translator_lduw(env, &ctx->base, ctx->base.pc_next);
+    int i;
 
     ctx->ol = ctx->xl;
     decode_opc(env, ctx, opcode16);
     ctx->base.pc_next = ctx->pc_succ_insn;
 
-    for (int i = ctx->ntemp - 1; i >= 0; --i) {
+    for (i = ctx->ntemp - 1; i >= 0; --i) {
         tcg_temp_free(ctx->temp[i]);
         ctx->temp[i] = NULL;
     }
     ctx->ntemp = 0;
+    for (i = ctx->nftemp - 1; i >= 0; --i) {
+        tcg_temp_free_i64(ctx->ftemp[i]);
+        ctx->ftemp[i] = NULL;
+    }
+    ctx->nftemp = 0;
 
     if (ctx->base.is_jmp == DISAS_NEXT) {
         target_ulong page_start;
-- 
2.17.1



^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v6 3/6] target/riscv: add support for zfinx
@ 2022-02-11  4:39   ` Weiwei Li
  0 siblings, 0 replies; 22+ messages in thread
From: Weiwei Li @ 2022-02-11  4:39 UTC (permalink / raw)
  To: richard.henderson, palmer, alistair.francis, bin.meng,
	qemu-riscv, qemu-devel
  Cc: wangjunqiang, lazyparser, ardxwe, Weiwei Li

  - update extension check REQUIRE_ZFINX_OR_F
  - update single float point register read/write
  - disable nanbox_s check

Co-authored-by: ardxwe <ardxwe@gmail.com>
Signed-off-by: Weiwei Li <liweiwei@iscas.ac.cn>
Signed-off-by: Junqiang Wang <wangjunqiang@iscas.ac.cn>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/riscv/fpu_helper.c               |  89 +++----
 target/riscv/helper.h                   |   2 +-
 target/riscv/insn_trans/trans_rvf.c.inc | 314 ++++++++++++++++--------
 target/riscv/internals.h                |  16 +-
 target/riscv/translate.c                |  93 ++++++-
 5 files changed, 369 insertions(+), 145 deletions(-)

diff --git a/target/riscv/fpu_helper.c b/target/riscv/fpu_helper.c
index 4a5982d594..63ca703459 100644
--- a/target/riscv/fpu_helper.c
+++ b/target/riscv/fpu_helper.c
@@ -98,10 +98,11 @@ static uint64_t do_fmadd_h(CPURISCVState *env, uint64_t rs1, uint64_t rs2,
 static uint64_t do_fmadd_s(CPURISCVState *env, uint64_t rs1, uint64_t rs2,
                            uint64_t rs3, int flags)
 {
-    float32 frs1 = check_nanbox_s(rs1);
-    float32 frs2 = check_nanbox_s(rs2);
-    float32 frs3 = check_nanbox_s(rs3);
-    return nanbox_s(float32_muladd(frs1, frs2, frs3, flags, &env->fp_status));
+    float32 frs1 = check_nanbox_s(env, rs1);
+    float32 frs2 = check_nanbox_s(env, rs2);
+    float32 frs3 = check_nanbox_s(env, rs3);
+    return nanbox_s(env, float32_muladd(frs1, frs2, frs3, flags,
+                                        &env->fp_status));
 }
 
 uint64_t helper_fmadd_s(CPURISCVState *env, uint64_t frs1, uint64_t frs2,
@@ -183,124 +184,124 @@ uint64_t helper_fnmadd_h(CPURISCVState *env, uint64_t frs1, uint64_t frs2,
 
 uint64_t helper_fadd_s(CPURISCVState *env, uint64_t rs1, uint64_t rs2)
 {
-    float32 frs1 = check_nanbox_s(rs1);
-    float32 frs2 = check_nanbox_s(rs2);
-    return nanbox_s(float32_add(frs1, frs2, &env->fp_status));
+    float32 frs1 = check_nanbox_s(env, rs1);
+    float32 frs2 = check_nanbox_s(env, rs2);
+    return nanbox_s(env, float32_add(frs1, frs2, &env->fp_status));
 }
 
 uint64_t helper_fsub_s(CPURISCVState *env, uint64_t rs1, uint64_t rs2)
 {
-    float32 frs1 = check_nanbox_s(rs1);
-    float32 frs2 = check_nanbox_s(rs2);
-    return nanbox_s(float32_sub(frs1, frs2, &env->fp_status));
+    float32 frs1 = check_nanbox_s(env, rs1);
+    float32 frs2 = check_nanbox_s(env, rs2);
+    return nanbox_s(env, float32_sub(frs1, frs2, &env->fp_status));
 }
 
 uint64_t helper_fmul_s(CPURISCVState *env, uint64_t rs1, uint64_t rs2)
 {
-    float32 frs1 = check_nanbox_s(rs1);
-    float32 frs2 = check_nanbox_s(rs2);
-    return nanbox_s(float32_mul(frs1, frs2, &env->fp_status));
+    float32 frs1 = check_nanbox_s(env, rs1);
+    float32 frs2 = check_nanbox_s(env, rs2);
+    return nanbox_s(env, float32_mul(frs1, frs2, &env->fp_status));
 }
 
 uint64_t helper_fdiv_s(CPURISCVState *env, uint64_t rs1, uint64_t rs2)
 {
-    float32 frs1 = check_nanbox_s(rs1);
-    float32 frs2 = check_nanbox_s(rs2);
-    return nanbox_s(float32_div(frs1, frs2, &env->fp_status));
+    float32 frs1 = check_nanbox_s(env, rs1);
+    float32 frs2 = check_nanbox_s(env, rs2);
+    return nanbox_s(env, float32_div(frs1, frs2, &env->fp_status));
 }
 
 uint64_t helper_fmin_s(CPURISCVState *env, uint64_t rs1, uint64_t rs2)
 {
-    float32 frs1 = check_nanbox_s(rs1);
-    float32 frs2 = check_nanbox_s(rs2);
-    return nanbox_s(env->priv_ver < PRIV_VERSION_1_11_0 ?
+    float32 frs1 = check_nanbox_s(env, rs1);
+    float32 frs2 = check_nanbox_s(env, rs2);
+    return nanbox_s(env, env->priv_ver < PRIV_VERSION_1_11_0 ?
                     float32_minnum(frs1, frs2, &env->fp_status) :
                     float32_minimum_number(frs1, frs2, &env->fp_status));
 }
 
 uint64_t helper_fmax_s(CPURISCVState *env, uint64_t rs1, uint64_t rs2)
 {
-    float32 frs1 = check_nanbox_s(rs1);
-    float32 frs2 = check_nanbox_s(rs2);
-    return nanbox_s(env->priv_ver < PRIV_VERSION_1_11_0 ?
+    float32 frs1 = check_nanbox_s(env, rs1);
+    float32 frs2 = check_nanbox_s(env, rs2);
+    return nanbox_s(env, env->priv_ver < PRIV_VERSION_1_11_0 ?
                     float32_maxnum(frs1, frs2, &env->fp_status) :
                     float32_maximum_number(frs1, frs2, &env->fp_status));
 }
 
 uint64_t helper_fsqrt_s(CPURISCVState *env, uint64_t rs1)
 {
-    float32 frs1 = check_nanbox_s(rs1);
-    return nanbox_s(float32_sqrt(frs1, &env->fp_status));
+    float32 frs1 = check_nanbox_s(env, rs1);
+    return nanbox_s(env, float32_sqrt(frs1, &env->fp_status));
 }
 
 target_ulong helper_fle_s(CPURISCVState *env, uint64_t rs1, uint64_t rs2)
 {
-    float32 frs1 = check_nanbox_s(rs1);
-    float32 frs2 = check_nanbox_s(rs2);
+    float32 frs1 = check_nanbox_s(env, rs1);
+    float32 frs2 = check_nanbox_s(env, rs2);
     return float32_le(frs1, frs2, &env->fp_status);
 }
 
 target_ulong helper_flt_s(CPURISCVState *env, uint64_t rs1, uint64_t rs2)
 {
-    float32 frs1 = check_nanbox_s(rs1);
-    float32 frs2 = check_nanbox_s(rs2);
+    float32 frs1 = check_nanbox_s(env, rs1);
+    float32 frs2 = check_nanbox_s(env, rs2);
     return float32_lt(frs1, frs2, &env->fp_status);
 }
 
 target_ulong helper_feq_s(CPURISCVState *env, uint64_t rs1, uint64_t rs2)
 {
-    float32 frs1 = check_nanbox_s(rs1);
-    float32 frs2 = check_nanbox_s(rs2);
+    float32 frs1 = check_nanbox_s(env, rs1);
+    float32 frs2 = check_nanbox_s(env, rs2);
     return float32_eq_quiet(frs1, frs2, &env->fp_status);
 }
 
 target_ulong helper_fcvt_w_s(CPURISCVState *env, uint64_t rs1)
 {
-    float32 frs1 = check_nanbox_s(rs1);
+    float32 frs1 = check_nanbox_s(env, rs1);
     return float32_to_int32(frs1, &env->fp_status);
 }
 
 target_ulong helper_fcvt_wu_s(CPURISCVState *env, uint64_t rs1)
 {
-    float32 frs1 = check_nanbox_s(rs1);
+    float32 frs1 = check_nanbox_s(env, rs1);
     return (int32_t)float32_to_uint32(frs1, &env->fp_status);
 }
 
 target_ulong helper_fcvt_l_s(CPURISCVState *env, uint64_t rs1)
 {
-    float32 frs1 = check_nanbox_s(rs1);
+    float32 frs1 = check_nanbox_s(env, rs1);
     return float32_to_int64(frs1, &env->fp_status);
 }
 
 target_ulong helper_fcvt_lu_s(CPURISCVState *env, uint64_t rs1)
 {
-    float32 frs1 = check_nanbox_s(rs1);
+    float32 frs1 = check_nanbox_s(env, rs1);
     return float32_to_uint64(frs1, &env->fp_status);
 }
 
 uint64_t helper_fcvt_s_w(CPURISCVState *env, target_ulong rs1)
 {
-    return nanbox_s(int32_to_float32((int32_t)rs1, &env->fp_status));
+    return nanbox_s(env, int32_to_float32((int32_t)rs1, &env->fp_status));
 }
 
 uint64_t helper_fcvt_s_wu(CPURISCVState *env, target_ulong rs1)
 {
-    return nanbox_s(uint32_to_float32((uint32_t)rs1, &env->fp_status));
+    return nanbox_s(env, uint32_to_float32((uint32_t)rs1, &env->fp_status));
 }
 
 uint64_t helper_fcvt_s_l(CPURISCVState *env, target_ulong rs1)
 {
-    return nanbox_s(int64_to_float32(rs1, &env->fp_status));
+    return nanbox_s(env, int64_to_float32(rs1, &env->fp_status));
 }
 
 uint64_t helper_fcvt_s_lu(CPURISCVState *env, target_ulong rs1)
 {
-    return nanbox_s(uint64_to_float32(rs1, &env->fp_status));
+    return nanbox_s(env, uint64_to_float32(rs1, &env->fp_status));
 }
 
-target_ulong helper_fclass_s(uint64_t rs1)
+target_ulong helper_fclass_s(CPURISCVState *env, uint64_t rs1)
 {
-    float32 frs1 = check_nanbox_s(rs1);
+    float32 frs1 = check_nanbox_s(env, rs1);
     return fclass_s(frs1);
 }
 
@@ -340,12 +341,12 @@ uint64_t helper_fmax_d(CPURISCVState *env, uint64_t frs1, uint64_t frs2)
 
 uint64_t helper_fcvt_s_d(CPURISCVState *env, uint64_t rs1)
 {
-    return nanbox_s(float64_to_float32(rs1, &env->fp_status));
+    return nanbox_s(env, float64_to_float32(rs1, &env->fp_status));
 }
 
 uint64_t helper_fcvt_d_s(CPURISCVState *env, uint64_t rs1)
 {
-    float32 frs1 = check_nanbox_s(rs1);
+    float32 frs1 = check_nanbox_s(env, rs1);
     return float32_to_float64(frs1, &env->fp_status);
 }
 
@@ -539,14 +540,14 @@ uint64_t helper_fcvt_h_lu(CPURISCVState *env, target_ulong rs1)
 
 uint64_t helper_fcvt_h_s(CPURISCVState *env, uint64_t rs1)
 {
-    float32 frs1 = check_nanbox_s(rs1);
+    float32 frs1 = check_nanbox_s(env, rs1);
     return nanbox_h(float32_to_float16(frs1, true, &env->fp_status));
 }
 
 uint64_t helper_fcvt_s_h(CPURISCVState *env, uint64_t rs1)
 {
     float16 frs1 = check_nanbox_h(rs1);
-    return nanbox_s(float16_to_float32(frs1, true, &env->fp_status));
+    return nanbox_s(env, float16_to_float32(frs1, true, &env->fp_status));
 }
 
 uint64_t helper_fcvt_h_d(CPURISCVState *env, uint64_t rs1)
diff --git a/target/riscv/helper.h b/target/riscv/helper.h
index 72cc2582f4..89195aad9d 100644
--- a/target/riscv/helper.h
+++ b/target/riscv/helper.h
@@ -38,7 +38,7 @@ DEF_HELPER_FLAGS_2(fcvt_s_w, TCG_CALL_NO_RWG, i64, env, tl)
 DEF_HELPER_FLAGS_2(fcvt_s_wu, TCG_CALL_NO_RWG, i64, env, tl)
 DEF_HELPER_FLAGS_2(fcvt_s_l, TCG_CALL_NO_RWG, i64, env, tl)
 DEF_HELPER_FLAGS_2(fcvt_s_lu, TCG_CALL_NO_RWG, i64, env, tl)
-DEF_HELPER_FLAGS_1(fclass_s, TCG_CALL_NO_RWG_SE, tl, i64)
+DEF_HELPER_FLAGS_2(fclass_s, TCG_CALL_NO_RWG_SE, tl, env, i64)
 
 /* Floating Point - Double Precision */
 DEF_HELPER_FLAGS_3(fadd_d, TCG_CALL_NO_RWG, i64, env, i64, i64)
diff --git a/target/riscv/insn_trans/trans_rvf.c.inc b/target/riscv/insn_trans/trans_rvf.c.inc
index 0aac87f7db..a1d3eb52ad 100644
--- a/target/riscv/insn_trans/trans_rvf.c.inc
+++ b/target/riscv/insn_trans/trans_rvf.c.inc
@@ -20,7 +20,14 @@
 
 #define REQUIRE_FPU do {\
     if (ctx->mstatus_fs == 0) \
-        return false;                       \
+        if (!ctx->cfg_ptr->ext_zfinx) \
+            return false; \
+} while (0)
+
+#define REQUIRE_ZFINX_OR_F(ctx) do {\
+    if (!ctx->cfg_ptr->ext_zfinx) { \
+        REQUIRE_EXT(ctx, RVF); \
+    } \
 } while (0)
 
 static bool trans_flw(DisasContext *ctx, arg_flw *a)
@@ -55,10 +62,16 @@ static bool trans_fsw(DisasContext *ctx, arg_fsw *a)
 static bool trans_fmadd_s(DisasContext *ctx, arg_fmadd_s *a)
 {
     REQUIRE_FPU;
-    REQUIRE_EXT(ctx, RVF);
+    REQUIRE_ZFINX_OR_F(ctx);
+
+    TCGv_i64 dest = dest_fpr(ctx, a->rd);
+    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
+    TCGv_i64 src2 = get_fpr_hs(ctx, a->rs2);
+    TCGv_i64 src3 = get_fpr_hs(ctx, a->rs3);
+
     gen_set_rm(ctx, a->rm);
-    gen_helper_fmadd_s(cpu_fpr[a->rd], cpu_env, cpu_fpr[a->rs1],
-                       cpu_fpr[a->rs2], cpu_fpr[a->rs3]);
+    gen_helper_fmadd_s(dest, cpu_env, src1, src2, src3);
+    gen_set_fpr_hs(ctx, a->rd, dest);
     mark_fs_dirty(ctx);
     return true;
 }
@@ -66,10 +79,16 @@ static bool trans_fmadd_s(DisasContext *ctx, arg_fmadd_s *a)
 static bool trans_fmsub_s(DisasContext *ctx, arg_fmsub_s *a)
 {
     REQUIRE_FPU;
-    REQUIRE_EXT(ctx, RVF);
+    REQUIRE_ZFINX_OR_F(ctx);
+
+    TCGv_i64 dest = dest_fpr(ctx, a->rd);
+    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
+    TCGv_i64 src2 = get_fpr_hs(ctx, a->rs2);
+    TCGv_i64 src3 = get_fpr_hs(ctx, a->rs3);
+
     gen_set_rm(ctx, a->rm);
-    gen_helper_fmsub_s(cpu_fpr[a->rd], cpu_env, cpu_fpr[a->rs1],
-                       cpu_fpr[a->rs2], cpu_fpr[a->rs3]);
+    gen_helper_fmsub_s(dest, cpu_env, src1, src2, src3);
+    gen_set_fpr_hs(ctx, a->rd, dest);
     mark_fs_dirty(ctx);
     return true;
 }
@@ -77,10 +96,16 @@ static bool trans_fmsub_s(DisasContext *ctx, arg_fmsub_s *a)
 static bool trans_fnmsub_s(DisasContext *ctx, arg_fnmsub_s *a)
 {
     REQUIRE_FPU;
-    REQUIRE_EXT(ctx, RVF);
+    REQUIRE_ZFINX_OR_F(ctx);
+
+    TCGv_i64 dest = dest_fpr(ctx, a->rd);
+    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
+    TCGv_i64 src2 = get_fpr_hs(ctx, a->rs2);
+    TCGv_i64 src3 = get_fpr_hs(ctx, a->rs3);
+
     gen_set_rm(ctx, a->rm);
-    gen_helper_fnmsub_s(cpu_fpr[a->rd], cpu_env, cpu_fpr[a->rs1],
-                        cpu_fpr[a->rs2], cpu_fpr[a->rs3]);
+    gen_helper_fnmsub_s(dest, cpu_env, src1, src2, src3);
+    gen_set_fpr_hs(ctx, a->rd, dest);
     mark_fs_dirty(ctx);
     return true;
 }
@@ -88,10 +113,16 @@ static bool trans_fnmsub_s(DisasContext *ctx, arg_fnmsub_s *a)
 static bool trans_fnmadd_s(DisasContext *ctx, arg_fnmadd_s *a)
 {
     REQUIRE_FPU;
-    REQUIRE_EXT(ctx, RVF);
+    REQUIRE_ZFINX_OR_F(ctx);
+
+    TCGv_i64 dest = dest_fpr(ctx, a->rd);
+    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
+    TCGv_i64 src2 = get_fpr_hs(ctx, a->rs2);
+    TCGv_i64 src3 = get_fpr_hs(ctx, a->rs3);
+
     gen_set_rm(ctx, a->rm);
-    gen_helper_fnmadd_s(cpu_fpr[a->rd], cpu_env, cpu_fpr[a->rs1],
-                        cpu_fpr[a->rs2], cpu_fpr[a->rs3]);
+    gen_helper_fnmadd_s(dest, cpu_env, src1, src2, src3);
+    gen_set_fpr_hs(ctx, a->rd, dest);
     mark_fs_dirty(ctx);
     return true;
 }
@@ -99,11 +130,15 @@ static bool trans_fnmadd_s(DisasContext *ctx, arg_fnmadd_s *a)
 static bool trans_fadd_s(DisasContext *ctx, arg_fadd_s *a)
 {
     REQUIRE_FPU;
-    REQUIRE_EXT(ctx, RVF);
+    REQUIRE_ZFINX_OR_F(ctx);
+
+    TCGv_i64 dest = dest_fpr(ctx, a->rd);
+    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
+    TCGv_i64 src2 = get_fpr_hs(ctx, a->rs2);
 
     gen_set_rm(ctx, a->rm);
-    gen_helper_fadd_s(cpu_fpr[a->rd], cpu_env,
-                      cpu_fpr[a->rs1], cpu_fpr[a->rs2]);
+    gen_helper_fadd_s(dest, cpu_env, src1, src2);
+    gen_set_fpr_hs(ctx, a->rd, dest);
     mark_fs_dirty(ctx);
     return true;
 }
@@ -111,11 +146,15 @@ static bool trans_fadd_s(DisasContext *ctx, arg_fadd_s *a)
 static bool trans_fsub_s(DisasContext *ctx, arg_fsub_s *a)
 {
     REQUIRE_FPU;
-    REQUIRE_EXT(ctx, RVF);
+    REQUIRE_ZFINX_OR_F(ctx);
+
+    TCGv_i64 dest = dest_fpr(ctx, a->rd);
+    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
+    TCGv_i64 src2 = get_fpr_hs(ctx, a->rs2);
 
     gen_set_rm(ctx, a->rm);
-    gen_helper_fsub_s(cpu_fpr[a->rd], cpu_env,
-                      cpu_fpr[a->rs1], cpu_fpr[a->rs2]);
+    gen_helper_fsub_s(dest, cpu_env, src1, src2);
+    gen_set_fpr_hs(ctx, a->rd, dest);
     mark_fs_dirty(ctx);
     return true;
 }
@@ -123,11 +162,15 @@ static bool trans_fsub_s(DisasContext *ctx, arg_fsub_s *a)
 static bool trans_fmul_s(DisasContext *ctx, arg_fmul_s *a)
 {
     REQUIRE_FPU;
-    REQUIRE_EXT(ctx, RVF);
+    REQUIRE_ZFINX_OR_F(ctx);
+
+    TCGv_i64 dest = dest_fpr(ctx, a->rd);
+    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
+    TCGv_i64 src2 = get_fpr_hs(ctx, a->rs2);
 
     gen_set_rm(ctx, a->rm);
-    gen_helper_fmul_s(cpu_fpr[a->rd], cpu_env,
-                      cpu_fpr[a->rs1], cpu_fpr[a->rs2]);
+    gen_helper_fmul_s(dest, cpu_env, src1, src2);
+    gen_set_fpr_hs(ctx, a->rd, dest);
     mark_fs_dirty(ctx);
     return true;
 }
@@ -135,11 +178,15 @@ static bool trans_fmul_s(DisasContext *ctx, arg_fmul_s *a)
 static bool trans_fdiv_s(DisasContext *ctx, arg_fdiv_s *a)
 {
     REQUIRE_FPU;
-    REQUIRE_EXT(ctx, RVF);
+    REQUIRE_ZFINX_OR_F(ctx);
+
+    TCGv_i64 dest = dest_fpr(ctx, a->rd);
+    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
+    TCGv_i64 src2 = get_fpr_hs(ctx, a->rs2);
 
     gen_set_rm(ctx, a->rm);
-    gen_helper_fdiv_s(cpu_fpr[a->rd], cpu_env,
-                      cpu_fpr[a->rs1], cpu_fpr[a->rs2]);
+    gen_helper_fdiv_s(dest, cpu_env, src1, src2);
+    gen_set_fpr_hs(ctx, a->rd, dest);
     mark_fs_dirty(ctx);
     return true;
 }
@@ -147,10 +194,14 @@ static bool trans_fdiv_s(DisasContext *ctx, arg_fdiv_s *a)
 static bool trans_fsqrt_s(DisasContext *ctx, arg_fsqrt_s *a)
 {
     REQUIRE_FPU;
-    REQUIRE_EXT(ctx, RVF);
+    REQUIRE_ZFINX_OR_F(ctx);
+
+    TCGv_i64 dest = dest_fpr(ctx, a->rd);
+    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
 
     gen_set_rm(ctx, a->rm);
-    gen_helper_fsqrt_s(cpu_fpr[a->rd], cpu_env, cpu_fpr[a->rs1]);
+    gen_helper_fsqrt_s(dest, cpu_env, src1);
+    gen_set_fpr_hs(ctx, a->rd, dest);
     mark_fs_dirty(ctx);
     return true;
 }
@@ -158,22 +209,37 @@ static bool trans_fsqrt_s(DisasContext *ctx, arg_fsqrt_s *a)
 static bool trans_fsgnj_s(DisasContext *ctx, arg_fsgnj_s *a)
 {
     REQUIRE_FPU;
-    REQUIRE_EXT(ctx, RVF);
+    REQUIRE_ZFINX_OR_F(ctx);
+
+    TCGv_i64 dest = dest_fpr(ctx, a->rd);
+    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
 
     if (a->rs1 == a->rs2) { /* FMOV */
-        gen_check_nanbox_s(cpu_fpr[a->rd], cpu_fpr[a->rs1]);
+        if (!ctx->cfg_ptr->ext_zfinx) {
+            gen_check_nanbox_s(dest, src1);
+        } else {
+            tcg_gen_ext32s_i64(dest, src1);
+        }
     } else { /* FSGNJ */
-        TCGv_i64 rs1 = tcg_temp_new_i64();
-        TCGv_i64 rs2 = tcg_temp_new_i64();
-
-        gen_check_nanbox_s(rs1, cpu_fpr[a->rs1]);
-        gen_check_nanbox_s(rs2, cpu_fpr[a->rs2]);
-
-        /* This formulation retains the nanboxing of rs2. */
-        tcg_gen_deposit_i64(cpu_fpr[a->rd], rs2, rs1, 0, 31);
-        tcg_temp_free_i64(rs1);
-        tcg_temp_free_i64(rs2);
+        TCGv_i64 src2 = get_fpr_hs(ctx, a->rs2);
+
+        if (!ctx->cfg_ptr->ext_zfinx) {
+            TCGv_i64 rs1 = tcg_temp_new_i64();
+            TCGv_i64 rs2 = tcg_temp_new_i64();
+            gen_check_nanbox_s(rs1, src1);
+            gen_check_nanbox_s(rs2, src2);
+
+            /* This formulation retains the nanboxing of rs2 in normal 'F'. */
+            tcg_gen_deposit_i64(dest, rs2, rs1, 0, 31);
+
+            tcg_temp_free_i64(rs1);
+            tcg_temp_free_i64(rs2);
+        } else {
+            tcg_gen_deposit_i64(dest, src2, src1, 0, 31);
+            tcg_gen_ext32s_i64(dest, dest);
+        }
     }
+    gen_set_fpr_hs(ctx, a->rd, dest);
     mark_fs_dirty(ctx);
     return true;
 }
@@ -183,16 +249,27 @@ static bool trans_fsgnjn_s(DisasContext *ctx, arg_fsgnjn_s *a)
     TCGv_i64 rs1, rs2, mask;
 
     REQUIRE_FPU;
-    REQUIRE_EXT(ctx, RVF);
+    REQUIRE_ZFINX_OR_F(ctx);
 
-    rs1 = tcg_temp_new_i64();
-    gen_check_nanbox_s(rs1, cpu_fpr[a->rs1]);
+    TCGv_i64 dest = dest_fpr(ctx, a->rd);
+    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
 
+    rs1 = tcg_temp_new_i64();
+    if (!ctx->cfg_ptr->ext_zfinx) {
+        gen_check_nanbox_s(rs1, src1);
+    } else {
+        tcg_gen_mov_i64(rs1, src1);
+    }
     if (a->rs1 == a->rs2) { /* FNEG */
-        tcg_gen_xori_i64(cpu_fpr[a->rd], rs1, MAKE_64BIT_MASK(31, 1));
+        tcg_gen_xori_i64(dest, rs1, MAKE_64BIT_MASK(31, 1));
     } else {
+        TCGv_i64 src2 = get_fpr_hs(ctx, a->rs2);
         rs2 = tcg_temp_new_i64();
-        gen_check_nanbox_s(rs2, cpu_fpr[a->rs2]);
+        if (!ctx->cfg_ptr->ext_zfinx) {
+            gen_check_nanbox_s(rs2, src2);
+        } else {
+            tcg_gen_mov_i64(rs2, src2);
+        }
 
         /*
          * Replace bit 31 in rs1 with inverse in rs2.
@@ -200,13 +277,17 @@ static bool trans_fsgnjn_s(DisasContext *ctx, arg_fsgnjn_s *a)
          */
         mask = tcg_constant_i64(~MAKE_64BIT_MASK(31, 1));
         tcg_gen_nor_i64(rs2, rs2, mask);
-        tcg_gen_and_i64(rs1, mask, rs1);
-        tcg_gen_or_i64(cpu_fpr[a->rd], rs1, rs2);
+        tcg_gen_and_i64(dest, mask, rs1);
+        tcg_gen_or_i64(dest, dest, rs2);
 
         tcg_temp_free_i64(rs2);
     }
+    /* signed-extended intead of nanboxing for result if enable zfinx */
+    if (ctx->cfg_ptr->ext_zfinx) {
+        tcg_gen_ext32s_i64(dest, dest);
+    }
+    gen_set_fpr_hs(ctx, a->rd, dest);
     tcg_temp_free_i64(rs1);
-
     mark_fs_dirty(ctx);
     return true;
 }
@@ -216,28 +297,45 @@ static bool trans_fsgnjx_s(DisasContext *ctx, arg_fsgnjx_s *a)
     TCGv_i64 rs1, rs2;
 
     REQUIRE_FPU;
-    REQUIRE_EXT(ctx, RVF);
+    REQUIRE_ZFINX_OR_F(ctx);
 
+    TCGv_i64 dest = dest_fpr(ctx, a->rd);
+    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
     rs1 = tcg_temp_new_i64();
-    gen_check_nanbox_s(rs1, cpu_fpr[a->rs1]);
+
+    if (!ctx->cfg_ptr->ext_zfinx) {
+        gen_check_nanbox_s(rs1, src1);
+    } else {
+        tcg_gen_mov_i64(rs1, src1);
+    }
 
     if (a->rs1 == a->rs2) { /* FABS */
-        tcg_gen_andi_i64(cpu_fpr[a->rd], rs1, ~MAKE_64BIT_MASK(31, 1));
+        tcg_gen_andi_i64(dest, rs1, ~MAKE_64BIT_MASK(31, 1));
     } else {
+        TCGv_i64 src2 = get_fpr_hs(ctx, a->rs2);
         rs2 = tcg_temp_new_i64();
-        gen_check_nanbox_s(rs2, cpu_fpr[a->rs2]);
+
+        if (!ctx->cfg_ptr->ext_zfinx) {
+            gen_check_nanbox_s(rs2, src2);
+        } else {
+            tcg_gen_mov_i64(rs2, src2);
+        }
 
         /*
          * Xor bit 31 in rs1 with that in rs2.
          * This formulation retains the nanboxing of rs1.
          */
-        tcg_gen_andi_i64(rs2, rs2, MAKE_64BIT_MASK(31, 1));
-        tcg_gen_xor_i64(cpu_fpr[a->rd], rs1, rs2);
+        tcg_gen_andi_i64(dest, rs2, MAKE_64BIT_MASK(31, 1));
+        tcg_gen_xor_i64(dest, rs1, dest);
 
         tcg_temp_free_i64(rs2);
     }
+    /* signed-extended intead of nanboxing for result if enable zfinx */
+    if (ctx->cfg_ptr->ext_zfinx) {
+        tcg_gen_ext32s_i64(dest, dest);
+    }
     tcg_temp_free_i64(rs1);
-
+    gen_set_fpr_hs(ctx, a->rd, dest);
     mark_fs_dirty(ctx);
     return true;
 }
@@ -245,10 +343,14 @@ static bool trans_fsgnjx_s(DisasContext *ctx, arg_fsgnjx_s *a)
 static bool trans_fmin_s(DisasContext *ctx, arg_fmin_s *a)
 {
     REQUIRE_FPU;
-    REQUIRE_EXT(ctx, RVF);
+    REQUIRE_ZFINX_OR_F(ctx);
+
+    TCGv_i64 dest = dest_fpr(ctx, a->rd);
+    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
+    TCGv_i64 src2 = get_fpr_hs(ctx, a->rs2);
 
-    gen_helper_fmin_s(cpu_fpr[a->rd], cpu_env, cpu_fpr[a->rs1],
-                      cpu_fpr[a->rs2]);
+    gen_helper_fmin_s(dest, cpu_env, src1, src2);
+    gen_set_fpr_hs(ctx, a->rd, dest);
     mark_fs_dirty(ctx);
     return true;
 }
@@ -256,10 +358,14 @@ static bool trans_fmin_s(DisasContext *ctx, arg_fmin_s *a)
 static bool trans_fmax_s(DisasContext *ctx, arg_fmax_s *a)
 {
     REQUIRE_FPU;
-    REQUIRE_EXT(ctx, RVF);
+    REQUIRE_ZFINX_OR_F(ctx);
+
+    TCGv_i64 dest = dest_fpr(ctx, a->rd);
+    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
+    TCGv_i64 src2 = get_fpr_hs(ctx, a->rs2);
 
-    gen_helper_fmax_s(cpu_fpr[a->rd], cpu_env, cpu_fpr[a->rs1],
-                      cpu_fpr[a->rs2]);
+    gen_helper_fmax_s(dest, cpu_env, src1, src2);
+    gen_set_fpr_hs(ctx, a->rd, dest);
     mark_fs_dirty(ctx);
     return true;
 }
@@ -267,12 +373,13 @@ static bool trans_fmax_s(DisasContext *ctx, arg_fmax_s *a)
 static bool trans_fcvt_w_s(DisasContext *ctx, arg_fcvt_w_s *a)
 {
     REQUIRE_FPU;
-    REQUIRE_EXT(ctx, RVF);
+    REQUIRE_ZFINX_OR_F(ctx);
 
     TCGv dest = dest_gpr(ctx, a->rd);
+    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
 
     gen_set_rm(ctx, a->rm);
-    gen_helper_fcvt_w_s(dest, cpu_env, cpu_fpr[a->rs1]);
+    gen_helper_fcvt_w_s(dest, cpu_env, src1);
     gen_set_gpr(ctx, a->rd, dest);
     return true;
 }
@@ -280,12 +387,13 @@ static bool trans_fcvt_w_s(DisasContext *ctx, arg_fcvt_w_s *a)
 static bool trans_fcvt_wu_s(DisasContext *ctx, arg_fcvt_wu_s *a)
 {
     REQUIRE_FPU;
-    REQUIRE_EXT(ctx, RVF);
+    REQUIRE_ZFINX_OR_F(ctx);
 
     TCGv dest = dest_gpr(ctx, a->rd);
+    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
 
     gen_set_rm(ctx, a->rm);
-    gen_helper_fcvt_wu_s(dest, cpu_env, cpu_fpr[a->rs1]);
+    gen_helper_fcvt_wu_s(dest, cpu_env, src1);
     gen_set_gpr(ctx, a->rd, dest);
     return true;
 }
@@ -294,14 +402,14 @@ static bool trans_fmv_x_w(DisasContext *ctx, arg_fmv_x_w *a)
 {
     /* NOTE: This was FMV.X.S in an earlier version of the ISA spec! */
     REQUIRE_FPU;
-    REQUIRE_EXT(ctx, RVF);
+    REQUIRE_ZFINX_OR_F(ctx);
 
     TCGv dest = dest_gpr(ctx, a->rd);
-
+    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
 #if defined(TARGET_RISCV64)
-    tcg_gen_ext32s_tl(dest, cpu_fpr[a->rs1]);
+    tcg_gen_ext32s_tl(dest, src1);
 #else
-    tcg_gen_extrl_i64_i32(dest, cpu_fpr[a->rs1]);
+    tcg_gen_extrl_i64_i32(dest, src1);
 #endif
 
     gen_set_gpr(ctx, a->rd, dest);
@@ -311,11 +419,13 @@ static bool trans_fmv_x_w(DisasContext *ctx, arg_fmv_x_w *a)
 static bool trans_feq_s(DisasContext *ctx, arg_feq_s *a)
 {
     REQUIRE_FPU;
-    REQUIRE_EXT(ctx, RVF);
+    REQUIRE_ZFINX_OR_F(ctx);
 
     TCGv dest = dest_gpr(ctx, a->rd);
+    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
+    TCGv_i64 src2 = get_fpr_hs(ctx, a->rs2);
 
-    gen_helper_feq_s(dest, cpu_env, cpu_fpr[a->rs1], cpu_fpr[a->rs2]);
+    gen_helper_feq_s(dest, cpu_env, src1, src2);
     gen_set_gpr(ctx, a->rd, dest);
     return true;
 }
@@ -323,11 +433,13 @@ static bool trans_feq_s(DisasContext *ctx, arg_feq_s *a)
 static bool trans_flt_s(DisasContext *ctx, arg_flt_s *a)
 {
     REQUIRE_FPU;
-    REQUIRE_EXT(ctx, RVF);
+    REQUIRE_ZFINX_OR_F(ctx);
 
     TCGv dest = dest_gpr(ctx, a->rd);
+    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
+    TCGv_i64 src2 = get_fpr_hs(ctx, a->rs2);
 
-    gen_helper_flt_s(dest, cpu_env, cpu_fpr[a->rs1], cpu_fpr[a->rs2]);
+    gen_helper_flt_s(dest, cpu_env, src1, src2);
     gen_set_gpr(ctx, a->rd, dest);
     return true;
 }
@@ -335,11 +447,13 @@ static bool trans_flt_s(DisasContext *ctx, arg_flt_s *a)
 static bool trans_fle_s(DisasContext *ctx, arg_fle_s *a)
 {
     REQUIRE_FPU;
-    REQUIRE_EXT(ctx, RVF);
+    REQUIRE_ZFINX_OR_F(ctx);
 
     TCGv dest = dest_gpr(ctx, a->rd);
+    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
+    TCGv_i64 src2 = get_fpr_hs(ctx, a->rs2);
 
-    gen_helper_fle_s(dest, cpu_env, cpu_fpr[a->rs1], cpu_fpr[a->rs2]);
+    gen_helper_fle_s(dest, cpu_env, src1, src2);
     gen_set_gpr(ctx, a->rd, dest);
     return true;
 }
@@ -347,11 +461,12 @@ static bool trans_fle_s(DisasContext *ctx, arg_fle_s *a)
 static bool trans_fclass_s(DisasContext *ctx, arg_fclass_s *a)
 {
     REQUIRE_FPU;
-    REQUIRE_EXT(ctx, RVF);
+    REQUIRE_ZFINX_OR_F(ctx);
 
     TCGv dest = dest_gpr(ctx, a->rd);
+    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
 
-    gen_helper_fclass_s(dest, cpu_fpr[a->rs1]);
+    gen_helper_fclass_s(dest, cpu_env, src1);
     gen_set_gpr(ctx, a->rd, dest);
     return true;
 }
@@ -359,13 +474,14 @@ static bool trans_fclass_s(DisasContext *ctx, arg_fclass_s *a)
 static bool trans_fcvt_s_w(DisasContext *ctx, arg_fcvt_s_w *a)
 {
     REQUIRE_FPU;
-    REQUIRE_EXT(ctx, RVF);
+    REQUIRE_ZFINX_OR_F(ctx);
 
+    TCGv_i64 dest = dest_fpr(ctx, a->rd);
     TCGv src = get_gpr(ctx, a->rs1, EXT_SIGN);
 
     gen_set_rm(ctx, a->rm);
-    gen_helper_fcvt_s_w(cpu_fpr[a->rd], cpu_env, src);
-
+    gen_helper_fcvt_s_w(dest, cpu_env, src);
+    gen_set_fpr_hs(ctx, a->rd, dest);
     mark_fs_dirty(ctx);
     return true;
 }
@@ -373,13 +489,14 @@ static bool trans_fcvt_s_w(DisasContext *ctx, arg_fcvt_s_w *a)
 static bool trans_fcvt_s_wu(DisasContext *ctx, arg_fcvt_s_wu *a)
 {
     REQUIRE_FPU;
-    REQUIRE_EXT(ctx, RVF);
+    REQUIRE_ZFINX_OR_F(ctx);
 
+    TCGv_i64 dest = dest_fpr(ctx, a->rd);
     TCGv src = get_gpr(ctx, a->rs1, EXT_ZERO);
 
     gen_set_rm(ctx, a->rm);
-    gen_helper_fcvt_s_wu(cpu_fpr[a->rd], cpu_env, src);
-
+    gen_helper_fcvt_s_wu(dest, cpu_env, src);
+    gen_set_fpr_hs(ctx, a->rd, dest);
     mark_fs_dirty(ctx);
     return true;
 }
@@ -388,13 +505,14 @@ static bool trans_fmv_w_x(DisasContext *ctx, arg_fmv_w_x *a)
 {
     /* NOTE: This was FMV.S.X in an earlier version of the ISA spec! */
     REQUIRE_FPU;
-    REQUIRE_EXT(ctx, RVF);
+    REQUIRE_ZFINX_OR_F(ctx);
 
+    TCGv_i64 dest = dest_fpr(ctx, a->rd);
     TCGv src = get_gpr(ctx, a->rs1, EXT_ZERO);
 
-    tcg_gen_extu_tl_i64(cpu_fpr[a->rd], src);
-    gen_nanbox_s(cpu_fpr[a->rd], cpu_fpr[a->rd]);
-
+    tcg_gen_extu_tl_i64(dest, src);
+    gen_nanbox_s(dest, dest);
+    gen_set_fpr_hs(ctx, a->rd, dest);
     mark_fs_dirty(ctx);
     return true;
 }
@@ -403,12 +521,13 @@ static bool trans_fcvt_l_s(DisasContext *ctx, arg_fcvt_l_s *a)
 {
     REQUIRE_64BIT(ctx);
     REQUIRE_FPU;
-    REQUIRE_EXT(ctx, RVF);
+    REQUIRE_ZFINX_OR_F(ctx);
 
     TCGv dest = dest_gpr(ctx, a->rd);
+    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
 
     gen_set_rm(ctx, a->rm);
-    gen_helper_fcvt_l_s(dest, cpu_env, cpu_fpr[a->rs1]);
+    gen_helper_fcvt_l_s(dest, cpu_env, src1);
     gen_set_gpr(ctx, a->rd, dest);
     return true;
 }
@@ -417,12 +536,13 @@ static bool trans_fcvt_lu_s(DisasContext *ctx, arg_fcvt_lu_s *a)
 {
     REQUIRE_64BIT(ctx);
     REQUIRE_FPU;
-    REQUIRE_EXT(ctx, RVF);
+    REQUIRE_ZFINX_OR_F(ctx);
 
     TCGv dest = dest_gpr(ctx, a->rd);
+    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
 
     gen_set_rm(ctx, a->rm);
-    gen_helper_fcvt_lu_s(dest, cpu_env, cpu_fpr[a->rs1]);
+    gen_helper_fcvt_lu_s(dest, cpu_env, src1);
     gen_set_gpr(ctx, a->rd, dest);
     return true;
 }
@@ -431,13 +551,14 @@ static bool trans_fcvt_s_l(DisasContext *ctx, arg_fcvt_s_l *a)
 {
     REQUIRE_64BIT(ctx);
     REQUIRE_FPU;
-    REQUIRE_EXT(ctx, RVF);
+    REQUIRE_ZFINX_OR_F(ctx);
 
+    TCGv_i64 dest = dest_fpr(ctx, a->rd);
     TCGv src = get_gpr(ctx, a->rs1, EXT_SIGN);
 
     gen_set_rm(ctx, a->rm);
-    gen_helper_fcvt_s_l(cpu_fpr[a->rd], cpu_env, src);
-
+    gen_helper_fcvt_s_l(dest, cpu_env, src);
+    gen_set_fpr_hs(ctx, a->rd, dest);
     mark_fs_dirty(ctx);
     return true;
 }
@@ -446,13 +567,14 @@ static bool trans_fcvt_s_lu(DisasContext *ctx, arg_fcvt_s_lu *a)
 {
     REQUIRE_64BIT(ctx);
     REQUIRE_FPU;
-    REQUIRE_EXT(ctx, RVF);
+    REQUIRE_ZFINX_OR_F(ctx);
 
+    TCGv_i64 dest = dest_fpr(ctx, a->rd);
     TCGv src = get_gpr(ctx, a->rs1, EXT_ZERO);
 
     gen_set_rm(ctx, a->rm);
-    gen_helper_fcvt_s_lu(cpu_fpr[a->rd], cpu_env, src);
-
+    gen_helper_fcvt_s_lu(dest, cpu_env, src);
+    gen_set_fpr_hs(ctx, a->rd, dest);
     mark_fs_dirty(ctx);
     return true;
 }
diff --git a/target/riscv/internals.h b/target/riscv/internals.h
index 065e8162a2..6237bb3115 100644
--- a/target/riscv/internals.h
+++ b/target/riscv/internals.h
@@ -46,13 +46,23 @@ enum {
     RISCV_FRM_ROD = 8,  /* Round to Odd */
 };
 
-static inline uint64_t nanbox_s(float32 f)
+static inline uint64_t nanbox_s(CPURISCVState *env, float32 f)
 {
-    return f | MAKE_64BIT_MASK(32, 32);
+    /* the value is sign-extended instead of NaN-boxing for zfinx */
+    if (RISCV_CPU(env_cpu(env))->cfg.ext_zfinx) {
+        return (int32_t)f;
+    } else {
+        return f | MAKE_64BIT_MASK(32, 32);
+    }
 }
 
-static inline float32 check_nanbox_s(uint64_t f)
+static inline float32 check_nanbox_s(CPURISCVState *env, uint64_t f)
 {
+    /* Disable NaN-boxing check when enable zfinx */
+    if (RISCV_CPU(env_cpu(env))->cfg.ext_zfinx) {
+        return (uint32_t)f;
+    }
+
     uint64_t mask = MAKE_64BIT_MASK(32, 32);
 
     if (likely((f & mask) == mask)) {
diff --git a/target/riscv/translate.c b/target/riscv/translate.c
index c7232de326..10cf37be41 100644
--- a/target/riscv/translate.c
+++ b/target/riscv/translate.c
@@ -101,6 +101,9 @@ typedef struct DisasContext {
     TCGv zero;
     /* Space for 3 operands plus 1 extra for address computation. */
     TCGv temp[4];
+    /* Space for 4 operands(1 dest and <=3 src) for float point computation */
+    TCGv_i64 ftemp[4];
+    uint8_t nftemp;
     /* PointerMasking extension */
     bool pm_mask_enabled;
     bool pm_base_enabled;
@@ -380,6 +383,86 @@ static void gen_set_gpr128(DisasContext *ctx, int reg_num, TCGv rl, TCGv rh)
     }
 }
 
+static TCGv_i64 ftemp_new(DisasContext *ctx)
+{
+    assert(ctx->nftemp < ARRAY_SIZE(ctx->ftemp));
+    return ctx->ftemp[ctx->nftemp++] = tcg_temp_new_i64();
+}
+
+static TCGv_i64 get_fpr_hs(DisasContext *ctx, int reg_num)
+{
+    if (!ctx->cfg_ptr->ext_zfinx) {
+        return cpu_fpr[reg_num];
+    }
+
+    if (reg_num == 0) {
+        return tcg_constant_i64(0);
+    }
+    switch (get_xl(ctx)) {
+    case MXL_RV32:
+#ifdef TARGET_RISCV32
+    {
+        TCGv_i64 t = ftemp_new(ctx);
+        tcg_gen_ext_i32_i64(t, cpu_gpr[reg_num]);
+        return t;
+    }
+#else
+    /* fall through */
+    case MXL_RV64:
+        return cpu_gpr[reg_num];
+#endif
+    default:
+        g_assert_not_reached();
+    }
+}
+
+static TCGv_i64 dest_fpr(DisasContext *ctx, int reg_num)
+{
+    if (!ctx->cfg_ptr->ext_zfinx) {
+        return cpu_fpr[reg_num];
+    }
+
+    if (reg_num == 0) {
+        return ftemp_new(ctx);
+    }
+
+    switch (get_xl(ctx)) {
+    case MXL_RV32:
+        return ftemp_new(ctx);
+#ifdef TARGET_RISCV64
+    case MXL_RV64:
+        return cpu_gpr[reg_num];
+#endif
+    default:
+        g_assert_not_reached();
+    }
+}
+
+/* assume t is nanboxing (for normal) or sign-extended (for zfinx) */
+static void gen_set_fpr_hs(DisasContext *ctx, int reg_num, TCGv_i64 t)
+{
+    if (!ctx->cfg_ptr->ext_zfinx) {
+        tcg_gen_mov_i64(cpu_fpr[reg_num], t);
+        return;
+    }
+    if (reg_num != 0) {
+        switch (get_xl(ctx)) {
+        case MXL_RV32:
+#ifdef TARGET_RISCV32
+            tcg_gen_extrl_i64_i32(cpu_gpr[reg_num], t);
+            break;
+#else
+        /* fall through */
+        case MXL_RV64:
+            tcg_gen_mov_i64(cpu_gpr[reg_num], t);
+            break;
+#endif
+        default:
+            g_assert_not_reached();
+        }
+    }
+}
+
 static void gen_jal(DisasContext *ctx, int rd, target_ulong imm)
 {
     target_ulong next_pc;
@@ -955,6 +1038,8 @@ static void riscv_tr_init_disas_context(DisasContextBase *dcbase, CPUState *cs)
     ctx->cs = cs;
     ctx->ntemp = 0;
     memset(ctx->temp, 0, sizeof(ctx->temp));
+    ctx->nftemp = 0;
+    memset(ctx->ftemp, 0, sizeof(ctx->ftemp));
     ctx->pm_mask_enabled = FIELD_EX32(tb_flags, TB_FLAGS, PM_MASK_ENABLED);
     ctx->pm_base_enabled = FIELD_EX32(tb_flags, TB_FLAGS, PM_BASE_ENABLED);
     ctx->zero = tcg_constant_tl(0);
@@ -976,16 +1061,22 @@ static void riscv_tr_translate_insn(DisasContextBase *dcbase, CPUState *cpu)
     DisasContext *ctx = container_of(dcbase, DisasContext, base);
     CPURISCVState *env = cpu->env_ptr;
     uint16_t opcode16 = translator_lduw(env, &ctx->base, ctx->base.pc_next);
+    int i;
 
     ctx->ol = ctx->xl;
     decode_opc(env, ctx, opcode16);
     ctx->base.pc_next = ctx->pc_succ_insn;
 
-    for (int i = ctx->ntemp - 1; i >= 0; --i) {
+    for (i = ctx->ntemp - 1; i >= 0; --i) {
         tcg_temp_free(ctx->temp[i]);
         ctx->temp[i] = NULL;
     }
     ctx->ntemp = 0;
+    for (i = ctx->nftemp - 1; i >= 0; --i) {
+        tcg_temp_free_i64(ctx->ftemp[i]);
+        ctx->ftemp[i] = NULL;
+    }
+    ctx->nftemp = 0;
 
     if (ctx->base.is_jmp == DISAS_NEXT) {
         target_ulong page_start;
-- 
2.17.1



^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v6 4/6] target/riscv: add support for zdinx
  2022-02-11  4:39 ` Weiwei Li
@ 2022-02-11  4:39   ` Weiwei Li
  -1 siblings, 0 replies; 22+ messages in thread
From: Weiwei Li @ 2022-02-11  4:39 UTC (permalink / raw)
  To: richard.henderson, palmer, alistair.francis, bin.meng,
	qemu-riscv, qemu-devel
  Cc: wangjunqiang, Weiwei Li, lazyparser, ardxwe

  -- update extension check REQUIRE_ZDINX_OR_D
  -- update double float point register read/write

Co-authored-by: ardxwe <ardxwe@gmail.com>
Signed-off-by: Weiwei Li <liweiwei@iscas.ac.cn>
Signed-off-by: Junqiang Wang <wangjunqiang@iscas.ac.cn>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/riscv/insn_trans/trans_rvd.c.inc | 285 +++++++++++++++++-------
 target/riscv/translate.c                |  52 +++++
 2 files changed, 259 insertions(+), 78 deletions(-)

diff --git a/target/riscv/insn_trans/trans_rvd.c.inc b/target/riscv/insn_trans/trans_rvd.c.inc
index 091ed3a8ad..1397c1ce1c 100644
--- a/target/riscv/insn_trans/trans_rvd.c.inc
+++ b/target/riscv/insn_trans/trans_rvd.c.inc
@@ -18,6 +18,19 @@
  * this program.  If not, see <http://www.gnu.org/licenses/>.
  */
 
+#define REQUIRE_ZDINX_OR_D(ctx) do { \
+    if (!ctx->cfg_ptr->ext_zdinx) { \
+        REQUIRE_EXT(ctx, RVD); \
+    } \
+} while (0)
+
+#define REQUIRE_EVEN(ctx, reg) do { \
+    if (ctx->cfg_ptr->ext_zdinx && (get_xl(ctx) == MXL_RV32) && \
+        ((reg) & 0x1)) { \
+        return false; \
+    } \
+} while (0)
+
 static bool trans_fld(DisasContext *ctx, arg_fld *a)
 {
     TCGv addr;
@@ -47,10 +60,17 @@ static bool trans_fsd(DisasContext *ctx, arg_fsd *a)
 static bool trans_fmadd_d(DisasContext *ctx, arg_fmadd_d *a)
 {
     REQUIRE_FPU;
-    REQUIRE_EXT(ctx, RVD);
+    REQUIRE_ZDINX_OR_D(ctx);
+    REQUIRE_EVEN(ctx, a->rd | a->rs1 | a->rs2 | a->rs3);
+
+    TCGv_i64 dest = dest_fpr(ctx, a->rd);
+    TCGv_i64 src1 = get_fpr_d(ctx, a->rs1);
+    TCGv_i64 src2 = get_fpr_d(ctx, a->rs2);
+    TCGv_i64 src3 = get_fpr_d(ctx, a->rs3);
+
     gen_set_rm(ctx, a->rm);
-    gen_helper_fmadd_d(cpu_fpr[a->rd], cpu_env, cpu_fpr[a->rs1],
-                       cpu_fpr[a->rs2], cpu_fpr[a->rs3]);
+    gen_helper_fmadd_d(dest, cpu_env, src1, src2, src3);
+    gen_set_fpr_d(ctx, a->rd, dest);
     mark_fs_dirty(ctx);
     return true;
 }
@@ -58,10 +78,17 @@ static bool trans_fmadd_d(DisasContext *ctx, arg_fmadd_d *a)
 static bool trans_fmsub_d(DisasContext *ctx, arg_fmsub_d *a)
 {
     REQUIRE_FPU;
-    REQUIRE_EXT(ctx, RVD);
+    REQUIRE_ZDINX_OR_D(ctx);
+    REQUIRE_EVEN(ctx, a->rd | a->rs1 | a->rs2 | a->rs3);
+
+    TCGv_i64 dest = dest_fpr(ctx, a->rd);
+    TCGv_i64 src1 = get_fpr_d(ctx, a->rs1);
+    TCGv_i64 src2 = get_fpr_d(ctx, a->rs2);
+    TCGv_i64 src3 = get_fpr_d(ctx, a->rs3);
+
     gen_set_rm(ctx, a->rm);
-    gen_helper_fmsub_d(cpu_fpr[a->rd], cpu_env, cpu_fpr[a->rs1],
-                       cpu_fpr[a->rs2], cpu_fpr[a->rs3]);
+    gen_helper_fmsub_d(dest, cpu_env, src1, src2, src3);
+    gen_set_fpr_d(ctx, a->rd, dest);
     mark_fs_dirty(ctx);
     return true;
 }
@@ -69,10 +96,17 @@ static bool trans_fmsub_d(DisasContext *ctx, arg_fmsub_d *a)
 static bool trans_fnmsub_d(DisasContext *ctx, arg_fnmsub_d *a)
 {
     REQUIRE_FPU;
-    REQUIRE_EXT(ctx, RVD);
+    REQUIRE_ZDINX_OR_D(ctx);
+    REQUIRE_EVEN(ctx, a->rd | a->rs1 | a->rs2 | a->rs3);
+
+    TCGv_i64 dest = dest_fpr(ctx, a->rd);
+    TCGv_i64 src1 = get_fpr_d(ctx, a->rs1);
+    TCGv_i64 src2 = get_fpr_d(ctx, a->rs2);
+    TCGv_i64 src3 = get_fpr_d(ctx, a->rs3);
+
     gen_set_rm(ctx, a->rm);
-    gen_helper_fnmsub_d(cpu_fpr[a->rd], cpu_env, cpu_fpr[a->rs1],
-                        cpu_fpr[a->rs2], cpu_fpr[a->rs3]);
+    gen_helper_fnmsub_d(dest, cpu_env, src1, src2, src3);
+    gen_set_fpr_d(ctx, a->rd, dest);
     mark_fs_dirty(ctx);
     return true;
 }
@@ -80,10 +114,17 @@ static bool trans_fnmsub_d(DisasContext *ctx, arg_fnmsub_d *a)
 static bool trans_fnmadd_d(DisasContext *ctx, arg_fnmadd_d *a)
 {
     REQUIRE_FPU;
-    REQUIRE_EXT(ctx, RVD);
+    REQUIRE_ZDINX_OR_D(ctx);
+    REQUIRE_EVEN(ctx, a->rd | a->rs1 | a->rs2 | a->rs3);
+
+    TCGv_i64 dest = dest_fpr(ctx, a->rd);
+    TCGv_i64 src1 = get_fpr_d(ctx, a->rs1);
+    TCGv_i64 src2 = get_fpr_d(ctx, a->rs2);
+    TCGv_i64 src3 = get_fpr_d(ctx, a->rs3);
+
     gen_set_rm(ctx, a->rm);
-    gen_helper_fnmadd_d(cpu_fpr[a->rd], cpu_env, cpu_fpr[a->rs1],
-                        cpu_fpr[a->rs2], cpu_fpr[a->rs3]);
+    gen_helper_fnmadd_d(dest, cpu_env, src1, src2, src3);
+    gen_set_fpr_d(ctx, a->rd, dest);
     mark_fs_dirty(ctx);
     return true;
 }
@@ -91,12 +132,16 @@ static bool trans_fnmadd_d(DisasContext *ctx, arg_fnmadd_d *a)
 static bool trans_fadd_d(DisasContext *ctx, arg_fadd_d *a)
 {
     REQUIRE_FPU;
-    REQUIRE_EXT(ctx, RVD);
+    REQUIRE_ZDINX_OR_D(ctx);
+    REQUIRE_EVEN(ctx, a->rd | a->rs1 | a->rs2);
 
-    gen_set_rm(ctx, a->rm);
-    gen_helper_fadd_d(cpu_fpr[a->rd], cpu_env,
-                      cpu_fpr[a->rs1], cpu_fpr[a->rs2]);
+    TCGv_i64 dest = dest_fpr(ctx, a->rd);
+    TCGv_i64 src1 = get_fpr_d(ctx, a->rs1);
+    TCGv_i64 src2 = get_fpr_d(ctx, a->rs2);
 
+    gen_set_rm(ctx, a->rm);
+    gen_helper_fadd_d(dest, cpu_env, src1, src2);
+    gen_set_fpr_d(ctx, a->rd, dest);
     mark_fs_dirty(ctx);
     return true;
 }
@@ -104,12 +149,16 @@ static bool trans_fadd_d(DisasContext *ctx, arg_fadd_d *a)
 static bool trans_fsub_d(DisasContext *ctx, arg_fsub_d *a)
 {
     REQUIRE_FPU;
-    REQUIRE_EXT(ctx, RVD);
+    REQUIRE_ZDINX_OR_D(ctx);
+    REQUIRE_EVEN(ctx, a->rd | a->rs1 | a->rs2);
 
-    gen_set_rm(ctx, a->rm);
-    gen_helper_fsub_d(cpu_fpr[a->rd], cpu_env,
-                      cpu_fpr[a->rs1], cpu_fpr[a->rs2]);
+    TCGv_i64 dest = dest_fpr(ctx, a->rd);
+    TCGv_i64 src1 = get_fpr_d(ctx, a->rs1);
+    TCGv_i64 src2 = get_fpr_d(ctx, a->rs2);
 
+    gen_set_rm(ctx, a->rm);
+    gen_helper_fsub_d(dest, cpu_env, src1, src2);
+    gen_set_fpr_d(ctx, a->rd, dest);
     mark_fs_dirty(ctx);
     return true;
 }
@@ -117,12 +166,16 @@ static bool trans_fsub_d(DisasContext *ctx, arg_fsub_d *a)
 static bool trans_fmul_d(DisasContext *ctx, arg_fmul_d *a)
 {
     REQUIRE_FPU;
-    REQUIRE_EXT(ctx, RVD);
+    REQUIRE_ZDINX_OR_D(ctx);
+    REQUIRE_EVEN(ctx, a->rd | a->rs1 | a->rs2);
 
-    gen_set_rm(ctx, a->rm);
-    gen_helper_fmul_d(cpu_fpr[a->rd], cpu_env,
-                      cpu_fpr[a->rs1], cpu_fpr[a->rs2]);
+    TCGv_i64 dest = dest_fpr(ctx, a->rd);
+    TCGv_i64 src1 = get_fpr_d(ctx, a->rs1);
+    TCGv_i64 src2 = get_fpr_d(ctx, a->rs2);
 
+    gen_set_rm(ctx, a->rm);
+    gen_helper_fmul_d(dest, cpu_env, src1, src2);
+    gen_set_fpr_d(ctx, a->rd, dest);
     mark_fs_dirty(ctx);
     return true;
 }
@@ -130,12 +183,16 @@ static bool trans_fmul_d(DisasContext *ctx, arg_fmul_d *a)
 static bool trans_fdiv_d(DisasContext *ctx, arg_fdiv_d *a)
 {
     REQUIRE_FPU;
-    REQUIRE_EXT(ctx, RVD);
+    REQUIRE_ZDINX_OR_D(ctx);
+    REQUIRE_EVEN(ctx, a->rd | a->rs1 | a->rs2);
 
-    gen_set_rm(ctx, a->rm);
-    gen_helper_fdiv_d(cpu_fpr[a->rd], cpu_env,
-                      cpu_fpr[a->rs1], cpu_fpr[a->rs2]);
+    TCGv_i64 dest = dest_fpr(ctx, a->rd);
+    TCGv_i64 src1 = get_fpr_d(ctx, a->rs1);
+    TCGv_i64 src2 = get_fpr_d(ctx, a->rs2);
 
+    gen_set_rm(ctx, a->rm);
+    gen_helper_fdiv_d(dest, cpu_env, src1, src2);
+    gen_set_fpr_d(ctx, a->rd, dest);
     mark_fs_dirty(ctx);
     return true;
 }
@@ -143,23 +200,34 @@ static bool trans_fdiv_d(DisasContext *ctx, arg_fdiv_d *a)
 static bool trans_fsqrt_d(DisasContext *ctx, arg_fsqrt_d *a)
 {
     REQUIRE_FPU;
-    REQUIRE_EXT(ctx, RVD);
+    REQUIRE_ZDINX_OR_D(ctx);
+    REQUIRE_EVEN(ctx, a->rd | a->rs1);
 
-    gen_set_rm(ctx, a->rm);
-    gen_helper_fsqrt_d(cpu_fpr[a->rd], cpu_env, cpu_fpr[a->rs1]);
+    TCGv_i64 dest = dest_fpr(ctx, a->rd);
+    TCGv_i64 src1 = get_fpr_d(ctx, a->rs1);
 
+    gen_set_rm(ctx, a->rm);
+    gen_helper_fsqrt_d(dest, cpu_env, src1);
+    gen_set_fpr_d(ctx, a->rd, dest);
     mark_fs_dirty(ctx);
     return true;
 }
 
 static bool trans_fsgnj_d(DisasContext *ctx, arg_fsgnj_d *a)
 {
+    REQUIRE_FPU;
+    REQUIRE_ZDINX_OR_D(ctx);
+    REQUIRE_EVEN(ctx, a->rd | a->rs1 | a->rs2);
+
+    TCGv_i64 dest = dest_fpr(ctx, a->rd);
     if (a->rs1 == a->rs2) { /* FMOV */
-        tcg_gen_mov_i64(cpu_fpr[a->rd], cpu_fpr[a->rs1]);
+        dest = get_fpr_d(ctx, a->rs1);
     } else {
-        tcg_gen_deposit_i64(cpu_fpr[a->rd], cpu_fpr[a->rs2],
-                            cpu_fpr[a->rs1], 0, 63);
+        TCGv_i64 src1 = get_fpr_d(ctx, a->rs1);
+        TCGv_i64 src2 = get_fpr_d(ctx, a->rs2);
+        tcg_gen_deposit_i64(dest, src2, src1, 0, 63);
     }
+    gen_set_fpr_d(ctx, a->rd, dest);
     mark_fs_dirty(ctx);
     return true;
 }
@@ -167,15 +235,22 @@ static bool trans_fsgnj_d(DisasContext *ctx, arg_fsgnj_d *a)
 static bool trans_fsgnjn_d(DisasContext *ctx, arg_fsgnjn_d *a)
 {
     REQUIRE_FPU;
-    REQUIRE_EXT(ctx, RVD);
+    REQUIRE_ZDINX_OR_D(ctx);
+    REQUIRE_EVEN(ctx, a->rd | a->rs1 | a->rs2);
+
+    TCGv_i64 dest = dest_fpr(ctx, a->rd);
+    TCGv_i64 src1 = get_fpr_d(ctx, a->rs1);
+
     if (a->rs1 == a->rs2) { /* FNEG */
-        tcg_gen_xori_i64(cpu_fpr[a->rd], cpu_fpr[a->rs1], INT64_MIN);
+        tcg_gen_xori_i64(dest, src1, INT64_MIN);
     } else {
+        TCGv_i64 src2 = get_fpr_d(ctx, a->rs2);
         TCGv_i64 t0 = tcg_temp_new_i64();
-        tcg_gen_not_i64(t0, cpu_fpr[a->rs2]);
-        tcg_gen_deposit_i64(cpu_fpr[a->rd], t0, cpu_fpr[a->rs1], 0, 63);
+        tcg_gen_not_i64(t0, src2);
+        tcg_gen_deposit_i64(dest, t0, src1, 0, 63);
         tcg_temp_free_i64(t0);
     }
+    gen_set_fpr_d(ctx, a->rd, dest);
     mark_fs_dirty(ctx);
     return true;
 }
@@ -183,15 +258,22 @@ static bool trans_fsgnjn_d(DisasContext *ctx, arg_fsgnjn_d *a)
 static bool trans_fsgnjx_d(DisasContext *ctx, arg_fsgnjx_d *a)
 {
     REQUIRE_FPU;
-    REQUIRE_EXT(ctx, RVD);
+    REQUIRE_ZDINX_OR_D(ctx);
+    REQUIRE_EVEN(ctx, a->rd | a->rs1 | a->rs2);
+
+    TCGv_i64 dest = dest_fpr(ctx, a->rd);
+    TCGv_i64 src1 = get_fpr_d(ctx, a->rs1);
+
     if (a->rs1 == a->rs2) { /* FABS */
-        tcg_gen_andi_i64(cpu_fpr[a->rd], cpu_fpr[a->rs1], ~INT64_MIN);
+        tcg_gen_andi_i64(dest, src1, ~INT64_MIN);
     } else {
+        TCGv_i64 src2 = get_fpr_d(ctx, a->rs2);
         TCGv_i64 t0 = tcg_temp_new_i64();
-        tcg_gen_andi_i64(t0, cpu_fpr[a->rs2], INT64_MIN);
-        tcg_gen_xor_i64(cpu_fpr[a->rd], cpu_fpr[a->rs1], t0);
+        tcg_gen_andi_i64(t0, src2, INT64_MIN);
+        tcg_gen_xor_i64(dest, src1, t0);
         tcg_temp_free_i64(t0);
     }
+    gen_set_fpr_d(ctx, a->rd, dest);
     mark_fs_dirty(ctx);
     return true;
 }
@@ -199,11 +281,15 @@ static bool trans_fsgnjx_d(DisasContext *ctx, arg_fsgnjx_d *a)
 static bool trans_fmin_d(DisasContext *ctx, arg_fmin_d *a)
 {
     REQUIRE_FPU;
-    REQUIRE_EXT(ctx, RVD);
+    REQUIRE_ZDINX_OR_D(ctx);
+    REQUIRE_EVEN(ctx, a->rd | a->rs1 | a->rs2);
 
-    gen_helper_fmin_d(cpu_fpr[a->rd], cpu_env,
-                      cpu_fpr[a->rs1], cpu_fpr[a->rs2]);
+    TCGv_i64 dest = dest_fpr(ctx, a->rd);
+    TCGv_i64 src1 = get_fpr_d(ctx, a->rs1);
+    TCGv_i64 src2 = get_fpr_d(ctx, a->rs2);
 
+    gen_helper_fmin_d(dest, cpu_env, src1, src2);
+    gen_set_fpr_d(ctx, a->rd, dest);
     mark_fs_dirty(ctx);
     return true;
 }
@@ -211,11 +297,15 @@ static bool trans_fmin_d(DisasContext *ctx, arg_fmin_d *a)
 static bool trans_fmax_d(DisasContext *ctx, arg_fmax_d *a)
 {
     REQUIRE_FPU;
-    REQUIRE_EXT(ctx, RVD);
+    REQUIRE_ZDINX_OR_D(ctx);
+    REQUIRE_EVEN(ctx, a->rd | a->rs1 | a->rs2);
 
-    gen_helper_fmax_d(cpu_fpr[a->rd], cpu_env,
-                      cpu_fpr[a->rs1], cpu_fpr[a->rs2]);
+    TCGv_i64 dest = dest_fpr(ctx, a->rd);
+    TCGv_i64 src1 = get_fpr_d(ctx, a->rs1);
+    TCGv_i64 src2 = get_fpr_d(ctx, a->rs2);
 
+    gen_helper_fmax_d(dest, cpu_env, src1, src2);
+    gen_set_fpr_d(ctx, a->rd, dest);
     mark_fs_dirty(ctx);
     return true;
 }
@@ -223,11 +313,15 @@ static bool trans_fmax_d(DisasContext *ctx, arg_fmax_d *a)
 static bool trans_fcvt_s_d(DisasContext *ctx, arg_fcvt_s_d *a)
 {
     REQUIRE_FPU;
-    REQUIRE_EXT(ctx, RVD);
+    REQUIRE_ZDINX_OR_D(ctx);
+    REQUIRE_EVEN(ctx, a->rs1);
 
-    gen_set_rm(ctx, a->rm);
-    gen_helper_fcvt_s_d(cpu_fpr[a->rd], cpu_env, cpu_fpr[a->rs1]);
+    TCGv_i64 dest = dest_fpr(ctx, a->rd);
+    TCGv_i64 src1 = get_fpr_d(ctx, a->rs1);
 
+    gen_set_rm(ctx, a->rm);
+    gen_helper_fcvt_s_d(dest, cpu_env, src1);
+    gen_set_fpr_hs(ctx, a->rd, dest);
     mark_fs_dirty(ctx);
     return true;
 }
@@ -235,11 +329,15 @@ static bool trans_fcvt_s_d(DisasContext *ctx, arg_fcvt_s_d *a)
 static bool trans_fcvt_d_s(DisasContext *ctx, arg_fcvt_d_s *a)
 {
     REQUIRE_FPU;
-    REQUIRE_EXT(ctx, RVD);
+    REQUIRE_ZDINX_OR_D(ctx);
+    REQUIRE_EVEN(ctx, a->rd);
 
-    gen_set_rm(ctx, a->rm);
-    gen_helper_fcvt_d_s(cpu_fpr[a->rd], cpu_env, cpu_fpr[a->rs1]);
+    TCGv_i64 dest = dest_fpr(ctx, a->rd);
+    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
 
+    gen_set_rm(ctx, a->rm);
+    gen_helper_fcvt_d_s(dest, cpu_env, src1);
+    gen_set_fpr_d(ctx, a->rd, dest);
     mark_fs_dirty(ctx);
     return true;
 }
@@ -247,11 +345,14 @@ static bool trans_fcvt_d_s(DisasContext *ctx, arg_fcvt_d_s *a)
 static bool trans_feq_d(DisasContext *ctx, arg_feq_d *a)
 {
     REQUIRE_FPU;
-    REQUIRE_EXT(ctx, RVD);
+    REQUIRE_ZDINX_OR_D(ctx);
+    REQUIRE_EVEN(ctx, a->rs1 | a->rs2);
 
     TCGv dest = dest_gpr(ctx, a->rd);
+    TCGv_i64 src1 = get_fpr_d(ctx, a->rs1);
+    TCGv_i64 src2 = get_fpr_d(ctx, a->rs2);
 
-    gen_helper_feq_d(dest, cpu_env, cpu_fpr[a->rs1], cpu_fpr[a->rs2]);
+    gen_helper_feq_d(dest, cpu_env, src1, src2);
     gen_set_gpr(ctx, a->rd, dest);
     return true;
 }
@@ -259,11 +360,14 @@ static bool trans_feq_d(DisasContext *ctx, arg_feq_d *a)
 static bool trans_flt_d(DisasContext *ctx, arg_flt_d *a)
 {
     REQUIRE_FPU;
-    REQUIRE_EXT(ctx, RVD);
+    REQUIRE_ZDINX_OR_D(ctx);
+    REQUIRE_EVEN(ctx, a->rs1 | a->rs2);
 
     TCGv dest = dest_gpr(ctx, a->rd);
+    TCGv_i64 src1 = get_fpr_d(ctx, a->rs1);
+    TCGv_i64 src2 = get_fpr_d(ctx, a->rs2);
 
-    gen_helper_flt_d(dest, cpu_env, cpu_fpr[a->rs1], cpu_fpr[a->rs2]);
+    gen_helper_flt_d(dest, cpu_env, src1, src2);
     gen_set_gpr(ctx, a->rd, dest);
     return true;
 }
@@ -271,11 +375,14 @@ static bool trans_flt_d(DisasContext *ctx, arg_flt_d *a)
 static bool trans_fle_d(DisasContext *ctx, arg_fle_d *a)
 {
     REQUIRE_FPU;
-    REQUIRE_EXT(ctx, RVD);
+    REQUIRE_ZDINX_OR_D(ctx);
+    REQUIRE_EVEN(ctx, a->rs1 | a->rs2);
 
     TCGv dest = dest_gpr(ctx, a->rd);
+    TCGv_i64 src1 = get_fpr_d(ctx, a->rs1);
+    TCGv_i64 src2 = get_fpr_d(ctx, a->rs2);
 
-    gen_helper_fle_d(dest, cpu_env, cpu_fpr[a->rs1], cpu_fpr[a->rs2]);
+    gen_helper_fle_d(dest, cpu_env, src1, src2);
     gen_set_gpr(ctx, a->rd, dest);
     return true;
 }
@@ -283,11 +390,13 @@ static bool trans_fle_d(DisasContext *ctx, arg_fle_d *a)
 static bool trans_fclass_d(DisasContext *ctx, arg_fclass_d *a)
 {
     REQUIRE_FPU;
-    REQUIRE_EXT(ctx, RVD);
+    REQUIRE_ZDINX_OR_D(ctx);
+    REQUIRE_EVEN(ctx, a->rs1);
 
     TCGv dest = dest_gpr(ctx, a->rd);
+    TCGv_i64 src1 = get_fpr_d(ctx, a->rs1);
 
-    gen_helper_fclass_d(dest, cpu_fpr[a->rs1]);
+    gen_helper_fclass_d(dest, src1);
     gen_set_gpr(ctx, a->rd, dest);
     return true;
 }
@@ -295,12 +404,14 @@ static bool trans_fclass_d(DisasContext *ctx, arg_fclass_d *a)
 static bool trans_fcvt_w_d(DisasContext *ctx, arg_fcvt_w_d *a)
 {
     REQUIRE_FPU;
-    REQUIRE_EXT(ctx, RVD);
+    REQUIRE_ZDINX_OR_D(ctx);
+    REQUIRE_EVEN(ctx, a->rs1);
 
     TCGv dest = dest_gpr(ctx, a->rd);
+    TCGv_i64 src1 = get_fpr_d(ctx, a->rs1);
 
     gen_set_rm(ctx, a->rm);
-    gen_helper_fcvt_w_d(dest, cpu_env, cpu_fpr[a->rs1]);
+    gen_helper_fcvt_w_d(dest, cpu_env, src1);
     gen_set_gpr(ctx, a->rd, dest);
     return true;
 }
@@ -308,12 +419,14 @@ static bool trans_fcvt_w_d(DisasContext *ctx, arg_fcvt_w_d *a)
 static bool trans_fcvt_wu_d(DisasContext *ctx, arg_fcvt_wu_d *a)
 {
     REQUIRE_FPU;
-    REQUIRE_EXT(ctx, RVD);
+    REQUIRE_ZDINX_OR_D(ctx);
+    REQUIRE_EVEN(ctx, a->rs1);
 
     TCGv dest = dest_gpr(ctx, a->rd);
+    TCGv_i64 src1 = get_fpr_d(ctx, a->rs1);
 
     gen_set_rm(ctx, a->rm);
-    gen_helper_fcvt_wu_d(dest, cpu_env, cpu_fpr[a->rs1]);
+    gen_helper_fcvt_wu_d(dest, cpu_env, src1);
     gen_set_gpr(ctx, a->rd, dest);
     return true;
 }
@@ -321,12 +434,15 @@ static bool trans_fcvt_wu_d(DisasContext *ctx, arg_fcvt_wu_d *a)
 static bool trans_fcvt_d_w(DisasContext *ctx, arg_fcvt_d_w *a)
 {
     REQUIRE_FPU;
-    REQUIRE_EXT(ctx, RVD);
+    REQUIRE_ZDINX_OR_D(ctx);
+    REQUIRE_EVEN(ctx, a->rd);
 
+    TCGv_i64 dest = dest_fpr(ctx, a->rd);
     TCGv src = get_gpr(ctx, a->rs1, EXT_SIGN);
 
     gen_set_rm(ctx, a->rm);
-    gen_helper_fcvt_d_w(cpu_fpr[a->rd], cpu_env, src);
+    gen_helper_fcvt_d_w(dest, cpu_env, src);
+    gen_set_fpr_d(ctx, a->rd, dest);
 
     mark_fs_dirty(ctx);
     return true;
@@ -335,12 +451,15 @@ static bool trans_fcvt_d_w(DisasContext *ctx, arg_fcvt_d_w *a)
 static bool trans_fcvt_d_wu(DisasContext *ctx, arg_fcvt_d_wu *a)
 {
     REQUIRE_FPU;
-    REQUIRE_EXT(ctx, RVD);
+    REQUIRE_ZDINX_OR_D(ctx);
+    REQUIRE_EVEN(ctx, a->rd);
 
+    TCGv_i64 dest = dest_fpr(ctx, a->rd);
     TCGv src = get_gpr(ctx, a->rs1, EXT_ZERO);
 
     gen_set_rm(ctx, a->rm);
-    gen_helper_fcvt_d_wu(cpu_fpr[a->rd], cpu_env, src);
+    gen_helper_fcvt_d_wu(dest, cpu_env, src);
+    gen_set_fpr_d(ctx, a->rd, dest);
 
     mark_fs_dirty(ctx);
     return true;
@@ -350,12 +469,14 @@ static bool trans_fcvt_l_d(DisasContext *ctx, arg_fcvt_l_d *a)
 {
     REQUIRE_64BIT(ctx);
     REQUIRE_FPU;
-    REQUIRE_EXT(ctx, RVD);
+    REQUIRE_ZDINX_OR_D(ctx);
+    REQUIRE_EVEN(ctx, a->rs1);
 
     TCGv dest = dest_gpr(ctx, a->rd);
+    TCGv_i64 src1 = get_fpr_d(ctx, a->rs1);
 
     gen_set_rm(ctx, a->rm);
-    gen_helper_fcvt_l_d(dest, cpu_env, cpu_fpr[a->rs1]);
+    gen_helper_fcvt_l_d(dest, cpu_env, src1);
     gen_set_gpr(ctx, a->rd, dest);
     return true;
 }
@@ -364,12 +485,14 @@ static bool trans_fcvt_lu_d(DisasContext *ctx, arg_fcvt_lu_d *a)
 {
     REQUIRE_64BIT(ctx);
     REQUIRE_FPU;
-    REQUIRE_EXT(ctx, RVD);
+    REQUIRE_ZDINX_OR_D(ctx);
+    REQUIRE_EVEN(ctx, a->rs1);
 
     TCGv dest = dest_gpr(ctx, a->rd);
+    TCGv_i64 src1 = get_fpr_d(ctx, a->rs1);
 
     gen_set_rm(ctx, a->rm);
-    gen_helper_fcvt_lu_d(dest, cpu_env, cpu_fpr[a->rs1]);
+    gen_helper_fcvt_lu_d(dest, cpu_env, src1);
     gen_set_gpr(ctx, a->rd, dest);
     return true;
 }
@@ -392,12 +515,15 @@ static bool trans_fcvt_d_l(DisasContext *ctx, arg_fcvt_d_l *a)
 {
     REQUIRE_64BIT(ctx);
     REQUIRE_FPU;
-    REQUIRE_EXT(ctx, RVD);
+    REQUIRE_ZDINX_OR_D(ctx);
+    REQUIRE_EVEN(ctx, a->rd);
 
+    TCGv_i64 dest = dest_fpr(ctx, a->rd);
     TCGv src = get_gpr(ctx, a->rs1, EXT_SIGN);
 
     gen_set_rm(ctx, a->rm);
-    gen_helper_fcvt_d_l(cpu_fpr[a->rd], cpu_env, src);
+    gen_helper_fcvt_d_l(dest, cpu_env, src);
+    gen_set_fpr_d(ctx, a->rd, dest);
 
     mark_fs_dirty(ctx);
     return true;
@@ -407,12 +533,15 @@ static bool trans_fcvt_d_lu(DisasContext *ctx, arg_fcvt_d_lu *a)
 {
     REQUIRE_64BIT(ctx);
     REQUIRE_FPU;
-    REQUIRE_EXT(ctx, RVD);
+    REQUIRE_ZDINX_OR_D(ctx);
+    REQUIRE_EVEN(ctx, a->rd);
 
+    TCGv_i64 dest = dest_fpr(ctx, a->rd);
     TCGv src = get_gpr(ctx, a->rs1, EXT_ZERO);
 
     gen_set_rm(ctx, a->rm);
-    gen_helper_fcvt_d_lu(cpu_fpr[a->rd], cpu_env, src);
+    gen_helper_fcvt_d_lu(dest, cpu_env, src);
+    gen_set_fpr_d(ctx, a->rd, dest);
 
     mark_fs_dirty(ctx);
     return true;
diff --git a/target/riscv/translate.c b/target/riscv/translate.c
index 10cf37be41..fac998a6b5 100644
--- a/target/riscv/translate.c
+++ b/target/riscv/translate.c
@@ -416,6 +416,31 @@ static TCGv_i64 get_fpr_hs(DisasContext *ctx, int reg_num)
     }
 }
 
+static TCGv_i64 get_fpr_d(DisasContext *ctx, int reg_num)
+{
+    if (!ctx->cfg_ptr->ext_zfinx) {
+        return cpu_fpr[reg_num];
+    }
+
+    if (reg_num == 0) {
+        return tcg_constant_i64(0);
+    }
+    switch (get_xl(ctx)) {
+    case MXL_RV32:
+    {
+        TCGv_i64 t = ftemp_new(ctx);
+        tcg_gen_concat_tl_i64(t, cpu_gpr[reg_num], cpu_gpr[reg_num + 1]);
+        return t;
+    }
+#ifdef TARGET_RISCV64
+    case MXL_RV64:
+        return cpu_gpr[reg_num];
+#endif
+    default:
+        g_assert_not_reached();
+    }
+}
+
 static TCGv_i64 dest_fpr(DisasContext *ctx, int reg_num)
 {
     if (!ctx->cfg_ptr->ext_zfinx) {
@@ -463,6 +488,33 @@ static void gen_set_fpr_hs(DisasContext *ctx, int reg_num, TCGv_i64 t)
     }
 }
 
+static void gen_set_fpr_d(DisasContext *ctx, int reg_num, TCGv_i64 t)
+{
+    if (!ctx->cfg_ptr->ext_zfinx) {
+        tcg_gen_mov_i64(cpu_fpr[reg_num], t);
+        return;
+    }
+
+    if (reg_num != 0) {
+        switch (get_xl(ctx)) {
+        case MXL_RV32:
+#ifdef TARGET_RISCV32
+            tcg_gen_extr_i64_i32(cpu_gpr[reg_num], cpu_gpr[reg_num + 1], t);
+            break;
+#else
+            tcg_gen_ext32s_i64(cpu_gpr[reg_num], t);
+            tcg_gen_sari_i64(cpu_gpr[reg_num + 1], t, 32);
+            break;
+        case MXL_RV64:
+            tcg_gen_mov_i64(cpu_gpr[reg_num], t);
+            break;
+#endif
+        default:
+            g_assert_not_reached();
+        }
+    }
+}
+
 static void gen_jal(DisasContext *ctx, int rd, target_ulong imm)
 {
     target_ulong next_pc;
-- 
2.17.1



^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v6 4/6] target/riscv: add support for zdinx
@ 2022-02-11  4:39   ` Weiwei Li
  0 siblings, 0 replies; 22+ messages in thread
From: Weiwei Li @ 2022-02-11  4:39 UTC (permalink / raw)
  To: richard.henderson, palmer, alistair.francis, bin.meng,
	qemu-riscv, qemu-devel
  Cc: wangjunqiang, lazyparser, ardxwe, Weiwei Li

  -- update extension check REQUIRE_ZDINX_OR_D
  -- update double float point register read/write

Co-authored-by: ardxwe <ardxwe@gmail.com>
Signed-off-by: Weiwei Li <liweiwei@iscas.ac.cn>
Signed-off-by: Junqiang Wang <wangjunqiang@iscas.ac.cn>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/riscv/insn_trans/trans_rvd.c.inc | 285 +++++++++++++++++-------
 target/riscv/translate.c                |  52 +++++
 2 files changed, 259 insertions(+), 78 deletions(-)

diff --git a/target/riscv/insn_trans/trans_rvd.c.inc b/target/riscv/insn_trans/trans_rvd.c.inc
index 091ed3a8ad..1397c1ce1c 100644
--- a/target/riscv/insn_trans/trans_rvd.c.inc
+++ b/target/riscv/insn_trans/trans_rvd.c.inc
@@ -18,6 +18,19 @@
  * this program.  If not, see <http://www.gnu.org/licenses/>.
  */
 
+#define REQUIRE_ZDINX_OR_D(ctx) do { \
+    if (!ctx->cfg_ptr->ext_zdinx) { \
+        REQUIRE_EXT(ctx, RVD); \
+    } \
+} while (0)
+
+#define REQUIRE_EVEN(ctx, reg) do { \
+    if (ctx->cfg_ptr->ext_zdinx && (get_xl(ctx) == MXL_RV32) && \
+        ((reg) & 0x1)) { \
+        return false; \
+    } \
+} while (0)
+
 static bool trans_fld(DisasContext *ctx, arg_fld *a)
 {
     TCGv addr;
@@ -47,10 +60,17 @@ static bool trans_fsd(DisasContext *ctx, arg_fsd *a)
 static bool trans_fmadd_d(DisasContext *ctx, arg_fmadd_d *a)
 {
     REQUIRE_FPU;
-    REQUIRE_EXT(ctx, RVD);
+    REQUIRE_ZDINX_OR_D(ctx);
+    REQUIRE_EVEN(ctx, a->rd | a->rs1 | a->rs2 | a->rs3);
+
+    TCGv_i64 dest = dest_fpr(ctx, a->rd);
+    TCGv_i64 src1 = get_fpr_d(ctx, a->rs1);
+    TCGv_i64 src2 = get_fpr_d(ctx, a->rs2);
+    TCGv_i64 src3 = get_fpr_d(ctx, a->rs3);
+
     gen_set_rm(ctx, a->rm);
-    gen_helper_fmadd_d(cpu_fpr[a->rd], cpu_env, cpu_fpr[a->rs1],
-                       cpu_fpr[a->rs2], cpu_fpr[a->rs3]);
+    gen_helper_fmadd_d(dest, cpu_env, src1, src2, src3);
+    gen_set_fpr_d(ctx, a->rd, dest);
     mark_fs_dirty(ctx);
     return true;
 }
@@ -58,10 +78,17 @@ static bool trans_fmadd_d(DisasContext *ctx, arg_fmadd_d *a)
 static bool trans_fmsub_d(DisasContext *ctx, arg_fmsub_d *a)
 {
     REQUIRE_FPU;
-    REQUIRE_EXT(ctx, RVD);
+    REQUIRE_ZDINX_OR_D(ctx);
+    REQUIRE_EVEN(ctx, a->rd | a->rs1 | a->rs2 | a->rs3);
+
+    TCGv_i64 dest = dest_fpr(ctx, a->rd);
+    TCGv_i64 src1 = get_fpr_d(ctx, a->rs1);
+    TCGv_i64 src2 = get_fpr_d(ctx, a->rs2);
+    TCGv_i64 src3 = get_fpr_d(ctx, a->rs3);
+
     gen_set_rm(ctx, a->rm);
-    gen_helper_fmsub_d(cpu_fpr[a->rd], cpu_env, cpu_fpr[a->rs1],
-                       cpu_fpr[a->rs2], cpu_fpr[a->rs3]);
+    gen_helper_fmsub_d(dest, cpu_env, src1, src2, src3);
+    gen_set_fpr_d(ctx, a->rd, dest);
     mark_fs_dirty(ctx);
     return true;
 }
@@ -69,10 +96,17 @@ static bool trans_fmsub_d(DisasContext *ctx, arg_fmsub_d *a)
 static bool trans_fnmsub_d(DisasContext *ctx, arg_fnmsub_d *a)
 {
     REQUIRE_FPU;
-    REQUIRE_EXT(ctx, RVD);
+    REQUIRE_ZDINX_OR_D(ctx);
+    REQUIRE_EVEN(ctx, a->rd | a->rs1 | a->rs2 | a->rs3);
+
+    TCGv_i64 dest = dest_fpr(ctx, a->rd);
+    TCGv_i64 src1 = get_fpr_d(ctx, a->rs1);
+    TCGv_i64 src2 = get_fpr_d(ctx, a->rs2);
+    TCGv_i64 src3 = get_fpr_d(ctx, a->rs3);
+
     gen_set_rm(ctx, a->rm);
-    gen_helper_fnmsub_d(cpu_fpr[a->rd], cpu_env, cpu_fpr[a->rs1],
-                        cpu_fpr[a->rs2], cpu_fpr[a->rs3]);
+    gen_helper_fnmsub_d(dest, cpu_env, src1, src2, src3);
+    gen_set_fpr_d(ctx, a->rd, dest);
     mark_fs_dirty(ctx);
     return true;
 }
@@ -80,10 +114,17 @@ static bool trans_fnmsub_d(DisasContext *ctx, arg_fnmsub_d *a)
 static bool trans_fnmadd_d(DisasContext *ctx, arg_fnmadd_d *a)
 {
     REQUIRE_FPU;
-    REQUIRE_EXT(ctx, RVD);
+    REQUIRE_ZDINX_OR_D(ctx);
+    REQUIRE_EVEN(ctx, a->rd | a->rs1 | a->rs2 | a->rs3);
+
+    TCGv_i64 dest = dest_fpr(ctx, a->rd);
+    TCGv_i64 src1 = get_fpr_d(ctx, a->rs1);
+    TCGv_i64 src2 = get_fpr_d(ctx, a->rs2);
+    TCGv_i64 src3 = get_fpr_d(ctx, a->rs3);
+
     gen_set_rm(ctx, a->rm);
-    gen_helper_fnmadd_d(cpu_fpr[a->rd], cpu_env, cpu_fpr[a->rs1],
-                        cpu_fpr[a->rs2], cpu_fpr[a->rs3]);
+    gen_helper_fnmadd_d(dest, cpu_env, src1, src2, src3);
+    gen_set_fpr_d(ctx, a->rd, dest);
     mark_fs_dirty(ctx);
     return true;
 }
@@ -91,12 +132,16 @@ static bool trans_fnmadd_d(DisasContext *ctx, arg_fnmadd_d *a)
 static bool trans_fadd_d(DisasContext *ctx, arg_fadd_d *a)
 {
     REQUIRE_FPU;
-    REQUIRE_EXT(ctx, RVD);
+    REQUIRE_ZDINX_OR_D(ctx);
+    REQUIRE_EVEN(ctx, a->rd | a->rs1 | a->rs2);
 
-    gen_set_rm(ctx, a->rm);
-    gen_helper_fadd_d(cpu_fpr[a->rd], cpu_env,
-                      cpu_fpr[a->rs1], cpu_fpr[a->rs2]);
+    TCGv_i64 dest = dest_fpr(ctx, a->rd);
+    TCGv_i64 src1 = get_fpr_d(ctx, a->rs1);
+    TCGv_i64 src2 = get_fpr_d(ctx, a->rs2);
 
+    gen_set_rm(ctx, a->rm);
+    gen_helper_fadd_d(dest, cpu_env, src1, src2);
+    gen_set_fpr_d(ctx, a->rd, dest);
     mark_fs_dirty(ctx);
     return true;
 }
@@ -104,12 +149,16 @@ static bool trans_fadd_d(DisasContext *ctx, arg_fadd_d *a)
 static bool trans_fsub_d(DisasContext *ctx, arg_fsub_d *a)
 {
     REQUIRE_FPU;
-    REQUIRE_EXT(ctx, RVD);
+    REQUIRE_ZDINX_OR_D(ctx);
+    REQUIRE_EVEN(ctx, a->rd | a->rs1 | a->rs2);
 
-    gen_set_rm(ctx, a->rm);
-    gen_helper_fsub_d(cpu_fpr[a->rd], cpu_env,
-                      cpu_fpr[a->rs1], cpu_fpr[a->rs2]);
+    TCGv_i64 dest = dest_fpr(ctx, a->rd);
+    TCGv_i64 src1 = get_fpr_d(ctx, a->rs1);
+    TCGv_i64 src2 = get_fpr_d(ctx, a->rs2);
 
+    gen_set_rm(ctx, a->rm);
+    gen_helper_fsub_d(dest, cpu_env, src1, src2);
+    gen_set_fpr_d(ctx, a->rd, dest);
     mark_fs_dirty(ctx);
     return true;
 }
@@ -117,12 +166,16 @@ static bool trans_fsub_d(DisasContext *ctx, arg_fsub_d *a)
 static bool trans_fmul_d(DisasContext *ctx, arg_fmul_d *a)
 {
     REQUIRE_FPU;
-    REQUIRE_EXT(ctx, RVD);
+    REQUIRE_ZDINX_OR_D(ctx);
+    REQUIRE_EVEN(ctx, a->rd | a->rs1 | a->rs2);
 
-    gen_set_rm(ctx, a->rm);
-    gen_helper_fmul_d(cpu_fpr[a->rd], cpu_env,
-                      cpu_fpr[a->rs1], cpu_fpr[a->rs2]);
+    TCGv_i64 dest = dest_fpr(ctx, a->rd);
+    TCGv_i64 src1 = get_fpr_d(ctx, a->rs1);
+    TCGv_i64 src2 = get_fpr_d(ctx, a->rs2);
 
+    gen_set_rm(ctx, a->rm);
+    gen_helper_fmul_d(dest, cpu_env, src1, src2);
+    gen_set_fpr_d(ctx, a->rd, dest);
     mark_fs_dirty(ctx);
     return true;
 }
@@ -130,12 +183,16 @@ static bool trans_fmul_d(DisasContext *ctx, arg_fmul_d *a)
 static bool trans_fdiv_d(DisasContext *ctx, arg_fdiv_d *a)
 {
     REQUIRE_FPU;
-    REQUIRE_EXT(ctx, RVD);
+    REQUIRE_ZDINX_OR_D(ctx);
+    REQUIRE_EVEN(ctx, a->rd | a->rs1 | a->rs2);
 
-    gen_set_rm(ctx, a->rm);
-    gen_helper_fdiv_d(cpu_fpr[a->rd], cpu_env,
-                      cpu_fpr[a->rs1], cpu_fpr[a->rs2]);
+    TCGv_i64 dest = dest_fpr(ctx, a->rd);
+    TCGv_i64 src1 = get_fpr_d(ctx, a->rs1);
+    TCGv_i64 src2 = get_fpr_d(ctx, a->rs2);
 
+    gen_set_rm(ctx, a->rm);
+    gen_helper_fdiv_d(dest, cpu_env, src1, src2);
+    gen_set_fpr_d(ctx, a->rd, dest);
     mark_fs_dirty(ctx);
     return true;
 }
@@ -143,23 +200,34 @@ static bool trans_fdiv_d(DisasContext *ctx, arg_fdiv_d *a)
 static bool trans_fsqrt_d(DisasContext *ctx, arg_fsqrt_d *a)
 {
     REQUIRE_FPU;
-    REQUIRE_EXT(ctx, RVD);
+    REQUIRE_ZDINX_OR_D(ctx);
+    REQUIRE_EVEN(ctx, a->rd | a->rs1);
 
-    gen_set_rm(ctx, a->rm);
-    gen_helper_fsqrt_d(cpu_fpr[a->rd], cpu_env, cpu_fpr[a->rs1]);
+    TCGv_i64 dest = dest_fpr(ctx, a->rd);
+    TCGv_i64 src1 = get_fpr_d(ctx, a->rs1);
 
+    gen_set_rm(ctx, a->rm);
+    gen_helper_fsqrt_d(dest, cpu_env, src1);
+    gen_set_fpr_d(ctx, a->rd, dest);
     mark_fs_dirty(ctx);
     return true;
 }
 
 static bool trans_fsgnj_d(DisasContext *ctx, arg_fsgnj_d *a)
 {
+    REQUIRE_FPU;
+    REQUIRE_ZDINX_OR_D(ctx);
+    REQUIRE_EVEN(ctx, a->rd | a->rs1 | a->rs2);
+
+    TCGv_i64 dest = dest_fpr(ctx, a->rd);
     if (a->rs1 == a->rs2) { /* FMOV */
-        tcg_gen_mov_i64(cpu_fpr[a->rd], cpu_fpr[a->rs1]);
+        dest = get_fpr_d(ctx, a->rs1);
     } else {
-        tcg_gen_deposit_i64(cpu_fpr[a->rd], cpu_fpr[a->rs2],
-                            cpu_fpr[a->rs1], 0, 63);
+        TCGv_i64 src1 = get_fpr_d(ctx, a->rs1);
+        TCGv_i64 src2 = get_fpr_d(ctx, a->rs2);
+        tcg_gen_deposit_i64(dest, src2, src1, 0, 63);
     }
+    gen_set_fpr_d(ctx, a->rd, dest);
     mark_fs_dirty(ctx);
     return true;
 }
@@ -167,15 +235,22 @@ static bool trans_fsgnj_d(DisasContext *ctx, arg_fsgnj_d *a)
 static bool trans_fsgnjn_d(DisasContext *ctx, arg_fsgnjn_d *a)
 {
     REQUIRE_FPU;
-    REQUIRE_EXT(ctx, RVD);
+    REQUIRE_ZDINX_OR_D(ctx);
+    REQUIRE_EVEN(ctx, a->rd | a->rs1 | a->rs2);
+
+    TCGv_i64 dest = dest_fpr(ctx, a->rd);
+    TCGv_i64 src1 = get_fpr_d(ctx, a->rs1);
+
     if (a->rs1 == a->rs2) { /* FNEG */
-        tcg_gen_xori_i64(cpu_fpr[a->rd], cpu_fpr[a->rs1], INT64_MIN);
+        tcg_gen_xori_i64(dest, src1, INT64_MIN);
     } else {
+        TCGv_i64 src2 = get_fpr_d(ctx, a->rs2);
         TCGv_i64 t0 = tcg_temp_new_i64();
-        tcg_gen_not_i64(t0, cpu_fpr[a->rs2]);
-        tcg_gen_deposit_i64(cpu_fpr[a->rd], t0, cpu_fpr[a->rs1], 0, 63);
+        tcg_gen_not_i64(t0, src2);
+        tcg_gen_deposit_i64(dest, t0, src1, 0, 63);
         tcg_temp_free_i64(t0);
     }
+    gen_set_fpr_d(ctx, a->rd, dest);
     mark_fs_dirty(ctx);
     return true;
 }
@@ -183,15 +258,22 @@ static bool trans_fsgnjn_d(DisasContext *ctx, arg_fsgnjn_d *a)
 static bool trans_fsgnjx_d(DisasContext *ctx, arg_fsgnjx_d *a)
 {
     REQUIRE_FPU;
-    REQUIRE_EXT(ctx, RVD);
+    REQUIRE_ZDINX_OR_D(ctx);
+    REQUIRE_EVEN(ctx, a->rd | a->rs1 | a->rs2);
+
+    TCGv_i64 dest = dest_fpr(ctx, a->rd);
+    TCGv_i64 src1 = get_fpr_d(ctx, a->rs1);
+
     if (a->rs1 == a->rs2) { /* FABS */
-        tcg_gen_andi_i64(cpu_fpr[a->rd], cpu_fpr[a->rs1], ~INT64_MIN);
+        tcg_gen_andi_i64(dest, src1, ~INT64_MIN);
     } else {
+        TCGv_i64 src2 = get_fpr_d(ctx, a->rs2);
         TCGv_i64 t0 = tcg_temp_new_i64();
-        tcg_gen_andi_i64(t0, cpu_fpr[a->rs2], INT64_MIN);
-        tcg_gen_xor_i64(cpu_fpr[a->rd], cpu_fpr[a->rs1], t0);
+        tcg_gen_andi_i64(t0, src2, INT64_MIN);
+        tcg_gen_xor_i64(dest, src1, t0);
         tcg_temp_free_i64(t0);
     }
+    gen_set_fpr_d(ctx, a->rd, dest);
     mark_fs_dirty(ctx);
     return true;
 }
@@ -199,11 +281,15 @@ static bool trans_fsgnjx_d(DisasContext *ctx, arg_fsgnjx_d *a)
 static bool trans_fmin_d(DisasContext *ctx, arg_fmin_d *a)
 {
     REQUIRE_FPU;
-    REQUIRE_EXT(ctx, RVD);
+    REQUIRE_ZDINX_OR_D(ctx);
+    REQUIRE_EVEN(ctx, a->rd | a->rs1 | a->rs2);
 
-    gen_helper_fmin_d(cpu_fpr[a->rd], cpu_env,
-                      cpu_fpr[a->rs1], cpu_fpr[a->rs2]);
+    TCGv_i64 dest = dest_fpr(ctx, a->rd);
+    TCGv_i64 src1 = get_fpr_d(ctx, a->rs1);
+    TCGv_i64 src2 = get_fpr_d(ctx, a->rs2);
 
+    gen_helper_fmin_d(dest, cpu_env, src1, src2);
+    gen_set_fpr_d(ctx, a->rd, dest);
     mark_fs_dirty(ctx);
     return true;
 }
@@ -211,11 +297,15 @@ static bool trans_fmin_d(DisasContext *ctx, arg_fmin_d *a)
 static bool trans_fmax_d(DisasContext *ctx, arg_fmax_d *a)
 {
     REQUIRE_FPU;
-    REQUIRE_EXT(ctx, RVD);
+    REQUIRE_ZDINX_OR_D(ctx);
+    REQUIRE_EVEN(ctx, a->rd | a->rs1 | a->rs2);
 
-    gen_helper_fmax_d(cpu_fpr[a->rd], cpu_env,
-                      cpu_fpr[a->rs1], cpu_fpr[a->rs2]);
+    TCGv_i64 dest = dest_fpr(ctx, a->rd);
+    TCGv_i64 src1 = get_fpr_d(ctx, a->rs1);
+    TCGv_i64 src2 = get_fpr_d(ctx, a->rs2);
 
+    gen_helper_fmax_d(dest, cpu_env, src1, src2);
+    gen_set_fpr_d(ctx, a->rd, dest);
     mark_fs_dirty(ctx);
     return true;
 }
@@ -223,11 +313,15 @@ static bool trans_fmax_d(DisasContext *ctx, arg_fmax_d *a)
 static bool trans_fcvt_s_d(DisasContext *ctx, arg_fcvt_s_d *a)
 {
     REQUIRE_FPU;
-    REQUIRE_EXT(ctx, RVD);
+    REQUIRE_ZDINX_OR_D(ctx);
+    REQUIRE_EVEN(ctx, a->rs1);
 
-    gen_set_rm(ctx, a->rm);
-    gen_helper_fcvt_s_d(cpu_fpr[a->rd], cpu_env, cpu_fpr[a->rs1]);
+    TCGv_i64 dest = dest_fpr(ctx, a->rd);
+    TCGv_i64 src1 = get_fpr_d(ctx, a->rs1);
 
+    gen_set_rm(ctx, a->rm);
+    gen_helper_fcvt_s_d(dest, cpu_env, src1);
+    gen_set_fpr_hs(ctx, a->rd, dest);
     mark_fs_dirty(ctx);
     return true;
 }
@@ -235,11 +329,15 @@ static bool trans_fcvt_s_d(DisasContext *ctx, arg_fcvt_s_d *a)
 static bool trans_fcvt_d_s(DisasContext *ctx, arg_fcvt_d_s *a)
 {
     REQUIRE_FPU;
-    REQUIRE_EXT(ctx, RVD);
+    REQUIRE_ZDINX_OR_D(ctx);
+    REQUIRE_EVEN(ctx, a->rd);
 
-    gen_set_rm(ctx, a->rm);
-    gen_helper_fcvt_d_s(cpu_fpr[a->rd], cpu_env, cpu_fpr[a->rs1]);
+    TCGv_i64 dest = dest_fpr(ctx, a->rd);
+    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
 
+    gen_set_rm(ctx, a->rm);
+    gen_helper_fcvt_d_s(dest, cpu_env, src1);
+    gen_set_fpr_d(ctx, a->rd, dest);
     mark_fs_dirty(ctx);
     return true;
 }
@@ -247,11 +345,14 @@ static bool trans_fcvt_d_s(DisasContext *ctx, arg_fcvt_d_s *a)
 static bool trans_feq_d(DisasContext *ctx, arg_feq_d *a)
 {
     REQUIRE_FPU;
-    REQUIRE_EXT(ctx, RVD);
+    REQUIRE_ZDINX_OR_D(ctx);
+    REQUIRE_EVEN(ctx, a->rs1 | a->rs2);
 
     TCGv dest = dest_gpr(ctx, a->rd);
+    TCGv_i64 src1 = get_fpr_d(ctx, a->rs1);
+    TCGv_i64 src2 = get_fpr_d(ctx, a->rs2);
 
-    gen_helper_feq_d(dest, cpu_env, cpu_fpr[a->rs1], cpu_fpr[a->rs2]);
+    gen_helper_feq_d(dest, cpu_env, src1, src2);
     gen_set_gpr(ctx, a->rd, dest);
     return true;
 }
@@ -259,11 +360,14 @@ static bool trans_feq_d(DisasContext *ctx, arg_feq_d *a)
 static bool trans_flt_d(DisasContext *ctx, arg_flt_d *a)
 {
     REQUIRE_FPU;
-    REQUIRE_EXT(ctx, RVD);
+    REQUIRE_ZDINX_OR_D(ctx);
+    REQUIRE_EVEN(ctx, a->rs1 | a->rs2);
 
     TCGv dest = dest_gpr(ctx, a->rd);
+    TCGv_i64 src1 = get_fpr_d(ctx, a->rs1);
+    TCGv_i64 src2 = get_fpr_d(ctx, a->rs2);
 
-    gen_helper_flt_d(dest, cpu_env, cpu_fpr[a->rs1], cpu_fpr[a->rs2]);
+    gen_helper_flt_d(dest, cpu_env, src1, src2);
     gen_set_gpr(ctx, a->rd, dest);
     return true;
 }
@@ -271,11 +375,14 @@ static bool trans_flt_d(DisasContext *ctx, arg_flt_d *a)
 static bool trans_fle_d(DisasContext *ctx, arg_fle_d *a)
 {
     REQUIRE_FPU;
-    REQUIRE_EXT(ctx, RVD);
+    REQUIRE_ZDINX_OR_D(ctx);
+    REQUIRE_EVEN(ctx, a->rs1 | a->rs2);
 
     TCGv dest = dest_gpr(ctx, a->rd);
+    TCGv_i64 src1 = get_fpr_d(ctx, a->rs1);
+    TCGv_i64 src2 = get_fpr_d(ctx, a->rs2);
 
-    gen_helper_fle_d(dest, cpu_env, cpu_fpr[a->rs1], cpu_fpr[a->rs2]);
+    gen_helper_fle_d(dest, cpu_env, src1, src2);
     gen_set_gpr(ctx, a->rd, dest);
     return true;
 }
@@ -283,11 +390,13 @@ static bool trans_fle_d(DisasContext *ctx, arg_fle_d *a)
 static bool trans_fclass_d(DisasContext *ctx, arg_fclass_d *a)
 {
     REQUIRE_FPU;
-    REQUIRE_EXT(ctx, RVD);
+    REQUIRE_ZDINX_OR_D(ctx);
+    REQUIRE_EVEN(ctx, a->rs1);
 
     TCGv dest = dest_gpr(ctx, a->rd);
+    TCGv_i64 src1 = get_fpr_d(ctx, a->rs1);
 
-    gen_helper_fclass_d(dest, cpu_fpr[a->rs1]);
+    gen_helper_fclass_d(dest, src1);
     gen_set_gpr(ctx, a->rd, dest);
     return true;
 }
@@ -295,12 +404,14 @@ static bool trans_fclass_d(DisasContext *ctx, arg_fclass_d *a)
 static bool trans_fcvt_w_d(DisasContext *ctx, arg_fcvt_w_d *a)
 {
     REQUIRE_FPU;
-    REQUIRE_EXT(ctx, RVD);
+    REQUIRE_ZDINX_OR_D(ctx);
+    REQUIRE_EVEN(ctx, a->rs1);
 
     TCGv dest = dest_gpr(ctx, a->rd);
+    TCGv_i64 src1 = get_fpr_d(ctx, a->rs1);
 
     gen_set_rm(ctx, a->rm);
-    gen_helper_fcvt_w_d(dest, cpu_env, cpu_fpr[a->rs1]);
+    gen_helper_fcvt_w_d(dest, cpu_env, src1);
     gen_set_gpr(ctx, a->rd, dest);
     return true;
 }
@@ -308,12 +419,14 @@ static bool trans_fcvt_w_d(DisasContext *ctx, arg_fcvt_w_d *a)
 static bool trans_fcvt_wu_d(DisasContext *ctx, arg_fcvt_wu_d *a)
 {
     REQUIRE_FPU;
-    REQUIRE_EXT(ctx, RVD);
+    REQUIRE_ZDINX_OR_D(ctx);
+    REQUIRE_EVEN(ctx, a->rs1);
 
     TCGv dest = dest_gpr(ctx, a->rd);
+    TCGv_i64 src1 = get_fpr_d(ctx, a->rs1);
 
     gen_set_rm(ctx, a->rm);
-    gen_helper_fcvt_wu_d(dest, cpu_env, cpu_fpr[a->rs1]);
+    gen_helper_fcvt_wu_d(dest, cpu_env, src1);
     gen_set_gpr(ctx, a->rd, dest);
     return true;
 }
@@ -321,12 +434,15 @@ static bool trans_fcvt_wu_d(DisasContext *ctx, arg_fcvt_wu_d *a)
 static bool trans_fcvt_d_w(DisasContext *ctx, arg_fcvt_d_w *a)
 {
     REQUIRE_FPU;
-    REQUIRE_EXT(ctx, RVD);
+    REQUIRE_ZDINX_OR_D(ctx);
+    REQUIRE_EVEN(ctx, a->rd);
 
+    TCGv_i64 dest = dest_fpr(ctx, a->rd);
     TCGv src = get_gpr(ctx, a->rs1, EXT_SIGN);
 
     gen_set_rm(ctx, a->rm);
-    gen_helper_fcvt_d_w(cpu_fpr[a->rd], cpu_env, src);
+    gen_helper_fcvt_d_w(dest, cpu_env, src);
+    gen_set_fpr_d(ctx, a->rd, dest);
 
     mark_fs_dirty(ctx);
     return true;
@@ -335,12 +451,15 @@ static bool trans_fcvt_d_w(DisasContext *ctx, arg_fcvt_d_w *a)
 static bool trans_fcvt_d_wu(DisasContext *ctx, arg_fcvt_d_wu *a)
 {
     REQUIRE_FPU;
-    REQUIRE_EXT(ctx, RVD);
+    REQUIRE_ZDINX_OR_D(ctx);
+    REQUIRE_EVEN(ctx, a->rd);
 
+    TCGv_i64 dest = dest_fpr(ctx, a->rd);
     TCGv src = get_gpr(ctx, a->rs1, EXT_ZERO);
 
     gen_set_rm(ctx, a->rm);
-    gen_helper_fcvt_d_wu(cpu_fpr[a->rd], cpu_env, src);
+    gen_helper_fcvt_d_wu(dest, cpu_env, src);
+    gen_set_fpr_d(ctx, a->rd, dest);
 
     mark_fs_dirty(ctx);
     return true;
@@ -350,12 +469,14 @@ static bool trans_fcvt_l_d(DisasContext *ctx, arg_fcvt_l_d *a)
 {
     REQUIRE_64BIT(ctx);
     REQUIRE_FPU;
-    REQUIRE_EXT(ctx, RVD);
+    REQUIRE_ZDINX_OR_D(ctx);
+    REQUIRE_EVEN(ctx, a->rs1);
 
     TCGv dest = dest_gpr(ctx, a->rd);
+    TCGv_i64 src1 = get_fpr_d(ctx, a->rs1);
 
     gen_set_rm(ctx, a->rm);
-    gen_helper_fcvt_l_d(dest, cpu_env, cpu_fpr[a->rs1]);
+    gen_helper_fcvt_l_d(dest, cpu_env, src1);
     gen_set_gpr(ctx, a->rd, dest);
     return true;
 }
@@ -364,12 +485,14 @@ static bool trans_fcvt_lu_d(DisasContext *ctx, arg_fcvt_lu_d *a)
 {
     REQUIRE_64BIT(ctx);
     REQUIRE_FPU;
-    REQUIRE_EXT(ctx, RVD);
+    REQUIRE_ZDINX_OR_D(ctx);
+    REQUIRE_EVEN(ctx, a->rs1);
 
     TCGv dest = dest_gpr(ctx, a->rd);
+    TCGv_i64 src1 = get_fpr_d(ctx, a->rs1);
 
     gen_set_rm(ctx, a->rm);
-    gen_helper_fcvt_lu_d(dest, cpu_env, cpu_fpr[a->rs1]);
+    gen_helper_fcvt_lu_d(dest, cpu_env, src1);
     gen_set_gpr(ctx, a->rd, dest);
     return true;
 }
@@ -392,12 +515,15 @@ static bool trans_fcvt_d_l(DisasContext *ctx, arg_fcvt_d_l *a)
 {
     REQUIRE_64BIT(ctx);
     REQUIRE_FPU;
-    REQUIRE_EXT(ctx, RVD);
+    REQUIRE_ZDINX_OR_D(ctx);
+    REQUIRE_EVEN(ctx, a->rd);
 
+    TCGv_i64 dest = dest_fpr(ctx, a->rd);
     TCGv src = get_gpr(ctx, a->rs1, EXT_SIGN);
 
     gen_set_rm(ctx, a->rm);
-    gen_helper_fcvt_d_l(cpu_fpr[a->rd], cpu_env, src);
+    gen_helper_fcvt_d_l(dest, cpu_env, src);
+    gen_set_fpr_d(ctx, a->rd, dest);
 
     mark_fs_dirty(ctx);
     return true;
@@ -407,12 +533,15 @@ static bool trans_fcvt_d_lu(DisasContext *ctx, arg_fcvt_d_lu *a)
 {
     REQUIRE_64BIT(ctx);
     REQUIRE_FPU;
-    REQUIRE_EXT(ctx, RVD);
+    REQUIRE_ZDINX_OR_D(ctx);
+    REQUIRE_EVEN(ctx, a->rd);
 
+    TCGv_i64 dest = dest_fpr(ctx, a->rd);
     TCGv src = get_gpr(ctx, a->rs1, EXT_ZERO);
 
     gen_set_rm(ctx, a->rm);
-    gen_helper_fcvt_d_lu(cpu_fpr[a->rd], cpu_env, src);
+    gen_helper_fcvt_d_lu(dest, cpu_env, src);
+    gen_set_fpr_d(ctx, a->rd, dest);
 
     mark_fs_dirty(ctx);
     return true;
diff --git a/target/riscv/translate.c b/target/riscv/translate.c
index 10cf37be41..fac998a6b5 100644
--- a/target/riscv/translate.c
+++ b/target/riscv/translate.c
@@ -416,6 +416,31 @@ static TCGv_i64 get_fpr_hs(DisasContext *ctx, int reg_num)
     }
 }
 
+static TCGv_i64 get_fpr_d(DisasContext *ctx, int reg_num)
+{
+    if (!ctx->cfg_ptr->ext_zfinx) {
+        return cpu_fpr[reg_num];
+    }
+
+    if (reg_num == 0) {
+        return tcg_constant_i64(0);
+    }
+    switch (get_xl(ctx)) {
+    case MXL_RV32:
+    {
+        TCGv_i64 t = ftemp_new(ctx);
+        tcg_gen_concat_tl_i64(t, cpu_gpr[reg_num], cpu_gpr[reg_num + 1]);
+        return t;
+    }
+#ifdef TARGET_RISCV64
+    case MXL_RV64:
+        return cpu_gpr[reg_num];
+#endif
+    default:
+        g_assert_not_reached();
+    }
+}
+
 static TCGv_i64 dest_fpr(DisasContext *ctx, int reg_num)
 {
     if (!ctx->cfg_ptr->ext_zfinx) {
@@ -463,6 +488,33 @@ static void gen_set_fpr_hs(DisasContext *ctx, int reg_num, TCGv_i64 t)
     }
 }
 
+static void gen_set_fpr_d(DisasContext *ctx, int reg_num, TCGv_i64 t)
+{
+    if (!ctx->cfg_ptr->ext_zfinx) {
+        tcg_gen_mov_i64(cpu_fpr[reg_num], t);
+        return;
+    }
+
+    if (reg_num != 0) {
+        switch (get_xl(ctx)) {
+        case MXL_RV32:
+#ifdef TARGET_RISCV32
+            tcg_gen_extr_i64_i32(cpu_gpr[reg_num], cpu_gpr[reg_num + 1], t);
+            break;
+#else
+            tcg_gen_ext32s_i64(cpu_gpr[reg_num], t);
+            tcg_gen_sari_i64(cpu_gpr[reg_num + 1], t, 32);
+            break;
+        case MXL_RV64:
+            tcg_gen_mov_i64(cpu_gpr[reg_num], t);
+            break;
+#endif
+        default:
+            g_assert_not_reached();
+        }
+    }
+}
+
 static void gen_jal(DisasContext *ctx, int rd, target_ulong imm)
 {
     target_ulong next_pc;
-- 
2.17.1



^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v6 5/6] target/riscv: add support for zhinx/zhinxmin
  2022-02-11  4:39 ` Weiwei Li
@ 2022-02-11  4:39   ` Weiwei Li
  -1 siblings, 0 replies; 22+ messages in thread
From: Weiwei Li @ 2022-02-11  4:39 UTC (permalink / raw)
  To: richard.henderson, palmer, alistair.francis, bin.meng,
	qemu-riscv, qemu-devel
  Cc: wangjunqiang, Weiwei Li, lazyparser, ardxwe

  - update extension check REQUIRE_ZHINX_OR_ZFH and REQUIRE_ZFH_OR_ZFHMIN_OR_ZHINX_OR_ZHINXMIN
  - update half float point register read/write
  - disable nanbox_h check

Signed-off-by: Weiwei Li <liweiwei@iscas.ac.cn>
Signed-off-by: Junqiang Wang <wangjunqiang@iscas.ac.cn>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/riscv/fpu_helper.c                 |  89 +++---
 target/riscv/helper.h                     |   2 +-
 target/riscv/insn_trans/trans_rvzfh.c.inc | 332 +++++++++++++++-------
 target/riscv/internals.h                  |  16 +-
 4 files changed, 296 insertions(+), 143 deletions(-)

diff --git a/target/riscv/fpu_helper.c b/target/riscv/fpu_helper.c
index 63ca703459..5699c9517f 100644
--- a/target/riscv/fpu_helper.c
+++ b/target/riscv/fpu_helper.c
@@ -89,10 +89,11 @@ void helper_set_rod_rounding_mode(CPURISCVState *env)
 static uint64_t do_fmadd_h(CPURISCVState *env, uint64_t rs1, uint64_t rs2,
                            uint64_t rs3, int flags)
 {
-    float16 frs1 = check_nanbox_h(rs1);
-    float16 frs2 = check_nanbox_h(rs2);
-    float16 frs3 = check_nanbox_h(rs3);
-    return nanbox_h(float16_muladd(frs1, frs2, frs3, flags, &env->fp_status));
+    float16 frs1 = check_nanbox_h(env, rs1);
+    float16 frs2 = check_nanbox_h(env, rs2);
+    float16 frs3 = check_nanbox_h(env, rs3);
+    return nanbox_h(env, float16_muladd(frs1, frs2, frs3, flags,
+                                        &env->fp_status));
 }
 
 static uint64_t do_fmadd_s(CPURISCVState *env, uint64_t rs1, uint64_t rs2,
@@ -417,146 +418,146 @@ target_ulong helper_fclass_d(uint64_t frs1)
 
 uint64_t helper_fadd_h(CPURISCVState *env, uint64_t rs1, uint64_t rs2)
 {
-    float16 frs1 = check_nanbox_h(rs1);
-    float16 frs2 = check_nanbox_h(rs2);
-    return nanbox_h(float16_add(frs1, frs2, &env->fp_status));
+    float16 frs1 = check_nanbox_h(env, rs1);
+    float16 frs2 = check_nanbox_h(env, rs2);
+    return nanbox_h(env, float16_add(frs1, frs2, &env->fp_status));
 }
 
 uint64_t helper_fsub_h(CPURISCVState *env, uint64_t rs1, uint64_t rs2)
 {
-    float16 frs1 = check_nanbox_h(rs1);
-    float16 frs2 = check_nanbox_h(rs2);
-    return nanbox_h(float16_sub(frs1, frs2, &env->fp_status));
+    float16 frs1 = check_nanbox_h(env, rs1);
+    float16 frs2 = check_nanbox_h(env, rs2);
+    return nanbox_h(env, float16_sub(frs1, frs2, &env->fp_status));
 }
 
 uint64_t helper_fmul_h(CPURISCVState *env, uint64_t rs1, uint64_t rs2)
 {
-    float16 frs1 = check_nanbox_h(rs1);
-    float16 frs2 = check_nanbox_h(rs2);
-    return nanbox_h(float16_mul(frs1, frs2, &env->fp_status));
+    float16 frs1 = check_nanbox_h(env, rs1);
+    float16 frs2 = check_nanbox_h(env, rs2);
+    return nanbox_h(env, float16_mul(frs1, frs2, &env->fp_status));
 }
 
 uint64_t helper_fdiv_h(CPURISCVState *env, uint64_t rs1, uint64_t rs2)
 {
-    float16 frs1 = check_nanbox_h(rs1);
-    float16 frs2 = check_nanbox_h(rs2);
-    return nanbox_h(float16_div(frs1, frs2, &env->fp_status));
+    float16 frs1 = check_nanbox_h(env, rs1);
+    float16 frs2 = check_nanbox_h(env, rs2);
+    return nanbox_h(env, float16_div(frs1, frs2, &env->fp_status));
 }
 
 uint64_t helper_fmin_h(CPURISCVState *env, uint64_t rs1, uint64_t rs2)
 {
-    float16 frs1 = check_nanbox_h(rs1);
-    float16 frs2 = check_nanbox_h(rs2);
-    return nanbox_h(env->priv_ver < PRIV_VERSION_1_11_0 ?
+    float16 frs1 = check_nanbox_h(env, rs1);
+    float16 frs2 = check_nanbox_h(env, rs2);
+    return nanbox_h(env, env->priv_ver < PRIV_VERSION_1_11_0 ?
                     float16_minnum(frs1, frs2, &env->fp_status) :
                     float16_minimum_number(frs1, frs2, &env->fp_status));
 }
 
 uint64_t helper_fmax_h(CPURISCVState *env, uint64_t rs1, uint64_t rs2)
 {
-    float16 frs1 = check_nanbox_h(rs1);
-    float16 frs2 = check_nanbox_h(rs2);
-    return nanbox_h(env->priv_ver < PRIV_VERSION_1_11_0 ?
+    float16 frs1 = check_nanbox_h(env, rs1);
+    float16 frs2 = check_nanbox_h(env, rs2);
+    return nanbox_h(env, env->priv_ver < PRIV_VERSION_1_11_0 ?
                     float16_maxnum(frs1, frs2, &env->fp_status) :
                     float16_maximum_number(frs1, frs2, &env->fp_status));
 }
 
 uint64_t helper_fsqrt_h(CPURISCVState *env, uint64_t rs1)
 {
-    float16 frs1 = check_nanbox_h(rs1);
-    return nanbox_h(float16_sqrt(frs1, &env->fp_status));
+    float16 frs1 = check_nanbox_h(env, rs1);
+    return nanbox_h(env, float16_sqrt(frs1, &env->fp_status));
 }
 
 target_ulong helper_fle_h(CPURISCVState *env, uint64_t rs1, uint64_t rs2)
 {
-    float16 frs1 = check_nanbox_h(rs1);
-    float16 frs2 = check_nanbox_h(rs2);
+    float16 frs1 = check_nanbox_h(env, rs1);
+    float16 frs2 = check_nanbox_h(env, rs2);
     return float16_le(frs1, frs2, &env->fp_status);
 }
 
 target_ulong helper_flt_h(CPURISCVState *env, uint64_t rs1, uint64_t rs2)
 {
-    float16 frs1 = check_nanbox_h(rs1);
-    float16 frs2 = check_nanbox_h(rs2);
+    float16 frs1 = check_nanbox_h(env, rs1);
+    float16 frs2 = check_nanbox_h(env, rs2);
     return float16_lt(frs1, frs2, &env->fp_status);
 }
 
 target_ulong helper_feq_h(CPURISCVState *env, uint64_t rs1, uint64_t rs2)
 {
-    float16 frs1 = check_nanbox_h(rs1);
-    float16 frs2 = check_nanbox_h(rs2);
+    float16 frs1 = check_nanbox_h(env, rs1);
+    float16 frs2 = check_nanbox_h(env, rs2);
     return float16_eq_quiet(frs1, frs2, &env->fp_status);
 }
 
-target_ulong helper_fclass_h(uint64_t rs1)
+target_ulong helper_fclass_h(CPURISCVState *env, uint64_t rs1)
 {
-    float16 frs1 = check_nanbox_h(rs1);
+    float16 frs1 = check_nanbox_h(env, rs1);
     return fclass_h(frs1);
 }
 
 target_ulong helper_fcvt_w_h(CPURISCVState *env, uint64_t rs1)
 {
-    float16 frs1 = check_nanbox_h(rs1);
+    float16 frs1 = check_nanbox_h(env, rs1);
     return float16_to_int32(frs1, &env->fp_status);
 }
 
 target_ulong helper_fcvt_wu_h(CPURISCVState *env, uint64_t rs1)
 {
-    float16 frs1 = check_nanbox_h(rs1);
+    float16 frs1 = check_nanbox_h(env, rs1);
     return (int32_t)float16_to_uint32(frs1, &env->fp_status);
 }
 
 target_ulong helper_fcvt_l_h(CPURISCVState *env, uint64_t rs1)
 {
-    float16 frs1 = check_nanbox_h(rs1);
+    float16 frs1 = check_nanbox_h(env, rs1);
     return float16_to_int64(frs1, &env->fp_status);
 }
 
 target_ulong helper_fcvt_lu_h(CPURISCVState *env, uint64_t rs1)
 {
-    float16 frs1 = check_nanbox_h(rs1);
+    float16 frs1 = check_nanbox_h(env, rs1);
     return float16_to_uint64(frs1, &env->fp_status);
 }
 
 uint64_t helper_fcvt_h_w(CPURISCVState *env, target_ulong rs1)
 {
-    return nanbox_h(int32_to_float16((int32_t)rs1, &env->fp_status));
+    return nanbox_h(env, int32_to_float16((int32_t)rs1, &env->fp_status));
 }
 
 uint64_t helper_fcvt_h_wu(CPURISCVState *env, target_ulong rs1)
 {
-    return nanbox_h(uint32_to_float16((uint32_t)rs1, &env->fp_status));
+    return nanbox_h(env, uint32_to_float16((uint32_t)rs1, &env->fp_status));
 }
 
 uint64_t helper_fcvt_h_l(CPURISCVState *env, target_ulong rs1)
 {
-    return nanbox_h(int64_to_float16(rs1, &env->fp_status));
+    return nanbox_h(env, int64_to_float16(rs1, &env->fp_status));
 }
 
 uint64_t helper_fcvt_h_lu(CPURISCVState *env, target_ulong rs1)
 {
-    return nanbox_h(uint64_to_float16(rs1, &env->fp_status));
+    return nanbox_h(env, uint64_to_float16(rs1, &env->fp_status));
 }
 
 uint64_t helper_fcvt_h_s(CPURISCVState *env, uint64_t rs1)
 {
     float32 frs1 = check_nanbox_s(env, rs1);
-    return nanbox_h(float32_to_float16(frs1, true, &env->fp_status));
+    return nanbox_h(env, float32_to_float16(frs1, true, &env->fp_status));
 }
 
 uint64_t helper_fcvt_s_h(CPURISCVState *env, uint64_t rs1)
 {
-    float16 frs1 = check_nanbox_h(rs1);
+    float16 frs1 = check_nanbox_h(env, rs1);
     return nanbox_s(env, float16_to_float32(frs1, true, &env->fp_status));
 }
 
 uint64_t helper_fcvt_h_d(CPURISCVState *env, uint64_t rs1)
 {
-    return nanbox_h(float64_to_float16(rs1, true, &env->fp_status));
+    return nanbox_h(env, float64_to_float16(rs1, true, &env->fp_status));
 }
 
 uint64_t helper_fcvt_d_h(CPURISCVState *env, uint64_t rs1)
 {
-    float16 frs1 = check_nanbox_h(rs1);
+    float16 frs1 = check_nanbox_h(env, rs1);
     return float16_to_float64(frs1, true, &env->fp_status);
 }
diff --git a/target/riscv/helper.h b/target/riscv/helper.h
index 89195aad9d..26bbab2fab 100644
--- a/target/riscv/helper.h
+++ b/target/riscv/helper.h
@@ -90,7 +90,7 @@ DEF_HELPER_FLAGS_2(fcvt_h_w, TCG_CALL_NO_RWG, i64, env, tl)
 DEF_HELPER_FLAGS_2(fcvt_h_wu, TCG_CALL_NO_RWG, i64, env, tl)
 DEF_HELPER_FLAGS_2(fcvt_h_l, TCG_CALL_NO_RWG, i64, env, tl)
 DEF_HELPER_FLAGS_2(fcvt_h_lu, TCG_CALL_NO_RWG, i64, env, tl)
-DEF_HELPER_FLAGS_1(fclass_h, TCG_CALL_NO_RWG_SE, tl, i64)
+DEF_HELPER_FLAGS_2(fclass_h, TCG_CALL_NO_RWG_SE, tl, env, i64)
 
 /* Special functions */
 DEF_HELPER_2(csrr, tl, env, int)
diff --git a/target/riscv/insn_trans/trans_rvzfh.c.inc b/target/riscv/insn_trans/trans_rvzfh.c.inc
index 608c51da2c..5d07150cd0 100644
--- a/target/riscv/insn_trans/trans_rvzfh.c.inc
+++ b/target/riscv/insn_trans/trans_rvzfh.c.inc
@@ -22,12 +22,25 @@
     }                         \
 } while (0)
 
+#define REQUIRE_ZHINX_OR_ZFH(ctx) do { \
+    if (!ctx->cfg_ptr->ext_zhinx && !ctx->cfg_ptr->ext_zfh) { \
+        return false;                  \
+    }                                  \
+} while (0)
+
 #define REQUIRE_ZFH_OR_ZFHMIN(ctx) do {       \
     if (!(ctx->cfg_ptr->ext_zfh || ctx->cfg_ptr->ext_zfhmin)) { \
         return false;                         \
     }                                         \
 } while (0)
 
+#define REQUIRE_ZFH_OR_ZFHMIN_OR_ZHINX_OR_ZHINXMIN(ctx) do { \
+    if (!(ctx->cfg_ptr->ext_zfh || ctx->cfg_ptr->ext_zfhmin ||          \
+          ctx->cfg_ptr->ext_zhinx || ctx->cfg_ptr->ext_zhinxmin)) {     \
+        return false;                                        \
+    }                                                        \
+} while (0)
+
 static bool trans_flh(DisasContext *ctx, arg_flh *a)
 {
     TCGv_i64 dest;
@@ -73,11 +86,16 @@ static bool trans_fsh(DisasContext *ctx, arg_fsh *a)
 static bool trans_fmadd_h(DisasContext *ctx, arg_fmadd_h *a)
 {
     REQUIRE_FPU;
-    REQUIRE_ZFH(ctx);
+    REQUIRE_ZHINX_OR_ZFH(ctx);
+
+    TCGv_i64 dest = dest_fpr(ctx, a->rd);
+    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
+    TCGv_i64 src2 = get_fpr_hs(ctx, a->rs2);
+    TCGv_i64 src3 = get_fpr_hs(ctx, a->rs3);
 
     gen_set_rm(ctx, a->rm);
-    gen_helper_fmadd_h(cpu_fpr[a->rd], cpu_env, cpu_fpr[a->rs1],
-                       cpu_fpr[a->rs2], cpu_fpr[a->rs3]);
+    gen_helper_fmadd_h(dest, cpu_env, src1, src2, src3);
+    gen_set_fpr_hs(ctx, a->rd, dest);
     mark_fs_dirty(ctx);
     return true;
 }
@@ -85,11 +103,16 @@ static bool trans_fmadd_h(DisasContext *ctx, arg_fmadd_h *a)
 static bool trans_fmsub_h(DisasContext *ctx, arg_fmsub_h *a)
 {
     REQUIRE_FPU;
-    REQUIRE_ZFH(ctx);
+    REQUIRE_ZHINX_OR_ZFH(ctx);
+
+    TCGv_i64 dest = dest_fpr(ctx, a->rd);
+    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
+    TCGv_i64 src2 = get_fpr_hs(ctx, a->rs2);
+    TCGv_i64 src3 = get_fpr_hs(ctx, a->rs3);
 
     gen_set_rm(ctx, a->rm);
-    gen_helper_fmsub_h(cpu_fpr[a->rd], cpu_env, cpu_fpr[a->rs1],
-                       cpu_fpr[a->rs2], cpu_fpr[a->rs3]);
+    gen_helper_fmsub_h(dest, cpu_env, src1, src2, src3);
+    gen_set_fpr_hs(ctx, a->rd, dest);
     mark_fs_dirty(ctx);
     return true;
 }
@@ -97,11 +120,16 @@ static bool trans_fmsub_h(DisasContext *ctx, arg_fmsub_h *a)
 static bool trans_fnmsub_h(DisasContext *ctx, arg_fnmsub_h *a)
 {
     REQUIRE_FPU;
-    REQUIRE_ZFH(ctx);
+    REQUIRE_ZHINX_OR_ZFH(ctx);
+
+    TCGv_i64 dest = dest_fpr(ctx, a->rd);
+    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
+    TCGv_i64 src2 = get_fpr_hs(ctx, a->rs2);
+    TCGv_i64 src3 = get_fpr_hs(ctx, a->rs3);
 
     gen_set_rm(ctx, a->rm);
-    gen_helper_fnmsub_h(cpu_fpr[a->rd], cpu_env, cpu_fpr[a->rs1],
-                        cpu_fpr[a->rs2], cpu_fpr[a->rs3]);
+    gen_helper_fnmsub_h(dest, cpu_env, src1, src2, src3);
+    gen_set_fpr_hs(ctx, a->rd, dest);
     mark_fs_dirty(ctx);
     return true;
 }
@@ -109,11 +137,16 @@ static bool trans_fnmsub_h(DisasContext *ctx, arg_fnmsub_h *a)
 static bool trans_fnmadd_h(DisasContext *ctx, arg_fnmadd_h *a)
 {
     REQUIRE_FPU;
-    REQUIRE_ZFH(ctx);
+    REQUIRE_ZHINX_OR_ZFH(ctx);
+
+    TCGv_i64 dest = dest_fpr(ctx, a->rd);
+    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
+    TCGv_i64 src2 = get_fpr_hs(ctx, a->rs2);
+    TCGv_i64 src3 = get_fpr_hs(ctx, a->rs3);
 
     gen_set_rm(ctx, a->rm);
-    gen_helper_fnmadd_h(cpu_fpr[a->rd], cpu_env, cpu_fpr[a->rs1],
-                        cpu_fpr[a->rs2], cpu_fpr[a->rs3]);
+    gen_helper_fnmadd_h(dest, cpu_env, src1, src2, src3);
+    gen_set_fpr_hs(ctx, a->rd, dest);
     mark_fs_dirty(ctx);
     return true;
 }
@@ -121,11 +154,15 @@ static bool trans_fnmadd_h(DisasContext *ctx, arg_fnmadd_h *a)
 static bool trans_fadd_h(DisasContext *ctx, arg_fadd_h *a)
 {
     REQUIRE_FPU;
-    REQUIRE_ZFH(ctx);
+    REQUIRE_ZHINX_OR_ZFH(ctx);
+
+    TCGv_i64 dest = dest_fpr(ctx, a->rd);
+    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
+    TCGv_i64 src2 = get_fpr_hs(ctx, a->rs2);
 
     gen_set_rm(ctx, a->rm);
-    gen_helper_fadd_h(cpu_fpr[a->rd], cpu_env,
-                      cpu_fpr[a->rs1], cpu_fpr[a->rs2]);
+    gen_helper_fadd_h(dest, cpu_env, src1, src2);
+    gen_set_fpr_hs(ctx, a->rd, dest);
     mark_fs_dirty(ctx);
     return true;
 }
@@ -133,11 +170,15 @@ static bool trans_fadd_h(DisasContext *ctx, arg_fadd_h *a)
 static bool trans_fsub_h(DisasContext *ctx, arg_fsub_h *a)
 {
     REQUIRE_FPU;
-    REQUIRE_ZFH(ctx);
+    REQUIRE_ZHINX_OR_ZFH(ctx);
+
+    TCGv_i64 dest = dest_fpr(ctx, a->rd);
+    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
+    TCGv_i64 src2 = get_fpr_hs(ctx, a->rs2);
 
     gen_set_rm(ctx, a->rm);
-    gen_helper_fsub_h(cpu_fpr[a->rd], cpu_env,
-                      cpu_fpr[a->rs1], cpu_fpr[a->rs2]);
+    gen_helper_fsub_h(dest, cpu_env, src1, src2);
+    gen_set_fpr_hs(ctx, a->rd, dest);
     mark_fs_dirty(ctx);
     return true;
 }
@@ -145,11 +186,15 @@ static bool trans_fsub_h(DisasContext *ctx, arg_fsub_h *a)
 static bool trans_fmul_h(DisasContext *ctx, arg_fmul_h *a)
 {
     REQUIRE_FPU;
-    REQUIRE_ZFH(ctx);
+    REQUIRE_ZHINX_OR_ZFH(ctx);
+
+    TCGv_i64 dest = dest_fpr(ctx, a->rd);
+    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
+    TCGv_i64 src2 = get_fpr_hs(ctx, a->rs2);
 
     gen_set_rm(ctx, a->rm);
-    gen_helper_fmul_h(cpu_fpr[a->rd], cpu_env,
-                      cpu_fpr[a->rs1], cpu_fpr[a->rs2]);
+    gen_helper_fmul_h(dest, cpu_env, src1, src2);
+    gen_set_fpr_hs(ctx, a->rd, dest);
     mark_fs_dirty(ctx);
     return true;
 }
@@ -157,11 +202,15 @@ static bool trans_fmul_h(DisasContext *ctx, arg_fmul_h *a)
 static bool trans_fdiv_h(DisasContext *ctx, arg_fdiv_h *a)
 {
     REQUIRE_FPU;
-    REQUIRE_ZFH(ctx);
+    REQUIRE_ZHINX_OR_ZFH(ctx);
+
+    TCGv_i64 dest = dest_fpr(ctx, a->rd);
+    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
+    TCGv_i64 src2 = get_fpr_hs(ctx, a->rs2);
 
     gen_set_rm(ctx, a->rm);
-    gen_helper_fdiv_h(cpu_fpr[a->rd], cpu_env,
-                      cpu_fpr[a->rs1], cpu_fpr[a->rs2]);
+    gen_helper_fdiv_h(dest, cpu_env, src1, src2);
+    gen_set_fpr_hs(ctx, a->rd, dest);
     mark_fs_dirty(ctx);
     return true;
 }
@@ -169,10 +218,14 @@ static bool trans_fdiv_h(DisasContext *ctx, arg_fdiv_h *a)
 static bool trans_fsqrt_h(DisasContext *ctx, arg_fsqrt_h *a)
 {
     REQUIRE_FPU;
-    REQUIRE_ZFH(ctx);
+    REQUIRE_ZHINX_OR_ZFH(ctx);
+
+    TCGv_i64 dest = dest_fpr(ctx, a->rd);
+    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
 
     gen_set_rm(ctx, a->rm);
-    gen_helper_fsqrt_h(cpu_fpr[a->rd], cpu_env, cpu_fpr[a->rs1]);
+    gen_helper_fsqrt_h(dest, cpu_env, src1);
+    gen_set_fpr_hs(ctx, a->rd, dest);
     mark_fs_dirty(ctx);
     return true;
 }
@@ -180,23 +233,37 @@ static bool trans_fsqrt_h(DisasContext *ctx, arg_fsqrt_h *a)
 static bool trans_fsgnj_h(DisasContext *ctx, arg_fsgnj_h *a)
 {
     REQUIRE_FPU;
-    REQUIRE_ZFH(ctx);
+    REQUIRE_ZHINX_OR_ZFH(ctx);
+
+    TCGv_i64 dest = dest_fpr(ctx, a->rd);
+    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
 
     if (a->rs1 == a->rs2) { /* FMOV */
-        gen_check_nanbox_h(cpu_fpr[a->rd], cpu_fpr[a->rs1]);
+        if (!ctx->cfg_ptr->ext_zfinx) {
+            gen_check_nanbox_h(dest, src1);
+        } else {
+            tcg_gen_ext16s_i64(dest, src1);
+        }
     } else {
-        TCGv_i64 rs1 = tcg_temp_new_i64();
-        TCGv_i64 rs2 = tcg_temp_new_i64();
-
-        gen_check_nanbox_h(rs1, cpu_fpr[a->rs1]);
-        gen_check_nanbox_h(rs2, cpu_fpr[a->rs2]);
-
-        /* This formulation retains the nanboxing of rs2. */
-        tcg_gen_deposit_i64(cpu_fpr[a->rd], rs2, rs1, 0, 15);
-        tcg_temp_free_i64(rs1);
-        tcg_temp_free_i64(rs2);
+        TCGv_i64 src2 = get_fpr_hs(ctx, a->rs2);
+
+        if (!ctx->cfg_ptr->ext_zfinx) {
+            TCGv_i64 rs1 = tcg_temp_new_i64();
+            TCGv_i64 rs2 = tcg_temp_new_i64();
+            gen_check_nanbox_h(rs1, src1);
+            gen_check_nanbox_h(rs2, src2);
+
+            /* This formulation retains the nanboxing of rs2 in normal 'Zfh'. */
+            tcg_gen_deposit_i64(dest, rs2, rs1, 0, 15);
+
+            tcg_temp_free_i64(rs1);
+            tcg_temp_free_i64(rs2);
+        } else {
+            tcg_gen_deposit_i64(dest, src2, src1, 0, 15);
+            tcg_gen_ext16s_i64(dest, dest);
+        }
     }
-
+    gen_set_fpr_hs(ctx, a->rd, dest);
     mark_fs_dirty(ctx);
     return true;
 }
@@ -206,16 +273,29 @@ static bool trans_fsgnjn_h(DisasContext *ctx, arg_fsgnjn_h *a)
     TCGv_i64 rs1, rs2, mask;
 
     REQUIRE_FPU;
-    REQUIRE_ZFH(ctx);
+    REQUIRE_ZHINX_OR_ZFH(ctx);
+
+    TCGv_i64 dest = dest_fpr(ctx, a->rd);
+    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
 
     rs1 = tcg_temp_new_i64();
-    gen_check_nanbox_h(rs1, cpu_fpr[a->rs1]);
+    if (!ctx->cfg_ptr->ext_zfinx) {
+        gen_check_nanbox_h(rs1, src1);
+    } else {
+        tcg_gen_mov_i64(rs1, src1);
+    }
 
     if (a->rs1 == a->rs2) { /* FNEG */
-        tcg_gen_xori_i64(cpu_fpr[a->rd], rs1, MAKE_64BIT_MASK(15, 1));
+        tcg_gen_xori_i64(dest, rs1, MAKE_64BIT_MASK(15, 1));
     } else {
+        TCGv_i64 src2 = get_fpr_hs(ctx, a->rs2);
         rs2 = tcg_temp_new_i64();
-        gen_check_nanbox_h(rs2, cpu_fpr[a->rs2]);
+
+        if (!ctx->cfg_ptr->ext_zfinx) {
+            gen_check_nanbox_h(rs2, src2);
+        } else {
+            tcg_gen_mov_i64(rs2, src2);
+        }
 
         /*
          * Replace bit 15 in rs1 with inverse in rs2.
@@ -224,12 +304,17 @@ static bool trans_fsgnjn_h(DisasContext *ctx, arg_fsgnjn_h *a)
         mask = tcg_const_i64(~MAKE_64BIT_MASK(15, 1));
         tcg_gen_not_i64(rs2, rs2);
         tcg_gen_andc_i64(rs2, rs2, mask);
-        tcg_gen_and_i64(rs1, mask, rs1);
-        tcg_gen_or_i64(cpu_fpr[a->rd], rs1, rs2);
+        tcg_gen_and_i64(dest, mask, rs1);
+        tcg_gen_or_i64(dest, dest, rs2);
 
         tcg_temp_free_i64(mask);
         tcg_temp_free_i64(rs2);
     }
+    /* signed-extended intead of nanboxing for result if enable zfinx */
+    if (ctx->cfg_ptr->ext_zfinx) {
+        tcg_gen_ext16s_i64(dest, dest);
+    }
+    tcg_temp_free_i64(rs1);
     mark_fs_dirty(ctx);
     return true;
 }
@@ -239,27 +324,44 @@ static bool trans_fsgnjx_h(DisasContext *ctx, arg_fsgnjx_h *a)
     TCGv_i64 rs1, rs2;
 
     REQUIRE_FPU;
-    REQUIRE_ZFH(ctx);
+    REQUIRE_ZHINX_OR_ZFH(ctx);
+
+    TCGv_i64 dest = dest_fpr(ctx, a->rd);
+    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
 
     rs1 = tcg_temp_new_i64();
-    gen_check_nanbox_s(rs1, cpu_fpr[a->rs1]);
+    if (!ctx->cfg_ptr->ext_zfinx) {
+        gen_check_nanbox_h(rs1, src1);
+    } else {
+        tcg_gen_mov_i64(rs1, src1);
+    }
 
     if (a->rs1 == a->rs2) { /* FABS */
-        tcg_gen_andi_i64(cpu_fpr[a->rd], rs1, ~MAKE_64BIT_MASK(15, 1));
+        tcg_gen_andi_i64(dest, rs1, ~MAKE_64BIT_MASK(15, 1));
     } else {
+        TCGv_i64 src2 = get_fpr_hs(ctx, a->rs2);
         rs2 = tcg_temp_new_i64();
-        gen_check_nanbox_s(rs2, cpu_fpr[a->rs2]);
+
+        if (!ctx->cfg_ptr->ext_zfinx) {
+            gen_check_nanbox_h(rs2, src2);
+        } else {
+            tcg_gen_mov_i64(rs2, src2);
+        }
 
         /*
          * Xor bit 15 in rs1 with that in rs2.
          * This formulation retains the nanboxing of rs1.
          */
-        tcg_gen_andi_i64(rs2, rs2, MAKE_64BIT_MASK(15, 1));
-        tcg_gen_xor_i64(cpu_fpr[a->rd], rs1, rs2);
+        tcg_gen_andi_i64(dest, rs2, MAKE_64BIT_MASK(15, 1));
+        tcg_gen_xor_i64(dest, rs1, dest);
 
         tcg_temp_free_i64(rs2);
     }
-
+    /* signed-extended intead of nanboxing for result if enable zfinx */
+    if (ctx->cfg_ptr->ext_zfinx) {
+        tcg_gen_ext16s_i64(dest, dest);
+    }
+    tcg_temp_free_i64(rs1);
     mark_fs_dirty(ctx);
     return true;
 }
@@ -267,10 +369,14 @@ static bool trans_fsgnjx_h(DisasContext *ctx, arg_fsgnjx_h *a)
 static bool trans_fmin_h(DisasContext *ctx, arg_fmin_h *a)
 {
     REQUIRE_FPU;
-    REQUIRE_ZFH(ctx);
+    REQUIRE_ZHINX_OR_ZFH(ctx);
+
+    TCGv_i64 dest = dest_fpr(ctx, a->rd);
+    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
+    TCGv_i64 src2 = get_fpr_hs(ctx, a->rs2);
 
-    gen_helper_fmin_h(cpu_fpr[a->rd], cpu_env, cpu_fpr[a->rs1],
-                      cpu_fpr[a->rs2]);
+    gen_helper_fmin_h(dest, cpu_env, src1, src2);
+    gen_set_fpr_hs(ctx, a->rd, dest);
     mark_fs_dirty(ctx);
     return true;
 }
@@ -278,10 +384,14 @@ static bool trans_fmin_h(DisasContext *ctx, arg_fmin_h *a)
 static bool trans_fmax_h(DisasContext *ctx, arg_fmax_h *a)
 {
     REQUIRE_FPU;
-    REQUIRE_ZFH(ctx);
+    REQUIRE_ZHINX_OR_ZFH(ctx);
 
-    gen_helper_fmax_h(cpu_fpr[a->rd], cpu_env, cpu_fpr[a->rs1],
-                      cpu_fpr[a->rs2]);
+    TCGv_i64 dest = dest_fpr(ctx, a->rd);
+    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
+    TCGv_i64 src2 = get_fpr_hs(ctx, a->rs2);
+
+    gen_helper_fmax_h(dest, cpu_env, src1, src2);
+    gen_set_fpr_hs(ctx, a->rd, dest);
     mark_fs_dirty(ctx);
     return true;
 }
@@ -289,10 +399,14 @@ static bool trans_fmax_h(DisasContext *ctx, arg_fmax_h *a)
 static bool trans_fcvt_s_h(DisasContext *ctx, arg_fcvt_s_h *a)
 {
     REQUIRE_FPU;
-    REQUIRE_ZFH_OR_ZFHMIN(ctx);
+    REQUIRE_ZFH_OR_ZFHMIN_OR_ZHINX_OR_ZHINXMIN(ctx);
+
+    TCGv_i64 dest = dest_fpr(ctx, a->rd);
+    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
 
     gen_set_rm(ctx, a->rm);
-    gen_helper_fcvt_s_h(cpu_fpr[a->rd], cpu_env, cpu_fpr[a->rs1]);
+    gen_helper_fcvt_s_h(dest, cpu_env, src1);
+    gen_set_fpr_hs(ctx, a->rd, dest);
 
     mark_fs_dirty(ctx);
 
@@ -302,26 +416,32 @@ static bool trans_fcvt_s_h(DisasContext *ctx, arg_fcvt_s_h *a)
 static bool trans_fcvt_d_h(DisasContext *ctx, arg_fcvt_d_h *a)
 {
     REQUIRE_FPU;
-    REQUIRE_ZFH_OR_ZFHMIN(ctx);
-    REQUIRE_EXT(ctx, RVD);
+    REQUIRE_ZFH_OR_ZFHMIN_OR_ZHINX_OR_ZHINXMIN(ctx);
+    REQUIRE_ZDINX_OR_D(ctx);
+
+    TCGv_i64 dest = dest_fpr(ctx, a->rd);
+    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
 
     gen_set_rm(ctx, a->rm);
-    gen_helper_fcvt_d_h(cpu_fpr[a->rd], cpu_env, cpu_fpr[a->rs1]);
+    gen_helper_fcvt_d_h(dest, cpu_env, src1);
+    gen_set_fpr_d(ctx, a->rd, dest);
 
     mark_fs_dirty(ctx);
 
-
     return true;
 }
 
 static bool trans_fcvt_h_s(DisasContext *ctx, arg_fcvt_h_s *a)
 {
     REQUIRE_FPU;
-    REQUIRE_ZFH_OR_ZFHMIN(ctx);
+    REQUIRE_ZFH_OR_ZFHMIN_OR_ZHINX_OR_ZHINXMIN(ctx);
 
-    gen_set_rm(ctx, a->rm);
-    gen_helper_fcvt_h_s(cpu_fpr[a->rd], cpu_env, cpu_fpr[a->rs1]);
+    TCGv_i64 dest = dest_fpr(ctx, a->rd);
+    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
 
+    gen_set_rm(ctx, a->rm);
+    gen_helper_fcvt_h_s(dest, cpu_env, src1);
+    gen_set_fpr_hs(ctx, a->rd, dest);
     mark_fs_dirty(ctx);
 
     return true;
@@ -330,12 +450,15 @@ static bool trans_fcvt_h_s(DisasContext *ctx, arg_fcvt_h_s *a)
 static bool trans_fcvt_h_d(DisasContext *ctx, arg_fcvt_h_d *a)
 {
     REQUIRE_FPU;
-    REQUIRE_ZFH_OR_ZFHMIN(ctx);
-    REQUIRE_EXT(ctx, RVD);
+    REQUIRE_ZFH_OR_ZFHMIN_OR_ZHINX_OR_ZHINXMIN(ctx);
+    REQUIRE_ZDINX_OR_D(ctx);
 
-    gen_set_rm(ctx, a->rm);
-    gen_helper_fcvt_h_d(cpu_fpr[a->rd], cpu_env, cpu_fpr[a->rs1]);
+    TCGv_i64 dest = dest_fpr(ctx, a->rd);
+    TCGv_i64 src1 = get_fpr_d(ctx, a->rs1);
 
+    gen_set_rm(ctx, a->rm);
+    gen_helper_fcvt_h_d(dest, cpu_env, src1);
+    gen_set_fpr_hs(ctx, a->rd, dest);
     mark_fs_dirty(ctx);
 
     return true;
@@ -344,11 +467,13 @@ static bool trans_fcvt_h_d(DisasContext *ctx, arg_fcvt_h_d *a)
 static bool trans_feq_h(DisasContext *ctx, arg_feq_h *a)
 {
     REQUIRE_FPU;
-    REQUIRE_ZFH(ctx);
+    REQUIRE_ZHINX_OR_ZFH(ctx);
 
     TCGv dest = dest_gpr(ctx, a->rd);
+    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
+    TCGv_i64 src2 = get_fpr_hs(ctx, a->rs2);
 
-    gen_helper_feq_h(dest, cpu_env, cpu_fpr[a->rs1], cpu_fpr[a->rs2]);
+    gen_helper_feq_h(dest, cpu_env, src1, src2);
     gen_set_gpr(ctx, a->rd, dest);
     return true;
 }
@@ -356,11 +481,13 @@ static bool trans_feq_h(DisasContext *ctx, arg_feq_h *a)
 static bool trans_flt_h(DisasContext *ctx, arg_flt_h *a)
 {
     REQUIRE_FPU;
-    REQUIRE_ZFH(ctx);
+    REQUIRE_ZHINX_OR_ZFH(ctx);
 
     TCGv dest = dest_gpr(ctx, a->rd);
+    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
+    TCGv_i64 src2 = get_fpr_hs(ctx, a->rs2);
 
-    gen_helper_flt_h(dest, cpu_env, cpu_fpr[a->rs1], cpu_fpr[a->rs2]);
+    gen_helper_flt_h(dest, cpu_env, src1, src2);
     gen_set_gpr(ctx, a->rd, dest);
 
     return true;
@@ -369,11 +496,13 @@ static bool trans_flt_h(DisasContext *ctx, arg_flt_h *a)
 static bool trans_fle_h(DisasContext *ctx, arg_fle_h *a)
 {
     REQUIRE_FPU;
-    REQUIRE_ZFH(ctx);
+    REQUIRE_ZHINX_OR_ZFH(ctx);
 
     TCGv dest = dest_gpr(ctx, a->rd);
+    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
+    TCGv_i64 src2 = get_fpr_hs(ctx, a->rs2);
 
-    gen_helper_fle_h(dest, cpu_env, cpu_fpr[a->rs1], cpu_fpr[a->rs2]);
+    gen_helper_fle_h(dest, cpu_env, src1, src2);
     gen_set_gpr(ctx, a->rd, dest);
     return true;
 }
@@ -381,11 +510,12 @@ static bool trans_fle_h(DisasContext *ctx, arg_fle_h *a)
 static bool trans_fclass_h(DisasContext *ctx, arg_fclass_h *a)
 {
     REQUIRE_FPU;
-    REQUIRE_ZFH(ctx);
+    REQUIRE_ZHINX_OR_ZFH(ctx);
 
     TCGv dest = dest_gpr(ctx, a->rd);
+    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
 
-    gen_helper_fclass_h(dest, cpu_fpr[a->rs1]);
+    gen_helper_fclass_h(dest, cpu_env, src1);
     gen_set_gpr(ctx, a->rd, dest);
     return true;
 }
@@ -393,12 +523,13 @@ static bool trans_fclass_h(DisasContext *ctx, arg_fclass_h *a)
 static bool trans_fcvt_w_h(DisasContext *ctx, arg_fcvt_w_h *a)
 {
     REQUIRE_FPU;
-    REQUIRE_ZFH(ctx);
+    REQUIRE_ZHINX_OR_ZFH(ctx);
 
     TCGv dest = dest_gpr(ctx, a->rd);
+    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
 
     gen_set_rm(ctx, a->rm);
-    gen_helper_fcvt_w_h(dest, cpu_env, cpu_fpr[a->rs1]);
+    gen_helper_fcvt_w_h(dest, cpu_env, src1);
     gen_set_gpr(ctx, a->rd, dest);
     return true;
 }
@@ -406,12 +537,13 @@ static bool trans_fcvt_w_h(DisasContext *ctx, arg_fcvt_w_h *a)
 static bool trans_fcvt_wu_h(DisasContext *ctx, arg_fcvt_wu_h *a)
 {
     REQUIRE_FPU;
-    REQUIRE_ZFH(ctx);
+    REQUIRE_ZHINX_OR_ZFH(ctx);
 
     TCGv dest = dest_gpr(ctx, a->rd);
+    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
 
     gen_set_rm(ctx, a->rm);
-    gen_helper_fcvt_wu_h(dest, cpu_env, cpu_fpr[a->rs1]);
+    gen_helper_fcvt_wu_h(dest, cpu_env, src1);
     gen_set_gpr(ctx, a->rd, dest);
     return true;
 }
@@ -419,12 +551,14 @@ static bool trans_fcvt_wu_h(DisasContext *ctx, arg_fcvt_wu_h *a)
 static bool trans_fcvt_h_w(DisasContext *ctx, arg_fcvt_h_w *a)
 {
     REQUIRE_FPU;
-    REQUIRE_ZFH(ctx);
+    REQUIRE_ZHINX_OR_ZFH(ctx);
 
+    TCGv_i64 dest = dest_fpr(ctx, a->rd);
     TCGv t0 = get_gpr(ctx, a->rs1, EXT_SIGN);
 
     gen_set_rm(ctx, a->rm);
-    gen_helper_fcvt_h_w(cpu_fpr[a->rd], cpu_env, t0);
+    gen_helper_fcvt_h_w(dest, cpu_env, t0);
+    gen_set_fpr_hs(ctx, a->rd, dest);
 
     mark_fs_dirty(ctx);
     return true;
@@ -433,12 +567,14 @@ static bool trans_fcvt_h_w(DisasContext *ctx, arg_fcvt_h_w *a)
 static bool trans_fcvt_h_wu(DisasContext *ctx, arg_fcvt_h_wu *a)
 {
     REQUIRE_FPU;
-    REQUIRE_ZFH(ctx);
+    REQUIRE_ZHINX_OR_ZFH(ctx);
 
+    TCGv_i64 dest = dest_fpr(ctx, a->rd);
     TCGv t0 = get_gpr(ctx, a->rs1, EXT_SIGN);
 
     gen_set_rm(ctx, a->rm);
-    gen_helper_fcvt_h_wu(cpu_fpr[a->rd], cpu_env, t0);
+    gen_helper_fcvt_h_wu(dest, cpu_env, t0);
+    gen_set_fpr_hs(ctx, a->rd, dest);
 
     mark_fs_dirty(ctx);
     return true;
@@ -482,12 +618,13 @@ static bool trans_fcvt_l_h(DisasContext *ctx, arg_fcvt_l_h *a)
 {
     REQUIRE_64BIT(ctx);
     REQUIRE_FPU;
-    REQUIRE_ZFH(ctx);
+    REQUIRE_ZHINX_OR_ZFH(ctx);
 
     TCGv dest = dest_gpr(ctx, a->rd);
+    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
 
     gen_set_rm(ctx, a->rm);
-    gen_helper_fcvt_l_h(dest, cpu_env, cpu_fpr[a->rs1]);
+    gen_helper_fcvt_l_h(dest, cpu_env, src1);
     gen_set_gpr(ctx, a->rd, dest);
     return true;
 }
@@ -496,12 +633,13 @@ static bool trans_fcvt_lu_h(DisasContext *ctx, arg_fcvt_lu_h *a)
 {
     REQUIRE_64BIT(ctx);
     REQUIRE_FPU;
-    REQUIRE_ZFH(ctx);
+    REQUIRE_ZHINX_OR_ZFH(ctx);
 
     TCGv dest = dest_gpr(ctx, a->rd);
+    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
 
     gen_set_rm(ctx, a->rm);
-    gen_helper_fcvt_lu_h(dest, cpu_env, cpu_fpr[a->rs1]);
+    gen_helper_fcvt_lu_h(dest, cpu_env, src1);
     gen_set_gpr(ctx, a->rd, dest);
     return true;
 }
@@ -510,12 +648,14 @@ static bool trans_fcvt_h_l(DisasContext *ctx, arg_fcvt_h_l *a)
 {
     REQUIRE_64BIT(ctx);
     REQUIRE_FPU;
-    REQUIRE_ZFH(ctx);
+    REQUIRE_ZHINX_OR_ZFH(ctx);
 
+    TCGv_i64 dest = dest_fpr(ctx, a->rd);
     TCGv t0 = get_gpr(ctx, a->rs1, EXT_SIGN);
 
     gen_set_rm(ctx, a->rm);
-    gen_helper_fcvt_h_l(cpu_fpr[a->rd], cpu_env, t0);
+    gen_helper_fcvt_h_l(dest, cpu_env, t0);
+    gen_set_fpr_hs(ctx, a->rd, dest);
 
     mark_fs_dirty(ctx);
     return true;
@@ -525,12 +665,14 @@ static bool trans_fcvt_h_lu(DisasContext *ctx, arg_fcvt_h_lu *a)
 {
     REQUIRE_64BIT(ctx);
     REQUIRE_FPU;
-    REQUIRE_ZFH(ctx);
+    REQUIRE_ZHINX_OR_ZFH(ctx);
 
+    TCGv_i64 dest = dest_fpr(ctx, a->rd);
     TCGv t0 = get_gpr(ctx, a->rs1, EXT_SIGN);
 
     gen_set_rm(ctx, a->rm);
-    gen_helper_fcvt_h_lu(cpu_fpr[a->rd], cpu_env, t0);
+    gen_helper_fcvt_h_lu(dest, cpu_env, t0);
+    gen_set_fpr_hs(ctx, a->rd, dest);
 
     mark_fs_dirty(ctx);
     return true;
diff --git a/target/riscv/internals.h b/target/riscv/internals.h
index 6237bb3115..dbb322bfa7 100644
--- a/target/riscv/internals.h
+++ b/target/riscv/internals.h
@@ -72,13 +72,23 @@ static inline float32 check_nanbox_s(CPURISCVState *env, uint64_t f)
     }
 }
 
-static inline uint64_t nanbox_h(float16 f)
+static inline uint64_t nanbox_h(CPURISCVState *env, float16 f)
 {
-    return f | MAKE_64BIT_MASK(16, 48);
+    /* the value is sign-extended instead of NaN-boxing for zfinx */
+    if (RISCV_CPU(env_cpu(env))->cfg.ext_zfinx) {
+        return (int16_t)f;
+    } else {
+        return f | MAKE_64BIT_MASK(16, 48);
+    }
 }
 
-static inline float16 check_nanbox_h(uint64_t f)
+static inline float16 check_nanbox_h(CPURISCVState *env, uint64_t f)
 {
+    /* Disable nanbox check when enable zfinx */
+    if (RISCV_CPU(env_cpu(env))->cfg.ext_zfinx) {
+        return (uint16_t)f;
+    }
+
     uint64_t mask = MAKE_64BIT_MASK(16, 48);
 
     if (likely((f & mask) == mask)) {
-- 
2.17.1



^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v6 5/6] target/riscv: add support for zhinx/zhinxmin
@ 2022-02-11  4:39   ` Weiwei Li
  0 siblings, 0 replies; 22+ messages in thread
From: Weiwei Li @ 2022-02-11  4:39 UTC (permalink / raw)
  To: richard.henderson, palmer, alistair.francis, bin.meng,
	qemu-riscv, qemu-devel
  Cc: wangjunqiang, lazyparser, ardxwe, Weiwei Li

  - update extension check REQUIRE_ZHINX_OR_ZFH and REQUIRE_ZFH_OR_ZFHMIN_OR_ZHINX_OR_ZHINXMIN
  - update half float point register read/write
  - disable nanbox_h check

Signed-off-by: Weiwei Li <liweiwei@iscas.ac.cn>
Signed-off-by: Junqiang Wang <wangjunqiang@iscas.ac.cn>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/riscv/fpu_helper.c                 |  89 +++---
 target/riscv/helper.h                     |   2 +-
 target/riscv/insn_trans/trans_rvzfh.c.inc | 332 +++++++++++++++-------
 target/riscv/internals.h                  |  16 +-
 4 files changed, 296 insertions(+), 143 deletions(-)

diff --git a/target/riscv/fpu_helper.c b/target/riscv/fpu_helper.c
index 63ca703459..5699c9517f 100644
--- a/target/riscv/fpu_helper.c
+++ b/target/riscv/fpu_helper.c
@@ -89,10 +89,11 @@ void helper_set_rod_rounding_mode(CPURISCVState *env)
 static uint64_t do_fmadd_h(CPURISCVState *env, uint64_t rs1, uint64_t rs2,
                            uint64_t rs3, int flags)
 {
-    float16 frs1 = check_nanbox_h(rs1);
-    float16 frs2 = check_nanbox_h(rs2);
-    float16 frs3 = check_nanbox_h(rs3);
-    return nanbox_h(float16_muladd(frs1, frs2, frs3, flags, &env->fp_status));
+    float16 frs1 = check_nanbox_h(env, rs1);
+    float16 frs2 = check_nanbox_h(env, rs2);
+    float16 frs3 = check_nanbox_h(env, rs3);
+    return nanbox_h(env, float16_muladd(frs1, frs2, frs3, flags,
+                                        &env->fp_status));
 }
 
 static uint64_t do_fmadd_s(CPURISCVState *env, uint64_t rs1, uint64_t rs2,
@@ -417,146 +418,146 @@ target_ulong helper_fclass_d(uint64_t frs1)
 
 uint64_t helper_fadd_h(CPURISCVState *env, uint64_t rs1, uint64_t rs2)
 {
-    float16 frs1 = check_nanbox_h(rs1);
-    float16 frs2 = check_nanbox_h(rs2);
-    return nanbox_h(float16_add(frs1, frs2, &env->fp_status));
+    float16 frs1 = check_nanbox_h(env, rs1);
+    float16 frs2 = check_nanbox_h(env, rs2);
+    return nanbox_h(env, float16_add(frs1, frs2, &env->fp_status));
 }
 
 uint64_t helper_fsub_h(CPURISCVState *env, uint64_t rs1, uint64_t rs2)
 {
-    float16 frs1 = check_nanbox_h(rs1);
-    float16 frs2 = check_nanbox_h(rs2);
-    return nanbox_h(float16_sub(frs1, frs2, &env->fp_status));
+    float16 frs1 = check_nanbox_h(env, rs1);
+    float16 frs2 = check_nanbox_h(env, rs2);
+    return nanbox_h(env, float16_sub(frs1, frs2, &env->fp_status));
 }
 
 uint64_t helper_fmul_h(CPURISCVState *env, uint64_t rs1, uint64_t rs2)
 {
-    float16 frs1 = check_nanbox_h(rs1);
-    float16 frs2 = check_nanbox_h(rs2);
-    return nanbox_h(float16_mul(frs1, frs2, &env->fp_status));
+    float16 frs1 = check_nanbox_h(env, rs1);
+    float16 frs2 = check_nanbox_h(env, rs2);
+    return nanbox_h(env, float16_mul(frs1, frs2, &env->fp_status));
 }
 
 uint64_t helper_fdiv_h(CPURISCVState *env, uint64_t rs1, uint64_t rs2)
 {
-    float16 frs1 = check_nanbox_h(rs1);
-    float16 frs2 = check_nanbox_h(rs2);
-    return nanbox_h(float16_div(frs1, frs2, &env->fp_status));
+    float16 frs1 = check_nanbox_h(env, rs1);
+    float16 frs2 = check_nanbox_h(env, rs2);
+    return nanbox_h(env, float16_div(frs1, frs2, &env->fp_status));
 }
 
 uint64_t helper_fmin_h(CPURISCVState *env, uint64_t rs1, uint64_t rs2)
 {
-    float16 frs1 = check_nanbox_h(rs1);
-    float16 frs2 = check_nanbox_h(rs2);
-    return nanbox_h(env->priv_ver < PRIV_VERSION_1_11_0 ?
+    float16 frs1 = check_nanbox_h(env, rs1);
+    float16 frs2 = check_nanbox_h(env, rs2);
+    return nanbox_h(env, env->priv_ver < PRIV_VERSION_1_11_0 ?
                     float16_minnum(frs1, frs2, &env->fp_status) :
                     float16_minimum_number(frs1, frs2, &env->fp_status));
 }
 
 uint64_t helper_fmax_h(CPURISCVState *env, uint64_t rs1, uint64_t rs2)
 {
-    float16 frs1 = check_nanbox_h(rs1);
-    float16 frs2 = check_nanbox_h(rs2);
-    return nanbox_h(env->priv_ver < PRIV_VERSION_1_11_0 ?
+    float16 frs1 = check_nanbox_h(env, rs1);
+    float16 frs2 = check_nanbox_h(env, rs2);
+    return nanbox_h(env, env->priv_ver < PRIV_VERSION_1_11_0 ?
                     float16_maxnum(frs1, frs2, &env->fp_status) :
                     float16_maximum_number(frs1, frs2, &env->fp_status));
 }
 
 uint64_t helper_fsqrt_h(CPURISCVState *env, uint64_t rs1)
 {
-    float16 frs1 = check_nanbox_h(rs1);
-    return nanbox_h(float16_sqrt(frs1, &env->fp_status));
+    float16 frs1 = check_nanbox_h(env, rs1);
+    return nanbox_h(env, float16_sqrt(frs1, &env->fp_status));
 }
 
 target_ulong helper_fle_h(CPURISCVState *env, uint64_t rs1, uint64_t rs2)
 {
-    float16 frs1 = check_nanbox_h(rs1);
-    float16 frs2 = check_nanbox_h(rs2);
+    float16 frs1 = check_nanbox_h(env, rs1);
+    float16 frs2 = check_nanbox_h(env, rs2);
     return float16_le(frs1, frs2, &env->fp_status);
 }
 
 target_ulong helper_flt_h(CPURISCVState *env, uint64_t rs1, uint64_t rs2)
 {
-    float16 frs1 = check_nanbox_h(rs1);
-    float16 frs2 = check_nanbox_h(rs2);
+    float16 frs1 = check_nanbox_h(env, rs1);
+    float16 frs2 = check_nanbox_h(env, rs2);
     return float16_lt(frs1, frs2, &env->fp_status);
 }
 
 target_ulong helper_feq_h(CPURISCVState *env, uint64_t rs1, uint64_t rs2)
 {
-    float16 frs1 = check_nanbox_h(rs1);
-    float16 frs2 = check_nanbox_h(rs2);
+    float16 frs1 = check_nanbox_h(env, rs1);
+    float16 frs2 = check_nanbox_h(env, rs2);
     return float16_eq_quiet(frs1, frs2, &env->fp_status);
 }
 
-target_ulong helper_fclass_h(uint64_t rs1)
+target_ulong helper_fclass_h(CPURISCVState *env, uint64_t rs1)
 {
-    float16 frs1 = check_nanbox_h(rs1);
+    float16 frs1 = check_nanbox_h(env, rs1);
     return fclass_h(frs1);
 }
 
 target_ulong helper_fcvt_w_h(CPURISCVState *env, uint64_t rs1)
 {
-    float16 frs1 = check_nanbox_h(rs1);
+    float16 frs1 = check_nanbox_h(env, rs1);
     return float16_to_int32(frs1, &env->fp_status);
 }
 
 target_ulong helper_fcvt_wu_h(CPURISCVState *env, uint64_t rs1)
 {
-    float16 frs1 = check_nanbox_h(rs1);
+    float16 frs1 = check_nanbox_h(env, rs1);
     return (int32_t)float16_to_uint32(frs1, &env->fp_status);
 }
 
 target_ulong helper_fcvt_l_h(CPURISCVState *env, uint64_t rs1)
 {
-    float16 frs1 = check_nanbox_h(rs1);
+    float16 frs1 = check_nanbox_h(env, rs1);
     return float16_to_int64(frs1, &env->fp_status);
 }
 
 target_ulong helper_fcvt_lu_h(CPURISCVState *env, uint64_t rs1)
 {
-    float16 frs1 = check_nanbox_h(rs1);
+    float16 frs1 = check_nanbox_h(env, rs1);
     return float16_to_uint64(frs1, &env->fp_status);
 }
 
 uint64_t helper_fcvt_h_w(CPURISCVState *env, target_ulong rs1)
 {
-    return nanbox_h(int32_to_float16((int32_t)rs1, &env->fp_status));
+    return nanbox_h(env, int32_to_float16((int32_t)rs1, &env->fp_status));
 }
 
 uint64_t helper_fcvt_h_wu(CPURISCVState *env, target_ulong rs1)
 {
-    return nanbox_h(uint32_to_float16((uint32_t)rs1, &env->fp_status));
+    return nanbox_h(env, uint32_to_float16((uint32_t)rs1, &env->fp_status));
 }
 
 uint64_t helper_fcvt_h_l(CPURISCVState *env, target_ulong rs1)
 {
-    return nanbox_h(int64_to_float16(rs1, &env->fp_status));
+    return nanbox_h(env, int64_to_float16(rs1, &env->fp_status));
 }
 
 uint64_t helper_fcvt_h_lu(CPURISCVState *env, target_ulong rs1)
 {
-    return nanbox_h(uint64_to_float16(rs1, &env->fp_status));
+    return nanbox_h(env, uint64_to_float16(rs1, &env->fp_status));
 }
 
 uint64_t helper_fcvt_h_s(CPURISCVState *env, uint64_t rs1)
 {
     float32 frs1 = check_nanbox_s(env, rs1);
-    return nanbox_h(float32_to_float16(frs1, true, &env->fp_status));
+    return nanbox_h(env, float32_to_float16(frs1, true, &env->fp_status));
 }
 
 uint64_t helper_fcvt_s_h(CPURISCVState *env, uint64_t rs1)
 {
-    float16 frs1 = check_nanbox_h(rs1);
+    float16 frs1 = check_nanbox_h(env, rs1);
     return nanbox_s(env, float16_to_float32(frs1, true, &env->fp_status));
 }
 
 uint64_t helper_fcvt_h_d(CPURISCVState *env, uint64_t rs1)
 {
-    return nanbox_h(float64_to_float16(rs1, true, &env->fp_status));
+    return nanbox_h(env, float64_to_float16(rs1, true, &env->fp_status));
 }
 
 uint64_t helper_fcvt_d_h(CPURISCVState *env, uint64_t rs1)
 {
-    float16 frs1 = check_nanbox_h(rs1);
+    float16 frs1 = check_nanbox_h(env, rs1);
     return float16_to_float64(frs1, true, &env->fp_status);
 }
diff --git a/target/riscv/helper.h b/target/riscv/helper.h
index 89195aad9d..26bbab2fab 100644
--- a/target/riscv/helper.h
+++ b/target/riscv/helper.h
@@ -90,7 +90,7 @@ DEF_HELPER_FLAGS_2(fcvt_h_w, TCG_CALL_NO_RWG, i64, env, tl)
 DEF_HELPER_FLAGS_2(fcvt_h_wu, TCG_CALL_NO_RWG, i64, env, tl)
 DEF_HELPER_FLAGS_2(fcvt_h_l, TCG_CALL_NO_RWG, i64, env, tl)
 DEF_HELPER_FLAGS_2(fcvt_h_lu, TCG_CALL_NO_RWG, i64, env, tl)
-DEF_HELPER_FLAGS_1(fclass_h, TCG_CALL_NO_RWG_SE, tl, i64)
+DEF_HELPER_FLAGS_2(fclass_h, TCG_CALL_NO_RWG_SE, tl, env, i64)
 
 /* Special functions */
 DEF_HELPER_2(csrr, tl, env, int)
diff --git a/target/riscv/insn_trans/trans_rvzfh.c.inc b/target/riscv/insn_trans/trans_rvzfh.c.inc
index 608c51da2c..5d07150cd0 100644
--- a/target/riscv/insn_trans/trans_rvzfh.c.inc
+++ b/target/riscv/insn_trans/trans_rvzfh.c.inc
@@ -22,12 +22,25 @@
     }                         \
 } while (0)
 
+#define REQUIRE_ZHINX_OR_ZFH(ctx) do { \
+    if (!ctx->cfg_ptr->ext_zhinx && !ctx->cfg_ptr->ext_zfh) { \
+        return false;                  \
+    }                                  \
+} while (0)
+
 #define REQUIRE_ZFH_OR_ZFHMIN(ctx) do {       \
     if (!(ctx->cfg_ptr->ext_zfh || ctx->cfg_ptr->ext_zfhmin)) { \
         return false;                         \
     }                                         \
 } while (0)
 
+#define REQUIRE_ZFH_OR_ZFHMIN_OR_ZHINX_OR_ZHINXMIN(ctx) do { \
+    if (!(ctx->cfg_ptr->ext_zfh || ctx->cfg_ptr->ext_zfhmin ||          \
+          ctx->cfg_ptr->ext_zhinx || ctx->cfg_ptr->ext_zhinxmin)) {     \
+        return false;                                        \
+    }                                                        \
+} while (0)
+
 static bool trans_flh(DisasContext *ctx, arg_flh *a)
 {
     TCGv_i64 dest;
@@ -73,11 +86,16 @@ static bool trans_fsh(DisasContext *ctx, arg_fsh *a)
 static bool trans_fmadd_h(DisasContext *ctx, arg_fmadd_h *a)
 {
     REQUIRE_FPU;
-    REQUIRE_ZFH(ctx);
+    REQUIRE_ZHINX_OR_ZFH(ctx);
+
+    TCGv_i64 dest = dest_fpr(ctx, a->rd);
+    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
+    TCGv_i64 src2 = get_fpr_hs(ctx, a->rs2);
+    TCGv_i64 src3 = get_fpr_hs(ctx, a->rs3);
 
     gen_set_rm(ctx, a->rm);
-    gen_helper_fmadd_h(cpu_fpr[a->rd], cpu_env, cpu_fpr[a->rs1],
-                       cpu_fpr[a->rs2], cpu_fpr[a->rs3]);
+    gen_helper_fmadd_h(dest, cpu_env, src1, src2, src3);
+    gen_set_fpr_hs(ctx, a->rd, dest);
     mark_fs_dirty(ctx);
     return true;
 }
@@ -85,11 +103,16 @@ static bool trans_fmadd_h(DisasContext *ctx, arg_fmadd_h *a)
 static bool trans_fmsub_h(DisasContext *ctx, arg_fmsub_h *a)
 {
     REQUIRE_FPU;
-    REQUIRE_ZFH(ctx);
+    REQUIRE_ZHINX_OR_ZFH(ctx);
+
+    TCGv_i64 dest = dest_fpr(ctx, a->rd);
+    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
+    TCGv_i64 src2 = get_fpr_hs(ctx, a->rs2);
+    TCGv_i64 src3 = get_fpr_hs(ctx, a->rs3);
 
     gen_set_rm(ctx, a->rm);
-    gen_helper_fmsub_h(cpu_fpr[a->rd], cpu_env, cpu_fpr[a->rs1],
-                       cpu_fpr[a->rs2], cpu_fpr[a->rs3]);
+    gen_helper_fmsub_h(dest, cpu_env, src1, src2, src3);
+    gen_set_fpr_hs(ctx, a->rd, dest);
     mark_fs_dirty(ctx);
     return true;
 }
@@ -97,11 +120,16 @@ static bool trans_fmsub_h(DisasContext *ctx, arg_fmsub_h *a)
 static bool trans_fnmsub_h(DisasContext *ctx, arg_fnmsub_h *a)
 {
     REQUIRE_FPU;
-    REQUIRE_ZFH(ctx);
+    REQUIRE_ZHINX_OR_ZFH(ctx);
+
+    TCGv_i64 dest = dest_fpr(ctx, a->rd);
+    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
+    TCGv_i64 src2 = get_fpr_hs(ctx, a->rs2);
+    TCGv_i64 src3 = get_fpr_hs(ctx, a->rs3);
 
     gen_set_rm(ctx, a->rm);
-    gen_helper_fnmsub_h(cpu_fpr[a->rd], cpu_env, cpu_fpr[a->rs1],
-                        cpu_fpr[a->rs2], cpu_fpr[a->rs3]);
+    gen_helper_fnmsub_h(dest, cpu_env, src1, src2, src3);
+    gen_set_fpr_hs(ctx, a->rd, dest);
     mark_fs_dirty(ctx);
     return true;
 }
@@ -109,11 +137,16 @@ static bool trans_fnmsub_h(DisasContext *ctx, arg_fnmsub_h *a)
 static bool trans_fnmadd_h(DisasContext *ctx, arg_fnmadd_h *a)
 {
     REQUIRE_FPU;
-    REQUIRE_ZFH(ctx);
+    REQUIRE_ZHINX_OR_ZFH(ctx);
+
+    TCGv_i64 dest = dest_fpr(ctx, a->rd);
+    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
+    TCGv_i64 src2 = get_fpr_hs(ctx, a->rs2);
+    TCGv_i64 src3 = get_fpr_hs(ctx, a->rs3);
 
     gen_set_rm(ctx, a->rm);
-    gen_helper_fnmadd_h(cpu_fpr[a->rd], cpu_env, cpu_fpr[a->rs1],
-                        cpu_fpr[a->rs2], cpu_fpr[a->rs3]);
+    gen_helper_fnmadd_h(dest, cpu_env, src1, src2, src3);
+    gen_set_fpr_hs(ctx, a->rd, dest);
     mark_fs_dirty(ctx);
     return true;
 }
@@ -121,11 +154,15 @@ static bool trans_fnmadd_h(DisasContext *ctx, arg_fnmadd_h *a)
 static bool trans_fadd_h(DisasContext *ctx, arg_fadd_h *a)
 {
     REQUIRE_FPU;
-    REQUIRE_ZFH(ctx);
+    REQUIRE_ZHINX_OR_ZFH(ctx);
+
+    TCGv_i64 dest = dest_fpr(ctx, a->rd);
+    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
+    TCGv_i64 src2 = get_fpr_hs(ctx, a->rs2);
 
     gen_set_rm(ctx, a->rm);
-    gen_helper_fadd_h(cpu_fpr[a->rd], cpu_env,
-                      cpu_fpr[a->rs1], cpu_fpr[a->rs2]);
+    gen_helper_fadd_h(dest, cpu_env, src1, src2);
+    gen_set_fpr_hs(ctx, a->rd, dest);
     mark_fs_dirty(ctx);
     return true;
 }
@@ -133,11 +170,15 @@ static bool trans_fadd_h(DisasContext *ctx, arg_fadd_h *a)
 static bool trans_fsub_h(DisasContext *ctx, arg_fsub_h *a)
 {
     REQUIRE_FPU;
-    REQUIRE_ZFH(ctx);
+    REQUIRE_ZHINX_OR_ZFH(ctx);
+
+    TCGv_i64 dest = dest_fpr(ctx, a->rd);
+    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
+    TCGv_i64 src2 = get_fpr_hs(ctx, a->rs2);
 
     gen_set_rm(ctx, a->rm);
-    gen_helper_fsub_h(cpu_fpr[a->rd], cpu_env,
-                      cpu_fpr[a->rs1], cpu_fpr[a->rs2]);
+    gen_helper_fsub_h(dest, cpu_env, src1, src2);
+    gen_set_fpr_hs(ctx, a->rd, dest);
     mark_fs_dirty(ctx);
     return true;
 }
@@ -145,11 +186,15 @@ static bool trans_fsub_h(DisasContext *ctx, arg_fsub_h *a)
 static bool trans_fmul_h(DisasContext *ctx, arg_fmul_h *a)
 {
     REQUIRE_FPU;
-    REQUIRE_ZFH(ctx);
+    REQUIRE_ZHINX_OR_ZFH(ctx);
+
+    TCGv_i64 dest = dest_fpr(ctx, a->rd);
+    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
+    TCGv_i64 src2 = get_fpr_hs(ctx, a->rs2);
 
     gen_set_rm(ctx, a->rm);
-    gen_helper_fmul_h(cpu_fpr[a->rd], cpu_env,
-                      cpu_fpr[a->rs1], cpu_fpr[a->rs2]);
+    gen_helper_fmul_h(dest, cpu_env, src1, src2);
+    gen_set_fpr_hs(ctx, a->rd, dest);
     mark_fs_dirty(ctx);
     return true;
 }
@@ -157,11 +202,15 @@ static bool trans_fmul_h(DisasContext *ctx, arg_fmul_h *a)
 static bool trans_fdiv_h(DisasContext *ctx, arg_fdiv_h *a)
 {
     REQUIRE_FPU;
-    REQUIRE_ZFH(ctx);
+    REQUIRE_ZHINX_OR_ZFH(ctx);
+
+    TCGv_i64 dest = dest_fpr(ctx, a->rd);
+    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
+    TCGv_i64 src2 = get_fpr_hs(ctx, a->rs2);
 
     gen_set_rm(ctx, a->rm);
-    gen_helper_fdiv_h(cpu_fpr[a->rd], cpu_env,
-                      cpu_fpr[a->rs1], cpu_fpr[a->rs2]);
+    gen_helper_fdiv_h(dest, cpu_env, src1, src2);
+    gen_set_fpr_hs(ctx, a->rd, dest);
     mark_fs_dirty(ctx);
     return true;
 }
@@ -169,10 +218,14 @@ static bool trans_fdiv_h(DisasContext *ctx, arg_fdiv_h *a)
 static bool trans_fsqrt_h(DisasContext *ctx, arg_fsqrt_h *a)
 {
     REQUIRE_FPU;
-    REQUIRE_ZFH(ctx);
+    REQUIRE_ZHINX_OR_ZFH(ctx);
+
+    TCGv_i64 dest = dest_fpr(ctx, a->rd);
+    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
 
     gen_set_rm(ctx, a->rm);
-    gen_helper_fsqrt_h(cpu_fpr[a->rd], cpu_env, cpu_fpr[a->rs1]);
+    gen_helper_fsqrt_h(dest, cpu_env, src1);
+    gen_set_fpr_hs(ctx, a->rd, dest);
     mark_fs_dirty(ctx);
     return true;
 }
@@ -180,23 +233,37 @@ static bool trans_fsqrt_h(DisasContext *ctx, arg_fsqrt_h *a)
 static bool trans_fsgnj_h(DisasContext *ctx, arg_fsgnj_h *a)
 {
     REQUIRE_FPU;
-    REQUIRE_ZFH(ctx);
+    REQUIRE_ZHINX_OR_ZFH(ctx);
+
+    TCGv_i64 dest = dest_fpr(ctx, a->rd);
+    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
 
     if (a->rs1 == a->rs2) { /* FMOV */
-        gen_check_nanbox_h(cpu_fpr[a->rd], cpu_fpr[a->rs1]);
+        if (!ctx->cfg_ptr->ext_zfinx) {
+            gen_check_nanbox_h(dest, src1);
+        } else {
+            tcg_gen_ext16s_i64(dest, src1);
+        }
     } else {
-        TCGv_i64 rs1 = tcg_temp_new_i64();
-        TCGv_i64 rs2 = tcg_temp_new_i64();
-
-        gen_check_nanbox_h(rs1, cpu_fpr[a->rs1]);
-        gen_check_nanbox_h(rs2, cpu_fpr[a->rs2]);
-
-        /* This formulation retains the nanboxing of rs2. */
-        tcg_gen_deposit_i64(cpu_fpr[a->rd], rs2, rs1, 0, 15);
-        tcg_temp_free_i64(rs1);
-        tcg_temp_free_i64(rs2);
+        TCGv_i64 src2 = get_fpr_hs(ctx, a->rs2);
+
+        if (!ctx->cfg_ptr->ext_zfinx) {
+            TCGv_i64 rs1 = tcg_temp_new_i64();
+            TCGv_i64 rs2 = tcg_temp_new_i64();
+            gen_check_nanbox_h(rs1, src1);
+            gen_check_nanbox_h(rs2, src2);
+
+            /* This formulation retains the nanboxing of rs2 in normal 'Zfh'. */
+            tcg_gen_deposit_i64(dest, rs2, rs1, 0, 15);
+
+            tcg_temp_free_i64(rs1);
+            tcg_temp_free_i64(rs2);
+        } else {
+            tcg_gen_deposit_i64(dest, src2, src1, 0, 15);
+            tcg_gen_ext16s_i64(dest, dest);
+        }
     }
-
+    gen_set_fpr_hs(ctx, a->rd, dest);
     mark_fs_dirty(ctx);
     return true;
 }
@@ -206,16 +273,29 @@ static bool trans_fsgnjn_h(DisasContext *ctx, arg_fsgnjn_h *a)
     TCGv_i64 rs1, rs2, mask;
 
     REQUIRE_FPU;
-    REQUIRE_ZFH(ctx);
+    REQUIRE_ZHINX_OR_ZFH(ctx);
+
+    TCGv_i64 dest = dest_fpr(ctx, a->rd);
+    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
 
     rs1 = tcg_temp_new_i64();
-    gen_check_nanbox_h(rs1, cpu_fpr[a->rs1]);
+    if (!ctx->cfg_ptr->ext_zfinx) {
+        gen_check_nanbox_h(rs1, src1);
+    } else {
+        tcg_gen_mov_i64(rs1, src1);
+    }
 
     if (a->rs1 == a->rs2) { /* FNEG */
-        tcg_gen_xori_i64(cpu_fpr[a->rd], rs1, MAKE_64BIT_MASK(15, 1));
+        tcg_gen_xori_i64(dest, rs1, MAKE_64BIT_MASK(15, 1));
     } else {
+        TCGv_i64 src2 = get_fpr_hs(ctx, a->rs2);
         rs2 = tcg_temp_new_i64();
-        gen_check_nanbox_h(rs2, cpu_fpr[a->rs2]);
+
+        if (!ctx->cfg_ptr->ext_zfinx) {
+            gen_check_nanbox_h(rs2, src2);
+        } else {
+            tcg_gen_mov_i64(rs2, src2);
+        }
 
         /*
          * Replace bit 15 in rs1 with inverse in rs2.
@@ -224,12 +304,17 @@ static bool trans_fsgnjn_h(DisasContext *ctx, arg_fsgnjn_h *a)
         mask = tcg_const_i64(~MAKE_64BIT_MASK(15, 1));
         tcg_gen_not_i64(rs2, rs2);
         tcg_gen_andc_i64(rs2, rs2, mask);
-        tcg_gen_and_i64(rs1, mask, rs1);
-        tcg_gen_or_i64(cpu_fpr[a->rd], rs1, rs2);
+        tcg_gen_and_i64(dest, mask, rs1);
+        tcg_gen_or_i64(dest, dest, rs2);
 
         tcg_temp_free_i64(mask);
         tcg_temp_free_i64(rs2);
     }
+    /* signed-extended intead of nanboxing for result if enable zfinx */
+    if (ctx->cfg_ptr->ext_zfinx) {
+        tcg_gen_ext16s_i64(dest, dest);
+    }
+    tcg_temp_free_i64(rs1);
     mark_fs_dirty(ctx);
     return true;
 }
@@ -239,27 +324,44 @@ static bool trans_fsgnjx_h(DisasContext *ctx, arg_fsgnjx_h *a)
     TCGv_i64 rs1, rs2;
 
     REQUIRE_FPU;
-    REQUIRE_ZFH(ctx);
+    REQUIRE_ZHINX_OR_ZFH(ctx);
+
+    TCGv_i64 dest = dest_fpr(ctx, a->rd);
+    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
 
     rs1 = tcg_temp_new_i64();
-    gen_check_nanbox_s(rs1, cpu_fpr[a->rs1]);
+    if (!ctx->cfg_ptr->ext_zfinx) {
+        gen_check_nanbox_h(rs1, src1);
+    } else {
+        tcg_gen_mov_i64(rs1, src1);
+    }
 
     if (a->rs1 == a->rs2) { /* FABS */
-        tcg_gen_andi_i64(cpu_fpr[a->rd], rs1, ~MAKE_64BIT_MASK(15, 1));
+        tcg_gen_andi_i64(dest, rs1, ~MAKE_64BIT_MASK(15, 1));
     } else {
+        TCGv_i64 src2 = get_fpr_hs(ctx, a->rs2);
         rs2 = tcg_temp_new_i64();
-        gen_check_nanbox_s(rs2, cpu_fpr[a->rs2]);
+
+        if (!ctx->cfg_ptr->ext_zfinx) {
+            gen_check_nanbox_h(rs2, src2);
+        } else {
+            tcg_gen_mov_i64(rs2, src2);
+        }
 
         /*
          * Xor bit 15 in rs1 with that in rs2.
          * This formulation retains the nanboxing of rs1.
          */
-        tcg_gen_andi_i64(rs2, rs2, MAKE_64BIT_MASK(15, 1));
-        tcg_gen_xor_i64(cpu_fpr[a->rd], rs1, rs2);
+        tcg_gen_andi_i64(dest, rs2, MAKE_64BIT_MASK(15, 1));
+        tcg_gen_xor_i64(dest, rs1, dest);
 
         tcg_temp_free_i64(rs2);
     }
-
+    /* signed-extended intead of nanboxing for result if enable zfinx */
+    if (ctx->cfg_ptr->ext_zfinx) {
+        tcg_gen_ext16s_i64(dest, dest);
+    }
+    tcg_temp_free_i64(rs1);
     mark_fs_dirty(ctx);
     return true;
 }
@@ -267,10 +369,14 @@ static bool trans_fsgnjx_h(DisasContext *ctx, arg_fsgnjx_h *a)
 static bool trans_fmin_h(DisasContext *ctx, arg_fmin_h *a)
 {
     REQUIRE_FPU;
-    REQUIRE_ZFH(ctx);
+    REQUIRE_ZHINX_OR_ZFH(ctx);
+
+    TCGv_i64 dest = dest_fpr(ctx, a->rd);
+    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
+    TCGv_i64 src2 = get_fpr_hs(ctx, a->rs2);
 
-    gen_helper_fmin_h(cpu_fpr[a->rd], cpu_env, cpu_fpr[a->rs1],
-                      cpu_fpr[a->rs2]);
+    gen_helper_fmin_h(dest, cpu_env, src1, src2);
+    gen_set_fpr_hs(ctx, a->rd, dest);
     mark_fs_dirty(ctx);
     return true;
 }
@@ -278,10 +384,14 @@ static bool trans_fmin_h(DisasContext *ctx, arg_fmin_h *a)
 static bool trans_fmax_h(DisasContext *ctx, arg_fmax_h *a)
 {
     REQUIRE_FPU;
-    REQUIRE_ZFH(ctx);
+    REQUIRE_ZHINX_OR_ZFH(ctx);
 
-    gen_helper_fmax_h(cpu_fpr[a->rd], cpu_env, cpu_fpr[a->rs1],
-                      cpu_fpr[a->rs2]);
+    TCGv_i64 dest = dest_fpr(ctx, a->rd);
+    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
+    TCGv_i64 src2 = get_fpr_hs(ctx, a->rs2);
+
+    gen_helper_fmax_h(dest, cpu_env, src1, src2);
+    gen_set_fpr_hs(ctx, a->rd, dest);
     mark_fs_dirty(ctx);
     return true;
 }
@@ -289,10 +399,14 @@ static bool trans_fmax_h(DisasContext *ctx, arg_fmax_h *a)
 static bool trans_fcvt_s_h(DisasContext *ctx, arg_fcvt_s_h *a)
 {
     REQUIRE_FPU;
-    REQUIRE_ZFH_OR_ZFHMIN(ctx);
+    REQUIRE_ZFH_OR_ZFHMIN_OR_ZHINX_OR_ZHINXMIN(ctx);
+
+    TCGv_i64 dest = dest_fpr(ctx, a->rd);
+    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
 
     gen_set_rm(ctx, a->rm);
-    gen_helper_fcvt_s_h(cpu_fpr[a->rd], cpu_env, cpu_fpr[a->rs1]);
+    gen_helper_fcvt_s_h(dest, cpu_env, src1);
+    gen_set_fpr_hs(ctx, a->rd, dest);
 
     mark_fs_dirty(ctx);
 
@@ -302,26 +416,32 @@ static bool trans_fcvt_s_h(DisasContext *ctx, arg_fcvt_s_h *a)
 static bool trans_fcvt_d_h(DisasContext *ctx, arg_fcvt_d_h *a)
 {
     REQUIRE_FPU;
-    REQUIRE_ZFH_OR_ZFHMIN(ctx);
-    REQUIRE_EXT(ctx, RVD);
+    REQUIRE_ZFH_OR_ZFHMIN_OR_ZHINX_OR_ZHINXMIN(ctx);
+    REQUIRE_ZDINX_OR_D(ctx);
+
+    TCGv_i64 dest = dest_fpr(ctx, a->rd);
+    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
 
     gen_set_rm(ctx, a->rm);
-    gen_helper_fcvt_d_h(cpu_fpr[a->rd], cpu_env, cpu_fpr[a->rs1]);
+    gen_helper_fcvt_d_h(dest, cpu_env, src1);
+    gen_set_fpr_d(ctx, a->rd, dest);
 
     mark_fs_dirty(ctx);
 
-
     return true;
 }
 
 static bool trans_fcvt_h_s(DisasContext *ctx, arg_fcvt_h_s *a)
 {
     REQUIRE_FPU;
-    REQUIRE_ZFH_OR_ZFHMIN(ctx);
+    REQUIRE_ZFH_OR_ZFHMIN_OR_ZHINX_OR_ZHINXMIN(ctx);
 
-    gen_set_rm(ctx, a->rm);
-    gen_helper_fcvt_h_s(cpu_fpr[a->rd], cpu_env, cpu_fpr[a->rs1]);
+    TCGv_i64 dest = dest_fpr(ctx, a->rd);
+    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
 
+    gen_set_rm(ctx, a->rm);
+    gen_helper_fcvt_h_s(dest, cpu_env, src1);
+    gen_set_fpr_hs(ctx, a->rd, dest);
     mark_fs_dirty(ctx);
 
     return true;
@@ -330,12 +450,15 @@ static bool trans_fcvt_h_s(DisasContext *ctx, arg_fcvt_h_s *a)
 static bool trans_fcvt_h_d(DisasContext *ctx, arg_fcvt_h_d *a)
 {
     REQUIRE_FPU;
-    REQUIRE_ZFH_OR_ZFHMIN(ctx);
-    REQUIRE_EXT(ctx, RVD);
+    REQUIRE_ZFH_OR_ZFHMIN_OR_ZHINX_OR_ZHINXMIN(ctx);
+    REQUIRE_ZDINX_OR_D(ctx);
 
-    gen_set_rm(ctx, a->rm);
-    gen_helper_fcvt_h_d(cpu_fpr[a->rd], cpu_env, cpu_fpr[a->rs1]);
+    TCGv_i64 dest = dest_fpr(ctx, a->rd);
+    TCGv_i64 src1 = get_fpr_d(ctx, a->rs1);
 
+    gen_set_rm(ctx, a->rm);
+    gen_helper_fcvt_h_d(dest, cpu_env, src1);
+    gen_set_fpr_hs(ctx, a->rd, dest);
     mark_fs_dirty(ctx);
 
     return true;
@@ -344,11 +467,13 @@ static bool trans_fcvt_h_d(DisasContext *ctx, arg_fcvt_h_d *a)
 static bool trans_feq_h(DisasContext *ctx, arg_feq_h *a)
 {
     REQUIRE_FPU;
-    REQUIRE_ZFH(ctx);
+    REQUIRE_ZHINX_OR_ZFH(ctx);
 
     TCGv dest = dest_gpr(ctx, a->rd);
+    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
+    TCGv_i64 src2 = get_fpr_hs(ctx, a->rs2);
 
-    gen_helper_feq_h(dest, cpu_env, cpu_fpr[a->rs1], cpu_fpr[a->rs2]);
+    gen_helper_feq_h(dest, cpu_env, src1, src2);
     gen_set_gpr(ctx, a->rd, dest);
     return true;
 }
@@ -356,11 +481,13 @@ static bool trans_feq_h(DisasContext *ctx, arg_feq_h *a)
 static bool trans_flt_h(DisasContext *ctx, arg_flt_h *a)
 {
     REQUIRE_FPU;
-    REQUIRE_ZFH(ctx);
+    REQUIRE_ZHINX_OR_ZFH(ctx);
 
     TCGv dest = dest_gpr(ctx, a->rd);
+    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
+    TCGv_i64 src2 = get_fpr_hs(ctx, a->rs2);
 
-    gen_helper_flt_h(dest, cpu_env, cpu_fpr[a->rs1], cpu_fpr[a->rs2]);
+    gen_helper_flt_h(dest, cpu_env, src1, src2);
     gen_set_gpr(ctx, a->rd, dest);
 
     return true;
@@ -369,11 +496,13 @@ static bool trans_flt_h(DisasContext *ctx, arg_flt_h *a)
 static bool trans_fle_h(DisasContext *ctx, arg_fle_h *a)
 {
     REQUIRE_FPU;
-    REQUIRE_ZFH(ctx);
+    REQUIRE_ZHINX_OR_ZFH(ctx);
 
     TCGv dest = dest_gpr(ctx, a->rd);
+    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
+    TCGv_i64 src2 = get_fpr_hs(ctx, a->rs2);
 
-    gen_helper_fle_h(dest, cpu_env, cpu_fpr[a->rs1], cpu_fpr[a->rs2]);
+    gen_helper_fle_h(dest, cpu_env, src1, src2);
     gen_set_gpr(ctx, a->rd, dest);
     return true;
 }
@@ -381,11 +510,12 @@ static bool trans_fle_h(DisasContext *ctx, arg_fle_h *a)
 static bool trans_fclass_h(DisasContext *ctx, arg_fclass_h *a)
 {
     REQUIRE_FPU;
-    REQUIRE_ZFH(ctx);
+    REQUIRE_ZHINX_OR_ZFH(ctx);
 
     TCGv dest = dest_gpr(ctx, a->rd);
+    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
 
-    gen_helper_fclass_h(dest, cpu_fpr[a->rs1]);
+    gen_helper_fclass_h(dest, cpu_env, src1);
     gen_set_gpr(ctx, a->rd, dest);
     return true;
 }
@@ -393,12 +523,13 @@ static bool trans_fclass_h(DisasContext *ctx, arg_fclass_h *a)
 static bool trans_fcvt_w_h(DisasContext *ctx, arg_fcvt_w_h *a)
 {
     REQUIRE_FPU;
-    REQUIRE_ZFH(ctx);
+    REQUIRE_ZHINX_OR_ZFH(ctx);
 
     TCGv dest = dest_gpr(ctx, a->rd);
+    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
 
     gen_set_rm(ctx, a->rm);
-    gen_helper_fcvt_w_h(dest, cpu_env, cpu_fpr[a->rs1]);
+    gen_helper_fcvt_w_h(dest, cpu_env, src1);
     gen_set_gpr(ctx, a->rd, dest);
     return true;
 }
@@ -406,12 +537,13 @@ static bool trans_fcvt_w_h(DisasContext *ctx, arg_fcvt_w_h *a)
 static bool trans_fcvt_wu_h(DisasContext *ctx, arg_fcvt_wu_h *a)
 {
     REQUIRE_FPU;
-    REQUIRE_ZFH(ctx);
+    REQUIRE_ZHINX_OR_ZFH(ctx);
 
     TCGv dest = dest_gpr(ctx, a->rd);
+    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
 
     gen_set_rm(ctx, a->rm);
-    gen_helper_fcvt_wu_h(dest, cpu_env, cpu_fpr[a->rs1]);
+    gen_helper_fcvt_wu_h(dest, cpu_env, src1);
     gen_set_gpr(ctx, a->rd, dest);
     return true;
 }
@@ -419,12 +551,14 @@ static bool trans_fcvt_wu_h(DisasContext *ctx, arg_fcvt_wu_h *a)
 static bool trans_fcvt_h_w(DisasContext *ctx, arg_fcvt_h_w *a)
 {
     REQUIRE_FPU;
-    REQUIRE_ZFH(ctx);
+    REQUIRE_ZHINX_OR_ZFH(ctx);
 
+    TCGv_i64 dest = dest_fpr(ctx, a->rd);
     TCGv t0 = get_gpr(ctx, a->rs1, EXT_SIGN);
 
     gen_set_rm(ctx, a->rm);
-    gen_helper_fcvt_h_w(cpu_fpr[a->rd], cpu_env, t0);
+    gen_helper_fcvt_h_w(dest, cpu_env, t0);
+    gen_set_fpr_hs(ctx, a->rd, dest);
 
     mark_fs_dirty(ctx);
     return true;
@@ -433,12 +567,14 @@ static bool trans_fcvt_h_w(DisasContext *ctx, arg_fcvt_h_w *a)
 static bool trans_fcvt_h_wu(DisasContext *ctx, arg_fcvt_h_wu *a)
 {
     REQUIRE_FPU;
-    REQUIRE_ZFH(ctx);
+    REQUIRE_ZHINX_OR_ZFH(ctx);
 
+    TCGv_i64 dest = dest_fpr(ctx, a->rd);
     TCGv t0 = get_gpr(ctx, a->rs1, EXT_SIGN);
 
     gen_set_rm(ctx, a->rm);
-    gen_helper_fcvt_h_wu(cpu_fpr[a->rd], cpu_env, t0);
+    gen_helper_fcvt_h_wu(dest, cpu_env, t0);
+    gen_set_fpr_hs(ctx, a->rd, dest);
 
     mark_fs_dirty(ctx);
     return true;
@@ -482,12 +618,13 @@ static bool trans_fcvt_l_h(DisasContext *ctx, arg_fcvt_l_h *a)
 {
     REQUIRE_64BIT(ctx);
     REQUIRE_FPU;
-    REQUIRE_ZFH(ctx);
+    REQUIRE_ZHINX_OR_ZFH(ctx);
 
     TCGv dest = dest_gpr(ctx, a->rd);
+    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
 
     gen_set_rm(ctx, a->rm);
-    gen_helper_fcvt_l_h(dest, cpu_env, cpu_fpr[a->rs1]);
+    gen_helper_fcvt_l_h(dest, cpu_env, src1);
     gen_set_gpr(ctx, a->rd, dest);
     return true;
 }
@@ -496,12 +633,13 @@ static bool trans_fcvt_lu_h(DisasContext *ctx, arg_fcvt_lu_h *a)
 {
     REQUIRE_64BIT(ctx);
     REQUIRE_FPU;
-    REQUIRE_ZFH(ctx);
+    REQUIRE_ZHINX_OR_ZFH(ctx);
 
     TCGv dest = dest_gpr(ctx, a->rd);
+    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
 
     gen_set_rm(ctx, a->rm);
-    gen_helper_fcvt_lu_h(dest, cpu_env, cpu_fpr[a->rs1]);
+    gen_helper_fcvt_lu_h(dest, cpu_env, src1);
     gen_set_gpr(ctx, a->rd, dest);
     return true;
 }
@@ -510,12 +648,14 @@ static bool trans_fcvt_h_l(DisasContext *ctx, arg_fcvt_h_l *a)
 {
     REQUIRE_64BIT(ctx);
     REQUIRE_FPU;
-    REQUIRE_ZFH(ctx);
+    REQUIRE_ZHINX_OR_ZFH(ctx);
 
+    TCGv_i64 dest = dest_fpr(ctx, a->rd);
     TCGv t0 = get_gpr(ctx, a->rs1, EXT_SIGN);
 
     gen_set_rm(ctx, a->rm);
-    gen_helper_fcvt_h_l(cpu_fpr[a->rd], cpu_env, t0);
+    gen_helper_fcvt_h_l(dest, cpu_env, t0);
+    gen_set_fpr_hs(ctx, a->rd, dest);
 
     mark_fs_dirty(ctx);
     return true;
@@ -525,12 +665,14 @@ static bool trans_fcvt_h_lu(DisasContext *ctx, arg_fcvt_h_lu *a)
 {
     REQUIRE_64BIT(ctx);
     REQUIRE_FPU;
-    REQUIRE_ZFH(ctx);
+    REQUIRE_ZHINX_OR_ZFH(ctx);
 
+    TCGv_i64 dest = dest_fpr(ctx, a->rd);
     TCGv t0 = get_gpr(ctx, a->rs1, EXT_SIGN);
 
     gen_set_rm(ctx, a->rm);
-    gen_helper_fcvt_h_lu(cpu_fpr[a->rd], cpu_env, t0);
+    gen_helper_fcvt_h_lu(dest, cpu_env, t0);
+    gen_set_fpr_hs(ctx, a->rd, dest);
 
     mark_fs_dirty(ctx);
     return true;
diff --git a/target/riscv/internals.h b/target/riscv/internals.h
index 6237bb3115..dbb322bfa7 100644
--- a/target/riscv/internals.h
+++ b/target/riscv/internals.h
@@ -72,13 +72,23 @@ static inline float32 check_nanbox_s(CPURISCVState *env, uint64_t f)
     }
 }
 
-static inline uint64_t nanbox_h(float16 f)
+static inline uint64_t nanbox_h(CPURISCVState *env, float16 f)
 {
-    return f | MAKE_64BIT_MASK(16, 48);
+    /* the value is sign-extended instead of NaN-boxing for zfinx */
+    if (RISCV_CPU(env_cpu(env))->cfg.ext_zfinx) {
+        return (int16_t)f;
+    } else {
+        return f | MAKE_64BIT_MASK(16, 48);
+    }
 }
 
-static inline float16 check_nanbox_h(uint64_t f)
+static inline float16 check_nanbox_h(CPURISCVState *env, uint64_t f)
 {
+    /* Disable nanbox check when enable zfinx */
+    if (RISCV_CPU(env_cpu(env))->cfg.ext_zfinx) {
+        return (uint16_t)f;
+    }
+
     uint64_t mask = MAKE_64BIT_MASK(16, 48);
 
     if (likely((f & mask) == mask)) {
-- 
2.17.1



^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v6 6/6] target/riscv: expose zfinx, zdinx, zhinx{min} properties
  2022-02-11  4:39 ` Weiwei Li
@ 2022-02-11  4:39   ` Weiwei Li
  -1 siblings, 0 replies; 22+ messages in thread
From: Weiwei Li @ 2022-02-11  4:39 UTC (permalink / raw)
  To: richard.henderson, palmer, alistair.francis, bin.meng,
	qemu-riscv, qemu-devel
  Cc: wangjunqiang, Weiwei Li, lazyparser, ardxwe

Co-authored-by: ardxwe <ardxwe@gmail.com>
Signed-off-by: Weiwei Li <liweiwei@iscas.ac.cn>
Signed-off-by: Junqiang Wang <wangjunqiang@iscas.ac.cn>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
---
 target/riscv/cpu.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
index 55371b1aa5..ddda4906ff 100644
--- a/target/riscv/cpu.c
+++ b/target/riscv/cpu.c
@@ -795,6 +795,11 @@ static Property riscv_cpu_properties[] = {
     DEFINE_PROP_BOOL("zbc", RISCVCPU, cfg.ext_zbc, true),
     DEFINE_PROP_BOOL("zbs", RISCVCPU, cfg.ext_zbs, true),
 
+    DEFINE_PROP_BOOL("zdinx", RISCVCPU, cfg.ext_zdinx, false),
+    DEFINE_PROP_BOOL("zfinx", RISCVCPU, cfg.ext_zfinx, false),
+    DEFINE_PROP_BOOL("zhinx", RISCVCPU, cfg.ext_zhinx, false),
+    DEFINE_PROP_BOOL("zhinxmin", RISCVCPU, cfg.ext_zhinxmin, false),
+
     /* Vendor-specific custom extensions */
     DEFINE_PROP_BOOL("xventanacondops", RISCVCPU, cfg.ext_XVentanaCondOps, false),
 
-- 
2.17.1



^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v6 6/6] target/riscv: expose zfinx, zdinx, zhinx{min} properties
@ 2022-02-11  4:39   ` Weiwei Li
  0 siblings, 0 replies; 22+ messages in thread
From: Weiwei Li @ 2022-02-11  4:39 UTC (permalink / raw)
  To: richard.henderson, palmer, alistair.francis, bin.meng,
	qemu-riscv, qemu-devel
  Cc: wangjunqiang, lazyparser, ardxwe, Weiwei Li

Co-authored-by: ardxwe <ardxwe@gmail.com>
Signed-off-by: Weiwei Li <liweiwei@iscas.ac.cn>
Signed-off-by: Junqiang Wang <wangjunqiang@iscas.ac.cn>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
---
 target/riscv/cpu.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
index 55371b1aa5..ddda4906ff 100644
--- a/target/riscv/cpu.c
+++ b/target/riscv/cpu.c
@@ -795,6 +795,11 @@ static Property riscv_cpu_properties[] = {
     DEFINE_PROP_BOOL("zbc", RISCVCPU, cfg.ext_zbc, true),
     DEFINE_PROP_BOOL("zbs", RISCVCPU, cfg.ext_zbs, true),
 
+    DEFINE_PROP_BOOL("zdinx", RISCVCPU, cfg.ext_zdinx, false),
+    DEFINE_PROP_BOOL("zfinx", RISCVCPU, cfg.ext_zfinx, false),
+    DEFINE_PROP_BOOL("zhinx", RISCVCPU, cfg.ext_zhinx, false),
+    DEFINE_PROP_BOOL("zhinxmin", RISCVCPU, cfg.ext_zhinxmin, false),
+
     /* Vendor-specific custom extensions */
     DEFINE_PROP_BOOL("xventanacondops", RISCVCPU, cfg.ext_XVentanaCondOps, false),
 
-- 
2.17.1



^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH v6 3/6] target/riscv: add support for zfinx
  2022-02-11  4:39   ` Weiwei Li
@ 2022-02-28  3:55     ` Alistair Francis
  -1 siblings, 0 replies; 22+ messages in thread
From: Alistair Francis @ 2022-02-28  3:55 UTC (permalink / raw)
  To: Weiwei Li
  Cc: Wei Wu (吴伟),
	open list:RISC-V, wangjunqiang, Bin Meng, Richard Henderson,
	qemu-devel@nongnu.org Developers, ardxwe, Palmer Dabbelt,
	Alistair Francis

On Fri, Feb 11, 2022 at 2:41 PM Weiwei Li <liweiwei@iscas.ac.cn> wrote:
>
>   - update extension check REQUIRE_ZFINX_OR_F
>   - update single float point register read/write
>   - disable nanbox_s check
>
> Co-authored-by: ardxwe <ardxwe@gmail.com>
> Signed-off-by: Weiwei Li <liweiwei@iscas.ac.cn>
> Signed-off-by: Junqiang Wang <wangjunqiang@iscas.ac.cn>
> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>

Acked-by: Alistair Francis <alistair.francis@wdc.com>

Alistair

> ---
>  target/riscv/fpu_helper.c               |  89 +++----
>  target/riscv/helper.h                   |   2 +-
>  target/riscv/insn_trans/trans_rvf.c.inc | 314 ++++++++++++++++--------
>  target/riscv/internals.h                |  16 +-
>  target/riscv/translate.c                |  93 ++++++-
>  5 files changed, 369 insertions(+), 145 deletions(-)
>
> diff --git a/target/riscv/fpu_helper.c b/target/riscv/fpu_helper.c
> index 4a5982d594..63ca703459 100644
> --- a/target/riscv/fpu_helper.c
> +++ b/target/riscv/fpu_helper.c
> @@ -98,10 +98,11 @@ static uint64_t do_fmadd_h(CPURISCVState *env, uint64_t rs1, uint64_t rs2,
>  static uint64_t do_fmadd_s(CPURISCVState *env, uint64_t rs1, uint64_t rs2,
>                             uint64_t rs3, int flags)
>  {
> -    float32 frs1 = check_nanbox_s(rs1);
> -    float32 frs2 = check_nanbox_s(rs2);
> -    float32 frs3 = check_nanbox_s(rs3);
> -    return nanbox_s(float32_muladd(frs1, frs2, frs3, flags, &env->fp_status));
> +    float32 frs1 = check_nanbox_s(env, rs1);
> +    float32 frs2 = check_nanbox_s(env, rs2);
> +    float32 frs3 = check_nanbox_s(env, rs3);
> +    return nanbox_s(env, float32_muladd(frs1, frs2, frs3, flags,
> +                                        &env->fp_status));
>  }
>
>  uint64_t helper_fmadd_s(CPURISCVState *env, uint64_t frs1, uint64_t frs2,
> @@ -183,124 +184,124 @@ uint64_t helper_fnmadd_h(CPURISCVState *env, uint64_t frs1, uint64_t frs2,
>
>  uint64_t helper_fadd_s(CPURISCVState *env, uint64_t rs1, uint64_t rs2)
>  {
> -    float32 frs1 = check_nanbox_s(rs1);
> -    float32 frs2 = check_nanbox_s(rs2);
> -    return nanbox_s(float32_add(frs1, frs2, &env->fp_status));
> +    float32 frs1 = check_nanbox_s(env, rs1);
> +    float32 frs2 = check_nanbox_s(env, rs2);
> +    return nanbox_s(env, float32_add(frs1, frs2, &env->fp_status));
>  }
>
>  uint64_t helper_fsub_s(CPURISCVState *env, uint64_t rs1, uint64_t rs2)
>  {
> -    float32 frs1 = check_nanbox_s(rs1);
> -    float32 frs2 = check_nanbox_s(rs2);
> -    return nanbox_s(float32_sub(frs1, frs2, &env->fp_status));
> +    float32 frs1 = check_nanbox_s(env, rs1);
> +    float32 frs2 = check_nanbox_s(env, rs2);
> +    return nanbox_s(env, float32_sub(frs1, frs2, &env->fp_status));
>  }
>
>  uint64_t helper_fmul_s(CPURISCVState *env, uint64_t rs1, uint64_t rs2)
>  {
> -    float32 frs1 = check_nanbox_s(rs1);
> -    float32 frs2 = check_nanbox_s(rs2);
> -    return nanbox_s(float32_mul(frs1, frs2, &env->fp_status));
> +    float32 frs1 = check_nanbox_s(env, rs1);
> +    float32 frs2 = check_nanbox_s(env, rs2);
> +    return nanbox_s(env, float32_mul(frs1, frs2, &env->fp_status));
>  }
>
>  uint64_t helper_fdiv_s(CPURISCVState *env, uint64_t rs1, uint64_t rs2)
>  {
> -    float32 frs1 = check_nanbox_s(rs1);
> -    float32 frs2 = check_nanbox_s(rs2);
> -    return nanbox_s(float32_div(frs1, frs2, &env->fp_status));
> +    float32 frs1 = check_nanbox_s(env, rs1);
> +    float32 frs2 = check_nanbox_s(env, rs2);
> +    return nanbox_s(env, float32_div(frs1, frs2, &env->fp_status));
>  }
>
>  uint64_t helper_fmin_s(CPURISCVState *env, uint64_t rs1, uint64_t rs2)
>  {
> -    float32 frs1 = check_nanbox_s(rs1);
> -    float32 frs2 = check_nanbox_s(rs2);
> -    return nanbox_s(env->priv_ver < PRIV_VERSION_1_11_0 ?
> +    float32 frs1 = check_nanbox_s(env, rs1);
> +    float32 frs2 = check_nanbox_s(env, rs2);
> +    return nanbox_s(env, env->priv_ver < PRIV_VERSION_1_11_0 ?
>                      float32_minnum(frs1, frs2, &env->fp_status) :
>                      float32_minimum_number(frs1, frs2, &env->fp_status));
>  }
>
>  uint64_t helper_fmax_s(CPURISCVState *env, uint64_t rs1, uint64_t rs2)
>  {
> -    float32 frs1 = check_nanbox_s(rs1);
> -    float32 frs2 = check_nanbox_s(rs2);
> -    return nanbox_s(env->priv_ver < PRIV_VERSION_1_11_0 ?
> +    float32 frs1 = check_nanbox_s(env, rs1);
> +    float32 frs2 = check_nanbox_s(env, rs2);
> +    return nanbox_s(env, env->priv_ver < PRIV_VERSION_1_11_0 ?
>                      float32_maxnum(frs1, frs2, &env->fp_status) :
>                      float32_maximum_number(frs1, frs2, &env->fp_status));
>  }
>
>  uint64_t helper_fsqrt_s(CPURISCVState *env, uint64_t rs1)
>  {
> -    float32 frs1 = check_nanbox_s(rs1);
> -    return nanbox_s(float32_sqrt(frs1, &env->fp_status));
> +    float32 frs1 = check_nanbox_s(env, rs1);
> +    return nanbox_s(env, float32_sqrt(frs1, &env->fp_status));
>  }
>
>  target_ulong helper_fle_s(CPURISCVState *env, uint64_t rs1, uint64_t rs2)
>  {
> -    float32 frs1 = check_nanbox_s(rs1);
> -    float32 frs2 = check_nanbox_s(rs2);
> +    float32 frs1 = check_nanbox_s(env, rs1);
> +    float32 frs2 = check_nanbox_s(env, rs2);
>      return float32_le(frs1, frs2, &env->fp_status);
>  }
>
>  target_ulong helper_flt_s(CPURISCVState *env, uint64_t rs1, uint64_t rs2)
>  {
> -    float32 frs1 = check_nanbox_s(rs1);
> -    float32 frs2 = check_nanbox_s(rs2);
> +    float32 frs1 = check_nanbox_s(env, rs1);
> +    float32 frs2 = check_nanbox_s(env, rs2);
>      return float32_lt(frs1, frs2, &env->fp_status);
>  }
>
>  target_ulong helper_feq_s(CPURISCVState *env, uint64_t rs1, uint64_t rs2)
>  {
> -    float32 frs1 = check_nanbox_s(rs1);
> -    float32 frs2 = check_nanbox_s(rs2);
> +    float32 frs1 = check_nanbox_s(env, rs1);
> +    float32 frs2 = check_nanbox_s(env, rs2);
>      return float32_eq_quiet(frs1, frs2, &env->fp_status);
>  }
>
>  target_ulong helper_fcvt_w_s(CPURISCVState *env, uint64_t rs1)
>  {
> -    float32 frs1 = check_nanbox_s(rs1);
> +    float32 frs1 = check_nanbox_s(env, rs1);
>      return float32_to_int32(frs1, &env->fp_status);
>  }
>
>  target_ulong helper_fcvt_wu_s(CPURISCVState *env, uint64_t rs1)
>  {
> -    float32 frs1 = check_nanbox_s(rs1);
> +    float32 frs1 = check_nanbox_s(env, rs1);
>      return (int32_t)float32_to_uint32(frs1, &env->fp_status);
>  }
>
>  target_ulong helper_fcvt_l_s(CPURISCVState *env, uint64_t rs1)
>  {
> -    float32 frs1 = check_nanbox_s(rs1);
> +    float32 frs1 = check_nanbox_s(env, rs1);
>      return float32_to_int64(frs1, &env->fp_status);
>  }
>
>  target_ulong helper_fcvt_lu_s(CPURISCVState *env, uint64_t rs1)
>  {
> -    float32 frs1 = check_nanbox_s(rs1);
> +    float32 frs1 = check_nanbox_s(env, rs1);
>      return float32_to_uint64(frs1, &env->fp_status);
>  }
>
>  uint64_t helper_fcvt_s_w(CPURISCVState *env, target_ulong rs1)
>  {
> -    return nanbox_s(int32_to_float32((int32_t)rs1, &env->fp_status));
> +    return nanbox_s(env, int32_to_float32((int32_t)rs1, &env->fp_status));
>  }
>
>  uint64_t helper_fcvt_s_wu(CPURISCVState *env, target_ulong rs1)
>  {
> -    return nanbox_s(uint32_to_float32((uint32_t)rs1, &env->fp_status));
> +    return nanbox_s(env, uint32_to_float32((uint32_t)rs1, &env->fp_status));
>  }
>
>  uint64_t helper_fcvt_s_l(CPURISCVState *env, target_ulong rs1)
>  {
> -    return nanbox_s(int64_to_float32(rs1, &env->fp_status));
> +    return nanbox_s(env, int64_to_float32(rs1, &env->fp_status));
>  }
>
>  uint64_t helper_fcvt_s_lu(CPURISCVState *env, target_ulong rs1)
>  {
> -    return nanbox_s(uint64_to_float32(rs1, &env->fp_status));
> +    return nanbox_s(env, uint64_to_float32(rs1, &env->fp_status));
>  }
>
> -target_ulong helper_fclass_s(uint64_t rs1)
> +target_ulong helper_fclass_s(CPURISCVState *env, uint64_t rs1)
>  {
> -    float32 frs1 = check_nanbox_s(rs1);
> +    float32 frs1 = check_nanbox_s(env, rs1);
>      return fclass_s(frs1);
>  }
>
> @@ -340,12 +341,12 @@ uint64_t helper_fmax_d(CPURISCVState *env, uint64_t frs1, uint64_t frs2)
>
>  uint64_t helper_fcvt_s_d(CPURISCVState *env, uint64_t rs1)
>  {
> -    return nanbox_s(float64_to_float32(rs1, &env->fp_status));
> +    return nanbox_s(env, float64_to_float32(rs1, &env->fp_status));
>  }
>
>  uint64_t helper_fcvt_d_s(CPURISCVState *env, uint64_t rs1)
>  {
> -    float32 frs1 = check_nanbox_s(rs1);
> +    float32 frs1 = check_nanbox_s(env, rs1);
>      return float32_to_float64(frs1, &env->fp_status);
>  }
>
> @@ -539,14 +540,14 @@ uint64_t helper_fcvt_h_lu(CPURISCVState *env, target_ulong rs1)
>
>  uint64_t helper_fcvt_h_s(CPURISCVState *env, uint64_t rs1)
>  {
> -    float32 frs1 = check_nanbox_s(rs1);
> +    float32 frs1 = check_nanbox_s(env, rs1);
>      return nanbox_h(float32_to_float16(frs1, true, &env->fp_status));
>  }
>
>  uint64_t helper_fcvt_s_h(CPURISCVState *env, uint64_t rs1)
>  {
>      float16 frs1 = check_nanbox_h(rs1);
> -    return nanbox_s(float16_to_float32(frs1, true, &env->fp_status));
> +    return nanbox_s(env, float16_to_float32(frs1, true, &env->fp_status));
>  }
>
>  uint64_t helper_fcvt_h_d(CPURISCVState *env, uint64_t rs1)
> diff --git a/target/riscv/helper.h b/target/riscv/helper.h
> index 72cc2582f4..89195aad9d 100644
> --- a/target/riscv/helper.h
> +++ b/target/riscv/helper.h
> @@ -38,7 +38,7 @@ DEF_HELPER_FLAGS_2(fcvt_s_w, TCG_CALL_NO_RWG, i64, env, tl)
>  DEF_HELPER_FLAGS_2(fcvt_s_wu, TCG_CALL_NO_RWG, i64, env, tl)
>  DEF_HELPER_FLAGS_2(fcvt_s_l, TCG_CALL_NO_RWG, i64, env, tl)
>  DEF_HELPER_FLAGS_2(fcvt_s_lu, TCG_CALL_NO_RWG, i64, env, tl)
> -DEF_HELPER_FLAGS_1(fclass_s, TCG_CALL_NO_RWG_SE, tl, i64)
> +DEF_HELPER_FLAGS_2(fclass_s, TCG_CALL_NO_RWG_SE, tl, env, i64)
>
>  /* Floating Point - Double Precision */
>  DEF_HELPER_FLAGS_3(fadd_d, TCG_CALL_NO_RWG, i64, env, i64, i64)
> diff --git a/target/riscv/insn_trans/trans_rvf.c.inc b/target/riscv/insn_trans/trans_rvf.c.inc
> index 0aac87f7db..a1d3eb52ad 100644
> --- a/target/riscv/insn_trans/trans_rvf.c.inc
> +++ b/target/riscv/insn_trans/trans_rvf.c.inc
> @@ -20,7 +20,14 @@
>
>  #define REQUIRE_FPU do {\
>      if (ctx->mstatus_fs == 0) \
> -        return false;                       \
> +        if (!ctx->cfg_ptr->ext_zfinx) \
> +            return false; \
> +} while (0)
> +
> +#define REQUIRE_ZFINX_OR_F(ctx) do {\
> +    if (!ctx->cfg_ptr->ext_zfinx) { \
> +        REQUIRE_EXT(ctx, RVF); \
> +    } \
>  } while (0)
>
>  static bool trans_flw(DisasContext *ctx, arg_flw *a)
> @@ -55,10 +62,16 @@ static bool trans_fsw(DisasContext *ctx, arg_fsw *a)
>  static bool trans_fmadd_s(DisasContext *ctx, arg_fmadd_s *a)
>  {
>      REQUIRE_FPU;
> -    REQUIRE_EXT(ctx, RVF);
> +    REQUIRE_ZFINX_OR_F(ctx);
> +
> +    TCGv_i64 dest = dest_fpr(ctx, a->rd);
> +    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
> +    TCGv_i64 src2 = get_fpr_hs(ctx, a->rs2);
> +    TCGv_i64 src3 = get_fpr_hs(ctx, a->rs3);
> +
>      gen_set_rm(ctx, a->rm);
> -    gen_helper_fmadd_s(cpu_fpr[a->rd], cpu_env, cpu_fpr[a->rs1],
> -                       cpu_fpr[a->rs2], cpu_fpr[a->rs3]);
> +    gen_helper_fmadd_s(dest, cpu_env, src1, src2, src3);
> +    gen_set_fpr_hs(ctx, a->rd, dest);
>      mark_fs_dirty(ctx);
>      return true;
>  }
> @@ -66,10 +79,16 @@ static bool trans_fmadd_s(DisasContext *ctx, arg_fmadd_s *a)
>  static bool trans_fmsub_s(DisasContext *ctx, arg_fmsub_s *a)
>  {
>      REQUIRE_FPU;
> -    REQUIRE_EXT(ctx, RVF);
> +    REQUIRE_ZFINX_OR_F(ctx);
> +
> +    TCGv_i64 dest = dest_fpr(ctx, a->rd);
> +    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
> +    TCGv_i64 src2 = get_fpr_hs(ctx, a->rs2);
> +    TCGv_i64 src3 = get_fpr_hs(ctx, a->rs3);
> +
>      gen_set_rm(ctx, a->rm);
> -    gen_helper_fmsub_s(cpu_fpr[a->rd], cpu_env, cpu_fpr[a->rs1],
> -                       cpu_fpr[a->rs2], cpu_fpr[a->rs3]);
> +    gen_helper_fmsub_s(dest, cpu_env, src1, src2, src3);
> +    gen_set_fpr_hs(ctx, a->rd, dest);
>      mark_fs_dirty(ctx);
>      return true;
>  }
> @@ -77,10 +96,16 @@ static bool trans_fmsub_s(DisasContext *ctx, arg_fmsub_s *a)
>  static bool trans_fnmsub_s(DisasContext *ctx, arg_fnmsub_s *a)
>  {
>      REQUIRE_FPU;
> -    REQUIRE_EXT(ctx, RVF);
> +    REQUIRE_ZFINX_OR_F(ctx);
> +
> +    TCGv_i64 dest = dest_fpr(ctx, a->rd);
> +    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
> +    TCGv_i64 src2 = get_fpr_hs(ctx, a->rs2);
> +    TCGv_i64 src3 = get_fpr_hs(ctx, a->rs3);
> +
>      gen_set_rm(ctx, a->rm);
> -    gen_helper_fnmsub_s(cpu_fpr[a->rd], cpu_env, cpu_fpr[a->rs1],
> -                        cpu_fpr[a->rs2], cpu_fpr[a->rs3]);
> +    gen_helper_fnmsub_s(dest, cpu_env, src1, src2, src3);
> +    gen_set_fpr_hs(ctx, a->rd, dest);
>      mark_fs_dirty(ctx);
>      return true;
>  }
> @@ -88,10 +113,16 @@ static bool trans_fnmsub_s(DisasContext *ctx, arg_fnmsub_s *a)
>  static bool trans_fnmadd_s(DisasContext *ctx, arg_fnmadd_s *a)
>  {
>      REQUIRE_FPU;
> -    REQUIRE_EXT(ctx, RVF);
> +    REQUIRE_ZFINX_OR_F(ctx);
> +
> +    TCGv_i64 dest = dest_fpr(ctx, a->rd);
> +    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
> +    TCGv_i64 src2 = get_fpr_hs(ctx, a->rs2);
> +    TCGv_i64 src3 = get_fpr_hs(ctx, a->rs3);
> +
>      gen_set_rm(ctx, a->rm);
> -    gen_helper_fnmadd_s(cpu_fpr[a->rd], cpu_env, cpu_fpr[a->rs1],
> -                        cpu_fpr[a->rs2], cpu_fpr[a->rs3]);
> +    gen_helper_fnmadd_s(dest, cpu_env, src1, src2, src3);
> +    gen_set_fpr_hs(ctx, a->rd, dest);
>      mark_fs_dirty(ctx);
>      return true;
>  }
> @@ -99,11 +130,15 @@ static bool trans_fnmadd_s(DisasContext *ctx, arg_fnmadd_s *a)
>  static bool trans_fadd_s(DisasContext *ctx, arg_fadd_s *a)
>  {
>      REQUIRE_FPU;
> -    REQUIRE_EXT(ctx, RVF);
> +    REQUIRE_ZFINX_OR_F(ctx);
> +
> +    TCGv_i64 dest = dest_fpr(ctx, a->rd);
> +    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
> +    TCGv_i64 src2 = get_fpr_hs(ctx, a->rs2);
>
>      gen_set_rm(ctx, a->rm);
> -    gen_helper_fadd_s(cpu_fpr[a->rd], cpu_env,
> -                      cpu_fpr[a->rs1], cpu_fpr[a->rs2]);
> +    gen_helper_fadd_s(dest, cpu_env, src1, src2);
> +    gen_set_fpr_hs(ctx, a->rd, dest);
>      mark_fs_dirty(ctx);
>      return true;
>  }
> @@ -111,11 +146,15 @@ static bool trans_fadd_s(DisasContext *ctx, arg_fadd_s *a)
>  static bool trans_fsub_s(DisasContext *ctx, arg_fsub_s *a)
>  {
>      REQUIRE_FPU;
> -    REQUIRE_EXT(ctx, RVF);
> +    REQUIRE_ZFINX_OR_F(ctx);
> +
> +    TCGv_i64 dest = dest_fpr(ctx, a->rd);
> +    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
> +    TCGv_i64 src2 = get_fpr_hs(ctx, a->rs2);
>
>      gen_set_rm(ctx, a->rm);
> -    gen_helper_fsub_s(cpu_fpr[a->rd], cpu_env,
> -                      cpu_fpr[a->rs1], cpu_fpr[a->rs2]);
> +    gen_helper_fsub_s(dest, cpu_env, src1, src2);
> +    gen_set_fpr_hs(ctx, a->rd, dest);
>      mark_fs_dirty(ctx);
>      return true;
>  }
> @@ -123,11 +162,15 @@ static bool trans_fsub_s(DisasContext *ctx, arg_fsub_s *a)
>  static bool trans_fmul_s(DisasContext *ctx, arg_fmul_s *a)
>  {
>      REQUIRE_FPU;
> -    REQUIRE_EXT(ctx, RVF);
> +    REQUIRE_ZFINX_OR_F(ctx);
> +
> +    TCGv_i64 dest = dest_fpr(ctx, a->rd);
> +    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
> +    TCGv_i64 src2 = get_fpr_hs(ctx, a->rs2);
>
>      gen_set_rm(ctx, a->rm);
> -    gen_helper_fmul_s(cpu_fpr[a->rd], cpu_env,
> -                      cpu_fpr[a->rs1], cpu_fpr[a->rs2]);
> +    gen_helper_fmul_s(dest, cpu_env, src1, src2);
> +    gen_set_fpr_hs(ctx, a->rd, dest);
>      mark_fs_dirty(ctx);
>      return true;
>  }
> @@ -135,11 +178,15 @@ static bool trans_fmul_s(DisasContext *ctx, arg_fmul_s *a)
>  static bool trans_fdiv_s(DisasContext *ctx, arg_fdiv_s *a)
>  {
>      REQUIRE_FPU;
> -    REQUIRE_EXT(ctx, RVF);
> +    REQUIRE_ZFINX_OR_F(ctx);
> +
> +    TCGv_i64 dest = dest_fpr(ctx, a->rd);
> +    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
> +    TCGv_i64 src2 = get_fpr_hs(ctx, a->rs2);
>
>      gen_set_rm(ctx, a->rm);
> -    gen_helper_fdiv_s(cpu_fpr[a->rd], cpu_env,
> -                      cpu_fpr[a->rs1], cpu_fpr[a->rs2]);
> +    gen_helper_fdiv_s(dest, cpu_env, src1, src2);
> +    gen_set_fpr_hs(ctx, a->rd, dest);
>      mark_fs_dirty(ctx);
>      return true;
>  }
> @@ -147,10 +194,14 @@ static bool trans_fdiv_s(DisasContext *ctx, arg_fdiv_s *a)
>  static bool trans_fsqrt_s(DisasContext *ctx, arg_fsqrt_s *a)
>  {
>      REQUIRE_FPU;
> -    REQUIRE_EXT(ctx, RVF);
> +    REQUIRE_ZFINX_OR_F(ctx);
> +
> +    TCGv_i64 dest = dest_fpr(ctx, a->rd);
> +    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
>
>      gen_set_rm(ctx, a->rm);
> -    gen_helper_fsqrt_s(cpu_fpr[a->rd], cpu_env, cpu_fpr[a->rs1]);
> +    gen_helper_fsqrt_s(dest, cpu_env, src1);
> +    gen_set_fpr_hs(ctx, a->rd, dest);
>      mark_fs_dirty(ctx);
>      return true;
>  }
> @@ -158,22 +209,37 @@ static bool trans_fsqrt_s(DisasContext *ctx, arg_fsqrt_s *a)
>  static bool trans_fsgnj_s(DisasContext *ctx, arg_fsgnj_s *a)
>  {
>      REQUIRE_FPU;
> -    REQUIRE_EXT(ctx, RVF);
> +    REQUIRE_ZFINX_OR_F(ctx);
> +
> +    TCGv_i64 dest = dest_fpr(ctx, a->rd);
> +    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
>
>      if (a->rs1 == a->rs2) { /* FMOV */
> -        gen_check_nanbox_s(cpu_fpr[a->rd], cpu_fpr[a->rs1]);
> +        if (!ctx->cfg_ptr->ext_zfinx) {
> +            gen_check_nanbox_s(dest, src1);
> +        } else {
> +            tcg_gen_ext32s_i64(dest, src1);
> +        }
>      } else { /* FSGNJ */
> -        TCGv_i64 rs1 = tcg_temp_new_i64();
> -        TCGv_i64 rs2 = tcg_temp_new_i64();
> -
> -        gen_check_nanbox_s(rs1, cpu_fpr[a->rs1]);
> -        gen_check_nanbox_s(rs2, cpu_fpr[a->rs2]);
> -
> -        /* This formulation retains the nanboxing of rs2. */
> -        tcg_gen_deposit_i64(cpu_fpr[a->rd], rs2, rs1, 0, 31);
> -        tcg_temp_free_i64(rs1);
> -        tcg_temp_free_i64(rs2);
> +        TCGv_i64 src2 = get_fpr_hs(ctx, a->rs2);
> +
> +        if (!ctx->cfg_ptr->ext_zfinx) {
> +            TCGv_i64 rs1 = tcg_temp_new_i64();
> +            TCGv_i64 rs2 = tcg_temp_new_i64();
> +            gen_check_nanbox_s(rs1, src1);
> +            gen_check_nanbox_s(rs2, src2);
> +
> +            /* This formulation retains the nanboxing of rs2 in normal 'F'. */
> +            tcg_gen_deposit_i64(dest, rs2, rs1, 0, 31);
> +
> +            tcg_temp_free_i64(rs1);
> +            tcg_temp_free_i64(rs2);
> +        } else {
> +            tcg_gen_deposit_i64(dest, src2, src1, 0, 31);
> +            tcg_gen_ext32s_i64(dest, dest);
> +        }
>      }
> +    gen_set_fpr_hs(ctx, a->rd, dest);
>      mark_fs_dirty(ctx);
>      return true;
>  }
> @@ -183,16 +249,27 @@ static bool trans_fsgnjn_s(DisasContext *ctx, arg_fsgnjn_s *a)
>      TCGv_i64 rs1, rs2, mask;
>
>      REQUIRE_FPU;
> -    REQUIRE_EXT(ctx, RVF);
> +    REQUIRE_ZFINX_OR_F(ctx);
>
> -    rs1 = tcg_temp_new_i64();
> -    gen_check_nanbox_s(rs1, cpu_fpr[a->rs1]);
> +    TCGv_i64 dest = dest_fpr(ctx, a->rd);
> +    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
>
> +    rs1 = tcg_temp_new_i64();
> +    if (!ctx->cfg_ptr->ext_zfinx) {
> +        gen_check_nanbox_s(rs1, src1);
> +    } else {
> +        tcg_gen_mov_i64(rs1, src1);
> +    }
>      if (a->rs1 == a->rs2) { /* FNEG */
> -        tcg_gen_xori_i64(cpu_fpr[a->rd], rs1, MAKE_64BIT_MASK(31, 1));
> +        tcg_gen_xori_i64(dest, rs1, MAKE_64BIT_MASK(31, 1));
>      } else {
> +        TCGv_i64 src2 = get_fpr_hs(ctx, a->rs2);
>          rs2 = tcg_temp_new_i64();
> -        gen_check_nanbox_s(rs2, cpu_fpr[a->rs2]);
> +        if (!ctx->cfg_ptr->ext_zfinx) {
> +            gen_check_nanbox_s(rs2, src2);
> +        } else {
> +            tcg_gen_mov_i64(rs2, src2);
> +        }
>
>          /*
>           * Replace bit 31 in rs1 with inverse in rs2.
> @@ -200,13 +277,17 @@ static bool trans_fsgnjn_s(DisasContext *ctx, arg_fsgnjn_s *a)
>           */
>          mask = tcg_constant_i64(~MAKE_64BIT_MASK(31, 1));
>          tcg_gen_nor_i64(rs2, rs2, mask);
> -        tcg_gen_and_i64(rs1, mask, rs1);
> -        tcg_gen_or_i64(cpu_fpr[a->rd], rs1, rs2);
> +        tcg_gen_and_i64(dest, mask, rs1);
> +        tcg_gen_or_i64(dest, dest, rs2);
>
>          tcg_temp_free_i64(rs2);
>      }
> +    /* signed-extended intead of nanboxing for result if enable zfinx */
> +    if (ctx->cfg_ptr->ext_zfinx) {
> +        tcg_gen_ext32s_i64(dest, dest);
> +    }
> +    gen_set_fpr_hs(ctx, a->rd, dest);
>      tcg_temp_free_i64(rs1);
> -
>      mark_fs_dirty(ctx);
>      return true;
>  }
> @@ -216,28 +297,45 @@ static bool trans_fsgnjx_s(DisasContext *ctx, arg_fsgnjx_s *a)
>      TCGv_i64 rs1, rs2;
>
>      REQUIRE_FPU;
> -    REQUIRE_EXT(ctx, RVF);
> +    REQUIRE_ZFINX_OR_F(ctx);
>
> +    TCGv_i64 dest = dest_fpr(ctx, a->rd);
> +    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
>      rs1 = tcg_temp_new_i64();
> -    gen_check_nanbox_s(rs1, cpu_fpr[a->rs1]);
> +
> +    if (!ctx->cfg_ptr->ext_zfinx) {
> +        gen_check_nanbox_s(rs1, src1);
> +    } else {
> +        tcg_gen_mov_i64(rs1, src1);
> +    }
>
>      if (a->rs1 == a->rs2) { /* FABS */
> -        tcg_gen_andi_i64(cpu_fpr[a->rd], rs1, ~MAKE_64BIT_MASK(31, 1));
> +        tcg_gen_andi_i64(dest, rs1, ~MAKE_64BIT_MASK(31, 1));
>      } else {
> +        TCGv_i64 src2 = get_fpr_hs(ctx, a->rs2);
>          rs2 = tcg_temp_new_i64();
> -        gen_check_nanbox_s(rs2, cpu_fpr[a->rs2]);
> +
> +        if (!ctx->cfg_ptr->ext_zfinx) {
> +            gen_check_nanbox_s(rs2, src2);
> +        } else {
> +            tcg_gen_mov_i64(rs2, src2);
> +        }
>
>          /*
>           * Xor bit 31 in rs1 with that in rs2.
>           * This formulation retains the nanboxing of rs1.
>           */
> -        tcg_gen_andi_i64(rs2, rs2, MAKE_64BIT_MASK(31, 1));
> -        tcg_gen_xor_i64(cpu_fpr[a->rd], rs1, rs2);
> +        tcg_gen_andi_i64(dest, rs2, MAKE_64BIT_MASK(31, 1));
> +        tcg_gen_xor_i64(dest, rs1, dest);
>
>          tcg_temp_free_i64(rs2);
>      }
> +    /* signed-extended intead of nanboxing for result if enable zfinx */
> +    if (ctx->cfg_ptr->ext_zfinx) {
> +        tcg_gen_ext32s_i64(dest, dest);
> +    }
>      tcg_temp_free_i64(rs1);
> -
> +    gen_set_fpr_hs(ctx, a->rd, dest);
>      mark_fs_dirty(ctx);
>      return true;
>  }
> @@ -245,10 +343,14 @@ static bool trans_fsgnjx_s(DisasContext *ctx, arg_fsgnjx_s *a)
>  static bool trans_fmin_s(DisasContext *ctx, arg_fmin_s *a)
>  {
>      REQUIRE_FPU;
> -    REQUIRE_EXT(ctx, RVF);
> +    REQUIRE_ZFINX_OR_F(ctx);
> +
> +    TCGv_i64 dest = dest_fpr(ctx, a->rd);
> +    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
> +    TCGv_i64 src2 = get_fpr_hs(ctx, a->rs2);
>
> -    gen_helper_fmin_s(cpu_fpr[a->rd], cpu_env, cpu_fpr[a->rs1],
> -                      cpu_fpr[a->rs2]);
> +    gen_helper_fmin_s(dest, cpu_env, src1, src2);
> +    gen_set_fpr_hs(ctx, a->rd, dest);
>      mark_fs_dirty(ctx);
>      return true;
>  }
> @@ -256,10 +358,14 @@ static bool trans_fmin_s(DisasContext *ctx, arg_fmin_s *a)
>  static bool trans_fmax_s(DisasContext *ctx, arg_fmax_s *a)
>  {
>      REQUIRE_FPU;
> -    REQUIRE_EXT(ctx, RVF);
> +    REQUIRE_ZFINX_OR_F(ctx);
> +
> +    TCGv_i64 dest = dest_fpr(ctx, a->rd);
> +    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
> +    TCGv_i64 src2 = get_fpr_hs(ctx, a->rs2);
>
> -    gen_helper_fmax_s(cpu_fpr[a->rd], cpu_env, cpu_fpr[a->rs1],
> -                      cpu_fpr[a->rs2]);
> +    gen_helper_fmax_s(dest, cpu_env, src1, src2);
> +    gen_set_fpr_hs(ctx, a->rd, dest);
>      mark_fs_dirty(ctx);
>      return true;
>  }
> @@ -267,12 +373,13 @@ static bool trans_fmax_s(DisasContext *ctx, arg_fmax_s *a)
>  static bool trans_fcvt_w_s(DisasContext *ctx, arg_fcvt_w_s *a)
>  {
>      REQUIRE_FPU;
> -    REQUIRE_EXT(ctx, RVF);
> +    REQUIRE_ZFINX_OR_F(ctx);
>
>      TCGv dest = dest_gpr(ctx, a->rd);
> +    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
>
>      gen_set_rm(ctx, a->rm);
> -    gen_helper_fcvt_w_s(dest, cpu_env, cpu_fpr[a->rs1]);
> +    gen_helper_fcvt_w_s(dest, cpu_env, src1);
>      gen_set_gpr(ctx, a->rd, dest);
>      return true;
>  }
> @@ -280,12 +387,13 @@ static bool trans_fcvt_w_s(DisasContext *ctx, arg_fcvt_w_s *a)
>  static bool trans_fcvt_wu_s(DisasContext *ctx, arg_fcvt_wu_s *a)
>  {
>      REQUIRE_FPU;
> -    REQUIRE_EXT(ctx, RVF);
> +    REQUIRE_ZFINX_OR_F(ctx);
>
>      TCGv dest = dest_gpr(ctx, a->rd);
> +    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
>
>      gen_set_rm(ctx, a->rm);
> -    gen_helper_fcvt_wu_s(dest, cpu_env, cpu_fpr[a->rs1]);
> +    gen_helper_fcvt_wu_s(dest, cpu_env, src1);
>      gen_set_gpr(ctx, a->rd, dest);
>      return true;
>  }
> @@ -294,14 +402,14 @@ static bool trans_fmv_x_w(DisasContext *ctx, arg_fmv_x_w *a)
>  {
>      /* NOTE: This was FMV.X.S in an earlier version of the ISA spec! */
>      REQUIRE_FPU;
> -    REQUIRE_EXT(ctx, RVF);
> +    REQUIRE_ZFINX_OR_F(ctx);
>
>      TCGv dest = dest_gpr(ctx, a->rd);
> -
> +    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
>  #if defined(TARGET_RISCV64)
> -    tcg_gen_ext32s_tl(dest, cpu_fpr[a->rs1]);
> +    tcg_gen_ext32s_tl(dest, src1);
>  #else
> -    tcg_gen_extrl_i64_i32(dest, cpu_fpr[a->rs1]);
> +    tcg_gen_extrl_i64_i32(dest, src1);
>  #endif
>
>      gen_set_gpr(ctx, a->rd, dest);
> @@ -311,11 +419,13 @@ static bool trans_fmv_x_w(DisasContext *ctx, arg_fmv_x_w *a)
>  static bool trans_feq_s(DisasContext *ctx, arg_feq_s *a)
>  {
>      REQUIRE_FPU;
> -    REQUIRE_EXT(ctx, RVF);
> +    REQUIRE_ZFINX_OR_F(ctx);
>
>      TCGv dest = dest_gpr(ctx, a->rd);
> +    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
> +    TCGv_i64 src2 = get_fpr_hs(ctx, a->rs2);
>
> -    gen_helper_feq_s(dest, cpu_env, cpu_fpr[a->rs1], cpu_fpr[a->rs2]);
> +    gen_helper_feq_s(dest, cpu_env, src1, src2);
>      gen_set_gpr(ctx, a->rd, dest);
>      return true;
>  }
> @@ -323,11 +433,13 @@ static bool trans_feq_s(DisasContext *ctx, arg_feq_s *a)
>  static bool trans_flt_s(DisasContext *ctx, arg_flt_s *a)
>  {
>      REQUIRE_FPU;
> -    REQUIRE_EXT(ctx, RVF);
> +    REQUIRE_ZFINX_OR_F(ctx);
>
>      TCGv dest = dest_gpr(ctx, a->rd);
> +    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
> +    TCGv_i64 src2 = get_fpr_hs(ctx, a->rs2);
>
> -    gen_helper_flt_s(dest, cpu_env, cpu_fpr[a->rs1], cpu_fpr[a->rs2]);
> +    gen_helper_flt_s(dest, cpu_env, src1, src2);
>      gen_set_gpr(ctx, a->rd, dest);
>      return true;
>  }
> @@ -335,11 +447,13 @@ static bool trans_flt_s(DisasContext *ctx, arg_flt_s *a)
>  static bool trans_fle_s(DisasContext *ctx, arg_fle_s *a)
>  {
>      REQUIRE_FPU;
> -    REQUIRE_EXT(ctx, RVF);
> +    REQUIRE_ZFINX_OR_F(ctx);
>
>      TCGv dest = dest_gpr(ctx, a->rd);
> +    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
> +    TCGv_i64 src2 = get_fpr_hs(ctx, a->rs2);
>
> -    gen_helper_fle_s(dest, cpu_env, cpu_fpr[a->rs1], cpu_fpr[a->rs2]);
> +    gen_helper_fle_s(dest, cpu_env, src1, src2);
>      gen_set_gpr(ctx, a->rd, dest);
>      return true;
>  }
> @@ -347,11 +461,12 @@ static bool trans_fle_s(DisasContext *ctx, arg_fle_s *a)
>  static bool trans_fclass_s(DisasContext *ctx, arg_fclass_s *a)
>  {
>      REQUIRE_FPU;
> -    REQUIRE_EXT(ctx, RVF);
> +    REQUIRE_ZFINX_OR_F(ctx);
>
>      TCGv dest = dest_gpr(ctx, a->rd);
> +    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
>
> -    gen_helper_fclass_s(dest, cpu_fpr[a->rs1]);
> +    gen_helper_fclass_s(dest, cpu_env, src1);
>      gen_set_gpr(ctx, a->rd, dest);
>      return true;
>  }
> @@ -359,13 +474,14 @@ static bool trans_fclass_s(DisasContext *ctx, arg_fclass_s *a)
>  static bool trans_fcvt_s_w(DisasContext *ctx, arg_fcvt_s_w *a)
>  {
>      REQUIRE_FPU;
> -    REQUIRE_EXT(ctx, RVF);
> +    REQUIRE_ZFINX_OR_F(ctx);
>
> +    TCGv_i64 dest = dest_fpr(ctx, a->rd);
>      TCGv src = get_gpr(ctx, a->rs1, EXT_SIGN);
>
>      gen_set_rm(ctx, a->rm);
> -    gen_helper_fcvt_s_w(cpu_fpr[a->rd], cpu_env, src);
> -
> +    gen_helper_fcvt_s_w(dest, cpu_env, src);
> +    gen_set_fpr_hs(ctx, a->rd, dest);
>      mark_fs_dirty(ctx);
>      return true;
>  }
> @@ -373,13 +489,14 @@ static bool trans_fcvt_s_w(DisasContext *ctx, arg_fcvt_s_w *a)
>  static bool trans_fcvt_s_wu(DisasContext *ctx, arg_fcvt_s_wu *a)
>  {
>      REQUIRE_FPU;
> -    REQUIRE_EXT(ctx, RVF);
> +    REQUIRE_ZFINX_OR_F(ctx);
>
> +    TCGv_i64 dest = dest_fpr(ctx, a->rd);
>      TCGv src = get_gpr(ctx, a->rs1, EXT_ZERO);
>
>      gen_set_rm(ctx, a->rm);
> -    gen_helper_fcvt_s_wu(cpu_fpr[a->rd], cpu_env, src);
> -
> +    gen_helper_fcvt_s_wu(dest, cpu_env, src);
> +    gen_set_fpr_hs(ctx, a->rd, dest);
>      mark_fs_dirty(ctx);
>      return true;
>  }
> @@ -388,13 +505,14 @@ static bool trans_fmv_w_x(DisasContext *ctx, arg_fmv_w_x *a)
>  {
>      /* NOTE: This was FMV.S.X in an earlier version of the ISA spec! */
>      REQUIRE_FPU;
> -    REQUIRE_EXT(ctx, RVF);
> +    REQUIRE_ZFINX_OR_F(ctx);
>
> +    TCGv_i64 dest = dest_fpr(ctx, a->rd);
>      TCGv src = get_gpr(ctx, a->rs1, EXT_ZERO);
>
> -    tcg_gen_extu_tl_i64(cpu_fpr[a->rd], src);
> -    gen_nanbox_s(cpu_fpr[a->rd], cpu_fpr[a->rd]);
> -
> +    tcg_gen_extu_tl_i64(dest, src);
> +    gen_nanbox_s(dest, dest);
> +    gen_set_fpr_hs(ctx, a->rd, dest);
>      mark_fs_dirty(ctx);
>      return true;
>  }
> @@ -403,12 +521,13 @@ static bool trans_fcvt_l_s(DisasContext *ctx, arg_fcvt_l_s *a)
>  {
>      REQUIRE_64BIT(ctx);
>      REQUIRE_FPU;
> -    REQUIRE_EXT(ctx, RVF);
> +    REQUIRE_ZFINX_OR_F(ctx);
>
>      TCGv dest = dest_gpr(ctx, a->rd);
> +    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
>
>      gen_set_rm(ctx, a->rm);
> -    gen_helper_fcvt_l_s(dest, cpu_env, cpu_fpr[a->rs1]);
> +    gen_helper_fcvt_l_s(dest, cpu_env, src1);
>      gen_set_gpr(ctx, a->rd, dest);
>      return true;
>  }
> @@ -417,12 +536,13 @@ static bool trans_fcvt_lu_s(DisasContext *ctx, arg_fcvt_lu_s *a)
>  {
>      REQUIRE_64BIT(ctx);
>      REQUIRE_FPU;
> -    REQUIRE_EXT(ctx, RVF);
> +    REQUIRE_ZFINX_OR_F(ctx);
>
>      TCGv dest = dest_gpr(ctx, a->rd);
> +    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
>
>      gen_set_rm(ctx, a->rm);
> -    gen_helper_fcvt_lu_s(dest, cpu_env, cpu_fpr[a->rs1]);
> +    gen_helper_fcvt_lu_s(dest, cpu_env, src1);
>      gen_set_gpr(ctx, a->rd, dest);
>      return true;
>  }
> @@ -431,13 +551,14 @@ static bool trans_fcvt_s_l(DisasContext *ctx, arg_fcvt_s_l *a)
>  {
>      REQUIRE_64BIT(ctx);
>      REQUIRE_FPU;
> -    REQUIRE_EXT(ctx, RVF);
> +    REQUIRE_ZFINX_OR_F(ctx);
>
> +    TCGv_i64 dest = dest_fpr(ctx, a->rd);
>      TCGv src = get_gpr(ctx, a->rs1, EXT_SIGN);
>
>      gen_set_rm(ctx, a->rm);
> -    gen_helper_fcvt_s_l(cpu_fpr[a->rd], cpu_env, src);
> -
> +    gen_helper_fcvt_s_l(dest, cpu_env, src);
> +    gen_set_fpr_hs(ctx, a->rd, dest);
>      mark_fs_dirty(ctx);
>      return true;
>  }
> @@ -446,13 +567,14 @@ static bool trans_fcvt_s_lu(DisasContext *ctx, arg_fcvt_s_lu *a)
>  {
>      REQUIRE_64BIT(ctx);
>      REQUIRE_FPU;
> -    REQUIRE_EXT(ctx, RVF);
> +    REQUIRE_ZFINX_OR_F(ctx);
>
> +    TCGv_i64 dest = dest_fpr(ctx, a->rd);
>      TCGv src = get_gpr(ctx, a->rs1, EXT_ZERO);
>
>      gen_set_rm(ctx, a->rm);
> -    gen_helper_fcvt_s_lu(cpu_fpr[a->rd], cpu_env, src);
> -
> +    gen_helper_fcvt_s_lu(dest, cpu_env, src);
> +    gen_set_fpr_hs(ctx, a->rd, dest);
>      mark_fs_dirty(ctx);
>      return true;
>  }
> diff --git a/target/riscv/internals.h b/target/riscv/internals.h
> index 065e8162a2..6237bb3115 100644
> --- a/target/riscv/internals.h
> +++ b/target/riscv/internals.h
> @@ -46,13 +46,23 @@ enum {
>      RISCV_FRM_ROD = 8,  /* Round to Odd */
>  };
>
> -static inline uint64_t nanbox_s(float32 f)
> +static inline uint64_t nanbox_s(CPURISCVState *env, float32 f)
>  {
> -    return f | MAKE_64BIT_MASK(32, 32);
> +    /* the value is sign-extended instead of NaN-boxing for zfinx */
> +    if (RISCV_CPU(env_cpu(env))->cfg.ext_zfinx) {
> +        return (int32_t)f;
> +    } else {
> +        return f | MAKE_64BIT_MASK(32, 32);
> +    }
>  }
>
> -static inline float32 check_nanbox_s(uint64_t f)
> +static inline float32 check_nanbox_s(CPURISCVState *env, uint64_t f)
>  {
> +    /* Disable NaN-boxing check when enable zfinx */
> +    if (RISCV_CPU(env_cpu(env))->cfg.ext_zfinx) {
> +        return (uint32_t)f;
> +    }
> +
>      uint64_t mask = MAKE_64BIT_MASK(32, 32);
>
>      if (likely((f & mask) == mask)) {
> diff --git a/target/riscv/translate.c b/target/riscv/translate.c
> index c7232de326..10cf37be41 100644
> --- a/target/riscv/translate.c
> +++ b/target/riscv/translate.c
> @@ -101,6 +101,9 @@ typedef struct DisasContext {
>      TCGv zero;
>      /* Space for 3 operands plus 1 extra for address computation. */
>      TCGv temp[4];
> +    /* Space for 4 operands(1 dest and <=3 src) for float point computation */
> +    TCGv_i64 ftemp[4];
> +    uint8_t nftemp;
>      /* PointerMasking extension */
>      bool pm_mask_enabled;
>      bool pm_base_enabled;
> @@ -380,6 +383,86 @@ static void gen_set_gpr128(DisasContext *ctx, int reg_num, TCGv rl, TCGv rh)
>      }
>  }
>
> +static TCGv_i64 ftemp_new(DisasContext *ctx)
> +{
> +    assert(ctx->nftemp < ARRAY_SIZE(ctx->ftemp));
> +    return ctx->ftemp[ctx->nftemp++] = tcg_temp_new_i64();
> +}
> +
> +static TCGv_i64 get_fpr_hs(DisasContext *ctx, int reg_num)
> +{
> +    if (!ctx->cfg_ptr->ext_zfinx) {
> +        return cpu_fpr[reg_num];
> +    }
> +
> +    if (reg_num == 0) {
> +        return tcg_constant_i64(0);
> +    }
> +    switch (get_xl(ctx)) {
> +    case MXL_RV32:
> +#ifdef TARGET_RISCV32
> +    {
> +        TCGv_i64 t = ftemp_new(ctx);
> +        tcg_gen_ext_i32_i64(t, cpu_gpr[reg_num]);
> +        return t;
> +    }
> +#else
> +    /* fall through */
> +    case MXL_RV64:
> +        return cpu_gpr[reg_num];
> +#endif
> +    default:
> +        g_assert_not_reached();
> +    }
> +}
> +
> +static TCGv_i64 dest_fpr(DisasContext *ctx, int reg_num)
> +{
> +    if (!ctx->cfg_ptr->ext_zfinx) {
> +        return cpu_fpr[reg_num];
> +    }
> +
> +    if (reg_num == 0) {
> +        return ftemp_new(ctx);
> +    }
> +
> +    switch (get_xl(ctx)) {
> +    case MXL_RV32:
> +        return ftemp_new(ctx);
> +#ifdef TARGET_RISCV64
> +    case MXL_RV64:
> +        return cpu_gpr[reg_num];
> +#endif
> +    default:
> +        g_assert_not_reached();
> +    }
> +}
> +
> +/* assume t is nanboxing (for normal) or sign-extended (for zfinx) */
> +static void gen_set_fpr_hs(DisasContext *ctx, int reg_num, TCGv_i64 t)
> +{
> +    if (!ctx->cfg_ptr->ext_zfinx) {
> +        tcg_gen_mov_i64(cpu_fpr[reg_num], t);
> +        return;
> +    }
> +    if (reg_num != 0) {
> +        switch (get_xl(ctx)) {
> +        case MXL_RV32:
> +#ifdef TARGET_RISCV32
> +            tcg_gen_extrl_i64_i32(cpu_gpr[reg_num], t);
> +            break;
> +#else
> +        /* fall through */
> +        case MXL_RV64:
> +            tcg_gen_mov_i64(cpu_gpr[reg_num], t);
> +            break;
> +#endif
> +        default:
> +            g_assert_not_reached();
> +        }
> +    }
> +}
> +
>  static void gen_jal(DisasContext *ctx, int rd, target_ulong imm)
>  {
>      target_ulong next_pc;
> @@ -955,6 +1038,8 @@ static void riscv_tr_init_disas_context(DisasContextBase *dcbase, CPUState *cs)
>      ctx->cs = cs;
>      ctx->ntemp = 0;
>      memset(ctx->temp, 0, sizeof(ctx->temp));
> +    ctx->nftemp = 0;
> +    memset(ctx->ftemp, 0, sizeof(ctx->ftemp));
>      ctx->pm_mask_enabled = FIELD_EX32(tb_flags, TB_FLAGS, PM_MASK_ENABLED);
>      ctx->pm_base_enabled = FIELD_EX32(tb_flags, TB_FLAGS, PM_BASE_ENABLED);
>      ctx->zero = tcg_constant_tl(0);
> @@ -976,16 +1061,22 @@ static void riscv_tr_translate_insn(DisasContextBase *dcbase, CPUState *cpu)
>      DisasContext *ctx = container_of(dcbase, DisasContext, base);
>      CPURISCVState *env = cpu->env_ptr;
>      uint16_t opcode16 = translator_lduw(env, &ctx->base, ctx->base.pc_next);
> +    int i;
>
>      ctx->ol = ctx->xl;
>      decode_opc(env, ctx, opcode16);
>      ctx->base.pc_next = ctx->pc_succ_insn;
>
> -    for (int i = ctx->ntemp - 1; i >= 0; --i) {
> +    for (i = ctx->ntemp - 1; i >= 0; --i) {
>          tcg_temp_free(ctx->temp[i]);
>          ctx->temp[i] = NULL;
>      }
>      ctx->ntemp = 0;
> +    for (i = ctx->nftemp - 1; i >= 0; --i) {
> +        tcg_temp_free_i64(ctx->ftemp[i]);
> +        ctx->ftemp[i] = NULL;
> +    }
> +    ctx->nftemp = 0;
>
>      if (ctx->base.is_jmp == DISAS_NEXT) {
>          target_ulong page_start;
> --
> 2.17.1
>
>


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v6 3/6] target/riscv: add support for zfinx
@ 2022-02-28  3:55     ` Alistair Francis
  0 siblings, 0 replies; 22+ messages in thread
From: Alistair Francis @ 2022-02-28  3:55 UTC (permalink / raw)
  To: Weiwei Li
  Cc: Richard Henderson, Palmer Dabbelt, Alistair Francis, Bin Meng,
	open list:RISC-V, qemu-devel@nongnu.org Developers, wangjunqiang,
	Wei Wu (吴伟),
	ardxwe

On Fri, Feb 11, 2022 at 2:41 PM Weiwei Li <liweiwei@iscas.ac.cn> wrote:
>
>   - update extension check REQUIRE_ZFINX_OR_F
>   - update single float point register read/write
>   - disable nanbox_s check
>
> Co-authored-by: ardxwe <ardxwe@gmail.com>
> Signed-off-by: Weiwei Li <liweiwei@iscas.ac.cn>
> Signed-off-by: Junqiang Wang <wangjunqiang@iscas.ac.cn>
> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>

Acked-by: Alistair Francis <alistair.francis@wdc.com>

Alistair

> ---
>  target/riscv/fpu_helper.c               |  89 +++----
>  target/riscv/helper.h                   |   2 +-
>  target/riscv/insn_trans/trans_rvf.c.inc | 314 ++++++++++++++++--------
>  target/riscv/internals.h                |  16 +-
>  target/riscv/translate.c                |  93 ++++++-
>  5 files changed, 369 insertions(+), 145 deletions(-)
>
> diff --git a/target/riscv/fpu_helper.c b/target/riscv/fpu_helper.c
> index 4a5982d594..63ca703459 100644
> --- a/target/riscv/fpu_helper.c
> +++ b/target/riscv/fpu_helper.c
> @@ -98,10 +98,11 @@ static uint64_t do_fmadd_h(CPURISCVState *env, uint64_t rs1, uint64_t rs2,
>  static uint64_t do_fmadd_s(CPURISCVState *env, uint64_t rs1, uint64_t rs2,
>                             uint64_t rs3, int flags)
>  {
> -    float32 frs1 = check_nanbox_s(rs1);
> -    float32 frs2 = check_nanbox_s(rs2);
> -    float32 frs3 = check_nanbox_s(rs3);
> -    return nanbox_s(float32_muladd(frs1, frs2, frs3, flags, &env->fp_status));
> +    float32 frs1 = check_nanbox_s(env, rs1);
> +    float32 frs2 = check_nanbox_s(env, rs2);
> +    float32 frs3 = check_nanbox_s(env, rs3);
> +    return nanbox_s(env, float32_muladd(frs1, frs2, frs3, flags,
> +                                        &env->fp_status));
>  }
>
>  uint64_t helper_fmadd_s(CPURISCVState *env, uint64_t frs1, uint64_t frs2,
> @@ -183,124 +184,124 @@ uint64_t helper_fnmadd_h(CPURISCVState *env, uint64_t frs1, uint64_t frs2,
>
>  uint64_t helper_fadd_s(CPURISCVState *env, uint64_t rs1, uint64_t rs2)
>  {
> -    float32 frs1 = check_nanbox_s(rs1);
> -    float32 frs2 = check_nanbox_s(rs2);
> -    return nanbox_s(float32_add(frs1, frs2, &env->fp_status));
> +    float32 frs1 = check_nanbox_s(env, rs1);
> +    float32 frs2 = check_nanbox_s(env, rs2);
> +    return nanbox_s(env, float32_add(frs1, frs2, &env->fp_status));
>  }
>
>  uint64_t helper_fsub_s(CPURISCVState *env, uint64_t rs1, uint64_t rs2)
>  {
> -    float32 frs1 = check_nanbox_s(rs1);
> -    float32 frs2 = check_nanbox_s(rs2);
> -    return nanbox_s(float32_sub(frs1, frs2, &env->fp_status));
> +    float32 frs1 = check_nanbox_s(env, rs1);
> +    float32 frs2 = check_nanbox_s(env, rs2);
> +    return nanbox_s(env, float32_sub(frs1, frs2, &env->fp_status));
>  }
>
>  uint64_t helper_fmul_s(CPURISCVState *env, uint64_t rs1, uint64_t rs2)
>  {
> -    float32 frs1 = check_nanbox_s(rs1);
> -    float32 frs2 = check_nanbox_s(rs2);
> -    return nanbox_s(float32_mul(frs1, frs2, &env->fp_status));
> +    float32 frs1 = check_nanbox_s(env, rs1);
> +    float32 frs2 = check_nanbox_s(env, rs2);
> +    return nanbox_s(env, float32_mul(frs1, frs2, &env->fp_status));
>  }
>
>  uint64_t helper_fdiv_s(CPURISCVState *env, uint64_t rs1, uint64_t rs2)
>  {
> -    float32 frs1 = check_nanbox_s(rs1);
> -    float32 frs2 = check_nanbox_s(rs2);
> -    return nanbox_s(float32_div(frs1, frs2, &env->fp_status));
> +    float32 frs1 = check_nanbox_s(env, rs1);
> +    float32 frs2 = check_nanbox_s(env, rs2);
> +    return nanbox_s(env, float32_div(frs1, frs2, &env->fp_status));
>  }
>
>  uint64_t helper_fmin_s(CPURISCVState *env, uint64_t rs1, uint64_t rs2)
>  {
> -    float32 frs1 = check_nanbox_s(rs1);
> -    float32 frs2 = check_nanbox_s(rs2);
> -    return nanbox_s(env->priv_ver < PRIV_VERSION_1_11_0 ?
> +    float32 frs1 = check_nanbox_s(env, rs1);
> +    float32 frs2 = check_nanbox_s(env, rs2);
> +    return nanbox_s(env, env->priv_ver < PRIV_VERSION_1_11_0 ?
>                      float32_minnum(frs1, frs2, &env->fp_status) :
>                      float32_minimum_number(frs1, frs2, &env->fp_status));
>  }
>
>  uint64_t helper_fmax_s(CPURISCVState *env, uint64_t rs1, uint64_t rs2)
>  {
> -    float32 frs1 = check_nanbox_s(rs1);
> -    float32 frs2 = check_nanbox_s(rs2);
> -    return nanbox_s(env->priv_ver < PRIV_VERSION_1_11_0 ?
> +    float32 frs1 = check_nanbox_s(env, rs1);
> +    float32 frs2 = check_nanbox_s(env, rs2);
> +    return nanbox_s(env, env->priv_ver < PRIV_VERSION_1_11_0 ?
>                      float32_maxnum(frs1, frs2, &env->fp_status) :
>                      float32_maximum_number(frs1, frs2, &env->fp_status));
>  }
>
>  uint64_t helper_fsqrt_s(CPURISCVState *env, uint64_t rs1)
>  {
> -    float32 frs1 = check_nanbox_s(rs1);
> -    return nanbox_s(float32_sqrt(frs1, &env->fp_status));
> +    float32 frs1 = check_nanbox_s(env, rs1);
> +    return nanbox_s(env, float32_sqrt(frs1, &env->fp_status));
>  }
>
>  target_ulong helper_fle_s(CPURISCVState *env, uint64_t rs1, uint64_t rs2)
>  {
> -    float32 frs1 = check_nanbox_s(rs1);
> -    float32 frs2 = check_nanbox_s(rs2);
> +    float32 frs1 = check_nanbox_s(env, rs1);
> +    float32 frs2 = check_nanbox_s(env, rs2);
>      return float32_le(frs1, frs2, &env->fp_status);
>  }
>
>  target_ulong helper_flt_s(CPURISCVState *env, uint64_t rs1, uint64_t rs2)
>  {
> -    float32 frs1 = check_nanbox_s(rs1);
> -    float32 frs2 = check_nanbox_s(rs2);
> +    float32 frs1 = check_nanbox_s(env, rs1);
> +    float32 frs2 = check_nanbox_s(env, rs2);
>      return float32_lt(frs1, frs2, &env->fp_status);
>  }
>
>  target_ulong helper_feq_s(CPURISCVState *env, uint64_t rs1, uint64_t rs2)
>  {
> -    float32 frs1 = check_nanbox_s(rs1);
> -    float32 frs2 = check_nanbox_s(rs2);
> +    float32 frs1 = check_nanbox_s(env, rs1);
> +    float32 frs2 = check_nanbox_s(env, rs2);
>      return float32_eq_quiet(frs1, frs2, &env->fp_status);
>  }
>
>  target_ulong helper_fcvt_w_s(CPURISCVState *env, uint64_t rs1)
>  {
> -    float32 frs1 = check_nanbox_s(rs1);
> +    float32 frs1 = check_nanbox_s(env, rs1);
>      return float32_to_int32(frs1, &env->fp_status);
>  }
>
>  target_ulong helper_fcvt_wu_s(CPURISCVState *env, uint64_t rs1)
>  {
> -    float32 frs1 = check_nanbox_s(rs1);
> +    float32 frs1 = check_nanbox_s(env, rs1);
>      return (int32_t)float32_to_uint32(frs1, &env->fp_status);
>  }
>
>  target_ulong helper_fcvt_l_s(CPURISCVState *env, uint64_t rs1)
>  {
> -    float32 frs1 = check_nanbox_s(rs1);
> +    float32 frs1 = check_nanbox_s(env, rs1);
>      return float32_to_int64(frs1, &env->fp_status);
>  }
>
>  target_ulong helper_fcvt_lu_s(CPURISCVState *env, uint64_t rs1)
>  {
> -    float32 frs1 = check_nanbox_s(rs1);
> +    float32 frs1 = check_nanbox_s(env, rs1);
>      return float32_to_uint64(frs1, &env->fp_status);
>  }
>
>  uint64_t helper_fcvt_s_w(CPURISCVState *env, target_ulong rs1)
>  {
> -    return nanbox_s(int32_to_float32((int32_t)rs1, &env->fp_status));
> +    return nanbox_s(env, int32_to_float32((int32_t)rs1, &env->fp_status));
>  }
>
>  uint64_t helper_fcvt_s_wu(CPURISCVState *env, target_ulong rs1)
>  {
> -    return nanbox_s(uint32_to_float32((uint32_t)rs1, &env->fp_status));
> +    return nanbox_s(env, uint32_to_float32((uint32_t)rs1, &env->fp_status));
>  }
>
>  uint64_t helper_fcvt_s_l(CPURISCVState *env, target_ulong rs1)
>  {
> -    return nanbox_s(int64_to_float32(rs1, &env->fp_status));
> +    return nanbox_s(env, int64_to_float32(rs1, &env->fp_status));
>  }
>
>  uint64_t helper_fcvt_s_lu(CPURISCVState *env, target_ulong rs1)
>  {
> -    return nanbox_s(uint64_to_float32(rs1, &env->fp_status));
> +    return nanbox_s(env, uint64_to_float32(rs1, &env->fp_status));
>  }
>
> -target_ulong helper_fclass_s(uint64_t rs1)
> +target_ulong helper_fclass_s(CPURISCVState *env, uint64_t rs1)
>  {
> -    float32 frs1 = check_nanbox_s(rs1);
> +    float32 frs1 = check_nanbox_s(env, rs1);
>      return fclass_s(frs1);
>  }
>
> @@ -340,12 +341,12 @@ uint64_t helper_fmax_d(CPURISCVState *env, uint64_t frs1, uint64_t frs2)
>
>  uint64_t helper_fcvt_s_d(CPURISCVState *env, uint64_t rs1)
>  {
> -    return nanbox_s(float64_to_float32(rs1, &env->fp_status));
> +    return nanbox_s(env, float64_to_float32(rs1, &env->fp_status));
>  }
>
>  uint64_t helper_fcvt_d_s(CPURISCVState *env, uint64_t rs1)
>  {
> -    float32 frs1 = check_nanbox_s(rs1);
> +    float32 frs1 = check_nanbox_s(env, rs1);
>      return float32_to_float64(frs1, &env->fp_status);
>  }
>
> @@ -539,14 +540,14 @@ uint64_t helper_fcvt_h_lu(CPURISCVState *env, target_ulong rs1)
>
>  uint64_t helper_fcvt_h_s(CPURISCVState *env, uint64_t rs1)
>  {
> -    float32 frs1 = check_nanbox_s(rs1);
> +    float32 frs1 = check_nanbox_s(env, rs1);
>      return nanbox_h(float32_to_float16(frs1, true, &env->fp_status));
>  }
>
>  uint64_t helper_fcvt_s_h(CPURISCVState *env, uint64_t rs1)
>  {
>      float16 frs1 = check_nanbox_h(rs1);
> -    return nanbox_s(float16_to_float32(frs1, true, &env->fp_status));
> +    return nanbox_s(env, float16_to_float32(frs1, true, &env->fp_status));
>  }
>
>  uint64_t helper_fcvt_h_d(CPURISCVState *env, uint64_t rs1)
> diff --git a/target/riscv/helper.h b/target/riscv/helper.h
> index 72cc2582f4..89195aad9d 100644
> --- a/target/riscv/helper.h
> +++ b/target/riscv/helper.h
> @@ -38,7 +38,7 @@ DEF_HELPER_FLAGS_2(fcvt_s_w, TCG_CALL_NO_RWG, i64, env, tl)
>  DEF_HELPER_FLAGS_2(fcvt_s_wu, TCG_CALL_NO_RWG, i64, env, tl)
>  DEF_HELPER_FLAGS_2(fcvt_s_l, TCG_CALL_NO_RWG, i64, env, tl)
>  DEF_HELPER_FLAGS_2(fcvt_s_lu, TCG_CALL_NO_RWG, i64, env, tl)
> -DEF_HELPER_FLAGS_1(fclass_s, TCG_CALL_NO_RWG_SE, tl, i64)
> +DEF_HELPER_FLAGS_2(fclass_s, TCG_CALL_NO_RWG_SE, tl, env, i64)
>
>  /* Floating Point - Double Precision */
>  DEF_HELPER_FLAGS_3(fadd_d, TCG_CALL_NO_RWG, i64, env, i64, i64)
> diff --git a/target/riscv/insn_trans/trans_rvf.c.inc b/target/riscv/insn_trans/trans_rvf.c.inc
> index 0aac87f7db..a1d3eb52ad 100644
> --- a/target/riscv/insn_trans/trans_rvf.c.inc
> +++ b/target/riscv/insn_trans/trans_rvf.c.inc
> @@ -20,7 +20,14 @@
>
>  #define REQUIRE_FPU do {\
>      if (ctx->mstatus_fs == 0) \
> -        return false;                       \
> +        if (!ctx->cfg_ptr->ext_zfinx) \
> +            return false; \
> +} while (0)
> +
> +#define REQUIRE_ZFINX_OR_F(ctx) do {\
> +    if (!ctx->cfg_ptr->ext_zfinx) { \
> +        REQUIRE_EXT(ctx, RVF); \
> +    } \
>  } while (0)
>
>  static bool trans_flw(DisasContext *ctx, arg_flw *a)
> @@ -55,10 +62,16 @@ static bool trans_fsw(DisasContext *ctx, arg_fsw *a)
>  static bool trans_fmadd_s(DisasContext *ctx, arg_fmadd_s *a)
>  {
>      REQUIRE_FPU;
> -    REQUIRE_EXT(ctx, RVF);
> +    REQUIRE_ZFINX_OR_F(ctx);
> +
> +    TCGv_i64 dest = dest_fpr(ctx, a->rd);
> +    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
> +    TCGv_i64 src2 = get_fpr_hs(ctx, a->rs2);
> +    TCGv_i64 src3 = get_fpr_hs(ctx, a->rs3);
> +
>      gen_set_rm(ctx, a->rm);
> -    gen_helper_fmadd_s(cpu_fpr[a->rd], cpu_env, cpu_fpr[a->rs1],
> -                       cpu_fpr[a->rs2], cpu_fpr[a->rs3]);
> +    gen_helper_fmadd_s(dest, cpu_env, src1, src2, src3);
> +    gen_set_fpr_hs(ctx, a->rd, dest);
>      mark_fs_dirty(ctx);
>      return true;
>  }
> @@ -66,10 +79,16 @@ static bool trans_fmadd_s(DisasContext *ctx, arg_fmadd_s *a)
>  static bool trans_fmsub_s(DisasContext *ctx, arg_fmsub_s *a)
>  {
>      REQUIRE_FPU;
> -    REQUIRE_EXT(ctx, RVF);
> +    REQUIRE_ZFINX_OR_F(ctx);
> +
> +    TCGv_i64 dest = dest_fpr(ctx, a->rd);
> +    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
> +    TCGv_i64 src2 = get_fpr_hs(ctx, a->rs2);
> +    TCGv_i64 src3 = get_fpr_hs(ctx, a->rs3);
> +
>      gen_set_rm(ctx, a->rm);
> -    gen_helper_fmsub_s(cpu_fpr[a->rd], cpu_env, cpu_fpr[a->rs1],
> -                       cpu_fpr[a->rs2], cpu_fpr[a->rs3]);
> +    gen_helper_fmsub_s(dest, cpu_env, src1, src2, src3);
> +    gen_set_fpr_hs(ctx, a->rd, dest);
>      mark_fs_dirty(ctx);
>      return true;
>  }
> @@ -77,10 +96,16 @@ static bool trans_fmsub_s(DisasContext *ctx, arg_fmsub_s *a)
>  static bool trans_fnmsub_s(DisasContext *ctx, arg_fnmsub_s *a)
>  {
>      REQUIRE_FPU;
> -    REQUIRE_EXT(ctx, RVF);
> +    REQUIRE_ZFINX_OR_F(ctx);
> +
> +    TCGv_i64 dest = dest_fpr(ctx, a->rd);
> +    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
> +    TCGv_i64 src2 = get_fpr_hs(ctx, a->rs2);
> +    TCGv_i64 src3 = get_fpr_hs(ctx, a->rs3);
> +
>      gen_set_rm(ctx, a->rm);
> -    gen_helper_fnmsub_s(cpu_fpr[a->rd], cpu_env, cpu_fpr[a->rs1],
> -                        cpu_fpr[a->rs2], cpu_fpr[a->rs3]);
> +    gen_helper_fnmsub_s(dest, cpu_env, src1, src2, src3);
> +    gen_set_fpr_hs(ctx, a->rd, dest);
>      mark_fs_dirty(ctx);
>      return true;
>  }
> @@ -88,10 +113,16 @@ static bool trans_fnmsub_s(DisasContext *ctx, arg_fnmsub_s *a)
>  static bool trans_fnmadd_s(DisasContext *ctx, arg_fnmadd_s *a)
>  {
>      REQUIRE_FPU;
> -    REQUIRE_EXT(ctx, RVF);
> +    REQUIRE_ZFINX_OR_F(ctx);
> +
> +    TCGv_i64 dest = dest_fpr(ctx, a->rd);
> +    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
> +    TCGv_i64 src2 = get_fpr_hs(ctx, a->rs2);
> +    TCGv_i64 src3 = get_fpr_hs(ctx, a->rs3);
> +
>      gen_set_rm(ctx, a->rm);
> -    gen_helper_fnmadd_s(cpu_fpr[a->rd], cpu_env, cpu_fpr[a->rs1],
> -                        cpu_fpr[a->rs2], cpu_fpr[a->rs3]);
> +    gen_helper_fnmadd_s(dest, cpu_env, src1, src2, src3);
> +    gen_set_fpr_hs(ctx, a->rd, dest);
>      mark_fs_dirty(ctx);
>      return true;
>  }
> @@ -99,11 +130,15 @@ static bool trans_fnmadd_s(DisasContext *ctx, arg_fnmadd_s *a)
>  static bool trans_fadd_s(DisasContext *ctx, arg_fadd_s *a)
>  {
>      REQUIRE_FPU;
> -    REQUIRE_EXT(ctx, RVF);
> +    REQUIRE_ZFINX_OR_F(ctx);
> +
> +    TCGv_i64 dest = dest_fpr(ctx, a->rd);
> +    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
> +    TCGv_i64 src2 = get_fpr_hs(ctx, a->rs2);
>
>      gen_set_rm(ctx, a->rm);
> -    gen_helper_fadd_s(cpu_fpr[a->rd], cpu_env,
> -                      cpu_fpr[a->rs1], cpu_fpr[a->rs2]);
> +    gen_helper_fadd_s(dest, cpu_env, src1, src2);
> +    gen_set_fpr_hs(ctx, a->rd, dest);
>      mark_fs_dirty(ctx);
>      return true;
>  }
> @@ -111,11 +146,15 @@ static bool trans_fadd_s(DisasContext *ctx, arg_fadd_s *a)
>  static bool trans_fsub_s(DisasContext *ctx, arg_fsub_s *a)
>  {
>      REQUIRE_FPU;
> -    REQUIRE_EXT(ctx, RVF);
> +    REQUIRE_ZFINX_OR_F(ctx);
> +
> +    TCGv_i64 dest = dest_fpr(ctx, a->rd);
> +    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
> +    TCGv_i64 src2 = get_fpr_hs(ctx, a->rs2);
>
>      gen_set_rm(ctx, a->rm);
> -    gen_helper_fsub_s(cpu_fpr[a->rd], cpu_env,
> -                      cpu_fpr[a->rs1], cpu_fpr[a->rs2]);
> +    gen_helper_fsub_s(dest, cpu_env, src1, src2);
> +    gen_set_fpr_hs(ctx, a->rd, dest);
>      mark_fs_dirty(ctx);
>      return true;
>  }
> @@ -123,11 +162,15 @@ static bool trans_fsub_s(DisasContext *ctx, arg_fsub_s *a)
>  static bool trans_fmul_s(DisasContext *ctx, arg_fmul_s *a)
>  {
>      REQUIRE_FPU;
> -    REQUIRE_EXT(ctx, RVF);
> +    REQUIRE_ZFINX_OR_F(ctx);
> +
> +    TCGv_i64 dest = dest_fpr(ctx, a->rd);
> +    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
> +    TCGv_i64 src2 = get_fpr_hs(ctx, a->rs2);
>
>      gen_set_rm(ctx, a->rm);
> -    gen_helper_fmul_s(cpu_fpr[a->rd], cpu_env,
> -                      cpu_fpr[a->rs1], cpu_fpr[a->rs2]);
> +    gen_helper_fmul_s(dest, cpu_env, src1, src2);
> +    gen_set_fpr_hs(ctx, a->rd, dest);
>      mark_fs_dirty(ctx);
>      return true;
>  }
> @@ -135,11 +178,15 @@ static bool trans_fmul_s(DisasContext *ctx, arg_fmul_s *a)
>  static bool trans_fdiv_s(DisasContext *ctx, arg_fdiv_s *a)
>  {
>      REQUIRE_FPU;
> -    REQUIRE_EXT(ctx, RVF);
> +    REQUIRE_ZFINX_OR_F(ctx);
> +
> +    TCGv_i64 dest = dest_fpr(ctx, a->rd);
> +    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
> +    TCGv_i64 src2 = get_fpr_hs(ctx, a->rs2);
>
>      gen_set_rm(ctx, a->rm);
> -    gen_helper_fdiv_s(cpu_fpr[a->rd], cpu_env,
> -                      cpu_fpr[a->rs1], cpu_fpr[a->rs2]);
> +    gen_helper_fdiv_s(dest, cpu_env, src1, src2);
> +    gen_set_fpr_hs(ctx, a->rd, dest);
>      mark_fs_dirty(ctx);
>      return true;
>  }
> @@ -147,10 +194,14 @@ static bool trans_fdiv_s(DisasContext *ctx, arg_fdiv_s *a)
>  static bool trans_fsqrt_s(DisasContext *ctx, arg_fsqrt_s *a)
>  {
>      REQUIRE_FPU;
> -    REQUIRE_EXT(ctx, RVF);
> +    REQUIRE_ZFINX_OR_F(ctx);
> +
> +    TCGv_i64 dest = dest_fpr(ctx, a->rd);
> +    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
>
>      gen_set_rm(ctx, a->rm);
> -    gen_helper_fsqrt_s(cpu_fpr[a->rd], cpu_env, cpu_fpr[a->rs1]);
> +    gen_helper_fsqrt_s(dest, cpu_env, src1);
> +    gen_set_fpr_hs(ctx, a->rd, dest);
>      mark_fs_dirty(ctx);
>      return true;
>  }
> @@ -158,22 +209,37 @@ static bool trans_fsqrt_s(DisasContext *ctx, arg_fsqrt_s *a)
>  static bool trans_fsgnj_s(DisasContext *ctx, arg_fsgnj_s *a)
>  {
>      REQUIRE_FPU;
> -    REQUIRE_EXT(ctx, RVF);
> +    REQUIRE_ZFINX_OR_F(ctx);
> +
> +    TCGv_i64 dest = dest_fpr(ctx, a->rd);
> +    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
>
>      if (a->rs1 == a->rs2) { /* FMOV */
> -        gen_check_nanbox_s(cpu_fpr[a->rd], cpu_fpr[a->rs1]);
> +        if (!ctx->cfg_ptr->ext_zfinx) {
> +            gen_check_nanbox_s(dest, src1);
> +        } else {
> +            tcg_gen_ext32s_i64(dest, src1);
> +        }
>      } else { /* FSGNJ */
> -        TCGv_i64 rs1 = tcg_temp_new_i64();
> -        TCGv_i64 rs2 = tcg_temp_new_i64();
> -
> -        gen_check_nanbox_s(rs1, cpu_fpr[a->rs1]);
> -        gen_check_nanbox_s(rs2, cpu_fpr[a->rs2]);
> -
> -        /* This formulation retains the nanboxing of rs2. */
> -        tcg_gen_deposit_i64(cpu_fpr[a->rd], rs2, rs1, 0, 31);
> -        tcg_temp_free_i64(rs1);
> -        tcg_temp_free_i64(rs2);
> +        TCGv_i64 src2 = get_fpr_hs(ctx, a->rs2);
> +
> +        if (!ctx->cfg_ptr->ext_zfinx) {
> +            TCGv_i64 rs1 = tcg_temp_new_i64();
> +            TCGv_i64 rs2 = tcg_temp_new_i64();
> +            gen_check_nanbox_s(rs1, src1);
> +            gen_check_nanbox_s(rs2, src2);
> +
> +            /* This formulation retains the nanboxing of rs2 in normal 'F'. */
> +            tcg_gen_deposit_i64(dest, rs2, rs1, 0, 31);
> +
> +            tcg_temp_free_i64(rs1);
> +            tcg_temp_free_i64(rs2);
> +        } else {
> +            tcg_gen_deposit_i64(dest, src2, src1, 0, 31);
> +            tcg_gen_ext32s_i64(dest, dest);
> +        }
>      }
> +    gen_set_fpr_hs(ctx, a->rd, dest);
>      mark_fs_dirty(ctx);
>      return true;
>  }
> @@ -183,16 +249,27 @@ static bool trans_fsgnjn_s(DisasContext *ctx, arg_fsgnjn_s *a)
>      TCGv_i64 rs1, rs2, mask;
>
>      REQUIRE_FPU;
> -    REQUIRE_EXT(ctx, RVF);
> +    REQUIRE_ZFINX_OR_F(ctx);
>
> -    rs1 = tcg_temp_new_i64();
> -    gen_check_nanbox_s(rs1, cpu_fpr[a->rs1]);
> +    TCGv_i64 dest = dest_fpr(ctx, a->rd);
> +    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
>
> +    rs1 = tcg_temp_new_i64();
> +    if (!ctx->cfg_ptr->ext_zfinx) {
> +        gen_check_nanbox_s(rs1, src1);
> +    } else {
> +        tcg_gen_mov_i64(rs1, src1);
> +    }
>      if (a->rs1 == a->rs2) { /* FNEG */
> -        tcg_gen_xori_i64(cpu_fpr[a->rd], rs1, MAKE_64BIT_MASK(31, 1));
> +        tcg_gen_xori_i64(dest, rs1, MAKE_64BIT_MASK(31, 1));
>      } else {
> +        TCGv_i64 src2 = get_fpr_hs(ctx, a->rs2);
>          rs2 = tcg_temp_new_i64();
> -        gen_check_nanbox_s(rs2, cpu_fpr[a->rs2]);
> +        if (!ctx->cfg_ptr->ext_zfinx) {
> +            gen_check_nanbox_s(rs2, src2);
> +        } else {
> +            tcg_gen_mov_i64(rs2, src2);
> +        }
>
>          /*
>           * Replace bit 31 in rs1 with inverse in rs2.
> @@ -200,13 +277,17 @@ static bool trans_fsgnjn_s(DisasContext *ctx, arg_fsgnjn_s *a)
>           */
>          mask = tcg_constant_i64(~MAKE_64BIT_MASK(31, 1));
>          tcg_gen_nor_i64(rs2, rs2, mask);
> -        tcg_gen_and_i64(rs1, mask, rs1);
> -        tcg_gen_or_i64(cpu_fpr[a->rd], rs1, rs2);
> +        tcg_gen_and_i64(dest, mask, rs1);
> +        tcg_gen_or_i64(dest, dest, rs2);
>
>          tcg_temp_free_i64(rs2);
>      }
> +    /* signed-extended intead of nanboxing for result if enable zfinx */
> +    if (ctx->cfg_ptr->ext_zfinx) {
> +        tcg_gen_ext32s_i64(dest, dest);
> +    }
> +    gen_set_fpr_hs(ctx, a->rd, dest);
>      tcg_temp_free_i64(rs1);
> -
>      mark_fs_dirty(ctx);
>      return true;
>  }
> @@ -216,28 +297,45 @@ static bool trans_fsgnjx_s(DisasContext *ctx, arg_fsgnjx_s *a)
>      TCGv_i64 rs1, rs2;
>
>      REQUIRE_FPU;
> -    REQUIRE_EXT(ctx, RVF);
> +    REQUIRE_ZFINX_OR_F(ctx);
>
> +    TCGv_i64 dest = dest_fpr(ctx, a->rd);
> +    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
>      rs1 = tcg_temp_new_i64();
> -    gen_check_nanbox_s(rs1, cpu_fpr[a->rs1]);
> +
> +    if (!ctx->cfg_ptr->ext_zfinx) {
> +        gen_check_nanbox_s(rs1, src1);
> +    } else {
> +        tcg_gen_mov_i64(rs1, src1);
> +    }
>
>      if (a->rs1 == a->rs2) { /* FABS */
> -        tcg_gen_andi_i64(cpu_fpr[a->rd], rs1, ~MAKE_64BIT_MASK(31, 1));
> +        tcg_gen_andi_i64(dest, rs1, ~MAKE_64BIT_MASK(31, 1));
>      } else {
> +        TCGv_i64 src2 = get_fpr_hs(ctx, a->rs2);
>          rs2 = tcg_temp_new_i64();
> -        gen_check_nanbox_s(rs2, cpu_fpr[a->rs2]);
> +
> +        if (!ctx->cfg_ptr->ext_zfinx) {
> +            gen_check_nanbox_s(rs2, src2);
> +        } else {
> +            tcg_gen_mov_i64(rs2, src2);
> +        }
>
>          /*
>           * Xor bit 31 in rs1 with that in rs2.
>           * This formulation retains the nanboxing of rs1.
>           */
> -        tcg_gen_andi_i64(rs2, rs2, MAKE_64BIT_MASK(31, 1));
> -        tcg_gen_xor_i64(cpu_fpr[a->rd], rs1, rs2);
> +        tcg_gen_andi_i64(dest, rs2, MAKE_64BIT_MASK(31, 1));
> +        tcg_gen_xor_i64(dest, rs1, dest);
>
>          tcg_temp_free_i64(rs2);
>      }
> +    /* signed-extended intead of nanboxing for result if enable zfinx */
> +    if (ctx->cfg_ptr->ext_zfinx) {
> +        tcg_gen_ext32s_i64(dest, dest);
> +    }
>      tcg_temp_free_i64(rs1);
> -
> +    gen_set_fpr_hs(ctx, a->rd, dest);
>      mark_fs_dirty(ctx);
>      return true;
>  }
> @@ -245,10 +343,14 @@ static bool trans_fsgnjx_s(DisasContext *ctx, arg_fsgnjx_s *a)
>  static bool trans_fmin_s(DisasContext *ctx, arg_fmin_s *a)
>  {
>      REQUIRE_FPU;
> -    REQUIRE_EXT(ctx, RVF);
> +    REQUIRE_ZFINX_OR_F(ctx);
> +
> +    TCGv_i64 dest = dest_fpr(ctx, a->rd);
> +    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
> +    TCGv_i64 src2 = get_fpr_hs(ctx, a->rs2);
>
> -    gen_helper_fmin_s(cpu_fpr[a->rd], cpu_env, cpu_fpr[a->rs1],
> -                      cpu_fpr[a->rs2]);
> +    gen_helper_fmin_s(dest, cpu_env, src1, src2);
> +    gen_set_fpr_hs(ctx, a->rd, dest);
>      mark_fs_dirty(ctx);
>      return true;
>  }
> @@ -256,10 +358,14 @@ static bool trans_fmin_s(DisasContext *ctx, arg_fmin_s *a)
>  static bool trans_fmax_s(DisasContext *ctx, arg_fmax_s *a)
>  {
>      REQUIRE_FPU;
> -    REQUIRE_EXT(ctx, RVF);
> +    REQUIRE_ZFINX_OR_F(ctx);
> +
> +    TCGv_i64 dest = dest_fpr(ctx, a->rd);
> +    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
> +    TCGv_i64 src2 = get_fpr_hs(ctx, a->rs2);
>
> -    gen_helper_fmax_s(cpu_fpr[a->rd], cpu_env, cpu_fpr[a->rs1],
> -                      cpu_fpr[a->rs2]);
> +    gen_helper_fmax_s(dest, cpu_env, src1, src2);
> +    gen_set_fpr_hs(ctx, a->rd, dest);
>      mark_fs_dirty(ctx);
>      return true;
>  }
> @@ -267,12 +373,13 @@ static bool trans_fmax_s(DisasContext *ctx, arg_fmax_s *a)
>  static bool trans_fcvt_w_s(DisasContext *ctx, arg_fcvt_w_s *a)
>  {
>      REQUIRE_FPU;
> -    REQUIRE_EXT(ctx, RVF);
> +    REQUIRE_ZFINX_OR_F(ctx);
>
>      TCGv dest = dest_gpr(ctx, a->rd);
> +    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
>
>      gen_set_rm(ctx, a->rm);
> -    gen_helper_fcvt_w_s(dest, cpu_env, cpu_fpr[a->rs1]);
> +    gen_helper_fcvt_w_s(dest, cpu_env, src1);
>      gen_set_gpr(ctx, a->rd, dest);
>      return true;
>  }
> @@ -280,12 +387,13 @@ static bool trans_fcvt_w_s(DisasContext *ctx, arg_fcvt_w_s *a)
>  static bool trans_fcvt_wu_s(DisasContext *ctx, arg_fcvt_wu_s *a)
>  {
>      REQUIRE_FPU;
> -    REQUIRE_EXT(ctx, RVF);
> +    REQUIRE_ZFINX_OR_F(ctx);
>
>      TCGv dest = dest_gpr(ctx, a->rd);
> +    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
>
>      gen_set_rm(ctx, a->rm);
> -    gen_helper_fcvt_wu_s(dest, cpu_env, cpu_fpr[a->rs1]);
> +    gen_helper_fcvt_wu_s(dest, cpu_env, src1);
>      gen_set_gpr(ctx, a->rd, dest);
>      return true;
>  }
> @@ -294,14 +402,14 @@ static bool trans_fmv_x_w(DisasContext *ctx, arg_fmv_x_w *a)
>  {
>      /* NOTE: This was FMV.X.S in an earlier version of the ISA spec! */
>      REQUIRE_FPU;
> -    REQUIRE_EXT(ctx, RVF);
> +    REQUIRE_ZFINX_OR_F(ctx);
>
>      TCGv dest = dest_gpr(ctx, a->rd);
> -
> +    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
>  #if defined(TARGET_RISCV64)
> -    tcg_gen_ext32s_tl(dest, cpu_fpr[a->rs1]);
> +    tcg_gen_ext32s_tl(dest, src1);
>  #else
> -    tcg_gen_extrl_i64_i32(dest, cpu_fpr[a->rs1]);
> +    tcg_gen_extrl_i64_i32(dest, src1);
>  #endif
>
>      gen_set_gpr(ctx, a->rd, dest);
> @@ -311,11 +419,13 @@ static bool trans_fmv_x_w(DisasContext *ctx, arg_fmv_x_w *a)
>  static bool trans_feq_s(DisasContext *ctx, arg_feq_s *a)
>  {
>      REQUIRE_FPU;
> -    REQUIRE_EXT(ctx, RVF);
> +    REQUIRE_ZFINX_OR_F(ctx);
>
>      TCGv dest = dest_gpr(ctx, a->rd);
> +    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
> +    TCGv_i64 src2 = get_fpr_hs(ctx, a->rs2);
>
> -    gen_helper_feq_s(dest, cpu_env, cpu_fpr[a->rs1], cpu_fpr[a->rs2]);
> +    gen_helper_feq_s(dest, cpu_env, src1, src2);
>      gen_set_gpr(ctx, a->rd, dest);
>      return true;
>  }
> @@ -323,11 +433,13 @@ static bool trans_feq_s(DisasContext *ctx, arg_feq_s *a)
>  static bool trans_flt_s(DisasContext *ctx, arg_flt_s *a)
>  {
>      REQUIRE_FPU;
> -    REQUIRE_EXT(ctx, RVF);
> +    REQUIRE_ZFINX_OR_F(ctx);
>
>      TCGv dest = dest_gpr(ctx, a->rd);
> +    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
> +    TCGv_i64 src2 = get_fpr_hs(ctx, a->rs2);
>
> -    gen_helper_flt_s(dest, cpu_env, cpu_fpr[a->rs1], cpu_fpr[a->rs2]);
> +    gen_helper_flt_s(dest, cpu_env, src1, src2);
>      gen_set_gpr(ctx, a->rd, dest);
>      return true;
>  }
> @@ -335,11 +447,13 @@ static bool trans_flt_s(DisasContext *ctx, arg_flt_s *a)
>  static bool trans_fle_s(DisasContext *ctx, arg_fle_s *a)
>  {
>      REQUIRE_FPU;
> -    REQUIRE_EXT(ctx, RVF);
> +    REQUIRE_ZFINX_OR_F(ctx);
>
>      TCGv dest = dest_gpr(ctx, a->rd);
> +    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
> +    TCGv_i64 src2 = get_fpr_hs(ctx, a->rs2);
>
> -    gen_helper_fle_s(dest, cpu_env, cpu_fpr[a->rs1], cpu_fpr[a->rs2]);
> +    gen_helper_fle_s(dest, cpu_env, src1, src2);
>      gen_set_gpr(ctx, a->rd, dest);
>      return true;
>  }
> @@ -347,11 +461,12 @@ static bool trans_fle_s(DisasContext *ctx, arg_fle_s *a)
>  static bool trans_fclass_s(DisasContext *ctx, arg_fclass_s *a)
>  {
>      REQUIRE_FPU;
> -    REQUIRE_EXT(ctx, RVF);
> +    REQUIRE_ZFINX_OR_F(ctx);
>
>      TCGv dest = dest_gpr(ctx, a->rd);
> +    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
>
> -    gen_helper_fclass_s(dest, cpu_fpr[a->rs1]);
> +    gen_helper_fclass_s(dest, cpu_env, src1);
>      gen_set_gpr(ctx, a->rd, dest);
>      return true;
>  }
> @@ -359,13 +474,14 @@ static bool trans_fclass_s(DisasContext *ctx, arg_fclass_s *a)
>  static bool trans_fcvt_s_w(DisasContext *ctx, arg_fcvt_s_w *a)
>  {
>      REQUIRE_FPU;
> -    REQUIRE_EXT(ctx, RVF);
> +    REQUIRE_ZFINX_OR_F(ctx);
>
> +    TCGv_i64 dest = dest_fpr(ctx, a->rd);
>      TCGv src = get_gpr(ctx, a->rs1, EXT_SIGN);
>
>      gen_set_rm(ctx, a->rm);
> -    gen_helper_fcvt_s_w(cpu_fpr[a->rd], cpu_env, src);
> -
> +    gen_helper_fcvt_s_w(dest, cpu_env, src);
> +    gen_set_fpr_hs(ctx, a->rd, dest);
>      mark_fs_dirty(ctx);
>      return true;
>  }
> @@ -373,13 +489,14 @@ static bool trans_fcvt_s_w(DisasContext *ctx, arg_fcvt_s_w *a)
>  static bool trans_fcvt_s_wu(DisasContext *ctx, arg_fcvt_s_wu *a)
>  {
>      REQUIRE_FPU;
> -    REQUIRE_EXT(ctx, RVF);
> +    REQUIRE_ZFINX_OR_F(ctx);
>
> +    TCGv_i64 dest = dest_fpr(ctx, a->rd);
>      TCGv src = get_gpr(ctx, a->rs1, EXT_ZERO);
>
>      gen_set_rm(ctx, a->rm);
> -    gen_helper_fcvt_s_wu(cpu_fpr[a->rd], cpu_env, src);
> -
> +    gen_helper_fcvt_s_wu(dest, cpu_env, src);
> +    gen_set_fpr_hs(ctx, a->rd, dest);
>      mark_fs_dirty(ctx);
>      return true;
>  }
> @@ -388,13 +505,14 @@ static bool trans_fmv_w_x(DisasContext *ctx, arg_fmv_w_x *a)
>  {
>      /* NOTE: This was FMV.S.X in an earlier version of the ISA spec! */
>      REQUIRE_FPU;
> -    REQUIRE_EXT(ctx, RVF);
> +    REQUIRE_ZFINX_OR_F(ctx);
>
> +    TCGv_i64 dest = dest_fpr(ctx, a->rd);
>      TCGv src = get_gpr(ctx, a->rs1, EXT_ZERO);
>
> -    tcg_gen_extu_tl_i64(cpu_fpr[a->rd], src);
> -    gen_nanbox_s(cpu_fpr[a->rd], cpu_fpr[a->rd]);
> -
> +    tcg_gen_extu_tl_i64(dest, src);
> +    gen_nanbox_s(dest, dest);
> +    gen_set_fpr_hs(ctx, a->rd, dest);
>      mark_fs_dirty(ctx);
>      return true;
>  }
> @@ -403,12 +521,13 @@ static bool trans_fcvt_l_s(DisasContext *ctx, arg_fcvt_l_s *a)
>  {
>      REQUIRE_64BIT(ctx);
>      REQUIRE_FPU;
> -    REQUIRE_EXT(ctx, RVF);
> +    REQUIRE_ZFINX_OR_F(ctx);
>
>      TCGv dest = dest_gpr(ctx, a->rd);
> +    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
>
>      gen_set_rm(ctx, a->rm);
> -    gen_helper_fcvt_l_s(dest, cpu_env, cpu_fpr[a->rs1]);
> +    gen_helper_fcvt_l_s(dest, cpu_env, src1);
>      gen_set_gpr(ctx, a->rd, dest);
>      return true;
>  }
> @@ -417,12 +536,13 @@ static bool trans_fcvt_lu_s(DisasContext *ctx, arg_fcvt_lu_s *a)
>  {
>      REQUIRE_64BIT(ctx);
>      REQUIRE_FPU;
> -    REQUIRE_EXT(ctx, RVF);
> +    REQUIRE_ZFINX_OR_F(ctx);
>
>      TCGv dest = dest_gpr(ctx, a->rd);
> +    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
>
>      gen_set_rm(ctx, a->rm);
> -    gen_helper_fcvt_lu_s(dest, cpu_env, cpu_fpr[a->rs1]);
> +    gen_helper_fcvt_lu_s(dest, cpu_env, src1);
>      gen_set_gpr(ctx, a->rd, dest);
>      return true;
>  }
> @@ -431,13 +551,14 @@ static bool trans_fcvt_s_l(DisasContext *ctx, arg_fcvt_s_l *a)
>  {
>      REQUIRE_64BIT(ctx);
>      REQUIRE_FPU;
> -    REQUIRE_EXT(ctx, RVF);
> +    REQUIRE_ZFINX_OR_F(ctx);
>
> +    TCGv_i64 dest = dest_fpr(ctx, a->rd);
>      TCGv src = get_gpr(ctx, a->rs1, EXT_SIGN);
>
>      gen_set_rm(ctx, a->rm);
> -    gen_helper_fcvt_s_l(cpu_fpr[a->rd], cpu_env, src);
> -
> +    gen_helper_fcvt_s_l(dest, cpu_env, src);
> +    gen_set_fpr_hs(ctx, a->rd, dest);
>      mark_fs_dirty(ctx);
>      return true;
>  }
> @@ -446,13 +567,14 @@ static bool trans_fcvt_s_lu(DisasContext *ctx, arg_fcvt_s_lu *a)
>  {
>      REQUIRE_64BIT(ctx);
>      REQUIRE_FPU;
> -    REQUIRE_EXT(ctx, RVF);
> +    REQUIRE_ZFINX_OR_F(ctx);
>
> +    TCGv_i64 dest = dest_fpr(ctx, a->rd);
>      TCGv src = get_gpr(ctx, a->rs1, EXT_ZERO);
>
>      gen_set_rm(ctx, a->rm);
> -    gen_helper_fcvt_s_lu(cpu_fpr[a->rd], cpu_env, src);
> -
> +    gen_helper_fcvt_s_lu(dest, cpu_env, src);
> +    gen_set_fpr_hs(ctx, a->rd, dest);
>      mark_fs_dirty(ctx);
>      return true;
>  }
> diff --git a/target/riscv/internals.h b/target/riscv/internals.h
> index 065e8162a2..6237bb3115 100644
> --- a/target/riscv/internals.h
> +++ b/target/riscv/internals.h
> @@ -46,13 +46,23 @@ enum {
>      RISCV_FRM_ROD = 8,  /* Round to Odd */
>  };
>
> -static inline uint64_t nanbox_s(float32 f)
> +static inline uint64_t nanbox_s(CPURISCVState *env, float32 f)
>  {
> -    return f | MAKE_64BIT_MASK(32, 32);
> +    /* the value is sign-extended instead of NaN-boxing for zfinx */
> +    if (RISCV_CPU(env_cpu(env))->cfg.ext_zfinx) {
> +        return (int32_t)f;
> +    } else {
> +        return f | MAKE_64BIT_MASK(32, 32);
> +    }
>  }
>
> -static inline float32 check_nanbox_s(uint64_t f)
> +static inline float32 check_nanbox_s(CPURISCVState *env, uint64_t f)
>  {
> +    /* Disable NaN-boxing check when enable zfinx */
> +    if (RISCV_CPU(env_cpu(env))->cfg.ext_zfinx) {
> +        return (uint32_t)f;
> +    }
> +
>      uint64_t mask = MAKE_64BIT_MASK(32, 32);
>
>      if (likely((f & mask) == mask)) {
> diff --git a/target/riscv/translate.c b/target/riscv/translate.c
> index c7232de326..10cf37be41 100644
> --- a/target/riscv/translate.c
> +++ b/target/riscv/translate.c
> @@ -101,6 +101,9 @@ typedef struct DisasContext {
>      TCGv zero;
>      /* Space for 3 operands plus 1 extra for address computation. */
>      TCGv temp[4];
> +    /* Space for 4 operands(1 dest and <=3 src) for float point computation */
> +    TCGv_i64 ftemp[4];
> +    uint8_t nftemp;
>      /* PointerMasking extension */
>      bool pm_mask_enabled;
>      bool pm_base_enabled;
> @@ -380,6 +383,86 @@ static void gen_set_gpr128(DisasContext *ctx, int reg_num, TCGv rl, TCGv rh)
>      }
>  }
>
> +static TCGv_i64 ftemp_new(DisasContext *ctx)
> +{
> +    assert(ctx->nftemp < ARRAY_SIZE(ctx->ftemp));
> +    return ctx->ftemp[ctx->nftemp++] = tcg_temp_new_i64();
> +}
> +
> +static TCGv_i64 get_fpr_hs(DisasContext *ctx, int reg_num)
> +{
> +    if (!ctx->cfg_ptr->ext_zfinx) {
> +        return cpu_fpr[reg_num];
> +    }
> +
> +    if (reg_num == 0) {
> +        return tcg_constant_i64(0);
> +    }
> +    switch (get_xl(ctx)) {
> +    case MXL_RV32:
> +#ifdef TARGET_RISCV32
> +    {
> +        TCGv_i64 t = ftemp_new(ctx);
> +        tcg_gen_ext_i32_i64(t, cpu_gpr[reg_num]);
> +        return t;
> +    }
> +#else
> +    /* fall through */
> +    case MXL_RV64:
> +        return cpu_gpr[reg_num];
> +#endif
> +    default:
> +        g_assert_not_reached();
> +    }
> +}
> +
> +static TCGv_i64 dest_fpr(DisasContext *ctx, int reg_num)
> +{
> +    if (!ctx->cfg_ptr->ext_zfinx) {
> +        return cpu_fpr[reg_num];
> +    }
> +
> +    if (reg_num == 0) {
> +        return ftemp_new(ctx);
> +    }
> +
> +    switch (get_xl(ctx)) {
> +    case MXL_RV32:
> +        return ftemp_new(ctx);
> +#ifdef TARGET_RISCV64
> +    case MXL_RV64:
> +        return cpu_gpr[reg_num];
> +#endif
> +    default:
> +        g_assert_not_reached();
> +    }
> +}
> +
> +/* assume t is nanboxing (for normal) or sign-extended (for zfinx) */
> +static void gen_set_fpr_hs(DisasContext *ctx, int reg_num, TCGv_i64 t)
> +{
> +    if (!ctx->cfg_ptr->ext_zfinx) {
> +        tcg_gen_mov_i64(cpu_fpr[reg_num], t);
> +        return;
> +    }
> +    if (reg_num != 0) {
> +        switch (get_xl(ctx)) {
> +        case MXL_RV32:
> +#ifdef TARGET_RISCV32
> +            tcg_gen_extrl_i64_i32(cpu_gpr[reg_num], t);
> +            break;
> +#else
> +        /* fall through */
> +        case MXL_RV64:
> +            tcg_gen_mov_i64(cpu_gpr[reg_num], t);
> +            break;
> +#endif
> +        default:
> +            g_assert_not_reached();
> +        }
> +    }
> +}
> +
>  static void gen_jal(DisasContext *ctx, int rd, target_ulong imm)
>  {
>      target_ulong next_pc;
> @@ -955,6 +1038,8 @@ static void riscv_tr_init_disas_context(DisasContextBase *dcbase, CPUState *cs)
>      ctx->cs = cs;
>      ctx->ntemp = 0;
>      memset(ctx->temp, 0, sizeof(ctx->temp));
> +    ctx->nftemp = 0;
> +    memset(ctx->ftemp, 0, sizeof(ctx->ftemp));
>      ctx->pm_mask_enabled = FIELD_EX32(tb_flags, TB_FLAGS, PM_MASK_ENABLED);
>      ctx->pm_base_enabled = FIELD_EX32(tb_flags, TB_FLAGS, PM_BASE_ENABLED);
>      ctx->zero = tcg_constant_tl(0);
> @@ -976,16 +1061,22 @@ static void riscv_tr_translate_insn(DisasContextBase *dcbase, CPUState *cpu)
>      DisasContext *ctx = container_of(dcbase, DisasContext, base);
>      CPURISCVState *env = cpu->env_ptr;
>      uint16_t opcode16 = translator_lduw(env, &ctx->base, ctx->base.pc_next);
> +    int i;
>
>      ctx->ol = ctx->xl;
>      decode_opc(env, ctx, opcode16);
>      ctx->base.pc_next = ctx->pc_succ_insn;
>
> -    for (int i = ctx->ntemp - 1; i >= 0; --i) {
> +    for (i = ctx->ntemp - 1; i >= 0; --i) {
>          tcg_temp_free(ctx->temp[i]);
>          ctx->temp[i] = NULL;
>      }
>      ctx->ntemp = 0;
> +    for (i = ctx->nftemp - 1; i >= 0; --i) {
> +        tcg_temp_free_i64(ctx->ftemp[i]);
> +        ctx->ftemp[i] = NULL;
> +    }
> +    ctx->nftemp = 0;
>
>      if (ctx->base.is_jmp == DISAS_NEXT) {
>          target_ulong page_start;
> --
> 2.17.1
>
>


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v6 4/6] target/riscv: add support for zdinx
  2022-02-11  4:39   ` Weiwei Li
@ 2022-02-28  4:01     ` Alistair Francis
  -1 siblings, 0 replies; 22+ messages in thread
From: Alistair Francis @ 2022-02-28  4:01 UTC (permalink / raw)
  To: Weiwei Li
  Cc: Wei Wu (吴伟),
	open list:RISC-V, wangjunqiang, Bin Meng, Richard Henderson,
	qemu-devel@nongnu.org Developers, ardxwe, Palmer Dabbelt,
	Alistair Francis

On Fri, Feb 11, 2022 at 2:45 PM Weiwei Li <liweiwei@iscas.ac.cn> wrote:
>
>   -- update extension check REQUIRE_ZDINX_OR_D
>   -- update double float point register read/write
>
> Co-authored-by: ardxwe <ardxwe@gmail.com>
> Signed-off-by: Weiwei Li <liweiwei@iscas.ac.cn>
> Signed-off-by: Junqiang Wang <wangjunqiang@iscas.ac.cn>
> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>

Reviewed-by: Alistair Francis <alistair.francis@wdc.com>

Alistair

> ---
>  target/riscv/insn_trans/trans_rvd.c.inc | 285 +++++++++++++++++-------
>  target/riscv/translate.c                |  52 +++++
>  2 files changed, 259 insertions(+), 78 deletions(-)
>
> diff --git a/target/riscv/insn_trans/trans_rvd.c.inc b/target/riscv/insn_trans/trans_rvd.c.inc
> index 091ed3a8ad..1397c1ce1c 100644
> --- a/target/riscv/insn_trans/trans_rvd.c.inc
> +++ b/target/riscv/insn_trans/trans_rvd.c.inc
> @@ -18,6 +18,19 @@
>   * this program.  If not, see <http://www.gnu.org/licenses/>.
>   */
>
> +#define REQUIRE_ZDINX_OR_D(ctx) do { \
> +    if (!ctx->cfg_ptr->ext_zdinx) { \
> +        REQUIRE_EXT(ctx, RVD); \
> +    } \
> +} while (0)
> +
> +#define REQUIRE_EVEN(ctx, reg) do { \
> +    if (ctx->cfg_ptr->ext_zdinx && (get_xl(ctx) == MXL_RV32) && \
> +        ((reg) & 0x1)) { \
> +        return false; \
> +    } \
> +} while (0)
> +
>  static bool trans_fld(DisasContext *ctx, arg_fld *a)
>  {
>      TCGv addr;
> @@ -47,10 +60,17 @@ static bool trans_fsd(DisasContext *ctx, arg_fsd *a)
>  static bool trans_fmadd_d(DisasContext *ctx, arg_fmadd_d *a)
>  {
>      REQUIRE_FPU;
> -    REQUIRE_EXT(ctx, RVD);
> +    REQUIRE_ZDINX_OR_D(ctx);
> +    REQUIRE_EVEN(ctx, a->rd | a->rs1 | a->rs2 | a->rs3);
> +
> +    TCGv_i64 dest = dest_fpr(ctx, a->rd);
> +    TCGv_i64 src1 = get_fpr_d(ctx, a->rs1);
> +    TCGv_i64 src2 = get_fpr_d(ctx, a->rs2);
> +    TCGv_i64 src3 = get_fpr_d(ctx, a->rs3);
> +
>      gen_set_rm(ctx, a->rm);
> -    gen_helper_fmadd_d(cpu_fpr[a->rd], cpu_env, cpu_fpr[a->rs1],
> -                       cpu_fpr[a->rs2], cpu_fpr[a->rs3]);
> +    gen_helper_fmadd_d(dest, cpu_env, src1, src2, src3);
> +    gen_set_fpr_d(ctx, a->rd, dest);
>      mark_fs_dirty(ctx);
>      return true;
>  }
> @@ -58,10 +78,17 @@ static bool trans_fmadd_d(DisasContext *ctx, arg_fmadd_d *a)
>  static bool trans_fmsub_d(DisasContext *ctx, arg_fmsub_d *a)
>  {
>      REQUIRE_FPU;
> -    REQUIRE_EXT(ctx, RVD);
> +    REQUIRE_ZDINX_OR_D(ctx);
> +    REQUIRE_EVEN(ctx, a->rd | a->rs1 | a->rs2 | a->rs3);
> +
> +    TCGv_i64 dest = dest_fpr(ctx, a->rd);
> +    TCGv_i64 src1 = get_fpr_d(ctx, a->rs1);
> +    TCGv_i64 src2 = get_fpr_d(ctx, a->rs2);
> +    TCGv_i64 src3 = get_fpr_d(ctx, a->rs3);
> +
>      gen_set_rm(ctx, a->rm);
> -    gen_helper_fmsub_d(cpu_fpr[a->rd], cpu_env, cpu_fpr[a->rs1],
> -                       cpu_fpr[a->rs2], cpu_fpr[a->rs3]);
> +    gen_helper_fmsub_d(dest, cpu_env, src1, src2, src3);
> +    gen_set_fpr_d(ctx, a->rd, dest);
>      mark_fs_dirty(ctx);
>      return true;
>  }
> @@ -69,10 +96,17 @@ static bool trans_fmsub_d(DisasContext *ctx, arg_fmsub_d *a)
>  static bool trans_fnmsub_d(DisasContext *ctx, arg_fnmsub_d *a)
>  {
>      REQUIRE_FPU;
> -    REQUIRE_EXT(ctx, RVD);
> +    REQUIRE_ZDINX_OR_D(ctx);
> +    REQUIRE_EVEN(ctx, a->rd | a->rs1 | a->rs2 | a->rs3);
> +
> +    TCGv_i64 dest = dest_fpr(ctx, a->rd);
> +    TCGv_i64 src1 = get_fpr_d(ctx, a->rs1);
> +    TCGv_i64 src2 = get_fpr_d(ctx, a->rs2);
> +    TCGv_i64 src3 = get_fpr_d(ctx, a->rs3);
> +
>      gen_set_rm(ctx, a->rm);
> -    gen_helper_fnmsub_d(cpu_fpr[a->rd], cpu_env, cpu_fpr[a->rs1],
> -                        cpu_fpr[a->rs2], cpu_fpr[a->rs3]);
> +    gen_helper_fnmsub_d(dest, cpu_env, src1, src2, src3);
> +    gen_set_fpr_d(ctx, a->rd, dest);
>      mark_fs_dirty(ctx);
>      return true;
>  }
> @@ -80,10 +114,17 @@ static bool trans_fnmsub_d(DisasContext *ctx, arg_fnmsub_d *a)
>  static bool trans_fnmadd_d(DisasContext *ctx, arg_fnmadd_d *a)
>  {
>      REQUIRE_FPU;
> -    REQUIRE_EXT(ctx, RVD);
> +    REQUIRE_ZDINX_OR_D(ctx);
> +    REQUIRE_EVEN(ctx, a->rd | a->rs1 | a->rs2 | a->rs3);
> +
> +    TCGv_i64 dest = dest_fpr(ctx, a->rd);
> +    TCGv_i64 src1 = get_fpr_d(ctx, a->rs1);
> +    TCGv_i64 src2 = get_fpr_d(ctx, a->rs2);
> +    TCGv_i64 src3 = get_fpr_d(ctx, a->rs3);
> +
>      gen_set_rm(ctx, a->rm);
> -    gen_helper_fnmadd_d(cpu_fpr[a->rd], cpu_env, cpu_fpr[a->rs1],
> -                        cpu_fpr[a->rs2], cpu_fpr[a->rs3]);
> +    gen_helper_fnmadd_d(dest, cpu_env, src1, src2, src3);
> +    gen_set_fpr_d(ctx, a->rd, dest);
>      mark_fs_dirty(ctx);
>      return true;
>  }
> @@ -91,12 +132,16 @@ static bool trans_fnmadd_d(DisasContext *ctx, arg_fnmadd_d *a)
>  static bool trans_fadd_d(DisasContext *ctx, arg_fadd_d *a)
>  {
>      REQUIRE_FPU;
> -    REQUIRE_EXT(ctx, RVD);
> +    REQUIRE_ZDINX_OR_D(ctx);
> +    REQUIRE_EVEN(ctx, a->rd | a->rs1 | a->rs2);
>
> -    gen_set_rm(ctx, a->rm);
> -    gen_helper_fadd_d(cpu_fpr[a->rd], cpu_env,
> -                      cpu_fpr[a->rs1], cpu_fpr[a->rs2]);
> +    TCGv_i64 dest = dest_fpr(ctx, a->rd);
> +    TCGv_i64 src1 = get_fpr_d(ctx, a->rs1);
> +    TCGv_i64 src2 = get_fpr_d(ctx, a->rs2);
>
> +    gen_set_rm(ctx, a->rm);
> +    gen_helper_fadd_d(dest, cpu_env, src1, src2);
> +    gen_set_fpr_d(ctx, a->rd, dest);
>      mark_fs_dirty(ctx);
>      return true;
>  }
> @@ -104,12 +149,16 @@ static bool trans_fadd_d(DisasContext *ctx, arg_fadd_d *a)
>  static bool trans_fsub_d(DisasContext *ctx, arg_fsub_d *a)
>  {
>      REQUIRE_FPU;
> -    REQUIRE_EXT(ctx, RVD);
> +    REQUIRE_ZDINX_OR_D(ctx);
> +    REQUIRE_EVEN(ctx, a->rd | a->rs1 | a->rs2);
>
> -    gen_set_rm(ctx, a->rm);
> -    gen_helper_fsub_d(cpu_fpr[a->rd], cpu_env,
> -                      cpu_fpr[a->rs1], cpu_fpr[a->rs2]);
> +    TCGv_i64 dest = dest_fpr(ctx, a->rd);
> +    TCGv_i64 src1 = get_fpr_d(ctx, a->rs1);
> +    TCGv_i64 src2 = get_fpr_d(ctx, a->rs2);
>
> +    gen_set_rm(ctx, a->rm);
> +    gen_helper_fsub_d(dest, cpu_env, src1, src2);
> +    gen_set_fpr_d(ctx, a->rd, dest);
>      mark_fs_dirty(ctx);
>      return true;
>  }
> @@ -117,12 +166,16 @@ static bool trans_fsub_d(DisasContext *ctx, arg_fsub_d *a)
>  static bool trans_fmul_d(DisasContext *ctx, arg_fmul_d *a)
>  {
>      REQUIRE_FPU;
> -    REQUIRE_EXT(ctx, RVD);
> +    REQUIRE_ZDINX_OR_D(ctx);
> +    REQUIRE_EVEN(ctx, a->rd | a->rs1 | a->rs2);
>
> -    gen_set_rm(ctx, a->rm);
> -    gen_helper_fmul_d(cpu_fpr[a->rd], cpu_env,
> -                      cpu_fpr[a->rs1], cpu_fpr[a->rs2]);
> +    TCGv_i64 dest = dest_fpr(ctx, a->rd);
> +    TCGv_i64 src1 = get_fpr_d(ctx, a->rs1);
> +    TCGv_i64 src2 = get_fpr_d(ctx, a->rs2);
>
> +    gen_set_rm(ctx, a->rm);
> +    gen_helper_fmul_d(dest, cpu_env, src1, src2);
> +    gen_set_fpr_d(ctx, a->rd, dest);
>      mark_fs_dirty(ctx);
>      return true;
>  }
> @@ -130,12 +183,16 @@ static bool trans_fmul_d(DisasContext *ctx, arg_fmul_d *a)
>  static bool trans_fdiv_d(DisasContext *ctx, arg_fdiv_d *a)
>  {
>      REQUIRE_FPU;
> -    REQUIRE_EXT(ctx, RVD);
> +    REQUIRE_ZDINX_OR_D(ctx);
> +    REQUIRE_EVEN(ctx, a->rd | a->rs1 | a->rs2);
>
> -    gen_set_rm(ctx, a->rm);
> -    gen_helper_fdiv_d(cpu_fpr[a->rd], cpu_env,
> -                      cpu_fpr[a->rs1], cpu_fpr[a->rs2]);
> +    TCGv_i64 dest = dest_fpr(ctx, a->rd);
> +    TCGv_i64 src1 = get_fpr_d(ctx, a->rs1);
> +    TCGv_i64 src2 = get_fpr_d(ctx, a->rs2);
>
> +    gen_set_rm(ctx, a->rm);
> +    gen_helper_fdiv_d(dest, cpu_env, src1, src2);
> +    gen_set_fpr_d(ctx, a->rd, dest);
>      mark_fs_dirty(ctx);
>      return true;
>  }
> @@ -143,23 +200,34 @@ static bool trans_fdiv_d(DisasContext *ctx, arg_fdiv_d *a)
>  static bool trans_fsqrt_d(DisasContext *ctx, arg_fsqrt_d *a)
>  {
>      REQUIRE_FPU;
> -    REQUIRE_EXT(ctx, RVD);
> +    REQUIRE_ZDINX_OR_D(ctx);
> +    REQUIRE_EVEN(ctx, a->rd | a->rs1);
>
> -    gen_set_rm(ctx, a->rm);
> -    gen_helper_fsqrt_d(cpu_fpr[a->rd], cpu_env, cpu_fpr[a->rs1]);
> +    TCGv_i64 dest = dest_fpr(ctx, a->rd);
> +    TCGv_i64 src1 = get_fpr_d(ctx, a->rs1);
>
> +    gen_set_rm(ctx, a->rm);
> +    gen_helper_fsqrt_d(dest, cpu_env, src1);
> +    gen_set_fpr_d(ctx, a->rd, dest);
>      mark_fs_dirty(ctx);
>      return true;
>  }
>
>  static bool trans_fsgnj_d(DisasContext *ctx, arg_fsgnj_d *a)
>  {
> +    REQUIRE_FPU;
> +    REQUIRE_ZDINX_OR_D(ctx);
> +    REQUIRE_EVEN(ctx, a->rd | a->rs1 | a->rs2);
> +
> +    TCGv_i64 dest = dest_fpr(ctx, a->rd);
>      if (a->rs1 == a->rs2) { /* FMOV */
> -        tcg_gen_mov_i64(cpu_fpr[a->rd], cpu_fpr[a->rs1]);
> +        dest = get_fpr_d(ctx, a->rs1);
>      } else {
> -        tcg_gen_deposit_i64(cpu_fpr[a->rd], cpu_fpr[a->rs2],
> -                            cpu_fpr[a->rs1], 0, 63);
> +        TCGv_i64 src1 = get_fpr_d(ctx, a->rs1);
> +        TCGv_i64 src2 = get_fpr_d(ctx, a->rs2);
> +        tcg_gen_deposit_i64(dest, src2, src1, 0, 63);
>      }
> +    gen_set_fpr_d(ctx, a->rd, dest);
>      mark_fs_dirty(ctx);
>      return true;
>  }
> @@ -167,15 +235,22 @@ static bool trans_fsgnj_d(DisasContext *ctx, arg_fsgnj_d *a)
>  static bool trans_fsgnjn_d(DisasContext *ctx, arg_fsgnjn_d *a)
>  {
>      REQUIRE_FPU;
> -    REQUIRE_EXT(ctx, RVD);
> +    REQUIRE_ZDINX_OR_D(ctx);
> +    REQUIRE_EVEN(ctx, a->rd | a->rs1 | a->rs2);
> +
> +    TCGv_i64 dest = dest_fpr(ctx, a->rd);
> +    TCGv_i64 src1 = get_fpr_d(ctx, a->rs1);
> +
>      if (a->rs1 == a->rs2) { /* FNEG */
> -        tcg_gen_xori_i64(cpu_fpr[a->rd], cpu_fpr[a->rs1], INT64_MIN);
> +        tcg_gen_xori_i64(dest, src1, INT64_MIN);
>      } else {
> +        TCGv_i64 src2 = get_fpr_d(ctx, a->rs2);
>          TCGv_i64 t0 = tcg_temp_new_i64();
> -        tcg_gen_not_i64(t0, cpu_fpr[a->rs2]);
> -        tcg_gen_deposit_i64(cpu_fpr[a->rd], t0, cpu_fpr[a->rs1], 0, 63);
> +        tcg_gen_not_i64(t0, src2);
> +        tcg_gen_deposit_i64(dest, t0, src1, 0, 63);
>          tcg_temp_free_i64(t0);
>      }
> +    gen_set_fpr_d(ctx, a->rd, dest);
>      mark_fs_dirty(ctx);
>      return true;
>  }
> @@ -183,15 +258,22 @@ static bool trans_fsgnjn_d(DisasContext *ctx, arg_fsgnjn_d *a)
>  static bool trans_fsgnjx_d(DisasContext *ctx, arg_fsgnjx_d *a)
>  {
>      REQUIRE_FPU;
> -    REQUIRE_EXT(ctx, RVD);
> +    REQUIRE_ZDINX_OR_D(ctx);
> +    REQUIRE_EVEN(ctx, a->rd | a->rs1 | a->rs2);
> +
> +    TCGv_i64 dest = dest_fpr(ctx, a->rd);
> +    TCGv_i64 src1 = get_fpr_d(ctx, a->rs1);
> +
>      if (a->rs1 == a->rs2) { /* FABS */
> -        tcg_gen_andi_i64(cpu_fpr[a->rd], cpu_fpr[a->rs1], ~INT64_MIN);
> +        tcg_gen_andi_i64(dest, src1, ~INT64_MIN);
>      } else {
> +        TCGv_i64 src2 = get_fpr_d(ctx, a->rs2);
>          TCGv_i64 t0 = tcg_temp_new_i64();
> -        tcg_gen_andi_i64(t0, cpu_fpr[a->rs2], INT64_MIN);
> -        tcg_gen_xor_i64(cpu_fpr[a->rd], cpu_fpr[a->rs1], t0);
> +        tcg_gen_andi_i64(t0, src2, INT64_MIN);
> +        tcg_gen_xor_i64(dest, src1, t0);
>          tcg_temp_free_i64(t0);
>      }
> +    gen_set_fpr_d(ctx, a->rd, dest);
>      mark_fs_dirty(ctx);
>      return true;
>  }
> @@ -199,11 +281,15 @@ static bool trans_fsgnjx_d(DisasContext *ctx, arg_fsgnjx_d *a)
>  static bool trans_fmin_d(DisasContext *ctx, arg_fmin_d *a)
>  {
>      REQUIRE_FPU;
> -    REQUIRE_EXT(ctx, RVD);
> +    REQUIRE_ZDINX_OR_D(ctx);
> +    REQUIRE_EVEN(ctx, a->rd | a->rs1 | a->rs2);
>
> -    gen_helper_fmin_d(cpu_fpr[a->rd], cpu_env,
> -                      cpu_fpr[a->rs1], cpu_fpr[a->rs2]);
> +    TCGv_i64 dest = dest_fpr(ctx, a->rd);
> +    TCGv_i64 src1 = get_fpr_d(ctx, a->rs1);
> +    TCGv_i64 src2 = get_fpr_d(ctx, a->rs2);
>
> +    gen_helper_fmin_d(dest, cpu_env, src1, src2);
> +    gen_set_fpr_d(ctx, a->rd, dest);
>      mark_fs_dirty(ctx);
>      return true;
>  }
> @@ -211,11 +297,15 @@ static bool trans_fmin_d(DisasContext *ctx, arg_fmin_d *a)
>  static bool trans_fmax_d(DisasContext *ctx, arg_fmax_d *a)
>  {
>      REQUIRE_FPU;
> -    REQUIRE_EXT(ctx, RVD);
> +    REQUIRE_ZDINX_OR_D(ctx);
> +    REQUIRE_EVEN(ctx, a->rd | a->rs1 | a->rs2);
>
> -    gen_helper_fmax_d(cpu_fpr[a->rd], cpu_env,
> -                      cpu_fpr[a->rs1], cpu_fpr[a->rs2]);
> +    TCGv_i64 dest = dest_fpr(ctx, a->rd);
> +    TCGv_i64 src1 = get_fpr_d(ctx, a->rs1);
> +    TCGv_i64 src2 = get_fpr_d(ctx, a->rs2);
>
> +    gen_helper_fmax_d(dest, cpu_env, src1, src2);
> +    gen_set_fpr_d(ctx, a->rd, dest);
>      mark_fs_dirty(ctx);
>      return true;
>  }
> @@ -223,11 +313,15 @@ static bool trans_fmax_d(DisasContext *ctx, arg_fmax_d *a)
>  static bool trans_fcvt_s_d(DisasContext *ctx, arg_fcvt_s_d *a)
>  {
>      REQUIRE_FPU;
> -    REQUIRE_EXT(ctx, RVD);
> +    REQUIRE_ZDINX_OR_D(ctx);
> +    REQUIRE_EVEN(ctx, a->rs1);
>
> -    gen_set_rm(ctx, a->rm);
> -    gen_helper_fcvt_s_d(cpu_fpr[a->rd], cpu_env, cpu_fpr[a->rs1]);
> +    TCGv_i64 dest = dest_fpr(ctx, a->rd);
> +    TCGv_i64 src1 = get_fpr_d(ctx, a->rs1);
>
> +    gen_set_rm(ctx, a->rm);
> +    gen_helper_fcvt_s_d(dest, cpu_env, src1);
> +    gen_set_fpr_hs(ctx, a->rd, dest);
>      mark_fs_dirty(ctx);
>      return true;
>  }
> @@ -235,11 +329,15 @@ static bool trans_fcvt_s_d(DisasContext *ctx, arg_fcvt_s_d *a)
>  static bool trans_fcvt_d_s(DisasContext *ctx, arg_fcvt_d_s *a)
>  {
>      REQUIRE_FPU;
> -    REQUIRE_EXT(ctx, RVD);
> +    REQUIRE_ZDINX_OR_D(ctx);
> +    REQUIRE_EVEN(ctx, a->rd);
>
> -    gen_set_rm(ctx, a->rm);
> -    gen_helper_fcvt_d_s(cpu_fpr[a->rd], cpu_env, cpu_fpr[a->rs1]);
> +    TCGv_i64 dest = dest_fpr(ctx, a->rd);
> +    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
>
> +    gen_set_rm(ctx, a->rm);
> +    gen_helper_fcvt_d_s(dest, cpu_env, src1);
> +    gen_set_fpr_d(ctx, a->rd, dest);
>      mark_fs_dirty(ctx);
>      return true;
>  }
> @@ -247,11 +345,14 @@ static bool trans_fcvt_d_s(DisasContext *ctx, arg_fcvt_d_s *a)
>  static bool trans_feq_d(DisasContext *ctx, arg_feq_d *a)
>  {
>      REQUIRE_FPU;
> -    REQUIRE_EXT(ctx, RVD);
> +    REQUIRE_ZDINX_OR_D(ctx);
> +    REQUIRE_EVEN(ctx, a->rs1 | a->rs2);
>
>      TCGv dest = dest_gpr(ctx, a->rd);
> +    TCGv_i64 src1 = get_fpr_d(ctx, a->rs1);
> +    TCGv_i64 src2 = get_fpr_d(ctx, a->rs2);
>
> -    gen_helper_feq_d(dest, cpu_env, cpu_fpr[a->rs1], cpu_fpr[a->rs2]);
> +    gen_helper_feq_d(dest, cpu_env, src1, src2);
>      gen_set_gpr(ctx, a->rd, dest);
>      return true;
>  }
> @@ -259,11 +360,14 @@ static bool trans_feq_d(DisasContext *ctx, arg_feq_d *a)
>  static bool trans_flt_d(DisasContext *ctx, arg_flt_d *a)
>  {
>      REQUIRE_FPU;
> -    REQUIRE_EXT(ctx, RVD);
> +    REQUIRE_ZDINX_OR_D(ctx);
> +    REQUIRE_EVEN(ctx, a->rs1 | a->rs2);
>
>      TCGv dest = dest_gpr(ctx, a->rd);
> +    TCGv_i64 src1 = get_fpr_d(ctx, a->rs1);
> +    TCGv_i64 src2 = get_fpr_d(ctx, a->rs2);
>
> -    gen_helper_flt_d(dest, cpu_env, cpu_fpr[a->rs1], cpu_fpr[a->rs2]);
> +    gen_helper_flt_d(dest, cpu_env, src1, src2);
>      gen_set_gpr(ctx, a->rd, dest);
>      return true;
>  }
> @@ -271,11 +375,14 @@ static bool trans_flt_d(DisasContext *ctx, arg_flt_d *a)
>  static bool trans_fle_d(DisasContext *ctx, arg_fle_d *a)
>  {
>      REQUIRE_FPU;
> -    REQUIRE_EXT(ctx, RVD);
> +    REQUIRE_ZDINX_OR_D(ctx);
> +    REQUIRE_EVEN(ctx, a->rs1 | a->rs2);
>
>      TCGv dest = dest_gpr(ctx, a->rd);
> +    TCGv_i64 src1 = get_fpr_d(ctx, a->rs1);
> +    TCGv_i64 src2 = get_fpr_d(ctx, a->rs2);
>
> -    gen_helper_fle_d(dest, cpu_env, cpu_fpr[a->rs1], cpu_fpr[a->rs2]);
> +    gen_helper_fle_d(dest, cpu_env, src1, src2);
>      gen_set_gpr(ctx, a->rd, dest);
>      return true;
>  }
> @@ -283,11 +390,13 @@ static bool trans_fle_d(DisasContext *ctx, arg_fle_d *a)
>  static bool trans_fclass_d(DisasContext *ctx, arg_fclass_d *a)
>  {
>      REQUIRE_FPU;
> -    REQUIRE_EXT(ctx, RVD);
> +    REQUIRE_ZDINX_OR_D(ctx);
> +    REQUIRE_EVEN(ctx, a->rs1);
>
>      TCGv dest = dest_gpr(ctx, a->rd);
> +    TCGv_i64 src1 = get_fpr_d(ctx, a->rs1);
>
> -    gen_helper_fclass_d(dest, cpu_fpr[a->rs1]);
> +    gen_helper_fclass_d(dest, src1);
>      gen_set_gpr(ctx, a->rd, dest);
>      return true;
>  }
> @@ -295,12 +404,14 @@ static bool trans_fclass_d(DisasContext *ctx, arg_fclass_d *a)
>  static bool trans_fcvt_w_d(DisasContext *ctx, arg_fcvt_w_d *a)
>  {
>      REQUIRE_FPU;
> -    REQUIRE_EXT(ctx, RVD);
> +    REQUIRE_ZDINX_OR_D(ctx);
> +    REQUIRE_EVEN(ctx, a->rs1);
>
>      TCGv dest = dest_gpr(ctx, a->rd);
> +    TCGv_i64 src1 = get_fpr_d(ctx, a->rs1);
>
>      gen_set_rm(ctx, a->rm);
> -    gen_helper_fcvt_w_d(dest, cpu_env, cpu_fpr[a->rs1]);
> +    gen_helper_fcvt_w_d(dest, cpu_env, src1);
>      gen_set_gpr(ctx, a->rd, dest);
>      return true;
>  }
> @@ -308,12 +419,14 @@ static bool trans_fcvt_w_d(DisasContext *ctx, arg_fcvt_w_d *a)
>  static bool trans_fcvt_wu_d(DisasContext *ctx, arg_fcvt_wu_d *a)
>  {
>      REQUIRE_FPU;
> -    REQUIRE_EXT(ctx, RVD);
> +    REQUIRE_ZDINX_OR_D(ctx);
> +    REQUIRE_EVEN(ctx, a->rs1);
>
>      TCGv dest = dest_gpr(ctx, a->rd);
> +    TCGv_i64 src1 = get_fpr_d(ctx, a->rs1);
>
>      gen_set_rm(ctx, a->rm);
> -    gen_helper_fcvt_wu_d(dest, cpu_env, cpu_fpr[a->rs1]);
> +    gen_helper_fcvt_wu_d(dest, cpu_env, src1);
>      gen_set_gpr(ctx, a->rd, dest);
>      return true;
>  }
> @@ -321,12 +434,15 @@ static bool trans_fcvt_wu_d(DisasContext *ctx, arg_fcvt_wu_d *a)
>  static bool trans_fcvt_d_w(DisasContext *ctx, arg_fcvt_d_w *a)
>  {
>      REQUIRE_FPU;
> -    REQUIRE_EXT(ctx, RVD);
> +    REQUIRE_ZDINX_OR_D(ctx);
> +    REQUIRE_EVEN(ctx, a->rd);
>
> +    TCGv_i64 dest = dest_fpr(ctx, a->rd);
>      TCGv src = get_gpr(ctx, a->rs1, EXT_SIGN);
>
>      gen_set_rm(ctx, a->rm);
> -    gen_helper_fcvt_d_w(cpu_fpr[a->rd], cpu_env, src);
> +    gen_helper_fcvt_d_w(dest, cpu_env, src);
> +    gen_set_fpr_d(ctx, a->rd, dest);
>
>      mark_fs_dirty(ctx);
>      return true;
> @@ -335,12 +451,15 @@ static bool trans_fcvt_d_w(DisasContext *ctx, arg_fcvt_d_w *a)
>  static bool trans_fcvt_d_wu(DisasContext *ctx, arg_fcvt_d_wu *a)
>  {
>      REQUIRE_FPU;
> -    REQUIRE_EXT(ctx, RVD);
> +    REQUIRE_ZDINX_OR_D(ctx);
> +    REQUIRE_EVEN(ctx, a->rd);
>
> +    TCGv_i64 dest = dest_fpr(ctx, a->rd);
>      TCGv src = get_gpr(ctx, a->rs1, EXT_ZERO);
>
>      gen_set_rm(ctx, a->rm);
> -    gen_helper_fcvt_d_wu(cpu_fpr[a->rd], cpu_env, src);
> +    gen_helper_fcvt_d_wu(dest, cpu_env, src);
> +    gen_set_fpr_d(ctx, a->rd, dest);
>
>      mark_fs_dirty(ctx);
>      return true;
> @@ -350,12 +469,14 @@ static bool trans_fcvt_l_d(DisasContext *ctx, arg_fcvt_l_d *a)
>  {
>      REQUIRE_64BIT(ctx);
>      REQUIRE_FPU;
> -    REQUIRE_EXT(ctx, RVD);
> +    REQUIRE_ZDINX_OR_D(ctx);
> +    REQUIRE_EVEN(ctx, a->rs1);
>
>      TCGv dest = dest_gpr(ctx, a->rd);
> +    TCGv_i64 src1 = get_fpr_d(ctx, a->rs1);
>
>      gen_set_rm(ctx, a->rm);
> -    gen_helper_fcvt_l_d(dest, cpu_env, cpu_fpr[a->rs1]);
> +    gen_helper_fcvt_l_d(dest, cpu_env, src1);
>      gen_set_gpr(ctx, a->rd, dest);
>      return true;
>  }
> @@ -364,12 +485,14 @@ static bool trans_fcvt_lu_d(DisasContext *ctx, arg_fcvt_lu_d *a)
>  {
>      REQUIRE_64BIT(ctx);
>      REQUIRE_FPU;
> -    REQUIRE_EXT(ctx, RVD);
> +    REQUIRE_ZDINX_OR_D(ctx);
> +    REQUIRE_EVEN(ctx, a->rs1);
>
>      TCGv dest = dest_gpr(ctx, a->rd);
> +    TCGv_i64 src1 = get_fpr_d(ctx, a->rs1);
>
>      gen_set_rm(ctx, a->rm);
> -    gen_helper_fcvt_lu_d(dest, cpu_env, cpu_fpr[a->rs1]);
> +    gen_helper_fcvt_lu_d(dest, cpu_env, src1);
>      gen_set_gpr(ctx, a->rd, dest);
>      return true;
>  }
> @@ -392,12 +515,15 @@ static bool trans_fcvt_d_l(DisasContext *ctx, arg_fcvt_d_l *a)
>  {
>      REQUIRE_64BIT(ctx);
>      REQUIRE_FPU;
> -    REQUIRE_EXT(ctx, RVD);
> +    REQUIRE_ZDINX_OR_D(ctx);
> +    REQUIRE_EVEN(ctx, a->rd);
>
> +    TCGv_i64 dest = dest_fpr(ctx, a->rd);
>      TCGv src = get_gpr(ctx, a->rs1, EXT_SIGN);
>
>      gen_set_rm(ctx, a->rm);
> -    gen_helper_fcvt_d_l(cpu_fpr[a->rd], cpu_env, src);
> +    gen_helper_fcvt_d_l(dest, cpu_env, src);
> +    gen_set_fpr_d(ctx, a->rd, dest);
>
>      mark_fs_dirty(ctx);
>      return true;
> @@ -407,12 +533,15 @@ static bool trans_fcvt_d_lu(DisasContext *ctx, arg_fcvt_d_lu *a)
>  {
>      REQUIRE_64BIT(ctx);
>      REQUIRE_FPU;
> -    REQUIRE_EXT(ctx, RVD);
> +    REQUIRE_ZDINX_OR_D(ctx);
> +    REQUIRE_EVEN(ctx, a->rd);
>
> +    TCGv_i64 dest = dest_fpr(ctx, a->rd);
>      TCGv src = get_gpr(ctx, a->rs1, EXT_ZERO);
>
>      gen_set_rm(ctx, a->rm);
> -    gen_helper_fcvt_d_lu(cpu_fpr[a->rd], cpu_env, src);
> +    gen_helper_fcvt_d_lu(dest, cpu_env, src);
> +    gen_set_fpr_d(ctx, a->rd, dest);
>
>      mark_fs_dirty(ctx);
>      return true;
> diff --git a/target/riscv/translate.c b/target/riscv/translate.c
> index 10cf37be41..fac998a6b5 100644
> --- a/target/riscv/translate.c
> +++ b/target/riscv/translate.c
> @@ -416,6 +416,31 @@ static TCGv_i64 get_fpr_hs(DisasContext *ctx, int reg_num)
>      }
>  }
>
> +static TCGv_i64 get_fpr_d(DisasContext *ctx, int reg_num)
> +{
> +    if (!ctx->cfg_ptr->ext_zfinx) {
> +        return cpu_fpr[reg_num];
> +    }
> +
> +    if (reg_num == 0) {
> +        return tcg_constant_i64(0);
> +    }
> +    switch (get_xl(ctx)) {
> +    case MXL_RV32:
> +    {
> +        TCGv_i64 t = ftemp_new(ctx);
> +        tcg_gen_concat_tl_i64(t, cpu_gpr[reg_num], cpu_gpr[reg_num + 1]);
> +        return t;
> +    }
> +#ifdef TARGET_RISCV64
> +    case MXL_RV64:
> +        return cpu_gpr[reg_num];
> +#endif
> +    default:
> +        g_assert_not_reached();
> +    }
> +}
> +
>  static TCGv_i64 dest_fpr(DisasContext *ctx, int reg_num)
>  {
>      if (!ctx->cfg_ptr->ext_zfinx) {
> @@ -463,6 +488,33 @@ static void gen_set_fpr_hs(DisasContext *ctx, int reg_num, TCGv_i64 t)
>      }
>  }
>
> +static void gen_set_fpr_d(DisasContext *ctx, int reg_num, TCGv_i64 t)
> +{
> +    if (!ctx->cfg_ptr->ext_zfinx) {
> +        tcg_gen_mov_i64(cpu_fpr[reg_num], t);
> +        return;
> +    }
> +
> +    if (reg_num != 0) {
> +        switch (get_xl(ctx)) {
> +        case MXL_RV32:
> +#ifdef TARGET_RISCV32
> +            tcg_gen_extr_i64_i32(cpu_gpr[reg_num], cpu_gpr[reg_num + 1], t);
> +            break;
> +#else
> +            tcg_gen_ext32s_i64(cpu_gpr[reg_num], t);
> +            tcg_gen_sari_i64(cpu_gpr[reg_num + 1], t, 32);
> +            break;
> +        case MXL_RV64:
> +            tcg_gen_mov_i64(cpu_gpr[reg_num], t);
> +            break;
> +#endif
> +        default:
> +            g_assert_not_reached();
> +        }
> +    }
> +}
> +
>  static void gen_jal(DisasContext *ctx, int rd, target_ulong imm)
>  {
>      target_ulong next_pc;
> --
> 2.17.1
>
>


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v6 4/6] target/riscv: add support for zdinx
@ 2022-02-28  4:01     ` Alistair Francis
  0 siblings, 0 replies; 22+ messages in thread
From: Alistair Francis @ 2022-02-28  4:01 UTC (permalink / raw)
  To: Weiwei Li
  Cc: Richard Henderson, Palmer Dabbelt, Alistair Francis, Bin Meng,
	open list:RISC-V, qemu-devel@nongnu.org Developers, wangjunqiang,
	Wei Wu (吴伟),
	ardxwe

On Fri, Feb 11, 2022 at 2:45 PM Weiwei Li <liweiwei@iscas.ac.cn> wrote:
>
>   -- update extension check REQUIRE_ZDINX_OR_D
>   -- update double float point register read/write
>
> Co-authored-by: ardxwe <ardxwe@gmail.com>
> Signed-off-by: Weiwei Li <liweiwei@iscas.ac.cn>
> Signed-off-by: Junqiang Wang <wangjunqiang@iscas.ac.cn>
> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>

Reviewed-by: Alistair Francis <alistair.francis@wdc.com>

Alistair

> ---
>  target/riscv/insn_trans/trans_rvd.c.inc | 285 +++++++++++++++++-------
>  target/riscv/translate.c                |  52 +++++
>  2 files changed, 259 insertions(+), 78 deletions(-)
>
> diff --git a/target/riscv/insn_trans/trans_rvd.c.inc b/target/riscv/insn_trans/trans_rvd.c.inc
> index 091ed3a8ad..1397c1ce1c 100644
> --- a/target/riscv/insn_trans/trans_rvd.c.inc
> +++ b/target/riscv/insn_trans/trans_rvd.c.inc
> @@ -18,6 +18,19 @@
>   * this program.  If not, see <http://www.gnu.org/licenses/>.
>   */
>
> +#define REQUIRE_ZDINX_OR_D(ctx) do { \
> +    if (!ctx->cfg_ptr->ext_zdinx) { \
> +        REQUIRE_EXT(ctx, RVD); \
> +    } \
> +} while (0)
> +
> +#define REQUIRE_EVEN(ctx, reg) do { \
> +    if (ctx->cfg_ptr->ext_zdinx && (get_xl(ctx) == MXL_RV32) && \
> +        ((reg) & 0x1)) { \
> +        return false; \
> +    } \
> +} while (0)
> +
>  static bool trans_fld(DisasContext *ctx, arg_fld *a)
>  {
>      TCGv addr;
> @@ -47,10 +60,17 @@ static bool trans_fsd(DisasContext *ctx, arg_fsd *a)
>  static bool trans_fmadd_d(DisasContext *ctx, arg_fmadd_d *a)
>  {
>      REQUIRE_FPU;
> -    REQUIRE_EXT(ctx, RVD);
> +    REQUIRE_ZDINX_OR_D(ctx);
> +    REQUIRE_EVEN(ctx, a->rd | a->rs1 | a->rs2 | a->rs3);
> +
> +    TCGv_i64 dest = dest_fpr(ctx, a->rd);
> +    TCGv_i64 src1 = get_fpr_d(ctx, a->rs1);
> +    TCGv_i64 src2 = get_fpr_d(ctx, a->rs2);
> +    TCGv_i64 src3 = get_fpr_d(ctx, a->rs3);
> +
>      gen_set_rm(ctx, a->rm);
> -    gen_helper_fmadd_d(cpu_fpr[a->rd], cpu_env, cpu_fpr[a->rs1],
> -                       cpu_fpr[a->rs2], cpu_fpr[a->rs3]);
> +    gen_helper_fmadd_d(dest, cpu_env, src1, src2, src3);
> +    gen_set_fpr_d(ctx, a->rd, dest);
>      mark_fs_dirty(ctx);
>      return true;
>  }
> @@ -58,10 +78,17 @@ static bool trans_fmadd_d(DisasContext *ctx, arg_fmadd_d *a)
>  static bool trans_fmsub_d(DisasContext *ctx, arg_fmsub_d *a)
>  {
>      REQUIRE_FPU;
> -    REQUIRE_EXT(ctx, RVD);
> +    REQUIRE_ZDINX_OR_D(ctx);
> +    REQUIRE_EVEN(ctx, a->rd | a->rs1 | a->rs2 | a->rs3);
> +
> +    TCGv_i64 dest = dest_fpr(ctx, a->rd);
> +    TCGv_i64 src1 = get_fpr_d(ctx, a->rs1);
> +    TCGv_i64 src2 = get_fpr_d(ctx, a->rs2);
> +    TCGv_i64 src3 = get_fpr_d(ctx, a->rs3);
> +
>      gen_set_rm(ctx, a->rm);
> -    gen_helper_fmsub_d(cpu_fpr[a->rd], cpu_env, cpu_fpr[a->rs1],
> -                       cpu_fpr[a->rs2], cpu_fpr[a->rs3]);
> +    gen_helper_fmsub_d(dest, cpu_env, src1, src2, src3);
> +    gen_set_fpr_d(ctx, a->rd, dest);
>      mark_fs_dirty(ctx);
>      return true;
>  }
> @@ -69,10 +96,17 @@ static bool trans_fmsub_d(DisasContext *ctx, arg_fmsub_d *a)
>  static bool trans_fnmsub_d(DisasContext *ctx, arg_fnmsub_d *a)
>  {
>      REQUIRE_FPU;
> -    REQUIRE_EXT(ctx, RVD);
> +    REQUIRE_ZDINX_OR_D(ctx);
> +    REQUIRE_EVEN(ctx, a->rd | a->rs1 | a->rs2 | a->rs3);
> +
> +    TCGv_i64 dest = dest_fpr(ctx, a->rd);
> +    TCGv_i64 src1 = get_fpr_d(ctx, a->rs1);
> +    TCGv_i64 src2 = get_fpr_d(ctx, a->rs2);
> +    TCGv_i64 src3 = get_fpr_d(ctx, a->rs3);
> +
>      gen_set_rm(ctx, a->rm);
> -    gen_helper_fnmsub_d(cpu_fpr[a->rd], cpu_env, cpu_fpr[a->rs1],
> -                        cpu_fpr[a->rs2], cpu_fpr[a->rs3]);
> +    gen_helper_fnmsub_d(dest, cpu_env, src1, src2, src3);
> +    gen_set_fpr_d(ctx, a->rd, dest);
>      mark_fs_dirty(ctx);
>      return true;
>  }
> @@ -80,10 +114,17 @@ static bool trans_fnmsub_d(DisasContext *ctx, arg_fnmsub_d *a)
>  static bool trans_fnmadd_d(DisasContext *ctx, arg_fnmadd_d *a)
>  {
>      REQUIRE_FPU;
> -    REQUIRE_EXT(ctx, RVD);
> +    REQUIRE_ZDINX_OR_D(ctx);
> +    REQUIRE_EVEN(ctx, a->rd | a->rs1 | a->rs2 | a->rs3);
> +
> +    TCGv_i64 dest = dest_fpr(ctx, a->rd);
> +    TCGv_i64 src1 = get_fpr_d(ctx, a->rs1);
> +    TCGv_i64 src2 = get_fpr_d(ctx, a->rs2);
> +    TCGv_i64 src3 = get_fpr_d(ctx, a->rs3);
> +
>      gen_set_rm(ctx, a->rm);
> -    gen_helper_fnmadd_d(cpu_fpr[a->rd], cpu_env, cpu_fpr[a->rs1],
> -                        cpu_fpr[a->rs2], cpu_fpr[a->rs3]);
> +    gen_helper_fnmadd_d(dest, cpu_env, src1, src2, src3);
> +    gen_set_fpr_d(ctx, a->rd, dest);
>      mark_fs_dirty(ctx);
>      return true;
>  }
> @@ -91,12 +132,16 @@ static bool trans_fnmadd_d(DisasContext *ctx, arg_fnmadd_d *a)
>  static bool trans_fadd_d(DisasContext *ctx, arg_fadd_d *a)
>  {
>      REQUIRE_FPU;
> -    REQUIRE_EXT(ctx, RVD);
> +    REQUIRE_ZDINX_OR_D(ctx);
> +    REQUIRE_EVEN(ctx, a->rd | a->rs1 | a->rs2);
>
> -    gen_set_rm(ctx, a->rm);
> -    gen_helper_fadd_d(cpu_fpr[a->rd], cpu_env,
> -                      cpu_fpr[a->rs1], cpu_fpr[a->rs2]);
> +    TCGv_i64 dest = dest_fpr(ctx, a->rd);
> +    TCGv_i64 src1 = get_fpr_d(ctx, a->rs1);
> +    TCGv_i64 src2 = get_fpr_d(ctx, a->rs2);
>
> +    gen_set_rm(ctx, a->rm);
> +    gen_helper_fadd_d(dest, cpu_env, src1, src2);
> +    gen_set_fpr_d(ctx, a->rd, dest);
>      mark_fs_dirty(ctx);
>      return true;
>  }
> @@ -104,12 +149,16 @@ static bool trans_fadd_d(DisasContext *ctx, arg_fadd_d *a)
>  static bool trans_fsub_d(DisasContext *ctx, arg_fsub_d *a)
>  {
>      REQUIRE_FPU;
> -    REQUIRE_EXT(ctx, RVD);
> +    REQUIRE_ZDINX_OR_D(ctx);
> +    REQUIRE_EVEN(ctx, a->rd | a->rs1 | a->rs2);
>
> -    gen_set_rm(ctx, a->rm);
> -    gen_helper_fsub_d(cpu_fpr[a->rd], cpu_env,
> -                      cpu_fpr[a->rs1], cpu_fpr[a->rs2]);
> +    TCGv_i64 dest = dest_fpr(ctx, a->rd);
> +    TCGv_i64 src1 = get_fpr_d(ctx, a->rs1);
> +    TCGv_i64 src2 = get_fpr_d(ctx, a->rs2);
>
> +    gen_set_rm(ctx, a->rm);
> +    gen_helper_fsub_d(dest, cpu_env, src1, src2);
> +    gen_set_fpr_d(ctx, a->rd, dest);
>      mark_fs_dirty(ctx);
>      return true;
>  }
> @@ -117,12 +166,16 @@ static bool trans_fsub_d(DisasContext *ctx, arg_fsub_d *a)
>  static bool trans_fmul_d(DisasContext *ctx, arg_fmul_d *a)
>  {
>      REQUIRE_FPU;
> -    REQUIRE_EXT(ctx, RVD);
> +    REQUIRE_ZDINX_OR_D(ctx);
> +    REQUIRE_EVEN(ctx, a->rd | a->rs1 | a->rs2);
>
> -    gen_set_rm(ctx, a->rm);
> -    gen_helper_fmul_d(cpu_fpr[a->rd], cpu_env,
> -                      cpu_fpr[a->rs1], cpu_fpr[a->rs2]);
> +    TCGv_i64 dest = dest_fpr(ctx, a->rd);
> +    TCGv_i64 src1 = get_fpr_d(ctx, a->rs1);
> +    TCGv_i64 src2 = get_fpr_d(ctx, a->rs2);
>
> +    gen_set_rm(ctx, a->rm);
> +    gen_helper_fmul_d(dest, cpu_env, src1, src2);
> +    gen_set_fpr_d(ctx, a->rd, dest);
>      mark_fs_dirty(ctx);
>      return true;
>  }
> @@ -130,12 +183,16 @@ static bool trans_fmul_d(DisasContext *ctx, arg_fmul_d *a)
>  static bool trans_fdiv_d(DisasContext *ctx, arg_fdiv_d *a)
>  {
>      REQUIRE_FPU;
> -    REQUIRE_EXT(ctx, RVD);
> +    REQUIRE_ZDINX_OR_D(ctx);
> +    REQUIRE_EVEN(ctx, a->rd | a->rs1 | a->rs2);
>
> -    gen_set_rm(ctx, a->rm);
> -    gen_helper_fdiv_d(cpu_fpr[a->rd], cpu_env,
> -                      cpu_fpr[a->rs1], cpu_fpr[a->rs2]);
> +    TCGv_i64 dest = dest_fpr(ctx, a->rd);
> +    TCGv_i64 src1 = get_fpr_d(ctx, a->rs1);
> +    TCGv_i64 src2 = get_fpr_d(ctx, a->rs2);
>
> +    gen_set_rm(ctx, a->rm);
> +    gen_helper_fdiv_d(dest, cpu_env, src1, src2);
> +    gen_set_fpr_d(ctx, a->rd, dest);
>      mark_fs_dirty(ctx);
>      return true;
>  }
> @@ -143,23 +200,34 @@ static bool trans_fdiv_d(DisasContext *ctx, arg_fdiv_d *a)
>  static bool trans_fsqrt_d(DisasContext *ctx, arg_fsqrt_d *a)
>  {
>      REQUIRE_FPU;
> -    REQUIRE_EXT(ctx, RVD);
> +    REQUIRE_ZDINX_OR_D(ctx);
> +    REQUIRE_EVEN(ctx, a->rd | a->rs1);
>
> -    gen_set_rm(ctx, a->rm);
> -    gen_helper_fsqrt_d(cpu_fpr[a->rd], cpu_env, cpu_fpr[a->rs1]);
> +    TCGv_i64 dest = dest_fpr(ctx, a->rd);
> +    TCGv_i64 src1 = get_fpr_d(ctx, a->rs1);
>
> +    gen_set_rm(ctx, a->rm);
> +    gen_helper_fsqrt_d(dest, cpu_env, src1);
> +    gen_set_fpr_d(ctx, a->rd, dest);
>      mark_fs_dirty(ctx);
>      return true;
>  }
>
>  static bool trans_fsgnj_d(DisasContext *ctx, arg_fsgnj_d *a)
>  {
> +    REQUIRE_FPU;
> +    REQUIRE_ZDINX_OR_D(ctx);
> +    REQUIRE_EVEN(ctx, a->rd | a->rs1 | a->rs2);
> +
> +    TCGv_i64 dest = dest_fpr(ctx, a->rd);
>      if (a->rs1 == a->rs2) { /* FMOV */
> -        tcg_gen_mov_i64(cpu_fpr[a->rd], cpu_fpr[a->rs1]);
> +        dest = get_fpr_d(ctx, a->rs1);
>      } else {
> -        tcg_gen_deposit_i64(cpu_fpr[a->rd], cpu_fpr[a->rs2],
> -                            cpu_fpr[a->rs1], 0, 63);
> +        TCGv_i64 src1 = get_fpr_d(ctx, a->rs1);
> +        TCGv_i64 src2 = get_fpr_d(ctx, a->rs2);
> +        tcg_gen_deposit_i64(dest, src2, src1, 0, 63);
>      }
> +    gen_set_fpr_d(ctx, a->rd, dest);
>      mark_fs_dirty(ctx);
>      return true;
>  }
> @@ -167,15 +235,22 @@ static bool trans_fsgnj_d(DisasContext *ctx, arg_fsgnj_d *a)
>  static bool trans_fsgnjn_d(DisasContext *ctx, arg_fsgnjn_d *a)
>  {
>      REQUIRE_FPU;
> -    REQUIRE_EXT(ctx, RVD);
> +    REQUIRE_ZDINX_OR_D(ctx);
> +    REQUIRE_EVEN(ctx, a->rd | a->rs1 | a->rs2);
> +
> +    TCGv_i64 dest = dest_fpr(ctx, a->rd);
> +    TCGv_i64 src1 = get_fpr_d(ctx, a->rs1);
> +
>      if (a->rs1 == a->rs2) { /* FNEG */
> -        tcg_gen_xori_i64(cpu_fpr[a->rd], cpu_fpr[a->rs1], INT64_MIN);
> +        tcg_gen_xori_i64(dest, src1, INT64_MIN);
>      } else {
> +        TCGv_i64 src2 = get_fpr_d(ctx, a->rs2);
>          TCGv_i64 t0 = tcg_temp_new_i64();
> -        tcg_gen_not_i64(t0, cpu_fpr[a->rs2]);
> -        tcg_gen_deposit_i64(cpu_fpr[a->rd], t0, cpu_fpr[a->rs1], 0, 63);
> +        tcg_gen_not_i64(t0, src2);
> +        tcg_gen_deposit_i64(dest, t0, src1, 0, 63);
>          tcg_temp_free_i64(t0);
>      }
> +    gen_set_fpr_d(ctx, a->rd, dest);
>      mark_fs_dirty(ctx);
>      return true;
>  }
> @@ -183,15 +258,22 @@ static bool trans_fsgnjn_d(DisasContext *ctx, arg_fsgnjn_d *a)
>  static bool trans_fsgnjx_d(DisasContext *ctx, arg_fsgnjx_d *a)
>  {
>      REQUIRE_FPU;
> -    REQUIRE_EXT(ctx, RVD);
> +    REQUIRE_ZDINX_OR_D(ctx);
> +    REQUIRE_EVEN(ctx, a->rd | a->rs1 | a->rs2);
> +
> +    TCGv_i64 dest = dest_fpr(ctx, a->rd);
> +    TCGv_i64 src1 = get_fpr_d(ctx, a->rs1);
> +
>      if (a->rs1 == a->rs2) { /* FABS */
> -        tcg_gen_andi_i64(cpu_fpr[a->rd], cpu_fpr[a->rs1], ~INT64_MIN);
> +        tcg_gen_andi_i64(dest, src1, ~INT64_MIN);
>      } else {
> +        TCGv_i64 src2 = get_fpr_d(ctx, a->rs2);
>          TCGv_i64 t0 = tcg_temp_new_i64();
> -        tcg_gen_andi_i64(t0, cpu_fpr[a->rs2], INT64_MIN);
> -        tcg_gen_xor_i64(cpu_fpr[a->rd], cpu_fpr[a->rs1], t0);
> +        tcg_gen_andi_i64(t0, src2, INT64_MIN);
> +        tcg_gen_xor_i64(dest, src1, t0);
>          tcg_temp_free_i64(t0);
>      }
> +    gen_set_fpr_d(ctx, a->rd, dest);
>      mark_fs_dirty(ctx);
>      return true;
>  }
> @@ -199,11 +281,15 @@ static bool trans_fsgnjx_d(DisasContext *ctx, arg_fsgnjx_d *a)
>  static bool trans_fmin_d(DisasContext *ctx, arg_fmin_d *a)
>  {
>      REQUIRE_FPU;
> -    REQUIRE_EXT(ctx, RVD);
> +    REQUIRE_ZDINX_OR_D(ctx);
> +    REQUIRE_EVEN(ctx, a->rd | a->rs1 | a->rs2);
>
> -    gen_helper_fmin_d(cpu_fpr[a->rd], cpu_env,
> -                      cpu_fpr[a->rs1], cpu_fpr[a->rs2]);
> +    TCGv_i64 dest = dest_fpr(ctx, a->rd);
> +    TCGv_i64 src1 = get_fpr_d(ctx, a->rs1);
> +    TCGv_i64 src2 = get_fpr_d(ctx, a->rs2);
>
> +    gen_helper_fmin_d(dest, cpu_env, src1, src2);
> +    gen_set_fpr_d(ctx, a->rd, dest);
>      mark_fs_dirty(ctx);
>      return true;
>  }
> @@ -211,11 +297,15 @@ static bool trans_fmin_d(DisasContext *ctx, arg_fmin_d *a)
>  static bool trans_fmax_d(DisasContext *ctx, arg_fmax_d *a)
>  {
>      REQUIRE_FPU;
> -    REQUIRE_EXT(ctx, RVD);
> +    REQUIRE_ZDINX_OR_D(ctx);
> +    REQUIRE_EVEN(ctx, a->rd | a->rs1 | a->rs2);
>
> -    gen_helper_fmax_d(cpu_fpr[a->rd], cpu_env,
> -                      cpu_fpr[a->rs1], cpu_fpr[a->rs2]);
> +    TCGv_i64 dest = dest_fpr(ctx, a->rd);
> +    TCGv_i64 src1 = get_fpr_d(ctx, a->rs1);
> +    TCGv_i64 src2 = get_fpr_d(ctx, a->rs2);
>
> +    gen_helper_fmax_d(dest, cpu_env, src1, src2);
> +    gen_set_fpr_d(ctx, a->rd, dest);
>      mark_fs_dirty(ctx);
>      return true;
>  }
> @@ -223,11 +313,15 @@ static bool trans_fmax_d(DisasContext *ctx, arg_fmax_d *a)
>  static bool trans_fcvt_s_d(DisasContext *ctx, arg_fcvt_s_d *a)
>  {
>      REQUIRE_FPU;
> -    REQUIRE_EXT(ctx, RVD);
> +    REQUIRE_ZDINX_OR_D(ctx);
> +    REQUIRE_EVEN(ctx, a->rs1);
>
> -    gen_set_rm(ctx, a->rm);
> -    gen_helper_fcvt_s_d(cpu_fpr[a->rd], cpu_env, cpu_fpr[a->rs1]);
> +    TCGv_i64 dest = dest_fpr(ctx, a->rd);
> +    TCGv_i64 src1 = get_fpr_d(ctx, a->rs1);
>
> +    gen_set_rm(ctx, a->rm);
> +    gen_helper_fcvt_s_d(dest, cpu_env, src1);
> +    gen_set_fpr_hs(ctx, a->rd, dest);
>      mark_fs_dirty(ctx);
>      return true;
>  }
> @@ -235,11 +329,15 @@ static bool trans_fcvt_s_d(DisasContext *ctx, arg_fcvt_s_d *a)
>  static bool trans_fcvt_d_s(DisasContext *ctx, arg_fcvt_d_s *a)
>  {
>      REQUIRE_FPU;
> -    REQUIRE_EXT(ctx, RVD);
> +    REQUIRE_ZDINX_OR_D(ctx);
> +    REQUIRE_EVEN(ctx, a->rd);
>
> -    gen_set_rm(ctx, a->rm);
> -    gen_helper_fcvt_d_s(cpu_fpr[a->rd], cpu_env, cpu_fpr[a->rs1]);
> +    TCGv_i64 dest = dest_fpr(ctx, a->rd);
> +    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
>
> +    gen_set_rm(ctx, a->rm);
> +    gen_helper_fcvt_d_s(dest, cpu_env, src1);
> +    gen_set_fpr_d(ctx, a->rd, dest);
>      mark_fs_dirty(ctx);
>      return true;
>  }
> @@ -247,11 +345,14 @@ static bool trans_fcvt_d_s(DisasContext *ctx, arg_fcvt_d_s *a)
>  static bool trans_feq_d(DisasContext *ctx, arg_feq_d *a)
>  {
>      REQUIRE_FPU;
> -    REQUIRE_EXT(ctx, RVD);
> +    REQUIRE_ZDINX_OR_D(ctx);
> +    REQUIRE_EVEN(ctx, a->rs1 | a->rs2);
>
>      TCGv dest = dest_gpr(ctx, a->rd);
> +    TCGv_i64 src1 = get_fpr_d(ctx, a->rs1);
> +    TCGv_i64 src2 = get_fpr_d(ctx, a->rs2);
>
> -    gen_helper_feq_d(dest, cpu_env, cpu_fpr[a->rs1], cpu_fpr[a->rs2]);
> +    gen_helper_feq_d(dest, cpu_env, src1, src2);
>      gen_set_gpr(ctx, a->rd, dest);
>      return true;
>  }
> @@ -259,11 +360,14 @@ static bool trans_feq_d(DisasContext *ctx, arg_feq_d *a)
>  static bool trans_flt_d(DisasContext *ctx, arg_flt_d *a)
>  {
>      REQUIRE_FPU;
> -    REQUIRE_EXT(ctx, RVD);
> +    REQUIRE_ZDINX_OR_D(ctx);
> +    REQUIRE_EVEN(ctx, a->rs1 | a->rs2);
>
>      TCGv dest = dest_gpr(ctx, a->rd);
> +    TCGv_i64 src1 = get_fpr_d(ctx, a->rs1);
> +    TCGv_i64 src2 = get_fpr_d(ctx, a->rs2);
>
> -    gen_helper_flt_d(dest, cpu_env, cpu_fpr[a->rs1], cpu_fpr[a->rs2]);
> +    gen_helper_flt_d(dest, cpu_env, src1, src2);
>      gen_set_gpr(ctx, a->rd, dest);
>      return true;
>  }
> @@ -271,11 +375,14 @@ static bool trans_flt_d(DisasContext *ctx, arg_flt_d *a)
>  static bool trans_fle_d(DisasContext *ctx, arg_fle_d *a)
>  {
>      REQUIRE_FPU;
> -    REQUIRE_EXT(ctx, RVD);
> +    REQUIRE_ZDINX_OR_D(ctx);
> +    REQUIRE_EVEN(ctx, a->rs1 | a->rs2);
>
>      TCGv dest = dest_gpr(ctx, a->rd);
> +    TCGv_i64 src1 = get_fpr_d(ctx, a->rs1);
> +    TCGv_i64 src2 = get_fpr_d(ctx, a->rs2);
>
> -    gen_helper_fle_d(dest, cpu_env, cpu_fpr[a->rs1], cpu_fpr[a->rs2]);
> +    gen_helper_fle_d(dest, cpu_env, src1, src2);
>      gen_set_gpr(ctx, a->rd, dest);
>      return true;
>  }
> @@ -283,11 +390,13 @@ static bool trans_fle_d(DisasContext *ctx, arg_fle_d *a)
>  static bool trans_fclass_d(DisasContext *ctx, arg_fclass_d *a)
>  {
>      REQUIRE_FPU;
> -    REQUIRE_EXT(ctx, RVD);
> +    REQUIRE_ZDINX_OR_D(ctx);
> +    REQUIRE_EVEN(ctx, a->rs1);
>
>      TCGv dest = dest_gpr(ctx, a->rd);
> +    TCGv_i64 src1 = get_fpr_d(ctx, a->rs1);
>
> -    gen_helper_fclass_d(dest, cpu_fpr[a->rs1]);
> +    gen_helper_fclass_d(dest, src1);
>      gen_set_gpr(ctx, a->rd, dest);
>      return true;
>  }
> @@ -295,12 +404,14 @@ static bool trans_fclass_d(DisasContext *ctx, arg_fclass_d *a)
>  static bool trans_fcvt_w_d(DisasContext *ctx, arg_fcvt_w_d *a)
>  {
>      REQUIRE_FPU;
> -    REQUIRE_EXT(ctx, RVD);
> +    REQUIRE_ZDINX_OR_D(ctx);
> +    REQUIRE_EVEN(ctx, a->rs1);
>
>      TCGv dest = dest_gpr(ctx, a->rd);
> +    TCGv_i64 src1 = get_fpr_d(ctx, a->rs1);
>
>      gen_set_rm(ctx, a->rm);
> -    gen_helper_fcvt_w_d(dest, cpu_env, cpu_fpr[a->rs1]);
> +    gen_helper_fcvt_w_d(dest, cpu_env, src1);
>      gen_set_gpr(ctx, a->rd, dest);
>      return true;
>  }
> @@ -308,12 +419,14 @@ static bool trans_fcvt_w_d(DisasContext *ctx, arg_fcvt_w_d *a)
>  static bool trans_fcvt_wu_d(DisasContext *ctx, arg_fcvt_wu_d *a)
>  {
>      REQUIRE_FPU;
> -    REQUIRE_EXT(ctx, RVD);
> +    REQUIRE_ZDINX_OR_D(ctx);
> +    REQUIRE_EVEN(ctx, a->rs1);
>
>      TCGv dest = dest_gpr(ctx, a->rd);
> +    TCGv_i64 src1 = get_fpr_d(ctx, a->rs1);
>
>      gen_set_rm(ctx, a->rm);
> -    gen_helper_fcvt_wu_d(dest, cpu_env, cpu_fpr[a->rs1]);
> +    gen_helper_fcvt_wu_d(dest, cpu_env, src1);
>      gen_set_gpr(ctx, a->rd, dest);
>      return true;
>  }
> @@ -321,12 +434,15 @@ static bool trans_fcvt_wu_d(DisasContext *ctx, arg_fcvt_wu_d *a)
>  static bool trans_fcvt_d_w(DisasContext *ctx, arg_fcvt_d_w *a)
>  {
>      REQUIRE_FPU;
> -    REQUIRE_EXT(ctx, RVD);
> +    REQUIRE_ZDINX_OR_D(ctx);
> +    REQUIRE_EVEN(ctx, a->rd);
>
> +    TCGv_i64 dest = dest_fpr(ctx, a->rd);
>      TCGv src = get_gpr(ctx, a->rs1, EXT_SIGN);
>
>      gen_set_rm(ctx, a->rm);
> -    gen_helper_fcvt_d_w(cpu_fpr[a->rd], cpu_env, src);
> +    gen_helper_fcvt_d_w(dest, cpu_env, src);
> +    gen_set_fpr_d(ctx, a->rd, dest);
>
>      mark_fs_dirty(ctx);
>      return true;
> @@ -335,12 +451,15 @@ static bool trans_fcvt_d_w(DisasContext *ctx, arg_fcvt_d_w *a)
>  static bool trans_fcvt_d_wu(DisasContext *ctx, arg_fcvt_d_wu *a)
>  {
>      REQUIRE_FPU;
> -    REQUIRE_EXT(ctx, RVD);
> +    REQUIRE_ZDINX_OR_D(ctx);
> +    REQUIRE_EVEN(ctx, a->rd);
>
> +    TCGv_i64 dest = dest_fpr(ctx, a->rd);
>      TCGv src = get_gpr(ctx, a->rs1, EXT_ZERO);
>
>      gen_set_rm(ctx, a->rm);
> -    gen_helper_fcvt_d_wu(cpu_fpr[a->rd], cpu_env, src);
> +    gen_helper_fcvt_d_wu(dest, cpu_env, src);
> +    gen_set_fpr_d(ctx, a->rd, dest);
>
>      mark_fs_dirty(ctx);
>      return true;
> @@ -350,12 +469,14 @@ static bool trans_fcvt_l_d(DisasContext *ctx, arg_fcvt_l_d *a)
>  {
>      REQUIRE_64BIT(ctx);
>      REQUIRE_FPU;
> -    REQUIRE_EXT(ctx, RVD);
> +    REQUIRE_ZDINX_OR_D(ctx);
> +    REQUIRE_EVEN(ctx, a->rs1);
>
>      TCGv dest = dest_gpr(ctx, a->rd);
> +    TCGv_i64 src1 = get_fpr_d(ctx, a->rs1);
>
>      gen_set_rm(ctx, a->rm);
> -    gen_helper_fcvt_l_d(dest, cpu_env, cpu_fpr[a->rs1]);
> +    gen_helper_fcvt_l_d(dest, cpu_env, src1);
>      gen_set_gpr(ctx, a->rd, dest);
>      return true;
>  }
> @@ -364,12 +485,14 @@ static bool trans_fcvt_lu_d(DisasContext *ctx, arg_fcvt_lu_d *a)
>  {
>      REQUIRE_64BIT(ctx);
>      REQUIRE_FPU;
> -    REQUIRE_EXT(ctx, RVD);
> +    REQUIRE_ZDINX_OR_D(ctx);
> +    REQUIRE_EVEN(ctx, a->rs1);
>
>      TCGv dest = dest_gpr(ctx, a->rd);
> +    TCGv_i64 src1 = get_fpr_d(ctx, a->rs1);
>
>      gen_set_rm(ctx, a->rm);
> -    gen_helper_fcvt_lu_d(dest, cpu_env, cpu_fpr[a->rs1]);
> +    gen_helper_fcvt_lu_d(dest, cpu_env, src1);
>      gen_set_gpr(ctx, a->rd, dest);
>      return true;
>  }
> @@ -392,12 +515,15 @@ static bool trans_fcvt_d_l(DisasContext *ctx, arg_fcvt_d_l *a)
>  {
>      REQUIRE_64BIT(ctx);
>      REQUIRE_FPU;
> -    REQUIRE_EXT(ctx, RVD);
> +    REQUIRE_ZDINX_OR_D(ctx);
> +    REQUIRE_EVEN(ctx, a->rd);
>
> +    TCGv_i64 dest = dest_fpr(ctx, a->rd);
>      TCGv src = get_gpr(ctx, a->rs1, EXT_SIGN);
>
>      gen_set_rm(ctx, a->rm);
> -    gen_helper_fcvt_d_l(cpu_fpr[a->rd], cpu_env, src);
> +    gen_helper_fcvt_d_l(dest, cpu_env, src);
> +    gen_set_fpr_d(ctx, a->rd, dest);
>
>      mark_fs_dirty(ctx);
>      return true;
> @@ -407,12 +533,15 @@ static bool trans_fcvt_d_lu(DisasContext *ctx, arg_fcvt_d_lu *a)
>  {
>      REQUIRE_64BIT(ctx);
>      REQUIRE_FPU;
> -    REQUIRE_EXT(ctx, RVD);
> +    REQUIRE_ZDINX_OR_D(ctx);
> +    REQUIRE_EVEN(ctx, a->rd);
>
> +    TCGv_i64 dest = dest_fpr(ctx, a->rd);
>      TCGv src = get_gpr(ctx, a->rs1, EXT_ZERO);
>
>      gen_set_rm(ctx, a->rm);
> -    gen_helper_fcvt_d_lu(cpu_fpr[a->rd], cpu_env, src);
> +    gen_helper_fcvt_d_lu(dest, cpu_env, src);
> +    gen_set_fpr_d(ctx, a->rd, dest);
>
>      mark_fs_dirty(ctx);
>      return true;
> diff --git a/target/riscv/translate.c b/target/riscv/translate.c
> index 10cf37be41..fac998a6b5 100644
> --- a/target/riscv/translate.c
> +++ b/target/riscv/translate.c
> @@ -416,6 +416,31 @@ static TCGv_i64 get_fpr_hs(DisasContext *ctx, int reg_num)
>      }
>  }
>
> +static TCGv_i64 get_fpr_d(DisasContext *ctx, int reg_num)
> +{
> +    if (!ctx->cfg_ptr->ext_zfinx) {
> +        return cpu_fpr[reg_num];
> +    }
> +
> +    if (reg_num == 0) {
> +        return tcg_constant_i64(0);
> +    }
> +    switch (get_xl(ctx)) {
> +    case MXL_RV32:
> +    {
> +        TCGv_i64 t = ftemp_new(ctx);
> +        tcg_gen_concat_tl_i64(t, cpu_gpr[reg_num], cpu_gpr[reg_num + 1]);
> +        return t;
> +    }
> +#ifdef TARGET_RISCV64
> +    case MXL_RV64:
> +        return cpu_gpr[reg_num];
> +#endif
> +    default:
> +        g_assert_not_reached();
> +    }
> +}
> +
>  static TCGv_i64 dest_fpr(DisasContext *ctx, int reg_num)
>  {
>      if (!ctx->cfg_ptr->ext_zfinx) {
> @@ -463,6 +488,33 @@ static void gen_set_fpr_hs(DisasContext *ctx, int reg_num, TCGv_i64 t)
>      }
>  }
>
> +static void gen_set_fpr_d(DisasContext *ctx, int reg_num, TCGv_i64 t)
> +{
> +    if (!ctx->cfg_ptr->ext_zfinx) {
> +        tcg_gen_mov_i64(cpu_fpr[reg_num], t);
> +        return;
> +    }
> +
> +    if (reg_num != 0) {
> +        switch (get_xl(ctx)) {
> +        case MXL_RV32:
> +#ifdef TARGET_RISCV32
> +            tcg_gen_extr_i64_i32(cpu_gpr[reg_num], cpu_gpr[reg_num + 1], t);
> +            break;
> +#else
> +            tcg_gen_ext32s_i64(cpu_gpr[reg_num], t);
> +            tcg_gen_sari_i64(cpu_gpr[reg_num + 1], t, 32);
> +            break;
> +        case MXL_RV64:
> +            tcg_gen_mov_i64(cpu_gpr[reg_num], t);
> +            break;
> +#endif
> +        default:
> +            g_assert_not_reached();
> +        }
> +    }
> +}
> +
>  static void gen_jal(DisasContext *ctx, int rd, target_ulong imm)
>  {
>      target_ulong next_pc;
> --
> 2.17.1
>
>


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v6 5/6] target/riscv: add support for zhinx/zhinxmin
  2022-02-11  4:39   ` Weiwei Li
@ 2022-02-28  4:09     ` Alistair Francis
  -1 siblings, 0 replies; 22+ messages in thread
From: Alistair Francis @ 2022-02-28  4:09 UTC (permalink / raw)
  To: Weiwei Li
  Cc: Wei Wu (吴伟),
	open list:RISC-V, wangjunqiang, Bin Meng, Richard Henderson,
	qemu-devel@nongnu.org Developers, ardxwe, Palmer Dabbelt,
	Alistair Francis

On Fri, Feb 11, 2022 at 2:45 PM Weiwei Li <liweiwei@iscas.ac.cn> wrote:
>
>   - update extension check REQUIRE_ZHINX_OR_ZFH and REQUIRE_ZFH_OR_ZFHMIN_OR_ZHINX_OR_ZHINXMIN
>   - update half float point register read/write
>   - disable nanbox_h check
>
> Signed-off-by: Weiwei Li <liweiwei@iscas.ac.cn>
> Signed-off-by: Junqiang Wang <wangjunqiang@iscas.ac.cn>
> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>

Acked-by: Alistair Francis <alistair.francis@wdc.com>

Alistair

> ---
>  target/riscv/fpu_helper.c                 |  89 +++---
>  target/riscv/helper.h                     |   2 +-
>  target/riscv/insn_trans/trans_rvzfh.c.inc | 332 +++++++++++++++-------
>  target/riscv/internals.h                  |  16 +-
>  4 files changed, 296 insertions(+), 143 deletions(-)
>
> diff --git a/target/riscv/fpu_helper.c b/target/riscv/fpu_helper.c
> index 63ca703459..5699c9517f 100644
> --- a/target/riscv/fpu_helper.c
> +++ b/target/riscv/fpu_helper.c
> @@ -89,10 +89,11 @@ void helper_set_rod_rounding_mode(CPURISCVState *env)
>  static uint64_t do_fmadd_h(CPURISCVState *env, uint64_t rs1, uint64_t rs2,
>                             uint64_t rs3, int flags)
>  {
> -    float16 frs1 = check_nanbox_h(rs1);
> -    float16 frs2 = check_nanbox_h(rs2);
> -    float16 frs3 = check_nanbox_h(rs3);
> -    return nanbox_h(float16_muladd(frs1, frs2, frs3, flags, &env->fp_status));
> +    float16 frs1 = check_nanbox_h(env, rs1);
> +    float16 frs2 = check_nanbox_h(env, rs2);
> +    float16 frs3 = check_nanbox_h(env, rs3);
> +    return nanbox_h(env, float16_muladd(frs1, frs2, frs3, flags,
> +                                        &env->fp_status));
>  }
>
>  static uint64_t do_fmadd_s(CPURISCVState *env, uint64_t rs1, uint64_t rs2,
> @@ -417,146 +418,146 @@ target_ulong helper_fclass_d(uint64_t frs1)
>
>  uint64_t helper_fadd_h(CPURISCVState *env, uint64_t rs1, uint64_t rs2)
>  {
> -    float16 frs1 = check_nanbox_h(rs1);
> -    float16 frs2 = check_nanbox_h(rs2);
> -    return nanbox_h(float16_add(frs1, frs2, &env->fp_status));
> +    float16 frs1 = check_nanbox_h(env, rs1);
> +    float16 frs2 = check_nanbox_h(env, rs2);
> +    return nanbox_h(env, float16_add(frs1, frs2, &env->fp_status));
>  }
>
>  uint64_t helper_fsub_h(CPURISCVState *env, uint64_t rs1, uint64_t rs2)
>  {
> -    float16 frs1 = check_nanbox_h(rs1);
> -    float16 frs2 = check_nanbox_h(rs2);
> -    return nanbox_h(float16_sub(frs1, frs2, &env->fp_status));
> +    float16 frs1 = check_nanbox_h(env, rs1);
> +    float16 frs2 = check_nanbox_h(env, rs2);
> +    return nanbox_h(env, float16_sub(frs1, frs2, &env->fp_status));
>  }
>
>  uint64_t helper_fmul_h(CPURISCVState *env, uint64_t rs1, uint64_t rs2)
>  {
> -    float16 frs1 = check_nanbox_h(rs1);
> -    float16 frs2 = check_nanbox_h(rs2);
> -    return nanbox_h(float16_mul(frs1, frs2, &env->fp_status));
> +    float16 frs1 = check_nanbox_h(env, rs1);
> +    float16 frs2 = check_nanbox_h(env, rs2);
> +    return nanbox_h(env, float16_mul(frs1, frs2, &env->fp_status));
>  }
>
>  uint64_t helper_fdiv_h(CPURISCVState *env, uint64_t rs1, uint64_t rs2)
>  {
> -    float16 frs1 = check_nanbox_h(rs1);
> -    float16 frs2 = check_nanbox_h(rs2);
> -    return nanbox_h(float16_div(frs1, frs2, &env->fp_status));
> +    float16 frs1 = check_nanbox_h(env, rs1);
> +    float16 frs2 = check_nanbox_h(env, rs2);
> +    return nanbox_h(env, float16_div(frs1, frs2, &env->fp_status));
>  }
>
>  uint64_t helper_fmin_h(CPURISCVState *env, uint64_t rs1, uint64_t rs2)
>  {
> -    float16 frs1 = check_nanbox_h(rs1);
> -    float16 frs2 = check_nanbox_h(rs2);
> -    return nanbox_h(env->priv_ver < PRIV_VERSION_1_11_0 ?
> +    float16 frs1 = check_nanbox_h(env, rs1);
> +    float16 frs2 = check_nanbox_h(env, rs2);
> +    return nanbox_h(env, env->priv_ver < PRIV_VERSION_1_11_0 ?
>                      float16_minnum(frs1, frs2, &env->fp_status) :
>                      float16_minimum_number(frs1, frs2, &env->fp_status));
>  }
>
>  uint64_t helper_fmax_h(CPURISCVState *env, uint64_t rs1, uint64_t rs2)
>  {
> -    float16 frs1 = check_nanbox_h(rs1);
> -    float16 frs2 = check_nanbox_h(rs2);
> -    return nanbox_h(env->priv_ver < PRIV_VERSION_1_11_0 ?
> +    float16 frs1 = check_nanbox_h(env, rs1);
> +    float16 frs2 = check_nanbox_h(env, rs2);
> +    return nanbox_h(env, env->priv_ver < PRIV_VERSION_1_11_0 ?
>                      float16_maxnum(frs1, frs2, &env->fp_status) :
>                      float16_maximum_number(frs1, frs2, &env->fp_status));
>  }
>
>  uint64_t helper_fsqrt_h(CPURISCVState *env, uint64_t rs1)
>  {
> -    float16 frs1 = check_nanbox_h(rs1);
> -    return nanbox_h(float16_sqrt(frs1, &env->fp_status));
> +    float16 frs1 = check_nanbox_h(env, rs1);
> +    return nanbox_h(env, float16_sqrt(frs1, &env->fp_status));
>  }
>
>  target_ulong helper_fle_h(CPURISCVState *env, uint64_t rs1, uint64_t rs2)
>  {
> -    float16 frs1 = check_nanbox_h(rs1);
> -    float16 frs2 = check_nanbox_h(rs2);
> +    float16 frs1 = check_nanbox_h(env, rs1);
> +    float16 frs2 = check_nanbox_h(env, rs2);
>      return float16_le(frs1, frs2, &env->fp_status);
>  }
>
>  target_ulong helper_flt_h(CPURISCVState *env, uint64_t rs1, uint64_t rs2)
>  {
> -    float16 frs1 = check_nanbox_h(rs1);
> -    float16 frs2 = check_nanbox_h(rs2);
> +    float16 frs1 = check_nanbox_h(env, rs1);
> +    float16 frs2 = check_nanbox_h(env, rs2);
>      return float16_lt(frs1, frs2, &env->fp_status);
>  }
>
>  target_ulong helper_feq_h(CPURISCVState *env, uint64_t rs1, uint64_t rs2)
>  {
> -    float16 frs1 = check_nanbox_h(rs1);
> -    float16 frs2 = check_nanbox_h(rs2);
> +    float16 frs1 = check_nanbox_h(env, rs1);
> +    float16 frs2 = check_nanbox_h(env, rs2);
>      return float16_eq_quiet(frs1, frs2, &env->fp_status);
>  }
>
> -target_ulong helper_fclass_h(uint64_t rs1)
> +target_ulong helper_fclass_h(CPURISCVState *env, uint64_t rs1)
>  {
> -    float16 frs1 = check_nanbox_h(rs1);
> +    float16 frs1 = check_nanbox_h(env, rs1);
>      return fclass_h(frs1);
>  }
>
>  target_ulong helper_fcvt_w_h(CPURISCVState *env, uint64_t rs1)
>  {
> -    float16 frs1 = check_nanbox_h(rs1);
> +    float16 frs1 = check_nanbox_h(env, rs1);
>      return float16_to_int32(frs1, &env->fp_status);
>  }
>
>  target_ulong helper_fcvt_wu_h(CPURISCVState *env, uint64_t rs1)
>  {
> -    float16 frs1 = check_nanbox_h(rs1);
> +    float16 frs1 = check_nanbox_h(env, rs1);
>      return (int32_t)float16_to_uint32(frs1, &env->fp_status);
>  }
>
>  target_ulong helper_fcvt_l_h(CPURISCVState *env, uint64_t rs1)
>  {
> -    float16 frs1 = check_nanbox_h(rs1);
> +    float16 frs1 = check_nanbox_h(env, rs1);
>      return float16_to_int64(frs1, &env->fp_status);
>  }
>
>  target_ulong helper_fcvt_lu_h(CPURISCVState *env, uint64_t rs1)
>  {
> -    float16 frs1 = check_nanbox_h(rs1);
> +    float16 frs1 = check_nanbox_h(env, rs1);
>      return float16_to_uint64(frs1, &env->fp_status);
>  }
>
>  uint64_t helper_fcvt_h_w(CPURISCVState *env, target_ulong rs1)
>  {
> -    return nanbox_h(int32_to_float16((int32_t)rs1, &env->fp_status));
> +    return nanbox_h(env, int32_to_float16((int32_t)rs1, &env->fp_status));
>  }
>
>  uint64_t helper_fcvt_h_wu(CPURISCVState *env, target_ulong rs1)
>  {
> -    return nanbox_h(uint32_to_float16((uint32_t)rs1, &env->fp_status));
> +    return nanbox_h(env, uint32_to_float16((uint32_t)rs1, &env->fp_status));
>  }
>
>  uint64_t helper_fcvt_h_l(CPURISCVState *env, target_ulong rs1)
>  {
> -    return nanbox_h(int64_to_float16(rs1, &env->fp_status));
> +    return nanbox_h(env, int64_to_float16(rs1, &env->fp_status));
>  }
>
>  uint64_t helper_fcvt_h_lu(CPURISCVState *env, target_ulong rs1)
>  {
> -    return nanbox_h(uint64_to_float16(rs1, &env->fp_status));
> +    return nanbox_h(env, uint64_to_float16(rs1, &env->fp_status));
>  }
>
>  uint64_t helper_fcvt_h_s(CPURISCVState *env, uint64_t rs1)
>  {
>      float32 frs1 = check_nanbox_s(env, rs1);
> -    return nanbox_h(float32_to_float16(frs1, true, &env->fp_status));
> +    return nanbox_h(env, float32_to_float16(frs1, true, &env->fp_status));
>  }
>
>  uint64_t helper_fcvt_s_h(CPURISCVState *env, uint64_t rs1)
>  {
> -    float16 frs1 = check_nanbox_h(rs1);
> +    float16 frs1 = check_nanbox_h(env, rs1);
>      return nanbox_s(env, float16_to_float32(frs1, true, &env->fp_status));
>  }
>
>  uint64_t helper_fcvt_h_d(CPURISCVState *env, uint64_t rs1)
>  {
> -    return nanbox_h(float64_to_float16(rs1, true, &env->fp_status));
> +    return nanbox_h(env, float64_to_float16(rs1, true, &env->fp_status));
>  }
>
>  uint64_t helper_fcvt_d_h(CPURISCVState *env, uint64_t rs1)
>  {
> -    float16 frs1 = check_nanbox_h(rs1);
> +    float16 frs1 = check_nanbox_h(env, rs1);
>      return float16_to_float64(frs1, true, &env->fp_status);
>  }
> diff --git a/target/riscv/helper.h b/target/riscv/helper.h
> index 89195aad9d..26bbab2fab 100644
> --- a/target/riscv/helper.h
> +++ b/target/riscv/helper.h
> @@ -90,7 +90,7 @@ DEF_HELPER_FLAGS_2(fcvt_h_w, TCG_CALL_NO_RWG, i64, env, tl)
>  DEF_HELPER_FLAGS_2(fcvt_h_wu, TCG_CALL_NO_RWG, i64, env, tl)
>  DEF_HELPER_FLAGS_2(fcvt_h_l, TCG_CALL_NO_RWG, i64, env, tl)
>  DEF_HELPER_FLAGS_2(fcvt_h_lu, TCG_CALL_NO_RWG, i64, env, tl)
> -DEF_HELPER_FLAGS_1(fclass_h, TCG_CALL_NO_RWG_SE, tl, i64)
> +DEF_HELPER_FLAGS_2(fclass_h, TCG_CALL_NO_RWG_SE, tl, env, i64)
>
>  /* Special functions */
>  DEF_HELPER_2(csrr, tl, env, int)
> diff --git a/target/riscv/insn_trans/trans_rvzfh.c.inc b/target/riscv/insn_trans/trans_rvzfh.c.inc
> index 608c51da2c..5d07150cd0 100644
> --- a/target/riscv/insn_trans/trans_rvzfh.c.inc
> +++ b/target/riscv/insn_trans/trans_rvzfh.c.inc
> @@ -22,12 +22,25 @@
>      }                         \
>  } while (0)
>
> +#define REQUIRE_ZHINX_OR_ZFH(ctx) do { \
> +    if (!ctx->cfg_ptr->ext_zhinx && !ctx->cfg_ptr->ext_zfh) { \
> +        return false;                  \
> +    }                                  \
> +} while (0)
> +
>  #define REQUIRE_ZFH_OR_ZFHMIN(ctx) do {       \
>      if (!(ctx->cfg_ptr->ext_zfh || ctx->cfg_ptr->ext_zfhmin)) { \
>          return false;                         \
>      }                                         \
>  } while (0)
>
> +#define REQUIRE_ZFH_OR_ZFHMIN_OR_ZHINX_OR_ZHINXMIN(ctx) do { \
> +    if (!(ctx->cfg_ptr->ext_zfh || ctx->cfg_ptr->ext_zfhmin ||          \
> +          ctx->cfg_ptr->ext_zhinx || ctx->cfg_ptr->ext_zhinxmin)) {     \
> +        return false;                                        \
> +    }                                                        \
> +} while (0)
> +
>  static bool trans_flh(DisasContext *ctx, arg_flh *a)
>  {
>      TCGv_i64 dest;
> @@ -73,11 +86,16 @@ static bool trans_fsh(DisasContext *ctx, arg_fsh *a)
>  static bool trans_fmadd_h(DisasContext *ctx, arg_fmadd_h *a)
>  {
>      REQUIRE_FPU;
> -    REQUIRE_ZFH(ctx);
> +    REQUIRE_ZHINX_OR_ZFH(ctx);
> +
> +    TCGv_i64 dest = dest_fpr(ctx, a->rd);
> +    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
> +    TCGv_i64 src2 = get_fpr_hs(ctx, a->rs2);
> +    TCGv_i64 src3 = get_fpr_hs(ctx, a->rs3);
>
>      gen_set_rm(ctx, a->rm);
> -    gen_helper_fmadd_h(cpu_fpr[a->rd], cpu_env, cpu_fpr[a->rs1],
> -                       cpu_fpr[a->rs2], cpu_fpr[a->rs3]);
> +    gen_helper_fmadd_h(dest, cpu_env, src1, src2, src3);
> +    gen_set_fpr_hs(ctx, a->rd, dest);
>      mark_fs_dirty(ctx);
>      return true;
>  }
> @@ -85,11 +103,16 @@ static bool trans_fmadd_h(DisasContext *ctx, arg_fmadd_h *a)
>  static bool trans_fmsub_h(DisasContext *ctx, arg_fmsub_h *a)
>  {
>      REQUIRE_FPU;
> -    REQUIRE_ZFH(ctx);
> +    REQUIRE_ZHINX_OR_ZFH(ctx);
> +
> +    TCGv_i64 dest = dest_fpr(ctx, a->rd);
> +    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
> +    TCGv_i64 src2 = get_fpr_hs(ctx, a->rs2);
> +    TCGv_i64 src3 = get_fpr_hs(ctx, a->rs3);
>
>      gen_set_rm(ctx, a->rm);
> -    gen_helper_fmsub_h(cpu_fpr[a->rd], cpu_env, cpu_fpr[a->rs1],
> -                       cpu_fpr[a->rs2], cpu_fpr[a->rs3]);
> +    gen_helper_fmsub_h(dest, cpu_env, src1, src2, src3);
> +    gen_set_fpr_hs(ctx, a->rd, dest);
>      mark_fs_dirty(ctx);
>      return true;
>  }
> @@ -97,11 +120,16 @@ static bool trans_fmsub_h(DisasContext *ctx, arg_fmsub_h *a)
>  static bool trans_fnmsub_h(DisasContext *ctx, arg_fnmsub_h *a)
>  {
>      REQUIRE_FPU;
> -    REQUIRE_ZFH(ctx);
> +    REQUIRE_ZHINX_OR_ZFH(ctx);
> +
> +    TCGv_i64 dest = dest_fpr(ctx, a->rd);
> +    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
> +    TCGv_i64 src2 = get_fpr_hs(ctx, a->rs2);
> +    TCGv_i64 src3 = get_fpr_hs(ctx, a->rs3);
>
>      gen_set_rm(ctx, a->rm);
> -    gen_helper_fnmsub_h(cpu_fpr[a->rd], cpu_env, cpu_fpr[a->rs1],
> -                        cpu_fpr[a->rs2], cpu_fpr[a->rs3]);
> +    gen_helper_fnmsub_h(dest, cpu_env, src1, src2, src3);
> +    gen_set_fpr_hs(ctx, a->rd, dest);
>      mark_fs_dirty(ctx);
>      return true;
>  }
> @@ -109,11 +137,16 @@ static bool trans_fnmsub_h(DisasContext *ctx, arg_fnmsub_h *a)
>  static bool trans_fnmadd_h(DisasContext *ctx, arg_fnmadd_h *a)
>  {
>      REQUIRE_FPU;
> -    REQUIRE_ZFH(ctx);
> +    REQUIRE_ZHINX_OR_ZFH(ctx);
> +
> +    TCGv_i64 dest = dest_fpr(ctx, a->rd);
> +    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
> +    TCGv_i64 src2 = get_fpr_hs(ctx, a->rs2);
> +    TCGv_i64 src3 = get_fpr_hs(ctx, a->rs3);
>
>      gen_set_rm(ctx, a->rm);
> -    gen_helper_fnmadd_h(cpu_fpr[a->rd], cpu_env, cpu_fpr[a->rs1],
> -                        cpu_fpr[a->rs2], cpu_fpr[a->rs3]);
> +    gen_helper_fnmadd_h(dest, cpu_env, src1, src2, src3);
> +    gen_set_fpr_hs(ctx, a->rd, dest);
>      mark_fs_dirty(ctx);
>      return true;
>  }
> @@ -121,11 +154,15 @@ static bool trans_fnmadd_h(DisasContext *ctx, arg_fnmadd_h *a)
>  static bool trans_fadd_h(DisasContext *ctx, arg_fadd_h *a)
>  {
>      REQUIRE_FPU;
> -    REQUIRE_ZFH(ctx);
> +    REQUIRE_ZHINX_OR_ZFH(ctx);
> +
> +    TCGv_i64 dest = dest_fpr(ctx, a->rd);
> +    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
> +    TCGv_i64 src2 = get_fpr_hs(ctx, a->rs2);
>
>      gen_set_rm(ctx, a->rm);
> -    gen_helper_fadd_h(cpu_fpr[a->rd], cpu_env,
> -                      cpu_fpr[a->rs1], cpu_fpr[a->rs2]);
> +    gen_helper_fadd_h(dest, cpu_env, src1, src2);
> +    gen_set_fpr_hs(ctx, a->rd, dest);
>      mark_fs_dirty(ctx);
>      return true;
>  }
> @@ -133,11 +170,15 @@ static bool trans_fadd_h(DisasContext *ctx, arg_fadd_h *a)
>  static bool trans_fsub_h(DisasContext *ctx, arg_fsub_h *a)
>  {
>      REQUIRE_FPU;
> -    REQUIRE_ZFH(ctx);
> +    REQUIRE_ZHINX_OR_ZFH(ctx);
> +
> +    TCGv_i64 dest = dest_fpr(ctx, a->rd);
> +    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
> +    TCGv_i64 src2 = get_fpr_hs(ctx, a->rs2);
>
>      gen_set_rm(ctx, a->rm);
> -    gen_helper_fsub_h(cpu_fpr[a->rd], cpu_env,
> -                      cpu_fpr[a->rs1], cpu_fpr[a->rs2]);
> +    gen_helper_fsub_h(dest, cpu_env, src1, src2);
> +    gen_set_fpr_hs(ctx, a->rd, dest);
>      mark_fs_dirty(ctx);
>      return true;
>  }
> @@ -145,11 +186,15 @@ static bool trans_fsub_h(DisasContext *ctx, arg_fsub_h *a)
>  static bool trans_fmul_h(DisasContext *ctx, arg_fmul_h *a)
>  {
>      REQUIRE_FPU;
> -    REQUIRE_ZFH(ctx);
> +    REQUIRE_ZHINX_OR_ZFH(ctx);
> +
> +    TCGv_i64 dest = dest_fpr(ctx, a->rd);
> +    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
> +    TCGv_i64 src2 = get_fpr_hs(ctx, a->rs2);
>
>      gen_set_rm(ctx, a->rm);
> -    gen_helper_fmul_h(cpu_fpr[a->rd], cpu_env,
> -                      cpu_fpr[a->rs1], cpu_fpr[a->rs2]);
> +    gen_helper_fmul_h(dest, cpu_env, src1, src2);
> +    gen_set_fpr_hs(ctx, a->rd, dest);
>      mark_fs_dirty(ctx);
>      return true;
>  }
> @@ -157,11 +202,15 @@ static bool trans_fmul_h(DisasContext *ctx, arg_fmul_h *a)
>  static bool trans_fdiv_h(DisasContext *ctx, arg_fdiv_h *a)
>  {
>      REQUIRE_FPU;
> -    REQUIRE_ZFH(ctx);
> +    REQUIRE_ZHINX_OR_ZFH(ctx);
> +
> +    TCGv_i64 dest = dest_fpr(ctx, a->rd);
> +    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
> +    TCGv_i64 src2 = get_fpr_hs(ctx, a->rs2);
>
>      gen_set_rm(ctx, a->rm);
> -    gen_helper_fdiv_h(cpu_fpr[a->rd], cpu_env,
> -                      cpu_fpr[a->rs1], cpu_fpr[a->rs2]);
> +    gen_helper_fdiv_h(dest, cpu_env, src1, src2);
> +    gen_set_fpr_hs(ctx, a->rd, dest);
>      mark_fs_dirty(ctx);
>      return true;
>  }
> @@ -169,10 +218,14 @@ static bool trans_fdiv_h(DisasContext *ctx, arg_fdiv_h *a)
>  static bool trans_fsqrt_h(DisasContext *ctx, arg_fsqrt_h *a)
>  {
>      REQUIRE_FPU;
> -    REQUIRE_ZFH(ctx);
> +    REQUIRE_ZHINX_OR_ZFH(ctx);
> +
> +    TCGv_i64 dest = dest_fpr(ctx, a->rd);
> +    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
>
>      gen_set_rm(ctx, a->rm);
> -    gen_helper_fsqrt_h(cpu_fpr[a->rd], cpu_env, cpu_fpr[a->rs1]);
> +    gen_helper_fsqrt_h(dest, cpu_env, src1);
> +    gen_set_fpr_hs(ctx, a->rd, dest);
>      mark_fs_dirty(ctx);
>      return true;
>  }
> @@ -180,23 +233,37 @@ static bool trans_fsqrt_h(DisasContext *ctx, arg_fsqrt_h *a)
>  static bool trans_fsgnj_h(DisasContext *ctx, arg_fsgnj_h *a)
>  {
>      REQUIRE_FPU;
> -    REQUIRE_ZFH(ctx);
> +    REQUIRE_ZHINX_OR_ZFH(ctx);
> +
> +    TCGv_i64 dest = dest_fpr(ctx, a->rd);
> +    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
>
>      if (a->rs1 == a->rs2) { /* FMOV */
> -        gen_check_nanbox_h(cpu_fpr[a->rd], cpu_fpr[a->rs1]);
> +        if (!ctx->cfg_ptr->ext_zfinx) {
> +            gen_check_nanbox_h(dest, src1);
> +        } else {
> +            tcg_gen_ext16s_i64(dest, src1);
> +        }
>      } else {
> -        TCGv_i64 rs1 = tcg_temp_new_i64();
> -        TCGv_i64 rs2 = tcg_temp_new_i64();
> -
> -        gen_check_nanbox_h(rs1, cpu_fpr[a->rs1]);
> -        gen_check_nanbox_h(rs2, cpu_fpr[a->rs2]);
> -
> -        /* This formulation retains the nanboxing of rs2. */
> -        tcg_gen_deposit_i64(cpu_fpr[a->rd], rs2, rs1, 0, 15);
> -        tcg_temp_free_i64(rs1);
> -        tcg_temp_free_i64(rs2);
> +        TCGv_i64 src2 = get_fpr_hs(ctx, a->rs2);
> +
> +        if (!ctx->cfg_ptr->ext_zfinx) {
> +            TCGv_i64 rs1 = tcg_temp_new_i64();
> +            TCGv_i64 rs2 = tcg_temp_new_i64();
> +            gen_check_nanbox_h(rs1, src1);
> +            gen_check_nanbox_h(rs2, src2);
> +
> +            /* This formulation retains the nanboxing of rs2 in normal 'Zfh'. */
> +            tcg_gen_deposit_i64(dest, rs2, rs1, 0, 15);
> +
> +            tcg_temp_free_i64(rs1);
> +            tcg_temp_free_i64(rs2);
> +        } else {
> +            tcg_gen_deposit_i64(dest, src2, src1, 0, 15);
> +            tcg_gen_ext16s_i64(dest, dest);
> +        }
>      }
> -
> +    gen_set_fpr_hs(ctx, a->rd, dest);
>      mark_fs_dirty(ctx);
>      return true;
>  }
> @@ -206,16 +273,29 @@ static bool trans_fsgnjn_h(DisasContext *ctx, arg_fsgnjn_h *a)
>      TCGv_i64 rs1, rs2, mask;
>
>      REQUIRE_FPU;
> -    REQUIRE_ZFH(ctx);
> +    REQUIRE_ZHINX_OR_ZFH(ctx);
> +
> +    TCGv_i64 dest = dest_fpr(ctx, a->rd);
> +    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
>
>      rs1 = tcg_temp_new_i64();
> -    gen_check_nanbox_h(rs1, cpu_fpr[a->rs1]);
> +    if (!ctx->cfg_ptr->ext_zfinx) {
> +        gen_check_nanbox_h(rs1, src1);
> +    } else {
> +        tcg_gen_mov_i64(rs1, src1);
> +    }
>
>      if (a->rs1 == a->rs2) { /* FNEG */
> -        tcg_gen_xori_i64(cpu_fpr[a->rd], rs1, MAKE_64BIT_MASK(15, 1));
> +        tcg_gen_xori_i64(dest, rs1, MAKE_64BIT_MASK(15, 1));
>      } else {
> +        TCGv_i64 src2 = get_fpr_hs(ctx, a->rs2);
>          rs2 = tcg_temp_new_i64();
> -        gen_check_nanbox_h(rs2, cpu_fpr[a->rs2]);
> +
> +        if (!ctx->cfg_ptr->ext_zfinx) {
> +            gen_check_nanbox_h(rs2, src2);
> +        } else {
> +            tcg_gen_mov_i64(rs2, src2);
> +        }
>
>          /*
>           * Replace bit 15 in rs1 with inverse in rs2.
> @@ -224,12 +304,17 @@ static bool trans_fsgnjn_h(DisasContext *ctx, arg_fsgnjn_h *a)
>          mask = tcg_const_i64(~MAKE_64BIT_MASK(15, 1));
>          tcg_gen_not_i64(rs2, rs2);
>          tcg_gen_andc_i64(rs2, rs2, mask);
> -        tcg_gen_and_i64(rs1, mask, rs1);
> -        tcg_gen_or_i64(cpu_fpr[a->rd], rs1, rs2);
> +        tcg_gen_and_i64(dest, mask, rs1);
> +        tcg_gen_or_i64(dest, dest, rs2);
>
>          tcg_temp_free_i64(mask);
>          tcg_temp_free_i64(rs2);
>      }
> +    /* signed-extended intead of nanboxing for result if enable zfinx */
> +    if (ctx->cfg_ptr->ext_zfinx) {
> +        tcg_gen_ext16s_i64(dest, dest);
> +    }
> +    tcg_temp_free_i64(rs1);
>      mark_fs_dirty(ctx);
>      return true;
>  }
> @@ -239,27 +324,44 @@ static bool trans_fsgnjx_h(DisasContext *ctx, arg_fsgnjx_h *a)
>      TCGv_i64 rs1, rs2;
>
>      REQUIRE_FPU;
> -    REQUIRE_ZFH(ctx);
> +    REQUIRE_ZHINX_OR_ZFH(ctx);
> +
> +    TCGv_i64 dest = dest_fpr(ctx, a->rd);
> +    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
>
>      rs1 = tcg_temp_new_i64();
> -    gen_check_nanbox_s(rs1, cpu_fpr[a->rs1]);
> +    if (!ctx->cfg_ptr->ext_zfinx) {
> +        gen_check_nanbox_h(rs1, src1);
> +    } else {
> +        tcg_gen_mov_i64(rs1, src1);
> +    }
>
>      if (a->rs1 == a->rs2) { /* FABS */
> -        tcg_gen_andi_i64(cpu_fpr[a->rd], rs1, ~MAKE_64BIT_MASK(15, 1));
> +        tcg_gen_andi_i64(dest, rs1, ~MAKE_64BIT_MASK(15, 1));
>      } else {
> +        TCGv_i64 src2 = get_fpr_hs(ctx, a->rs2);
>          rs2 = tcg_temp_new_i64();
> -        gen_check_nanbox_s(rs2, cpu_fpr[a->rs2]);
> +
> +        if (!ctx->cfg_ptr->ext_zfinx) {
> +            gen_check_nanbox_h(rs2, src2);
> +        } else {
> +            tcg_gen_mov_i64(rs2, src2);
> +        }
>
>          /*
>           * Xor bit 15 in rs1 with that in rs2.
>           * This formulation retains the nanboxing of rs1.
>           */
> -        tcg_gen_andi_i64(rs2, rs2, MAKE_64BIT_MASK(15, 1));
> -        tcg_gen_xor_i64(cpu_fpr[a->rd], rs1, rs2);
> +        tcg_gen_andi_i64(dest, rs2, MAKE_64BIT_MASK(15, 1));
> +        tcg_gen_xor_i64(dest, rs1, dest);
>
>          tcg_temp_free_i64(rs2);
>      }
> -
> +    /* signed-extended intead of nanboxing for result if enable zfinx */
> +    if (ctx->cfg_ptr->ext_zfinx) {
> +        tcg_gen_ext16s_i64(dest, dest);
> +    }
> +    tcg_temp_free_i64(rs1);
>      mark_fs_dirty(ctx);
>      return true;
>  }
> @@ -267,10 +369,14 @@ static bool trans_fsgnjx_h(DisasContext *ctx, arg_fsgnjx_h *a)
>  static bool trans_fmin_h(DisasContext *ctx, arg_fmin_h *a)
>  {
>      REQUIRE_FPU;
> -    REQUIRE_ZFH(ctx);
> +    REQUIRE_ZHINX_OR_ZFH(ctx);
> +
> +    TCGv_i64 dest = dest_fpr(ctx, a->rd);
> +    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
> +    TCGv_i64 src2 = get_fpr_hs(ctx, a->rs2);
>
> -    gen_helper_fmin_h(cpu_fpr[a->rd], cpu_env, cpu_fpr[a->rs1],
> -                      cpu_fpr[a->rs2]);
> +    gen_helper_fmin_h(dest, cpu_env, src1, src2);
> +    gen_set_fpr_hs(ctx, a->rd, dest);
>      mark_fs_dirty(ctx);
>      return true;
>  }
> @@ -278,10 +384,14 @@ static bool trans_fmin_h(DisasContext *ctx, arg_fmin_h *a)
>  static bool trans_fmax_h(DisasContext *ctx, arg_fmax_h *a)
>  {
>      REQUIRE_FPU;
> -    REQUIRE_ZFH(ctx);
> +    REQUIRE_ZHINX_OR_ZFH(ctx);
>
> -    gen_helper_fmax_h(cpu_fpr[a->rd], cpu_env, cpu_fpr[a->rs1],
> -                      cpu_fpr[a->rs2]);
> +    TCGv_i64 dest = dest_fpr(ctx, a->rd);
> +    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
> +    TCGv_i64 src2 = get_fpr_hs(ctx, a->rs2);
> +
> +    gen_helper_fmax_h(dest, cpu_env, src1, src2);
> +    gen_set_fpr_hs(ctx, a->rd, dest);
>      mark_fs_dirty(ctx);
>      return true;
>  }
> @@ -289,10 +399,14 @@ static bool trans_fmax_h(DisasContext *ctx, arg_fmax_h *a)
>  static bool trans_fcvt_s_h(DisasContext *ctx, arg_fcvt_s_h *a)
>  {
>      REQUIRE_FPU;
> -    REQUIRE_ZFH_OR_ZFHMIN(ctx);
> +    REQUIRE_ZFH_OR_ZFHMIN_OR_ZHINX_OR_ZHINXMIN(ctx);
> +
> +    TCGv_i64 dest = dest_fpr(ctx, a->rd);
> +    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
>
>      gen_set_rm(ctx, a->rm);
> -    gen_helper_fcvt_s_h(cpu_fpr[a->rd], cpu_env, cpu_fpr[a->rs1]);
> +    gen_helper_fcvt_s_h(dest, cpu_env, src1);
> +    gen_set_fpr_hs(ctx, a->rd, dest);
>
>      mark_fs_dirty(ctx);
>
> @@ -302,26 +416,32 @@ static bool trans_fcvt_s_h(DisasContext *ctx, arg_fcvt_s_h *a)
>  static bool trans_fcvt_d_h(DisasContext *ctx, arg_fcvt_d_h *a)
>  {
>      REQUIRE_FPU;
> -    REQUIRE_ZFH_OR_ZFHMIN(ctx);
> -    REQUIRE_EXT(ctx, RVD);
> +    REQUIRE_ZFH_OR_ZFHMIN_OR_ZHINX_OR_ZHINXMIN(ctx);
> +    REQUIRE_ZDINX_OR_D(ctx);
> +
> +    TCGv_i64 dest = dest_fpr(ctx, a->rd);
> +    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
>
>      gen_set_rm(ctx, a->rm);
> -    gen_helper_fcvt_d_h(cpu_fpr[a->rd], cpu_env, cpu_fpr[a->rs1]);
> +    gen_helper_fcvt_d_h(dest, cpu_env, src1);
> +    gen_set_fpr_d(ctx, a->rd, dest);
>
>      mark_fs_dirty(ctx);
>
> -
>      return true;
>  }
>
>  static bool trans_fcvt_h_s(DisasContext *ctx, arg_fcvt_h_s *a)
>  {
>      REQUIRE_FPU;
> -    REQUIRE_ZFH_OR_ZFHMIN(ctx);
> +    REQUIRE_ZFH_OR_ZFHMIN_OR_ZHINX_OR_ZHINXMIN(ctx);
>
> -    gen_set_rm(ctx, a->rm);
> -    gen_helper_fcvt_h_s(cpu_fpr[a->rd], cpu_env, cpu_fpr[a->rs1]);
> +    TCGv_i64 dest = dest_fpr(ctx, a->rd);
> +    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
>
> +    gen_set_rm(ctx, a->rm);
> +    gen_helper_fcvt_h_s(dest, cpu_env, src1);
> +    gen_set_fpr_hs(ctx, a->rd, dest);
>      mark_fs_dirty(ctx);
>
>      return true;
> @@ -330,12 +450,15 @@ static bool trans_fcvt_h_s(DisasContext *ctx, arg_fcvt_h_s *a)
>  static bool trans_fcvt_h_d(DisasContext *ctx, arg_fcvt_h_d *a)
>  {
>      REQUIRE_FPU;
> -    REQUIRE_ZFH_OR_ZFHMIN(ctx);
> -    REQUIRE_EXT(ctx, RVD);
> +    REQUIRE_ZFH_OR_ZFHMIN_OR_ZHINX_OR_ZHINXMIN(ctx);
> +    REQUIRE_ZDINX_OR_D(ctx);
>
> -    gen_set_rm(ctx, a->rm);
> -    gen_helper_fcvt_h_d(cpu_fpr[a->rd], cpu_env, cpu_fpr[a->rs1]);
> +    TCGv_i64 dest = dest_fpr(ctx, a->rd);
> +    TCGv_i64 src1 = get_fpr_d(ctx, a->rs1);
>
> +    gen_set_rm(ctx, a->rm);
> +    gen_helper_fcvt_h_d(dest, cpu_env, src1);
> +    gen_set_fpr_hs(ctx, a->rd, dest);
>      mark_fs_dirty(ctx);
>
>      return true;
> @@ -344,11 +467,13 @@ static bool trans_fcvt_h_d(DisasContext *ctx, arg_fcvt_h_d *a)
>  static bool trans_feq_h(DisasContext *ctx, arg_feq_h *a)
>  {
>      REQUIRE_FPU;
> -    REQUIRE_ZFH(ctx);
> +    REQUIRE_ZHINX_OR_ZFH(ctx);
>
>      TCGv dest = dest_gpr(ctx, a->rd);
> +    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
> +    TCGv_i64 src2 = get_fpr_hs(ctx, a->rs2);
>
> -    gen_helper_feq_h(dest, cpu_env, cpu_fpr[a->rs1], cpu_fpr[a->rs2]);
> +    gen_helper_feq_h(dest, cpu_env, src1, src2);
>      gen_set_gpr(ctx, a->rd, dest);
>      return true;
>  }
> @@ -356,11 +481,13 @@ static bool trans_feq_h(DisasContext *ctx, arg_feq_h *a)
>  static bool trans_flt_h(DisasContext *ctx, arg_flt_h *a)
>  {
>      REQUIRE_FPU;
> -    REQUIRE_ZFH(ctx);
> +    REQUIRE_ZHINX_OR_ZFH(ctx);
>
>      TCGv dest = dest_gpr(ctx, a->rd);
> +    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
> +    TCGv_i64 src2 = get_fpr_hs(ctx, a->rs2);
>
> -    gen_helper_flt_h(dest, cpu_env, cpu_fpr[a->rs1], cpu_fpr[a->rs2]);
> +    gen_helper_flt_h(dest, cpu_env, src1, src2);
>      gen_set_gpr(ctx, a->rd, dest);
>
>      return true;
> @@ -369,11 +496,13 @@ static bool trans_flt_h(DisasContext *ctx, arg_flt_h *a)
>  static bool trans_fle_h(DisasContext *ctx, arg_fle_h *a)
>  {
>      REQUIRE_FPU;
> -    REQUIRE_ZFH(ctx);
> +    REQUIRE_ZHINX_OR_ZFH(ctx);
>
>      TCGv dest = dest_gpr(ctx, a->rd);
> +    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
> +    TCGv_i64 src2 = get_fpr_hs(ctx, a->rs2);
>
> -    gen_helper_fle_h(dest, cpu_env, cpu_fpr[a->rs1], cpu_fpr[a->rs2]);
> +    gen_helper_fle_h(dest, cpu_env, src1, src2);
>      gen_set_gpr(ctx, a->rd, dest);
>      return true;
>  }
> @@ -381,11 +510,12 @@ static bool trans_fle_h(DisasContext *ctx, arg_fle_h *a)
>  static bool trans_fclass_h(DisasContext *ctx, arg_fclass_h *a)
>  {
>      REQUIRE_FPU;
> -    REQUIRE_ZFH(ctx);
> +    REQUIRE_ZHINX_OR_ZFH(ctx);
>
>      TCGv dest = dest_gpr(ctx, a->rd);
> +    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
>
> -    gen_helper_fclass_h(dest, cpu_fpr[a->rs1]);
> +    gen_helper_fclass_h(dest, cpu_env, src1);
>      gen_set_gpr(ctx, a->rd, dest);
>      return true;
>  }
> @@ -393,12 +523,13 @@ static bool trans_fclass_h(DisasContext *ctx, arg_fclass_h *a)
>  static bool trans_fcvt_w_h(DisasContext *ctx, arg_fcvt_w_h *a)
>  {
>      REQUIRE_FPU;
> -    REQUIRE_ZFH(ctx);
> +    REQUIRE_ZHINX_OR_ZFH(ctx);
>
>      TCGv dest = dest_gpr(ctx, a->rd);
> +    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
>
>      gen_set_rm(ctx, a->rm);
> -    gen_helper_fcvt_w_h(dest, cpu_env, cpu_fpr[a->rs1]);
> +    gen_helper_fcvt_w_h(dest, cpu_env, src1);
>      gen_set_gpr(ctx, a->rd, dest);
>      return true;
>  }
> @@ -406,12 +537,13 @@ static bool trans_fcvt_w_h(DisasContext *ctx, arg_fcvt_w_h *a)
>  static bool trans_fcvt_wu_h(DisasContext *ctx, arg_fcvt_wu_h *a)
>  {
>      REQUIRE_FPU;
> -    REQUIRE_ZFH(ctx);
> +    REQUIRE_ZHINX_OR_ZFH(ctx);
>
>      TCGv dest = dest_gpr(ctx, a->rd);
> +    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
>
>      gen_set_rm(ctx, a->rm);
> -    gen_helper_fcvt_wu_h(dest, cpu_env, cpu_fpr[a->rs1]);
> +    gen_helper_fcvt_wu_h(dest, cpu_env, src1);
>      gen_set_gpr(ctx, a->rd, dest);
>      return true;
>  }
> @@ -419,12 +551,14 @@ static bool trans_fcvt_wu_h(DisasContext *ctx, arg_fcvt_wu_h *a)
>  static bool trans_fcvt_h_w(DisasContext *ctx, arg_fcvt_h_w *a)
>  {
>      REQUIRE_FPU;
> -    REQUIRE_ZFH(ctx);
> +    REQUIRE_ZHINX_OR_ZFH(ctx);
>
> +    TCGv_i64 dest = dest_fpr(ctx, a->rd);
>      TCGv t0 = get_gpr(ctx, a->rs1, EXT_SIGN);
>
>      gen_set_rm(ctx, a->rm);
> -    gen_helper_fcvt_h_w(cpu_fpr[a->rd], cpu_env, t0);
> +    gen_helper_fcvt_h_w(dest, cpu_env, t0);
> +    gen_set_fpr_hs(ctx, a->rd, dest);
>
>      mark_fs_dirty(ctx);
>      return true;
> @@ -433,12 +567,14 @@ static bool trans_fcvt_h_w(DisasContext *ctx, arg_fcvt_h_w *a)
>  static bool trans_fcvt_h_wu(DisasContext *ctx, arg_fcvt_h_wu *a)
>  {
>      REQUIRE_FPU;
> -    REQUIRE_ZFH(ctx);
> +    REQUIRE_ZHINX_OR_ZFH(ctx);
>
> +    TCGv_i64 dest = dest_fpr(ctx, a->rd);
>      TCGv t0 = get_gpr(ctx, a->rs1, EXT_SIGN);
>
>      gen_set_rm(ctx, a->rm);
> -    gen_helper_fcvt_h_wu(cpu_fpr[a->rd], cpu_env, t0);
> +    gen_helper_fcvt_h_wu(dest, cpu_env, t0);
> +    gen_set_fpr_hs(ctx, a->rd, dest);
>
>      mark_fs_dirty(ctx);
>      return true;
> @@ -482,12 +618,13 @@ static bool trans_fcvt_l_h(DisasContext *ctx, arg_fcvt_l_h *a)
>  {
>      REQUIRE_64BIT(ctx);
>      REQUIRE_FPU;
> -    REQUIRE_ZFH(ctx);
> +    REQUIRE_ZHINX_OR_ZFH(ctx);
>
>      TCGv dest = dest_gpr(ctx, a->rd);
> +    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
>
>      gen_set_rm(ctx, a->rm);
> -    gen_helper_fcvt_l_h(dest, cpu_env, cpu_fpr[a->rs1]);
> +    gen_helper_fcvt_l_h(dest, cpu_env, src1);
>      gen_set_gpr(ctx, a->rd, dest);
>      return true;
>  }
> @@ -496,12 +633,13 @@ static bool trans_fcvt_lu_h(DisasContext *ctx, arg_fcvt_lu_h *a)
>  {
>      REQUIRE_64BIT(ctx);
>      REQUIRE_FPU;
> -    REQUIRE_ZFH(ctx);
> +    REQUIRE_ZHINX_OR_ZFH(ctx);
>
>      TCGv dest = dest_gpr(ctx, a->rd);
> +    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
>
>      gen_set_rm(ctx, a->rm);
> -    gen_helper_fcvt_lu_h(dest, cpu_env, cpu_fpr[a->rs1]);
> +    gen_helper_fcvt_lu_h(dest, cpu_env, src1);
>      gen_set_gpr(ctx, a->rd, dest);
>      return true;
>  }
> @@ -510,12 +648,14 @@ static bool trans_fcvt_h_l(DisasContext *ctx, arg_fcvt_h_l *a)
>  {
>      REQUIRE_64BIT(ctx);
>      REQUIRE_FPU;
> -    REQUIRE_ZFH(ctx);
> +    REQUIRE_ZHINX_OR_ZFH(ctx);
>
> +    TCGv_i64 dest = dest_fpr(ctx, a->rd);
>      TCGv t0 = get_gpr(ctx, a->rs1, EXT_SIGN);
>
>      gen_set_rm(ctx, a->rm);
> -    gen_helper_fcvt_h_l(cpu_fpr[a->rd], cpu_env, t0);
> +    gen_helper_fcvt_h_l(dest, cpu_env, t0);
> +    gen_set_fpr_hs(ctx, a->rd, dest);
>
>      mark_fs_dirty(ctx);
>      return true;
> @@ -525,12 +665,14 @@ static bool trans_fcvt_h_lu(DisasContext *ctx, arg_fcvt_h_lu *a)
>  {
>      REQUIRE_64BIT(ctx);
>      REQUIRE_FPU;
> -    REQUIRE_ZFH(ctx);
> +    REQUIRE_ZHINX_OR_ZFH(ctx);
>
> +    TCGv_i64 dest = dest_fpr(ctx, a->rd);
>      TCGv t0 = get_gpr(ctx, a->rs1, EXT_SIGN);
>
>      gen_set_rm(ctx, a->rm);
> -    gen_helper_fcvt_h_lu(cpu_fpr[a->rd], cpu_env, t0);
> +    gen_helper_fcvt_h_lu(dest, cpu_env, t0);
> +    gen_set_fpr_hs(ctx, a->rd, dest);
>
>      mark_fs_dirty(ctx);
>      return true;
> diff --git a/target/riscv/internals.h b/target/riscv/internals.h
> index 6237bb3115..dbb322bfa7 100644
> --- a/target/riscv/internals.h
> +++ b/target/riscv/internals.h
> @@ -72,13 +72,23 @@ static inline float32 check_nanbox_s(CPURISCVState *env, uint64_t f)
>      }
>  }
>
> -static inline uint64_t nanbox_h(float16 f)
> +static inline uint64_t nanbox_h(CPURISCVState *env, float16 f)
>  {
> -    return f | MAKE_64BIT_MASK(16, 48);
> +    /* the value is sign-extended instead of NaN-boxing for zfinx */
> +    if (RISCV_CPU(env_cpu(env))->cfg.ext_zfinx) {
> +        return (int16_t)f;
> +    } else {
> +        return f | MAKE_64BIT_MASK(16, 48);
> +    }
>  }
>
> -static inline float16 check_nanbox_h(uint64_t f)
> +static inline float16 check_nanbox_h(CPURISCVState *env, uint64_t f)
>  {
> +    /* Disable nanbox check when enable zfinx */
> +    if (RISCV_CPU(env_cpu(env))->cfg.ext_zfinx) {
> +        return (uint16_t)f;
> +    }
> +
>      uint64_t mask = MAKE_64BIT_MASK(16, 48);
>
>      if (likely((f & mask) == mask)) {
> --
> 2.17.1
>
>


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v6 5/6] target/riscv: add support for zhinx/zhinxmin
@ 2022-02-28  4:09     ` Alistair Francis
  0 siblings, 0 replies; 22+ messages in thread
From: Alistair Francis @ 2022-02-28  4:09 UTC (permalink / raw)
  To: Weiwei Li
  Cc: Richard Henderson, Palmer Dabbelt, Alistair Francis, Bin Meng,
	open list:RISC-V, qemu-devel@nongnu.org Developers, wangjunqiang,
	Wei Wu (吴伟),
	ardxwe

On Fri, Feb 11, 2022 at 2:45 PM Weiwei Li <liweiwei@iscas.ac.cn> wrote:
>
>   - update extension check REQUIRE_ZHINX_OR_ZFH and REQUIRE_ZFH_OR_ZFHMIN_OR_ZHINX_OR_ZHINXMIN
>   - update half float point register read/write
>   - disable nanbox_h check
>
> Signed-off-by: Weiwei Li <liweiwei@iscas.ac.cn>
> Signed-off-by: Junqiang Wang <wangjunqiang@iscas.ac.cn>
> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>

Acked-by: Alistair Francis <alistair.francis@wdc.com>

Alistair

> ---
>  target/riscv/fpu_helper.c                 |  89 +++---
>  target/riscv/helper.h                     |   2 +-
>  target/riscv/insn_trans/trans_rvzfh.c.inc | 332 +++++++++++++++-------
>  target/riscv/internals.h                  |  16 +-
>  4 files changed, 296 insertions(+), 143 deletions(-)
>
> diff --git a/target/riscv/fpu_helper.c b/target/riscv/fpu_helper.c
> index 63ca703459..5699c9517f 100644
> --- a/target/riscv/fpu_helper.c
> +++ b/target/riscv/fpu_helper.c
> @@ -89,10 +89,11 @@ void helper_set_rod_rounding_mode(CPURISCVState *env)
>  static uint64_t do_fmadd_h(CPURISCVState *env, uint64_t rs1, uint64_t rs2,
>                             uint64_t rs3, int flags)
>  {
> -    float16 frs1 = check_nanbox_h(rs1);
> -    float16 frs2 = check_nanbox_h(rs2);
> -    float16 frs3 = check_nanbox_h(rs3);
> -    return nanbox_h(float16_muladd(frs1, frs2, frs3, flags, &env->fp_status));
> +    float16 frs1 = check_nanbox_h(env, rs1);
> +    float16 frs2 = check_nanbox_h(env, rs2);
> +    float16 frs3 = check_nanbox_h(env, rs3);
> +    return nanbox_h(env, float16_muladd(frs1, frs2, frs3, flags,
> +                                        &env->fp_status));
>  }
>
>  static uint64_t do_fmadd_s(CPURISCVState *env, uint64_t rs1, uint64_t rs2,
> @@ -417,146 +418,146 @@ target_ulong helper_fclass_d(uint64_t frs1)
>
>  uint64_t helper_fadd_h(CPURISCVState *env, uint64_t rs1, uint64_t rs2)
>  {
> -    float16 frs1 = check_nanbox_h(rs1);
> -    float16 frs2 = check_nanbox_h(rs2);
> -    return nanbox_h(float16_add(frs1, frs2, &env->fp_status));
> +    float16 frs1 = check_nanbox_h(env, rs1);
> +    float16 frs2 = check_nanbox_h(env, rs2);
> +    return nanbox_h(env, float16_add(frs1, frs2, &env->fp_status));
>  }
>
>  uint64_t helper_fsub_h(CPURISCVState *env, uint64_t rs1, uint64_t rs2)
>  {
> -    float16 frs1 = check_nanbox_h(rs1);
> -    float16 frs2 = check_nanbox_h(rs2);
> -    return nanbox_h(float16_sub(frs1, frs2, &env->fp_status));
> +    float16 frs1 = check_nanbox_h(env, rs1);
> +    float16 frs2 = check_nanbox_h(env, rs2);
> +    return nanbox_h(env, float16_sub(frs1, frs2, &env->fp_status));
>  }
>
>  uint64_t helper_fmul_h(CPURISCVState *env, uint64_t rs1, uint64_t rs2)
>  {
> -    float16 frs1 = check_nanbox_h(rs1);
> -    float16 frs2 = check_nanbox_h(rs2);
> -    return nanbox_h(float16_mul(frs1, frs2, &env->fp_status));
> +    float16 frs1 = check_nanbox_h(env, rs1);
> +    float16 frs2 = check_nanbox_h(env, rs2);
> +    return nanbox_h(env, float16_mul(frs1, frs2, &env->fp_status));
>  }
>
>  uint64_t helper_fdiv_h(CPURISCVState *env, uint64_t rs1, uint64_t rs2)
>  {
> -    float16 frs1 = check_nanbox_h(rs1);
> -    float16 frs2 = check_nanbox_h(rs2);
> -    return nanbox_h(float16_div(frs1, frs2, &env->fp_status));
> +    float16 frs1 = check_nanbox_h(env, rs1);
> +    float16 frs2 = check_nanbox_h(env, rs2);
> +    return nanbox_h(env, float16_div(frs1, frs2, &env->fp_status));
>  }
>
>  uint64_t helper_fmin_h(CPURISCVState *env, uint64_t rs1, uint64_t rs2)
>  {
> -    float16 frs1 = check_nanbox_h(rs1);
> -    float16 frs2 = check_nanbox_h(rs2);
> -    return nanbox_h(env->priv_ver < PRIV_VERSION_1_11_0 ?
> +    float16 frs1 = check_nanbox_h(env, rs1);
> +    float16 frs2 = check_nanbox_h(env, rs2);
> +    return nanbox_h(env, env->priv_ver < PRIV_VERSION_1_11_0 ?
>                      float16_minnum(frs1, frs2, &env->fp_status) :
>                      float16_minimum_number(frs1, frs2, &env->fp_status));
>  }
>
>  uint64_t helper_fmax_h(CPURISCVState *env, uint64_t rs1, uint64_t rs2)
>  {
> -    float16 frs1 = check_nanbox_h(rs1);
> -    float16 frs2 = check_nanbox_h(rs2);
> -    return nanbox_h(env->priv_ver < PRIV_VERSION_1_11_0 ?
> +    float16 frs1 = check_nanbox_h(env, rs1);
> +    float16 frs2 = check_nanbox_h(env, rs2);
> +    return nanbox_h(env, env->priv_ver < PRIV_VERSION_1_11_0 ?
>                      float16_maxnum(frs1, frs2, &env->fp_status) :
>                      float16_maximum_number(frs1, frs2, &env->fp_status));
>  }
>
>  uint64_t helper_fsqrt_h(CPURISCVState *env, uint64_t rs1)
>  {
> -    float16 frs1 = check_nanbox_h(rs1);
> -    return nanbox_h(float16_sqrt(frs1, &env->fp_status));
> +    float16 frs1 = check_nanbox_h(env, rs1);
> +    return nanbox_h(env, float16_sqrt(frs1, &env->fp_status));
>  }
>
>  target_ulong helper_fle_h(CPURISCVState *env, uint64_t rs1, uint64_t rs2)
>  {
> -    float16 frs1 = check_nanbox_h(rs1);
> -    float16 frs2 = check_nanbox_h(rs2);
> +    float16 frs1 = check_nanbox_h(env, rs1);
> +    float16 frs2 = check_nanbox_h(env, rs2);
>      return float16_le(frs1, frs2, &env->fp_status);
>  }
>
>  target_ulong helper_flt_h(CPURISCVState *env, uint64_t rs1, uint64_t rs2)
>  {
> -    float16 frs1 = check_nanbox_h(rs1);
> -    float16 frs2 = check_nanbox_h(rs2);
> +    float16 frs1 = check_nanbox_h(env, rs1);
> +    float16 frs2 = check_nanbox_h(env, rs2);
>      return float16_lt(frs1, frs2, &env->fp_status);
>  }
>
>  target_ulong helper_feq_h(CPURISCVState *env, uint64_t rs1, uint64_t rs2)
>  {
> -    float16 frs1 = check_nanbox_h(rs1);
> -    float16 frs2 = check_nanbox_h(rs2);
> +    float16 frs1 = check_nanbox_h(env, rs1);
> +    float16 frs2 = check_nanbox_h(env, rs2);
>      return float16_eq_quiet(frs1, frs2, &env->fp_status);
>  }
>
> -target_ulong helper_fclass_h(uint64_t rs1)
> +target_ulong helper_fclass_h(CPURISCVState *env, uint64_t rs1)
>  {
> -    float16 frs1 = check_nanbox_h(rs1);
> +    float16 frs1 = check_nanbox_h(env, rs1);
>      return fclass_h(frs1);
>  }
>
>  target_ulong helper_fcvt_w_h(CPURISCVState *env, uint64_t rs1)
>  {
> -    float16 frs1 = check_nanbox_h(rs1);
> +    float16 frs1 = check_nanbox_h(env, rs1);
>      return float16_to_int32(frs1, &env->fp_status);
>  }
>
>  target_ulong helper_fcvt_wu_h(CPURISCVState *env, uint64_t rs1)
>  {
> -    float16 frs1 = check_nanbox_h(rs1);
> +    float16 frs1 = check_nanbox_h(env, rs1);
>      return (int32_t)float16_to_uint32(frs1, &env->fp_status);
>  }
>
>  target_ulong helper_fcvt_l_h(CPURISCVState *env, uint64_t rs1)
>  {
> -    float16 frs1 = check_nanbox_h(rs1);
> +    float16 frs1 = check_nanbox_h(env, rs1);
>      return float16_to_int64(frs1, &env->fp_status);
>  }
>
>  target_ulong helper_fcvt_lu_h(CPURISCVState *env, uint64_t rs1)
>  {
> -    float16 frs1 = check_nanbox_h(rs1);
> +    float16 frs1 = check_nanbox_h(env, rs1);
>      return float16_to_uint64(frs1, &env->fp_status);
>  }
>
>  uint64_t helper_fcvt_h_w(CPURISCVState *env, target_ulong rs1)
>  {
> -    return nanbox_h(int32_to_float16((int32_t)rs1, &env->fp_status));
> +    return nanbox_h(env, int32_to_float16((int32_t)rs1, &env->fp_status));
>  }
>
>  uint64_t helper_fcvt_h_wu(CPURISCVState *env, target_ulong rs1)
>  {
> -    return nanbox_h(uint32_to_float16((uint32_t)rs1, &env->fp_status));
> +    return nanbox_h(env, uint32_to_float16((uint32_t)rs1, &env->fp_status));
>  }
>
>  uint64_t helper_fcvt_h_l(CPURISCVState *env, target_ulong rs1)
>  {
> -    return nanbox_h(int64_to_float16(rs1, &env->fp_status));
> +    return nanbox_h(env, int64_to_float16(rs1, &env->fp_status));
>  }
>
>  uint64_t helper_fcvt_h_lu(CPURISCVState *env, target_ulong rs1)
>  {
> -    return nanbox_h(uint64_to_float16(rs1, &env->fp_status));
> +    return nanbox_h(env, uint64_to_float16(rs1, &env->fp_status));
>  }
>
>  uint64_t helper_fcvt_h_s(CPURISCVState *env, uint64_t rs1)
>  {
>      float32 frs1 = check_nanbox_s(env, rs1);
> -    return nanbox_h(float32_to_float16(frs1, true, &env->fp_status));
> +    return nanbox_h(env, float32_to_float16(frs1, true, &env->fp_status));
>  }
>
>  uint64_t helper_fcvt_s_h(CPURISCVState *env, uint64_t rs1)
>  {
> -    float16 frs1 = check_nanbox_h(rs1);
> +    float16 frs1 = check_nanbox_h(env, rs1);
>      return nanbox_s(env, float16_to_float32(frs1, true, &env->fp_status));
>  }
>
>  uint64_t helper_fcvt_h_d(CPURISCVState *env, uint64_t rs1)
>  {
> -    return nanbox_h(float64_to_float16(rs1, true, &env->fp_status));
> +    return nanbox_h(env, float64_to_float16(rs1, true, &env->fp_status));
>  }
>
>  uint64_t helper_fcvt_d_h(CPURISCVState *env, uint64_t rs1)
>  {
> -    float16 frs1 = check_nanbox_h(rs1);
> +    float16 frs1 = check_nanbox_h(env, rs1);
>      return float16_to_float64(frs1, true, &env->fp_status);
>  }
> diff --git a/target/riscv/helper.h b/target/riscv/helper.h
> index 89195aad9d..26bbab2fab 100644
> --- a/target/riscv/helper.h
> +++ b/target/riscv/helper.h
> @@ -90,7 +90,7 @@ DEF_HELPER_FLAGS_2(fcvt_h_w, TCG_CALL_NO_RWG, i64, env, tl)
>  DEF_HELPER_FLAGS_2(fcvt_h_wu, TCG_CALL_NO_RWG, i64, env, tl)
>  DEF_HELPER_FLAGS_2(fcvt_h_l, TCG_CALL_NO_RWG, i64, env, tl)
>  DEF_HELPER_FLAGS_2(fcvt_h_lu, TCG_CALL_NO_RWG, i64, env, tl)
> -DEF_HELPER_FLAGS_1(fclass_h, TCG_CALL_NO_RWG_SE, tl, i64)
> +DEF_HELPER_FLAGS_2(fclass_h, TCG_CALL_NO_RWG_SE, tl, env, i64)
>
>  /* Special functions */
>  DEF_HELPER_2(csrr, tl, env, int)
> diff --git a/target/riscv/insn_trans/trans_rvzfh.c.inc b/target/riscv/insn_trans/trans_rvzfh.c.inc
> index 608c51da2c..5d07150cd0 100644
> --- a/target/riscv/insn_trans/trans_rvzfh.c.inc
> +++ b/target/riscv/insn_trans/trans_rvzfh.c.inc
> @@ -22,12 +22,25 @@
>      }                         \
>  } while (0)
>
> +#define REQUIRE_ZHINX_OR_ZFH(ctx) do { \
> +    if (!ctx->cfg_ptr->ext_zhinx && !ctx->cfg_ptr->ext_zfh) { \
> +        return false;                  \
> +    }                                  \
> +} while (0)
> +
>  #define REQUIRE_ZFH_OR_ZFHMIN(ctx) do {       \
>      if (!(ctx->cfg_ptr->ext_zfh || ctx->cfg_ptr->ext_zfhmin)) { \
>          return false;                         \
>      }                                         \
>  } while (0)
>
> +#define REQUIRE_ZFH_OR_ZFHMIN_OR_ZHINX_OR_ZHINXMIN(ctx) do { \
> +    if (!(ctx->cfg_ptr->ext_zfh || ctx->cfg_ptr->ext_zfhmin ||          \
> +          ctx->cfg_ptr->ext_zhinx || ctx->cfg_ptr->ext_zhinxmin)) {     \
> +        return false;                                        \
> +    }                                                        \
> +} while (0)
> +
>  static bool trans_flh(DisasContext *ctx, arg_flh *a)
>  {
>      TCGv_i64 dest;
> @@ -73,11 +86,16 @@ static bool trans_fsh(DisasContext *ctx, arg_fsh *a)
>  static bool trans_fmadd_h(DisasContext *ctx, arg_fmadd_h *a)
>  {
>      REQUIRE_FPU;
> -    REQUIRE_ZFH(ctx);
> +    REQUIRE_ZHINX_OR_ZFH(ctx);
> +
> +    TCGv_i64 dest = dest_fpr(ctx, a->rd);
> +    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
> +    TCGv_i64 src2 = get_fpr_hs(ctx, a->rs2);
> +    TCGv_i64 src3 = get_fpr_hs(ctx, a->rs3);
>
>      gen_set_rm(ctx, a->rm);
> -    gen_helper_fmadd_h(cpu_fpr[a->rd], cpu_env, cpu_fpr[a->rs1],
> -                       cpu_fpr[a->rs2], cpu_fpr[a->rs3]);
> +    gen_helper_fmadd_h(dest, cpu_env, src1, src2, src3);
> +    gen_set_fpr_hs(ctx, a->rd, dest);
>      mark_fs_dirty(ctx);
>      return true;
>  }
> @@ -85,11 +103,16 @@ static bool trans_fmadd_h(DisasContext *ctx, arg_fmadd_h *a)
>  static bool trans_fmsub_h(DisasContext *ctx, arg_fmsub_h *a)
>  {
>      REQUIRE_FPU;
> -    REQUIRE_ZFH(ctx);
> +    REQUIRE_ZHINX_OR_ZFH(ctx);
> +
> +    TCGv_i64 dest = dest_fpr(ctx, a->rd);
> +    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
> +    TCGv_i64 src2 = get_fpr_hs(ctx, a->rs2);
> +    TCGv_i64 src3 = get_fpr_hs(ctx, a->rs3);
>
>      gen_set_rm(ctx, a->rm);
> -    gen_helper_fmsub_h(cpu_fpr[a->rd], cpu_env, cpu_fpr[a->rs1],
> -                       cpu_fpr[a->rs2], cpu_fpr[a->rs3]);
> +    gen_helper_fmsub_h(dest, cpu_env, src1, src2, src3);
> +    gen_set_fpr_hs(ctx, a->rd, dest);
>      mark_fs_dirty(ctx);
>      return true;
>  }
> @@ -97,11 +120,16 @@ static bool trans_fmsub_h(DisasContext *ctx, arg_fmsub_h *a)
>  static bool trans_fnmsub_h(DisasContext *ctx, arg_fnmsub_h *a)
>  {
>      REQUIRE_FPU;
> -    REQUIRE_ZFH(ctx);
> +    REQUIRE_ZHINX_OR_ZFH(ctx);
> +
> +    TCGv_i64 dest = dest_fpr(ctx, a->rd);
> +    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
> +    TCGv_i64 src2 = get_fpr_hs(ctx, a->rs2);
> +    TCGv_i64 src3 = get_fpr_hs(ctx, a->rs3);
>
>      gen_set_rm(ctx, a->rm);
> -    gen_helper_fnmsub_h(cpu_fpr[a->rd], cpu_env, cpu_fpr[a->rs1],
> -                        cpu_fpr[a->rs2], cpu_fpr[a->rs3]);
> +    gen_helper_fnmsub_h(dest, cpu_env, src1, src2, src3);
> +    gen_set_fpr_hs(ctx, a->rd, dest);
>      mark_fs_dirty(ctx);
>      return true;
>  }
> @@ -109,11 +137,16 @@ static bool trans_fnmsub_h(DisasContext *ctx, arg_fnmsub_h *a)
>  static bool trans_fnmadd_h(DisasContext *ctx, arg_fnmadd_h *a)
>  {
>      REQUIRE_FPU;
> -    REQUIRE_ZFH(ctx);
> +    REQUIRE_ZHINX_OR_ZFH(ctx);
> +
> +    TCGv_i64 dest = dest_fpr(ctx, a->rd);
> +    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
> +    TCGv_i64 src2 = get_fpr_hs(ctx, a->rs2);
> +    TCGv_i64 src3 = get_fpr_hs(ctx, a->rs3);
>
>      gen_set_rm(ctx, a->rm);
> -    gen_helper_fnmadd_h(cpu_fpr[a->rd], cpu_env, cpu_fpr[a->rs1],
> -                        cpu_fpr[a->rs2], cpu_fpr[a->rs3]);
> +    gen_helper_fnmadd_h(dest, cpu_env, src1, src2, src3);
> +    gen_set_fpr_hs(ctx, a->rd, dest);
>      mark_fs_dirty(ctx);
>      return true;
>  }
> @@ -121,11 +154,15 @@ static bool trans_fnmadd_h(DisasContext *ctx, arg_fnmadd_h *a)
>  static bool trans_fadd_h(DisasContext *ctx, arg_fadd_h *a)
>  {
>      REQUIRE_FPU;
> -    REQUIRE_ZFH(ctx);
> +    REQUIRE_ZHINX_OR_ZFH(ctx);
> +
> +    TCGv_i64 dest = dest_fpr(ctx, a->rd);
> +    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
> +    TCGv_i64 src2 = get_fpr_hs(ctx, a->rs2);
>
>      gen_set_rm(ctx, a->rm);
> -    gen_helper_fadd_h(cpu_fpr[a->rd], cpu_env,
> -                      cpu_fpr[a->rs1], cpu_fpr[a->rs2]);
> +    gen_helper_fadd_h(dest, cpu_env, src1, src2);
> +    gen_set_fpr_hs(ctx, a->rd, dest);
>      mark_fs_dirty(ctx);
>      return true;
>  }
> @@ -133,11 +170,15 @@ static bool trans_fadd_h(DisasContext *ctx, arg_fadd_h *a)
>  static bool trans_fsub_h(DisasContext *ctx, arg_fsub_h *a)
>  {
>      REQUIRE_FPU;
> -    REQUIRE_ZFH(ctx);
> +    REQUIRE_ZHINX_OR_ZFH(ctx);
> +
> +    TCGv_i64 dest = dest_fpr(ctx, a->rd);
> +    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
> +    TCGv_i64 src2 = get_fpr_hs(ctx, a->rs2);
>
>      gen_set_rm(ctx, a->rm);
> -    gen_helper_fsub_h(cpu_fpr[a->rd], cpu_env,
> -                      cpu_fpr[a->rs1], cpu_fpr[a->rs2]);
> +    gen_helper_fsub_h(dest, cpu_env, src1, src2);
> +    gen_set_fpr_hs(ctx, a->rd, dest);
>      mark_fs_dirty(ctx);
>      return true;
>  }
> @@ -145,11 +186,15 @@ static bool trans_fsub_h(DisasContext *ctx, arg_fsub_h *a)
>  static bool trans_fmul_h(DisasContext *ctx, arg_fmul_h *a)
>  {
>      REQUIRE_FPU;
> -    REQUIRE_ZFH(ctx);
> +    REQUIRE_ZHINX_OR_ZFH(ctx);
> +
> +    TCGv_i64 dest = dest_fpr(ctx, a->rd);
> +    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
> +    TCGv_i64 src2 = get_fpr_hs(ctx, a->rs2);
>
>      gen_set_rm(ctx, a->rm);
> -    gen_helper_fmul_h(cpu_fpr[a->rd], cpu_env,
> -                      cpu_fpr[a->rs1], cpu_fpr[a->rs2]);
> +    gen_helper_fmul_h(dest, cpu_env, src1, src2);
> +    gen_set_fpr_hs(ctx, a->rd, dest);
>      mark_fs_dirty(ctx);
>      return true;
>  }
> @@ -157,11 +202,15 @@ static bool trans_fmul_h(DisasContext *ctx, arg_fmul_h *a)
>  static bool trans_fdiv_h(DisasContext *ctx, arg_fdiv_h *a)
>  {
>      REQUIRE_FPU;
> -    REQUIRE_ZFH(ctx);
> +    REQUIRE_ZHINX_OR_ZFH(ctx);
> +
> +    TCGv_i64 dest = dest_fpr(ctx, a->rd);
> +    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
> +    TCGv_i64 src2 = get_fpr_hs(ctx, a->rs2);
>
>      gen_set_rm(ctx, a->rm);
> -    gen_helper_fdiv_h(cpu_fpr[a->rd], cpu_env,
> -                      cpu_fpr[a->rs1], cpu_fpr[a->rs2]);
> +    gen_helper_fdiv_h(dest, cpu_env, src1, src2);
> +    gen_set_fpr_hs(ctx, a->rd, dest);
>      mark_fs_dirty(ctx);
>      return true;
>  }
> @@ -169,10 +218,14 @@ static bool trans_fdiv_h(DisasContext *ctx, arg_fdiv_h *a)
>  static bool trans_fsqrt_h(DisasContext *ctx, arg_fsqrt_h *a)
>  {
>      REQUIRE_FPU;
> -    REQUIRE_ZFH(ctx);
> +    REQUIRE_ZHINX_OR_ZFH(ctx);
> +
> +    TCGv_i64 dest = dest_fpr(ctx, a->rd);
> +    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
>
>      gen_set_rm(ctx, a->rm);
> -    gen_helper_fsqrt_h(cpu_fpr[a->rd], cpu_env, cpu_fpr[a->rs1]);
> +    gen_helper_fsqrt_h(dest, cpu_env, src1);
> +    gen_set_fpr_hs(ctx, a->rd, dest);
>      mark_fs_dirty(ctx);
>      return true;
>  }
> @@ -180,23 +233,37 @@ static bool trans_fsqrt_h(DisasContext *ctx, arg_fsqrt_h *a)
>  static bool trans_fsgnj_h(DisasContext *ctx, arg_fsgnj_h *a)
>  {
>      REQUIRE_FPU;
> -    REQUIRE_ZFH(ctx);
> +    REQUIRE_ZHINX_OR_ZFH(ctx);
> +
> +    TCGv_i64 dest = dest_fpr(ctx, a->rd);
> +    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
>
>      if (a->rs1 == a->rs2) { /* FMOV */
> -        gen_check_nanbox_h(cpu_fpr[a->rd], cpu_fpr[a->rs1]);
> +        if (!ctx->cfg_ptr->ext_zfinx) {
> +            gen_check_nanbox_h(dest, src1);
> +        } else {
> +            tcg_gen_ext16s_i64(dest, src1);
> +        }
>      } else {
> -        TCGv_i64 rs1 = tcg_temp_new_i64();
> -        TCGv_i64 rs2 = tcg_temp_new_i64();
> -
> -        gen_check_nanbox_h(rs1, cpu_fpr[a->rs1]);
> -        gen_check_nanbox_h(rs2, cpu_fpr[a->rs2]);
> -
> -        /* This formulation retains the nanboxing of rs2. */
> -        tcg_gen_deposit_i64(cpu_fpr[a->rd], rs2, rs1, 0, 15);
> -        tcg_temp_free_i64(rs1);
> -        tcg_temp_free_i64(rs2);
> +        TCGv_i64 src2 = get_fpr_hs(ctx, a->rs2);
> +
> +        if (!ctx->cfg_ptr->ext_zfinx) {
> +            TCGv_i64 rs1 = tcg_temp_new_i64();
> +            TCGv_i64 rs2 = tcg_temp_new_i64();
> +            gen_check_nanbox_h(rs1, src1);
> +            gen_check_nanbox_h(rs2, src2);
> +
> +            /* This formulation retains the nanboxing of rs2 in normal 'Zfh'. */
> +            tcg_gen_deposit_i64(dest, rs2, rs1, 0, 15);
> +
> +            tcg_temp_free_i64(rs1);
> +            tcg_temp_free_i64(rs2);
> +        } else {
> +            tcg_gen_deposit_i64(dest, src2, src1, 0, 15);
> +            tcg_gen_ext16s_i64(dest, dest);
> +        }
>      }
> -
> +    gen_set_fpr_hs(ctx, a->rd, dest);
>      mark_fs_dirty(ctx);
>      return true;
>  }
> @@ -206,16 +273,29 @@ static bool trans_fsgnjn_h(DisasContext *ctx, arg_fsgnjn_h *a)
>      TCGv_i64 rs1, rs2, mask;
>
>      REQUIRE_FPU;
> -    REQUIRE_ZFH(ctx);
> +    REQUIRE_ZHINX_OR_ZFH(ctx);
> +
> +    TCGv_i64 dest = dest_fpr(ctx, a->rd);
> +    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
>
>      rs1 = tcg_temp_new_i64();
> -    gen_check_nanbox_h(rs1, cpu_fpr[a->rs1]);
> +    if (!ctx->cfg_ptr->ext_zfinx) {
> +        gen_check_nanbox_h(rs1, src1);
> +    } else {
> +        tcg_gen_mov_i64(rs1, src1);
> +    }
>
>      if (a->rs1 == a->rs2) { /* FNEG */
> -        tcg_gen_xori_i64(cpu_fpr[a->rd], rs1, MAKE_64BIT_MASK(15, 1));
> +        tcg_gen_xori_i64(dest, rs1, MAKE_64BIT_MASK(15, 1));
>      } else {
> +        TCGv_i64 src2 = get_fpr_hs(ctx, a->rs2);
>          rs2 = tcg_temp_new_i64();
> -        gen_check_nanbox_h(rs2, cpu_fpr[a->rs2]);
> +
> +        if (!ctx->cfg_ptr->ext_zfinx) {
> +            gen_check_nanbox_h(rs2, src2);
> +        } else {
> +            tcg_gen_mov_i64(rs2, src2);
> +        }
>
>          /*
>           * Replace bit 15 in rs1 with inverse in rs2.
> @@ -224,12 +304,17 @@ static bool trans_fsgnjn_h(DisasContext *ctx, arg_fsgnjn_h *a)
>          mask = tcg_const_i64(~MAKE_64BIT_MASK(15, 1));
>          tcg_gen_not_i64(rs2, rs2);
>          tcg_gen_andc_i64(rs2, rs2, mask);
> -        tcg_gen_and_i64(rs1, mask, rs1);
> -        tcg_gen_or_i64(cpu_fpr[a->rd], rs1, rs2);
> +        tcg_gen_and_i64(dest, mask, rs1);
> +        tcg_gen_or_i64(dest, dest, rs2);
>
>          tcg_temp_free_i64(mask);
>          tcg_temp_free_i64(rs2);
>      }
> +    /* signed-extended intead of nanboxing for result if enable zfinx */
> +    if (ctx->cfg_ptr->ext_zfinx) {
> +        tcg_gen_ext16s_i64(dest, dest);
> +    }
> +    tcg_temp_free_i64(rs1);
>      mark_fs_dirty(ctx);
>      return true;
>  }
> @@ -239,27 +324,44 @@ static bool trans_fsgnjx_h(DisasContext *ctx, arg_fsgnjx_h *a)
>      TCGv_i64 rs1, rs2;
>
>      REQUIRE_FPU;
> -    REQUIRE_ZFH(ctx);
> +    REQUIRE_ZHINX_OR_ZFH(ctx);
> +
> +    TCGv_i64 dest = dest_fpr(ctx, a->rd);
> +    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
>
>      rs1 = tcg_temp_new_i64();
> -    gen_check_nanbox_s(rs1, cpu_fpr[a->rs1]);
> +    if (!ctx->cfg_ptr->ext_zfinx) {
> +        gen_check_nanbox_h(rs1, src1);
> +    } else {
> +        tcg_gen_mov_i64(rs1, src1);
> +    }
>
>      if (a->rs1 == a->rs2) { /* FABS */
> -        tcg_gen_andi_i64(cpu_fpr[a->rd], rs1, ~MAKE_64BIT_MASK(15, 1));
> +        tcg_gen_andi_i64(dest, rs1, ~MAKE_64BIT_MASK(15, 1));
>      } else {
> +        TCGv_i64 src2 = get_fpr_hs(ctx, a->rs2);
>          rs2 = tcg_temp_new_i64();
> -        gen_check_nanbox_s(rs2, cpu_fpr[a->rs2]);
> +
> +        if (!ctx->cfg_ptr->ext_zfinx) {
> +            gen_check_nanbox_h(rs2, src2);
> +        } else {
> +            tcg_gen_mov_i64(rs2, src2);
> +        }
>
>          /*
>           * Xor bit 15 in rs1 with that in rs2.
>           * This formulation retains the nanboxing of rs1.
>           */
> -        tcg_gen_andi_i64(rs2, rs2, MAKE_64BIT_MASK(15, 1));
> -        tcg_gen_xor_i64(cpu_fpr[a->rd], rs1, rs2);
> +        tcg_gen_andi_i64(dest, rs2, MAKE_64BIT_MASK(15, 1));
> +        tcg_gen_xor_i64(dest, rs1, dest);
>
>          tcg_temp_free_i64(rs2);
>      }
> -
> +    /* signed-extended intead of nanboxing for result if enable zfinx */
> +    if (ctx->cfg_ptr->ext_zfinx) {
> +        tcg_gen_ext16s_i64(dest, dest);
> +    }
> +    tcg_temp_free_i64(rs1);
>      mark_fs_dirty(ctx);
>      return true;
>  }
> @@ -267,10 +369,14 @@ static bool trans_fsgnjx_h(DisasContext *ctx, arg_fsgnjx_h *a)
>  static bool trans_fmin_h(DisasContext *ctx, arg_fmin_h *a)
>  {
>      REQUIRE_FPU;
> -    REQUIRE_ZFH(ctx);
> +    REQUIRE_ZHINX_OR_ZFH(ctx);
> +
> +    TCGv_i64 dest = dest_fpr(ctx, a->rd);
> +    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
> +    TCGv_i64 src2 = get_fpr_hs(ctx, a->rs2);
>
> -    gen_helper_fmin_h(cpu_fpr[a->rd], cpu_env, cpu_fpr[a->rs1],
> -                      cpu_fpr[a->rs2]);
> +    gen_helper_fmin_h(dest, cpu_env, src1, src2);
> +    gen_set_fpr_hs(ctx, a->rd, dest);
>      mark_fs_dirty(ctx);
>      return true;
>  }
> @@ -278,10 +384,14 @@ static bool trans_fmin_h(DisasContext *ctx, arg_fmin_h *a)
>  static bool trans_fmax_h(DisasContext *ctx, arg_fmax_h *a)
>  {
>      REQUIRE_FPU;
> -    REQUIRE_ZFH(ctx);
> +    REQUIRE_ZHINX_OR_ZFH(ctx);
>
> -    gen_helper_fmax_h(cpu_fpr[a->rd], cpu_env, cpu_fpr[a->rs1],
> -                      cpu_fpr[a->rs2]);
> +    TCGv_i64 dest = dest_fpr(ctx, a->rd);
> +    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
> +    TCGv_i64 src2 = get_fpr_hs(ctx, a->rs2);
> +
> +    gen_helper_fmax_h(dest, cpu_env, src1, src2);
> +    gen_set_fpr_hs(ctx, a->rd, dest);
>      mark_fs_dirty(ctx);
>      return true;
>  }
> @@ -289,10 +399,14 @@ static bool trans_fmax_h(DisasContext *ctx, arg_fmax_h *a)
>  static bool trans_fcvt_s_h(DisasContext *ctx, arg_fcvt_s_h *a)
>  {
>      REQUIRE_FPU;
> -    REQUIRE_ZFH_OR_ZFHMIN(ctx);
> +    REQUIRE_ZFH_OR_ZFHMIN_OR_ZHINX_OR_ZHINXMIN(ctx);
> +
> +    TCGv_i64 dest = dest_fpr(ctx, a->rd);
> +    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
>
>      gen_set_rm(ctx, a->rm);
> -    gen_helper_fcvt_s_h(cpu_fpr[a->rd], cpu_env, cpu_fpr[a->rs1]);
> +    gen_helper_fcvt_s_h(dest, cpu_env, src1);
> +    gen_set_fpr_hs(ctx, a->rd, dest);
>
>      mark_fs_dirty(ctx);
>
> @@ -302,26 +416,32 @@ static bool trans_fcvt_s_h(DisasContext *ctx, arg_fcvt_s_h *a)
>  static bool trans_fcvt_d_h(DisasContext *ctx, arg_fcvt_d_h *a)
>  {
>      REQUIRE_FPU;
> -    REQUIRE_ZFH_OR_ZFHMIN(ctx);
> -    REQUIRE_EXT(ctx, RVD);
> +    REQUIRE_ZFH_OR_ZFHMIN_OR_ZHINX_OR_ZHINXMIN(ctx);
> +    REQUIRE_ZDINX_OR_D(ctx);
> +
> +    TCGv_i64 dest = dest_fpr(ctx, a->rd);
> +    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
>
>      gen_set_rm(ctx, a->rm);
> -    gen_helper_fcvt_d_h(cpu_fpr[a->rd], cpu_env, cpu_fpr[a->rs1]);
> +    gen_helper_fcvt_d_h(dest, cpu_env, src1);
> +    gen_set_fpr_d(ctx, a->rd, dest);
>
>      mark_fs_dirty(ctx);
>
> -
>      return true;
>  }
>
>  static bool trans_fcvt_h_s(DisasContext *ctx, arg_fcvt_h_s *a)
>  {
>      REQUIRE_FPU;
> -    REQUIRE_ZFH_OR_ZFHMIN(ctx);
> +    REQUIRE_ZFH_OR_ZFHMIN_OR_ZHINX_OR_ZHINXMIN(ctx);
>
> -    gen_set_rm(ctx, a->rm);
> -    gen_helper_fcvt_h_s(cpu_fpr[a->rd], cpu_env, cpu_fpr[a->rs1]);
> +    TCGv_i64 dest = dest_fpr(ctx, a->rd);
> +    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
>
> +    gen_set_rm(ctx, a->rm);
> +    gen_helper_fcvt_h_s(dest, cpu_env, src1);
> +    gen_set_fpr_hs(ctx, a->rd, dest);
>      mark_fs_dirty(ctx);
>
>      return true;
> @@ -330,12 +450,15 @@ static bool trans_fcvt_h_s(DisasContext *ctx, arg_fcvt_h_s *a)
>  static bool trans_fcvt_h_d(DisasContext *ctx, arg_fcvt_h_d *a)
>  {
>      REQUIRE_FPU;
> -    REQUIRE_ZFH_OR_ZFHMIN(ctx);
> -    REQUIRE_EXT(ctx, RVD);
> +    REQUIRE_ZFH_OR_ZFHMIN_OR_ZHINX_OR_ZHINXMIN(ctx);
> +    REQUIRE_ZDINX_OR_D(ctx);
>
> -    gen_set_rm(ctx, a->rm);
> -    gen_helper_fcvt_h_d(cpu_fpr[a->rd], cpu_env, cpu_fpr[a->rs1]);
> +    TCGv_i64 dest = dest_fpr(ctx, a->rd);
> +    TCGv_i64 src1 = get_fpr_d(ctx, a->rs1);
>
> +    gen_set_rm(ctx, a->rm);
> +    gen_helper_fcvt_h_d(dest, cpu_env, src1);
> +    gen_set_fpr_hs(ctx, a->rd, dest);
>      mark_fs_dirty(ctx);
>
>      return true;
> @@ -344,11 +467,13 @@ static bool trans_fcvt_h_d(DisasContext *ctx, arg_fcvt_h_d *a)
>  static bool trans_feq_h(DisasContext *ctx, arg_feq_h *a)
>  {
>      REQUIRE_FPU;
> -    REQUIRE_ZFH(ctx);
> +    REQUIRE_ZHINX_OR_ZFH(ctx);
>
>      TCGv dest = dest_gpr(ctx, a->rd);
> +    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
> +    TCGv_i64 src2 = get_fpr_hs(ctx, a->rs2);
>
> -    gen_helper_feq_h(dest, cpu_env, cpu_fpr[a->rs1], cpu_fpr[a->rs2]);
> +    gen_helper_feq_h(dest, cpu_env, src1, src2);
>      gen_set_gpr(ctx, a->rd, dest);
>      return true;
>  }
> @@ -356,11 +481,13 @@ static bool trans_feq_h(DisasContext *ctx, arg_feq_h *a)
>  static bool trans_flt_h(DisasContext *ctx, arg_flt_h *a)
>  {
>      REQUIRE_FPU;
> -    REQUIRE_ZFH(ctx);
> +    REQUIRE_ZHINX_OR_ZFH(ctx);
>
>      TCGv dest = dest_gpr(ctx, a->rd);
> +    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
> +    TCGv_i64 src2 = get_fpr_hs(ctx, a->rs2);
>
> -    gen_helper_flt_h(dest, cpu_env, cpu_fpr[a->rs1], cpu_fpr[a->rs2]);
> +    gen_helper_flt_h(dest, cpu_env, src1, src2);
>      gen_set_gpr(ctx, a->rd, dest);
>
>      return true;
> @@ -369,11 +496,13 @@ static bool trans_flt_h(DisasContext *ctx, arg_flt_h *a)
>  static bool trans_fle_h(DisasContext *ctx, arg_fle_h *a)
>  {
>      REQUIRE_FPU;
> -    REQUIRE_ZFH(ctx);
> +    REQUIRE_ZHINX_OR_ZFH(ctx);
>
>      TCGv dest = dest_gpr(ctx, a->rd);
> +    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
> +    TCGv_i64 src2 = get_fpr_hs(ctx, a->rs2);
>
> -    gen_helper_fle_h(dest, cpu_env, cpu_fpr[a->rs1], cpu_fpr[a->rs2]);
> +    gen_helper_fle_h(dest, cpu_env, src1, src2);
>      gen_set_gpr(ctx, a->rd, dest);
>      return true;
>  }
> @@ -381,11 +510,12 @@ static bool trans_fle_h(DisasContext *ctx, arg_fle_h *a)
>  static bool trans_fclass_h(DisasContext *ctx, arg_fclass_h *a)
>  {
>      REQUIRE_FPU;
> -    REQUIRE_ZFH(ctx);
> +    REQUIRE_ZHINX_OR_ZFH(ctx);
>
>      TCGv dest = dest_gpr(ctx, a->rd);
> +    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
>
> -    gen_helper_fclass_h(dest, cpu_fpr[a->rs1]);
> +    gen_helper_fclass_h(dest, cpu_env, src1);
>      gen_set_gpr(ctx, a->rd, dest);
>      return true;
>  }
> @@ -393,12 +523,13 @@ static bool trans_fclass_h(DisasContext *ctx, arg_fclass_h *a)
>  static bool trans_fcvt_w_h(DisasContext *ctx, arg_fcvt_w_h *a)
>  {
>      REQUIRE_FPU;
> -    REQUIRE_ZFH(ctx);
> +    REQUIRE_ZHINX_OR_ZFH(ctx);
>
>      TCGv dest = dest_gpr(ctx, a->rd);
> +    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
>
>      gen_set_rm(ctx, a->rm);
> -    gen_helper_fcvt_w_h(dest, cpu_env, cpu_fpr[a->rs1]);
> +    gen_helper_fcvt_w_h(dest, cpu_env, src1);
>      gen_set_gpr(ctx, a->rd, dest);
>      return true;
>  }
> @@ -406,12 +537,13 @@ static bool trans_fcvt_w_h(DisasContext *ctx, arg_fcvt_w_h *a)
>  static bool trans_fcvt_wu_h(DisasContext *ctx, arg_fcvt_wu_h *a)
>  {
>      REQUIRE_FPU;
> -    REQUIRE_ZFH(ctx);
> +    REQUIRE_ZHINX_OR_ZFH(ctx);
>
>      TCGv dest = dest_gpr(ctx, a->rd);
> +    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
>
>      gen_set_rm(ctx, a->rm);
> -    gen_helper_fcvt_wu_h(dest, cpu_env, cpu_fpr[a->rs1]);
> +    gen_helper_fcvt_wu_h(dest, cpu_env, src1);
>      gen_set_gpr(ctx, a->rd, dest);
>      return true;
>  }
> @@ -419,12 +551,14 @@ static bool trans_fcvt_wu_h(DisasContext *ctx, arg_fcvt_wu_h *a)
>  static bool trans_fcvt_h_w(DisasContext *ctx, arg_fcvt_h_w *a)
>  {
>      REQUIRE_FPU;
> -    REQUIRE_ZFH(ctx);
> +    REQUIRE_ZHINX_OR_ZFH(ctx);
>
> +    TCGv_i64 dest = dest_fpr(ctx, a->rd);
>      TCGv t0 = get_gpr(ctx, a->rs1, EXT_SIGN);
>
>      gen_set_rm(ctx, a->rm);
> -    gen_helper_fcvt_h_w(cpu_fpr[a->rd], cpu_env, t0);
> +    gen_helper_fcvt_h_w(dest, cpu_env, t0);
> +    gen_set_fpr_hs(ctx, a->rd, dest);
>
>      mark_fs_dirty(ctx);
>      return true;
> @@ -433,12 +567,14 @@ static bool trans_fcvt_h_w(DisasContext *ctx, arg_fcvt_h_w *a)
>  static bool trans_fcvt_h_wu(DisasContext *ctx, arg_fcvt_h_wu *a)
>  {
>      REQUIRE_FPU;
> -    REQUIRE_ZFH(ctx);
> +    REQUIRE_ZHINX_OR_ZFH(ctx);
>
> +    TCGv_i64 dest = dest_fpr(ctx, a->rd);
>      TCGv t0 = get_gpr(ctx, a->rs1, EXT_SIGN);
>
>      gen_set_rm(ctx, a->rm);
> -    gen_helper_fcvt_h_wu(cpu_fpr[a->rd], cpu_env, t0);
> +    gen_helper_fcvt_h_wu(dest, cpu_env, t0);
> +    gen_set_fpr_hs(ctx, a->rd, dest);
>
>      mark_fs_dirty(ctx);
>      return true;
> @@ -482,12 +618,13 @@ static bool trans_fcvt_l_h(DisasContext *ctx, arg_fcvt_l_h *a)
>  {
>      REQUIRE_64BIT(ctx);
>      REQUIRE_FPU;
> -    REQUIRE_ZFH(ctx);
> +    REQUIRE_ZHINX_OR_ZFH(ctx);
>
>      TCGv dest = dest_gpr(ctx, a->rd);
> +    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
>
>      gen_set_rm(ctx, a->rm);
> -    gen_helper_fcvt_l_h(dest, cpu_env, cpu_fpr[a->rs1]);
> +    gen_helper_fcvt_l_h(dest, cpu_env, src1);
>      gen_set_gpr(ctx, a->rd, dest);
>      return true;
>  }
> @@ -496,12 +633,13 @@ static bool trans_fcvt_lu_h(DisasContext *ctx, arg_fcvt_lu_h *a)
>  {
>      REQUIRE_64BIT(ctx);
>      REQUIRE_FPU;
> -    REQUIRE_ZFH(ctx);
> +    REQUIRE_ZHINX_OR_ZFH(ctx);
>
>      TCGv dest = dest_gpr(ctx, a->rd);
> +    TCGv_i64 src1 = get_fpr_hs(ctx, a->rs1);
>
>      gen_set_rm(ctx, a->rm);
> -    gen_helper_fcvt_lu_h(dest, cpu_env, cpu_fpr[a->rs1]);
> +    gen_helper_fcvt_lu_h(dest, cpu_env, src1);
>      gen_set_gpr(ctx, a->rd, dest);
>      return true;
>  }
> @@ -510,12 +648,14 @@ static bool trans_fcvt_h_l(DisasContext *ctx, arg_fcvt_h_l *a)
>  {
>      REQUIRE_64BIT(ctx);
>      REQUIRE_FPU;
> -    REQUIRE_ZFH(ctx);
> +    REQUIRE_ZHINX_OR_ZFH(ctx);
>
> +    TCGv_i64 dest = dest_fpr(ctx, a->rd);
>      TCGv t0 = get_gpr(ctx, a->rs1, EXT_SIGN);
>
>      gen_set_rm(ctx, a->rm);
> -    gen_helper_fcvt_h_l(cpu_fpr[a->rd], cpu_env, t0);
> +    gen_helper_fcvt_h_l(dest, cpu_env, t0);
> +    gen_set_fpr_hs(ctx, a->rd, dest);
>
>      mark_fs_dirty(ctx);
>      return true;
> @@ -525,12 +665,14 @@ static bool trans_fcvt_h_lu(DisasContext *ctx, arg_fcvt_h_lu *a)
>  {
>      REQUIRE_64BIT(ctx);
>      REQUIRE_FPU;
> -    REQUIRE_ZFH(ctx);
> +    REQUIRE_ZHINX_OR_ZFH(ctx);
>
> +    TCGv_i64 dest = dest_fpr(ctx, a->rd);
>      TCGv t0 = get_gpr(ctx, a->rs1, EXT_SIGN);
>
>      gen_set_rm(ctx, a->rm);
> -    gen_helper_fcvt_h_lu(cpu_fpr[a->rd], cpu_env, t0);
> +    gen_helper_fcvt_h_lu(dest, cpu_env, t0);
> +    gen_set_fpr_hs(ctx, a->rd, dest);
>
>      mark_fs_dirty(ctx);
>      return true;
> diff --git a/target/riscv/internals.h b/target/riscv/internals.h
> index 6237bb3115..dbb322bfa7 100644
> --- a/target/riscv/internals.h
> +++ b/target/riscv/internals.h
> @@ -72,13 +72,23 @@ static inline float32 check_nanbox_s(CPURISCVState *env, uint64_t f)
>      }
>  }
>
> -static inline uint64_t nanbox_h(float16 f)
> +static inline uint64_t nanbox_h(CPURISCVState *env, float16 f)
>  {
> -    return f | MAKE_64BIT_MASK(16, 48);
> +    /* the value is sign-extended instead of NaN-boxing for zfinx */
> +    if (RISCV_CPU(env_cpu(env))->cfg.ext_zfinx) {
> +        return (int16_t)f;
> +    } else {
> +        return f | MAKE_64BIT_MASK(16, 48);
> +    }
>  }
>
> -static inline float16 check_nanbox_h(uint64_t f)
> +static inline float16 check_nanbox_h(CPURISCVState *env, uint64_t f)
>  {
> +    /* Disable nanbox check when enable zfinx */
> +    if (RISCV_CPU(env_cpu(env))->cfg.ext_zfinx) {
> +        return (uint16_t)f;
> +    }
> +
>      uint64_t mask = MAKE_64BIT_MASK(16, 48);
>
>      if (likely((f & mask) == mask)) {
> --
> 2.17.1
>
>


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v6 0/6] support subsets of Float-Point in Integer Registers extensions
  2022-02-11  4:39 ` Weiwei Li
@ 2022-02-28  8:27   ` Alistair Francis
  -1 siblings, 0 replies; 22+ messages in thread
From: Alistair Francis @ 2022-02-28  8:27 UTC (permalink / raw)
  To: Weiwei Li
  Cc: Wei Wu (吴伟),
	open list:RISC-V, wangjunqiang, Bin Meng, Richard Henderson,
	qemu-devel@nongnu.org Developers, ardxwe, Palmer Dabbelt,
	Alistair Francis

On Fri, Feb 11, 2022 at 2:49 PM Weiwei Li <liweiwei@iscas.ac.cn> wrote:
>
> This patchset implements RISC-V Float-Point in Integer Registers extensions(Version 1.0), which includes Zfinx, Zdinx, Zhinx and Zhinxmin extension.
>
> Specification:
> https://github.com/riscv/riscv-zfinx/blob/main/zfinx-1.0.0.pdf
>
> The port is available here:
> https://github.com/plctlab/plct-qemu/tree/plct-zfinx-upstream-v6
>
> To test this implementation, specify cpu argument with 'zfinx =true,zdinx=true,zhinx=true,zhinxmin=true' with 'g=false,f=false,d=false,Zfh=false,Zfhmin=false'
> This implementation can pass gcc tests, ci result can be found in https://ci.rvperf.org/job/plct-qemu-zfinx-upstream/.
>
> v6:
> * rename flags Z*inx to z*inx
> * rebase on apply-to-riscv.next
>
> v5:
> * put definition of ftemp and nftemp together, add comments for them
> * sperate the declare of variable i from loop
>
> v4:
> * combine register pair check for rv32 zdinx
> * clear mstatus.FS when RVF is disabled by write_misa
>
> v3:
> * delete unused reset for mstatus.FS
> * use positive test for RVF instead of negative test for ZFINX
> * replace get_ol with get_xl
> * use tcg_gen_concat_tl_i64 to unify tcg_gen_concat_i32_i64 and tcg_gen_deposit_i64
>
> v2:
> * hardwire mstatus.FS to zero when enable zfinx
> * do register-pair check at the begin of translation
> * optimize partial implemention as suggested
>
> Weiwei Li (6):
>   target/riscv: add cfg properties for zfinx, zdinx and zhinx{min}
>   target/riscv: hardwire mstatus.FS to zero when enable zfinx
>   target/riscv: add support for zfinx
>   target/riscv: add support for zdinx
>   target/riscv: add support for zhinx/zhinxmin
>   target/riscv: expose zfinx, zdinx, zhinx{min} properties

Thanks!

Applied to riscv-to-apply.next

Alistair

>
>  target/riscv/cpu.c                        |  17 ++
>  target/riscv/cpu.h                        |   4 +
>  target/riscv/cpu_helper.c                 |   6 +-
>  target/riscv/csr.c                        |  25 +-
>  target/riscv/fpu_helper.c                 | 178 ++++++------
>  target/riscv/helper.h                     |   4 +-
>  target/riscv/insn_trans/trans_rvd.c.inc   | 285 ++++++++++++++-----
>  target/riscv/insn_trans/trans_rvf.c.inc   | 314 +++++++++++++-------
>  target/riscv/insn_trans/trans_rvzfh.c.inc | 332 +++++++++++++++-------
>  target/riscv/internals.h                  |  32 ++-
>  target/riscv/translate.c                  | 149 +++++++++-
>  11 files changed, 974 insertions(+), 372 deletions(-)
>
> --
> 2.17.1
>
>


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v6 0/6] support subsets of Float-Point in Integer Registers extensions
@ 2022-02-28  8:27   ` Alistair Francis
  0 siblings, 0 replies; 22+ messages in thread
From: Alistair Francis @ 2022-02-28  8:27 UTC (permalink / raw)
  To: Weiwei Li
  Cc: Richard Henderson, Palmer Dabbelt, Alistair Francis, Bin Meng,
	open list:RISC-V, qemu-devel@nongnu.org Developers, wangjunqiang,
	Wei Wu (吴伟),
	ardxwe

On Fri, Feb 11, 2022 at 2:49 PM Weiwei Li <liweiwei@iscas.ac.cn> wrote:
>
> This patchset implements RISC-V Float-Point in Integer Registers extensions(Version 1.0), which includes Zfinx, Zdinx, Zhinx and Zhinxmin extension.
>
> Specification:
> https://github.com/riscv/riscv-zfinx/blob/main/zfinx-1.0.0.pdf
>
> The port is available here:
> https://github.com/plctlab/plct-qemu/tree/plct-zfinx-upstream-v6
>
> To test this implementation, specify cpu argument with 'zfinx =true,zdinx=true,zhinx=true,zhinxmin=true' with 'g=false,f=false,d=false,Zfh=false,Zfhmin=false'
> This implementation can pass gcc tests, ci result can be found in https://ci.rvperf.org/job/plct-qemu-zfinx-upstream/.
>
> v6:
> * rename flags Z*inx to z*inx
> * rebase on apply-to-riscv.next
>
> v5:
> * put definition of ftemp and nftemp together, add comments for them
> * sperate the declare of variable i from loop
>
> v4:
> * combine register pair check for rv32 zdinx
> * clear mstatus.FS when RVF is disabled by write_misa
>
> v3:
> * delete unused reset for mstatus.FS
> * use positive test for RVF instead of negative test for ZFINX
> * replace get_ol with get_xl
> * use tcg_gen_concat_tl_i64 to unify tcg_gen_concat_i32_i64 and tcg_gen_deposit_i64
>
> v2:
> * hardwire mstatus.FS to zero when enable zfinx
> * do register-pair check at the begin of translation
> * optimize partial implemention as suggested
>
> Weiwei Li (6):
>   target/riscv: add cfg properties for zfinx, zdinx and zhinx{min}
>   target/riscv: hardwire mstatus.FS to zero when enable zfinx
>   target/riscv: add support for zfinx
>   target/riscv: add support for zdinx
>   target/riscv: add support for zhinx/zhinxmin
>   target/riscv: expose zfinx, zdinx, zhinx{min} properties

Thanks!

Applied to riscv-to-apply.next

Alistair

>
>  target/riscv/cpu.c                        |  17 ++
>  target/riscv/cpu.h                        |   4 +
>  target/riscv/cpu_helper.c                 |   6 +-
>  target/riscv/csr.c                        |  25 +-
>  target/riscv/fpu_helper.c                 | 178 ++++++------
>  target/riscv/helper.h                     |   4 +-
>  target/riscv/insn_trans/trans_rvd.c.inc   | 285 ++++++++++++++-----
>  target/riscv/insn_trans/trans_rvf.c.inc   | 314 +++++++++++++-------
>  target/riscv/insn_trans/trans_rvzfh.c.inc | 332 +++++++++++++++-------
>  target/riscv/internals.h                  |  32 ++-
>  target/riscv/translate.c                  | 149 +++++++++-
>  11 files changed, 974 insertions(+), 372 deletions(-)
>
> --
> 2.17.1
>
>


^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2022-02-28  8:31 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-02-11  4:39 [PATCH v6 0/6] support subsets of Float-Point in Integer Registers extensions Weiwei Li
2022-02-11  4:39 ` Weiwei Li
2022-02-11  4:39 ` [PATCH v6 1/6] target/riscv: add cfg properties for zfinx, zdinx and zhinx{min} Weiwei Li
2022-02-11  4:39   ` Weiwei Li
2022-02-11  4:39 ` [PATCH v6 2/6] target/riscv: hardwire mstatus.FS to zero when enable zfinx Weiwei Li
2022-02-11  4:39   ` Weiwei Li
2022-02-11  4:39 ` [PATCH v6 3/6] target/riscv: add support for zfinx Weiwei Li
2022-02-11  4:39   ` Weiwei Li
2022-02-28  3:55   ` Alistair Francis
2022-02-28  3:55     ` Alistair Francis
2022-02-11  4:39 ` [PATCH v6 4/6] target/riscv: add support for zdinx Weiwei Li
2022-02-11  4:39   ` Weiwei Li
2022-02-28  4:01   ` Alistair Francis
2022-02-28  4:01     ` Alistair Francis
2022-02-11  4:39 ` [PATCH v6 5/6] target/riscv: add support for zhinx/zhinxmin Weiwei Li
2022-02-11  4:39   ` Weiwei Li
2022-02-28  4:09   ` Alistair Francis
2022-02-28  4:09     ` Alistair Francis
2022-02-11  4:39 ` [PATCH v6 6/6] target/riscv: expose zfinx, zdinx, zhinx{min} properties Weiwei Li
2022-02-11  4:39   ` Weiwei Li
2022-02-28  8:27 ` [PATCH v6 0/6] support subsets of Float-Point in Integer Registers extensions Alistair Francis
2022-02-28  8:27   ` Alistair Francis

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.