All of lore.kernel.org
 help / color / mirror / Atom feed
* [PULL 00/26] riscv-to-apply queue
@ 2021-10-07  6:47 Alistair Francis
  2021-10-07  6:47 ` [PULL 01/26] target/riscv: Introduce temporary in gen_add_uw() Alistair Francis
                   ` (26 more replies)
  0 siblings, 27 replies; 37+ messages in thread
From: Alistair Francis @ 2021-10-07  6:47 UTC (permalink / raw)
  To: qemu-devel, peter.maydell; +Cc: alistair23, Alistair Francis

From: Alistair Francis <alistair.francis@wdc.com>

The following changes since commit ca61fa4b803e5d0abaf6f1ceb690f23bb78a4def:

  Merge remote-tracking branch 'remotes/quic/tags/pull-hex-20211006' into staging (2021-10-06 12:11:14 -0700)

are available in the Git repository at:

  git@github.com:alistair23/qemu.git tags/pull-riscv-to-apply-20211007

for you to fetch changes up to 9ae6ecd848dcd1b32003526ab65a0d4c644dfb07:

  hw/riscv: shakti_c: Mark as not user creatable (2021-10-07 08:41:33 +1000)

----------------------------------------------------------------
Third RISC-V PR for QEMU 6.2

 - Add Zb[abcs] instruction support
 - Remove RVB support
 - Bug fix of setting mstatus_hs.[SD|FS] bits
 - Mark some UART devices as 'input'
 - QOMify PolarFire MMUART
 - Fixes for sifive PDMA
 - Mark shakti_c as not user creatable

----------------------------------------------------------------
Alistair Francis (1):
      hw/riscv: shakti_c: Mark as not user creatable

Bin Meng (5):
      hw/char: ibex_uart: Register device in 'input' category
      hw/char: shakti_uart: Register device in 'input' category
      hw/char: sifive_uart: Register device in 'input' category
      hw/dma: sifive_pdma: Fix Control.claim bit detection
      hw/dma: sifive_pdma: Don't run DMA when channel is disclaimed

Frank Chang (1):
      target/riscv: Set mstatus_hs.[SD|FS] bits if Clean and V=1 in mark_fs_dirty()

Philipp Tomsich (16):
      target/riscv: Introduce temporary in gen_add_uw()
      target/riscv: fix clzw implementation to operate on arg1
      target/riscv: clwz must ignore high bits (use shift-left & changed logic)
      target/riscv: Add x-zba, x-zbb, x-zbc and x-zbs properties
      target/riscv: Reassign instructions to the Zba-extension
      target/riscv: Remove the W-form instructions from Zbs
      target/riscv: Remove shift-one instructions (proposed Zbo in pre-0.93 draft-B)
      target/riscv: Reassign instructions to the Zbs-extension
      target/riscv: Add instructions of the Zbc-extension
      target/riscv: Reassign instructions to the Zbb-extension
      target/riscv: Add orc.b instruction for Zbb, removing gorc/gorci
      target/riscv: Add a REQUIRE_32BIT macro
      target/riscv: Add rev8 instruction, removing grev/grevi
      target/riscv: Add zext.h instructions to Zbb, removing pack/packu/packh
      target/riscv: Remove RVB (replaced by Zb[abcs])
      disas/riscv: Add Zb[abcs] instructions

Philippe Mathieu-Daudé (3):
      hw/char/mchp_pfsoc_mmuart: Simplify MCHP_PFSOC_MMUART_REG definition
      hw/char/mchp_pfsoc_mmuart: Use a MemoryRegion container
      hw/char/mchp_pfsoc_mmuart: QOM'ify PolarFire MMUART

 include/hw/char/mchp_pfsoc_mmuart.h     |  17 +-
 target/riscv/cpu.h                      |  11 +-
 target/riscv/helper.h                   |   6 +-
 target/riscv/insn32.decode              | 115 ++++-----
 disas/riscv.c                           | 157 +++++++++++-
 hw/char/ibex_uart.c                     |   1 +
 hw/char/mchp_pfsoc_mmuart.c             | 116 +++++++--
 hw/char/shakti_uart.c                   |   1 +
 hw/char/sifive_uart.c                   |   1 +
 hw/dma/sifive_pdma.c                    |  13 +-
 hw/riscv/shakti_c.c                     |   7 +
 target/riscv/bitmanip_helper.c          |  65 +----
 target/riscv/cpu.c                      |  30 +--
 target/riscv/translate.c                |  36 ++-
 target/riscv/insn_trans/trans_rvb.c.inc | 419 ++++++++++----------------------
 15 files changed, 516 insertions(+), 479 deletions(-)


^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PULL 01/26] target/riscv: Introduce temporary in gen_add_uw()
  2021-10-07  6:47 [PULL 00/26] riscv-to-apply queue Alistair Francis
@ 2021-10-07  6:47 ` Alistair Francis
  2021-10-07  6:47 ` [PULL 02/26] target/riscv: fix clzw implementation to operate on arg1 Alistair Francis
                   ` (25 subsequent siblings)
  26 siblings, 0 replies; 37+ messages in thread
From: Alistair Francis @ 2021-10-07  6:47 UTC (permalink / raw)
  To: qemu-devel, peter.maydell
  Cc: alistair23, Philipp Tomsich, Alistair Francis, Bin Meng,
	Richard Henderson

From: Philipp Tomsich <philipp.tomsich@vrull.eu>

Following the recent changes in translate.c, gen_add_uw() causes
failures on CF3 and SPEC2017 due to the reuse of arg1.  Fix these
regressions by introducing a temporary.

Signed-off-by: Philipp Tomsich <philipp.tomsich@vrull.eu>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Bin Meng <bmeng.cn@gmail.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20210911140016.834071-2-philipp.tomsich@vrull.eu
Fixes: 191d1dafae9c ("target/riscv: Add DisasExtend to gen_arith*")
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
---
 target/riscv/insn_trans/trans_rvb.c.inc | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/target/riscv/insn_trans/trans_rvb.c.inc b/target/riscv/insn_trans/trans_rvb.c.inc
index b72e76255c..c0a6e25826 100644
--- a/target/riscv/insn_trans/trans_rvb.c.inc
+++ b/target/riscv/insn_trans/trans_rvb.c.inc
@@ -624,8 +624,10 @@ GEN_TRANS_SHADD_UW(3)
 
 static void gen_add_uw(TCGv ret, TCGv arg1, TCGv arg2)
 {
-    tcg_gen_ext32u_tl(arg1, arg1);
-    tcg_gen_add_tl(ret, arg1, arg2);
+    TCGv t = tcg_temp_new();
+    tcg_gen_ext32u_tl(t, arg1);
+    tcg_gen_add_tl(ret, t, arg2);
+    tcg_temp_free(t);
 }
 
 static bool trans_add_uw(DisasContext *ctx, arg_add_uw *a)
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PULL 02/26] target/riscv: fix clzw implementation to operate on arg1
  2021-10-07  6:47 [PULL 00/26] riscv-to-apply queue Alistair Francis
  2021-10-07  6:47 ` [PULL 01/26] target/riscv: Introduce temporary in gen_add_uw() Alistair Francis
@ 2021-10-07  6:47 ` Alistair Francis
  2021-10-07  6:47 ` [PULL 03/26] target/riscv: clwz must ignore high bits (use shift-left & changed logic) Alistair Francis
                   ` (24 subsequent siblings)
  26 siblings, 0 replies; 37+ messages in thread
From: Alistair Francis @ 2021-10-07  6:47 UTC (permalink / raw)
  To: qemu-devel, peter.maydell
  Cc: alistair23, Philipp Tomsich, Alistair Francis, Bin Meng,
	Richard Henderson

From: Philipp Tomsich <philipp.tomsich@vrull.eu>

The refactored gen_clzw() uses ret as its argument, instead of arg1.
Fix it.

Signed-off-by: Philipp Tomsich <philipp.tomsich@vrull.eu>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Bin Meng <bmeng.cn@gmail.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20210911140016.834071-3-philipp.tomsich@vrull.eu
Fixes: 60903915050 ("target/riscv: Add DisasExtend to gen_unary")
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
---
 target/riscv/insn_trans/trans_rvb.c.inc | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/target/riscv/insn_trans/trans_rvb.c.inc b/target/riscv/insn_trans/trans_rvb.c.inc
index c0a6e25826..6c85c89f6d 100644
--- a/target/riscv/insn_trans/trans_rvb.c.inc
+++ b/target/riscv/insn_trans/trans_rvb.c.inc
@@ -349,7 +349,7 @@ GEN_TRANS_SHADD(3)
 
 static void gen_clzw(TCGv ret, TCGv arg1)
 {
-    tcg_gen_clzi_tl(ret, ret, 64);
+    tcg_gen_clzi_tl(ret, arg1, 64);
     tcg_gen_subi_tl(ret, ret, 32);
 }
 
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PULL 03/26] target/riscv: clwz must ignore high bits (use shift-left & changed logic)
  2021-10-07  6:47 [PULL 00/26] riscv-to-apply queue Alistair Francis
  2021-10-07  6:47 ` [PULL 01/26] target/riscv: Introduce temporary in gen_add_uw() Alistair Francis
  2021-10-07  6:47 ` [PULL 02/26] target/riscv: fix clzw implementation to operate on arg1 Alistair Francis
@ 2021-10-07  6:47 ` Alistair Francis
  2021-10-07  6:47 ` [PULL 04/26] target/riscv: Add x-zba, x-zbb, x-zbc and x-zbs properties Alistair Francis
                   ` (23 subsequent siblings)
  26 siblings, 0 replies; 37+ messages in thread
From: Alistair Francis @ 2021-10-07  6:47 UTC (permalink / raw)
  To: qemu-devel, peter.maydell
  Cc: alistair23, Philipp Tomsich, LIU Zhiwei, Alistair Francis

From: Philipp Tomsich <philipp.tomsich@vrull.eu>

Assume clzw being executed on a register that is not sign-extended, such
as for the following sequence that uses (1ULL << 63) | 392 as the operand
to clzw:
	bseti	a2, zero, 63
	addi	a2, a2, 392
	clzw    a3, a2
The correct result of clzw would be 23, but the current implementation
returns -32 (as it performs a 64bit clz, which results in 0 leading zero
bits, and then subtracts 32).

Fix this by changing the implementation to:
 1. shift the original register up by 32
 2. performs a target-length (64bit) clz
 3. return 32 if no bits are set

Marking this instruction as 'w-form' (i.e., setting ctx->w) would not
correctly model the behaviour, as the instruction should not perform
a zero-extensions on the input (after all, it is not a .uw instruction)
and the result is always in the range 0..32 (so neither a sign-extension
nor a zero-extension on the result will ever be needed).  Consequently,
we do not set ctx->w and mark the instruction as EXT_NONE.

Signed-off-by: Philipp Tomsich <philipp.tomsich@vrull.eu>
Reviewed-by: LIU Zhiwei<zhiwei_liu@c-sky.com>
Message-id: 20210911140016.834071-4-philipp.tomsich@vrull.eu
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
---
 target/riscv/insn_trans/trans_rvb.c.inc | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/target/riscv/insn_trans/trans_rvb.c.inc b/target/riscv/insn_trans/trans_rvb.c.inc
index 6c85c89f6d..73d1e45026 100644
--- a/target/riscv/insn_trans/trans_rvb.c.inc
+++ b/target/riscv/insn_trans/trans_rvb.c.inc
@@ -349,15 +349,17 @@ GEN_TRANS_SHADD(3)
 
 static void gen_clzw(TCGv ret, TCGv arg1)
 {
-    tcg_gen_clzi_tl(ret, arg1, 64);
-    tcg_gen_subi_tl(ret, ret, 32);
+    TCGv t = tcg_temp_new();
+    tcg_gen_shli_tl(t, arg1, 32);
+    tcg_gen_clzi_tl(ret, t, 32);
+    tcg_temp_free(t);
 }
 
 static bool trans_clzw(DisasContext *ctx, arg_clzw *a)
 {
     REQUIRE_64BIT(ctx);
     REQUIRE_EXT(ctx, RVB);
-    return gen_unary(ctx, a, EXT_ZERO, gen_clzw);
+    return gen_unary(ctx, a, EXT_NONE, gen_clzw);
 }
 
 static void gen_ctzw(TCGv ret, TCGv arg1)
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PULL 04/26] target/riscv: Add x-zba, x-zbb, x-zbc and x-zbs properties
  2021-10-07  6:47 [PULL 00/26] riscv-to-apply queue Alistair Francis
                   ` (2 preceding siblings ...)
  2021-10-07  6:47 ` [PULL 03/26] target/riscv: clwz must ignore high bits (use shift-left & changed logic) Alistair Francis
@ 2021-10-07  6:47 ` Alistair Francis
  2021-10-07  6:47 ` [PULL 05/26] target/riscv: Reassign instructions to the Zba-extension Alistair Francis
                   ` (22 subsequent siblings)
  26 siblings, 0 replies; 37+ messages in thread
From: Alistair Francis @ 2021-10-07  6:47 UTC (permalink / raw)
  To: qemu-devel, peter.maydell
  Cc: alistair23, Philipp Tomsich, Richard Henderson, Alistair Francis,
	Bin Meng

From: Philipp Tomsich <philipp.tomsich@vrull.eu>

The bitmanipulation ISA extensions will be ratified as individual
small extension packages instead of a large B-extension.  The first
new instructions through the door (these have completed public review)
are Zb[abcs].

This adds new 'x-zba', 'x-zbb', 'x-zbc' and 'x-zbs' properties for
these in target/riscv/cpu.[ch].

Signed-off-by: Philipp Tomsich <philipp.tomsich@vrull.eu>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Bin Meng <bmeng.cn@gmail.com>
Message-id: 20210911140016.834071-5-philipp.tomsich@vrull.eu
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
---
 target/riscv/cpu.h | 4 ++++
 target/riscv/cpu.c | 4 ++++
 2 files changed, 8 insertions(+)

diff --git a/target/riscv/cpu.h b/target/riscv/cpu.h
index 5896aca346..1a38723f2c 100644
--- a/target/riscv/cpu.h
+++ b/target/riscv/cpu.h
@@ -293,6 +293,10 @@ struct RISCVCPU {
         bool ext_u;
         bool ext_h;
         bool ext_v;
+        bool ext_zba;
+        bool ext_zbb;
+        bool ext_zbc;
+        bool ext_zbs;
         bool ext_counters;
         bool ext_ifencei;
         bool ext_icsr;
diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
index 7c626d89cd..785a3a8d19 100644
--- a/target/riscv/cpu.c
+++ b/target/riscv/cpu.c
@@ -617,6 +617,10 @@ static Property riscv_cpu_properties[] = {
     DEFINE_PROP_BOOL("u", RISCVCPU, cfg.ext_u, true),
     /* This is experimental so mark with 'x-' */
     DEFINE_PROP_BOOL("x-b", RISCVCPU, cfg.ext_b, false),
+    DEFINE_PROP_BOOL("x-zba", RISCVCPU, cfg.ext_zba, false),
+    DEFINE_PROP_BOOL("x-zbb", RISCVCPU, cfg.ext_zbb, false),
+    DEFINE_PROP_BOOL("x-zbc", RISCVCPU, cfg.ext_zbc, false),
+    DEFINE_PROP_BOOL("x-zbs", RISCVCPU, cfg.ext_zbs, false),
     DEFINE_PROP_BOOL("x-h", RISCVCPU, cfg.ext_h, false),
     DEFINE_PROP_BOOL("x-v", RISCVCPU, cfg.ext_v, false),
     DEFINE_PROP_BOOL("Counters", RISCVCPU, cfg.ext_counters, true),
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PULL 05/26] target/riscv: Reassign instructions to the Zba-extension
  2021-10-07  6:47 [PULL 00/26] riscv-to-apply queue Alistair Francis
                   ` (3 preceding siblings ...)
  2021-10-07  6:47 ` [PULL 04/26] target/riscv: Add x-zba, x-zbb, x-zbc and x-zbs properties Alistair Francis
@ 2021-10-07  6:47 ` Alistair Francis
  2021-10-07  6:47 ` [PULL 06/26] target/riscv: Remove the W-form instructions from Zbs Alistair Francis
                   ` (21 subsequent siblings)
  26 siblings, 0 replies; 37+ messages in thread
From: Alistair Francis @ 2021-10-07  6:47 UTC (permalink / raw)
  To: qemu-devel, peter.maydell
  Cc: alistair23, Philipp Tomsich, Richard Henderson, Alistair Francis,
	Bin Meng

From: Philipp Tomsich <philipp.tomsich@vrull.eu>

The following instructions are part of Zba:
 - add.uw (RV64 only)
 - sh[123]add (RV32 and RV64)
 - sh[123]add.uw (RV64-only)
 - slli.uw (RV64-only)

Signed-off-by: Philipp Tomsich <philipp.tomsich@vrull.eu>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Acked-by: Bin Meng <bmeng.cn@gmail.com>
Message-id: 20210911140016.834071-6-philipp.tomsich@vrull.eu
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
---
 target/riscv/insn32.decode              | 20 ++++++++++++--------
 target/riscv/insn_trans/trans_rvb.c.inc | 16 +++++++++++-----
 2 files changed, 23 insertions(+), 13 deletions(-)

diff --git a/target/riscv/insn32.decode b/target/riscv/insn32.decode
index 2cd921d51c..86f1166dab 100644
--- a/target/riscv/insn32.decode
+++ b/target/riscv/insn32.decode
@@ -660,6 +660,18 @@ vamomaxd_v      10100 . . ..... ..... 111 ..... 0101111 @r_wdvm
 vamominud_v     11000 . . ..... ..... 111 ..... 0101111 @r_wdvm
 vamomaxud_v     11100 . . ..... ..... 111 ..... 0101111 @r_wdvm
 
+# *** RV32 Zba Standard Extension ***
+sh1add     0010000 .......... 010 ..... 0110011 @r
+sh2add     0010000 .......... 100 ..... 0110011 @r
+sh3add     0010000 .......... 110 ..... 0110011 @r
+
+# *** RV64 Zba Standard Extension (in addition to RV32 Zba) ***
+add_uw     0000100 .......... 000 ..... 0111011 @r
+sh1add_uw  0010000 .......... 010 ..... 0111011 @r
+sh2add_uw  0010000 .......... 100 ..... 0111011 @r
+sh3add_uw  0010000 .......... 110 ..... 0111011 @r
+slli_uw    00001 ............ 001 ..... 0011011 @sh
+
 # *** RV32B Standard Extension ***
 clz        011000 000000 ..... 001 ..... 0010011 @r2
 ctz        011000 000001 ..... 001 ..... 0010011 @r2
@@ -687,9 +699,6 @@ ror        0110000 .......... 101 ..... 0110011 @r
 rol        0110000 .......... 001 ..... 0110011 @r
 grev       0110100 .......... 101 ..... 0110011 @r
 gorc       0010100 .......... 101 ..... 0110011 @r
-sh1add     0010000 .......... 010 ..... 0110011 @r
-sh2add     0010000 .......... 100 ..... 0110011 @r
-sh3add     0010000 .......... 110 ..... 0110011 @r
 
 bseti      00101. ........... 001 ..... 0010011 @sh
 bclri      01001. ........... 001 ..... 0010011 @sh
@@ -718,10 +727,6 @@ rorw       0110000 .......... 101 ..... 0111011 @r
 rolw       0110000 .......... 001 ..... 0111011 @r
 grevw      0110100 .......... 101 ..... 0111011 @r
 gorcw      0010100 .......... 101 ..... 0111011 @r
-sh1add_uw  0010000 .......... 010 ..... 0111011 @r
-sh2add_uw  0010000 .......... 100 ..... 0111011 @r
-sh3add_uw  0010000 .......... 110 ..... 0111011 @r
-add_uw     0000100 .......... 000 ..... 0111011 @r
 
 bsetiw     0010100 .......... 001 ..... 0011011 @sh5
 bclriw     0100100 .......... 001 ..... 0011011 @sh5
@@ -732,4 +737,3 @@ roriw      0110000 .......... 101 ..... 0011011 @sh5
 greviw     0110100 .......... 101 ..... 0011011 @sh5
 gorciw     0010100 .......... 101 ..... 0011011 @sh5
 
-slli_uw    00001. ........... 001 ..... 0011011 @sh
diff --git a/target/riscv/insn_trans/trans_rvb.c.inc b/target/riscv/insn_trans/trans_rvb.c.inc
index 73d1e45026..fd549c7b0f 100644
--- a/target/riscv/insn_trans/trans_rvb.c.inc
+++ b/target/riscv/insn_trans/trans_rvb.c.inc
@@ -1,8 +1,9 @@
 /*
- * RISC-V translation routines for the RVB Standard Extension.
+ * RISC-V translation routines for the RVB draft and Zba Standard Extension.
  *
  * Copyright (c) 2020 Kito Cheng, kito.cheng@sifive.com
  * Copyright (c) 2020 Frank Chang, frank.chang@sifive.com
+ * Copyright (c) 2021 Philipp Tomsich, philipp.tomsich@vrull.eu
  *
  * This program is free software; you can redistribute it and/or modify it
  * under the terms and conditions of the GNU General Public License,
@@ -17,6 +18,11 @@
  * this program.  If not, see <http://www.gnu.org/licenses/>.
  */
 
+#define REQUIRE_ZBA(ctx) do {                    \
+    if (!RISCV_CPU(ctx->cs)->cfg.ext_zba) {      \
+        return false;                            \
+    }                                            \
+} while (0)
 
 static void gen_clz(TCGv ret, TCGv arg1)
 {
@@ -339,7 +345,7 @@ GEN_SHADD(3)
 #define GEN_TRANS_SHADD(SHAMT)                                             \
 static bool trans_sh##SHAMT##add(DisasContext *ctx, arg_sh##SHAMT##add *a) \
 {                                                                          \
-    REQUIRE_EXT(ctx, RVB);                                                 \
+    REQUIRE_ZBA(ctx);                                                      \
     return gen_arith(ctx, a, EXT_NONE, gen_sh##SHAMT##add);                \
 }
 
@@ -616,7 +622,7 @@ static bool trans_sh##SHAMT##add_uw(DisasContext *ctx,        \
                                     arg_sh##SHAMT##add_uw *a) \
 {                                                             \
     REQUIRE_64BIT(ctx);                                       \
-    REQUIRE_EXT(ctx, RVB);                                    \
+    REQUIRE_ZBA(ctx);                                         \
     return gen_arith(ctx, a, EXT_NONE, gen_sh##SHAMT##add_uw);  \
 }
 
@@ -635,7 +641,7 @@ static void gen_add_uw(TCGv ret, TCGv arg1, TCGv arg2)
 static bool trans_add_uw(DisasContext *ctx, arg_add_uw *a)
 {
     REQUIRE_64BIT(ctx);
-    REQUIRE_EXT(ctx, RVB);
+    REQUIRE_ZBA(ctx);
     return gen_arith(ctx, a, EXT_NONE, gen_add_uw);
 }
 
@@ -647,6 +653,6 @@ static void gen_slli_uw(TCGv dest, TCGv src, target_long shamt)
 static bool trans_slli_uw(DisasContext *ctx, arg_slli_uw *a)
 {
     REQUIRE_64BIT(ctx);
-    REQUIRE_EXT(ctx, RVB);
+    REQUIRE_ZBA(ctx);
     return gen_shift_imm_fn(ctx, a, EXT_NONE, gen_slli_uw);
 }
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PULL 06/26] target/riscv: Remove the W-form instructions from Zbs
  2021-10-07  6:47 [PULL 00/26] riscv-to-apply queue Alistair Francis
                   ` (4 preceding siblings ...)
  2021-10-07  6:47 ` [PULL 05/26] target/riscv: Reassign instructions to the Zba-extension Alistair Francis
@ 2021-10-07  6:47 ` Alistair Francis
  2021-10-07  6:47 ` [PULL 07/26] target/riscv: Remove shift-one instructions (proposed Zbo in pre-0.93 draft-B) Alistair Francis
                   ` (20 subsequent siblings)
  26 siblings, 0 replies; 37+ messages in thread
From: Alistair Francis @ 2021-10-07  6:47 UTC (permalink / raw)
  To: qemu-devel, peter.maydell
  Cc: alistair23, Philipp Tomsich, Richard Henderson, Alistair Francis,
	Bin Meng

From: Philipp Tomsich <philipp.tomsich@vrull.eu>

Zbs 1.0.0 (just as the 0.93 draft-B before) does not provide for W-form
instructions for Zbs (single-bit instructions).  Remove them.

Note that these instructions had already been removed for the 0.93
version of the draft-B extention and have not been present in the
binutils patches circulating in January 2021.

Signed-off-by: Philipp Tomsich <philipp.tomsich@vrull.eu>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Acked-by: Bin Meng <bmeng.cn@gmail.com>
Message-id: 20210911140016.834071-7-philipp.tomsich@vrull.eu
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
---
 target/riscv/insn32.decode              |  7 ----
 target/riscv/insn_trans/trans_rvb.c.inc | 56 -------------------------
 2 files changed, 63 deletions(-)

diff --git a/target/riscv/insn32.decode b/target/riscv/insn32.decode
index 86f1166dab..b499691a9e 100644
--- a/target/riscv/insn32.decode
+++ b/target/riscv/insn32.decode
@@ -717,10 +717,6 @@ cpopw      0110000 00010 ..... 001 ..... 0011011 @r2
 
 packw      0000100 .......... 100 ..... 0111011 @r
 packuw     0100100 .......... 100 ..... 0111011 @r
-bsetw      0010100 .......... 001 ..... 0111011 @r
-bclrw      0100100 .......... 001 ..... 0111011 @r
-binvw      0110100 .......... 001 ..... 0111011 @r
-bextw      0100100 .......... 101 ..... 0111011 @r
 slow       0010000 .......... 001 ..... 0111011 @r
 srow       0010000 .......... 101 ..... 0111011 @r
 rorw       0110000 .......... 101 ..... 0111011 @r
@@ -728,9 +724,6 @@ rolw       0110000 .......... 001 ..... 0111011 @r
 grevw      0110100 .......... 101 ..... 0111011 @r
 gorcw      0010100 .......... 101 ..... 0111011 @r
 
-bsetiw     0010100 .......... 001 ..... 0011011 @sh5
-bclriw     0100100 .......... 001 ..... 0011011 @sh5
-binviw     0110100 .......... 001 ..... 0011011 @sh5
 sloiw      0010000 .......... 001 ..... 0011011 @sh5
 sroiw      0010000 .......... 101 ..... 0011011 @sh5
 roriw      0110000 .......... 101 ..... 0011011 @sh5
diff --git a/target/riscv/insn_trans/trans_rvb.c.inc b/target/riscv/insn_trans/trans_rvb.c.inc
index fd549c7b0f..fbe1c3b410 100644
--- a/target/riscv/insn_trans/trans_rvb.c.inc
+++ b/target/riscv/insn_trans/trans_rvb.c.inc
@@ -420,62 +420,6 @@ static bool trans_packuw(DisasContext *ctx, arg_packuw *a)
     return gen_arith(ctx, a, EXT_NONE, gen_packuw);
 }
 
-static bool trans_bsetw(DisasContext *ctx, arg_bsetw *a)
-{
-    REQUIRE_64BIT(ctx);
-    REQUIRE_EXT(ctx, RVB);
-    ctx->w = true;
-    return gen_shift(ctx, a, EXT_NONE, gen_bset);
-}
-
-static bool trans_bsetiw(DisasContext *ctx, arg_bsetiw *a)
-{
-    REQUIRE_64BIT(ctx);
-    REQUIRE_EXT(ctx, RVB);
-    ctx->w = true;
-    return gen_shift_imm_tl(ctx, a, EXT_NONE, gen_bset);
-}
-
-static bool trans_bclrw(DisasContext *ctx, arg_bclrw *a)
-{
-    REQUIRE_64BIT(ctx);
-    REQUIRE_EXT(ctx, RVB);
-    ctx->w = true;
-    return gen_shift(ctx, a, EXT_NONE, gen_bclr);
-}
-
-static bool trans_bclriw(DisasContext *ctx, arg_bclriw *a)
-{
-    REQUIRE_64BIT(ctx);
-    REQUIRE_EXT(ctx, RVB);
-    ctx->w = true;
-    return gen_shift_imm_tl(ctx, a, EXT_NONE, gen_bclr);
-}
-
-static bool trans_binvw(DisasContext *ctx, arg_binvw *a)
-{
-    REQUIRE_64BIT(ctx);
-    REQUIRE_EXT(ctx, RVB);
-    ctx->w = true;
-    return gen_shift(ctx, a, EXT_NONE, gen_binv);
-}
-
-static bool trans_binviw(DisasContext *ctx, arg_binviw *a)
-{
-    REQUIRE_64BIT(ctx);
-    REQUIRE_EXT(ctx, RVB);
-    ctx->w = true;
-    return gen_shift_imm_tl(ctx, a, EXT_NONE, gen_binv);
-}
-
-static bool trans_bextw(DisasContext *ctx, arg_bextw *a)
-{
-    REQUIRE_64BIT(ctx);
-    REQUIRE_EXT(ctx, RVB);
-    ctx->w = true;
-    return gen_shift(ctx, a, EXT_NONE, gen_bext);
-}
-
 static bool trans_slow(DisasContext *ctx, arg_slow *a)
 {
     REQUIRE_64BIT(ctx);
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PULL 07/26] target/riscv: Remove shift-one instructions (proposed Zbo in pre-0.93 draft-B)
  2021-10-07  6:47 [PULL 00/26] riscv-to-apply queue Alistair Francis
                   ` (5 preceding siblings ...)
  2021-10-07  6:47 ` [PULL 06/26] target/riscv: Remove the W-form instructions from Zbs Alistair Francis
@ 2021-10-07  6:47 ` Alistair Francis
  2021-10-07  6:47 ` [PULL 08/26] target/riscv: Reassign instructions to the Zbs-extension Alistair Francis
                   ` (19 subsequent siblings)
  26 siblings, 0 replies; 37+ messages in thread
From: Alistair Francis @ 2021-10-07  6:47 UTC (permalink / raw)
  To: qemu-devel, peter.maydell
  Cc: alistair23, Philipp Tomsich, Richard Henderson, Alistair Francis,
	Bin Meng

From: Philipp Tomsich <philipp.tomsich@vrull.eu>

The Zb[abcs] ratification package does not include the proposed
shift-one instructions. There currently is no clear plan to whether
these (or variants of them) will be ratified as Zbo (or a different
extension) or what the timeframe for such a decision could be.

Signed-off-by: Philipp Tomsich <philipp.tomsich@vrull.eu>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Acked-by: Bin Meng <bmeng.cn@gmail.com>
Message-id: 20210911140016.834071-8-philipp.tomsich@vrull.eu
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
---
 target/riscv/insn32.decode              |  8 ---
 target/riscv/insn_trans/trans_rvb.c.inc | 70 -------------------------
 2 files changed, 78 deletions(-)

diff --git a/target/riscv/insn32.decode b/target/riscv/insn32.decode
index b499691a9e..e0f6e315a2 100644
--- a/target/riscv/insn32.decode
+++ b/target/riscv/insn32.decode
@@ -693,8 +693,6 @@ bset       0010100 .......... 001 ..... 0110011 @r
 bclr       0100100 .......... 001 ..... 0110011 @r
 binv       0110100 .......... 001 ..... 0110011 @r
 bext       0100100 .......... 101 ..... 0110011 @r
-slo        0010000 .......... 001 ..... 0110011 @r
-sro        0010000 .......... 101 ..... 0110011 @r
 ror        0110000 .......... 101 ..... 0110011 @r
 rol        0110000 .......... 001 ..... 0110011 @r
 grev       0110100 .......... 101 ..... 0110011 @r
@@ -704,8 +702,6 @@ bseti      00101. ........... 001 ..... 0010011 @sh
 bclri      01001. ........... 001 ..... 0010011 @sh
 binvi      01101. ........... 001 ..... 0010011 @sh
 bexti      01001. ........... 101 ..... 0010011 @sh
-sloi       00100. ........... 001 ..... 0010011 @sh
-sroi       00100. ........... 101 ..... 0010011 @sh
 rori       01100. ........... 101 ..... 0010011 @sh
 grevi      01101. ........... 101 ..... 0010011 @sh
 gorci      00101. ........... 101 ..... 0010011 @sh
@@ -717,15 +713,11 @@ cpopw      0110000 00010 ..... 001 ..... 0011011 @r2
 
 packw      0000100 .......... 100 ..... 0111011 @r
 packuw     0100100 .......... 100 ..... 0111011 @r
-slow       0010000 .......... 001 ..... 0111011 @r
-srow       0010000 .......... 101 ..... 0111011 @r
 rorw       0110000 .......... 101 ..... 0111011 @r
 rolw       0110000 .......... 001 ..... 0111011 @r
 grevw      0110100 .......... 101 ..... 0111011 @r
 gorcw      0010100 .......... 101 ..... 0111011 @r
 
-sloiw      0010000 .......... 001 ..... 0011011 @sh5
-sroiw      0010000 .......... 101 ..... 0011011 @sh5
 roriw      0110000 .......... 101 ..... 0011011 @sh5
 greviw     0110100 .......... 101 ..... 0011011 @sh5
 gorciw     0010100 .......... 101 ..... 0011011 @sh5
diff --git a/target/riscv/insn_trans/trans_rvb.c.inc b/target/riscv/insn_trans/trans_rvb.c.inc
index fbe1c3b410..a5bf40f95b 100644
--- a/target/riscv/insn_trans/trans_rvb.c.inc
+++ b/target/riscv/insn_trans/trans_rvb.c.inc
@@ -237,44 +237,6 @@ static bool trans_bexti(DisasContext *ctx, arg_bexti *a)
     return gen_shift_imm_tl(ctx, a, EXT_NONE, gen_bext);
 }
 
-static void gen_slo(TCGv ret, TCGv arg1, TCGv arg2)
-{
-    tcg_gen_not_tl(ret, arg1);
-    tcg_gen_shl_tl(ret, ret, arg2);
-    tcg_gen_not_tl(ret, ret);
-}
-
-static bool trans_slo(DisasContext *ctx, arg_slo *a)
-{
-    REQUIRE_EXT(ctx, RVB);
-    return gen_shift(ctx, a, EXT_NONE, gen_slo);
-}
-
-static bool trans_sloi(DisasContext *ctx, arg_sloi *a)
-{
-    REQUIRE_EXT(ctx, RVB);
-    return gen_shift_imm_tl(ctx, a, EXT_NONE, gen_slo);
-}
-
-static void gen_sro(TCGv ret, TCGv arg1, TCGv arg2)
-{
-    tcg_gen_not_tl(ret, arg1);
-    tcg_gen_shr_tl(ret, ret, arg2);
-    tcg_gen_not_tl(ret, ret);
-}
-
-static bool trans_sro(DisasContext *ctx, arg_sro *a)
-{
-    REQUIRE_EXT(ctx, RVB);
-    return gen_shift(ctx, a, EXT_ZERO, gen_sro);
-}
-
-static bool trans_sroi(DisasContext *ctx, arg_sroi *a)
-{
-    REQUIRE_EXT(ctx, RVB);
-    return gen_shift_imm_tl(ctx, a, EXT_ZERO, gen_sro);
-}
-
 static bool trans_ror(DisasContext *ctx, arg_ror *a)
 {
     REQUIRE_EXT(ctx, RVB);
@@ -420,38 +382,6 @@ static bool trans_packuw(DisasContext *ctx, arg_packuw *a)
     return gen_arith(ctx, a, EXT_NONE, gen_packuw);
 }
 
-static bool trans_slow(DisasContext *ctx, arg_slow *a)
-{
-    REQUIRE_64BIT(ctx);
-    REQUIRE_EXT(ctx, RVB);
-    ctx->w = true;
-    return gen_shift(ctx, a, EXT_NONE, gen_slo);
-}
-
-static bool trans_sloiw(DisasContext *ctx, arg_sloiw *a)
-{
-    REQUIRE_64BIT(ctx);
-    REQUIRE_EXT(ctx, RVB);
-    ctx->w = true;
-    return gen_shift_imm_tl(ctx, a, EXT_NONE, gen_slo);
-}
-
-static bool trans_srow(DisasContext *ctx, arg_srow *a)
-{
-    REQUIRE_64BIT(ctx);
-    REQUIRE_EXT(ctx, RVB);
-    ctx->w = true;
-    return gen_shift(ctx, a, EXT_ZERO, gen_sro);
-}
-
-static bool trans_sroiw(DisasContext *ctx, arg_sroiw *a)
-{
-    REQUIRE_64BIT(ctx);
-    REQUIRE_EXT(ctx, RVB);
-    ctx->w = true;
-    return gen_shift_imm_tl(ctx, a, EXT_ZERO, gen_sro);
-}
-
 static void gen_rorw(TCGv ret, TCGv arg1, TCGv arg2)
 {
     TCGv_i32 t1 = tcg_temp_new_i32();
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PULL 08/26] target/riscv: Reassign instructions to the Zbs-extension
  2021-10-07  6:47 [PULL 00/26] riscv-to-apply queue Alistair Francis
                   ` (6 preceding siblings ...)
  2021-10-07  6:47 ` [PULL 07/26] target/riscv: Remove shift-one instructions (proposed Zbo in pre-0.93 draft-B) Alistair Francis
@ 2021-10-07  6:47 ` Alistair Francis
  2021-10-07  6:47 ` [PULL 09/26] target/riscv: Add instructions of the Zbc-extension Alistair Francis
                   ` (18 subsequent siblings)
  26 siblings, 0 replies; 37+ messages in thread
From: Alistair Francis @ 2021-10-07  6:47 UTC (permalink / raw)
  To: qemu-devel, peter.maydell
  Cc: alistair23, Philipp Tomsich, Richard Henderson, Alistair Francis,
	Bin Meng

From: Philipp Tomsich <philipp.tomsich@vrull.eu>

The following instructions are part of Zbs:
 - b{set,clr,ext,inv}
 - b{set,clr,ext,inv}i

Signed-off-by: Philipp Tomsich <philipp.tomsich@vrull.eu>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Acked-by: Bin Meng <bmeng.cn@gmail.com>
Message-id: 20210911140016.834071-9-philipp.tomsich@vrull.eu
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
---
 target/riscv/insn32.decode              | 17 +++++++++--------
 target/riscv/insn_trans/trans_rvb.c.inc | 25 +++++++++++++++----------
 2 files changed, 24 insertions(+), 18 deletions(-)

diff --git a/target/riscv/insn32.decode b/target/riscv/insn32.decode
index e0f6e315a2..35a3563ff4 100644
--- a/target/riscv/insn32.decode
+++ b/target/riscv/insn32.decode
@@ -689,19 +689,11 @@ min        0000101 .......... 100 ..... 0110011 @r
 minu       0000101 .......... 101 ..... 0110011 @r
 max        0000101 .......... 110 ..... 0110011 @r
 maxu       0000101 .......... 111 ..... 0110011 @r
-bset       0010100 .......... 001 ..... 0110011 @r
-bclr       0100100 .......... 001 ..... 0110011 @r
-binv       0110100 .......... 001 ..... 0110011 @r
-bext       0100100 .......... 101 ..... 0110011 @r
 ror        0110000 .......... 101 ..... 0110011 @r
 rol        0110000 .......... 001 ..... 0110011 @r
 grev       0110100 .......... 101 ..... 0110011 @r
 gorc       0010100 .......... 101 ..... 0110011 @r
 
-bseti      00101. ........... 001 ..... 0010011 @sh
-bclri      01001. ........... 001 ..... 0010011 @sh
-binvi      01101. ........... 001 ..... 0010011 @sh
-bexti      01001. ........... 101 ..... 0010011 @sh
 rori       01100. ........... 101 ..... 0010011 @sh
 grevi      01101. ........... 101 ..... 0010011 @sh
 gorci      00101. ........... 101 ..... 0010011 @sh
@@ -722,3 +714,12 @@ roriw      0110000 .......... 101 ..... 0011011 @sh5
 greviw     0110100 .......... 101 ..... 0011011 @sh5
 gorciw     0010100 .......... 101 ..... 0011011 @sh5
 
+# *** RV32 Zbs Standard Extension ***
+bclr       0100100 .......... 001 ..... 0110011 @r
+bclri      01001. ........... 001 ..... 0010011 @sh
+bext       0100100 .......... 101 ..... 0110011 @r
+bexti      01001. ........... 101 ..... 0010011 @sh
+binv       0110100 .......... 001 ..... 0110011 @r
+binvi      01101. ........... 001 ..... 0010011 @sh
+bset       0010100 .......... 001 ..... 0110011 @r
+bseti      00101. ........... 001 ..... 0010011 @sh
diff --git a/target/riscv/insn_trans/trans_rvb.c.inc b/target/riscv/insn_trans/trans_rvb.c.inc
index a5bf40f95b..861364e3e5 100644
--- a/target/riscv/insn_trans/trans_rvb.c.inc
+++ b/target/riscv/insn_trans/trans_rvb.c.inc
@@ -1,5 +1,5 @@
 /*
- * RISC-V translation routines for the RVB draft and Zba Standard Extension.
+ * RISC-V translation routines for the RVB draft Zb[as] Standard Extension.
  *
  * Copyright (c) 2020 Kito Cheng, kito.cheng@sifive.com
  * Copyright (c) 2020 Frank Chang, frank.chang@sifive.com
@@ -24,11 +24,16 @@
     }                                            \
 } while (0)
 
+#define REQUIRE_ZBS(ctx) do {                    \
+    if (!RISCV_CPU(ctx->cs)->cfg.ext_zbs) {      \
+        return false;                            \
+    }                                            \
+} while (0)
+
 static void gen_clz(TCGv ret, TCGv arg1)
 {
     tcg_gen_clzi_tl(ret, arg1, TARGET_LONG_BITS);
 }
-
 static bool trans_clz(DisasContext *ctx, arg_clz *a)
 {
     REQUIRE_EXT(ctx, RVB);
@@ -165,13 +170,13 @@ static void gen_bset(TCGv ret, TCGv arg1, TCGv shamt)
 
 static bool trans_bset(DisasContext *ctx, arg_bset *a)
 {
-    REQUIRE_EXT(ctx, RVB);
+    REQUIRE_ZBS(ctx);
     return gen_shift(ctx, a, EXT_NONE, gen_bset);
 }
 
 static bool trans_bseti(DisasContext *ctx, arg_bseti *a)
 {
-    REQUIRE_EXT(ctx, RVB);
+    REQUIRE_ZBS(ctx);
     return gen_shift_imm_tl(ctx, a, EXT_NONE, gen_bset);
 }
 
@@ -187,13 +192,13 @@ static void gen_bclr(TCGv ret, TCGv arg1, TCGv shamt)
 
 static bool trans_bclr(DisasContext *ctx, arg_bclr *a)
 {
-    REQUIRE_EXT(ctx, RVB);
+    REQUIRE_ZBS(ctx);
     return gen_shift(ctx, a, EXT_NONE, gen_bclr);
 }
 
 static bool trans_bclri(DisasContext *ctx, arg_bclri *a)
 {
-    REQUIRE_EXT(ctx, RVB);
+    REQUIRE_ZBS(ctx);
     return gen_shift_imm_tl(ctx, a, EXT_NONE, gen_bclr);
 }
 
@@ -209,13 +214,13 @@ static void gen_binv(TCGv ret, TCGv arg1, TCGv shamt)
 
 static bool trans_binv(DisasContext *ctx, arg_binv *a)
 {
-    REQUIRE_EXT(ctx, RVB);
+    REQUIRE_ZBS(ctx);
     return gen_shift(ctx, a, EXT_NONE, gen_binv);
 }
 
 static bool trans_binvi(DisasContext *ctx, arg_binvi *a)
 {
-    REQUIRE_EXT(ctx, RVB);
+    REQUIRE_ZBS(ctx);
     return gen_shift_imm_tl(ctx, a, EXT_NONE, gen_binv);
 }
 
@@ -227,13 +232,13 @@ static void gen_bext(TCGv ret, TCGv arg1, TCGv shamt)
 
 static bool trans_bext(DisasContext *ctx, arg_bext *a)
 {
-    REQUIRE_EXT(ctx, RVB);
+    REQUIRE_ZBS(ctx);
     return gen_shift(ctx, a, EXT_NONE, gen_bext);
 }
 
 static bool trans_bexti(DisasContext *ctx, arg_bexti *a)
 {
-    REQUIRE_EXT(ctx, RVB);
+    REQUIRE_ZBS(ctx);
     return gen_shift_imm_tl(ctx, a, EXT_NONE, gen_bext);
 }
 
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PULL 09/26] target/riscv: Add instructions of the Zbc-extension
  2021-10-07  6:47 [PULL 00/26] riscv-to-apply queue Alistair Francis
                   ` (7 preceding siblings ...)
  2021-10-07  6:47 ` [PULL 08/26] target/riscv: Reassign instructions to the Zbs-extension Alistair Francis
@ 2021-10-07  6:47 ` Alistair Francis
  2021-10-07  6:47 ` [PULL 10/26] target/riscv: Reassign instructions to the Zbb-extension Alistair Francis
                   ` (17 subsequent siblings)
  26 siblings, 0 replies; 37+ messages in thread
From: Alistair Francis @ 2021-10-07  6:47 UTC (permalink / raw)
  To: qemu-devel, peter.maydell
  Cc: alistair23, Philipp Tomsich, Richard Henderson, Alistair Francis

From: Philipp Tomsich <philipp.tomsich@vrull.eu>

The following instructions are part of Zbc:
 - clmul
 - clmulh
 - clmulr

Note that these instructions were already defined in the pre-0.93 and
the 0.93 draft-B proposals, but had not been omitted in the earlier
addition of draft-B to QEmu.

Signed-off-by: Philipp Tomsich <philipp.tomsich@vrull.eu>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-id: 20210911140016.834071-10-philipp.tomsich@vrull.eu
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
---
 target/riscv/helper.h                   |  2 ++
 target/riscv/insn32.decode              |  5 ++++
 target/riscv/bitmanip_helper.c          | 27 +++++++++++++++++++++
 target/riscv/insn_trans/trans_rvb.c.inc | 32 ++++++++++++++++++++++++-
 4 files changed, 65 insertions(+), 1 deletion(-)

diff --git a/target/riscv/helper.h b/target/riscv/helper.h
index 460eee9988..8a318a2dbc 100644
--- a/target/riscv/helper.h
+++ b/target/riscv/helper.h
@@ -63,6 +63,8 @@ DEF_HELPER_FLAGS_2(grev, TCG_CALL_NO_RWG_SE, tl, tl, tl)
 DEF_HELPER_FLAGS_2(grevw, TCG_CALL_NO_RWG_SE, tl, tl, tl)
 DEF_HELPER_FLAGS_2(gorc, TCG_CALL_NO_RWG_SE, tl, tl, tl)
 DEF_HELPER_FLAGS_2(gorcw, TCG_CALL_NO_RWG_SE, tl, tl, tl)
+DEF_HELPER_FLAGS_2(clmul, TCG_CALL_NO_RWG_SE, tl, tl, tl)
+DEF_HELPER_FLAGS_2(clmulr, TCG_CALL_NO_RWG_SE, tl, tl, tl)
 
 /* Special functions */
 DEF_HELPER_2(csrr, tl, env, int)
diff --git a/target/riscv/insn32.decode b/target/riscv/insn32.decode
index 35a3563ff4..1658bb4217 100644
--- a/target/riscv/insn32.decode
+++ b/target/riscv/insn32.decode
@@ -714,6 +714,11 @@ roriw      0110000 .......... 101 ..... 0011011 @sh5
 greviw     0110100 .......... 101 ..... 0011011 @sh5
 gorciw     0010100 .......... 101 ..... 0011011 @sh5
 
+# *** RV32 Zbc Standard Extension ***
+clmul      0000101 .......... 001 ..... 0110011 @r
+clmulh     0000101 .......... 011 ..... 0110011 @r
+clmulr     0000101 .......... 010 ..... 0110011 @r
+
 # *** RV32 Zbs Standard Extension ***
 bclr       0100100 .......... 001 ..... 0110011 @r
 bclri      01001. ........... 001 ..... 0010011 @sh
diff --git a/target/riscv/bitmanip_helper.c b/target/riscv/bitmanip_helper.c
index 5b2f795d03..73be5a81c7 100644
--- a/target/riscv/bitmanip_helper.c
+++ b/target/riscv/bitmanip_helper.c
@@ -3,6 +3,7 @@
  *
  * Copyright (c) 2020 Kito Cheng, kito.cheng@sifive.com
  * Copyright (c) 2020 Frank Chang, frank.chang@sifive.com
+ * Copyright (c) 2021 Philipp Tomsich, philipp.tomsich@vrull.eu
  *
  * This program is free software; you can redistribute it and/or modify it
  * under the terms and conditions of the GNU General Public License,
@@ -88,3 +89,29 @@ target_ulong HELPER(gorcw)(target_ulong rs1, target_ulong rs2)
 {
     return do_gorc(rs1, rs2, 32);
 }
+
+target_ulong HELPER(clmul)(target_ulong rs1, target_ulong rs2)
+{
+    target_ulong result = 0;
+
+    for (int i = 0; i < TARGET_LONG_BITS; i++) {
+        if ((rs2 >> i) & 1) {
+            result ^= (rs1 << i);
+        }
+    }
+
+    return result;
+}
+
+target_ulong HELPER(clmulr)(target_ulong rs1, target_ulong rs2)
+{
+    target_ulong result = 0;
+
+    for (int i = 0; i < TARGET_LONG_BITS; i++) {
+        if ((rs2 >> i) & 1) {
+            result ^= (rs1 >> (TARGET_LONG_BITS - i - 1));
+        }
+    }
+
+    return result;
+}
diff --git a/target/riscv/insn_trans/trans_rvb.c.inc b/target/riscv/insn_trans/trans_rvb.c.inc
index 861364e3e5..2eb5fa3640 100644
--- a/target/riscv/insn_trans/trans_rvb.c.inc
+++ b/target/riscv/insn_trans/trans_rvb.c.inc
@@ -1,5 +1,5 @@
 /*
- * RISC-V translation routines for the RVB draft Zb[as] Standard Extension.
+ * RISC-V translation routines for the Zb[acs] Standard Extension.
  *
  * Copyright (c) 2020 Kito Cheng, kito.cheng@sifive.com
  * Copyright (c) 2020 Frank Chang, frank.chang@sifive.com
@@ -24,6 +24,12 @@
     }                                            \
 } while (0)
 
+#define REQUIRE_ZBC(ctx) do {                    \
+    if (!RISCV_CPU(ctx->cs)->cfg.ext_zbc) {      \
+        return false;                            \
+    }                                            \
+} while (0)
+
 #define REQUIRE_ZBS(ctx) do {                    \
     if (!RISCV_CPU(ctx->cs)->cfg.ext_zbs) {      \
         return false;                            \
@@ -535,3 +541,27 @@ static bool trans_slli_uw(DisasContext *ctx, arg_slli_uw *a)
     REQUIRE_ZBA(ctx);
     return gen_shift_imm_fn(ctx, a, EXT_NONE, gen_slli_uw);
 }
+
+static bool trans_clmul(DisasContext *ctx, arg_clmul *a)
+{
+    REQUIRE_ZBC(ctx);
+    return gen_arith(ctx, a, EXT_NONE, gen_helper_clmul);
+}
+
+static void gen_clmulh(TCGv dst, TCGv src1, TCGv src2)
+{
+     gen_helper_clmulr(dst, src1, src2);
+     tcg_gen_shri_tl(dst, dst, 1);
+}
+
+static bool trans_clmulh(DisasContext *ctx, arg_clmulr *a)
+{
+    REQUIRE_ZBC(ctx);
+    return gen_arith(ctx, a, EXT_NONE, gen_clmulh);
+}
+
+static bool trans_clmulr(DisasContext *ctx, arg_clmulh *a)
+{
+    REQUIRE_ZBC(ctx);
+    return gen_arith(ctx, a, EXT_NONE, gen_helper_clmulr);
+}
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PULL 10/26] target/riscv: Reassign instructions to the Zbb-extension
  2021-10-07  6:47 [PULL 00/26] riscv-to-apply queue Alistair Francis
                   ` (8 preceding siblings ...)
  2021-10-07  6:47 ` [PULL 09/26] target/riscv: Add instructions of the Zbc-extension Alistair Francis
@ 2021-10-07  6:47 ` Alistair Francis
  2021-10-07  6:47 ` [PULL 11/26] target/riscv: Add orc.b instruction for Zbb, removing gorc/gorci Alistair Francis
                   ` (16 subsequent siblings)
  26 siblings, 0 replies; 37+ messages in thread
From: Alistair Francis @ 2021-10-07  6:47 UTC (permalink / raw)
  To: qemu-devel, peter.maydell
  Cc: alistair23, Philipp Tomsich, Richard Henderson, Alistair Francis,
	Bin Meng

From: Philipp Tomsich <philipp.tomsich@vrull.eu>

This reassigns the instructions that are part of Zbb into it, with the
notable exceptions of the instructions (rev8, zext.w and orc.b) that
changed due to gorci, grevi and pack not being part of Zb[abcs].

Signed-off-by: Philipp Tomsich <philipp.tomsich@vrull.eu>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Acked-by: Bin Meng <bmeng.cn@gmail.com>
Message-id: 20210911140016.834071-11-philipp.tomsich@vrull.eu
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
---
 target/riscv/insn32.decode              | 40 ++++++++++---------
 target/riscv/insn_trans/trans_rvb.c.inc | 51 ++++++++++++++-----------
 2 files changed, 50 insertions(+), 41 deletions(-)

diff --git a/target/riscv/insn32.decode b/target/riscv/insn32.decode
index 1658bb4217..a509cfee11 100644
--- a/target/riscv/insn32.decode
+++ b/target/riscv/insn32.decode
@@ -672,45 +672,47 @@ sh2add_uw  0010000 .......... 100 ..... 0111011 @r
 sh3add_uw  0010000 .......... 110 ..... 0111011 @r
 slli_uw    00001 ............ 001 ..... 0011011 @sh
 
-# *** RV32B Standard Extension ***
+# *** RV32 Zbb Standard Extension ***
+andn       0100000 .......... 111 ..... 0110011 @r
 clz        011000 000000 ..... 001 ..... 0010011 @r2
-ctz        011000 000001 ..... 001 ..... 0010011 @r2
 cpop       011000 000010 ..... 001 ..... 0010011 @r2
+ctz        011000 000001 ..... 001 ..... 0010011 @r2
+max        0000101 .......... 110 ..... 0110011 @r
+maxu       0000101 .......... 111 ..... 0110011 @r
+min        0000101 .......... 100 ..... 0110011 @r
+minu       0000101 .......... 101 ..... 0110011 @r
+orn        0100000 .......... 110 ..... 0110011 @r
+rol        0110000 .......... 001 ..... 0110011 @r
+ror        0110000 .......... 101 ..... 0110011 @r
+rori       01100 ............ 101 ..... 0010011 @sh
 sext_b     011000 000100 ..... 001 ..... 0010011 @r2
 sext_h     011000 000101 ..... 001 ..... 0010011 @r2
-
-andn       0100000 .......... 111 ..... 0110011 @r
-orn        0100000 .......... 110 ..... 0110011 @r
 xnor       0100000 .......... 100 ..... 0110011 @r
+
+# *** RV64 Zbb Standard Extension (in addition to RV32 Zbb) ***
+clzw       0110000 00000 ..... 001 ..... 0011011 @r2
+ctzw       0110000 00001 ..... 001 ..... 0011011 @r2
+cpopw      0110000 00010 ..... 001 ..... 0011011 @r2
+rolw       0110000 .......... 001 ..... 0111011 @r
+roriw      0110000 .......... 101 ..... 0011011 @sh5
+rorw       0110000 .......... 101 ..... 0111011 @r
+
+# *** RV32B Standard Extension ***
 pack       0000100 .......... 100 ..... 0110011 @r
 packu      0100100 .......... 100 ..... 0110011 @r
 packh      0000100 .......... 111 ..... 0110011 @r
-min        0000101 .......... 100 ..... 0110011 @r
-minu       0000101 .......... 101 ..... 0110011 @r
-max        0000101 .......... 110 ..... 0110011 @r
-maxu       0000101 .......... 111 ..... 0110011 @r
-ror        0110000 .......... 101 ..... 0110011 @r
-rol        0110000 .......... 001 ..... 0110011 @r
 grev       0110100 .......... 101 ..... 0110011 @r
 gorc       0010100 .......... 101 ..... 0110011 @r
 
-rori       01100. ........... 101 ..... 0010011 @sh
 grevi      01101. ........... 101 ..... 0010011 @sh
 gorci      00101. ........... 101 ..... 0010011 @sh
 
 # *** RV64B Standard Extension (in addition to RV32B) ***
-clzw       0110000 00000 ..... 001 ..... 0011011 @r2
-ctzw       0110000 00001 ..... 001 ..... 0011011 @r2
-cpopw      0110000 00010 ..... 001 ..... 0011011 @r2
-
 packw      0000100 .......... 100 ..... 0111011 @r
 packuw     0100100 .......... 100 ..... 0111011 @r
-rorw       0110000 .......... 101 ..... 0111011 @r
-rolw       0110000 .......... 001 ..... 0111011 @r
 grevw      0110100 .......... 101 ..... 0111011 @r
 gorcw      0010100 .......... 101 ..... 0111011 @r
 
-roriw      0110000 .......... 101 ..... 0011011 @sh5
 greviw     0110100 .......... 101 ..... 0011011 @sh5
 gorciw     0010100 .......... 101 ..... 0011011 @sh5
 
diff --git a/target/riscv/insn_trans/trans_rvb.c.inc b/target/riscv/insn_trans/trans_rvb.c.inc
index 2eb5fa3640..bdfb495f24 100644
--- a/target/riscv/insn_trans/trans_rvb.c.inc
+++ b/target/riscv/insn_trans/trans_rvb.c.inc
@@ -1,5 +1,5 @@
 /*
- * RISC-V translation routines for the Zb[acs] Standard Extension.
+ * RISC-V translation routines for the Zb[abcs] Standard Extension.
  *
  * Copyright (c) 2020 Kito Cheng, kito.cheng@sifive.com
  * Copyright (c) 2020 Frank Chang, frank.chang@sifive.com
@@ -24,6 +24,12 @@
     }                                            \
 } while (0)
 
+#define REQUIRE_ZBB(ctx) do {                    \
+    if (!RISCV_CPU(ctx->cs)->cfg.ext_zbb) {      \
+        return false;                            \
+    }                                            \
+} while (0)
+
 #define REQUIRE_ZBC(ctx) do {                    \
     if (!RISCV_CPU(ctx->cs)->cfg.ext_zbc) {      \
         return false;                            \
@@ -40,9 +46,10 @@ static void gen_clz(TCGv ret, TCGv arg1)
 {
     tcg_gen_clzi_tl(ret, arg1, TARGET_LONG_BITS);
 }
+
 static bool trans_clz(DisasContext *ctx, arg_clz *a)
 {
-    REQUIRE_EXT(ctx, RVB);
+    REQUIRE_ZBB(ctx);
     return gen_unary(ctx, a, EXT_ZERO, gen_clz);
 }
 
@@ -53,31 +60,31 @@ static void gen_ctz(TCGv ret, TCGv arg1)
 
 static bool trans_ctz(DisasContext *ctx, arg_ctz *a)
 {
-    REQUIRE_EXT(ctx, RVB);
+    REQUIRE_ZBB(ctx);
     return gen_unary(ctx, a, EXT_ZERO, gen_ctz);
 }
 
 static bool trans_cpop(DisasContext *ctx, arg_cpop *a)
 {
-    REQUIRE_EXT(ctx, RVB);
+    REQUIRE_ZBB(ctx);
     return gen_unary(ctx, a, EXT_ZERO, tcg_gen_ctpop_tl);
 }
 
 static bool trans_andn(DisasContext *ctx, arg_andn *a)
 {
-    REQUIRE_EXT(ctx, RVB);
+    REQUIRE_ZBB(ctx);
     return gen_arith(ctx, a, EXT_NONE, tcg_gen_andc_tl);
 }
 
 static bool trans_orn(DisasContext *ctx, arg_orn *a)
 {
-    REQUIRE_EXT(ctx, RVB);
+    REQUIRE_ZBB(ctx);
     return gen_arith(ctx, a, EXT_NONE, tcg_gen_orc_tl);
 }
 
 static bool trans_xnor(DisasContext *ctx, arg_xnor *a)
 {
-    REQUIRE_EXT(ctx, RVB);
+    REQUIRE_ZBB(ctx);
     return gen_arith(ctx, a, EXT_NONE, tcg_gen_eqv_tl);
 }
 
@@ -124,37 +131,37 @@ static bool trans_packh(DisasContext *ctx, arg_packh *a)
 
 static bool trans_min(DisasContext *ctx, arg_min *a)
 {
-    REQUIRE_EXT(ctx, RVB);
+    REQUIRE_ZBB(ctx);
     return gen_arith(ctx, a, EXT_SIGN, tcg_gen_smin_tl);
 }
 
 static bool trans_max(DisasContext *ctx, arg_max *a)
 {
-    REQUIRE_EXT(ctx, RVB);
+    REQUIRE_ZBB(ctx);
     return gen_arith(ctx, a, EXT_SIGN, tcg_gen_smax_tl);
 }
 
 static bool trans_minu(DisasContext *ctx, arg_minu *a)
 {
-    REQUIRE_EXT(ctx, RVB);
+    REQUIRE_ZBB(ctx);
     return gen_arith(ctx, a, EXT_SIGN, tcg_gen_umin_tl);
 }
 
 static bool trans_maxu(DisasContext *ctx, arg_maxu *a)
 {
-    REQUIRE_EXT(ctx, RVB);
+    REQUIRE_ZBB(ctx);
     return gen_arith(ctx, a, EXT_SIGN, tcg_gen_umax_tl);
 }
 
 static bool trans_sext_b(DisasContext *ctx, arg_sext_b *a)
 {
-    REQUIRE_EXT(ctx, RVB);
+    REQUIRE_ZBB(ctx);
     return gen_unary(ctx, a, EXT_NONE, tcg_gen_ext8s_tl);
 }
 
 static bool trans_sext_h(DisasContext *ctx, arg_sext_h *a)
 {
-    REQUIRE_EXT(ctx, RVB);
+    REQUIRE_ZBB(ctx);
     return gen_unary(ctx, a, EXT_NONE, tcg_gen_ext16s_tl);
 }
 
@@ -250,19 +257,19 @@ static bool trans_bexti(DisasContext *ctx, arg_bexti *a)
 
 static bool trans_ror(DisasContext *ctx, arg_ror *a)
 {
-    REQUIRE_EXT(ctx, RVB);
+    REQUIRE_ZBB(ctx);
     return gen_shift(ctx, a, EXT_NONE, tcg_gen_rotr_tl);
 }
 
 static bool trans_rori(DisasContext *ctx, arg_rori *a)
 {
-    REQUIRE_EXT(ctx, RVB);
+    REQUIRE_ZBB(ctx);
     return gen_shift_imm_fn(ctx, a, EXT_NONE, tcg_gen_rotri_tl);
 }
 
 static bool trans_rol(DisasContext *ctx, arg_rol *a)
 {
-    REQUIRE_EXT(ctx, RVB);
+    REQUIRE_ZBB(ctx);
     return gen_shift(ctx, a, EXT_NONE, tcg_gen_rotl_tl);
 }
 
@@ -337,7 +344,7 @@ static void gen_clzw(TCGv ret, TCGv arg1)
 static bool trans_clzw(DisasContext *ctx, arg_clzw *a)
 {
     REQUIRE_64BIT(ctx);
-    REQUIRE_EXT(ctx, RVB);
+    REQUIRE_ZBB(ctx);
     return gen_unary(ctx, a, EXT_NONE, gen_clzw);
 }
 
@@ -350,14 +357,14 @@ static void gen_ctzw(TCGv ret, TCGv arg1)
 static bool trans_ctzw(DisasContext *ctx, arg_ctzw *a)
 {
     REQUIRE_64BIT(ctx);
-    REQUIRE_EXT(ctx, RVB);
+    REQUIRE_ZBB(ctx);
     return gen_unary(ctx, a, EXT_NONE, gen_ctzw);
 }
 
 static bool trans_cpopw(DisasContext *ctx, arg_cpopw *a)
 {
     REQUIRE_64BIT(ctx);
-    REQUIRE_EXT(ctx, RVB);
+    REQUIRE_ZBB(ctx);
     ctx->w = true;
     return gen_unary(ctx, a, EXT_ZERO, tcg_gen_ctpop_tl);
 }
@@ -414,7 +421,7 @@ static void gen_rorw(TCGv ret, TCGv arg1, TCGv arg2)
 static bool trans_rorw(DisasContext *ctx, arg_rorw *a)
 {
     REQUIRE_64BIT(ctx);
-    REQUIRE_EXT(ctx, RVB);
+    REQUIRE_ZBB(ctx);
     ctx->w = true;
     return gen_shift(ctx, a, EXT_NONE, gen_rorw);
 }
@@ -422,7 +429,7 @@ static bool trans_rorw(DisasContext *ctx, arg_rorw *a)
 static bool trans_roriw(DisasContext *ctx, arg_roriw *a)
 {
     REQUIRE_64BIT(ctx);
-    REQUIRE_EXT(ctx, RVB);
+    REQUIRE_ZBB(ctx);
     ctx->w = true;
     return gen_shift_imm_tl(ctx, a, EXT_NONE, gen_rorw);
 }
@@ -448,7 +455,7 @@ static void gen_rolw(TCGv ret, TCGv arg1, TCGv arg2)
 static bool trans_rolw(DisasContext *ctx, arg_rolw *a)
 {
     REQUIRE_64BIT(ctx);
-    REQUIRE_EXT(ctx, RVB);
+    REQUIRE_ZBB(ctx);
     ctx->w = true;
     return gen_shift(ctx, a, EXT_NONE, gen_rolw);
 }
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PULL 11/26] target/riscv: Add orc.b instruction for Zbb, removing gorc/gorci
  2021-10-07  6:47 [PULL 00/26] riscv-to-apply queue Alistair Francis
                   ` (9 preceding siblings ...)
  2021-10-07  6:47 ` [PULL 10/26] target/riscv: Reassign instructions to the Zbb-extension Alistair Francis
@ 2021-10-07  6:47 ` Alistair Francis
  2021-10-13  9:36   ` Vincent Palatin
  2021-10-07  6:47 ` [PULL 12/26] target/riscv: Add a REQUIRE_32BIT macro Alistair Francis
                   ` (15 subsequent siblings)
  26 siblings, 1 reply; 37+ messages in thread
From: Alistair Francis @ 2021-10-07  6:47 UTC (permalink / raw)
  To: qemu-devel, peter.maydell
  Cc: alistair23, Philipp Tomsich, Richard Henderson, Alistair Francis

From: Philipp Tomsich <philipp.tomsich@vrull.eu>

The 1.0.0 version of Zbb does not contain gorc/gorci.  Instead, a
orc.b instruction (equivalent to the orc.b pseudo-instruction built on
gorci from pre-0.93 draft-B) is available, mainly targeting
string-processing workloads.

This commit adds the new orc.b instruction and removed gorc/gorci.

Signed-off-by: Philipp Tomsich <philipp.tomsich@vrull.eu>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-id: 20210911140016.834071-12-philipp.tomsich@vrull.eu
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
---
 target/riscv/helper.h                   |  2 --
 target/riscv/insn32.decode              |  6 +---
 target/riscv/bitmanip_helper.c          | 26 -----------------
 target/riscv/insn_trans/trans_rvb.c.inc | 39 +++++++++++--------------
 4 files changed, 18 insertions(+), 55 deletions(-)

diff --git a/target/riscv/helper.h b/target/riscv/helper.h
index 8a318a2dbc..a9bda2c8ac 100644
--- a/target/riscv/helper.h
+++ b/target/riscv/helper.h
@@ -61,8 +61,6 @@ DEF_HELPER_FLAGS_1(fclass_d, TCG_CALL_NO_RWG_SE, tl, i64)
 /* Bitmanip */
 DEF_HELPER_FLAGS_2(grev, TCG_CALL_NO_RWG_SE, tl, tl, tl)
 DEF_HELPER_FLAGS_2(grevw, TCG_CALL_NO_RWG_SE, tl, tl, tl)
-DEF_HELPER_FLAGS_2(gorc, TCG_CALL_NO_RWG_SE, tl, tl, tl)
-DEF_HELPER_FLAGS_2(gorcw, TCG_CALL_NO_RWG_SE, tl, tl, tl)
 DEF_HELPER_FLAGS_2(clmul, TCG_CALL_NO_RWG_SE, tl, tl, tl)
 DEF_HELPER_FLAGS_2(clmulr, TCG_CALL_NO_RWG_SE, tl, tl, tl)
 
diff --git a/target/riscv/insn32.decode b/target/riscv/insn32.decode
index a509cfee11..59202196dc 100644
--- a/target/riscv/insn32.decode
+++ b/target/riscv/insn32.decode
@@ -681,6 +681,7 @@ max        0000101 .......... 110 ..... 0110011 @r
 maxu       0000101 .......... 111 ..... 0110011 @r
 min        0000101 .......... 100 ..... 0110011 @r
 minu       0000101 .......... 101 ..... 0110011 @r
+orc_b      001010 000111 ..... 101 ..... 0010011 @r2
 orn        0100000 .......... 110 ..... 0110011 @r
 rol        0110000 .......... 001 ..... 0110011 @r
 ror        0110000 .......... 101 ..... 0110011 @r
@@ -702,19 +703,14 @@ pack       0000100 .......... 100 ..... 0110011 @r
 packu      0100100 .......... 100 ..... 0110011 @r
 packh      0000100 .......... 111 ..... 0110011 @r
 grev       0110100 .......... 101 ..... 0110011 @r
-gorc       0010100 .......... 101 ..... 0110011 @r
-
 grevi      01101. ........... 101 ..... 0010011 @sh
-gorci      00101. ........... 101 ..... 0010011 @sh
 
 # *** RV64B Standard Extension (in addition to RV32B) ***
 packw      0000100 .......... 100 ..... 0111011 @r
 packuw     0100100 .......... 100 ..... 0111011 @r
 grevw      0110100 .......... 101 ..... 0111011 @r
-gorcw      0010100 .......... 101 ..... 0111011 @r
 
 greviw     0110100 .......... 101 ..... 0011011 @sh5
-gorciw     0010100 .......... 101 ..... 0011011 @sh5
 
 # *** RV32 Zbc Standard Extension ***
 clmul      0000101 .......... 001 ..... 0110011 @r
diff --git a/target/riscv/bitmanip_helper.c b/target/riscv/bitmanip_helper.c
index 73be5a81c7..bb48388fcd 100644
--- a/target/riscv/bitmanip_helper.c
+++ b/target/riscv/bitmanip_helper.c
@@ -64,32 +64,6 @@ target_ulong HELPER(grevw)(target_ulong rs1, target_ulong rs2)
     return do_grev(rs1, rs2, 32);
 }
 
-static target_ulong do_gorc(target_ulong rs1,
-                            target_ulong rs2,
-                            int bits)
-{
-    target_ulong x = rs1;
-    int i, shift;
-
-    for (i = 0, shift = 1; shift < bits; i++, shift <<= 1) {
-        if (rs2 & shift) {
-            x |= do_swap(x, adjacent_masks[i], shift);
-        }
-    }
-
-    return x;
-}
-
-target_ulong HELPER(gorc)(target_ulong rs1, target_ulong rs2)
-{
-    return do_gorc(rs1, rs2, TARGET_LONG_BITS);
-}
-
-target_ulong HELPER(gorcw)(target_ulong rs1, target_ulong rs2)
-{
-    return do_gorc(rs1, rs2, 32);
-}
-
 target_ulong HELPER(clmul)(target_ulong rs1, target_ulong rs2)
 {
     target_ulong result = 0;
diff --git a/target/riscv/insn_trans/trans_rvb.c.inc b/target/riscv/insn_trans/trans_rvb.c.inc
index bdfb495f24..d32af5915a 100644
--- a/target/riscv/insn_trans/trans_rvb.c.inc
+++ b/target/riscv/insn_trans/trans_rvb.c.inc
@@ -295,16 +295,27 @@ static bool trans_grevi(DisasContext *ctx, arg_grevi *a)
     return gen_shift_imm_fn(ctx, a, EXT_NONE, gen_grevi);
 }
 
-static bool trans_gorc(DisasContext *ctx, arg_gorc *a)
+static void gen_orc_b(TCGv ret, TCGv source1)
 {
-    REQUIRE_EXT(ctx, RVB);
-    return gen_shift(ctx, a, EXT_ZERO, gen_helper_gorc);
+    TCGv  tmp = tcg_temp_new();
+    TCGv  ones = tcg_constant_tl(dup_const_tl(MO_8, 0x01));
+
+    /* Set lsb in each byte if the byte was zero. */
+    tcg_gen_sub_tl(tmp, source1, ones);
+    tcg_gen_andc_tl(tmp, tmp, source1);
+    tcg_gen_shri_tl(tmp, tmp, 7);
+    tcg_gen_andc_tl(tmp, ones, tmp);
+
+    /* Replicate the lsb of each byte across the byte. */
+    tcg_gen_muli_tl(ret, tmp, 0xff);
+
+    tcg_temp_free(tmp);
 }
 
-static bool trans_gorci(DisasContext *ctx, arg_gorci *a)
+static bool trans_orc_b(DisasContext *ctx, arg_orc_b *a)
 {
-    REQUIRE_EXT(ctx, RVB);
-    return gen_shift_imm_tl(ctx, a, EXT_ZERO, gen_helper_gorc);
+    REQUIRE_ZBB(ctx);
+    return gen_unary(ctx, a, EXT_ZERO, gen_orc_b);
 }
 
 #define GEN_SHADD(SHAMT)                                       \
@@ -476,22 +487,6 @@ static bool trans_greviw(DisasContext *ctx, arg_greviw *a)
     return gen_shift_imm_tl(ctx, a, EXT_ZERO, gen_helper_grev);
 }
 
-static bool trans_gorcw(DisasContext *ctx, arg_gorcw *a)
-{
-    REQUIRE_64BIT(ctx);
-    REQUIRE_EXT(ctx, RVB);
-    ctx->w = true;
-    return gen_shift(ctx, a, EXT_ZERO, gen_helper_gorc);
-}
-
-static bool trans_gorciw(DisasContext *ctx, arg_gorciw *a)
-{
-    REQUIRE_64BIT(ctx);
-    REQUIRE_EXT(ctx, RVB);
-    ctx->w = true;
-    return gen_shift_imm_tl(ctx, a, EXT_ZERO, gen_helper_gorc);
-}
-
 #define GEN_SHADD_UW(SHAMT)                                       \
 static void gen_sh##SHAMT##add_uw(TCGv ret, TCGv arg1, TCGv arg2) \
 {                                                                 \
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PULL 12/26] target/riscv: Add a REQUIRE_32BIT macro
  2021-10-07  6:47 [PULL 00/26] riscv-to-apply queue Alistair Francis
                   ` (10 preceding siblings ...)
  2021-10-07  6:47 ` [PULL 11/26] target/riscv: Add orc.b instruction for Zbb, removing gorc/gorci Alistair Francis
@ 2021-10-07  6:47 ` Alistair Francis
  2021-10-07  6:47 ` [PULL 13/26] target/riscv: Add rev8 instruction, removing grev/grevi Alistair Francis
                   ` (14 subsequent siblings)
  26 siblings, 0 replies; 37+ messages in thread
From: Alistair Francis @ 2021-10-07  6:47 UTC (permalink / raw)
  To: qemu-devel, peter.maydell
  Cc: alistair23, Philipp Tomsich, Richard Henderson, Alistair Francis,
	Bin Meng

From: Philipp Tomsich <philipp.tomsich@vrull.eu>

With the changes to Zb[abcs], there's some encodings that are
different in RV64 and RV32 (e.g., for rev8 and zext.h). For these,
we'll need a helper macro allowing us to select on RV32, as well.

Signed-off-by: Philipp Tomsich <philipp.tomsich@vrull.eu>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Bin Meng <bmeng.cn@gmail.com>
Message-id: 20210911140016.834071-13-philipp.tomsich@vrull.eu
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
---
 target/riscv/translate.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/target/riscv/translate.c b/target/riscv/translate.c
index 74b33fa3c9..b2d3444bc5 100644
--- a/target/riscv/translate.c
+++ b/target/riscv/translate.c
@@ -337,6 +337,12 @@ EX_SH(12)
     }                              \
 } while (0)
 
+#define REQUIRE_32BIT(ctx) do { \
+    if (!is_32bit(ctx)) {       \
+        return false;           \
+    }                           \
+} while (0)
+
 #define REQUIRE_64BIT(ctx) do { \
     if (is_32bit(ctx)) {        \
         return false;           \
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PULL 13/26] target/riscv: Add rev8 instruction, removing grev/grevi
  2021-10-07  6:47 [PULL 00/26] riscv-to-apply queue Alistair Francis
                   ` (11 preceding siblings ...)
  2021-10-07  6:47 ` [PULL 12/26] target/riscv: Add a REQUIRE_32BIT macro Alistair Francis
@ 2021-10-07  6:47 ` Alistair Francis
  2021-10-07  6:47 ` [PULL 14/26] target/riscv: Add zext.h instructions to Zbb, removing pack/packu/packh Alistair Francis
                   ` (13 subsequent siblings)
  26 siblings, 0 replies; 37+ messages in thread
From: Alistair Francis @ 2021-10-07  6:47 UTC (permalink / raw)
  To: qemu-devel, peter.maydell
  Cc: alistair23, Philipp Tomsich, Richard Henderson, Alistair Francis

From: Philipp Tomsich <philipp.tomsich@vrull.eu>

The 1.0.0 version of Zbb does not contain grev/grevi.  Instead, a
rev8 instruction (equivalent to the rev8 pseudo-instruction built on
grevi from pre-0.93 draft-B) is available.

This commit adds the new rev8 instruction and removes grev/grevi.

Note that there is no W-form of this instruction (both a
sign-extending and zero-extending 32-bit version can easily be
synthesized by following rev8 with either a srai or srli instruction
on RV64) and that the opcode encodings for rev8 in RV32 and RV64 are
different.

Signed-off-by: Philipp Tomsich <philipp.tomsich@vrull.eu>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-id: 20210911140016.834071-14-philipp.tomsich@vrull.eu
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
---
 target/riscv/helper.h                   |  2 --
 target/riscv/insn32.decode              | 12 ++++----
 target/riscv/bitmanip_helper.c          | 40 -------------------------
 target/riscv/insn_trans/trans_rvb.c.inc | 40 +++++--------------------
 4 files changed, 15 insertions(+), 79 deletions(-)

diff --git a/target/riscv/helper.h b/target/riscv/helper.h
index a9bda2c8ac..c7a5376227 100644
--- a/target/riscv/helper.h
+++ b/target/riscv/helper.h
@@ -59,8 +59,6 @@ DEF_HELPER_FLAGS_2(fcvt_d_lu, TCG_CALL_NO_RWG, i64, env, tl)
 DEF_HELPER_FLAGS_1(fclass_d, TCG_CALL_NO_RWG_SE, tl, i64)
 
 /* Bitmanip */
-DEF_HELPER_FLAGS_2(grev, TCG_CALL_NO_RWG_SE, tl, tl, tl)
-DEF_HELPER_FLAGS_2(grevw, TCG_CALL_NO_RWG_SE, tl, tl, tl)
 DEF_HELPER_FLAGS_2(clmul, TCG_CALL_NO_RWG_SE, tl, tl, tl)
 DEF_HELPER_FLAGS_2(clmulr, TCG_CALL_NO_RWG_SE, tl, tl, tl)
 
diff --git a/target/riscv/insn32.decode b/target/riscv/insn32.decode
index 59202196dc..901a66c0f5 100644
--- a/target/riscv/insn32.decode
+++ b/target/riscv/insn32.decode
@@ -683,6 +683,9 @@ min        0000101 .......... 100 ..... 0110011 @r
 minu       0000101 .......... 101 ..... 0110011 @r
 orc_b      001010 000111 ..... 101 ..... 0010011 @r2
 orn        0100000 .......... 110 ..... 0110011 @r
+# The encoding for rev8 differs between RV32 and RV64.
+# rev8_32 denotes the RV32 variant.
+rev8_32    011010 011000 ..... 101 ..... 0010011 @r2
 rol        0110000 .......... 001 ..... 0110011 @r
 ror        0110000 .......... 101 ..... 0110011 @r
 rori       01100 ............ 101 ..... 0010011 @sh
@@ -694,6 +697,10 @@ xnor       0100000 .......... 100 ..... 0110011 @r
 clzw       0110000 00000 ..... 001 ..... 0011011 @r2
 ctzw       0110000 00001 ..... 001 ..... 0011011 @r2
 cpopw      0110000 00010 ..... 001 ..... 0011011 @r2
+# The encoding for rev8 differs between RV32 and RV64.
+# When executing on RV64, the encoding used in RV32 is an illegal
+# instruction, so we use different handler functions to differentiate.
+rev8_64    011010 111000 ..... 101 ..... 0010011 @r2
 rolw       0110000 .......... 001 ..... 0111011 @r
 roriw      0110000 .......... 101 ..... 0011011 @sh5
 rorw       0110000 .......... 101 ..... 0111011 @r
@@ -702,15 +709,10 @@ rorw       0110000 .......... 101 ..... 0111011 @r
 pack       0000100 .......... 100 ..... 0110011 @r
 packu      0100100 .......... 100 ..... 0110011 @r
 packh      0000100 .......... 111 ..... 0110011 @r
-grev       0110100 .......... 101 ..... 0110011 @r
-grevi      01101. ........... 101 ..... 0010011 @sh
 
 # *** RV64B Standard Extension (in addition to RV32B) ***
 packw      0000100 .......... 100 ..... 0111011 @r
 packuw     0100100 .......... 100 ..... 0111011 @r
-grevw      0110100 .......... 101 ..... 0111011 @r
-
-greviw     0110100 .......... 101 ..... 0011011 @sh5
 
 # *** RV32 Zbc Standard Extension ***
 clmul      0000101 .......... 001 ..... 0110011 @r
diff --git a/target/riscv/bitmanip_helper.c b/target/riscv/bitmanip_helper.c
index bb48388fcd..f1b5e5549f 100644
--- a/target/riscv/bitmanip_helper.c
+++ b/target/riscv/bitmanip_helper.c
@@ -24,46 +24,6 @@
 #include "exec/helper-proto.h"
 #include "tcg/tcg.h"
 
-static const uint64_t adjacent_masks[] = {
-    dup_const(MO_8, 0x55),
-    dup_const(MO_8, 0x33),
-    dup_const(MO_8, 0x0f),
-    dup_const(MO_16, 0xff),
-    dup_const(MO_32, 0xffff),
-    UINT32_MAX
-};
-
-static inline target_ulong do_swap(target_ulong x, uint64_t mask, int shift)
-{
-    return ((x & mask) << shift) | ((x & ~mask) >> shift);
-}
-
-static target_ulong do_grev(target_ulong rs1,
-                            target_ulong rs2,
-                            int bits)
-{
-    target_ulong x = rs1;
-    int i, shift;
-
-    for (i = 0, shift = 1; shift < bits; i++, shift <<= 1) {
-        if (rs2 & shift) {
-            x = do_swap(x, adjacent_masks[i], shift);
-        }
-    }
-
-    return x;
-}
-
-target_ulong HELPER(grev)(target_ulong rs1, target_ulong rs2)
-{
-    return do_grev(rs1, rs2, TARGET_LONG_BITS);
-}
-
-target_ulong HELPER(grevw)(target_ulong rs1, target_ulong rs2)
-{
-    return do_grev(rs1, rs2, 32);
-}
-
 target_ulong HELPER(clmul)(target_ulong rs1, target_ulong rs2)
 {
     target_ulong result = 0;
diff --git a/target/riscv/insn_trans/trans_rvb.c.inc b/target/riscv/insn_trans/trans_rvb.c.inc
index d32af5915a..48a7c9ca5e 100644
--- a/target/riscv/insn_trans/trans_rvb.c.inc
+++ b/target/riscv/insn_trans/trans_rvb.c.inc
@@ -273,26 +273,18 @@ static bool trans_rol(DisasContext *ctx, arg_rol *a)
     return gen_shift(ctx, a, EXT_NONE, tcg_gen_rotl_tl);
 }
 
-static bool trans_grev(DisasContext *ctx, arg_grev *a)
+static bool trans_rev8_32(DisasContext *ctx, arg_rev8_32 *a)
 {
-    REQUIRE_EXT(ctx, RVB);
-    return gen_shift(ctx, a, EXT_NONE, gen_helper_grev);
-}
-
-static void gen_grevi(TCGv dest, TCGv src, target_long shamt)
-{
-    if (shamt == TARGET_LONG_BITS - 8) {
-        /* rev8, byte swaps */
-        tcg_gen_bswap_tl(dest, src);
-    } else {
-        gen_helper_grev(dest, src, tcg_constant_tl(shamt));
-    }
+    REQUIRE_32BIT(ctx);
+    REQUIRE_ZBB(ctx);
+    return gen_unary(ctx, a, EXT_NONE, tcg_gen_bswap_tl);
 }
 
-static bool trans_grevi(DisasContext *ctx, arg_grevi *a)
+static bool trans_rev8_64(DisasContext *ctx, arg_rev8_64 *a)
 {
-    REQUIRE_EXT(ctx, RVB);
-    return gen_shift_imm_fn(ctx, a, EXT_NONE, gen_grevi);
+    REQUIRE_64BIT(ctx);
+    REQUIRE_ZBB(ctx);
+    return gen_unary(ctx, a, EXT_NONE, tcg_gen_bswap_tl);
 }
 
 static void gen_orc_b(TCGv ret, TCGv source1)
@@ -471,22 +463,6 @@ static bool trans_rolw(DisasContext *ctx, arg_rolw *a)
     return gen_shift(ctx, a, EXT_NONE, gen_rolw);
 }
 
-static bool trans_grevw(DisasContext *ctx, arg_grevw *a)
-{
-    REQUIRE_64BIT(ctx);
-    REQUIRE_EXT(ctx, RVB);
-    ctx->w = true;
-    return gen_shift(ctx, a, EXT_ZERO, gen_helper_grev);
-}
-
-static bool trans_greviw(DisasContext *ctx, arg_greviw *a)
-{
-    REQUIRE_64BIT(ctx);
-    REQUIRE_EXT(ctx, RVB);
-    ctx->w = true;
-    return gen_shift_imm_tl(ctx, a, EXT_ZERO, gen_helper_grev);
-}
-
 #define GEN_SHADD_UW(SHAMT)                                       \
 static void gen_sh##SHAMT##add_uw(TCGv ret, TCGv arg1, TCGv arg2) \
 {                                                                 \
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PULL 14/26] target/riscv: Add zext.h instructions to Zbb, removing pack/packu/packh
  2021-10-07  6:47 [PULL 00/26] riscv-to-apply queue Alistair Francis
                   ` (12 preceding siblings ...)
  2021-10-07  6:47 ` [PULL 13/26] target/riscv: Add rev8 instruction, removing grev/grevi Alistair Francis
@ 2021-10-07  6:47 ` Alistair Francis
  2021-10-07  6:47 ` [PULL 15/26] target/riscv: Remove RVB (replaced by Zb[abcs]) Alistair Francis
                   ` (12 subsequent siblings)
  26 siblings, 0 replies; 37+ messages in thread
From: Alistair Francis @ 2021-10-07  6:47 UTC (permalink / raw)
  To: qemu-devel, peter.maydell
  Cc: alistair23, Philipp Tomsich, Richard Henderson, Alistair Francis

From: Philipp Tomsich <philipp.tomsich@vrull.eu>

The 1.0.0 version of Zbb does not contain pack/packu/packh. However, a
zext.h instruction is provided (built on pack/packh from pre-0.93
draft-B) is available.

This commit adds zext.h and removes the pack* instructions.

Note that the encodings for zext.h are different between RV32 and
RV64, which is handled through REQUIRE_32BIT.

Signed-off-by: Philipp Tomsich <philipp.tomsich@vrull.eu>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-id: 20210911140016.834071-15-philipp.tomsich@vrull.eu
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
---
 target/riscv/insn32.decode              | 12 ++--
 target/riscv/insn_trans/trans_rvb.c.inc | 86 ++++---------------------
 2 files changed, 21 insertions(+), 77 deletions(-)

diff --git a/target/riscv/insn32.decode b/target/riscv/insn32.decode
index 901a66c0f5..affb99b3e6 100644
--- a/target/riscv/insn32.decode
+++ b/target/riscv/insn32.decode
@@ -692,6 +692,9 @@ rori       01100 ............ 101 ..... 0010011 @sh
 sext_b     011000 000100 ..... 001 ..... 0010011 @r2
 sext_h     011000 000101 ..... 001 ..... 0010011 @r2
 xnor       0100000 .......... 100 ..... 0110011 @r
+# The encoding for zext.h differs between RV32 and RV64.
+# zext_h_32 denotes the RV32 variant.
+zext_h_32  0000100 00000 ..... 100 ..... 0110011 @r2
 
 # *** RV64 Zbb Standard Extension (in addition to RV32 Zbb) ***
 clzw       0110000 00000 ..... 001 ..... 0011011 @r2
@@ -704,15 +707,14 @@ rev8_64    011010 111000 ..... 101 ..... 0010011 @r2
 rolw       0110000 .......... 001 ..... 0111011 @r
 roriw      0110000 .......... 101 ..... 0011011 @sh5
 rorw       0110000 .......... 101 ..... 0111011 @r
+# The encoding for zext.h differs between RV32 and RV64.
+# When executing on RV64, the encoding used in RV32 is an illegal
+# instruction, so we use different handler functions to differentiate.
+zext_h_64  0000100 00000 ..... 100 ..... 0111011 @r2
 
 # *** RV32B Standard Extension ***
-pack       0000100 .......... 100 ..... 0110011 @r
-packu      0100100 .......... 100 ..... 0110011 @r
-packh      0000100 .......... 111 ..... 0110011 @r
 
 # *** RV64B Standard Extension (in addition to RV32B) ***
-packw      0000100 .......... 100 ..... 0111011 @r
-packuw     0100100 .......... 100 ..... 0111011 @r
 
 # *** RV32 Zbc Standard Extension ***
 clmul      0000101 .......... 001 ..... 0110011 @r
diff --git a/target/riscv/insn_trans/trans_rvb.c.inc b/target/riscv/insn_trans/trans_rvb.c.inc
index 48a7c9ca5e..185c3e9a60 100644
--- a/target/riscv/insn_trans/trans_rvb.c.inc
+++ b/target/riscv/insn_trans/trans_rvb.c.inc
@@ -88,47 +88,6 @@ static bool trans_xnor(DisasContext *ctx, arg_xnor *a)
     return gen_arith(ctx, a, EXT_NONE, tcg_gen_eqv_tl);
 }
 
-static void gen_pack(TCGv ret, TCGv arg1, TCGv arg2)
-{
-    tcg_gen_deposit_tl(ret, arg1, arg2,
-                       TARGET_LONG_BITS / 2,
-                       TARGET_LONG_BITS / 2);
-}
-
-static bool trans_pack(DisasContext *ctx, arg_pack *a)
-{
-    REQUIRE_EXT(ctx, RVB);
-    return gen_arith(ctx, a, EXT_NONE, gen_pack);
-}
-
-static void gen_packu(TCGv ret, TCGv arg1, TCGv arg2)
-{
-    TCGv t = tcg_temp_new();
-    tcg_gen_shri_tl(t, arg1, TARGET_LONG_BITS / 2);
-    tcg_gen_deposit_tl(ret, arg2, t, 0, TARGET_LONG_BITS / 2);
-    tcg_temp_free(t);
-}
-
-static bool trans_packu(DisasContext *ctx, arg_packu *a)
-{
-    REQUIRE_EXT(ctx, RVB);
-    return gen_arith(ctx, a, EXT_NONE, gen_packu);
-}
-
-static void gen_packh(TCGv ret, TCGv arg1, TCGv arg2)
-{
-    TCGv t = tcg_temp_new();
-    tcg_gen_ext8u_tl(t, arg2);
-    tcg_gen_deposit_tl(ret, arg1, t, 8, TARGET_LONG_BITS - 8);
-    tcg_temp_free(t);
-}
-
-static bool trans_packh(DisasContext *ctx, arg_packh *a)
-{
-    REQUIRE_EXT(ctx, RVB);
-    return gen_arith(ctx, a, EXT_NONE, gen_packh);
-}
-
 static bool trans_min(DisasContext *ctx, arg_min *a)
 {
     REQUIRE_ZBB(ctx);
@@ -336,6 +295,20 @@ GEN_TRANS_SHADD(1)
 GEN_TRANS_SHADD(2)
 GEN_TRANS_SHADD(3)
 
+static bool trans_zext_h_32(DisasContext *ctx, arg_zext_h_32 *a)
+{
+    REQUIRE_32BIT(ctx);
+    REQUIRE_ZBB(ctx);
+    return gen_unary(ctx, a, EXT_NONE, tcg_gen_ext16u_tl);
+}
+
+static bool trans_zext_h_64(DisasContext *ctx, arg_zext_h_64 *a)
+{
+    REQUIRE_64BIT(ctx);
+    REQUIRE_ZBB(ctx);
+    return gen_unary(ctx, a, EXT_NONE, tcg_gen_ext16u_tl);
+}
+
 static void gen_clzw(TCGv ret, TCGv arg1)
 {
     TCGv t = tcg_temp_new();
@@ -372,37 +345,6 @@ static bool trans_cpopw(DisasContext *ctx, arg_cpopw *a)
     return gen_unary(ctx, a, EXT_ZERO, tcg_gen_ctpop_tl);
 }
 
-static void gen_packw(TCGv ret, TCGv arg1, TCGv arg2)
-{
-    TCGv t = tcg_temp_new();
-    tcg_gen_ext16s_tl(t, arg2);
-    tcg_gen_deposit_tl(ret, arg1, t, 16, 48);
-    tcg_temp_free(t);
-}
-
-static bool trans_packw(DisasContext *ctx, arg_packw *a)
-{
-    REQUIRE_64BIT(ctx);
-    REQUIRE_EXT(ctx, RVB);
-    return gen_arith(ctx, a, EXT_NONE, gen_packw);
-}
-
-static void gen_packuw(TCGv ret, TCGv arg1, TCGv arg2)
-{
-    TCGv t = tcg_temp_new();
-    tcg_gen_shri_tl(t, arg1, 16);
-    tcg_gen_deposit_tl(ret, arg2, t, 0, 16);
-    tcg_gen_ext32s_tl(ret, ret);
-    tcg_temp_free(t);
-}
-
-static bool trans_packuw(DisasContext *ctx, arg_packuw *a)
-{
-    REQUIRE_64BIT(ctx);
-    REQUIRE_EXT(ctx, RVB);
-    return gen_arith(ctx, a, EXT_NONE, gen_packuw);
-}
-
 static void gen_rorw(TCGv ret, TCGv arg1, TCGv arg2)
 {
     TCGv_i32 t1 = tcg_temp_new_i32();
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PULL 15/26] target/riscv: Remove RVB (replaced by Zb[abcs])
  2021-10-07  6:47 [PULL 00/26] riscv-to-apply queue Alistair Francis
                   ` (13 preceding siblings ...)
  2021-10-07  6:47 ` [PULL 14/26] target/riscv: Add zext.h instructions to Zbb, removing pack/packu/packh Alistair Francis
@ 2021-10-07  6:47 ` Alistair Francis
  2021-10-07  6:47 ` [PULL 16/26] disas/riscv: Add Zb[abcs] instructions Alistair Francis
                   ` (11 subsequent siblings)
  26 siblings, 0 replies; 37+ messages in thread
From: Alistair Francis @ 2021-10-07  6:47 UTC (permalink / raw)
  To: qemu-devel, peter.maydell
  Cc: alistair23, Philipp Tomsich, Richard Henderson, Alistair Francis,
	Bin Meng

From: Philipp Tomsich <philipp.tomsich@vrull.eu>

With everything classified as Zb[abcs] and pre-0.93 draft-B
instructions that are not part of Zb[abcs] removed, we can remove the
remaining support code for RVB.

Note that RVB has been retired for good and misa.B will neither mean
'some' or 'all of' Zb*:
  https://lists.riscv.org/g/tech-bitmanip/message/532

Signed-off-by: Philipp Tomsich <philipp.tomsich@vrull.eu>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Bin Meng <bmeng.cn@gmail.com>
Message-id: 20210911140016.834071-16-philipp.tomsich@vrull.eu
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
---
 target/riscv/cpu.h         |  3 ---
 target/riscv/insn32.decode |  4 ----
 target/riscv/cpu.c         | 26 --------------------------
 3 files changed, 33 deletions(-)

diff --git a/target/riscv/cpu.h b/target/riscv/cpu.h
index 1a38723f2c..bd519c9090 100644
--- a/target/riscv/cpu.h
+++ b/target/riscv/cpu.h
@@ -67,7 +67,6 @@
 #define RVS RV('S')
 #define RVU RV('U')
 #define RVH RV('H')
-#define RVB RV('B')
 
 /* S extension denotes that Supervisor mode exists, however it is possible
    to have a core that support S mode but does not have an MMU and there
@@ -83,7 +82,6 @@ enum {
 #define PRIV_VERSION_1_10_0 0x00011000
 #define PRIV_VERSION_1_11_0 0x00011100
 
-#define BEXT_VERSION_0_93_0 0x00009300
 #define VEXT_VERSION_0_07_1 0x00000701
 
 enum {
@@ -288,7 +286,6 @@ struct RISCVCPU {
         bool ext_f;
         bool ext_d;
         bool ext_c;
-        bool ext_b;
         bool ext_s;
         bool ext_u;
         bool ext_h;
diff --git a/target/riscv/insn32.decode b/target/riscv/insn32.decode
index affb99b3e6..2f251dac1b 100644
--- a/target/riscv/insn32.decode
+++ b/target/riscv/insn32.decode
@@ -712,10 +712,6 @@ rorw       0110000 .......... 101 ..... 0111011 @r
 # instruction, so we use different handler functions to differentiate.
 zext_h_64  0000100 00000 ..... 100 ..... 0111011 @r2
 
-# *** RV32B Standard Extension ***
-
-# *** RV64B Standard Extension (in addition to RV32B) ***
-
 # *** RV32 Zbc Standard Extension ***
 clmul      0000101 .......... 001 ..... 0110011 @r
 clmulh     0000101 .......... 011 ..... 0110011 @r
diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
index 785a3a8d19..1d69d1887e 100644
--- a/target/riscv/cpu.c
+++ b/target/riscv/cpu.c
@@ -127,11 +127,6 @@ static void set_priv_version(CPURISCVState *env, int priv_ver)
     env->priv_ver = priv_ver;
 }
 
-static void set_bext_version(CPURISCVState *env, int bext_ver)
-{
-    env->bext_ver = bext_ver;
-}
-
 static void set_vext_version(CPURISCVState *env, int vext_ver)
 {
     env->vext_ver = vext_ver;
@@ -496,25 +491,6 @@ static void riscv_cpu_realize(DeviceState *dev, Error **errp)
         if (cpu->cfg.ext_h) {
             target_misa |= RVH;
         }
-        if (cpu->cfg.ext_b) {
-            int bext_version = BEXT_VERSION_0_93_0;
-            target_misa |= RVB;
-
-            if (cpu->cfg.bext_spec) {
-                if (!g_strcmp0(cpu->cfg.bext_spec, "v0.93")) {
-                    bext_version = BEXT_VERSION_0_93_0;
-                } else {
-                    error_setg(errp,
-                           "Unsupported bitmanip spec version '%s'",
-                           cpu->cfg.bext_spec);
-                    return;
-                }
-            } else {
-                qemu_log("bitmanip version is not specified, "
-                         "use the default value v0.93\n");
-            }
-            set_bext_version(env, bext_version);
-        }
         if (cpu->cfg.ext_v) {
             int vext_version = VEXT_VERSION_0_07_1;
             target_misa |= RVV;
@@ -616,7 +592,6 @@ static Property riscv_cpu_properties[] = {
     DEFINE_PROP_BOOL("s", RISCVCPU, cfg.ext_s, true),
     DEFINE_PROP_BOOL("u", RISCVCPU, cfg.ext_u, true),
     /* This is experimental so mark with 'x-' */
-    DEFINE_PROP_BOOL("x-b", RISCVCPU, cfg.ext_b, false),
     DEFINE_PROP_BOOL("x-zba", RISCVCPU, cfg.ext_zba, false),
     DEFINE_PROP_BOOL("x-zbb", RISCVCPU, cfg.ext_zbb, false),
     DEFINE_PROP_BOOL("x-zbc", RISCVCPU, cfg.ext_zbc, false),
@@ -627,7 +602,6 @@ static Property riscv_cpu_properties[] = {
     DEFINE_PROP_BOOL("Zifencei", RISCVCPU, cfg.ext_ifencei, true),
     DEFINE_PROP_BOOL("Zicsr", RISCVCPU, cfg.ext_icsr, true),
     DEFINE_PROP_STRING("priv_spec", RISCVCPU, cfg.priv_spec),
-    DEFINE_PROP_STRING("bext_spec", RISCVCPU, cfg.bext_spec),
     DEFINE_PROP_STRING("vext_spec", RISCVCPU, cfg.vext_spec),
     DEFINE_PROP_UINT16("vlen", RISCVCPU, cfg.vlen, 128),
     DEFINE_PROP_UINT16("elen", RISCVCPU, cfg.elen, 64),
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PULL 16/26] disas/riscv: Add Zb[abcs] instructions
  2021-10-07  6:47 [PULL 00/26] riscv-to-apply queue Alistair Francis
                   ` (14 preceding siblings ...)
  2021-10-07  6:47 ` [PULL 15/26] target/riscv: Remove RVB (replaced by Zb[abcs]) Alistair Francis
@ 2021-10-07  6:47 ` Alistair Francis
  2021-10-07  6:47 ` [PULL 17/26] target/riscv: Set mstatus_hs.[SD|FS] bits if Clean and V=1 in mark_fs_dirty() Alistair Francis
                   ` (10 subsequent siblings)
  26 siblings, 0 replies; 37+ messages in thread
From: Alistair Francis @ 2021-10-07  6:47 UTC (permalink / raw)
  To: qemu-devel, peter.maydell; +Cc: alistair23, Philipp Tomsich, Alistair Francis

From: Philipp Tomsich <philipp.tomsich@vrull.eu>

With the addition of Zb[abcs], we also need to add disassembler
support for these new instructions.

Signed-off-by: Philipp Tomsich <philipp.tomsich@vrull.eu>
Acked-by: Alistair Francis <alistair.francis@wdc.com>
Message-id: 20210911140016.834071-17-philipp.tomsich@vrull.eu
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
---
 disas/riscv.c | 157 +++++++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 154 insertions(+), 3 deletions(-)

diff --git a/disas/riscv.c b/disas/riscv.c
index 278d9be924..793ad14c27 100644
--- a/disas/riscv.c
+++ b/disas/riscv.c
@@ -478,6 +478,49 @@ typedef enum {
     rv_op_fsflags = 316,
     rv_op_fsrmi = 317,
     rv_op_fsflagsi = 318,
+    rv_op_bseti = 319,
+    rv_op_bclri = 320,
+    rv_op_binvi = 321,
+    rv_op_bexti = 322,
+    rv_op_rori = 323,
+    rv_op_clz = 324,
+    rv_op_ctz = 325,
+    rv_op_cpop = 326,
+    rv_op_sext_h = 327,
+    rv_op_sext_b = 328,
+    rv_op_xnor = 329,
+    rv_op_orn = 330,
+    rv_op_andn = 331,
+    rv_op_rol = 332,
+    rv_op_ror = 333,
+    rv_op_sh1add = 334,
+    rv_op_sh2add = 335,
+    rv_op_sh3add = 336,
+    rv_op_sh1add_uw = 337,
+    rv_op_sh2add_uw = 338,
+    rv_op_sh3add_uw = 339,
+    rv_op_clmul = 340,
+    rv_op_clmulr = 341,
+    rv_op_clmulh = 342,
+    rv_op_min = 343,
+    rv_op_minu = 344,
+    rv_op_max = 345,
+    rv_op_maxu = 346,
+    rv_op_clzw = 347,
+    rv_op_ctzw = 348,
+    rv_op_cpopw = 349,
+    rv_op_slli_uw = 350,
+    rv_op_add_uw = 351,
+    rv_op_rolw = 352,
+    rv_op_rorw = 353,
+    rv_op_rev8 = 354,
+    rv_op_zext_h = 355,
+    rv_op_roriw = 356,
+    rv_op_orc_b = 357,
+    rv_op_bset = 358,
+    rv_op_bclr = 359,
+    rv_op_binv = 360,
+    rv_op_bext = 361,
 } rv_op;
 
 /* structures */
@@ -1117,6 +1160,49 @@ const rv_opcode_data opcode_data[] = {
     { "fsflags", rv_codec_i_csr, rv_fmt_rd_rs1, NULL, 0, 0, 0 },
     { "fsrmi", rv_codec_i_csr, rv_fmt_rd_zimm, NULL, 0, 0, 0 },
     { "fsflagsi", rv_codec_i_csr, rv_fmt_rd_zimm, NULL, 0, 0, 0 },
+    { "bseti", rv_codec_i_sh7, rv_fmt_rd_rs1_imm, NULL, 0, 0, 0 },
+    { "bclri", rv_codec_i_sh7, rv_fmt_rd_rs1_imm, NULL, 0, 0, 0 },
+    { "binvi", rv_codec_i_sh7, rv_fmt_rd_rs1_imm, NULL, 0, 0, 0 },
+    { "bexti", rv_codec_i_sh7, rv_fmt_rd_rs1_imm, NULL, 0, 0, 0 },
+    { "rori", rv_codec_i_sh7, rv_fmt_rd_rs1_imm, NULL, 0, 0, 0 },
+    { "clz", rv_codec_r, rv_fmt_rd_rs1, NULL, 0, 0, 0 },
+    { "ctz", rv_codec_r, rv_fmt_rd_rs1, NULL, 0, 0, 0 },
+    { "cpop", rv_codec_r, rv_fmt_rd_rs1, NULL, 0, 0, 0 },
+    { "sext.h", rv_codec_r, rv_fmt_rd_rs1, NULL, 0, 0, 0 },
+    { "sext.b", rv_codec_r, rv_fmt_rd_rs1, NULL, 0, 0, 0 },
+    { "xnor", rv_codec_r, rv_fmt_rd_rs1, NULL, 0, 0, 0 },
+    { "orn", rv_codec_r, rv_fmt_rd_rs1, NULL, 0, 0, 0 },
+    { "andn", rv_codec_r, rv_fmt_rd_rs1, NULL, 0, 0, 0 },
+    { "rol", rv_codec_r, rv_fmt_rd_rs1_rs2, NULL, 0, 0, 0 },
+    { "ror", rv_codec_r, rv_fmt_rd_rs1_rs2, NULL, 0, 0, 0 },
+    { "sh1add", rv_codec_r, rv_fmt_rd_rs1_rs2, NULL, 0, 0, 0 },
+    { "sh2add", rv_codec_r, rv_fmt_rd_rs1_rs2, NULL, 0, 0, 0 },
+    { "sh3add", rv_codec_r, rv_fmt_rd_rs1_rs2, NULL, 0, 0, 0 },
+    { "sh1add.uw", rv_codec_r, rv_fmt_rd_rs1_rs2, NULL, 0, 0, 0 },
+    { "sh2add.uw", rv_codec_r, rv_fmt_rd_rs1_rs2, NULL, 0, 0, 0 },
+    { "sh3add.uw", rv_codec_r, rv_fmt_rd_rs1_rs2, NULL, 0, 0, 0 },
+    { "clmul", rv_codec_r, rv_fmt_rd_rs1_rs2, NULL, 0, 0, 0 },
+    { "clmulr", rv_codec_r, rv_fmt_rd_rs1_rs2, NULL, 0, 0, 0 },
+    { "clmulh", rv_codec_r, rv_fmt_rd_rs1_rs2, NULL, 0, 0, 0 },
+    { "min", rv_codec_r, rv_fmt_rd_rs1_rs2, NULL, 0, 0, 0 },
+    { "minu", rv_codec_r, rv_fmt_rd_rs1_rs2, NULL, 0, 0, 0 },
+    { "max", rv_codec_r, rv_fmt_rd_rs1_rs2, NULL, 0, 0, 0 },
+    { "maxu", rv_codec_r, rv_fmt_rd_rs1_rs2, NULL, 0, 0, 0 },
+    { "clzw", rv_codec_r, rv_fmt_rd_rs1, NULL, 0, 0, 0 },
+    { "clzw", rv_codec_r, rv_fmt_rd_rs1, NULL, 0, 0, 0 },
+    { "cpopw", rv_codec_r, rv_fmt_rd_rs1, NULL, 0, 0, 0 },
+    { "slli.uw", rv_codec_r, rv_fmt_rd_rs1_rs2, NULL, 0, 0, 0 },
+    { "add.uw", rv_codec_r, rv_fmt_rd_rs1_rs2, NULL, 0, 0, 0 },
+    { "rolw", rv_codec_r, rv_fmt_rd_rs1_rs2, NULL, 0, 0, 0 },
+    { "rorw", rv_codec_r, rv_fmt_rd_rs1_rs2, NULL, 0, 0, 0 },
+    { "rev8", rv_codec_r, rv_fmt_rd_rs1, NULL, 0, 0, 0 },
+    { "zext.h", rv_codec_r, rv_fmt_rd_rs1, NULL, 0, 0, 0 },
+    { "roriw", rv_codec_i_sh5, rv_fmt_rd_rs1_imm, NULL, 0, 0, 0 },
+    { "orc.b", rv_codec_r, rv_fmt_rd_rs1, NULL, 0, 0, 0 },
+    { "bset", rv_codec_r, rv_fmt_rd_rs1_rs2, NULL, 0, 0, 0 },
+    { "bclr", rv_codec_r, rv_fmt_rd_rs1_rs2, NULL, 0, 0, 0 },
+    { "binv", rv_codec_r, rv_fmt_rd_rs1_rs2, NULL, 0, 0, 0 },
+    { "bext", rv_codec_r, rv_fmt_rd_rs1_rs2, NULL, 0, 0, 0 },
 };
 
 /* CSR names */
@@ -1507,7 +1593,20 @@ static void decode_inst_opcode(rv_decode *dec, rv_isa isa)
             case 0: op = rv_op_addi; break;
             case 1:
                 switch (((inst >> 27) & 0b11111)) {
-                case 0: op = rv_op_slli; break;
+                case 0b00000: op = rv_op_slli; break;
+                case 0b00101: op = rv_op_bseti; break;
+                case 0b01001: op = rv_op_bclri; break;
+                case 0b01101: op = rv_op_binvi; break;
+                case 0b01100:
+                    switch (((inst >> 20) & 0b1111111)) {
+                    case 0b0000000: op = rv_op_clz; break;
+                    case 0b0000001: op = rv_op_ctz; break;
+                    case 0b0000010: op = rv_op_cpop; break;
+                      /* 0b0000011 */
+                    case 0b0000100: op = rv_op_sext_b; break;
+                    case 0b0000101: op = rv_op_sext_h; break;
+                    }
+                    break;
                 }
                 break;
             case 2: op = rv_op_slti; break;
@@ -1515,8 +1614,16 @@ static void decode_inst_opcode(rv_decode *dec, rv_isa isa)
             case 4: op = rv_op_xori; break;
             case 5:
                 switch (((inst >> 27) & 0b11111)) {
-                case 0: op = rv_op_srli; break;
-                case 8: op = rv_op_srai; break;
+                case 0b00000: op = rv_op_srli; break;
+                case 0b00101: op = rv_op_orc_b; break;
+                case 0b01000: op = rv_op_srai; break;
+                case 0b01001: op = rv_op_bexti; break;
+                case 0b01100: op = rv_op_rori; break;
+                case 0b01101:
+                    switch ((inst >> 20) & 0b1111111) {
+                    case 0b0111000: op = rv_op_rev8; break;
+                    }
+                    break;
                 }
                 break;
             case 6: op = rv_op_ori; break;
@@ -1530,12 +1637,21 @@ static void decode_inst_opcode(rv_decode *dec, rv_isa isa)
             case 1:
                 switch (((inst >> 25) & 0b1111111)) {
                 case 0: op = rv_op_slliw; break;
+                case 4: op = rv_op_slli_uw; break;
+                case 48:
+                    switch ((inst >> 20) & 0b11111) {
+                    case 0b00000: op = rv_op_clzw; break;
+                    case 0b00001: op = rv_op_ctzw; break;
+                    case 0b00010: op = rv_op_cpopw; break;
+                    }
+                    break;
                 }
                 break;
             case 5:
                 switch (((inst >> 25) & 0b1111111)) {
                 case 0: op = rv_op_srliw; break;
                 case 32: op = rv_op_sraiw; break;
+                case 48: op = rv_op_roriw; break;
                 }
                 break;
             }
@@ -1623,8 +1739,32 @@ static void decode_inst_opcode(rv_decode *dec, rv_isa isa)
             case 13: op = rv_op_divu; break;
             case 14: op = rv_op_rem; break;
             case 15: op = rv_op_remu; break;
+            case 36:
+                switch ((inst >> 20) & 0b11111) {
+                case 0: op = rv_op_zext_h; break;
+                }
+                break;
+            case 41: op = rv_op_clmul; break;
+            case 42: op = rv_op_clmulr; break;
+            case 43: op = rv_op_clmulh; break;
+            case 44: op = rv_op_min; break;
+            case 45: op = rv_op_minu; break;
+            case 46: op = rv_op_max; break;
+            case 47: op = rv_op_maxu; break;
+            case 130: op = rv_op_sh1add; break;
+            case 132: op = rv_op_sh2add; break;
+            case 134: op = rv_op_sh3add; break;
+            case 161: op = rv_op_bset; break;
             case 256: op = rv_op_sub; break;
+            case 260: op = rv_op_xnor; break;
             case 261: op = rv_op_sra; break;
+            case 262: op = rv_op_orn; break;
+            case 263: op = rv_op_andn; break;
+            case 289: op = rv_op_bclr; break;
+            case 293: op = rv_op_bext; break;
+            case 385: op = rv_op_rol; break;
+            case 386: op = rv_op_ror; break;
+            case 417: op = rv_op_binv; break;
             }
             break;
         case 13: op = rv_op_lui; break;
@@ -1638,8 +1778,19 @@ static void decode_inst_opcode(rv_decode *dec, rv_isa isa)
             case 13: op = rv_op_divuw; break;
             case 14: op = rv_op_remw; break;
             case 15: op = rv_op_remuw; break;
+            case 32: op = rv_op_add_uw; break;
+            case 36:
+                switch ((inst >> 20) & 0b11111) {
+                case 0: op = rv_op_zext_h; break;
+                }
+                break;
+            case 130: op = rv_op_sh1add_uw; break;
+            case 132: op = rv_op_sh2add_uw; break;
+            case 134: op = rv_op_sh3add_uw; break;
             case 256: op = rv_op_subw; break;
             case 261: op = rv_op_sraw; break;
+            case 385: op = rv_op_rolw; break;
+            case 389: op = rv_op_rorw; break;
             }
             break;
         case 16:
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PULL 17/26] target/riscv: Set mstatus_hs.[SD|FS] bits if Clean and V=1 in mark_fs_dirty()
  2021-10-07  6:47 [PULL 00/26] riscv-to-apply queue Alistair Francis
                   ` (15 preceding siblings ...)
  2021-10-07  6:47 ` [PULL 16/26] disas/riscv: Add Zb[abcs] instructions Alistair Francis
@ 2021-10-07  6:47 ` Alistair Francis
  2021-10-07  6:47 ` [PULL 18/26] hw/char: ibex_uart: Register device in 'input' category Alistair Francis
                   ` (9 subsequent siblings)
  26 siblings, 0 replies; 37+ messages in thread
From: Alistair Francis @ 2021-10-07  6:47 UTC (permalink / raw)
  To: qemu-devel, peter.maydell
  Cc: alistair23, Frank Chang, Vincent Chen, Richard Henderson,
	Alistair Francis

From: Frank Chang <frank.chang@sifive.com>

When V=1, both vsstauts.FS and HS-level sstatus.FS are in effect.
Modifying the floating-point state when V=1 causes both fields to
be set to 3 (Dirty).

However, it's possible that HS-level sstatus.FS is Clean and VS-level
vsstatus.FS is Dirty at the time mark_fs_dirty() is called when V=1.
We can't early return for this case because we still need to set
sstatus.FS to Dirty according to spec.

Signed-off-by: Frank Chang <frank.chang@sifive.com>
Reviewed-by: Vincent Chen <vincent.chen@sifive.com>
Tested-by: Vincent Chen <vincent.chen@sifive.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-id: 20210921020234.123448-1-frank.chang@sifive.com
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
---
 target/riscv/cpu.h       |  4 ++++
 target/riscv/translate.c | 30 +++++++++++++++++-------------
 2 files changed, 21 insertions(+), 13 deletions(-)

diff --git a/target/riscv/cpu.h b/target/riscv/cpu.h
index bd519c9090..9e55b2f5b1 100644
--- a/target/riscv/cpu.h
+++ b/target/riscv/cpu.h
@@ -393,6 +393,7 @@ FIELD(TB_FLAGS, SEW, 5, 3)
 FIELD(TB_FLAGS, VILL, 8, 1)
 /* Is a Hypervisor instruction load/store allowed? */
 FIELD(TB_FLAGS, HLSX, 9, 1)
+FIELD(TB_FLAGS, MSTATUS_HS_FS, 10, 2)
 
 bool riscv_cpu_is_32bit(CPURISCVState *env);
 
@@ -449,6 +450,9 @@ static inline void cpu_get_tb_cpu_state(CPURISCVState *env, target_ulong *pc,
                 get_field(env->hstatus, HSTATUS_HU))) {
             flags = FIELD_DP32(flags, TB_FLAGS, HLSX, 1);
         }
+
+        flags = FIELD_DP32(flags, TB_FLAGS, MSTATUS_HS_FS,
+                           get_field(env->mstatus_hs, MSTATUS_FS));
     }
 #endif
 
diff --git a/target/riscv/translate.c b/target/riscv/translate.c
index b2d3444bc5..d2442f0cf5 100644
--- a/target/riscv/translate.c
+++ b/target/riscv/translate.c
@@ -58,6 +58,7 @@ typedef struct DisasContext {
     target_ulong misa;
     uint32_t opcode;
     uint32_t mstatus_fs;
+    uint32_t mstatus_hs_fs;
     uint32_t mem_idx;
     /* Remember the rounding mode encoded in the previous fp instruction,
        which we have already installed into env->fp_status.  Or -1 for
@@ -280,27 +281,29 @@ static void gen_jal(DisasContext *ctx, int rd, target_ulong imm)
 static void mark_fs_dirty(DisasContext *ctx)
 {
     TCGv tmp;
-    target_ulong sd;
+    target_ulong sd = is_32bit(ctx) ? MSTATUS32_SD : MSTATUS64_SD;
 
-    if (ctx->mstatus_fs == MSTATUS_FS) {
-        return;
-    }
-    /* Remember the state change for the rest of the TB.  */
-    ctx->mstatus_fs = MSTATUS_FS;
+    if (ctx->mstatus_fs != MSTATUS_FS) {
+        /* Remember the state change for the rest of the TB. */
+        ctx->mstatus_fs = MSTATUS_FS;
 
-    tmp = tcg_temp_new();
-    sd = is_32bit(ctx) ? MSTATUS32_SD : MSTATUS64_SD;
+        tmp = tcg_temp_new();
+        tcg_gen_ld_tl(tmp, cpu_env, offsetof(CPURISCVState, mstatus));
+        tcg_gen_ori_tl(tmp, tmp, MSTATUS_FS | sd);
+        tcg_gen_st_tl(tmp, cpu_env, offsetof(CPURISCVState, mstatus));
+        tcg_temp_free(tmp);
+    }
 
-    tcg_gen_ld_tl(tmp, cpu_env, offsetof(CPURISCVState, mstatus));
-    tcg_gen_ori_tl(tmp, tmp, MSTATUS_FS | sd);
-    tcg_gen_st_tl(tmp, cpu_env, offsetof(CPURISCVState, mstatus));
+    if (ctx->virt_enabled && ctx->mstatus_hs_fs != MSTATUS_FS) {
+        /* Remember the stage change for the rest of the TB. */
+        ctx->mstatus_hs_fs = MSTATUS_FS;
 
-    if (ctx->virt_enabled) {
+        tmp = tcg_temp_new();
         tcg_gen_ld_tl(tmp, cpu_env, offsetof(CPURISCVState, mstatus_hs));
         tcg_gen_ori_tl(tmp, tmp, MSTATUS_FS | sd);
         tcg_gen_st_tl(tmp, cpu_env, offsetof(CPURISCVState, mstatus_hs));
+        tcg_temp_free(tmp);
     }
-    tcg_temp_free(tmp);
 }
 #else
 static inline void mark_fs_dirty(DisasContext *ctx) { }
@@ -539,6 +542,7 @@ static void riscv_tr_init_disas_context(DisasContextBase *dcbase, CPUState *cs)
     ctx->frm = -1;  /* unknown rounding mode */
     ctx->ext_ifencei = cpu->cfg.ext_ifencei;
     ctx->vlen = cpu->cfg.vlen;
+    ctx->mstatus_hs_fs = FIELD_EX32(tb_flags, TB_FLAGS, MSTATUS_HS_FS);
     ctx->hlsx = FIELD_EX32(tb_flags, TB_FLAGS, HLSX);
     ctx->vill = FIELD_EX32(tb_flags, TB_FLAGS, VILL);
     ctx->sew = FIELD_EX32(tb_flags, TB_FLAGS, SEW);
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PULL 18/26] hw/char: ibex_uart: Register device in 'input' category
  2021-10-07  6:47 [PULL 00/26] riscv-to-apply queue Alistair Francis
                   ` (16 preceding siblings ...)
  2021-10-07  6:47 ` [PULL 17/26] target/riscv: Set mstatus_hs.[SD|FS] bits if Clean and V=1 in mark_fs_dirty() Alistair Francis
@ 2021-10-07  6:47 ` Alistair Francis
  2021-10-07  6:47 ` [PULL 19/26] hw/char: shakti_uart: " Alistair Francis
                   ` (8 subsequent siblings)
  26 siblings, 0 replies; 37+ messages in thread
From: Alistair Francis @ 2021-10-07  6:47 UTC (permalink / raw)
  To: qemu-devel, peter.maydell
  Cc: alistair23, Bin Meng, Philippe Mathieu-Daudé, Alistair Francis

From: Bin Meng <bmeng.cn@gmail.com>

The category of ibex_uart device is not set. Put it into the
'input' category.

Signed-off-by: Bin Meng <bmeng.cn@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-id: 20210926105003.2716-1-bmeng.cn@gmail.com
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
---
 hw/char/ibex_uart.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/hw/char/ibex_uart.c b/hw/char/ibex_uart.c
index 9b0a817713..e58181fcf4 100644
--- a/hw/char/ibex_uart.c
+++ b/hw/char/ibex_uart.c
@@ -550,6 +550,7 @@ static void ibex_uart_class_init(ObjectClass *klass, void *data)
     dc->realize = ibex_uart_realize;
     dc->vmsd = &vmstate_ibex_uart;
     device_class_set_props(dc, ibex_uart_properties);
+    set_bit(DEVICE_CATEGORY_INPUT, dc->categories);
 }
 
 static const TypeInfo ibex_uart_info = {
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PULL 19/26] hw/char: shakti_uart: Register device in 'input' category
  2021-10-07  6:47 [PULL 00/26] riscv-to-apply queue Alistair Francis
                   ` (17 preceding siblings ...)
  2021-10-07  6:47 ` [PULL 18/26] hw/char: ibex_uart: Register device in 'input' category Alistair Francis
@ 2021-10-07  6:47 ` Alistair Francis
  2021-10-07  6:47 ` [PULL 20/26] hw/char: sifive_uart: " Alistair Francis
                   ` (7 subsequent siblings)
  26 siblings, 0 replies; 37+ messages in thread
From: Alistair Francis @ 2021-10-07  6:47 UTC (permalink / raw)
  To: qemu-devel, peter.maydell
  Cc: alistair23, Bin Meng, Philippe Mathieu-Daudé, Alistair Francis

From: Bin Meng <bmeng.cn@gmail.com>

The category of shakti_uart device is not set. Put it into the
'input' category.

Signed-off-by: Bin Meng <bmeng.cn@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-id: 20210926105003.2716-2-bmeng.cn@gmail.com
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
---
 hw/char/shakti_uart.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/hw/char/shakti_uart.c b/hw/char/shakti_uart.c
index 6870821325..98b142c7df 100644
--- a/hw/char/shakti_uart.c
+++ b/hw/char/shakti_uart.c
@@ -168,6 +168,7 @@ static void shakti_uart_class_init(ObjectClass *klass, void *data)
     dc->reset = shakti_uart_reset;
     dc->realize = shakti_uart_realize;
     device_class_set_props(dc, shakti_uart_properties);
+    set_bit(DEVICE_CATEGORY_INPUT, dc->categories);
 }
 
 static const TypeInfo shakti_uart_info = {
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PULL 20/26] hw/char: sifive_uart: Register device in 'input' category
  2021-10-07  6:47 [PULL 00/26] riscv-to-apply queue Alistair Francis
                   ` (18 preceding siblings ...)
  2021-10-07  6:47 ` [PULL 19/26] hw/char: shakti_uart: " Alistair Francis
@ 2021-10-07  6:47 ` Alistair Francis
  2021-10-07  6:47 ` [PULL 21/26] hw/char/mchp_pfsoc_mmuart: Simplify MCHP_PFSOC_MMUART_REG definition Alistair Francis
                   ` (6 subsequent siblings)
  26 siblings, 0 replies; 37+ messages in thread
From: Alistair Francis @ 2021-10-07  6:47 UTC (permalink / raw)
  To: qemu-devel, peter.maydell
  Cc: alistair23, Bin Meng, Philippe Mathieu-Daudé, Alistair Francis

From: Bin Meng <bmeng.cn@gmail.com>

The category of sifive_uart device is not set. Put it into the
'input' category.

Signed-off-by: Bin Meng <bmeng.cn@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-id: 20210926105003.2716-3-bmeng.cn@gmail.com
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
---
 hw/char/sifive_uart.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/hw/char/sifive_uart.c b/hw/char/sifive_uart.c
index 278e21c434..1c75f792b3 100644
--- a/hw/char/sifive_uart.c
+++ b/hw/char/sifive_uart.c
@@ -248,6 +248,7 @@ static void sifive_uart_class_init(ObjectClass *oc, void *data)
     rc->phases.enter = sifive_uart_reset_enter;
     rc->phases.hold  = sifive_uart_reset_hold;
     device_class_set_props(dc, sifive_uart_properties);
+    set_bit(DEVICE_CATEGORY_INPUT, dc->categories);
 }
 
 static const TypeInfo sifive_uart_info = {
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PULL 21/26] hw/char/mchp_pfsoc_mmuart: Simplify MCHP_PFSOC_MMUART_REG definition
  2021-10-07  6:47 [PULL 00/26] riscv-to-apply queue Alistair Francis
                   ` (19 preceding siblings ...)
  2021-10-07  6:47 ` [PULL 20/26] hw/char: sifive_uart: " Alistair Francis
@ 2021-10-07  6:47 ` Alistair Francis
  2021-10-07  6:47 ` [PULL 22/26] hw/char/mchp_pfsoc_mmuart: Use a MemoryRegion container Alistair Francis
                   ` (5 subsequent siblings)
  26 siblings, 0 replies; 37+ messages in thread
From: Alistair Francis @ 2021-10-07  6:47 UTC (permalink / raw)
  To: qemu-devel, peter.maydell
  Cc: alistair23, Philippe Mathieu-Daudé, Bin Meng, Alistair Francis

From: Philippe Mathieu-Daudé <f4bug@amsat.org>

The current MCHP_PFSOC_MMUART_REG_SIZE definition represent the
size occupied by all the registers. However all registers are
32-bit wide, and the MemoryRegionOps handlers are restricted to
32-bit:

  static const MemoryRegionOps mchp_pfsoc_mmuart_ops = {
      .read = mchp_pfsoc_mmuart_read,
      .write = mchp_pfsoc_mmuart_write,
      .impl = {
          .min_access_size = 4,
          .max_access_size = 4,
      },

Avoid being triskaidekaphobic, simplify by using the number of
registers.

Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Bin Meng <bin.meng@windriver.com>
Tested-by: Bin Meng <bin.meng@windriver.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-id: 20210925133407.1259392-2-f4bug@amsat.org
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
---
 include/hw/char/mchp_pfsoc_mmuart.h |  4 ++--
 hw/char/mchp_pfsoc_mmuart.c         | 14 ++++++++------
 2 files changed, 10 insertions(+), 8 deletions(-)

diff --git a/include/hw/char/mchp_pfsoc_mmuart.h b/include/hw/char/mchp_pfsoc_mmuart.h
index f61990215f..9c012e6c97 100644
--- a/include/hw/char/mchp_pfsoc_mmuart.h
+++ b/include/hw/char/mchp_pfsoc_mmuart.h
@@ -30,7 +30,7 @@
 
 #include "hw/char/serial.h"
 
-#define MCHP_PFSOC_MMUART_REG_SIZE  52
+#define MCHP_PFSOC_MMUART_REG_COUNT 13
 
 typedef struct MchpPfSoCMMUartState {
     MemoryRegion iomem;
@@ -39,7 +39,7 @@ typedef struct MchpPfSoCMMUartState {
 
     SerialMM *serial;
 
-    uint32_t reg[MCHP_PFSOC_MMUART_REG_SIZE / sizeof(uint32_t)];
+    uint32_t reg[MCHP_PFSOC_MMUART_REG_COUNT];
 } MchpPfSoCMMUartState;
 
 /**
diff --git a/hw/char/mchp_pfsoc_mmuart.c b/hw/char/mchp_pfsoc_mmuart.c
index 2facf85c2d..584e7fec17 100644
--- a/hw/char/mchp_pfsoc_mmuart.c
+++ b/hw/char/mchp_pfsoc_mmuart.c
@@ -29,13 +29,14 @@ static uint64_t mchp_pfsoc_mmuart_read(void *opaque, hwaddr addr, unsigned size)
 {
     MchpPfSoCMMUartState *s = opaque;
 
-    if (addr >= MCHP_PFSOC_MMUART_REG_SIZE) {
+    addr >>= 2;
+    if (addr >= MCHP_PFSOC_MMUART_REG_COUNT) {
         qemu_log_mask(LOG_GUEST_ERROR, "%s: read: addr=0x%" HWADDR_PRIx "\n",
-                      __func__, addr);
+                      __func__, addr << 2);
         return 0;
     }
 
-    return s->reg[addr / sizeof(uint32_t)];
+    return s->reg[addr];
 }
 
 static void mchp_pfsoc_mmuart_write(void *opaque, hwaddr addr,
@@ -44,13 +45,14 @@ static void mchp_pfsoc_mmuart_write(void *opaque, hwaddr addr,
     MchpPfSoCMMUartState *s = opaque;
     uint32_t val32 = (uint32_t)value;
 
-    if (addr >= MCHP_PFSOC_MMUART_REG_SIZE) {
+    addr >>= 2;
+    if (addr >= MCHP_PFSOC_MMUART_REG_COUNT) {
         qemu_log_mask(LOG_GUEST_ERROR, "%s: bad write: addr=0x%" HWADDR_PRIx
-                      " v=0x%x\n", __func__, addr, val32);
+                      " v=0x%x\n", __func__, addr << 2, val32);
         return;
     }
 
-    s->reg[addr / sizeof(uint32_t)] = val32;
+    s->reg[addr] = val32;
 }
 
 static const MemoryRegionOps mchp_pfsoc_mmuart_ops = {
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PULL 22/26] hw/char/mchp_pfsoc_mmuart: Use a MemoryRegion container
  2021-10-07  6:47 [PULL 00/26] riscv-to-apply queue Alistair Francis
                   ` (20 preceding siblings ...)
  2021-10-07  6:47 ` [PULL 21/26] hw/char/mchp_pfsoc_mmuart: Simplify MCHP_PFSOC_MMUART_REG definition Alistair Francis
@ 2021-10-07  6:47 ` Alistair Francis
  2021-10-07  6:47 ` [PULL 23/26] hw/char/mchp_pfsoc_mmuart: QOM'ify PolarFire MMUART Alistair Francis
                   ` (4 subsequent siblings)
  26 siblings, 0 replies; 37+ messages in thread
From: Alistair Francis @ 2021-10-07  6:47 UTC (permalink / raw)
  To: qemu-devel, peter.maydell
  Cc: alistair23, Philippe Mathieu-Daudé, Bin Meng, Alistair Francis

From: Philippe Mathieu-Daudé <f4bug@amsat.org>

Our device have 2 different I/O regions:
- a 16550 UART mapped for 32-bit accesses
- 13 extra registers

Instead of mapping each region on the main bus, introduce
a container, map the 2 devices regions on the container,
and map the container on the main bus.

Before:

  (qemu) info mtree
    ...
    0000000020100000-000000002010001f (prio 0, i/o): serial
    0000000020100020-000000002010101f (prio 0, i/o): mchp.pfsoc.mmuart
    0000000020102000-000000002010201f (prio 0, i/o): serial
    0000000020102020-000000002010301f (prio 0, i/o): mchp.pfsoc.mmuart
    0000000020104000-000000002010401f (prio 0, i/o): serial
    0000000020104020-000000002010501f (prio 0, i/o): mchp.pfsoc.mmuart
    0000000020106000-000000002010601f (prio 0, i/o): serial
    0000000020106020-000000002010701f (prio 0, i/o): mchp.pfsoc.mmuart

After:

  (qemu) info mtree
    ...
    0000000020100000-0000000020100fff (prio 0, i/o): mchp.pfsoc.mmuart
      0000000020100000-000000002010001f (prio 0, i/o): serial
      0000000020100020-0000000020100fff (prio 0, i/o): mchp.pfsoc.mmuart.regs
    0000000020102000-0000000020102fff (prio 0, i/o): mchp.pfsoc.mmuart
      0000000020102000-000000002010201f (prio 0, i/o): serial
      0000000020102020-0000000020102fff (prio 0, i/o): mchp.pfsoc.mmuart.regs
    0000000020104000-0000000020104fff (prio 0, i/o): mchp.pfsoc.mmuart
      0000000020104000-000000002010401f (prio 0, i/o): serial
      0000000020104020-0000000020104fff (prio 0, i/o): mchp.pfsoc.mmuart.regs
    0000000020106000-0000000020106fff (prio 0, i/o): mchp.pfsoc.mmuart
      0000000020106000-000000002010601f (prio 0, i/o): serial
      0000000020106020-0000000020106fff (prio 0, i/o): mchp.pfsoc.mmuart.regs

Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Bin Meng <bin.meng@windriver.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Tested-by: Bin Meng <bin.meng@windriver.com>
Message-id: 20210925133407.1259392-3-f4bug@amsat.org
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
---
 include/hw/char/mchp_pfsoc_mmuart.h |  1 +
 hw/char/mchp_pfsoc_mmuart.c         | 11 ++++++++---
 2 files changed, 9 insertions(+), 3 deletions(-)

diff --git a/include/hw/char/mchp_pfsoc_mmuart.h b/include/hw/char/mchp_pfsoc_mmuart.h
index 9c012e6c97..864ac1a36b 100644
--- a/include/hw/char/mchp_pfsoc_mmuart.h
+++ b/include/hw/char/mchp_pfsoc_mmuart.h
@@ -33,6 +33,7 @@
 #define MCHP_PFSOC_MMUART_REG_COUNT 13
 
 typedef struct MchpPfSoCMMUartState {
+    MemoryRegion container;
     MemoryRegion iomem;
     hwaddr base;
     qemu_irq irq;
diff --git a/hw/char/mchp_pfsoc_mmuart.c b/hw/char/mchp_pfsoc_mmuart.c
index 584e7fec17..ea58655976 100644
--- a/hw/char/mchp_pfsoc_mmuart.c
+++ b/hw/char/mchp_pfsoc_mmuart.c
@@ -25,6 +25,8 @@
 #include "chardev/char.h"
 #include "hw/char/mchp_pfsoc_mmuart.h"
 
+#define REGS_OFFSET 0x20
+
 static uint64_t mchp_pfsoc_mmuart_read(void *opaque, hwaddr addr, unsigned size)
 {
     MchpPfSoCMMUartState *s = opaque;
@@ -72,16 +74,19 @@ MchpPfSoCMMUartState *mchp_pfsoc_mmuart_create(MemoryRegion *sysmem,
 
     s = g_new0(MchpPfSoCMMUartState, 1);
 
+    memory_region_init(&s->container, NULL, "mchp.pfsoc.mmuart", 0x1000);
+
     memory_region_init_io(&s->iomem, NULL, &mchp_pfsoc_mmuart_ops, s,
-                          "mchp.pfsoc.mmuart", 0x1000);
+                          "mchp.pfsoc.mmuart.regs", 0x1000 - REGS_OFFSET);
+    memory_region_add_subregion(&s->container, REGS_OFFSET, &s->iomem);
 
     s->base = base;
     s->irq = irq;
 
-    s->serial = serial_mm_init(sysmem, base, 2, irq, 399193, chr,
+    s->serial = serial_mm_init(&s->container, 0, 2, irq, 399193, chr,
                                DEVICE_LITTLE_ENDIAN);
 
-    memory_region_add_subregion(sysmem, base + 0x20, &s->iomem);
+    memory_region_add_subregion(sysmem, base, &s->container);
 
     return s;
 }
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PULL 23/26] hw/char/mchp_pfsoc_mmuart: QOM'ify PolarFire MMUART
  2021-10-07  6:47 [PULL 00/26] riscv-to-apply queue Alistair Francis
                   ` (21 preceding siblings ...)
  2021-10-07  6:47 ` [PULL 22/26] hw/char/mchp_pfsoc_mmuart: Use a MemoryRegion container Alistair Francis
@ 2021-10-07  6:47 ` Alistair Francis
  2021-10-07  6:47 ` [PULL 24/26] hw/dma: sifive_pdma: Fix Control.claim bit detection Alistair Francis
                   ` (3 subsequent siblings)
  26 siblings, 0 replies; 37+ messages in thread
From: Alistair Francis @ 2021-10-07  6:47 UTC (permalink / raw)
  To: qemu-devel, peter.maydell
  Cc: alistair23, Philippe Mathieu-Daudé, Bin Meng, Alistair Francis

From: Philippe Mathieu-Daudé <f4bug@amsat.org>

- Embed SerialMM in MchpPfSoCMMUartState and QOM-initialize it
- Alias SERIAL_MM 'chardev' property on MCHP_PFSOC_UART
- Forward SerialMM sysbus IRQ in mchp_pfsoc_mmuart_realize()
- Add DeviceReset() method
- Add vmstate structure for migration
- Register device in 'input' category
- Keep mchp_pfsoc_mmuart_create() behavior

Note, serial_mm_init() calls qdev_set_legacy_instance_id().
This call is only needed for backwards-compatibility of incoming
migration data with old versions of QEMU which implemented migration
of devices with hand-rolled code. Since this device didn't previously
handle migration at all, then it doesn't need to set the legacy
instance ID.

Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Bin Meng <bin.meng@windriver.com>
Tested-by: Bin Meng <bin.meng@windriver.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-id: 20210925133407.1259392-4-f4bug@amsat.org
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
---
 include/hw/char/mchp_pfsoc_mmuart.h | 12 +++-
 hw/char/mchp_pfsoc_mmuart.c         | 97 +++++++++++++++++++++++++----
 2 files changed, 93 insertions(+), 16 deletions(-)

diff --git a/include/hw/char/mchp_pfsoc_mmuart.h b/include/hw/char/mchp_pfsoc_mmuart.h
index 864ac1a36b..b0e14ca355 100644
--- a/include/hw/char/mchp_pfsoc_mmuart.h
+++ b/include/hw/char/mchp_pfsoc_mmuart.h
@@ -28,17 +28,23 @@
 #ifndef HW_MCHP_PFSOC_MMUART_H
 #define HW_MCHP_PFSOC_MMUART_H
 
+#include "hw/sysbus.h"
 #include "hw/char/serial.h"
 
 #define MCHP_PFSOC_MMUART_REG_COUNT 13
 
+#define TYPE_MCHP_PFSOC_UART "mchp.pfsoc.uart"
+OBJECT_DECLARE_SIMPLE_TYPE(MchpPfSoCMMUartState, MCHP_PFSOC_UART)
+
 typedef struct MchpPfSoCMMUartState {
+    /*< private >*/
+    SysBusDevice parent_obj;
+
+    /*< public >*/
     MemoryRegion container;
     MemoryRegion iomem;
-    hwaddr base;
-    qemu_irq irq;
 
-    SerialMM *serial;
+    SerialMM serial_mm;
 
     uint32_t reg[MCHP_PFSOC_MMUART_REG_COUNT];
 } MchpPfSoCMMUartState;
diff --git a/hw/char/mchp_pfsoc_mmuart.c b/hw/char/mchp_pfsoc_mmuart.c
index ea58655976..22f3e78eb9 100644
--- a/hw/char/mchp_pfsoc_mmuart.c
+++ b/hw/char/mchp_pfsoc_mmuart.c
@@ -22,8 +22,10 @@
 
 #include "qemu/osdep.h"
 #include "qemu/log.h"
-#include "chardev/char.h"
+#include "qapi/error.h"
+#include "migration/vmstate.h"
 #include "hw/char/mchp_pfsoc_mmuart.h"
+#include "hw/qdev-properties.h"
 
 #define REGS_OFFSET 0x20
 
@@ -67,26 +69,95 @@ static const MemoryRegionOps mchp_pfsoc_mmuart_ops = {
     },
 };
 
-MchpPfSoCMMUartState *mchp_pfsoc_mmuart_create(MemoryRegion *sysmem,
-    hwaddr base, qemu_irq irq, Chardev *chr)
+static void mchp_pfsoc_mmuart_reset(DeviceState *dev)
+{
+    MchpPfSoCMMUartState *s = MCHP_PFSOC_UART(dev);
+
+    memset(s->reg, 0, sizeof(s->reg));
+    device_cold_reset(DEVICE(&s->serial_mm));
+}
+
+static void mchp_pfsoc_mmuart_init(Object *obj)
 {
-    MchpPfSoCMMUartState *s;
+    MchpPfSoCMMUartState *s = MCHP_PFSOC_UART(obj);
 
-    s = g_new0(MchpPfSoCMMUartState, 1);
+    object_initialize_child(obj, "serial-mm", &s->serial_mm, TYPE_SERIAL_MM);
+    object_property_add_alias(obj, "chardev", OBJECT(&s->serial_mm), "chardev");
+}
 
-    memory_region_init(&s->container, NULL, "mchp.pfsoc.mmuart", 0x1000);
+static void mchp_pfsoc_mmuart_realize(DeviceState *dev, Error **errp)
+{
+    MchpPfSoCMMUartState *s = MCHP_PFSOC_UART(dev);
 
-    memory_region_init_io(&s->iomem, NULL, &mchp_pfsoc_mmuart_ops, s,
+    qdev_prop_set_uint8(DEVICE(&s->serial_mm), "regshift", 2);
+    qdev_prop_set_uint32(DEVICE(&s->serial_mm), "baudbase", 399193);
+    qdev_prop_set_uint8(DEVICE(&s->serial_mm), "endianness",
+                        DEVICE_LITTLE_ENDIAN);
+    if (!sysbus_realize(SYS_BUS_DEVICE(&s->serial_mm), errp)) {
+        return;
+    }
+
+    sysbus_pass_irq(SYS_BUS_DEVICE(dev), SYS_BUS_DEVICE(&s->serial_mm));
+
+    memory_region_init(&s->container, OBJECT(s), "mchp.pfsoc.mmuart", 0x1000);
+    sysbus_init_mmio(SYS_BUS_DEVICE(dev), &s->container);
+
+    memory_region_add_subregion(&s->container, 0,
+                    sysbus_mmio_get_region(SYS_BUS_DEVICE(&s->serial_mm), 0));
+
+    memory_region_init_io(&s->iomem, OBJECT(s), &mchp_pfsoc_mmuart_ops, s,
                           "mchp.pfsoc.mmuart.regs", 0x1000 - REGS_OFFSET);
     memory_region_add_subregion(&s->container, REGS_OFFSET, &s->iomem);
+}
 
-    s->base = base;
-    s->irq = irq;
+static const VMStateDescription mchp_pfsoc_mmuart_vmstate = {
+    .name = "mchp.pfsoc.uart",
+    .version_id = 0,
+    .minimum_version_id = 0,
+    .fields = (VMStateField[]) {
+        VMSTATE_UINT32_ARRAY(reg, MchpPfSoCMMUartState,
+                             MCHP_PFSOC_MMUART_REG_COUNT),
+        VMSTATE_END_OF_LIST()
+    }
+};
+
+static void mchp_pfsoc_mmuart_class_init(ObjectClass *oc, void *data)
+{
+    DeviceClass *dc = DEVICE_CLASS(oc);
+
+    dc->realize = mchp_pfsoc_mmuart_realize;
+    dc->reset = mchp_pfsoc_mmuart_reset;
+    dc->vmsd = &mchp_pfsoc_mmuart_vmstate;
+    set_bit(DEVICE_CATEGORY_INPUT, dc->categories);
+}
+
+static const TypeInfo mchp_pfsoc_mmuart_info = {
+    .name          = TYPE_MCHP_PFSOC_UART,
+    .parent        = TYPE_SYS_BUS_DEVICE,
+    .instance_size = sizeof(MchpPfSoCMMUartState),
+    .instance_init = mchp_pfsoc_mmuart_init,
+    .class_init    = mchp_pfsoc_mmuart_class_init,
+};
+
+static void mchp_pfsoc_mmuart_register_types(void)
+{
+    type_register_static(&mchp_pfsoc_mmuart_info);
+}
+
+type_init(mchp_pfsoc_mmuart_register_types)
+
+MchpPfSoCMMUartState *mchp_pfsoc_mmuart_create(MemoryRegion *sysmem,
+                                               hwaddr base,
+                                               qemu_irq irq, Chardev *chr)
+{
+    DeviceState *dev = qdev_new(TYPE_MCHP_PFSOC_UART);
+    SysBusDevice *sbd = SYS_BUS_DEVICE(dev);
 
-    s->serial = serial_mm_init(&s->container, 0, 2, irq, 399193, chr,
-                               DEVICE_LITTLE_ENDIAN);
+    qdev_prop_set_chr(dev, "chardev", chr);
+    sysbus_realize(sbd, &error_fatal);
 
-    memory_region_add_subregion(sysmem, base, &s->container);
+    memory_region_add_subregion(sysmem, base, sysbus_mmio_get_region(sbd, 0));
+    sysbus_connect_irq(sbd, 0, irq);
 
-    return s;
+    return MCHP_PFSOC_UART(dev);
 }
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PULL 24/26] hw/dma: sifive_pdma: Fix Control.claim bit detection
  2021-10-07  6:47 [PULL 00/26] riscv-to-apply queue Alistair Francis
                   ` (22 preceding siblings ...)
  2021-10-07  6:47 ` [PULL 23/26] hw/char/mchp_pfsoc_mmuart: QOM'ify PolarFire MMUART Alistair Francis
@ 2021-10-07  6:47 ` Alistair Francis
  2021-10-07  6:47 ` [PULL 25/26] hw/dma: sifive_pdma: Don't run DMA when channel is disclaimed Alistair Francis
                   ` (2 subsequent siblings)
  26 siblings, 0 replies; 37+ messages in thread
From: Alistair Francis @ 2021-10-07  6:47 UTC (permalink / raw)
  To: qemu-devel, peter.maydell
  Cc: alistair23, Bin Meng, Philippe Mathieu-Daudé, Alistair Francis

From: Bin Meng <bmeng.cn@gmail.com>

At present the codes detect whether the DMA channel is claimed by:

  claimed = !!s->chan[ch].control & CONTROL_CLAIM;

As ! has higher precedence over & (bitwise and), this is essentially

  claimed = (!!s->chan[ch].control) & CONTROL_CLAIM;

which is wrong, as any non-zero bit set in the control register will
produce a result of a claimed channel.

Fixes: de7c7988d25d ("hw/dma: sifive_pdma: reset Next* registers when Control.claim is set")
Signed-off-by: Bin Meng <bmeng.cn@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20210927072124.1564129-1-bmeng.cn@gmail.com
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
---
 hw/dma/sifive_pdma.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/dma/sifive_pdma.c b/hw/dma/sifive_pdma.c
index b4fd40573a..b8ec7621f3 100644
--- a/hw/dma/sifive_pdma.c
+++ b/hw/dma/sifive_pdma.c
@@ -243,7 +243,7 @@ static void sifive_pdma_write(void *opaque, hwaddr offset,
     offset &= 0xfff;
     switch (offset) {
     case DMA_CONTROL:
-        claimed = !!s->chan[ch].control & CONTROL_CLAIM;
+        claimed = !!(s->chan[ch].control & CONTROL_CLAIM);
 
         if (!claimed && (value & CONTROL_CLAIM)) {
             /* reset Next* registers */
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PULL 25/26] hw/dma: sifive_pdma: Don't run DMA when channel is disclaimed
  2021-10-07  6:47 [PULL 00/26] riscv-to-apply queue Alistair Francis
                   ` (23 preceding siblings ...)
  2021-10-07  6:47 ` [PULL 24/26] hw/dma: sifive_pdma: Fix Control.claim bit detection Alistair Francis
@ 2021-10-07  6:47 ` Alistair Francis
  2021-10-07  6:47 ` [PULL 26/26] hw/riscv: shakti_c: Mark as not user creatable Alistair Francis
  2021-10-07 17:25 ` [PULL 00/26] riscv-to-apply queue Richard Henderson
  26 siblings, 0 replies; 37+ messages in thread
From: Alistair Francis @ 2021-10-07  6:47 UTC (permalink / raw)
  To: qemu-devel, peter.maydell; +Cc: alistair23, Bin Meng, Alistair Francis

From: Bin Meng <bmeng.cn@gmail.com>

If Control.run bit is set while not preserving the Control.claim
bit, the DMA transfer shall not be started.

The following result is PDMA tested in U-Boot on Unleashed board:

=> mw.l 0x3000000 0x0                      <= Disclaim channel 0
=> mw.l 0x3000000 0x1                      <= Claim channel 0
=> mw.l 0x3000004 0x55000000               <= wsize = rsize = 5 (2^5 = 32 bytes)
=> mw.q 0x3000008 0x2                      <= NextBytes = 2
=> mw.q 0x3000010 0x84000000               <= NextDestination = 0x84000000
=> mw.q 0x3000018 0x84001000               <= NextSource = 0x84001000
=> mw.l 0x84000000 0x87654321              <= Fill test data to dst
=> mw.l 0x84001000 0x12345678              <= Fill test data to src
=> md.l 0x84000000 1; md.l 0x84001000 1    <= Dump src/dst memory contents
84000000: 87654321                               !Ce.
84001000: 12345678                               xV4.
=> md.l 0x3000000 8                        <= Dump PDMA status
03000000: 00000001 55000000 00000002 00000000    .......U........
03000010: 84000000 00000000 84001000 00000000    ................
=> mw.l 0x3000000 0x2                      <= Set channel 0 run bit only
=> md.l 0x3000000 8                        <= Dump PDMA status
03000000: 00000000 55000000 00000002 00000000    .......U........
03000010: 84000000 00000000 84001000 00000000    ................
=> md.l 0x84000000 1; md.l 0x84001000 1    <= Dump src/dst memory contents
84000000: 87654321                               !Ce.
84001000: 12345678                               xV4.

Signed-off-by: Bin Meng <bmeng.cn@gmail.com>
Acked-by: Alistair Francis <alistair.francis@wdc.com>
Message-id: 20210927072124.1564129-2-bmeng.cn@gmail.com
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
---
 hw/dma/sifive_pdma.c | 11 +++++++++--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/hw/dma/sifive_pdma.c b/hw/dma/sifive_pdma.c
index b8ec7621f3..85fe34f5f3 100644
--- a/hw/dma/sifive_pdma.c
+++ b/hw/dma/sifive_pdma.c
@@ -232,7 +232,7 @@ static void sifive_pdma_write(void *opaque, hwaddr offset,
 {
     SiFivePDMAState *s = opaque;
     int ch = SIFIVE_PDMA_CHAN_NO(offset);
-    bool claimed;
+    bool claimed, run;
 
     if (ch >= SIFIVE_PDMA_CHANS) {
         qemu_log_mask(LOG_GUEST_ERROR, "%s: Invalid channel no %d\n",
@@ -244,6 +244,7 @@ static void sifive_pdma_write(void *opaque, hwaddr offset,
     switch (offset) {
     case DMA_CONTROL:
         claimed = !!(s->chan[ch].control & CONTROL_CLAIM);
+        run = !!(s->chan[ch].control & CONTROL_RUN);
 
         if (!claimed && (value & CONTROL_CLAIM)) {
             /* reset Next* registers */
@@ -254,13 +255,19 @@ static void sifive_pdma_write(void *opaque, hwaddr offset,
             s->chan[ch].next_src = 0;
         }
 
+        /* claim bit can only be cleared when run is low */
+        if (run && !(value & CONTROL_CLAIM)) {
+            value |= CONTROL_CLAIM;
+        }
+
         s->chan[ch].control = value;
 
         /*
          * If channel was not claimed before run bit is set,
+         * or if the channel is disclaimed when run was low,
          * DMA won't run.
          */
-        if (!claimed) {
+        if (!claimed || (!run && !(value & CONTROL_CLAIM))) {
             s->chan[ch].control &= ~CONTROL_RUN;
             return;
         }
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PULL 26/26] hw/riscv: shakti_c: Mark as not user creatable
  2021-10-07  6:47 [PULL 00/26] riscv-to-apply queue Alistair Francis
                   ` (24 preceding siblings ...)
  2021-10-07  6:47 ` [PULL 25/26] hw/dma: sifive_pdma: Don't run DMA when channel is disclaimed Alistair Francis
@ 2021-10-07  6:47 ` Alistair Francis
  2021-10-07 17:25 ` [PULL 00/26] riscv-to-apply queue Richard Henderson
  26 siblings, 0 replies; 37+ messages in thread
From: Alistair Francis @ 2021-10-07  6:47 UTC (permalink / raw)
  To: qemu-devel, peter.maydell
  Cc: alistair23, Alistair Francis, Philippe Mathieu-Daudé, Bin Meng

From: Alistair Francis <alistair.francis@wdc.com>

Mark the shakti_c machine as not user creatable.

Resolves: https://gitlab.com/qemu-project/qemu/-/issues/639
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Bin Meng <bmeng.cn@gmail.com>
Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <c617a04d4e3dd041a3427b47a1b1d5ab475a2edd.1632871759.git.alistair.francis@wdc.com>
---
 hw/riscv/shakti_c.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/hw/riscv/shakti_c.c b/hw/riscv/shakti_c.c
index 2f084d3c8d..d7d1f91fa5 100644
--- a/hw/riscv/shakti_c.c
+++ b/hw/riscv/shakti_c.c
@@ -150,6 +150,13 @@ static void shakti_c_soc_class_init(ObjectClass *klass, void *data)
 {
     DeviceClass *dc = DEVICE_CLASS(klass);
     dc->realize = shakti_c_soc_state_realize;
+    /*
+     * Reasons:
+     *     - Creates CPUS in riscv_hart_realize(), and can create unintended
+     *       CPUs
+     *     - Uses serial_hds in realize function, thus can't be used twice
+     */
+    dc->user_creatable = false;
 }
 
 static void shakti_c_soc_instance_init(Object *obj)
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 37+ messages in thread

* Re: [PULL 00/26] riscv-to-apply queue
  2021-10-07  6:47 [PULL 00/26] riscv-to-apply queue Alistair Francis
                   ` (25 preceding siblings ...)
  2021-10-07  6:47 ` [PULL 26/26] hw/riscv: shakti_c: Mark as not user creatable Alistair Francis
@ 2021-10-07 17:25 ` Richard Henderson
  26 siblings, 0 replies; 37+ messages in thread
From: Richard Henderson @ 2021-10-07 17:25 UTC (permalink / raw)
  To: Alistair Francis, qemu-devel, peter.maydell; +Cc: alistair23, Alistair Francis

On 10/6/21 11:47 PM, Alistair Francis wrote:
> From: Alistair Francis <alistair.francis@wdc.com>
> 
> The following changes since commit ca61fa4b803e5d0abaf6f1ceb690f23bb78a4def:
> 
>    Merge remote-tracking branch 'remotes/quic/tags/pull-hex-20211006' into staging (2021-10-06 12:11:14 -0700)
> 
> are available in the Git repository at:
> 
>    git@github.com:alistair23/qemu.git tags/pull-riscv-to-apply-20211007
> 
> for you to fetch changes up to 9ae6ecd848dcd1b32003526ab65a0d4c644dfb07:
> 
>    hw/riscv: shakti_c: Mark as not user creatable (2021-10-07 08:41:33 +1000)
> 
> ----------------------------------------------------------------
> Third RISC-V PR for QEMU 6.2
> 
>   - Add Zb[abcs] instruction support
>   - Remove RVB support
>   - Bug fix of setting mstatus_hs.[SD|FS] bits
>   - Mark some UART devices as 'input'
>   - QOMify PolarFire MMUART
>   - Fixes for sifive PDMA
>   - Mark shakti_c as not user creatable
> 
> ----------------------------------------------------------------
> Alistair Francis (1):
>        hw/riscv: shakti_c: Mark as not user creatable
> 
> Bin Meng (5):
>        hw/char: ibex_uart: Register device in 'input' category
>        hw/char: shakti_uart: Register device in 'input' category
>        hw/char: sifive_uart: Register device in 'input' category
>        hw/dma: sifive_pdma: Fix Control.claim bit detection
>        hw/dma: sifive_pdma: Don't run DMA when channel is disclaimed
> 
> Frank Chang (1):
>        target/riscv: Set mstatus_hs.[SD|FS] bits if Clean and V=1 in mark_fs_dirty()
> 
> Philipp Tomsich (16):
>        target/riscv: Introduce temporary in gen_add_uw()
>        target/riscv: fix clzw implementation to operate on arg1
>        target/riscv: clwz must ignore high bits (use shift-left & changed logic)
>        target/riscv: Add x-zba, x-zbb, x-zbc and x-zbs properties
>        target/riscv: Reassign instructions to the Zba-extension
>        target/riscv: Remove the W-form instructions from Zbs
>        target/riscv: Remove shift-one instructions (proposed Zbo in pre-0.93 draft-B)
>        target/riscv: Reassign instructions to the Zbs-extension
>        target/riscv: Add instructions of the Zbc-extension
>        target/riscv: Reassign instructions to the Zbb-extension
>        target/riscv: Add orc.b instruction for Zbb, removing gorc/gorci
>        target/riscv: Add a REQUIRE_32BIT macro
>        target/riscv: Add rev8 instruction, removing grev/grevi
>        target/riscv: Add zext.h instructions to Zbb, removing pack/packu/packh
>        target/riscv: Remove RVB (replaced by Zb[abcs])
>        disas/riscv: Add Zb[abcs] instructions
> 
> Philippe Mathieu-Daudé (3):
>        hw/char/mchp_pfsoc_mmuart: Simplify MCHP_PFSOC_MMUART_REG definition
>        hw/char/mchp_pfsoc_mmuart: Use a MemoryRegion container
>        hw/char/mchp_pfsoc_mmuart: QOM'ify PolarFire MMUART
> 
>   include/hw/char/mchp_pfsoc_mmuart.h     |  17 +-
>   target/riscv/cpu.h                      |  11 +-
>   target/riscv/helper.h                   |   6 +-
>   target/riscv/insn32.decode              | 115 ++++-----
>   disas/riscv.c                           | 157 +++++++++++-
>   hw/char/ibex_uart.c                     |   1 +
>   hw/char/mchp_pfsoc_mmuart.c             | 116 +++++++--
>   hw/char/shakti_uart.c                   |   1 +
>   hw/char/sifive_uart.c                   |   1 +
>   hw/dma/sifive_pdma.c                    |  13 +-
>   hw/riscv/shakti_c.c                     |   7 +
>   target/riscv/bitmanip_helper.c          |  65 +----
>   target/riscv/cpu.c                      |  30 +--
>   target/riscv/translate.c                |  36 ++-
>   target/riscv/insn_trans/trans_rvb.c.inc | 419 ++++++++++----------------------
>   15 files changed, 516 insertions(+), 479 deletions(-)

Applied, thanks.

Remember to update the wiki for the user-facing changes.


r~


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PULL 11/26] target/riscv: Add orc.b instruction for Zbb, removing gorc/gorci
  2021-10-07  6:47 ` [PULL 11/26] target/riscv: Add orc.b instruction for Zbb, removing gorc/gorci Alistair Francis
@ 2021-10-13  9:36   ` Vincent Palatin
  2021-10-13  9:37     ` [PATCH v1A] target/riscv: fix orc.b instruction in the Zbb extension Vincent Palatin
                       ` (2 more replies)
  0 siblings, 3 replies; 37+ messages in thread
From: Vincent Palatin @ 2021-10-13  9:36 UTC (permalink / raw)
  To: Philipp Tomsich, qemu-devel
  Cc: Alistair Francis, Peter Maydell, Alistair Francis,
	Richard Henderson, alistair23

On Thu, Oct 7, 2021 at 8:58 AM Alistair Francis
<alistair.francis@opensource.wdc.com> wrote:
>
> From: Philipp Tomsich <philipp.tomsich@vrull.eu>
>
> The 1.0.0 version of Zbb does not contain gorc/gorci.  Instead, a
> orc.b instruction (equivalent to the orc.b pseudo-instruction built on
> gorci from pre-0.93 draft-B) is available, mainly targeting
> string-processing workloads.
>
> This commit adds the new orc.b instruction and removed gorc/gorci.
>
> Signed-off-by: Philipp Tomsich <philipp.tomsich@vrull.eu>
> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
> Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
> Message-id: 20210911140016.834071-12-philipp.tomsich@vrull.eu
> Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
> ---
>  target/riscv/helper.h                   |  2 --
>  target/riscv/insn32.decode              |  6 +---
>  target/riscv/bitmanip_helper.c          | 26 -----------------
>  target/riscv/insn_trans/trans_rvb.c.inc | 39 +++++++++++--------------
>  4 files changed, 18 insertions(+), 55 deletions(-)
>
> diff --git a/target/riscv/helper.h b/target/riscv/helper.h
> index 8a318a2dbc..a9bda2c8ac 100644
> --- a/target/riscv/helper.h
> +++ b/target/riscv/helper.h
> @@ -61,8 +61,6 @@ DEF_HELPER_FLAGS_1(fclass_d, TCG_CALL_NO_RWG_SE, tl, i64)
>  /* Bitmanip */
>  DEF_HELPER_FLAGS_2(grev, TCG_CALL_NO_RWG_SE, tl, tl, tl)
>  DEF_HELPER_FLAGS_2(grevw, TCG_CALL_NO_RWG_SE, tl, tl, tl)
> -DEF_HELPER_FLAGS_2(gorc, TCG_CALL_NO_RWG_SE, tl, tl, tl)
> -DEF_HELPER_FLAGS_2(gorcw, TCG_CALL_NO_RWG_SE, tl, tl, tl)
>  DEF_HELPER_FLAGS_2(clmul, TCG_CALL_NO_RWG_SE, tl, tl, tl)
>  DEF_HELPER_FLAGS_2(clmulr, TCG_CALL_NO_RWG_SE, tl, tl, tl)
>
> diff --git a/target/riscv/insn32.decode b/target/riscv/insn32.decode
> index a509cfee11..59202196dc 100644
> --- a/target/riscv/insn32.decode
> +++ b/target/riscv/insn32.decode
> @@ -681,6 +681,7 @@ max        0000101 .......... 110 ..... 0110011 @r
>  maxu       0000101 .......... 111 ..... 0110011 @r
>  min        0000101 .......... 100 ..... 0110011 @r
>  minu       0000101 .......... 101 ..... 0110011 @r
> +orc_b      001010 000111 ..... 101 ..... 0010011 @r2
>  orn        0100000 .......... 110 ..... 0110011 @r
>  rol        0110000 .......... 001 ..... 0110011 @r
>  ror        0110000 .......... 101 ..... 0110011 @r
> @@ -702,19 +703,14 @@ pack       0000100 .......... 100 ..... 0110011 @r
>  packu      0100100 .......... 100 ..... 0110011 @r
>  packh      0000100 .......... 111 ..... 0110011 @r
>  grev       0110100 .......... 101 ..... 0110011 @r
> -gorc       0010100 .......... 101 ..... 0110011 @r
> -
>  grevi      01101. ........... 101 ..... 0010011 @sh
> -gorci      00101. ........... 101 ..... 0010011 @sh
>
>  # *** RV64B Standard Extension (in addition to RV32B) ***
>  packw      0000100 .......... 100 ..... 0111011 @r
>  packuw     0100100 .......... 100 ..... 0111011 @r
>  grevw      0110100 .......... 101 ..... 0111011 @r
> -gorcw      0010100 .......... 101 ..... 0111011 @r
>
>  greviw     0110100 .......... 101 ..... 0011011 @sh5
> -gorciw     0010100 .......... 101 ..... 0011011 @sh5
>
>  # *** RV32 Zbc Standard Extension ***
>  clmul      0000101 .......... 001 ..... 0110011 @r
> diff --git a/target/riscv/bitmanip_helper.c b/target/riscv/bitmanip_helper.c
> index 73be5a81c7..bb48388fcd 100644
> --- a/target/riscv/bitmanip_helper.c
> +++ b/target/riscv/bitmanip_helper.c
> @@ -64,32 +64,6 @@ target_ulong HELPER(grevw)(target_ulong rs1, target_ulong rs2)
>      return do_grev(rs1, rs2, 32);
>  }
>
> -static target_ulong do_gorc(target_ulong rs1,
> -                            target_ulong rs2,
> -                            int bits)
> -{
> -    target_ulong x = rs1;
> -    int i, shift;
> -
> -    for (i = 0, shift = 1; shift < bits; i++, shift <<= 1) {
> -        if (rs2 & shift) {
> -            x |= do_swap(x, adjacent_masks[i], shift);
> -        }
> -    }
> -
> -    return x;
> -}
> -
> -target_ulong HELPER(gorc)(target_ulong rs1, target_ulong rs2)
> -{
> -    return do_gorc(rs1, rs2, TARGET_LONG_BITS);
> -}
> -
> -target_ulong HELPER(gorcw)(target_ulong rs1, target_ulong rs2)
> -{
> -    return do_gorc(rs1, rs2, 32);
> -}
> -
>  target_ulong HELPER(clmul)(target_ulong rs1, target_ulong rs2)
>  {
>      target_ulong result = 0;
> diff --git a/target/riscv/insn_trans/trans_rvb.c.inc b/target/riscv/insn_trans/trans_rvb.c.inc
> index bdfb495f24..d32af5915a 100644
> --- a/target/riscv/insn_trans/trans_rvb.c.inc
> +++ b/target/riscv/insn_trans/trans_rvb.c.inc
> @@ -295,16 +295,27 @@ static bool trans_grevi(DisasContext *ctx, arg_grevi *a)
>      return gen_shift_imm_fn(ctx, a, EXT_NONE, gen_grevi);
>  }
>
> -static bool trans_gorc(DisasContext *ctx, arg_gorc *a)
> +static void gen_orc_b(TCGv ret, TCGv source1)
>  {
> -    REQUIRE_EXT(ctx, RVB);
> -    return gen_shift(ctx, a, EXT_ZERO, gen_helper_gorc);
> +    TCGv  tmp = tcg_temp_new();
> +    TCGv  ones = tcg_constant_tl(dup_const_tl(MO_8, 0x01));
> +
> +    /* Set lsb in each byte if the byte was zero. */
> +    tcg_gen_sub_tl(tmp, source1, ones);
> +    tcg_gen_andc_tl(tmp, tmp, source1);
> +    tcg_gen_shri_tl(tmp, tmp, 7);
> +    tcg_gen_andc_tl(tmp, ones, tmp);
> +
> +    /* Replicate the lsb of each byte across the byte. */
> +    tcg_gen_muli_tl(ret, tmp, 0xff);
> +
> +    tcg_temp_free(tmp);
>  }

It seems there is a bug in the current orc.b implementation,
the following 7 hexadecimal patterns return one wrong byte (0x00
instead of 0xff):
orc.b(0x............01..) = 0x............00.. (instead of 0x............ff..)
orc.b(0x..........01....) = 0x..........00.... (instead of 0x..........ff....)
orc.b(0x........01......) = 0x........00...... (instead of 0x........ff......)
orc.b(0x......01........) = 0x......00........ (instead of 0x......ff........)
orc.b(0x....01..........) = 0x....00.......... (instead of 0x....ff..........)
orc.b(0x..01............) = 0x..00............ (instead of 0x..ff............)
orc.b(0x01..............) = 0x00.............. (instead of 0xff..............)
(see test cases below)

The issue seems to be related to the propagation of the carry.
I had a hard time fixing it. With some help, I have added a prolog
which basically computes:
(source1 | ((source1 << 1) & ~ones)) in order to avoid the carry.
I will send the patch as a follow-up in this thread as 'Patch 1A'.

But it's notably less optimized than the current code,  so feel free
to come up with better options.
Actually my initial stab at fixing it was implementing a more
straightforward but less astute 'divide and conquer' method
where bits are or'ed by pairs, then the pairs are or'ed by pair ...
using the following formula:
tmp = source1 | (source1 >> 1)
tmp = tmp | (tmp >> 2)
tmp = tmp | (tmp >> 4)
ret = tmp & 0x0101010101010101
ret = tmp * 0xff
as it's notably less optimized than the current code when converted in
tcg_gen_ primitives but de par with the fixed version.
I'm also sending in this thread as 'Patch 1B' as I feel it's slightly
easier to grasp.


Test cases run on current implementation:
orc.b(0x0000000000000000) = 0x0000000000000000   OK (expect 0x0000000000000000)
orc.b(0x0000000000000001) = 0x00000000000000ff   OK (expect 0x00000000000000ff)
orc.b(0x0000000000000002) = 0x00000000000000ff   OK (expect 0x00000000000000ff)
orc.b(0x0000000000000004) = 0x00000000000000ff   OK (expect 0x00000000000000ff)
orc.b(0x0000000000000008) = 0x00000000000000ff   OK (expect 0x00000000000000ff)
orc.b(0x0000000000000010) = 0x00000000000000ff   OK (expect 0x00000000000000ff)
orc.b(0x0000000000000020) = 0x00000000000000ff   OK (expect 0x00000000000000ff)
orc.b(0x0000000000000040) = 0x00000000000000ff   OK (expect 0x00000000000000ff)
orc.b(0x0000000000000080) = 0x00000000000000ff   OK (expect 0x00000000000000ff)
orc.b(0x0000000000000100) = 0x0000000000000000 FAIL (expect 0x000000000000ff00)
orc.b(0x0000000000000200) = 0x000000000000ff00   OK (expect 0x000000000000ff00)
orc.b(0x0000000000000400) = 0x000000000000ff00   OK (expect 0x000000000000ff00)
orc.b(0x0000000000000800) = 0x000000000000ff00   OK (expect 0x000000000000ff00)
orc.b(0x0000000000001000) = 0x000000000000ff00   OK (expect 0x000000000000ff00)
orc.b(0x0000000000002000) = 0x000000000000ff00   OK (expect 0x000000000000ff00)
orc.b(0x0000000000004000) = 0x000000000000ff00   OK (expect 0x000000000000ff00)
orc.b(0x0000000000008000) = 0x000000000000ff00   OK (expect 0x000000000000ff00)
orc.b(0x0000000000010000) = 0x0000000000000000 FAIL (expect 0x0000000000ff0000)
orc.b(0x0000000000020000) = 0x0000000000ff0000   OK (expect 0x0000000000ff0000)
orc.b(0x0000000001000000) = 0x0000000000000000 FAIL (expect 0x00000000ff000000)
orc.b(0x0000000002000000) = 0x00000000ff000000   OK (expect 0x00000000ff000000)
orc.b(0x0000000100000000) = 0x0000000000000000 FAIL (expect 0x000000ff00000000)
orc.b(0x0000000200000000) = 0x000000ff00000000   OK (expect 0x000000ff00000000)
orc.b(0x0000010000000000) = 0x0000000000000000 FAIL (expect 0x0000ff0000000000)
orc.b(0x0000020000000000) = 0x0000ff0000000000   OK (expect 0x0000ff0000000000)
orc.b(0x0001000000000000) = 0x0000000000000000 FAIL (expect 0x00ff000000000000)
orc.b(0x0002000000000000) = 0x00ff000000000000   OK (expect 0x00ff000000000000)
orc.b(0x0100000000000000) = 0x0000000000000000 FAIL (expect 0xff00000000000000)
orc.b(0x0200000000000000) = 0xff00000000000000   OK (expect 0xff00000000000000)
orc.b(0x0400000000000000) = 0xff00000000000000   OK (expect 0xff00000000000000)
orc.b(0x0800000000000000) = 0xff00000000000000   OK (expect 0xff00000000000000)
orc.b(0x1000000000000000) = 0xff00000000000000   OK (expect 0xff00000000000000)
orc.b(0x2000000000000000) = 0xff00000000000000   OK (expect 0xff00000000000000)
orc.b(0x4000000000000000) = 0xff00000000000000   OK (expect 0xff00000000000000)
orc.b(0x8000000000000000) = 0xff00000000000000   OK (expect 0xff00000000000000)
orc.b(0xffffffffffffffff) = 0xffffffffffffffff   OK (expect 0xffffffffffffffff)
orc.b(0x00ff00ff00ff00ff) = 0x00ff00ff00ff00ff   OK (expect 0x00ff00ff00ff00ff)
orc.b(0xff00ff00ff00ff00) = 0xff00ff00ff00ff00   OK (expect 0xff00ff00ff00ff00)
orc.b(0x0001000100010001) = 0x00000000000000ff FAIL (expect 0x00ff00ff00ff00ff)
orc.b(0x0100010001000100) = 0x0000000000000000 FAIL (expect 0xff00ff00ff00ff00)
orc.b(0x8040201008040201) = 0xffffffffffffffff   OK (expect 0xffffffffffffffff)
orc.b(0x0804020180402010) = 0xffffffffffffffff   OK (expect 0xffffffffffffffff)
orc.b(0x010055aa00401100) = 0x0000ffff00ffff00 FAIL (expect 0xff00ffff00ffff00)


>
> -static bool trans_gorci(DisasContext *ctx, arg_gorci *a)
> +static bool trans_orc_b(DisasContext *ctx, arg_orc_b *a)
>  {
> -    REQUIRE_EXT(ctx, RVB);
> -    return gen_shift_imm_tl(ctx, a, EXT_ZERO, gen_helper_gorc);
> +    REQUIRE_ZBB(ctx);
> +    return gen_unary(ctx, a, EXT_ZERO, gen_orc_b);
>  }
>
>  #define GEN_SHADD(SHAMT)                                       \
> @@ -476,22 +487,6 @@ static bool trans_greviw(DisasContext *ctx, arg_greviw *a)
>      return gen_shift_imm_tl(ctx, a, EXT_ZERO, gen_helper_grev);
>  }
>
> -static bool trans_gorcw(DisasContext *ctx, arg_gorcw *a)
> -{
> -    REQUIRE_64BIT(ctx);
> -    REQUIRE_EXT(ctx, RVB);
> -    ctx->w = true;
> -    return gen_shift(ctx, a, EXT_ZERO, gen_helper_gorc);
> -}
> -
> -static bool trans_gorciw(DisasContext *ctx, arg_gorciw *a)
> -{
> -    REQUIRE_64BIT(ctx);
> -    REQUIRE_EXT(ctx, RVB);
> -    ctx->w = true;
> -    return gen_shift_imm_tl(ctx, a, EXT_ZERO, gen_helper_gorc);
> -}
> -
>  #define GEN_SHADD_UW(SHAMT)                                       \
>  static void gen_sh##SHAMT##add_uw(TCGv ret, TCGv arg1, TCGv arg2) \
>  {                                                                 \
> --
> 2.31.1
>
>


^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH v1A] target/riscv: fix orc.b instruction in the Zbb extension
  2021-10-13  9:36   ` Vincent Palatin
@ 2021-10-13  9:37     ` Vincent Palatin
  2021-10-13  9:38     ` [PATCH v1B] " Vincent Palatin
  2021-10-13 13:12     ` [PULL 11/26] target/riscv: Add orc.b instruction for Zbb, removing gorc/gorci Philipp Tomsich
  2 siblings, 0 replies; 37+ messages in thread
From: Vincent Palatin @ 2021-10-13  9:37 UTC (permalink / raw)
  To: philipp.tomsich, qemu-devel
  Cc: alistair23, richard.henderson, alistair.francis, vpalatin, peter.maydell

The implementation was failing for the following 7 hexadecimal patterns
which return one wrong byte (0x00 instead of 0xff):
orc.b(0x............01..) = 0x............00.. (instead of 0x............ff..)
orc.b(0x..........01....) = 0x..........00.... (instead of 0x..........ff....)
orc.b(0x........01......) = 0x........00...... (instead of 0x........ff......)
orc.b(0x......01........) = 0x......00........ (instead of 0x......ff........)
orc.b(0x....01..........) = 0x....00.......... (instead of 0x....ff..........)
orc.b(0x..01............) = 0x..00............ (instead of 0x..ff............)
orc.b(0x01..............) = 0x00.............. (instead of 0xff..............)

Try to keep the carry from propagating and triggering the incorrect
results.

Signed-off-by: Vincent Palatin <vpalatin@rivosinc.com>
---
 target/riscv/insn_trans/trans_rvb.c.inc | 11 +++++++++--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/target/riscv/insn_trans/trans_rvb.c.inc b/target/riscv/insn_trans/trans_rvb.c.inc
index 185c3e9a60..b9fc272789 100644
--- a/target/riscv/insn_trans/trans_rvb.c.inc
+++ b/target/riscv/insn_trans/trans_rvb.c.inc
@@ -249,11 +249,17 @@ static bool trans_rev8_64(DisasContext *ctx, arg_rev8_64 *a)
 static void gen_orc_b(TCGv ret, TCGv source1)
 {
     TCGv  tmp = tcg_temp_new();
+    TCGv  tmp2 = tcg_temp_new();
     TCGv  ones = tcg_constant_tl(dup_const_tl(MO_8, 0x01));
 
+    /* avoid carry propagation */
+    tcg_gen_shli_tl(tmp, source1, 1);
+    tcg_gen_or_tl(tmp, source1, tmp);
+    tcg_gen_andc_tl(tmp2, tmp, ones);
+
     /* Set lsb in each byte if the byte was zero. */
-    tcg_gen_sub_tl(tmp, source1, ones);
-    tcg_gen_andc_tl(tmp, tmp, source1);
+    tcg_gen_sub_tl(tmp, tmp2, ones);
+    tcg_gen_andc_tl(tmp, tmp, tmp2);
     tcg_gen_shri_tl(tmp, tmp, 7);
     tcg_gen_andc_tl(tmp, ones, tmp);
 
@@ -261,6 +267,7 @@ static void gen_orc_b(TCGv ret, TCGv source1)
     tcg_gen_muli_tl(ret, tmp, 0xff);
 
     tcg_temp_free(tmp);
+    tcg_temp_free(tmp2);
 }
 
 static bool trans_orc_b(DisasContext *ctx, arg_orc_b *a)
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH v1B] target/riscv: fix orc.b instruction in the Zbb extension
  2021-10-13  9:36   ` Vincent Palatin
  2021-10-13  9:37     ` [PATCH v1A] target/riscv: fix orc.b instruction in the Zbb extension Vincent Palatin
@ 2021-10-13  9:38     ` Vincent Palatin
  2021-10-13 13:12     ` [PULL 11/26] target/riscv: Add orc.b instruction for Zbb, removing gorc/gorci Philipp Tomsich
  2 siblings, 0 replies; 37+ messages in thread
From: Vincent Palatin @ 2021-10-13  9:38 UTC (permalink / raw)
  To: philipp.tomsich, qemu-devel
  Cc: alistair23, richard.henderson, alistair.francis, vpalatin, peter.maydell

The implementation was failing for the following 7 hexadecimal patterns
which return one wrong byte (0x00 instead of 0xff):
orc.b(0x............01..) = 0x............00.. (instead of 0x............ff..)
orc.b(0x..........01....) = 0x..........00.... (instead of 0x..........ff....)
orc.b(0x........01......) = 0x........00...... (instead of 0x........ff......)
orc.b(0x......01........) = 0x......00........ (instead of 0x......ff........)
orc.b(0x....01..........) = 0x....00.......... (instead of 0x....ff..........)
orc.b(0x..01............) = 0x..00............ (instead of 0x..ff............)
orc.b(0x01..............) = 0x00.............. (instead of 0xff..............)

Implement a simpler but less astute/optimized 'divide and conquer' method
where bits are or'ed by pairs, then the pairs are or'ed by pair ...

Signed-off-by: Vincent Palatin <vpalatin@rivosinc.com>
---
 target/riscv/insn_trans/trans_rvb.c.inc | 18 +++++++++++++-----
 1 file changed, 13 insertions(+), 5 deletions(-)

diff --git a/target/riscv/insn_trans/trans_rvb.c.inc b/target/riscv/insn_trans/trans_rvb.c.inc
index 185c3e9a60..04f795652d 100644
--- a/target/riscv/insn_trans/trans_rvb.c.inc
+++ b/target/riscv/insn_trans/trans_rvb.c.inc
@@ -249,18 +249,26 @@ static bool trans_rev8_64(DisasContext *ctx, arg_rev8_64 *a)
 static void gen_orc_b(TCGv ret, TCGv source1)
 {
     TCGv  tmp = tcg_temp_new();
+    TCGv  shifted = tcg_temp_new();
     TCGv  ones = tcg_constant_tl(dup_const_tl(MO_8, 0x01));
 
-    /* Set lsb in each byte if the byte was zero. */
-    tcg_gen_sub_tl(tmp, source1, ones);
-    tcg_gen_andc_tl(tmp, tmp, source1);
-    tcg_gen_shri_tl(tmp, tmp, 7);
-    tcg_gen_andc_tl(tmp, ones, tmp);
+    /*
+     * Divide and conquer: show one byte of the word in the comments,
+     * with U meaning Useful or'ed bit, X Junk content bit, . don't care.
+     */
+    tcg_gen_shri_tl(shifted, source1, 1);
+    tcg_gen_or_tl(tmp, source1, shifted); /* tmp[15:8] = XU.U.U.U */
+    tcg_gen_shri_tl(shifted, tmp, 2);
+    tcg_gen_or_tl(tmp, shifted, tmp);     /* tmp[15:8] = XXXU...U */
+    tcg_gen_shri_tl(shifted, tmp, 4);
+    tcg_gen_or_tl(tmp, shifted, tmp);     /* tmp[15:8] = XXXXXXXU */
+    tcg_gen_and_tl(tmp, ones, tmp);       /* tmp[15:8] = 0000000U */
 
     /* Replicate the lsb of each byte across the byte. */
     tcg_gen_muli_tl(ret, tmp, 0xff);
 
     tcg_temp_free(tmp);
+    tcg_temp_free(shifted);
 }
 
 static bool trans_orc_b(DisasContext *ctx, arg_orc_b *a)
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 37+ messages in thread

* Re: [PULL 11/26] target/riscv: Add orc.b instruction for Zbb, removing gorc/gorci
  2021-10-13  9:36   ` Vincent Palatin
  2021-10-13  9:37     ` [PATCH v1A] target/riscv: fix orc.b instruction in the Zbb extension Vincent Palatin
  2021-10-13  9:38     ` [PATCH v1B] " Vincent Palatin
@ 2021-10-13 13:12     ` Philipp Tomsich
  2021-10-13 13:44       ` Vincent Palatin
  2 siblings, 1 reply; 37+ messages in thread
From: Philipp Tomsich @ 2021-10-13 13:12 UTC (permalink / raw)
  To: Vincent Palatin
  Cc: Peter Maydell, Richard Henderson,
	qemu-devel@nongnu.org Developers, Alistair Francis,
	Alistair Francis, Alistair Francis

I had a much simpler version initially (using 3 x mask/shift/or, for
12 instructions after setup of constants), but took up the suggestion
to optimize based on haszero(v)...
Indeed this appears to not do what we expect, when there's only 0x01
set in a byte.

The less optimized form, with a single constant, that would still do
what we want is:
   /* set high-bit for non-zero bytes */
   constant = dup_const_tl(MO_8, 0x7f);
   tmp = v & constant;   // AND
   tmp += constant;       // ADD
   tmp |= v;                    // OR
   /* extract high-bit to low-bit, for each word */
   tmp &= ~constant;     // ANDC
   tmp >>= 7;                 // SHR
   /* multiply with 0xff to populate entire byte where the low-bit is set */
   tmp *= 0xff;                // MUL

I'll submit a patch with this one later today, once I had a chance to
pass this through a full test.

Thanks,
Philipp.


On Wed, 13 Oct 2021 at 11:36, Vincent Palatin <vpalatin@rivosinc.com> wrote:
>
> On Thu, Oct 7, 2021 at 8:58 AM Alistair Francis
> <alistair.francis@opensource.wdc.com> wrote:
> >
> > From: Philipp Tomsich <philipp.tomsich@vrull.eu>
> >
> > The 1.0.0 version of Zbb does not contain gorc/gorci.  Instead, a
> > orc.b instruction (equivalent to the orc.b pseudo-instruction built on
> > gorci from pre-0.93 draft-B) is available, mainly targeting
> > string-processing workloads.
> >
> > This commit adds the new orc.b instruction and removed gorc/gorci.
> >
> > Signed-off-by: Philipp Tomsich <philipp.tomsich@vrull.eu>
> > Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
> > Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
> > Message-id: 20210911140016.834071-12-philipp.tomsich@vrull.eu
> > Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
> > ---
> >  target/riscv/helper.h                   |  2 --
> >  target/riscv/insn32.decode              |  6 +---
> >  target/riscv/bitmanip_helper.c          | 26 -----------------
> >  target/riscv/insn_trans/trans_rvb.c.inc | 39 +++++++++++--------------
> >  4 files changed, 18 insertions(+), 55 deletions(-)
> >
> > diff --git a/target/riscv/helper.h b/target/riscv/helper.h
> > index 8a318a2dbc..a9bda2c8ac 100644
> > --- a/target/riscv/helper.h
> > +++ b/target/riscv/helper.h
> > @@ -61,8 +61,6 @@ DEF_HELPER_FLAGS_1(fclass_d, TCG_CALL_NO_RWG_SE, tl, i64)
> >  /* Bitmanip */
> >  DEF_HELPER_FLAGS_2(grev, TCG_CALL_NO_RWG_SE, tl, tl, tl)
> >  DEF_HELPER_FLAGS_2(grevw, TCG_CALL_NO_RWG_SE, tl, tl, tl)
> > -DEF_HELPER_FLAGS_2(gorc, TCG_CALL_NO_RWG_SE, tl, tl, tl)
> > -DEF_HELPER_FLAGS_2(gorcw, TCG_CALL_NO_RWG_SE, tl, tl, tl)
> >  DEF_HELPER_FLAGS_2(clmul, TCG_CALL_NO_RWG_SE, tl, tl, tl)
> >  DEF_HELPER_FLAGS_2(clmulr, TCG_CALL_NO_RWG_SE, tl, tl, tl)
> >
> > diff --git a/target/riscv/insn32.decode b/target/riscv/insn32.decode
> > index a509cfee11..59202196dc 100644
> > --- a/target/riscv/insn32.decode
> > +++ b/target/riscv/insn32.decode
> > @@ -681,6 +681,7 @@ max        0000101 .......... 110 ..... 0110011 @r
> >  maxu       0000101 .......... 111 ..... 0110011 @r
> >  min        0000101 .......... 100 ..... 0110011 @r
> >  minu       0000101 .......... 101 ..... 0110011 @r
> > +orc_b      001010 000111 ..... 101 ..... 0010011 @r2
> >  orn        0100000 .......... 110 ..... 0110011 @r
> >  rol        0110000 .......... 001 ..... 0110011 @r
> >  ror        0110000 .......... 101 ..... 0110011 @r
> > @@ -702,19 +703,14 @@ pack       0000100 .......... 100 ..... 0110011 @r
> >  packu      0100100 .......... 100 ..... 0110011 @r
> >  packh      0000100 .......... 111 ..... 0110011 @r
> >  grev       0110100 .......... 101 ..... 0110011 @r
> > -gorc       0010100 .......... 101 ..... 0110011 @r
> > -
> >  grevi      01101. ........... 101 ..... 0010011 @sh
> > -gorci      00101. ........... 101 ..... 0010011 @sh
> >
> >  # *** RV64B Standard Extension (in addition to RV32B) ***
> >  packw      0000100 .......... 100 ..... 0111011 @r
> >  packuw     0100100 .......... 100 ..... 0111011 @r
> >  grevw      0110100 .......... 101 ..... 0111011 @r
> > -gorcw      0010100 .......... 101 ..... 0111011 @r
> >
> >  greviw     0110100 .......... 101 ..... 0011011 @sh5
> > -gorciw     0010100 .......... 101 ..... 0011011 @sh5
> >
> >  # *** RV32 Zbc Standard Extension ***
> >  clmul      0000101 .......... 001 ..... 0110011 @r
> > diff --git a/target/riscv/bitmanip_helper.c b/target/riscv/bitmanip_helper.c
> > index 73be5a81c7..bb48388fcd 100644
> > --- a/target/riscv/bitmanip_helper.c
> > +++ b/target/riscv/bitmanip_helper.c
> > @@ -64,32 +64,6 @@ target_ulong HELPER(grevw)(target_ulong rs1, target_ulong rs2)
> >      return do_grev(rs1, rs2, 32);
> >  }
> >
> > -static target_ulong do_gorc(target_ulong rs1,
> > -                            target_ulong rs2,
> > -                            int bits)
> > -{
> > -    target_ulong x = rs1;
> > -    int i, shift;
> > -
> > -    for (i = 0, shift = 1; shift < bits; i++, shift <<= 1) {
> > -        if (rs2 & shift) {
> > -            x |= do_swap(x, adjacent_masks[i], shift);
> > -        }
> > -    }
> > -
> > -    return x;
> > -}
> > -
> > -target_ulong HELPER(gorc)(target_ulong rs1, target_ulong rs2)
> > -{
> > -    return do_gorc(rs1, rs2, TARGET_LONG_BITS);
> > -}
> > -
> > -target_ulong HELPER(gorcw)(target_ulong rs1, target_ulong rs2)
> > -{
> > -    return do_gorc(rs1, rs2, 32);
> > -}
> > -
> >  target_ulong HELPER(clmul)(target_ulong rs1, target_ulong rs2)
> >  {
> >      target_ulong result = 0;
> > diff --git a/target/riscv/insn_trans/trans_rvb.c.inc b/target/riscv/insn_trans/trans_rvb.c.inc
> > index bdfb495f24..d32af5915a 100644
> > --- a/target/riscv/insn_trans/trans_rvb.c.inc
> > +++ b/target/riscv/insn_trans/trans_rvb.c.inc
> > @@ -295,16 +295,27 @@ static bool trans_grevi(DisasContext *ctx, arg_grevi *a)
> >      return gen_shift_imm_fn(ctx, a, EXT_NONE, gen_grevi);
> >  }
> >
> > -static bool trans_gorc(DisasContext *ctx, arg_gorc *a)
> > +static void gen_orc_b(TCGv ret, TCGv source1)
> >  {
> > -    REQUIRE_EXT(ctx, RVB);
> > -    return gen_shift(ctx, a, EXT_ZERO, gen_helper_gorc);
> > +    TCGv  tmp = tcg_temp_new();
> > +    TCGv  ones = tcg_constant_tl(dup_const_tl(MO_8, 0x01));
> > +
> > +    /* Set lsb in each byte if the byte was zero. */
> > +    tcg_gen_sub_tl(tmp, source1, ones);
> > +    tcg_gen_andc_tl(tmp, tmp, source1);
> > +    tcg_gen_shri_tl(tmp, tmp, 7);
> > +    tcg_gen_andc_tl(tmp, ones, tmp);
> > +
> > +    /* Replicate the lsb of each byte across the byte. */
> > +    tcg_gen_muli_tl(ret, tmp, 0xff);
> > +
> > +    tcg_temp_free(tmp);
> >  }
>
> It seems there is a bug in the current orc.b implementation,
> the following 7 hexadecimal patterns return one wrong byte (0x00
> instead of 0xff):
> orc.b(0x............01..) = 0x............00.. (instead of 0x............ff..)
> orc.b(0x..........01....) = 0x..........00.... (instead of 0x..........ff....)
> orc.b(0x........01......) = 0x........00...... (instead of 0x........ff......)
> orc.b(0x......01........) = 0x......00........ (instead of 0x......ff........)
> orc.b(0x....01..........) = 0x....00.......... (instead of 0x....ff..........)
> orc.b(0x..01............) = 0x..00............ (instead of 0x..ff............)
> orc.b(0x01..............) = 0x00.............. (instead of 0xff..............)
> (see test cases below)
>
> The issue seems to be related to the propagation of the carry.
> I had a hard time fixing it. With some help, I have added a prolog
> which basically computes:
> (source1 | ((source1 << 1) & ~ones)) in order to avoid the carry.
> I will send the patch as a follow-up in this thread as 'Patch 1A'.
>
> But it's notably less optimized than the current code,  so feel free
> to come up with better options.
> Actually my initial stab at fixing it was implementing a more
> straightforward but less astute 'divide and conquer' method
> where bits are or'ed by pairs, then the pairs are or'ed by pair ...
> using the following formula:
> tmp = source1 | (source1 >> 1)
> tmp = tmp | (tmp >> 2)
> tmp = tmp | (tmp >> 4)
> ret = tmp & 0x0101010101010101
> ret = tmp * 0xff
> as it's notably less optimized than the current code when converted in
> tcg_gen_ primitives but de par with the fixed version.
> I'm also sending in this thread as 'Patch 1B' as I feel it's slightly
> easier to grasp.
>
>
> Test cases run on current implementation:
> orc.b(0x0000000000000000) = 0x0000000000000000   OK (expect 0x0000000000000000)
> orc.b(0x0000000000000001) = 0x00000000000000ff   OK (expect 0x00000000000000ff)
> orc.b(0x0000000000000002) = 0x00000000000000ff   OK (expect 0x00000000000000ff)
> orc.b(0x0000000000000004) = 0x00000000000000ff   OK (expect 0x00000000000000ff)
> orc.b(0x0000000000000008) = 0x00000000000000ff   OK (expect 0x00000000000000ff)
> orc.b(0x0000000000000010) = 0x00000000000000ff   OK (expect 0x00000000000000ff)
> orc.b(0x0000000000000020) = 0x00000000000000ff   OK (expect 0x00000000000000ff)
> orc.b(0x0000000000000040) = 0x00000000000000ff   OK (expect 0x00000000000000ff)
> orc.b(0x0000000000000080) = 0x00000000000000ff   OK (expect 0x00000000000000ff)
> orc.b(0x0000000000000100) = 0x0000000000000000 FAIL (expect 0x000000000000ff00)
> orc.b(0x0000000000000200) = 0x000000000000ff00   OK (expect 0x000000000000ff00)
> orc.b(0x0000000000000400) = 0x000000000000ff00   OK (expect 0x000000000000ff00)
> orc.b(0x0000000000000800) = 0x000000000000ff00   OK (expect 0x000000000000ff00)
> orc.b(0x0000000000001000) = 0x000000000000ff00   OK (expect 0x000000000000ff00)
> orc.b(0x0000000000002000) = 0x000000000000ff00   OK (expect 0x000000000000ff00)
> orc.b(0x0000000000004000) = 0x000000000000ff00   OK (expect 0x000000000000ff00)
> orc.b(0x0000000000008000) = 0x000000000000ff00   OK (expect 0x000000000000ff00)
> orc.b(0x0000000000010000) = 0x0000000000000000 FAIL (expect 0x0000000000ff0000)
> orc.b(0x0000000000020000) = 0x0000000000ff0000   OK (expect 0x0000000000ff0000)
> orc.b(0x0000000001000000) = 0x0000000000000000 FAIL (expect 0x00000000ff000000)
> orc.b(0x0000000002000000) = 0x00000000ff000000   OK (expect 0x00000000ff000000)
> orc.b(0x0000000100000000) = 0x0000000000000000 FAIL (expect 0x000000ff00000000)
> orc.b(0x0000000200000000) = 0x000000ff00000000   OK (expect 0x000000ff00000000)
> orc.b(0x0000010000000000) = 0x0000000000000000 FAIL (expect 0x0000ff0000000000)
> orc.b(0x0000020000000000) = 0x0000ff0000000000   OK (expect 0x0000ff0000000000)
> orc.b(0x0001000000000000) = 0x0000000000000000 FAIL (expect 0x00ff000000000000)
> orc.b(0x0002000000000000) = 0x00ff000000000000   OK (expect 0x00ff000000000000)
> orc.b(0x0100000000000000) = 0x0000000000000000 FAIL (expect 0xff00000000000000)
> orc.b(0x0200000000000000) = 0xff00000000000000   OK (expect 0xff00000000000000)
> orc.b(0x0400000000000000) = 0xff00000000000000   OK (expect 0xff00000000000000)
> orc.b(0x0800000000000000) = 0xff00000000000000   OK (expect 0xff00000000000000)
> orc.b(0x1000000000000000) = 0xff00000000000000   OK (expect 0xff00000000000000)
> orc.b(0x2000000000000000) = 0xff00000000000000   OK (expect 0xff00000000000000)
> orc.b(0x4000000000000000) = 0xff00000000000000   OK (expect 0xff00000000000000)
> orc.b(0x8000000000000000) = 0xff00000000000000   OK (expect 0xff00000000000000)
> orc.b(0xffffffffffffffff) = 0xffffffffffffffff   OK (expect 0xffffffffffffffff)
> orc.b(0x00ff00ff00ff00ff) = 0x00ff00ff00ff00ff   OK (expect 0x00ff00ff00ff00ff)
> orc.b(0xff00ff00ff00ff00) = 0xff00ff00ff00ff00   OK (expect 0xff00ff00ff00ff00)
> orc.b(0x0001000100010001) = 0x00000000000000ff FAIL (expect 0x00ff00ff00ff00ff)
> orc.b(0x0100010001000100) = 0x0000000000000000 FAIL (expect 0xff00ff00ff00ff00)
> orc.b(0x8040201008040201) = 0xffffffffffffffff   OK (expect 0xffffffffffffffff)
> orc.b(0x0804020180402010) = 0xffffffffffffffff   OK (expect 0xffffffffffffffff)
> orc.b(0x010055aa00401100) = 0x0000ffff00ffff00 FAIL (expect 0xff00ffff00ffff00)
>
>
> >
> > -static bool trans_gorci(DisasContext *ctx, arg_gorci *a)
> > +static bool trans_orc_b(DisasContext *ctx, arg_orc_b *a)
> >  {
> > -    REQUIRE_EXT(ctx, RVB);
> > -    return gen_shift_imm_tl(ctx, a, EXT_ZERO, gen_helper_gorc);
> > +    REQUIRE_ZBB(ctx);
> > +    return gen_unary(ctx, a, EXT_ZERO, gen_orc_b);
> >  }
> >
> >  #define GEN_SHADD(SHAMT)                                       \
> > @@ -476,22 +487,6 @@ static bool trans_greviw(DisasContext *ctx, arg_greviw *a)
> >      return gen_shift_imm_tl(ctx, a, EXT_ZERO, gen_helper_grev);
> >  }
> >
> > -static bool trans_gorcw(DisasContext *ctx, arg_gorcw *a)
> > -{
> > -    REQUIRE_64BIT(ctx);
> > -    REQUIRE_EXT(ctx, RVB);
> > -    ctx->w = true;
> > -    return gen_shift(ctx, a, EXT_ZERO, gen_helper_gorc);
> > -}
> > -
> > -static bool trans_gorciw(DisasContext *ctx, arg_gorciw *a)
> > -{
> > -    REQUIRE_64BIT(ctx);
> > -    REQUIRE_EXT(ctx, RVB);
> > -    ctx->w = true;
> > -    return gen_shift_imm_tl(ctx, a, EXT_ZERO, gen_helper_gorc);
> > -}
> > -
> >  #define GEN_SHADD_UW(SHAMT)                                       \
> >  static void gen_sh##SHAMT##add_uw(TCGv ret, TCGv arg1, TCGv arg2) \
> >  {                                                                 \
> > --
> > 2.31.1
> >
> >


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PULL 11/26] target/riscv: Add orc.b instruction for Zbb, removing gorc/gorci
  2021-10-13 13:12     ` [PULL 11/26] target/riscv: Add orc.b instruction for Zbb, removing gorc/gorci Philipp Tomsich
@ 2021-10-13 13:44       ` Vincent Palatin
  2021-10-13 13:49         ` Philipp Tomsich
  0 siblings, 1 reply; 37+ messages in thread
From: Vincent Palatin @ 2021-10-13 13:44 UTC (permalink / raw)
  To: Philipp Tomsich
  Cc: Peter Maydell, Richard Henderson,
	qemu-devel@nongnu.org Developers, Alistair Francis,
	Alistair Francis, Alistair Francis

On Wed, Oct 13, 2021 at 3:13 PM Philipp Tomsich
<philipp.tomsich@vrull.eu> wrote:
>
> I had a much simpler version initially (using 3 x mask/shift/or, for
> 12 instructions after setup of constants), but took up the suggestion
> to optimize based on haszero(v)...
> Indeed this appears to not do what we expect, when there's only 0x01
> set in a byte.
>
> The less optimized form, with a single constant, that would still do
> what we want is:
>    /* set high-bit for non-zero bytes */
>    constant = dup_const_tl(MO_8, 0x7f);
>    tmp = v & constant;   // AND
>    tmp += constant;       // ADD
>    tmp |= v;                    // OR
>    /* extract high-bit to low-bit, for each word */
>    tmp &= ~constant;     // ANDC
>    tmp >>= 7;                 // SHR
>    /* multiply with 0xff to populate entire byte where the low-bit is set */
>    tmp *= 0xff;                // MUL
>
> I'll submit a patch with this one later today, once I had a chance to
> pass this through a full test.


Thanks for the insight.

I have tried it, implemented as:
```
static void gen_orc_b(TCGv ret, TCGv source1)
{
    TCGv  tmp = tcg_temp_new();
    TCGv  constant = tcg_constant_tl(dup_const_tl(MO_8, 0x7f));

    /* set high-bit for non-zero bytes */
    tcg_gen_and_tl(tmp, source1, constant);
    tcg_gen_add_tl(tmp, tmp, constant);
    tcg_gen_or_tl(tmp, tmp, source1);
    /* extract high-bit to low-bit, for each word */
    tcg_gen_andc_tl(tmp, tmp, constant);
    tcg_gen_shri_tl(tmp, tmp, 7);

    /* Replicate the lsb of each byte across the byte. */
    tcg_gen_muli_tl(ret, tmp, 0xff);

    tcg_temp_free(tmp);
}
```

It does pass my own test sequences.


>
> On Wed, 13 Oct 2021 at 11:36, Vincent Palatin <vpalatin@rivosinc.com> wrote:
> >
> > On Thu, Oct 7, 2021 at 8:58 AM Alistair Francis
> > <alistair.francis@opensource.wdc.com> wrote:
> > >
> > > From: Philipp Tomsich <philipp.tomsich@vrull.eu>
> > >
> > > The 1.0.0 version of Zbb does not contain gorc/gorci.  Instead, a
> > > orc.b instruction (equivalent to the orc.b pseudo-instruction built on
> > > gorci from pre-0.93 draft-B) is available, mainly targeting
> > > string-processing workloads.
> > >
> > > This commit adds the new orc.b instruction and removed gorc/gorci.
> > >
> > > Signed-off-by: Philipp Tomsich <philipp.tomsich@vrull.eu>
> > > Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
> > > Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
> > > Message-id: 20210911140016.834071-12-philipp.tomsich@vrull.eu
> > > Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
> > > ---
> > >  target/riscv/helper.h                   |  2 --
> > >  target/riscv/insn32.decode              |  6 +---
> > >  target/riscv/bitmanip_helper.c          | 26 -----------------
> > >  target/riscv/insn_trans/trans_rvb.c.inc | 39 +++++++++++--------------
> > >  4 files changed, 18 insertions(+), 55 deletions(-)
> > >
> > > diff --git a/target/riscv/helper.h b/target/riscv/helper.h
> > > index 8a318a2dbc..a9bda2c8ac 100644
> > > --- a/target/riscv/helper.h
> > > +++ b/target/riscv/helper.h
> > > @@ -61,8 +61,6 @@ DEF_HELPER_FLAGS_1(fclass_d, TCG_CALL_NO_RWG_SE, tl, i64)
> > >  /* Bitmanip */
> > >  DEF_HELPER_FLAGS_2(grev, TCG_CALL_NO_RWG_SE, tl, tl, tl)
> > >  DEF_HELPER_FLAGS_2(grevw, TCG_CALL_NO_RWG_SE, tl, tl, tl)
> > > -DEF_HELPER_FLAGS_2(gorc, TCG_CALL_NO_RWG_SE, tl, tl, tl)
> > > -DEF_HELPER_FLAGS_2(gorcw, TCG_CALL_NO_RWG_SE, tl, tl, tl)
> > >  DEF_HELPER_FLAGS_2(clmul, TCG_CALL_NO_RWG_SE, tl, tl, tl)
> > >  DEF_HELPER_FLAGS_2(clmulr, TCG_CALL_NO_RWG_SE, tl, tl, tl)
> > >
> > > diff --git a/target/riscv/insn32.decode b/target/riscv/insn32.decode
> > > index a509cfee11..59202196dc 100644
> > > --- a/target/riscv/insn32.decode
> > > +++ b/target/riscv/insn32.decode
> > > @@ -681,6 +681,7 @@ max        0000101 .......... 110 ..... 0110011 @r
> > >  maxu       0000101 .......... 111 ..... 0110011 @r
> > >  min        0000101 .......... 100 ..... 0110011 @r
> > >  minu       0000101 .......... 101 ..... 0110011 @r
> > > +orc_b      001010 000111 ..... 101 ..... 0010011 @r2
> > >  orn        0100000 .......... 110 ..... 0110011 @r
> > >  rol        0110000 .......... 001 ..... 0110011 @r
> > >  ror        0110000 .......... 101 ..... 0110011 @r
> > > @@ -702,19 +703,14 @@ pack       0000100 .......... 100 ..... 0110011 @r
> > >  packu      0100100 .......... 100 ..... 0110011 @r
> > >  packh      0000100 .......... 111 ..... 0110011 @r
> > >  grev       0110100 .......... 101 ..... 0110011 @r
> > > -gorc       0010100 .......... 101 ..... 0110011 @r
> > > -
> > >  grevi      01101. ........... 101 ..... 0010011 @sh
> > > -gorci      00101. ........... 101 ..... 0010011 @sh
> > >
> > >  # *** RV64B Standard Extension (in addition to RV32B) ***
> > >  packw      0000100 .......... 100 ..... 0111011 @r
> > >  packuw     0100100 .......... 100 ..... 0111011 @r
> > >  grevw      0110100 .......... 101 ..... 0111011 @r
> > > -gorcw      0010100 .......... 101 ..... 0111011 @r
> > >
> > >  greviw     0110100 .......... 101 ..... 0011011 @sh5
> > > -gorciw     0010100 .......... 101 ..... 0011011 @sh5
> > >
> > >  # *** RV32 Zbc Standard Extension ***
> > >  clmul      0000101 .......... 001 ..... 0110011 @r
> > > diff --git a/target/riscv/bitmanip_helper.c b/target/riscv/bitmanip_helper.c
> > > index 73be5a81c7..bb48388fcd 100644
> > > --- a/target/riscv/bitmanip_helper.c
> > > +++ b/target/riscv/bitmanip_helper.c
> > > @@ -64,32 +64,6 @@ target_ulong HELPER(grevw)(target_ulong rs1, target_ulong rs2)
> > >      return do_grev(rs1, rs2, 32);
> > >  }
> > >
> > > -static target_ulong do_gorc(target_ulong rs1,
> > > -                            target_ulong rs2,
> > > -                            int bits)
> > > -{
> > > -    target_ulong x = rs1;
> > > -    int i, shift;
> > > -
> > > -    for (i = 0, shift = 1; shift < bits; i++, shift <<= 1) {
> > > -        if (rs2 & shift) {
> > > -            x |= do_swap(x, adjacent_masks[i], shift);
> > > -        }
> > > -    }
> > > -
> > > -    return x;
> > > -}
> > > -
> > > -target_ulong HELPER(gorc)(target_ulong rs1, target_ulong rs2)
> > > -{
> > > -    return do_gorc(rs1, rs2, TARGET_LONG_BITS);
> > > -}
> > > -
> > > -target_ulong HELPER(gorcw)(target_ulong rs1, target_ulong rs2)
> > > -{
> > > -    return do_gorc(rs1, rs2, 32);
> > > -}
> > > -
> > >  target_ulong HELPER(clmul)(target_ulong rs1, target_ulong rs2)
> > >  {
> > >      target_ulong result = 0;
> > > diff --git a/target/riscv/insn_trans/trans_rvb.c.inc b/target/riscv/insn_trans/trans_rvb.c.inc
> > > index bdfb495f24..d32af5915a 100644
> > > --- a/target/riscv/insn_trans/trans_rvb.c.inc
> > > +++ b/target/riscv/insn_trans/trans_rvb.c.inc
> > > @@ -295,16 +295,27 @@ static bool trans_grevi(DisasContext *ctx, arg_grevi *a)
> > >      return gen_shift_imm_fn(ctx, a, EXT_NONE, gen_grevi);
> > >  }
> > >
> > > -static bool trans_gorc(DisasContext *ctx, arg_gorc *a)
> > > +static void gen_orc_b(TCGv ret, TCGv source1)
> > >  {
> > > -    REQUIRE_EXT(ctx, RVB);
> > > -    return gen_shift(ctx, a, EXT_ZERO, gen_helper_gorc);
> > > +    TCGv  tmp = tcg_temp_new();
> > > +    TCGv  ones = tcg_constant_tl(dup_const_tl(MO_8, 0x01));
> > > +
> > > +    /* Set lsb in each byte if the byte was zero. */
> > > +    tcg_gen_sub_tl(tmp, source1, ones);
> > > +    tcg_gen_andc_tl(tmp, tmp, source1);
> > > +    tcg_gen_shri_tl(tmp, tmp, 7);
> > > +    tcg_gen_andc_tl(tmp, ones, tmp);
> > > +
> > > +    /* Replicate the lsb of each byte across the byte. */
> > > +    tcg_gen_muli_tl(ret, tmp, 0xff);
> > > +
> > > +    tcg_temp_free(tmp);
> > >  }
> >
> > It seems there is a bug in the current orc.b implementation,
> > the following 7 hexadecimal patterns return one wrong byte (0x00
> > instead of 0xff):
> > orc.b(0x............01..) = 0x............00.. (instead of 0x............ff..)
> > orc.b(0x..........01....) = 0x..........00.... (instead of 0x..........ff....)
> > orc.b(0x........01......) = 0x........00...... (instead of 0x........ff......)
> > orc.b(0x......01........) = 0x......00........ (instead of 0x......ff........)
> > orc.b(0x....01..........) = 0x....00.......... (instead of 0x....ff..........)
> > orc.b(0x..01............) = 0x..00............ (instead of 0x..ff............)
> > orc.b(0x01..............) = 0x00.............. (instead of 0xff..............)
> > (see test cases below)
> >
> > The issue seems to be related to the propagation of the carry.
> > I had a hard time fixing it. With some help, I have added a prolog
> > which basically computes:
> > (source1 | ((source1 << 1) & ~ones)) in order to avoid the carry.
> > I will send the patch as a follow-up in this thread as 'Patch 1A'.
> >
> > But it's notably less optimized than the current code,  so feel free
> > to come up with better options.
> > Actually my initial stab at fixing it was implementing a more
> > straightforward but less astute 'divide and conquer' method
> > where bits are or'ed by pairs, then the pairs are or'ed by pair ...
> > using the following formula:
> > tmp = source1 | (source1 >> 1)
> > tmp = tmp | (tmp >> 2)
> > tmp = tmp | (tmp >> 4)
> > ret = tmp & 0x0101010101010101
> > ret = tmp * 0xff
> > as it's notably less optimized than the current code when converted in
> > tcg_gen_ primitives but de par with the fixed version.
> > I'm also sending in this thread as 'Patch 1B' as I feel it's slightly
> > easier to grasp.
> >
> >
> > Test cases run on current implementation:
> > orc.b(0x0000000000000000) = 0x0000000000000000   OK (expect 0x0000000000000000)
> > orc.b(0x0000000000000001) = 0x00000000000000ff   OK (expect 0x00000000000000ff)
> > orc.b(0x0000000000000002) = 0x00000000000000ff   OK (expect 0x00000000000000ff)
> > orc.b(0x0000000000000004) = 0x00000000000000ff   OK (expect 0x00000000000000ff)
> > orc.b(0x0000000000000008) = 0x00000000000000ff   OK (expect 0x00000000000000ff)
> > orc.b(0x0000000000000010) = 0x00000000000000ff   OK (expect 0x00000000000000ff)
> > orc.b(0x0000000000000020) = 0x00000000000000ff   OK (expect 0x00000000000000ff)
> > orc.b(0x0000000000000040) = 0x00000000000000ff   OK (expect 0x00000000000000ff)
> > orc.b(0x0000000000000080) = 0x00000000000000ff   OK (expect 0x00000000000000ff)
> > orc.b(0x0000000000000100) = 0x0000000000000000 FAIL (expect 0x000000000000ff00)
> > orc.b(0x0000000000000200) = 0x000000000000ff00   OK (expect 0x000000000000ff00)
> > orc.b(0x0000000000000400) = 0x000000000000ff00   OK (expect 0x000000000000ff00)
> > orc.b(0x0000000000000800) = 0x000000000000ff00   OK (expect 0x000000000000ff00)
> > orc.b(0x0000000000001000) = 0x000000000000ff00   OK (expect 0x000000000000ff00)
> > orc.b(0x0000000000002000) = 0x000000000000ff00   OK (expect 0x000000000000ff00)
> > orc.b(0x0000000000004000) = 0x000000000000ff00   OK (expect 0x000000000000ff00)
> > orc.b(0x0000000000008000) = 0x000000000000ff00   OK (expect 0x000000000000ff00)
> > orc.b(0x0000000000010000) = 0x0000000000000000 FAIL (expect 0x0000000000ff0000)
> > orc.b(0x0000000000020000) = 0x0000000000ff0000   OK (expect 0x0000000000ff0000)
> > orc.b(0x0000000001000000) = 0x0000000000000000 FAIL (expect 0x00000000ff000000)
> > orc.b(0x0000000002000000) = 0x00000000ff000000   OK (expect 0x00000000ff000000)
> > orc.b(0x0000000100000000) = 0x0000000000000000 FAIL (expect 0x000000ff00000000)
> > orc.b(0x0000000200000000) = 0x000000ff00000000   OK (expect 0x000000ff00000000)
> > orc.b(0x0000010000000000) = 0x0000000000000000 FAIL (expect 0x0000ff0000000000)
> > orc.b(0x0000020000000000) = 0x0000ff0000000000   OK (expect 0x0000ff0000000000)
> > orc.b(0x0001000000000000) = 0x0000000000000000 FAIL (expect 0x00ff000000000000)
> > orc.b(0x0002000000000000) = 0x00ff000000000000   OK (expect 0x00ff000000000000)
> > orc.b(0x0100000000000000) = 0x0000000000000000 FAIL (expect 0xff00000000000000)
> > orc.b(0x0200000000000000) = 0xff00000000000000   OK (expect 0xff00000000000000)
> > orc.b(0x0400000000000000) = 0xff00000000000000   OK (expect 0xff00000000000000)
> > orc.b(0x0800000000000000) = 0xff00000000000000   OK (expect 0xff00000000000000)
> > orc.b(0x1000000000000000) = 0xff00000000000000   OK (expect 0xff00000000000000)
> > orc.b(0x2000000000000000) = 0xff00000000000000   OK (expect 0xff00000000000000)
> > orc.b(0x4000000000000000) = 0xff00000000000000   OK (expect 0xff00000000000000)
> > orc.b(0x8000000000000000) = 0xff00000000000000   OK (expect 0xff00000000000000)
> > orc.b(0xffffffffffffffff) = 0xffffffffffffffff   OK (expect 0xffffffffffffffff)
> > orc.b(0x00ff00ff00ff00ff) = 0x00ff00ff00ff00ff   OK (expect 0x00ff00ff00ff00ff)
> > orc.b(0xff00ff00ff00ff00) = 0xff00ff00ff00ff00   OK (expect 0xff00ff00ff00ff00)
> > orc.b(0x0001000100010001) = 0x00000000000000ff FAIL (expect 0x00ff00ff00ff00ff)
> > orc.b(0x0100010001000100) = 0x0000000000000000 FAIL (expect 0xff00ff00ff00ff00)
> > orc.b(0x8040201008040201) = 0xffffffffffffffff   OK (expect 0xffffffffffffffff)
> > orc.b(0x0804020180402010) = 0xffffffffffffffff   OK (expect 0xffffffffffffffff)
> > orc.b(0x010055aa00401100) = 0x0000ffff00ffff00 FAIL (expect 0xff00ffff00ffff00)
> >
> >
> > >
> > > -static bool trans_gorci(DisasContext *ctx, arg_gorci *a)
> > > +static bool trans_orc_b(DisasContext *ctx, arg_orc_b *a)
> > >  {
> > > -    REQUIRE_EXT(ctx, RVB);
> > > -    return gen_shift_imm_tl(ctx, a, EXT_ZERO, gen_helper_gorc);
> > > +    REQUIRE_ZBB(ctx);
> > > +    return gen_unary(ctx, a, EXT_ZERO, gen_orc_b);
> > >  }
> > >
> > >  #define GEN_SHADD(SHAMT)                                       \
> > > @@ -476,22 +487,6 @@ static bool trans_greviw(DisasContext *ctx, arg_greviw *a)
> > >      return gen_shift_imm_tl(ctx, a, EXT_ZERO, gen_helper_grev);
> > >  }
> > >
> > > -static bool trans_gorcw(DisasContext *ctx, arg_gorcw *a)
> > > -{
> > > -    REQUIRE_64BIT(ctx);
> > > -    REQUIRE_EXT(ctx, RVB);
> > > -    ctx->w = true;
> > > -    return gen_shift(ctx, a, EXT_ZERO, gen_helper_gorc);
> > > -}
> > > -
> > > -static bool trans_gorciw(DisasContext *ctx, arg_gorciw *a)
> > > -{
> > > -    REQUIRE_64BIT(ctx);
> > > -    REQUIRE_EXT(ctx, RVB);
> > > -    ctx->w = true;
> > > -    return gen_shift_imm_tl(ctx, a, EXT_ZERO, gen_helper_gorc);
> > > -}
> > > -
> > >  #define GEN_SHADD_UW(SHAMT)                                       \
> > >  static void gen_sh##SHAMT##add_uw(TCGv ret, TCGv arg1, TCGv arg2) \
> > >  {                                                                 \
> > > --
> > > 2.31.1
> > >
> > >


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PULL 11/26] target/riscv: Add orc.b instruction for Zbb, removing gorc/gorci
  2021-10-13 13:44       ` Vincent Palatin
@ 2021-10-13 13:49         ` Philipp Tomsich
  2021-10-13 16:20           ` Vineet Gupta
  0 siblings, 1 reply; 37+ messages in thread
From: Philipp Tomsich @ 2021-10-13 13:49 UTC (permalink / raw)
  To: Vincent Palatin
  Cc: Peter Maydell, Richard Henderson,
	qemu-devel@nongnu.org Developers, Alistair Francis,
	Alistair Francis, Alistair Francis

On Wed, 13 Oct 2021 at 15:44, Vincent Palatin <vpalatin@rivosinc.com> wrote:
>
> On Wed, Oct 13, 2021 at 3:13 PM Philipp Tomsich
> <philipp.tomsich@vrull.eu> wrote:
> >
> > I had a much simpler version initially (using 3 x mask/shift/or, for
> > 12 instructions after setup of constants), but took up the suggestion
> > to optimize based on haszero(v)...
> > Indeed this appears to not do what we expect, when there's only 0x01
> > set in a byte.
> >
> > The less optimized form, with a single constant, that would still do
> > what we want is:
> >    /* set high-bit for non-zero bytes */
> >    constant = dup_const_tl(MO_8, 0x7f);
> >    tmp = v & constant;   // AND
> >    tmp += constant;       // ADD
> >    tmp |= v;                    // OR
> >    /* extract high-bit to low-bit, for each word */
> >    tmp &= ~constant;     // ANDC
> >    tmp >>= 7;                 // SHR
> >    /* multiply with 0xff to populate entire byte where the low-bit is set */
> >    tmp *= 0xff;                // MUL
> >
> > I'll submit a patch with this one later today, once I had a chance to
> > pass this through a full test.
>
>
> Thanks for the insight.
>
> I have tried it, implemented as:
> ```
> static void gen_orc_b(TCGv ret, TCGv source1)
> {
>     TCGv  tmp = tcg_temp_new();
>     TCGv  constant = tcg_constant_tl(dup_const_tl(MO_8, 0x7f));
>
>     /* set high-bit for non-zero bytes */
>     tcg_gen_and_tl(tmp, source1, constant);
>     tcg_gen_add_tl(tmp, tmp, constant);
>     tcg_gen_or_tl(tmp, tmp, source1);
>     /* extract high-bit to low-bit, for each word */
>     tcg_gen_andc_tl(tmp, tmp, constant);
>     tcg_gen_shri_tl(tmp, tmp, 7);
>
>     /* Replicate the lsb of each byte across the byte. */
>     tcg_gen_muli_tl(ret, tmp, 0xff);
>
>     tcg_temp_free(tmp);
> }
> ```
>
> It does pass my own test sequences.

I am running it against SPEC at the moment, using optimized
strlen/strcpy/strcmp functions using orc.b.
The verdict on that should be available later today...

Philipp.


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PULL 11/26] target/riscv: Add orc.b instruction for Zbb, removing gorc/gorci
  2021-10-13 13:49         ` Philipp Tomsich
@ 2021-10-13 16:20           ` Vineet Gupta
  2021-10-13 16:51             ` Richard Henderson
  0 siblings, 1 reply; 37+ messages in thread
From: Vineet Gupta @ 2021-10-13 16:20 UTC (permalink / raw)
  To: Philipp Tomsich, Vincent Palatin
  Cc: Peter Maydell, Anup Patel, Richard Henderson,
	qemu-devel@nongnu.org Developers, Alistair Francis,
	Alistair Francis, Alistair Francis, Palmer Dabbelt, Jim Wilson

On 10/13/21 6:49 AM, Philipp Tomsich wrote:
> On Wed, 13 Oct 2021 at 15:44, Vincent Palatin <vpalatin@rivosinc.com> wrote:
>>
>> On Wed, Oct 13, 2021 at 3:13 PM Philipp Tomsich
>> <philipp.tomsich@vrull.eu> wrote:
>>>
>>> I had a much simpler version initially (using 3 x mask/shift/or, for
>>> 12 instructions after setup of constants), but took up the suggestion
>>> to optimize based on haszero(v)...
>>> Indeed this appears to not do what we expect, when there's only 0x01
>>> set in a byte.
>>>
>>> The less optimized form, with a single constant, that would still do
>>> what we want is:
>>>     /* set high-bit for non-zero bytes */
>>>     constant = dup_const_tl(MO_8, 0x7f);
>>>     tmp = v & constant;   // AND
>>>     tmp += constant;       // ADD
>>>     tmp |= v;                    // OR
>>>     /* extract high-bit to low-bit, for each word */
>>>     tmp &= ~constant;     // ANDC
>>>     tmp >>= 7;                 // SHR
>>>     /* multiply with 0xff to populate entire byte where the low-bit is set */
>>>     tmp *= 0xff;                // MUL
>>>
>>> I'll submit a patch with this one later today, once I had a chance to
>>> pass this through a full test.
>>
>>
>> Thanks for the insight.
>>
>> I have tried it, implemented as:
>> ```
>> static void gen_orc_b(TCGv ret, TCGv source1)
>> {
>>      TCGv  tmp = tcg_temp_new();
>>      TCGv  constant = tcg_constant_tl(dup_const_tl(MO_8, 0x7f));
>>
>>      /* set high-bit for non-zero bytes */
>>      tcg_gen_and_tl(tmp, source1, constant);
>>      tcg_gen_add_tl(tmp, tmp, constant);
>>      tcg_gen_or_tl(tmp, tmp, source1);
>>      /* extract high-bit to low-bit, for each word */
>>      tcg_gen_andc_tl(tmp, tmp, constant);
>>      tcg_gen_shri_tl(tmp, tmp, 7);
>>
>>      /* Replicate the lsb of each byte across the byte. */
>>      tcg_gen_muli_tl(ret, tmp, 0xff);
>>
>>      tcg_temp_free(tmp);
>> }
>> ```
>>
>> It does pass my own test sequences.
> 
> I am running it against SPEC at the moment, using optimized
> strlen/strcpy/strcmp functions using orc.b.
> The verdict on that should be available later today...

off topic but relates, for Zb (and similar things in the future) whats 
the strategy for change management/discovery. I understand you can 
hardcode things for quick test, but for a proper glibc implementation 
this would be an IFUNC but there seems to be no architectural way per 
spec (for software/kernel) to discover this.

Same issue is with building linux kernel with Zb - how do we make sure 
that hardware/sim supports Zb when running corresponding software.

It seems some generic discovery/enumeration scheme is in works but what 
to do in the interim.

Thx,
-Vineet


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PULL 11/26] target/riscv: Add orc.b instruction for Zbb, removing gorc/gorci
  2021-10-13 16:20           ` Vineet Gupta
@ 2021-10-13 16:51             ` Richard Henderson
  2021-10-13 17:00               ` Philipp Tomsich
  0 siblings, 1 reply; 37+ messages in thread
From: Richard Henderson @ 2021-10-13 16:51 UTC (permalink / raw)
  To: Vineet Gupta, Philipp Tomsich, Vincent Palatin
  Cc: Alistair Francis, Peter Maydell, Anup Patel,
	qemu-devel@nongnu.org Developers, Alistair Francis,
	Alistair Francis, Palmer Dabbelt, Jim Wilson

On 10/13/21 9:20 AM, Vineet Gupta wrote:
> off topic but relates, for Zb (and similar things in the future) whats the strategy for 
> change management/discovery. I understand you can hardcode things for quick test, but for 
> a proper glibc implementation this would be an IFUNC but there seems to be no 
> architectural way per spec (for software/kernel) to discover this.

Since the architecture restricted access to these CSRs, you do have to coordinate with the 
kernel.

There is an AT_HWCAP value that is given to userland, but it is currently masked to only 
provide a few of the MISA bits.  This will need to be extended for both V and Zb.  It 
doesn't help that Zb has been split into lots of smaller extensions, which (if done 
simplistically) will quickly consume all of the bits within AT_HWCAP.

So: I strongly suggest that RISC-V spend a few moments considering a way to represent this 
that will easily support the myriad extensions.  One possibility is to add more AT_* 
entries straight away -- AT_HWCAP_ZB, which contains one bit for all of the Zb[abcs] 
extensions.  Possibly set the "main" AT_HWCAP "b" bit if Zb is present at some minimal level.

> Same issue is with building linux kernel with Zb - how do we make sure that hardware/sim 
> supports Zb when running corresponding software.

On the kernel side this is easier -- read the CSRs then patch the kernel.
There are existing ways to manage this sort of thing.


r~


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PULL 11/26] target/riscv: Add orc.b instruction for Zbb, removing gorc/gorci
  2021-10-13 16:51             ` Richard Henderson
@ 2021-10-13 17:00               ` Philipp Tomsich
  0 siblings, 0 replies; 37+ messages in thread
From: Philipp Tomsich @ 2021-10-13 17:00 UTC (permalink / raw)
  To: Richard Henderson
  Cc: Peter Maydell, Alistair Francis, Anup Patel, Vineet Gupta,
	qemu-devel@nongnu.org Developers, Alistair Francis,
	Alistair Francis, Palmer Dabbelt, Vincent Palatin, Jim Wilson

[-- Attachment #1: Type: text/plain, Size: 2024 bytes --]

On Wed, 13 Oct 2021 at 18:51, Richard Henderson <
richard.henderson@linaro.org> wrote:

> On 10/13/21 9:20 AM, Vineet Gupta wrote:
> > off topic but relates, for Zb (and similar things in the future) whats
> the strategy for
> > change management/discovery. I understand you can hardcode things for
> quick test, but for
> > a proper glibc implementation this would be an IFUNC but there seems to
> be no
> > architectural way per spec (for software/kernel) to discover this.
>
> Since the architecture restricted access to these CSRs, you do have to
> coordinate with the
> kernel.
>

Zb[abcs] will not be discoverable via MISA bits.
A unified low-level discovery mechanisms (and a way to inject this
information to userspace via the auxiliary vector) are being developed at
the moment.

There is an AT_HWCAP value that is given to userland, but it is currently
> masked to only
> provide a few of the MISA bits.  This will need to be extended for both V
> and Zb.  It
> doesn't help that Zb has been split into lots of smaller extensions, which
> (if done
> simplistically) will quickly consume all of the bits within AT_HWCAP.
>

It looks like HWCAP, HWCAP2 and AT_PLATFORM and AT_BASE_PLATFORM will be
used.
Kito presented the (then current) state of thinking at the Linux Plumbers
Conference…


> So: I strongly suggest that RISC-V spend a few moments considering a way
> to represent this
> that will easily support the myriad extensions.  One possibility is to add
> more AT_*
> entries straight away -- AT_HWCAP_ZB, which contains one bit for all of
> the Zb[abcs]
> extensions.  Possibly set the "main" AT_HWCAP "b" bit if Zb is present at
> some minimal level.
>
> > Same issue is with building linux kernel with Zb - how do we make sure
> that hardware/sim
> > supports Zb when running corresponding software.
>
> On the kernel side this is easier -- read the CSRs then patch the kernel.
> There are existing ways to manage this sort of thing.
>
>
> r~
>

[-- Attachment #2: Type: text/html, Size: 2750 bytes --]

^ permalink raw reply	[flat|nested] 37+ messages in thread

end of thread, other threads:[~2021-10-13 17:03 UTC | newest]

Thread overview: 37+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-10-07  6:47 [PULL 00/26] riscv-to-apply queue Alistair Francis
2021-10-07  6:47 ` [PULL 01/26] target/riscv: Introduce temporary in gen_add_uw() Alistair Francis
2021-10-07  6:47 ` [PULL 02/26] target/riscv: fix clzw implementation to operate on arg1 Alistair Francis
2021-10-07  6:47 ` [PULL 03/26] target/riscv: clwz must ignore high bits (use shift-left & changed logic) Alistair Francis
2021-10-07  6:47 ` [PULL 04/26] target/riscv: Add x-zba, x-zbb, x-zbc and x-zbs properties Alistair Francis
2021-10-07  6:47 ` [PULL 05/26] target/riscv: Reassign instructions to the Zba-extension Alistair Francis
2021-10-07  6:47 ` [PULL 06/26] target/riscv: Remove the W-form instructions from Zbs Alistair Francis
2021-10-07  6:47 ` [PULL 07/26] target/riscv: Remove shift-one instructions (proposed Zbo in pre-0.93 draft-B) Alistair Francis
2021-10-07  6:47 ` [PULL 08/26] target/riscv: Reassign instructions to the Zbs-extension Alistair Francis
2021-10-07  6:47 ` [PULL 09/26] target/riscv: Add instructions of the Zbc-extension Alistair Francis
2021-10-07  6:47 ` [PULL 10/26] target/riscv: Reassign instructions to the Zbb-extension Alistair Francis
2021-10-07  6:47 ` [PULL 11/26] target/riscv: Add orc.b instruction for Zbb, removing gorc/gorci Alistair Francis
2021-10-13  9:36   ` Vincent Palatin
2021-10-13  9:37     ` [PATCH v1A] target/riscv: fix orc.b instruction in the Zbb extension Vincent Palatin
2021-10-13  9:38     ` [PATCH v1B] " Vincent Palatin
2021-10-13 13:12     ` [PULL 11/26] target/riscv: Add orc.b instruction for Zbb, removing gorc/gorci Philipp Tomsich
2021-10-13 13:44       ` Vincent Palatin
2021-10-13 13:49         ` Philipp Tomsich
2021-10-13 16:20           ` Vineet Gupta
2021-10-13 16:51             ` Richard Henderson
2021-10-13 17:00               ` Philipp Tomsich
2021-10-07  6:47 ` [PULL 12/26] target/riscv: Add a REQUIRE_32BIT macro Alistair Francis
2021-10-07  6:47 ` [PULL 13/26] target/riscv: Add rev8 instruction, removing grev/grevi Alistair Francis
2021-10-07  6:47 ` [PULL 14/26] target/riscv: Add zext.h instructions to Zbb, removing pack/packu/packh Alistair Francis
2021-10-07  6:47 ` [PULL 15/26] target/riscv: Remove RVB (replaced by Zb[abcs]) Alistair Francis
2021-10-07  6:47 ` [PULL 16/26] disas/riscv: Add Zb[abcs] instructions Alistair Francis
2021-10-07  6:47 ` [PULL 17/26] target/riscv: Set mstatus_hs.[SD|FS] bits if Clean and V=1 in mark_fs_dirty() Alistair Francis
2021-10-07  6:47 ` [PULL 18/26] hw/char: ibex_uart: Register device in 'input' category Alistair Francis
2021-10-07  6:47 ` [PULL 19/26] hw/char: shakti_uart: " Alistair Francis
2021-10-07  6:47 ` [PULL 20/26] hw/char: sifive_uart: " Alistair Francis
2021-10-07  6:47 ` [PULL 21/26] hw/char/mchp_pfsoc_mmuart: Simplify MCHP_PFSOC_MMUART_REG definition Alistair Francis
2021-10-07  6:47 ` [PULL 22/26] hw/char/mchp_pfsoc_mmuart: Use a MemoryRegion container Alistair Francis
2021-10-07  6:47 ` [PULL 23/26] hw/char/mchp_pfsoc_mmuart: QOM'ify PolarFire MMUART Alistair Francis
2021-10-07  6:47 ` [PULL 24/26] hw/dma: sifive_pdma: Fix Control.claim bit detection Alistair Francis
2021-10-07  6:47 ` [PULL 25/26] hw/dma: sifive_pdma: Don't run DMA when channel is disclaimed Alistair Francis
2021-10-07  6:47 ` [PULL 26/26] hw/riscv: shakti_c: Mark as not user creatable Alistair Francis
2021-10-07 17:25 ` [PULL 00/26] riscv-to-apply queue Richard Henderson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.