* [PATCH] target/riscv: Fix orc.b implementation
@ 2021-10-13 18:41 Philipp Tomsich
2021-10-13 19:12 ` Vincent Palatin
` (2 more replies)
0 siblings, 3 replies; 4+ messages in thread
From: Philipp Tomsich @ 2021-10-13 18:41 UTC (permalink / raw)
To: qemu-devel
Cc: Alistair Francis, Richard Henderson, Philipp Tomsich, Vincent Palatin
The earlier implementation fell into a corner case for bytes that were
0x01, giving a wrong result (but not affecting our application test
cases for strings, as an ASCII value 0x01 is rare in those...).
This changes the algorithm to:
1. Mask out the high-bit of each bytes (so that each byte is <= 127).
2. Add 127 to each byte (i.e. if the low 7 bits are not 0, this will overflow
into the highest bit of each byte).
3. Bitwise-or the original value back in (to cover those cases where the
source byte was exactly 128) to saturate the high-bit.
4. Shift-and-mask (implemented as a mask-and-shift) to extract the MSB of
each byte into its LSB.
5. Multiply with 0xff to fan out the LSB to all bits of each byte.
Fixes: d7a4fcb034 ("target/riscv: Add orc.b instruction for Zbb, removing gorc/gorci")
Signed-off-by: Philipp Tomsich <philipp.tomsich@vrull.eu>
Reported-by: Vincent Palatin <vpalatin@rivosinc.com>
---
target/riscv/insn_trans/trans_rvb.c.inc | 13 ++++++++-----
1 file changed, 8 insertions(+), 5 deletions(-)
diff --git a/target/riscv/insn_trans/trans_rvb.c.inc b/target/riscv/insn_trans/trans_rvb.c.inc
index 185c3e9a60..3095624f32 100644
--- a/target/riscv/insn_trans/trans_rvb.c.inc
+++ b/target/riscv/insn_trans/trans_rvb.c.inc
@@ -249,13 +249,16 @@ static bool trans_rev8_64(DisasContext *ctx, arg_rev8_64 *a)
static void gen_orc_b(TCGv ret, TCGv source1)
{
TCGv tmp = tcg_temp_new();
- TCGv ones = tcg_constant_tl(dup_const_tl(MO_8, 0x01));
+ TCGv low7 = tcg_constant_tl(dup_const_tl(MO_8, 0x7f));
- /* Set lsb in each byte if the byte was zero. */
- tcg_gen_sub_tl(tmp, source1, ones);
- tcg_gen_andc_tl(tmp, tmp, source1);
+ /* Set msb in each byte if the byte was non-zero. */
+ tcg_gen_and_tl(tmp, source1, low7);
+ tcg_gen_add_tl(tmp, tmp, low7);
+ tcg_gen_or_tl(tmp, tmp, source1);
+
+ /* Extract the msb to the lsb in each byte */
+ tcg_gen_andc_tl(tmp, tmp, low7);
tcg_gen_shri_tl(tmp, tmp, 7);
- tcg_gen_andc_tl(tmp, ones, tmp);
/* Replicate the lsb of each byte across the byte. */
tcg_gen_muli_tl(ret, tmp, 0xff);
--
2.25.1
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH] target/riscv: Fix orc.b implementation
2021-10-13 18:41 [PATCH] target/riscv: Fix orc.b implementation Philipp Tomsich
@ 2021-10-13 19:12 ` Vincent Palatin
2021-10-13 19:57 ` Richard Henderson
2021-10-15 5:28 ` Alistair Francis
2 siblings, 0 replies; 4+ messages in thread
From: Vincent Palatin @ 2021-10-13 19:12 UTC (permalink / raw)
To: Philipp Tomsich
Cc: Alistair Francis, Richard Henderson, qemu-devel@nongnu.org Developers
On Wed, Oct 13, 2021 at 8:41 PM Philipp Tomsich
<philipp.tomsich@vrull.eu> wrote:
>
> The earlier implementation fell into a corner case for bytes that were
> 0x01, giving a wrong result (but not affecting our application test
> cases for strings, as an ASCII value 0x01 is rare in those...).
>
> This changes the algorithm to:
> 1. Mask out the high-bit of each bytes (so that each byte is <= 127).
> 2. Add 127 to each byte (i.e. if the low 7 bits are not 0, this will overflow
> into the highest bit of each byte).
> 3. Bitwise-or the original value back in (to cover those cases where the
> source byte was exactly 128) to saturate the high-bit.
> 4. Shift-and-mask (implemented as a mask-and-shift) to extract the MSB of
> each byte into its LSB.
> 5. Multiply with 0xff to fan out the LSB to all bits of each byte.
>
> Fixes: d7a4fcb034 ("target/riscv: Add orc.b instruction for Zbb, removing gorc/gorci")
>
> Signed-off-by: Philipp Tomsich <philipp.tomsich@vrull.eu>
> Reported-by: Vincent Palatin <vpalatin@rivosinc.com>
>
Tested-by: Vincent Palatin <vpalatin@rivosinc.com>
> ---
>
> target/riscv/insn_trans/trans_rvb.c.inc | 13 ++++++++-----
> 1 file changed, 8 insertions(+), 5 deletions(-)
>
> diff --git a/target/riscv/insn_trans/trans_rvb.c.inc b/target/riscv/insn_trans/trans_rvb.c.inc
> index 185c3e9a60..3095624f32 100644
> --- a/target/riscv/insn_trans/trans_rvb.c.inc
> +++ b/target/riscv/insn_trans/trans_rvb.c.inc
> @@ -249,13 +249,16 @@ static bool trans_rev8_64(DisasContext *ctx, arg_rev8_64 *a)
> static void gen_orc_b(TCGv ret, TCGv source1)
> {
> TCGv tmp = tcg_temp_new();
> - TCGv ones = tcg_constant_tl(dup_const_tl(MO_8, 0x01));
> + TCGv low7 = tcg_constant_tl(dup_const_tl(MO_8, 0x7f));
>
> - /* Set lsb in each byte if the byte was zero. */
> - tcg_gen_sub_tl(tmp, source1, ones);
> - tcg_gen_andc_tl(tmp, tmp, source1);
> + /* Set msb in each byte if the byte was non-zero. */
> + tcg_gen_and_tl(tmp, source1, low7);
> + tcg_gen_add_tl(tmp, tmp, low7);
> + tcg_gen_or_tl(tmp, tmp, source1);
> +
> + /* Extract the msb to the lsb in each byte */
> + tcg_gen_andc_tl(tmp, tmp, low7);
> tcg_gen_shri_tl(tmp, tmp, 7);
> - tcg_gen_andc_tl(tmp, ones, tmp);
>
> /* Replicate the lsb of each byte across the byte. */
> tcg_gen_muli_tl(ret, tmp, 0xff);
> --
> 2.25.1
>
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] target/riscv: Fix orc.b implementation
2021-10-13 18:41 [PATCH] target/riscv: Fix orc.b implementation Philipp Tomsich
2021-10-13 19:12 ` Vincent Palatin
@ 2021-10-13 19:57 ` Richard Henderson
2021-10-15 5:28 ` Alistair Francis
2 siblings, 0 replies; 4+ messages in thread
From: Richard Henderson @ 2021-10-13 19:57 UTC (permalink / raw)
To: Philipp Tomsich, qemu-devel; +Cc: Alistair Francis, Vincent Palatin
On 10/13/21 11:41 AM, Philipp Tomsich wrote:
> The earlier implementation fell into a corner case for bytes that were
> 0x01, giving a wrong result (but not affecting our application test
> cases for strings, as an ASCII value 0x01 is rare in those...).
>
> This changes the algorithm to:
> 1. Mask out the high-bit of each bytes (so that each byte is <= 127).
> 2. Add 127 to each byte (i.e. if the low 7 bits are not 0, this will overflow
> into the highest bit of each byte).
> 3. Bitwise-or the original value back in (to cover those cases where the
> source byte was exactly 128) to saturate the high-bit.
> 4. Shift-and-mask (implemented as a mask-and-shift) to extract the MSB of
> each byte into its LSB.
> 5. Multiply with 0xff to fan out the LSB to all bits of each byte.
>
> Fixes: d7a4fcb034 ("target/riscv: Add orc.b instruction for Zbb, removing gorc/gorci")
>
> Signed-off-by: Philipp Tomsich<philipp.tomsich@vrull.eu>
> Reported-by: Vincent Palatin<vpalatin@rivosinc.com>
>
> ---
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
r~
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] target/riscv: Fix orc.b implementation
2021-10-13 18:41 [PATCH] target/riscv: Fix orc.b implementation Philipp Tomsich
2021-10-13 19:12 ` Vincent Palatin
2021-10-13 19:57 ` Richard Henderson
@ 2021-10-15 5:28 ` Alistair Francis
2 siblings, 0 replies; 4+ messages in thread
From: Alistair Francis @ 2021-10-15 5:28 UTC (permalink / raw)
To: Philipp Tomsich
Cc: Richard Henderson, Alistair Francis,
qemu-devel@nongnu.org Developers, Vincent Palatin
On Thu, Oct 14, 2021 at 4:43 AM Philipp Tomsich
<philipp.tomsich@vrull.eu> wrote:
>
> The earlier implementation fell into a corner case for bytes that were
> 0x01, giving a wrong result (but not affecting our application test
> cases for strings, as an ASCII value 0x01 is rare in those...).
>
> This changes the algorithm to:
> 1. Mask out the high-bit of each bytes (so that each byte is <= 127).
> 2. Add 127 to each byte (i.e. if the low 7 bits are not 0, this will overflow
> into the highest bit of each byte).
> 3. Bitwise-or the original value back in (to cover those cases where the
> source byte was exactly 128) to saturate the high-bit.
> 4. Shift-and-mask (implemented as a mask-and-shift) to extract the MSB of
> each byte into its LSB.
> 5. Multiply with 0xff to fan out the LSB to all bits of each byte.
>
> Fixes: d7a4fcb034 ("target/riscv: Add orc.b instruction for Zbb, removing gorc/gorci")
>
> Signed-off-by: Philipp Tomsich <philipp.tomsich@vrull.eu>
> Reported-by: Vincent Palatin <vpalatin@rivosinc.com>
Thanks!
Applied to riscv-to-apply.next
Alistair
>
> ---
>
> target/riscv/insn_trans/trans_rvb.c.inc | 13 ++++++++-----
> 1 file changed, 8 insertions(+), 5 deletions(-)
>
> diff --git a/target/riscv/insn_trans/trans_rvb.c.inc b/target/riscv/insn_trans/trans_rvb.c.inc
> index 185c3e9a60..3095624f32 100644
> --- a/target/riscv/insn_trans/trans_rvb.c.inc
> +++ b/target/riscv/insn_trans/trans_rvb.c.inc
> @@ -249,13 +249,16 @@ static bool trans_rev8_64(DisasContext *ctx, arg_rev8_64 *a)
> static void gen_orc_b(TCGv ret, TCGv source1)
> {
> TCGv tmp = tcg_temp_new();
> - TCGv ones = tcg_constant_tl(dup_const_tl(MO_8, 0x01));
> + TCGv low7 = tcg_constant_tl(dup_const_tl(MO_8, 0x7f));
>
> - /* Set lsb in each byte if the byte was zero. */
> - tcg_gen_sub_tl(tmp, source1, ones);
> - tcg_gen_andc_tl(tmp, tmp, source1);
> + /* Set msb in each byte if the byte was non-zero. */
> + tcg_gen_and_tl(tmp, source1, low7);
> + tcg_gen_add_tl(tmp, tmp, low7);
> + tcg_gen_or_tl(tmp, tmp, source1);
> +
> + /* Extract the msb to the lsb in each byte */
> + tcg_gen_andc_tl(tmp, tmp, low7);
> tcg_gen_shri_tl(tmp, tmp, 7);
> - tcg_gen_andc_tl(tmp, ones, tmp);
>
> /* Replicate the lsb of each byte across the byte. */
> tcg_gen_muli_tl(ret, tmp, 0xff);
> --
> 2.25.1
>
>
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2021-10-15 5:31 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-10-13 18:41 [PATCH] target/riscv: Fix orc.b implementation Philipp Tomsich
2021-10-13 19:12 ` Vincent Palatin
2021-10-13 19:57 ` Richard Henderson
2021-10-15 5:28 ` Alistair Francis
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.