All of lore.kernel.org
 help / color / mirror / Atom feed
From: liweiwei <liweiwei@iscas.ac.cn>
To: Richard Henderson <richard.henderson@linaro.org>,
	palmer@dabbelt.com, alistair.francis@wdc.com,
	bin.meng@windriver.com, qemu-riscv@nongnu.org,
	qemu-devel@nongnu.org
Cc: wangjunqiang@iscas.ac.cn, lazyparser@gmail.com,
	luruibo2000@163.com, lustrew@foxmail.com
Subject: Re: [RFC 2/6] target/riscv: rvk: add implementation of instructions for Zbk* - reuse partial instructions of Zbb/Zbc extensions - add brev8 packh, unzip, zip, etc.
Date: Wed, 3 Nov 2021 08:56:03 +0800	[thread overview]
Message-ID: <1be2c279-2d6b-1acb-c216-a598f02d43e1@iscas.ac.cn> (raw)
In-Reply-To: <5523b929-316e-a119-af1a-2a4aba4ee86d@linaro.org>

Thanks for your suggestions.

在 2021/11/2 下午11:44, Richard Henderson 写道:
> On 11/1/21 11:11 PM, liweiwei wrote:
>> Signed-off-by: liweiwei <liweiwei@iscas.ac.cn>
>> Signed-off-by: wangjunqiang <wangjunqiang@iscas.ac.cn>
>
> You managed to get the whole patch description into the subject line.
> Please break it up.
>
OK.
>> +target_ulong HELPER(grev)(target_ulong rs1, target_ulong rs2)
>> +{
>> +    return do_grev(rs1, rs2, TARGET_LONG_BITS);
>> +}
>
> Are we expecting to see the full grev instruction at any point? If 
> not, we can certainly implement Zbk with a simpler implementation.
The main idea that I add this helper is that  grev may be added to 
B-extension later and it can be reused. However, it have no effect 
currently.  I'll replace this with a simpler implementation.
>
>> +target_ulong HELPER(xperm)(target_ulong rs1, target_ulong rs2, 
>> uint32_t sz_log2)
>> +{
>> +    target_ulong r = 0;
>> +    target_ulong sz = 1LL << sz_log2;
>> +    target_ulong mask = (1LL << sz) - 1;
>> +    for (int i = 0; i < TARGET_LONG_BITS; i += sz) {
>> +        target_ulong pos = ((rs2 >> i) & mask) << sz_log2;
>> +        if (pos < sizeof(target_ulong) * 8) {
>> +            r |= ((rs1 >> pos) & mask) << i;
>> +        }
>> +    }
>> +    return r;
>> +}
>
> This could become a static inline do_xperm, and provide two specific 
> xperm4 and xperm8 helpers; the compiler would fold all of the sz_log2 
> stuff into a more efficient implementation.
OK.
>
>> +target_ulong HELPER(unshfl)(target_ulong rs1,
>> +                            target_ulong rs2)
>> +{
>> +    target_ulong x = rs1;
>> +    int i, shift;
>> +    int bits = TARGET_LONG_BITS >> 1;
>> +    for (i = 0, shift = 1; shift < bits; i++, shift <<= 1) {
>> +        if (rs2 & shift) {
>> +            x = do_shuf_stage(x, shuf_masks[i], shuf_masks[i] >> 
>> shift, shift);
>> +        }
>> +    }
>> +    return x;
>> +}
>> +
>> +target_ulong HELPER(shfl)(target_ulong rs1,
>> +                          target_ulong rs2)
>> +{
>> +    target_ulong x = rs1;
>> +    int i, shift;
>> +    shift = TARGET_LONG_BITS >> 2;
>> +    i = (shift == 8) ? 3 : 4;
>> +    for (; i >= 0; i--, shift >>= 1) {
>> +        if (rs2 & shift) {
>> +            x = do_shuf_stage(x, shuf_masks[i], shuf_masks[i] >> 
>> shift, shift);
>> +        }
>> +    }
>> +    return x;
>> +}
>
> Similar comment as for grev.
>
>> +# The encoding for zext.h differs between RV32 and RV64.
>> +# zext_h_32 denotes the RV32 variant.
>> +{
>> +  zext_h_32  0000100 00000 ..... 100 ..... 0110011 @r2
>> +  pack       0000100 ..... ..... 100 ..... 0110011 @r
>> +}
>
> Note to self: improve tcg_gen_deposit to notice zeros, so that the 
> more general pack compiles to zero-extension.
>
>> @@ -556,6 +563,81 @@ static bool gen_unary_per_ol(DisasContext *ctx, 
>> arg_r2 *a, DisasExtend ext,
>>       return gen_unary(ctx, a, ext, f_tl);
>>   }
>>   +static bool gen_xperm(DisasContext *ctx, arg_r *a, int32_t size)
>> +{
>> +    TCGv dest = dest_gpr(ctx, a->rd);
>> +    TCGv src1 = get_gpr(ctx, a->rs1, EXT_NONE);
>> +    TCGv src2 = get_gpr(ctx, a->rs2, EXT_NONE);
>> +
>> +    TCGv_i32 sz = tcg_const_i32(size);
>> +    gen_helper_xperm(dest, src1, src2, sz);
>> +
>> +    gen_set_gpr(ctx, a->rd, dest);
>> +    tcg_temp_free_i32(sz);
>> +    return true;
>> +}
>> +
>> +static bool gen_grevi(DisasContext *ctx, arg_r2 *a, int shamt)
>> +{
>> +    TCGv dest = dest_gpr(ctx, a->rd);
>> +    TCGv src1 = get_gpr(ctx, a->rs1, EXT_NONE);
>> +
>> +    if (shamt == (TARGET_LONG_BITS - 8)) {
>> +        /* rev8, byte swaps */
>> +        tcg_gen_bswap_tl(dest, src1);
>> +    } else {
>> +        TCGv src2 = tcg_temp_new();
>> +        tcg_gen_movi_tl(src2, shamt);
>> +        gen_helper_grev(dest, src1, src2);
>> +        tcg_temp_free(src2);
>> +    }
>> +
>> +    gen_set_gpr(ctx, a->rd, dest);
>> +    return true;
>> +}
>> +
>> +static void gen_pack(TCGv ret, TCGv src1, TCGv src2)
>> +{
>> +    tcg_gen_deposit_tl(ret, src1, src2,
>> +                       TARGET_LONG_BITS / 2,
>> +                       TARGET_LONG_BITS / 2);
>> +}
>> +
>> +static void gen_packh(TCGv ret, TCGv src1, TCGv src2)
>> +{
>> +    TCGv t = tcg_temp_new();
>> +    tcg_gen_ext8u_tl(t, src2);
>> +    tcg_gen_deposit_tl(ret, src1, t, 8, TARGET_LONG_BITS - 8);
>> +    tcg_temp_free(t);
>> +}
>> +
>> +static void gen_packw(TCGv ret, TCGv src1, TCGv src2)
>> +{
>> +    TCGv t = tcg_temp_new();
>> +    tcg_gen_ext16s_tl(t, src2);
>> +    tcg_gen_deposit_tl(ret, src1, t, 16, 48);
>> +    tcg_temp_free(t);
>> +}
>> +
>> +static bool gen_shufi(DisasContext *ctx, arg_r2 *a, int shamt,
>> +                       void(*func)(TCGv, TCGv, TCGv))
>> +{
>> +    if (shamt >= TARGET_LONG_BITS / 2) {
>> +        return false;
>> +    }
>> +
>> +    TCGv dest = dest_gpr(ctx, a->rd);
>> +    TCGv src1 = get_gpr(ctx, a->rs1, EXT_NONE);
>> +    TCGv src2 = tcg_temp_new();
>> +
>> +    tcg_gen_movi_tl(src2, shamt);
>> +    (*func)(dest, src1, src2);
>> +
>> +    gen_set_gpr(ctx, a->rd, dest);
>> +    tcg_temp_free(src2);
>> +    return true;
>> +}
>
> All of the gen functions belong in insn_trans/trans_rvb.c.inc.
OK. I'll move them to insn_trans/trans_rvb.c.inc.
>
>
> r~



WARNING: multiple messages have this Message-ID (diff)
From: liweiwei <liweiwei@iscas.ac.cn>
To: Richard Henderson <richard.henderson@linaro.org>,
	palmer@dabbelt.com, alistair.francis@wdc.com,
	bin.meng@windriver.com, qemu-riscv@nongnu.org,
	qemu-devel@nongnu.org
Cc: wangjunqiang@iscas.ac.cn, lazyparser@gmail.com,
	lustrew@foxmail.com, luruibo2000@163.com
Subject: Re: [RFC 2/6] target/riscv: rvk: add implementation of instructions for Zbk* - reuse partial instructions of Zbb/Zbc extensions - add brev8 packh, unzip, zip, etc.
Date: Wed, 3 Nov 2021 08:56:03 +0800	[thread overview]
Message-ID: <1be2c279-2d6b-1acb-c216-a598f02d43e1@iscas.ac.cn> (raw)
In-Reply-To: <5523b929-316e-a119-af1a-2a4aba4ee86d@linaro.org>

Thanks for your suggestions.

在 2021/11/2 下午11:44, Richard Henderson 写道:
> On 11/1/21 11:11 PM, liweiwei wrote:
>> Signed-off-by: liweiwei <liweiwei@iscas.ac.cn>
>> Signed-off-by: wangjunqiang <wangjunqiang@iscas.ac.cn>
>
> You managed to get the whole patch description into the subject line.
> Please break it up.
>
OK.
>> +target_ulong HELPER(grev)(target_ulong rs1, target_ulong rs2)
>> +{
>> +    return do_grev(rs1, rs2, TARGET_LONG_BITS);
>> +}
>
> Are we expecting to see the full grev instruction at any point? If 
> not, we can certainly implement Zbk with a simpler implementation.
The main idea that I add this helper is that  grev may be added to 
B-extension later and it can be reused. However, it have no effect 
currently.  I'll replace this with a simpler implementation.
>
>> +target_ulong HELPER(xperm)(target_ulong rs1, target_ulong rs2, 
>> uint32_t sz_log2)
>> +{
>> +    target_ulong r = 0;
>> +    target_ulong sz = 1LL << sz_log2;
>> +    target_ulong mask = (1LL << sz) - 1;
>> +    for (int i = 0; i < TARGET_LONG_BITS; i += sz) {
>> +        target_ulong pos = ((rs2 >> i) & mask) << sz_log2;
>> +        if (pos < sizeof(target_ulong) * 8) {
>> +            r |= ((rs1 >> pos) & mask) << i;
>> +        }
>> +    }
>> +    return r;
>> +}
>
> This could become a static inline do_xperm, and provide two specific 
> xperm4 and xperm8 helpers; the compiler would fold all of the sz_log2 
> stuff into a more efficient implementation.
OK.
>
>> +target_ulong HELPER(unshfl)(target_ulong rs1,
>> +                            target_ulong rs2)
>> +{
>> +    target_ulong x = rs1;
>> +    int i, shift;
>> +    int bits = TARGET_LONG_BITS >> 1;
>> +    for (i = 0, shift = 1; shift < bits; i++, shift <<= 1) {
>> +        if (rs2 & shift) {
>> +            x = do_shuf_stage(x, shuf_masks[i], shuf_masks[i] >> 
>> shift, shift);
>> +        }
>> +    }
>> +    return x;
>> +}
>> +
>> +target_ulong HELPER(shfl)(target_ulong rs1,
>> +                          target_ulong rs2)
>> +{
>> +    target_ulong x = rs1;
>> +    int i, shift;
>> +    shift = TARGET_LONG_BITS >> 2;
>> +    i = (shift == 8) ? 3 : 4;
>> +    for (; i >= 0; i--, shift >>= 1) {
>> +        if (rs2 & shift) {
>> +            x = do_shuf_stage(x, shuf_masks[i], shuf_masks[i] >> 
>> shift, shift);
>> +        }
>> +    }
>> +    return x;
>> +}
>
> Similar comment as for grev.
>
>> +# The encoding for zext.h differs between RV32 and RV64.
>> +# zext_h_32 denotes the RV32 variant.
>> +{
>> +  zext_h_32  0000100 00000 ..... 100 ..... 0110011 @r2
>> +  pack       0000100 ..... ..... 100 ..... 0110011 @r
>> +}
>
> Note to self: improve tcg_gen_deposit to notice zeros, so that the 
> more general pack compiles to zero-extension.
>
>> @@ -556,6 +563,81 @@ static bool gen_unary_per_ol(DisasContext *ctx, 
>> arg_r2 *a, DisasExtend ext,
>>       return gen_unary(ctx, a, ext, f_tl);
>>   }
>>   +static bool gen_xperm(DisasContext *ctx, arg_r *a, int32_t size)
>> +{
>> +    TCGv dest = dest_gpr(ctx, a->rd);
>> +    TCGv src1 = get_gpr(ctx, a->rs1, EXT_NONE);
>> +    TCGv src2 = get_gpr(ctx, a->rs2, EXT_NONE);
>> +
>> +    TCGv_i32 sz = tcg_const_i32(size);
>> +    gen_helper_xperm(dest, src1, src2, sz);
>> +
>> +    gen_set_gpr(ctx, a->rd, dest);
>> +    tcg_temp_free_i32(sz);
>> +    return true;
>> +}
>> +
>> +static bool gen_grevi(DisasContext *ctx, arg_r2 *a, int shamt)
>> +{
>> +    TCGv dest = dest_gpr(ctx, a->rd);
>> +    TCGv src1 = get_gpr(ctx, a->rs1, EXT_NONE);
>> +
>> +    if (shamt == (TARGET_LONG_BITS - 8)) {
>> +        /* rev8, byte swaps */
>> +        tcg_gen_bswap_tl(dest, src1);
>> +    } else {
>> +        TCGv src2 = tcg_temp_new();
>> +        tcg_gen_movi_tl(src2, shamt);
>> +        gen_helper_grev(dest, src1, src2);
>> +        tcg_temp_free(src2);
>> +    }
>> +
>> +    gen_set_gpr(ctx, a->rd, dest);
>> +    return true;
>> +}
>> +
>> +static void gen_pack(TCGv ret, TCGv src1, TCGv src2)
>> +{
>> +    tcg_gen_deposit_tl(ret, src1, src2,
>> +                       TARGET_LONG_BITS / 2,
>> +                       TARGET_LONG_BITS / 2);
>> +}
>> +
>> +static void gen_packh(TCGv ret, TCGv src1, TCGv src2)
>> +{
>> +    TCGv t = tcg_temp_new();
>> +    tcg_gen_ext8u_tl(t, src2);
>> +    tcg_gen_deposit_tl(ret, src1, t, 8, TARGET_LONG_BITS - 8);
>> +    tcg_temp_free(t);
>> +}
>> +
>> +static void gen_packw(TCGv ret, TCGv src1, TCGv src2)
>> +{
>> +    TCGv t = tcg_temp_new();
>> +    tcg_gen_ext16s_tl(t, src2);
>> +    tcg_gen_deposit_tl(ret, src1, t, 16, 48);
>> +    tcg_temp_free(t);
>> +}
>> +
>> +static bool gen_shufi(DisasContext *ctx, arg_r2 *a, int shamt,
>> +                       void(*func)(TCGv, TCGv, TCGv))
>> +{
>> +    if (shamt >= TARGET_LONG_BITS / 2) {
>> +        return false;
>> +    }
>> +
>> +    TCGv dest = dest_gpr(ctx, a->rd);
>> +    TCGv src1 = get_gpr(ctx, a->rs1, EXT_NONE);
>> +    TCGv src2 = tcg_temp_new();
>> +
>> +    tcg_gen_movi_tl(src2, shamt);
>> +    (*func)(dest, src1, src2);
>> +
>> +    gen_set_gpr(ctx, a->rd, dest);
>> +    tcg_temp_free(src2);
>> +    return true;
>> +}
>
> All of the gen functions belong in insn_trans/trans_rvb.c.inc.
OK. I'll move them to insn_trans/trans_rvb.c.inc.
>
>
> r~



  reply	other threads:[~2021-11-03  0:57 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-11-02  3:11 [RFC 0/6] support subsets of scalar crypto extension liweiwei
2021-11-02  3:11 ` liweiwei
2021-11-02  3:11 ` [RFC 1/6] target/riscv: rvk: add flag support for Zbk[bcx] liweiwei
2021-11-02  3:11   ` liweiwei
2021-11-02 14:18   ` Richard Henderson
2021-11-02 14:18     ` Richard Henderson
2021-11-02 15:00     ` liweiwei
2021-11-02 15:00       ` liweiwei
2021-11-02  3:11 ` [RFC 2/6] target/riscv: rvk: add implementation of instructions for Zbk* - reuse partial instructions of Zbb/Zbc extensions - add brev8 packh, unzip, zip, etc liweiwei
2021-11-02  3:11   ` liweiwei
2021-11-02 15:44   ` Richard Henderson
2021-11-02 15:44     ` Richard Henderson
2021-11-03  0:56     ` liweiwei [this message]
2021-11-03  0:56       ` liweiwei
2021-11-02  3:11 ` [RFC 3/6] target/riscv: rvk: add flag support for Zk/Zkn/Zknd/Zknd/Zkne/Zknh/Zks/Zksed/Zksh/Zkr liweiwei
2021-11-02  3:11   ` liweiwei
2021-11-02 17:56   ` Richard Henderson
2021-11-02 17:56     ` Richard Henderson
2021-11-03  1:06     ` liweiwei
2021-11-03  1:06       ` liweiwei
2021-11-03  1:21       ` Richard Henderson
2021-11-03  1:21         ` Richard Henderson
2021-11-03  7:22         ` liweiwei
2021-11-03  7:22           ` liweiwei
2021-11-02  3:11 ` [RFC 4/6] target/riscv: rvk: add implementation of instructions for Zk* liweiwei
2021-11-02  3:11   ` liweiwei
2021-11-02 18:56   ` Richard Henderson
2021-11-02 18:56     ` Richard Henderson
2021-11-03  1:08     ` liweiwei
2021-11-03  1:08       ` liweiwei
2021-11-02  3:11 ` [RFC 5/6] target/riscv: rvk: add CSR support for Zkr: - add SEED CSR - add USEED, SSEED fields for MSECCFG CSR liweiwei
2021-11-02  3:11   ` liweiwei
2021-11-02  3:11 ` [RFC 6/6] disas/riscv.c: rvk: add disas support for Zbk* and Zk* instructions liweiwei
2021-11-02  3:11   ` liweiwei

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1be2c279-2d6b-1acb-c216-a598f02d43e1@iscas.ac.cn \
    --to=liweiwei@iscas.ac.cn \
    --cc=alistair.francis@wdc.com \
    --cc=bin.meng@windriver.com \
    --cc=lazyparser@gmail.com \
    --cc=luruibo2000@163.com \
    --cc=lustrew@foxmail.com \
    --cc=palmer@dabbelt.com \
    --cc=qemu-devel@nongnu.org \
    --cc=qemu-riscv@nongnu.org \
    --cc=richard.henderson@linaro.org \
    --cc=wangjunqiang@iscas.ac.cn \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.