From: liweiwei <liweiwei@iscas.ac.cn> To: Richard Henderson <richard.henderson@linaro.org>, palmer@dabbelt.com, alistair.francis@wdc.com, bin.meng@windriver.com, qemu-riscv@nongnu.org, qemu-devel@nongnu.org Cc: wangjunqiang@iscas.ac.cn, lazyparser@gmail.com, luruibo2000@163.com, lustrew@foxmail.com Subject: Re: [RFC 2/6] target/riscv: rvk: add implementation of instructions for Zbk* - reuse partial instructions of Zbb/Zbc extensions - add brev8 packh, unzip, zip, etc. Date: Wed, 3 Nov 2021 08:56:03 +0800 [thread overview] Message-ID: <1be2c279-2d6b-1acb-c216-a598f02d43e1@iscas.ac.cn> (raw) In-Reply-To: <5523b929-316e-a119-af1a-2a4aba4ee86d@linaro.org> Thanks for your suggestions. 在 2021/11/2 下午11:44, Richard Henderson 写道: > On 11/1/21 11:11 PM, liweiwei wrote: >> Signed-off-by: liweiwei <liweiwei@iscas.ac.cn> >> Signed-off-by: wangjunqiang <wangjunqiang@iscas.ac.cn> > > You managed to get the whole patch description into the subject line. > Please break it up. > OK. >> +target_ulong HELPER(grev)(target_ulong rs1, target_ulong rs2) >> +{ >> + return do_grev(rs1, rs2, TARGET_LONG_BITS); >> +} > > Are we expecting to see the full grev instruction at any point? If > not, we can certainly implement Zbk with a simpler implementation. The main idea that I add this helper is that grev may be added to B-extension later and it can be reused. However, it have no effect currently. I'll replace this with a simpler implementation. > >> +target_ulong HELPER(xperm)(target_ulong rs1, target_ulong rs2, >> uint32_t sz_log2) >> +{ >> + target_ulong r = 0; >> + target_ulong sz = 1LL << sz_log2; >> + target_ulong mask = (1LL << sz) - 1; >> + for (int i = 0; i < TARGET_LONG_BITS; i += sz) { >> + target_ulong pos = ((rs2 >> i) & mask) << sz_log2; >> + if (pos < sizeof(target_ulong) * 8) { >> + r |= ((rs1 >> pos) & mask) << i; >> + } >> + } >> + return r; >> +} > > This could become a static inline do_xperm, and provide two specific > xperm4 and xperm8 helpers; the compiler would fold all of the sz_log2 > stuff into a more efficient implementation. OK. > >> +target_ulong HELPER(unshfl)(target_ulong rs1, >> + target_ulong rs2) >> +{ >> + target_ulong x = rs1; >> + int i, shift; >> + int bits = TARGET_LONG_BITS >> 1; >> + for (i = 0, shift = 1; shift < bits; i++, shift <<= 1) { >> + if (rs2 & shift) { >> + x = do_shuf_stage(x, shuf_masks[i], shuf_masks[i] >> >> shift, shift); >> + } >> + } >> + return x; >> +} >> + >> +target_ulong HELPER(shfl)(target_ulong rs1, >> + target_ulong rs2) >> +{ >> + target_ulong x = rs1; >> + int i, shift; >> + shift = TARGET_LONG_BITS >> 2; >> + i = (shift == 8) ? 3 : 4; >> + for (; i >= 0; i--, shift >>= 1) { >> + if (rs2 & shift) { >> + x = do_shuf_stage(x, shuf_masks[i], shuf_masks[i] >> >> shift, shift); >> + } >> + } >> + return x; >> +} > > Similar comment as for grev. > >> +# The encoding for zext.h differs between RV32 and RV64. >> +# zext_h_32 denotes the RV32 variant. >> +{ >> + zext_h_32 0000100 00000 ..... 100 ..... 0110011 @r2 >> + pack 0000100 ..... ..... 100 ..... 0110011 @r >> +} > > Note to self: improve tcg_gen_deposit to notice zeros, so that the > more general pack compiles to zero-extension. > >> @@ -556,6 +563,81 @@ static bool gen_unary_per_ol(DisasContext *ctx, >> arg_r2 *a, DisasExtend ext, >> return gen_unary(ctx, a, ext, f_tl); >> } >> +static bool gen_xperm(DisasContext *ctx, arg_r *a, int32_t size) >> +{ >> + TCGv dest = dest_gpr(ctx, a->rd); >> + TCGv src1 = get_gpr(ctx, a->rs1, EXT_NONE); >> + TCGv src2 = get_gpr(ctx, a->rs2, EXT_NONE); >> + >> + TCGv_i32 sz = tcg_const_i32(size); >> + gen_helper_xperm(dest, src1, src2, sz); >> + >> + gen_set_gpr(ctx, a->rd, dest); >> + tcg_temp_free_i32(sz); >> + return true; >> +} >> + >> +static bool gen_grevi(DisasContext *ctx, arg_r2 *a, int shamt) >> +{ >> + TCGv dest = dest_gpr(ctx, a->rd); >> + TCGv src1 = get_gpr(ctx, a->rs1, EXT_NONE); >> + >> + if (shamt == (TARGET_LONG_BITS - 8)) { >> + /* rev8, byte swaps */ >> + tcg_gen_bswap_tl(dest, src1); >> + } else { >> + TCGv src2 = tcg_temp_new(); >> + tcg_gen_movi_tl(src2, shamt); >> + gen_helper_grev(dest, src1, src2); >> + tcg_temp_free(src2); >> + } >> + >> + gen_set_gpr(ctx, a->rd, dest); >> + return true; >> +} >> + >> +static void gen_pack(TCGv ret, TCGv src1, TCGv src2) >> +{ >> + tcg_gen_deposit_tl(ret, src1, src2, >> + TARGET_LONG_BITS / 2, >> + TARGET_LONG_BITS / 2); >> +} >> + >> +static void gen_packh(TCGv ret, TCGv src1, TCGv src2) >> +{ >> + TCGv t = tcg_temp_new(); >> + tcg_gen_ext8u_tl(t, src2); >> + tcg_gen_deposit_tl(ret, src1, t, 8, TARGET_LONG_BITS - 8); >> + tcg_temp_free(t); >> +} >> + >> +static void gen_packw(TCGv ret, TCGv src1, TCGv src2) >> +{ >> + TCGv t = tcg_temp_new(); >> + tcg_gen_ext16s_tl(t, src2); >> + tcg_gen_deposit_tl(ret, src1, t, 16, 48); >> + tcg_temp_free(t); >> +} >> + >> +static bool gen_shufi(DisasContext *ctx, arg_r2 *a, int shamt, >> + void(*func)(TCGv, TCGv, TCGv)) >> +{ >> + if (shamt >= TARGET_LONG_BITS / 2) { >> + return false; >> + } >> + >> + TCGv dest = dest_gpr(ctx, a->rd); >> + TCGv src1 = get_gpr(ctx, a->rs1, EXT_NONE); >> + TCGv src2 = tcg_temp_new(); >> + >> + tcg_gen_movi_tl(src2, shamt); >> + (*func)(dest, src1, src2); >> + >> + gen_set_gpr(ctx, a->rd, dest); >> + tcg_temp_free(src2); >> + return true; >> +} > > All of the gen functions belong in insn_trans/trans_rvb.c.inc. OK. I'll move them to insn_trans/trans_rvb.c.inc. > > > r~
WARNING: multiple messages have this Message-ID (diff)
From: liweiwei <liweiwei@iscas.ac.cn> To: Richard Henderson <richard.henderson@linaro.org>, palmer@dabbelt.com, alistair.francis@wdc.com, bin.meng@windriver.com, qemu-riscv@nongnu.org, qemu-devel@nongnu.org Cc: wangjunqiang@iscas.ac.cn, lazyparser@gmail.com, lustrew@foxmail.com, luruibo2000@163.com Subject: Re: [RFC 2/6] target/riscv: rvk: add implementation of instructions for Zbk* - reuse partial instructions of Zbb/Zbc extensions - add brev8 packh, unzip, zip, etc. Date: Wed, 3 Nov 2021 08:56:03 +0800 [thread overview] Message-ID: <1be2c279-2d6b-1acb-c216-a598f02d43e1@iscas.ac.cn> (raw) In-Reply-To: <5523b929-316e-a119-af1a-2a4aba4ee86d@linaro.org> Thanks for your suggestions. 在 2021/11/2 下午11:44, Richard Henderson 写道: > On 11/1/21 11:11 PM, liweiwei wrote: >> Signed-off-by: liweiwei <liweiwei@iscas.ac.cn> >> Signed-off-by: wangjunqiang <wangjunqiang@iscas.ac.cn> > > You managed to get the whole patch description into the subject line. > Please break it up. > OK. >> +target_ulong HELPER(grev)(target_ulong rs1, target_ulong rs2) >> +{ >> + return do_grev(rs1, rs2, TARGET_LONG_BITS); >> +} > > Are we expecting to see the full grev instruction at any point? If > not, we can certainly implement Zbk with a simpler implementation. The main idea that I add this helper is that grev may be added to B-extension later and it can be reused. However, it have no effect currently. I'll replace this with a simpler implementation. > >> +target_ulong HELPER(xperm)(target_ulong rs1, target_ulong rs2, >> uint32_t sz_log2) >> +{ >> + target_ulong r = 0; >> + target_ulong sz = 1LL << sz_log2; >> + target_ulong mask = (1LL << sz) - 1; >> + for (int i = 0; i < TARGET_LONG_BITS; i += sz) { >> + target_ulong pos = ((rs2 >> i) & mask) << sz_log2; >> + if (pos < sizeof(target_ulong) * 8) { >> + r |= ((rs1 >> pos) & mask) << i; >> + } >> + } >> + return r; >> +} > > This could become a static inline do_xperm, and provide two specific > xperm4 and xperm8 helpers; the compiler would fold all of the sz_log2 > stuff into a more efficient implementation. OK. > >> +target_ulong HELPER(unshfl)(target_ulong rs1, >> + target_ulong rs2) >> +{ >> + target_ulong x = rs1; >> + int i, shift; >> + int bits = TARGET_LONG_BITS >> 1; >> + for (i = 0, shift = 1; shift < bits; i++, shift <<= 1) { >> + if (rs2 & shift) { >> + x = do_shuf_stage(x, shuf_masks[i], shuf_masks[i] >> >> shift, shift); >> + } >> + } >> + return x; >> +} >> + >> +target_ulong HELPER(shfl)(target_ulong rs1, >> + target_ulong rs2) >> +{ >> + target_ulong x = rs1; >> + int i, shift; >> + shift = TARGET_LONG_BITS >> 2; >> + i = (shift == 8) ? 3 : 4; >> + for (; i >= 0; i--, shift >>= 1) { >> + if (rs2 & shift) { >> + x = do_shuf_stage(x, shuf_masks[i], shuf_masks[i] >> >> shift, shift); >> + } >> + } >> + return x; >> +} > > Similar comment as for grev. > >> +# The encoding for zext.h differs between RV32 and RV64. >> +# zext_h_32 denotes the RV32 variant. >> +{ >> + zext_h_32 0000100 00000 ..... 100 ..... 0110011 @r2 >> + pack 0000100 ..... ..... 100 ..... 0110011 @r >> +} > > Note to self: improve tcg_gen_deposit to notice zeros, so that the > more general pack compiles to zero-extension. > >> @@ -556,6 +563,81 @@ static bool gen_unary_per_ol(DisasContext *ctx, >> arg_r2 *a, DisasExtend ext, >> return gen_unary(ctx, a, ext, f_tl); >> } >> +static bool gen_xperm(DisasContext *ctx, arg_r *a, int32_t size) >> +{ >> + TCGv dest = dest_gpr(ctx, a->rd); >> + TCGv src1 = get_gpr(ctx, a->rs1, EXT_NONE); >> + TCGv src2 = get_gpr(ctx, a->rs2, EXT_NONE); >> + >> + TCGv_i32 sz = tcg_const_i32(size); >> + gen_helper_xperm(dest, src1, src2, sz); >> + >> + gen_set_gpr(ctx, a->rd, dest); >> + tcg_temp_free_i32(sz); >> + return true; >> +} >> + >> +static bool gen_grevi(DisasContext *ctx, arg_r2 *a, int shamt) >> +{ >> + TCGv dest = dest_gpr(ctx, a->rd); >> + TCGv src1 = get_gpr(ctx, a->rs1, EXT_NONE); >> + >> + if (shamt == (TARGET_LONG_BITS - 8)) { >> + /* rev8, byte swaps */ >> + tcg_gen_bswap_tl(dest, src1); >> + } else { >> + TCGv src2 = tcg_temp_new(); >> + tcg_gen_movi_tl(src2, shamt); >> + gen_helper_grev(dest, src1, src2); >> + tcg_temp_free(src2); >> + } >> + >> + gen_set_gpr(ctx, a->rd, dest); >> + return true; >> +} >> + >> +static void gen_pack(TCGv ret, TCGv src1, TCGv src2) >> +{ >> + tcg_gen_deposit_tl(ret, src1, src2, >> + TARGET_LONG_BITS / 2, >> + TARGET_LONG_BITS / 2); >> +} >> + >> +static void gen_packh(TCGv ret, TCGv src1, TCGv src2) >> +{ >> + TCGv t = tcg_temp_new(); >> + tcg_gen_ext8u_tl(t, src2); >> + tcg_gen_deposit_tl(ret, src1, t, 8, TARGET_LONG_BITS - 8); >> + tcg_temp_free(t); >> +} >> + >> +static void gen_packw(TCGv ret, TCGv src1, TCGv src2) >> +{ >> + TCGv t = tcg_temp_new(); >> + tcg_gen_ext16s_tl(t, src2); >> + tcg_gen_deposit_tl(ret, src1, t, 16, 48); >> + tcg_temp_free(t); >> +} >> + >> +static bool gen_shufi(DisasContext *ctx, arg_r2 *a, int shamt, >> + void(*func)(TCGv, TCGv, TCGv)) >> +{ >> + if (shamt >= TARGET_LONG_BITS / 2) { >> + return false; >> + } >> + >> + TCGv dest = dest_gpr(ctx, a->rd); >> + TCGv src1 = get_gpr(ctx, a->rs1, EXT_NONE); >> + TCGv src2 = tcg_temp_new(); >> + >> + tcg_gen_movi_tl(src2, shamt); >> + (*func)(dest, src1, src2); >> + >> + gen_set_gpr(ctx, a->rd, dest); >> + tcg_temp_free(src2); >> + return true; >> +} > > All of the gen functions belong in insn_trans/trans_rvb.c.inc. OK. I'll move them to insn_trans/trans_rvb.c.inc. > > > r~
next prev parent reply other threads:[~2021-11-03 0:57 UTC|newest] Thread overview: 34+ messages / expand[flat|nested] mbox.gz Atom feed top 2021-11-02 3:11 [RFC 0/6] support subsets of scalar crypto extension liweiwei 2021-11-02 3:11 ` liweiwei 2021-11-02 3:11 ` [RFC 1/6] target/riscv: rvk: add flag support for Zbk[bcx] liweiwei 2021-11-02 3:11 ` liweiwei 2021-11-02 14:18 ` Richard Henderson 2021-11-02 14:18 ` Richard Henderson 2021-11-02 15:00 ` liweiwei 2021-11-02 15:00 ` liweiwei 2021-11-02 3:11 ` [RFC 2/6] target/riscv: rvk: add implementation of instructions for Zbk* - reuse partial instructions of Zbb/Zbc extensions - add brev8 packh, unzip, zip, etc liweiwei 2021-11-02 3:11 ` liweiwei 2021-11-02 15:44 ` Richard Henderson 2021-11-02 15:44 ` Richard Henderson 2021-11-03 0:56 ` liweiwei [this message] 2021-11-03 0:56 ` liweiwei 2021-11-02 3:11 ` [RFC 3/6] target/riscv: rvk: add flag support for Zk/Zkn/Zknd/Zknd/Zkne/Zknh/Zks/Zksed/Zksh/Zkr liweiwei 2021-11-02 3:11 ` liweiwei 2021-11-02 17:56 ` Richard Henderson 2021-11-02 17:56 ` Richard Henderson 2021-11-03 1:06 ` liweiwei 2021-11-03 1:06 ` liweiwei 2021-11-03 1:21 ` Richard Henderson 2021-11-03 1:21 ` Richard Henderson 2021-11-03 7:22 ` liweiwei 2021-11-03 7:22 ` liweiwei 2021-11-02 3:11 ` [RFC 4/6] target/riscv: rvk: add implementation of instructions for Zk* liweiwei 2021-11-02 3:11 ` liweiwei 2021-11-02 18:56 ` Richard Henderson 2021-11-02 18:56 ` Richard Henderson 2021-11-03 1:08 ` liweiwei 2021-11-03 1:08 ` liweiwei 2021-11-02 3:11 ` [RFC 5/6] target/riscv: rvk: add CSR support for Zkr: - add SEED CSR - add USEED, SSEED fields for MSECCFG CSR liweiwei 2021-11-02 3:11 ` liweiwei 2021-11-02 3:11 ` [RFC 6/6] disas/riscv.c: rvk: add disas support for Zbk* and Zk* instructions liweiwei 2021-11-02 3:11 ` liweiwei
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=1be2c279-2d6b-1acb-c216-a598f02d43e1@iscas.ac.cn \ --to=liweiwei@iscas.ac.cn \ --cc=alistair.francis@wdc.com \ --cc=bin.meng@windriver.com \ --cc=lazyparser@gmail.com \ --cc=luruibo2000@163.com \ --cc=lustrew@foxmail.com \ --cc=palmer@dabbelt.com \ --cc=qemu-devel@nongnu.org \ --cc=qemu-riscv@nongnu.org \ --cc=richard.henderson@linaro.org \ --cc=wangjunqiang@iscas.ac.cn \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.