All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Alex Bennée" <alex.bennee@linaro.org>
To: Richard Henderson <richard.henderson@linaro.org>
Cc: qemu-devel@nongnu.org
Subject: Re: [PATCH 11/20] tcg/i386: Implement avx512 immediate rotate
Date: Wed, 02 Feb 2022 14:05:24 +0000	[thread overview]
Message-ID: <87wniduzko.fsf@linaro.org> (raw)
In-Reply-To: <20211218194250.247633-12-richard.henderson@linaro.org>


Richard Henderson <richard.henderson@linaro.org> writes:

> AVX512VL has VPROLD and VPROLQ, layered onto the same
> opcode as PSHIFTD, but requires EVEX encoding and W.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>  tcg/i386/tcg-target.h     |  2 +-
>  tcg/i386/tcg-target.c.inc | 15 +++++++++++++--
>  2 files changed, 14 insertions(+), 3 deletions(-)
>
> diff --git a/tcg/i386/tcg-target.h b/tcg/i386/tcg-target.h
> index 12d098ad6c..38c09fd66c 100644
> --- a/tcg/i386/tcg-target.h
> +++ b/tcg/i386/tcg-target.h
> @@ -195,7 +195,7 @@ extern bool have_movbe;
>  #define TCG_TARGET_HAS_not_vec          0
>  #define TCG_TARGET_HAS_neg_vec          0
>  #define TCG_TARGET_HAS_abs_vec          1
> -#define TCG_TARGET_HAS_roti_vec         0
> +#define TCG_TARGET_HAS_roti_vec         have_avx512vl
>  #define TCG_TARGET_HAS_rots_vec         0
>  #define TCG_TARGET_HAS_rotv_vec         0
>  #define TCG_TARGET_HAS_shi_vec          1
> diff --git a/tcg/i386/tcg-target.c.inc b/tcg/i386/tcg-target.c.inc
> index c4e6f2e5ea..5ab7c4c0fa 100644
> --- a/tcg/i386/tcg-target.c.inc
> +++ b/tcg/i386/tcg-target.c.inc
> @@ -361,7 +361,7 @@ static bool tcg_target_const_match(int64_t val, TCGType type, int ct)
>  #define OPC_PSHUFLW     (0x70 | P_EXT | P_SIMDF2)
>  #define OPC_PSHUFHW     (0x70 | P_EXT | P_SIMDF3)
>  #define OPC_PSHIFTW_Ib  (0x71 | P_EXT | P_DATA16) /* /2 /6 /4 */
> -#define OPC_PSHIFTD_Ib  (0x72 | P_EXT | P_DATA16) /* /2 /6 /4 */
> +#define OPC_PSHIFTD_Ib  (0x72 | P_EXT | P_DATA16) /* /1 /2 /6 /4 */
>  #define OPC_PSHIFTQ_Ib  (0x73 | P_EXT | P_DATA16) /* /2 /6 /4 */
>  #define OPC_PSLLW       (0xf1 | P_EXT | P_DATA16)
>  #define OPC_PSLLD       (0xf2 | P_EXT | P_DATA16)
> @@ -2906,6 +2906,14 @@ static void tcg_out_vec_op(TCGContext *s, TCGOpcode opc,
>              insn |= P_VEXW | P_EVEX;
>          }
>          sub = 4;
> +        goto gen_shift;
> +    case INDEX_op_rotli_vec:
> +        insn = OPC_PSHIFTD_Ib | P_EVEX;  /* VPROL[DQ] */
> +        if (vece == MO_64) {
> +            insn |= P_VEXW;
> +        }
> +        sub = 1;
> +        goto gen_shift;

This could just be a /* fall-through */ although given the large amount
of gotos the switch statement is gathering I'm not sure it makes too
much difference.

Is there any reason why gen_shift couldn't be pushed into a helper
function so we just had:

    static void tcg_out_vec_shift(s, vece, insn, sub, a0, a1, a2) {
        tcg_debug_assert(vece != MO_8);
        if (type == TCG_TYPE_V256) {
            insn |= P_VEXL;
        }
        tcg_out_vex_modrm(s, insn, sub, a0, a1);
        tcg_out8(s, a2);
    }

    ...

    case INDEX_op_rotli_vec:
        insn = OPC_PSHIFTD_Ib | P_EVEX;  /* VPROL[DQ] */
        if (vece == MO_64) {
            insn |= P_VEXW;
        }
        tcg_out_vec_shift(s, vece, insn, 1, a0, a1, a2);
        break;

Surely the compiler would inline if needed (and even if it didn't it the
code generation that critical we care about a few cycles)?


-- 
Alex Bennée


  reply	other threads:[~2022-02-02 14:43 UTC|newest]

Thread overview: 51+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-12-18 19:42 [PATCH 00/20] tcg: vector improvements Richard Henderson
2021-12-18 19:42 ` [PATCH 01/20] tcg/optimize: Fix folding of vector ops Richard Henderson
2021-12-19 11:37   ` Philippe Mathieu-Daudé
2021-12-18 19:42 ` [PATCH 02/20] tcg: Add opcodes for vector nand, nor, eqv Richard Henderson
2021-12-19 11:28   ` Philippe Mathieu-Daudé
2022-02-01 18:28   ` Alex Bennée
2021-12-18 19:42 ` [PATCH 03/20] tcg/ppc: Implement vector NAND, NOR, EQV Richard Henderson
2021-12-19  0:15   ` Philippe Mathieu-Daudé
2022-02-01 18:29   ` Alex Bennée
2021-12-18 19:42 ` [PATCH 04/20] tcg/s390x: " Richard Henderson
2021-12-19  0:17   ` Philippe Mathieu-Daudé
2022-02-01 18:29   ` Alex Bennée
2022-02-01 18:31   ` Alex Bennée
2024-01-03 13:21   ` Philippe Mathieu-Daudé
2024-01-03 21:58     ` Richard Henderson
2021-12-18 19:42 ` [PATCH 05/20] tcg/i386: Detect AVX512 Richard Henderson
2022-02-01 18:41   ` Alex Bennée
2021-12-18 19:42 ` [PATCH 06/20] tcg/i386: Add tcg_out_evex_opc Richard Henderson
2022-02-01 19:20   ` Alex Bennée
2021-12-18 19:42 ` [PATCH 07/20] tcg/i386: Use tcg_can_emit_vec_op in expand_vec_cmp_noinv Richard Henderson
2022-02-01 19:21   ` Alex Bennée
2021-12-18 19:42 ` [PATCH 08/20] tcg/i386: Implement avx512 variable shifts Richard Henderson
2022-02-01 20:33   ` Alex Bennée
2021-12-18 19:42 ` [PATCH 09/20] tcg/i386: Implement avx512 scalar shift Richard Henderson
2022-02-02 13:48   ` Alex Bennée
2021-12-18 19:42 ` [PATCH 10/20] tcg/i386: Implement avx512 immediate sari shift Richard Henderson
2022-02-02 14:02   ` Alex Bennée
2021-12-18 19:42 ` [PATCH 11/20] tcg/i386: Implement avx512 immediate rotate Richard Henderson
2022-02-02 14:05   ` Alex Bennée [this message]
2022-02-03  1:26     ` Richard Henderson
2021-12-18 19:42 ` [PATCH 12/20] tcg/i386: Implement avx512 variable rotate Richard Henderson
2022-02-02 14:14   ` Alex Bennée
2021-12-18 19:42 ` [PATCH 13/20] tcg/i386: Support avx512vbmi2 vector shift-double instructions Richard Henderson
2022-02-02 14:28   ` Alex Bennée
2021-12-18 19:42 ` [PATCH 14/20] tcg/i386: Expand vector word rotate as avx512vbmi2 shift-double Richard Henderson
2022-02-03 10:32   ` Alex Bennée
2021-12-18 19:42 ` [PATCH 15/20] tcg/i386: Remove rotls_vec from tcg_target_op_def Richard Henderson
2022-02-03 10:34   ` Alex Bennée
2021-12-18 19:42 ` [PATCH 16/20] tcg/i386: Expand scalar rotate with avx512 insns Richard Henderson
2022-02-03 10:38   ` Alex Bennée
2021-12-18 19:42 ` [PATCH 17/20] tcg/i386: Implement avx512 min/max/abs Richard Henderson
2022-02-03 10:44   ` Alex Bennée
2021-12-18 19:42 ` [PATCH 18/20] tcg/i386: Implement avx512 multiply Richard Henderson
2022-02-03 10:45   ` Alex Bennée
2021-12-18 19:42 ` [PATCH 19/20] tcg/i386: Implement more logical operations for avx512 Richard Henderson
2022-02-03 10:46   ` Alex Bennée
2022-02-03 21:54     ` Richard Henderson
2021-12-18 19:42 ` [PATCH 20/20] tcg/i386: Implement bitsel " Richard Henderson
2022-02-03 10:51   ` Alex Bennée
2022-01-29  9:28 ` [PATCH 00/20] tcg: vector improvements Richard Henderson
2022-02-03 10:25 ` Alex Bennée

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87wniduzko.fsf@linaro.org \
    --to=alex.bennee@linaro.org \
    --cc=qemu-devel@nongnu.org \
    --cc=richard.henderson@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.