All of lore.kernel.org
 help / color / mirror / Atom feed
From: Lucas Mateus Martins Araujo e Castro <lucas.araujo@eldorado.org.br>
To: Richard Henderson <richard.henderson@linaro.org>, qemu-ppc@nongnu.org
Cc: "open list:All patches CC here" <qemu-devel@nongnu.org>,
	"Greg Kurz" <groug@kaod.org>,
	"Daniel Henrique Barboza" <danielhb413@gmail.com>,
	"Cédric Le Goater" <clg@kaod.org>,
	"David Gibson" <david@gibson.dropbear.id.au>
Subject: Re: [RFC PATCH 2/7] target/ppc: Implemented xvi*ger* instructions
Date: Wed, 27 Apr 2022 17:24:48 -0300	[thread overview]
Message-ID: <d160958f-6703-8af7-aa8f-f3843d9b1066@eldorado.org.br> (raw)
In-Reply-To: <7f3f38ab-9a15-3202-5c15-8159e95af6ab@linaro.org>

[-- Attachment #1: Type: text/plain, Size: 5246 bytes --]


On 26/04/2022 20:40, Richard Henderson wrote:
>
> On 4/26/22 05:50, Lucas Mateus Castro(alqotel) wrote:
>> +%xx_at          23:3 !function=times_4
>> +@XX3_at         ...... ... .. ..... ..... ........ ... &XX3 
>> xt=%xx_at xb=%xx_xb
>
> Hmm.  Depends, I suppose on whether you want acc[0-7] or vsr[0-28]
I mostly used VSR function here, but since I'll change the patch 1 to 
your suggestion (which will require creating acc_full_offset) I'll make 
a few changes to create some functions for the accumulator
>
>> +/*
>> + * Packed VSX Integer GER Flags
>> + * 00 - no accumulation no saturation
>> + * 01 - accumulate but no saturation
>> + * 10 - no accumulation but with saturation
>> + * 11 - accumulate with saturation
>> + */
>> +static inline bool get_sat(uint32_t flags)
>> +{
>> +    return flags & 0x2;
>> +}
>> +
>> +static inline bool get_acc(uint32_t flags)
>> +{
>> +    return flags & 0x1;
>> +}
>
> Better to have separate helpers for these?  They'd be immediate 
> operands to the function
> replacing XVIGER (see below) and thus optimize well.
Do you mean different functions or a function that receives packed_flags 
along with the callback functions?
>
>> +#define GET_VsrN(a, i) (extract32(a->VsrB((i) / 2), (i) % 2 ? 4 : 0, 
>> 4))
>> +#define GET_VsrB(a, i) a->VsrB(i)
>> +#define GET_VsrH(a, i) a->VsrH(i)
>> +
>> +#define GET_VsrSN(a, i) (sextract32(a->VsrSB((i) / 2), (i) % 2 ? 4 : 
>> 0, 4))
>> +#define GET_VsrSB(a, i) a->VsrSB(i)
>> +#define GET_VsrSH(a, i) a->VsrSH(i)
>
> These can be made into functions of the form
>
>     typedef int32_t xviger_extract(ppc_vsr_t *a, int i);
>
In this case it'd be necessary to receive 2 xviger_extract functions 
since XVI8GER4* multiply one value as signed and the other as unsigned 
(and other integer GER treat both as signed).

An alternative would be to isolate the innermost loop into a different 
function, like:

     typedef int64_t do_ger(int32_t a, int32_t b, int32_t at, int32_t pmsk);

     static int64_t ger_rank4(int32_t a, int32_t b, int32_t at, int32_t 
mask)
     {
         int64_t psum = 0, i;
         for (i = 0; i < 4; i++, mask >>= 1) {
             if (mask & 1) {
                 psum += (sextract32(a, i * 8, 8)) * (extract32(b, i * 
8, 8));
            }
         }
         return psum;
     }

That way we could avoid having 'rank' as a parameter, what do you think?

>
>
>> diff --git a/target/ppc/internal.h b/target/ppc/internal.h
>> index 8094e0b033..a994d98238 100644
>> --- a/target/ppc/internal.h
>> +++ b/target/ppc/internal.h
>> @@ -291,4 +291,32 @@ G_NORETURN void 
>> ppc_cpu_do_unaligned_access(CPUState *cs, vaddr addr,
>>                                               uintptr_t retaddr);
>>   #endif
>>
>> +/*
>> + * Auxiliary functions to pack/unpack masks for GER instructions.
>> + *
>> + * Packed format:
>> + *  Bits 0-3: xmsk
>> + *  Bits 4-7: ymsk
>> + *  Bits 8-15: pmsk
>> + */
>> +static inline uint8_t ger_get_xmsk(uint32_t packed_masks)
>> +{
>> +    return packed_masks & 0xF;
>> +}
>> +
>> +static inline uint8_t ger_get_ymsk(uint32_t packed_masks)
>> +{
>> +    return (packed_masks >> 4) & 0xF;
>> +}
>> +
>> +static inline uint8_t ger_get_pmsk(uint32_t packed_masks)
>> +{
>> +    return (packed_masks >> 8) & 0xFF;
>> +}
>> +
>> +static inline int ger_pack_masks(int pmsk, int ymsk, int xmsk)
>> +{
>> +    return (pmsk & 0xFF) << 8 | (ymsk & 0xF) << 4 | (xmsk & 0xF);
>> +}
>
> Use hw/registerfields.h.  C.f. PREDDESC in target/arm/internals.h.
Ok, will do
>
>> +static bool do_ger_XX3(DisasContext *ctx, arg_XX3 *a, uint32_t op,
>> +                             void (*helper)(TCGv_env, TCGv_i32, 
>> TCGv_i32,
>> +                                            TCGv_i32, TCGv_i32, 
>> TCGv_i32))
>> +{
>> +    uint32_t mask;
>> +    REQUIRE_INSNS_FLAGS2(ctx, ISA310);
>> +    REQUIRE_VSX(ctx);
>> +    if (unlikely((a->xa / 4 == a->xt / 4) || (a->xb / 4 == a->xt / 
>> 4))) {
>> +        gen_invalid(ctx);
>> +        return true;
>> +    }
>> +
>> +    mask = 0xFFFFFFFF;
>> +    helper(cpu_env, tcg_constant_i32(a->xa), tcg_constant_i32(a->xb),
>> +           tcg_constant_i32(a->xt), tcg_constant_i32(mask),
>> +           tcg_constant_i32(op));
>> +    return true;
>> +}
>
> Why are you passing register numbers instead of pointers, like 
> everywhere else?
Because here we are not working only with 1 register per register 
number, the ACC uses 4 and the XVF64GER* needs to use XA and XA+1, and 
while VSR is an array so I could do ppc_vsr_ptr+1 I thought it was 
better not to access memory I was not given a pointer to, so I passed XA 
so I can request cpu_vsr_ptr(env, xa) and cpu_vsr_ptr(env, xa + 1)
>
>
> r~
-- 
Lucas Mateus M. Araujo e Castro
Instituto de Pesquisas ELDORADO 
<https://www.eldorado.org.br/?utm_campaign=assinatura_de_e-mail&utm_medium=email&utm_source=RD+Station>
Departamento Computação Embarcada
Analista de Software Trainee
Aviso Legal - Disclaimer <https://www.eldorado.org.br/disclaimer.html>

[-- Attachment #2: Type: text/html, Size: 8574 bytes --]

  reply	other threads:[~2022-04-27 20:26 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-04-26 12:50 [RFC PATCH 0/7] VSX MMA Implementation Lucas Mateus Castro(alqotel)
2022-04-26 12:50 ` [RFC PATCH 1/7] target/ppc: Implement xxm[tf]acc and xxsetaccz Lucas Mateus Castro(alqotel)
2022-04-26 22:59   ` Richard Henderson
2022-04-26 12:50 ` [RFC PATCH 2/7] target/ppc: Implemented xvi*ger* instructions Lucas Mateus Castro(alqotel)
2022-04-26 23:40   ` Richard Henderson
2022-04-27 20:24     ` Lucas Mateus Martins Araujo e Castro [this message]
2022-04-27 22:28       ` Richard Henderson
2022-04-26 12:50 ` [RFC PATCH 3/7] target/ppc: Implemented pmxvi*ger* instructions Lucas Mateus Castro(alqotel)
2022-04-26 12:50 ` [RFC PATCH 4/7] target/ppc: Implemented xvf*ger* Lucas Mateus Castro(alqotel)
2022-04-27  0:09   ` Richard Henderson
2022-04-26 12:50 ` [RFC PATCH 5/7] target/ppc: Implemented xvf16ger* Lucas Mateus Castro(alqotel)
2022-04-27  0:26   ` Richard Henderson
2022-04-27 21:11     ` Lucas Mateus Martins Araujo e Castro
2022-04-27 22:30       ` Richard Henderson
2022-04-26 12:50 ` [RFC PATCH 6/7] target/ppc: Implemented pmxvf*ger* Lucas Mateus Castro(alqotel)
2022-04-27  0:33   ` Richard Henderson
2022-04-26 12:50 ` [RFC PATCH 7/7] target/ppc: Implemented [pm]xvbf16ger2* Lucas Mateus Castro(alqotel)
2022-04-27  6:21 ` [RFC PATCH 0/7] VSX MMA Implementation Joel Stanley
2022-04-27  7:10   ` Cédric Le Goater
2022-05-05  6:06     ` Joel Stanley
2022-04-28 14:05 ` Lucas Mateus Martins Araujo e Castro

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=d160958f-6703-8af7-aa8f-f3843d9b1066@eldorado.org.br \
    --to=lucas.araujo@eldorado.org.br \
    --cc=clg@kaod.org \
    --cc=danielhb413@gmail.com \
    --cc=david@gibson.dropbear.id.au \
    --cc=groug@kaod.org \
    --cc=qemu-devel@nongnu.org \
    --cc=qemu-ppc@nongnu.org \
    --cc=richard.henderson@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.