From: Lucas Mateus Martins Araujo e Castro <lucas.araujo@eldorado.org.br>
To: Richard Henderson <richard.henderson@linaro.org>, qemu-ppc@nongnu.org
Cc: "open list:All patches CC here" <qemu-devel@nongnu.org>,
"Greg Kurz" <groug@kaod.org>,
"Daniel Henrique Barboza" <danielhb413@gmail.com>,
"Cédric Le Goater" <clg@kaod.org>,
"David Gibson" <david@gibson.dropbear.id.au>
Subject: Re: [RFC PATCH 2/7] target/ppc: Implemented xvi*ger* instructions
Date: Wed, 27 Apr 2022 17:24:48 -0300 [thread overview]
Message-ID: <d160958f-6703-8af7-aa8f-f3843d9b1066@eldorado.org.br> (raw)
In-Reply-To: <7f3f38ab-9a15-3202-5c15-8159e95af6ab@linaro.org>
[-- Attachment #1: Type: text/plain, Size: 5246 bytes --]
On 26/04/2022 20:40, Richard Henderson wrote:
>
> On 4/26/22 05:50, Lucas Mateus Castro(alqotel) wrote:
>> +%xx_at 23:3 !function=times_4
>> +@XX3_at ...... ... .. ..... ..... ........ ... &XX3
>> xt=%xx_at xb=%xx_xb
>
> Hmm. Depends, I suppose on whether you want acc[0-7] or vsr[0-28]
I mostly used VSR function here, but since I'll change the patch 1 to
your suggestion (which will require creating acc_full_offset) I'll make
a few changes to create some functions for the accumulator
>
>> +/*
>> + * Packed VSX Integer GER Flags
>> + * 00 - no accumulation no saturation
>> + * 01 - accumulate but no saturation
>> + * 10 - no accumulation but with saturation
>> + * 11 - accumulate with saturation
>> + */
>> +static inline bool get_sat(uint32_t flags)
>> +{
>> + return flags & 0x2;
>> +}
>> +
>> +static inline bool get_acc(uint32_t flags)
>> +{
>> + return flags & 0x1;
>> +}
>
> Better to have separate helpers for these? They'd be immediate
> operands to the function
> replacing XVIGER (see below) and thus optimize well.
Do you mean different functions or a function that receives packed_flags
along with the callback functions?
>
>> +#define GET_VsrN(a, i) (extract32(a->VsrB((i) / 2), (i) % 2 ? 4 : 0,
>> 4))
>> +#define GET_VsrB(a, i) a->VsrB(i)
>> +#define GET_VsrH(a, i) a->VsrH(i)
>> +
>> +#define GET_VsrSN(a, i) (sextract32(a->VsrSB((i) / 2), (i) % 2 ? 4 :
>> 0, 4))
>> +#define GET_VsrSB(a, i) a->VsrSB(i)
>> +#define GET_VsrSH(a, i) a->VsrSH(i)
>
> These can be made into functions of the form
>
> typedef int32_t xviger_extract(ppc_vsr_t *a, int i);
>
In this case it'd be necessary to receive 2 xviger_extract functions
since XVI8GER4* multiply one value as signed and the other as unsigned
(and other integer GER treat both as signed).
An alternative would be to isolate the innermost loop into a different
function, like:
typedef int64_t do_ger(int32_t a, int32_t b, int32_t at, int32_t pmsk);
static int64_t ger_rank4(int32_t a, int32_t b, int32_t at, int32_t
mask)
{
int64_t psum = 0, i;
for (i = 0; i < 4; i++, mask >>= 1) {
if (mask & 1) {
psum += (sextract32(a, i * 8, 8)) * (extract32(b, i *
8, 8));
}
}
return psum;
}
That way we could avoid having 'rank' as a parameter, what do you think?
>
>
>> diff --git a/target/ppc/internal.h b/target/ppc/internal.h
>> index 8094e0b033..a994d98238 100644
>> --- a/target/ppc/internal.h
>> +++ b/target/ppc/internal.h
>> @@ -291,4 +291,32 @@ G_NORETURN void
>> ppc_cpu_do_unaligned_access(CPUState *cs, vaddr addr,
>> uintptr_t retaddr);
>> #endif
>>
>> +/*
>> + * Auxiliary functions to pack/unpack masks for GER instructions.
>> + *
>> + * Packed format:
>> + * Bits 0-3: xmsk
>> + * Bits 4-7: ymsk
>> + * Bits 8-15: pmsk
>> + */
>> +static inline uint8_t ger_get_xmsk(uint32_t packed_masks)
>> +{
>> + return packed_masks & 0xF;
>> +}
>> +
>> +static inline uint8_t ger_get_ymsk(uint32_t packed_masks)
>> +{
>> + return (packed_masks >> 4) & 0xF;
>> +}
>> +
>> +static inline uint8_t ger_get_pmsk(uint32_t packed_masks)
>> +{
>> + return (packed_masks >> 8) & 0xFF;
>> +}
>> +
>> +static inline int ger_pack_masks(int pmsk, int ymsk, int xmsk)
>> +{
>> + return (pmsk & 0xFF) << 8 | (ymsk & 0xF) << 4 | (xmsk & 0xF);
>> +}
>
> Use hw/registerfields.h. C.f. PREDDESC in target/arm/internals.h.
Ok, will do
>
>> +static bool do_ger_XX3(DisasContext *ctx, arg_XX3 *a, uint32_t op,
>> + void (*helper)(TCGv_env, TCGv_i32,
>> TCGv_i32,
>> + TCGv_i32, TCGv_i32,
>> TCGv_i32))
>> +{
>> + uint32_t mask;
>> + REQUIRE_INSNS_FLAGS2(ctx, ISA310);
>> + REQUIRE_VSX(ctx);
>> + if (unlikely((a->xa / 4 == a->xt / 4) || (a->xb / 4 == a->xt /
>> 4))) {
>> + gen_invalid(ctx);
>> + return true;
>> + }
>> +
>> + mask = 0xFFFFFFFF;
>> + helper(cpu_env, tcg_constant_i32(a->xa), tcg_constant_i32(a->xb),
>> + tcg_constant_i32(a->xt), tcg_constant_i32(mask),
>> + tcg_constant_i32(op));
>> + return true;
>> +}
>
> Why are you passing register numbers instead of pointers, like
> everywhere else?
Because here we are not working only with 1 register per register
number, the ACC uses 4 and the XVF64GER* needs to use XA and XA+1, and
while VSR is an array so I could do ppc_vsr_ptr+1 I thought it was
better not to access memory I was not given a pointer to, so I passed XA
so I can request cpu_vsr_ptr(env, xa) and cpu_vsr_ptr(env, xa + 1)
>
>
> r~
--
Lucas Mateus M. Araujo e Castro
Instituto de Pesquisas ELDORADO
<https://www.eldorado.org.br/?utm_campaign=assinatura_de_e-mail&utm_medium=email&utm_source=RD+Station>
Departamento Computação Embarcada
Analista de Software Trainee
Aviso Legal - Disclaimer <https://www.eldorado.org.br/disclaimer.html>
[-- Attachment #2: Type: text/html, Size: 8574 bytes --]
next prev parent reply other threads:[~2022-04-27 20:26 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-04-26 12:50 [RFC PATCH 0/7] VSX MMA Implementation Lucas Mateus Castro(alqotel)
2022-04-26 12:50 ` [RFC PATCH 1/7] target/ppc: Implement xxm[tf]acc and xxsetaccz Lucas Mateus Castro(alqotel)
2022-04-26 22:59 ` Richard Henderson
2022-04-26 12:50 ` [RFC PATCH 2/7] target/ppc: Implemented xvi*ger* instructions Lucas Mateus Castro(alqotel)
2022-04-26 23:40 ` Richard Henderson
2022-04-27 20:24 ` Lucas Mateus Martins Araujo e Castro [this message]
2022-04-27 22:28 ` Richard Henderson
2022-04-26 12:50 ` [RFC PATCH 3/7] target/ppc: Implemented pmxvi*ger* instructions Lucas Mateus Castro(alqotel)
2022-04-26 12:50 ` [RFC PATCH 4/7] target/ppc: Implemented xvf*ger* Lucas Mateus Castro(alqotel)
2022-04-27 0:09 ` Richard Henderson
2022-04-26 12:50 ` [RFC PATCH 5/7] target/ppc: Implemented xvf16ger* Lucas Mateus Castro(alqotel)
2022-04-27 0:26 ` Richard Henderson
2022-04-27 21:11 ` Lucas Mateus Martins Araujo e Castro
2022-04-27 22:30 ` Richard Henderson
2022-04-26 12:50 ` [RFC PATCH 6/7] target/ppc: Implemented pmxvf*ger* Lucas Mateus Castro(alqotel)
2022-04-27 0:33 ` Richard Henderson
2022-04-26 12:50 ` [RFC PATCH 7/7] target/ppc: Implemented [pm]xvbf16ger2* Lucas Mateus Castro(alqotel)
2022-04-27 6:21 ` [RFC PATCH 0/7] VSX MMA Implementation Joel Stanley
2022-04-27 7:10 ` Cédric Le Goater
2022-05-05 6:06 ` Joel Stanley
2022-04-28 14:05 ` Lucas Mateus Martins Araujo e Castro
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=d160958f-6703-8af7-aa8f-f3843d9b1066@eldorado.org.br \
--to=lucas.araujo@eldorado.org.br \
--cc=clg@kaod.org \
--cc=danielhb413@gmail.com \
--cc=david@gibson.dropbear.id.au \
--cc=groug@kaod.org \
--cc=qemu-devel@nongnu.org \
--cc=qemu-ppc@nongnu.org \
--cc=richard.henderson@linaro.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.