All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mateja Marjanovic <Mateja.Marjanovic@rt-rk.com>
To: Aleksandar Markovic <aleksandar.m.mail@gmail.com>
Cc: "QEMU Developers" <qemu-devel@nongnu.org>,
	"Aleksandar Rikalo" <arikalo@wavecomp.com>,
	"Richard Henderson" <richard.henderson@linaro.org>,
	"Philippe Mathieu-Daudé" <philmd@redhat.com>,
	"Aleksandar Markovic" <amarkovic@wavecomp.com>,
	"Aurelien Jarno" <aurelien@aurel32.net>
Subject: Re: [Qemu-devel] [PATCH v6 2/4] target/mips: Optimize ILVEV.<B|H|W|D> MSA instructions
Date: Mon, 15 Apr 2019 15:48:01 +0200	[thread overview]
Message-ID: <9ab9a488-0640-444c-a761-4e46771d1d4a@rt-rk.com> (raw)
In-Reply-To: <CAL1e-=i8US6_BnsksTFwJq-hrMN5t77ysRhPdsotsRvGch5jiw@mail.gmail.com>


On 13.4.19. 18:05, Aleksandar Markovic wrote:
> On Thu, Apr 4, 2019 at 3:18 PM Mateja Marjanovic
> <mateja.marjanovic@rt-rk.com> wrote:
>> From: Mateja Marjanovic <Mateja.Marjanovic@rt-rk.com>
>>
>> Optimize set of MSA instructions ILVEV.<B|H|W|D>, using
>> directly tcg registers and performing logic on them
>> instead of using helpers.
>>
>> In the following table, the first column is the performance
>> before this patch. The second represents the performance,
>> after converting from helpers to tcg, but without using
>> tcg_gen_deposit function. The third one is the solution
>> which is implemented in this patch.
>>
>> Performance measurement is done by executing the
>> instructions a large number of times on a computer
> What is the exact number of times?
I will add that from now on.
>
>> with Intel Core i7-3770 CPU @ 3.40GHz×8.
>>
>> ============================================================
>> || instr    ||   before    || no-deposit ||  with-deposit ||
>> ============================================================
>> || ilvev.b  ||  126.92 ms  ||  24.52 ms  ||   24.43 ms    ||
>> || ilvev.h  ||   93.67 ms  ||  23.92 ms  ||   23.86 ms    ||
>> || ilvev.w  ||  117.86 ms  ||  23.83 ms  ||   22.17 ms    ||
>> || ilvev.d  ||   45.49 ms  ||  19.74 ms  ||   19.71 ms    ||
>> ============================================================
>>
> "With-deposit" for ilvev.w can't be the same as in the previous
> version (22.17 ms), since you eliminated one tcg_gen_andi_i64() in
> this version compared to the previous one. "With-deposit" for ilvev.wb
> and ilvev.h also can't be the same. It looks you just copy-pasted the
> numbers. Please retest the performance and attach the accurate
> numbers.
My mistake, I will add it in v7.
>
> Also, there should be five columns and their meanings should be:
>
>    - instruction
>    - before
>    - no-deposit-no-mask-as-tcg-constant
>    - with-deposit-no-mask-as-tcg-constant
>    - with-deposit-with-mask-as-tcg-constant (final)
Alright, but the deposit function and mask as a tcg constant
are optimizations for two different problems. The deposit
function is used only in case of word, and mask as a tcg
constant in halfword and byte.
>> No-deposit column and with-deposit column have the
>> same statistical values in every row, except ILVEV.W,
>> which is the only function which uses the deposit
>> function.
>>
>> No-deposit version of the ILVEV.W implementation:
>>
>> static inline void gen_ilvev_w(CPUMIPSState *env, uint32_t wd,
>>                                 uint32_t ws, uint32_t wt)
>> {
>>      TCGv_i64 t1 = tcg_temp_new_i64();
>>      TCGv_i64 t2 = tcg_temp_new_i64();
>>      uint64_t mask = 0x00000000ffffffffULL;
>>
>>      tcg_gen_andi_i64(t1, msa_wr_d[wt * 2], mask);
>>      tcg_gen_andi_i64(t2, msa_wr_d[ws * 2], mask);
>>      tcg_gen_shli_i64(t2, t2, 32);
>>      tcg_gen_or_i64(msa_wr_d[wd * 2], t1, t2);
>>
>>      tcg_gen_andi_i64(t1, msa_wr_d[wt * 2 + 1], mask);
>>      tcg_gen_andi_i64(t2, msa_wr_d[ws * 2 + 1], mask);
>>      tcg_gen_shli_i64(t2, t2, 32);
>>      tcg_gen_or_i64(msa_wr_d[wd * 2 + 1], t1, t2);
>>
>>      tcg_temp_free_i64(t1);
>>      tcg_temp_free_i64(t2);
>> }
>>
>> Suggested-by: Richard Henderson <richard.henderson@linaro.org>
> You forgot Philippe.
I will add Philippe, ofcourse.
>
>> Signed-off-by: Mateja Marjanovic <mateja.marjanovic@rt-rk.com>
>> ---
>>   target/mips/helper.h     |   1 -
>>   target/mips/msa_helper.c |   9 -----
>>   target/mips/translate.c  | 101 ++++++++++++++++++++++++++++++++++++++++++++++-
>>   3 files changed, 100 insertions(+), 11 deletions(-)
>>
>> diff --git a/target/mips/helper.h b/target/mips/helper.h
>> index 02e16c7..82f6a40 100644
>> --- a/target/mips/helper.h
>> +++ b/target/mips/helper.h
>> @@ -864,7 +864,6 @@ DEF_HELPER_5(msa_pckev_df, void, env, i32, i32, i32, i32)
>>   DEF_HELPER_5(msa_pckod_df, void, env, i32, i32, i32, i32)
>>   DEF_HELPER_5(msa_ilvl_df, void, env, i32, i32, i32, i32)
>>   DEF_HELPER_5(msa_ilvr_df, void, env, i32, i32, i32, i32)
>> -DEF_HELPER_5(msa_ilvev_df, void, env, i32, i32, i32, i32)
>>   DEF_HELPER_5(msa_vshf_df, void, env, i32, i32, i32, i32)
>>   DEF_HELPER_5(msa_srar_df, void, env, i32, i32, i32, i32)
>>   DEF_HELPER_5(msa_srlr_df, void, env, i32, i32, i32, i32)
>> diff --git a/target/mips/msa_helper.c b/target/mips/msa_helper.c
>> index a7ea6aa..d5c3842 100644
>> --- a/target/mips/msa_helper.c
>> +++ b/target/mips/msa_helper.c
>> @@ -1197,15 +1197,6 @@ MSA_FN_DF(ilvl_df)
>>       } while (0)
>>   MSA_FN_DF(ilvr_df)
>>   #undef MSA_DO
>> -
>> -#define MSA_DO(DF)                      \
>> -    do {                                \
>> -        pwx->DF[2*i]   = pwt->DF[2*i];  \
>> -        pwx->DF[2*i+1] = pws->DF[2*i];  \
>> -    } while (0)
>> -MSA_FN_DF(ilvev_df)
>> -#undef MSA_DO
>> -
>>   #undef MSA_LOOP_COND
>>
>>   #define MSA_LOOP_COND(DF) \
>> diff --git a/target/mips/translate.c b/target/mips/translate.c
>> index df685e4..3057669 100644
>> --- a/target/mips/translate.c
>> +++ b/target/mips/translate.c
>> @@ -28973,6 +28973,90 @@ static inline void gen_ilvod_d(CPUMIPSState *env, uint32_t wd,
>>       tcg_gen_mov_i64(msa_wr_d[wd * 2 + 1], msa_wr_d[ws * 2 + 1]);
>>   }
>>
>> +/*
>> + * [MSA] ILVEV.B wd, ws, wt
>> + *
>> + *   Vector Interleave Even (byte data elements)
>> + *
>> + */
>> +static inline void gen_ilvev_b(CPUMIPSState *env, uint32_t wd,
>> +                               uint32_t ws, uint32_t wt)
>> +{
>> +    TCGv_i64 t1 = tcg_temp_new_i64();
>> +    TCGv_i64 t2 = tcg_temp_new_i64();
>> +    TCGv_i64 mask = tcg_const_i64(0x00ff00ff00ff00ffULL);
>> +
>> +    tcg_gen_and_i64(t1, msa_wr_d[wt * 2], mask);
>> +    tcg_gen_and_i64(t2, msa_wr_d[ws * 2], mask);
>> +    tcg_gen_shli_i64(t2, t2, 8);
>> +    tcg_gen_or_i64(msa_wr_d[wd * 2], t1, t2);
>> +
>> +    tcg_gen_and_i64(t1, msa_wr_d[wt * 2 + 1], mask);
>> +    tcg_gen_and_i64(t2, msa_wr_d[ws * 2 + 1], mask);
>> +    tcg_gen_shli_i64(t2, t2, 8);
>> +    tcg_gen_or_i64(msa_wr_d[wd * 2 + 1], t1, t2);
>> +
>> +    tcg_temp_free_i64(mask);
>> +    tcg_temp_free_i64(t1);
>> +    tcg_temp_free_i64(t2);
>> +}
>> +
>> +/*
>> + * [MSA] ILVEV.H wd, ws, wt
>> + *
>> + *   Vector Interleave Even (halfword data elements)
>> + *
>> + */
>> +static inline void gen_ilvev_h(CPUMIPSState *env, uint32_t wd,
>> +                               uint32_t ws, uint32_t wt)
>> +{
>> +    TCGv_i64 t1 = tcg_temp_new_i64();
>> +    TCGv_i64 t2 = tcg_temp_new_i64();
>> +    TCGv_i64 mask = tcg_const_i64(0x0000ffff0000ffffULL);
>> +
>> +    tcg_gen_and_i64(t1, msa_wr_d[wt * 2], mask);
>> +    tcg_gen_and_i64(t2, msa_wr_d[ws * 2], mask);
>> +    tcg_gen_shli_i64(t2, t2, 16);
>> +    tcg_gen_or_i64(msa_wr_d[wd * 2], t1, t2);
>> +
>> +    tcg_gen_and_i64(t1, msa_wr_d[wt * 2 + 1], mask);
>> +    tcg_gen_and_i64(t2, msa_wr_d[ws * 2 + 1], mask);
>> +    tcg_gen_shli_i64(t2, t2, 16);
>> +    tcg_gen_or_i64(msa_wr_d[wd * 2 + 1], t1, t2);
>> +
>> +    tcg_temp_free_i64(mask);
>> +    tcg_temp_free_i64(t1);
>> +    tcg_temp_free_i64(t2);
>> +}
> Please apply Philippe's refactoring for the preceding two functions.
I will in v7.
>
>> +
>> +/*
>> + * [MSA] ILVEV.W wd, ws, wt
>> + *
>> + *   Vector Interleave Even (word data elements)
>> + *
>> + */
>> +static inline void gen_ilvev_w(CPUMIPSState *env, uint32_t wd,
>> +                               uint32_t ws, uint32_t wt)
>> +{
>> +    tcg_gen_deposit_i64(msa_wr_d[wd * 2], msa_wr_d[wt * 2],
>> +                        msa_wr_d[ws * 2], 32, 32);
>> +    tcg_gen_deposit_i64(msa_wr_d[wd * 2 + 1], msa_wr_d[wt * 2 + 1],
>> +                        msa_wr_d[ws * 2 + 1], 32, 32);
>> +}
>> +
>> +/*
>> + * [MSA] ILVEV.D wd, ws, wt
>> + *
>> + *   Vector Interleave Even (Doubleword data elements)
> Doubleword -> doubleword
It will be changed in v7.
>
>> + *
>> + */
>> +static inline void gen_ilvev_d(CPUMIPSState *env, uint32_t wd,
>> +                               uint32_t ws, uint32_t wt)
>> +{
>> +    tcg_gen_mov_i64(msa_wr_d[wd * 2 + 1], msa_wr_d[ws * 2]);
>> +    tcg_gen_mov_i64(msa_wr_d[wd * 2], msa_wr_d[wt * 2]);
>> +}
>> +
>>   static void gen_msa_3r(CPUMIPSState *env, DisasContext *ctx)
>>   {
>>   #define MASK_MSA_3R(op)    (MASK_MSA_MINOR(op) | (op & (0x7 << 23)))
>> @@ -29129,7 +29213,22 @@ static void gen_msa_3r(CPUMIPSState *env, DisasContext *ctx)
>>           gen_helper_msa_mod_s_df(cpu_env, tdf, twd, tws, twt);
>>           break;
>>       case OPC_ILVEV_df:
>> -        gen_helper_msa_ilvev_df(cpu_env, tdf, twd, tws, twt);
>> +        switch (df) {
>> +        case DF_BYTE:
>> +            gen_ilvev_b(env, wd, ws, wt);
>> +            break;
>> +        case DF_HALF:
>> +            gen_ilvev_h(env, wd, ws, wt);
>> +            break;
>> +        case DF_WORD:
>> +            gen_ilvev_w(env, wd, ws, wt);
>> +            break;
>> +        case DF_DOUBLE:
>> +            gen_ilvev_d(env, wd, ws, wt);
>> +            break;
>> +        default:
>> +            assert(0);
>> +        }
>>           break;
>>       case OPC_BINSR_df:
>>           gen_helper_msa_binsr_df(cpu_env, tdf, twd, tws, twt);
>> --
>> 2.7.4
>>
>>
> Thanks,
> Aleksandar
Thanks,
Mateja

WARNING: multiple messages have this Message-ID (diff)
From: Mateja Marjanovic <Mateja.Marjanovic@rt-rk.com>
To: Aleksandar Markovic <aleksandar.m.mail@gmail.com>
Cc: "Aleksandar Rikalo" <arikalo@wavecomp.com>,
	"Richard Henderson" <richard.henderson@linaro.org>,
	"QEMU Developers" <qemu-devel@nongnu.org>,
	"Aleksandar Markovic" <amarkovic@wavecomp.com>,
	"Philippe Mathieu-Daudé" <philmd@redhat.com>,
	"Aurelien Jarno" <aurelien@aurel32.net>
Subject: Re: [Qemu-devel] [PATCH v6 2/4] target/mips: Optimize ILVEV.<B|H|W|D> MSA instructions
Date: Mon, 15 Apr 2019 15:48:01 +0200	[thread overview]
Message-ID: <9ab9a488-0640-444c-a761-4e46771d1d4a@rt-rk.com> (raw)
Message-ID: <20190415134801.fP2SqdbFZ1jeL0WbXlEiDMuPgmEmnPFy1rE_PzSmd5w@z> (raw)
In-Reply-To: <CAL1e-=i8US6_BnsksTFwJq-hrMN5t77ysRhPdsotsRvGch5jiw@mail.gmail.com>


On 13.4.19. 18:05, Aleksandar Markovic wrote:
> On Thu, Apr 4, 2019 at 3:18 PM Mateja Marjanovic
> <mateja.marjanovic@rt-rk.com> wrote:
>> From: Mateja Marjanovic <Mateja.Marjanovic@rt-rk.com>
>>
>> Optimize set of MSA instructions ILVEV.<B|H|W|D>, using
>> directly tcg registers and performing logic on them
>> instead of using helpers.
>>
>> In the following table, the first column is the performance
>> before this patch. The second represents the performance,
>> after converting from helpers to tcg, but without using
>> tcg_gen_deposit function. The third one is the solution
>> which is implemented in this patch.
>>
>> Performance measurement is done by executing the
>> instructions a large number of times on a computer
> What is the exact number of times?
I will add that from now on.
>
>> with Intel Core i7-3770 CPU @ 3.40GHz×8.
>>
>> ============================================================
>> || instr    ||   before    || no-deposit ||  with-deposit ||
>> ============================================================
>> || ilvev.b  ||  126.92 ms  ||  24.52 ms  ||   24.43 ms    ||
>> || ilvev.h  ||   93.67 ms  ||  23.92 ms  ||   23.86 ms    ||
>> || ilvev.w  ||  117.86 ms  ||  23.83 ms  ||   22.17 ms    ||
>> || ilvev.d  ||   45.49 ms  ||  19.74 ms  ||   19.71 ms    ||
>> ============================================================
>>
> "With-deposit" for ilvev.w can't be the same as in the previous
> version (22.17 ms), since you eliminated one tcg_gen_andi_i64() in
> this version compared to the previous one. "With-deposit" for ilvev.wb
> and ilvev.h also can't be the same. It looks you just copy-pasted the
> numbers. Please retest the performance and attach the accurate
> numbers.
My mistake, I will add it in v7.
>
> Also, there should be five columns and their meanings should be:
>
>    - instruction
>    - before
>    - no-deposit-no-mask-as-tcg-constant
>    - with-deposit-no-mask-as-tcg-constant
>    - with-deposit-with-mask-as-tcg-constant (final)
Alright, but the deposit function and mask as a tcg constant
are optimizations for two different problems. The deposit
function is used only in case of word, and mask as a tcg
constant in halfword and byte.
>> No-deposit column and with-deposit column have the
>> same statistical values in every row, except ILVEV.W,
>> which is the only function which uses the deposit
>> function.
>>
>> No-deposit version of the ILVEV.W implementation:
>>
>> static inline void gen_ilvev_w(CPUMIPSState *env, uint32_t wd,
>>                                 uint32_t ws, uint32_t wt)
>> {
>>      TCGv_i64 t1 = tcg_temp_new_i64();
>>      TCGv_i64 t2 = tcg_temp_new_i64();
>>      uint64_t mask = 0x00000000ffffffffULL;
>>
>>      tcg_gen_andi_i64(t1, msa_wr_d[wt * 2], mask);
>>      tcg_gen_andi_i64(t2, msa_wr_d[ws * 2], mask);
>>      tcg_gen_shli_i64(t2, t2, 32);
>>      tcg_gen_or_i64(msa_wr_d[wd * 2], t1, t2);
>>
>>      tcg_gen_andi_i64(t1, msa_wr_d[wt * 2 + 1], mask);
>>      tcg_gen_andi_i64(t2, msa_wr_d[ws * 2 + 1], mask);
>>      tcg_gen_shli_i64(t2, t2, 32);
>>      tcg_gen_or_i64(msa_wr_d[wd * 2 + 1], t1, t2);
>>
>>      tcg_temp_free_i64(t1);
>>      tcg_temp_free_i64(t2);
>> }
>>
>> Suggested-by: Richard Henderson <richard.henderson@linaro.org>
> You forgot Philippe.
I will add Philippe, ofcourse.
>
>> Signed-off-by: Mateja Marjanovic <mateja.marjanovic@rt-rk.com>
>> ---
>>   target/mips/helper.h     |   1 -
>>   target/mips/msa_helper.c |   9 -----
>>   target/mips/translate.c  | 101 ++++++++++++++++++++++++++++++++++++++++++++++-
>>   3 files changed, 100 insertions(+), 11 deletions(-)
>>
>> diff --git a/target/mips/helper.h b/target/mips/helper.h
>> index 02e16c7..82f6a40 100644
>> --- a/target/mips/helper.h
>> +++ b/target/mips/helper.h
>> @@ -864,7 +864,6 @@ DEF_HELPER_5(msa_pckev_df, void, env, i32, i32, i32, i32)
>>   DEF_HELPER_5(msa_pckod_df, void, env, i32, i32, i32, i32)
>>   DEF_HELPER_5(msa_ilvl_df, void, env, i32, i32, i32, i32)
>>   DEF_HELPER_5(msa_ilvr_df, void, env, i32, i32, i32, i32)
>> -DEF_HELPER_5(msa_ilvev_df, void, env, i32, i32, i32, i32)
>>   DEF_HELPER_5(msa_vshf_df, void, env, i32, i32, i32, i32)
>>   DEF_HELPER_5(msa_srar_df, void, env, i32, i32, i32, i32)
>>   DEF_HELPER_5(msa_srlr_df, void, env, i32, i32, i32, i32)
>> diff --git a/target/mips/msa_helper.c b/target/mips/msa_helper.c
>> index a7ea6aa..d5c3842 100644
>> --- a/target/mips/msa_helper.c
>> +++ b/target/mips/msa_helper.c
>> @@ -1197,15 +1197,6 @@ MSA_FN_DF(ilvl_df)
>>       } while (0)
>>   MSA_FN_DF(ilvr_df)
>>   #undef MSA_DO
>> -
>> -#define MSA_DO(DF)                      \
>> -    do {                                \
>> -        pwx->DF[2*i]   = pwt->DF[2*i];  \
>> -        pwx->DF[2*i+1] = pws->DF[2*i];  \
>> -    } while (0)
>> -MSA_FN_DF(ilvev_df)
>> -#undef MSA_DO
>> -
>>   #undef MSA_LOOP_COND
>>
>>   #define MSA_LOOP_COND(DF) \
>> diff --git a/target/mips/translate.c b/target/mips/translate.c
>> index df685e4..3057669 100644
>> --- a/target/mips/translate.c
>> +++ b/target/mips/translate.c
>> @@ -28973,6 +28973,90 @@ static inline void gen_ilvod_d(CPUMIPSState *env, uint32_t wd,
>>       tcg_gen_mov_i64(msa_wr_d[wd * 2 + 1], msa_wr_d[ws * 2 + 1]);
>>   }
>>
>> +/*
>> + * [MSA] ILVEV.B wd, ws, wt
>> + *
>> + *   Vector Interleave Even (byte data elements)
>> + *
>> + */
>> +static inline void gen_ilvev_b(CPUMIPSState *env, uint32_t wd,
>> +                               uint32_t ws, uint32_t wt)
>> +{
>> +    TCGv_i64 t1 = tcg_temp_new_i64();
>> +    TCGv_i64 t2 = tcg_temp_new_i64();
>> +    TCGv_i64 mask = tcg_const_i64(0x00ff00ff00ff00ffULL);
>> +
>> +    tcg_gen_and_i64(t1, msa_wr_d[wt * 2], mask);
>> +    tcg_gen_and_i64(t2, msa_wr_d[ws * 2], mask);
>> +    tcg_gen_shli_i64(t2, t2, 8);
>> +    tcg_gen_or_i64(msa_wr_d[wd * 2], t1, t2);
>> +
>> +    tcg_gen_and_i64(t1, msa_wr_d[wt * 2 + 1], mask);
>> +    tcg_gen_and_i64(t2, msa_wr_d[ws * 2 + 1], mask);
>> +    tcg_gen_shli_i64(t2, t2, 8);
>> +    tcg_gen_or_i64(msa_wr_d[wd * 2 + 1], t1, t2);
>> +
>> +    tcg_temp_free_i64(mask);
>> +    tcg_temp_free_i64(t1);
>> +    tcg_temp_free_i64(t2);
>> +}
>> +
>> +/*
>> + * [MSA] ILVEV.H wd, ws, wt
>> + *
>> + *   Vector Interleave Even (halfword data elements)
>> + *
>> + */
>> +static inline void gen_ilvev_h(CPUMIPSState *env, uint32_t wd,
>> +                               uint32_t ws, uint32_t wt)
>> +{
>> +    TCGv_i64 t1 = tcg_temp_new_i64();
>> +    TCGv_i64 t2 = tcg_temp_new_i64();
>> +    TCGv_i64 mask = tcg_const_i64(0x0000ffff0000ffffULL);
>> +
>> +    tcg_gen_and_i64(t1, msa_wr_d[wt * 2], mask);
>> +    tcg_gen_and_i64(t2, msa_wr_d[ws * 2], mask);
>> +    tcg_gen_shli_i64(t2, t2, 16);
>> +    tcg_gen_or_i64(msa_wr_d[wd * 2], t1, t2);
>> +
>> +    tcg_gen_and_i64(t1, msa_wr_d[wt * 2 + 1], mask);
>> +    tcg_gen_and_i64(t2, msa_wr_d[ws * 2 + 1], mask);
>> +    tcg_gen_shli_i64(t2, t2, 16);
>> +    tcg_gen_or_i64(msa_wr_d[wd * 2 + 1], t1, t2);
>> +
>> +    tcg_temp_free_i64(mask);
>> +    tcg_temp_free_i64(t1);
>> +    tcg_temp_free_i64(t2);
>> +}
> Please apply Philippe's refactoring for the preceding two functions.
I will in v7.
>
>> +
>> +/*
>> + * [MSA] ILVEV.W wd, ws, wt
>> + *
>> + *   Vector Interleave Even (word data elements)
>> + *
>> + */
>> +static inline void gen_ilvev_w(CPUMIPSState *env, uint32_t wd,
>> +                               uint32_t ws, uint32_t wt)
>> +{
>> +    tcg_gen_deposit_i64(msa_wr_d[wd * 2], msa_wr_d[wt * 2],
>> +                        msa_wr_d[ws * 2], 32, 32);
>> +    tcg_gen_deposit_i64(msa_wr_d[wd * 2 + 1], msa_wr_d[wt * 2 + 1],
>> +                        msa_wr_d[ws * 2 + 1], 32, 32);
>> +}
>> +
>> +/*
>> + * [MSA] ILVEV.D wd, ws, wt
>> + *
>> + *   Vector Interleave Even (Doubleword data elements)
> Doubleword -> doubleword
It will be changed in v7.
>
>> + *
>> + */
>> +static inline void gen_ilvev_d(CPUMIPSState *env, uint32_t wd,
>> +                               uint32_t ws, uint32_t wt)
>> +{
>> +    tcg_gen_mov_i64(msa_wr_d[wd * 2 + 1], msa_wr_d[ws * 2]);
>> +    tcg_gen_mov_i64(msa_wr_d[wd * 2], msa_wr_d[wt * 2]);
>> +}
>> +
>>   static void gen_msa_3r(CPUMIPSState *env, DisasContext *ctx)
>>   {
>>   #define MASK_MSA_3R(op)    (MASK_MSA_MINOR(op) | (op & (0x7 << 23)))
>> @@ -29129,7 +29213,22 @@ static void gen_msa_3r(CPUMIPSState *env, DisasContext *ctx)
>>           gen_helper_msa_mod_s_df(cpu_env, tdf, twd, tws, twt);
>>           break;
>>       case OPC_ILVEV_df:
>> -        gen_helper_msa_ilvev_df(cpu_env, tdf, twd, tws, twt);
>> +        switch (df) {
>> +        case DF_BYTE:
>> +            gen_ilvev_b(env, wd, ws, wt);
>> +            break;
>> +        case DF_HALF:
>> +            gen_ilvev_h(env, wd, ws, wt);
>> +            break;
>> +        case DF_WORD:
>> +            gen_ilvev_w(env, wd, ws, wt);
>> +            break;
>> +        case DF_DOUBLE:
>> +            gen_ilvev_d(env, wd, ws, wt);
>> +            break;
>> +        default:
>> +            assert(0);
>> +        }
>>           break;
>>       case OPC_BINSR_df:
>>           gen_helper_msa_binsr_df(cpu_env, tdf, twd, tws, twt);
>> --
>> 2.7.4
>>
>>
> Thanks,
> Aleksandar
Thanks,
Mateja


  reply	other threads:[~2019-04-15 13:48 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-04-04 13:14 [Qemu-devel] [PATCH v6 0/4] target/mips: Optimize MSA interleave instructions Mateja Marjanovic
2019-04-04 13:14 ` [Qemu-devel] [PATCH v6 1/4] target/mips: Optimize ILVOD.<B|H|W|D> MSA instructions Mateja Marjanovic
2019-04-04 13:47   ` Philippe Mathieu-Daudé
2019-04-13 16:09   ` Aleksandar Markovic
2019-04-13 16:09     ` Aleksandar Markovic
2019-04-04 13:14 ` [Qemu-devel] [PATCH v6 2/4] target/mips: Optimize ILVEV.<B|H|W|D> " Mateja Marjanovic
2019-04-04 13:42   ` Philippe Mathieu-Daudé
2019-04-04 18:19     ` Aleksandar Markovic
2019-04-04 19:17       ` Philippe Mathieu-Daudé
2019-04-05  0:26         ` Aleksandar Markovic
2019-04-05  0:26           ` Aleksandar Markovic
2019-04-17 12:45     ` Mateja Marjanovic
2019-04-17 12:45       ` Mateja Marjanovic
2019-04-13 16:05   ` Aleksandar Markovic
2019-04-13 16:05     ` Aleksandar Markovic
2019-04-15 13:48     ` Mateja Marjanovic [this message]
2019-04-15 13:48       ` Mateja Marjanovic
2019-04-04 13:14 ` [Qemu-devel] [PATCH v6 3/4] target/mips: Optimize ILVL.<B|H|W|D> " Mateja Marjanovic
2019-04-13 16:15   ` Aleksandar Markovic
2019-04-13 16:15     ` Aleksandar Markovic
2019-04-04 13:14 ` [Qemu-devel] [PATCH v6 4/4] target/mips: Optimize ILVR.<B|H|W|D> " Mateja Marjanovic
2019-04-13 16:05   ` Aleksandar Markovic
2019-04-13 16:05     ` Aleksandar Markovic
2019-04-15 11:24     ` Mateja Marjanovic
2019-04-15 11:24       ` Mateja Marjanovic
2019-04-16 21:20       ` Aleksandar Markovic
2019-04-16 21:20         ` Aleksandar Markovic
2019-04-17  8:16         ` Mateja Marjanovic
2019-04-17  8:16           ` Mateja Marjanovic

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=9ab9a488-0640-444c-a761-4e46771d1d4a@rt-rk.com \
    --to=mateja.marjanovic@rt-rk.com \
    --cc=aleksandar.m.mail@gmail.com \
    --cc=amarkovic@wavecomp.com \
    --cc=arikalo@wavecomp.com \
    --cc=aurelien@aurel32.net \
    --cc=philmd@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=richard.henderson@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.